Tags

, ,

I’m writing a book proposal based on the lecture notes for my R for Quants workshop I conducted at the Baruch MFE program. I’ve discovered that a book proposal is remarkably similar to a business plan: you need to identify your target audience, the size of that audience, what differentiates your from other books in the market, and why you’re qualified to write that book. As an exercise to focus my own thoughts on this process, this post touches on the motivation for such a book.

The purpose of this book is to illustrate the value of bringing together quantitative analysis and modeling with computer science and systems development into a holistic discipline. The emphasis is on the use of functional programming as the conceptual force that binds the quantitative model with the systems model. Joining two fields that at times seem mutually repellent yields a program structure that readily supports a sequential development process that easily transitions from analysis to modeling to systems development. While this approach expects more from practitioners, we argue that the gains are well worth the effort. Regardless of our enthusiasm, quants face ever larger data sets, more computationally demanding problems, and real-time demands (not to mention shrinking budgets). It is thus more challenging to solve contemporary problems without having a solid foundation in both quantitative methods and computer science. Perhaps the term Financial Engineer has gained currency precisely because academics and practitioners recognize the need to emphasize both aspects of the discipline. The world of high frequency trading has already embraced this model where the superstars are experts in both domains. While not everybody needs to fit this mold, at a minimum practitioners must appreciate both sides of the coin to maximize the effectiveness of their analytical systems.

Convincing the reader of such a claim requires a stroll through computer science history, touching on the origins of mathematical thinking, statistical programming, and programming paradigms. On the surface such an exposition may seem superfluous, when in fact it provides a conceptual and contextual foundation for the design principles discussed later in the book. For example, discussing the merits of a functional programming paradigm requires one to understand other programming paradigms and how they impact systems design. These discussions are viewed through the lens of quantitative finance with contrasting examples touching on modern portfolio theory, asset pricing, signal generation, portfolio optimization, etc. In most cases the principles highlighted in these examples are not exclusive to quantitative finance and can be applied to other computational fields. As an example, the price change of an instrument can be modeled various ways. Certain situations may require balancing precision of the model with performance of the calculation. Identifying which implementation provides the optimal balance means swapping out one model for another. A poor system design makes this process tedious and error prone. The lesson can be applied equally well irrespective of whether the model is for modeling asset prices or sodium channels in the brain.

Once the fundamentals are covered, the core of the book will examine various functional approaches and how these methods facilitate the modeling process. Functional programming concepts like first-class functions, higher-order functions, side effects, and pattern matching will be presented alongside numerical optimization and linear algebra to illustrate this interplay. We rely on two parallel reference implementations (Reference Classes and our own futile.paradigm) to drive our main arguments despite the plethora of programming models available in R. At times we also discuss the S3 and S4 object systems since they are both integral parts of R and are unavoidable in daily use. In order to avoid reams of source listings, at times transformations between implementations will seem like magic, similar to jumps in a proof. We make heavy use of diagrams to show the conceptual connections between the quantitative model and the software model as a way to fill in these gaps.

Next we examine more advanced topics such as developing practical strategies to cope with big data, on-demand analytics, and high performance computing and the impact these scenarios have on the modeling process. A common theme is how the superior modularity of functional programming enables the quant to easily tune and replace models as conditions and constraints change over time. An equally important benefit is how functional programs with limited side effects are easier to parallelize, which means that not only is the model pluggable but so is the wiring. Packages like ‘foreach’ and ‘snow’ can be drop-in implementations that leverage parallelism behind the scenes if done correctly. Similar strategies can be applied for GPU-based computations. When done incorrectly, these enhancements can act as shackles that prevent alternative analytics as the logical flow is stuck in the one path. Proper systems design and functional programming techniques simplifies the process of adding these features without disrupting the modeling process.

An interdisciplinary book of this nature runs the risk of alienating all interested parties. To even pick up the book requires more than a passing knowledge and interest of both quantitative fields and computer science. Many introductory concepts are glossed over in order to maintain focus on the core discussion. Knowledge of basic statistical and numerical methods, machine learning, and programming concepts are all assumed, though we do provide copious references to literature as well as refresher notes in the appendix. When it makes sense, we do spend extra time establishing core concepts. For example we expect most quants to know the basics of object oriented programming but little if anything about its dual, functional programming. Coming from the other side, software engineers transitioning into quantitative development should be comfortable with basic statistics and linear algebra. The non-finance audience will find that many of the analytical methods are industry-independent. For those in the financial industry, many of the techniques and systems discussed can be used as is.

While assuming a fair amount of knowledge, at the same time this book is not a research book. Experts looking for new approaches to quantitative analysis or computer science will not find it here. The contribution of this book is really the marriage of these fields and the approaches one must take to satisfy the demands of modern computational systems. Our hope is that the diverse background of quants is fertile ground for such a book, and the desire to optimize various processes and systems applies equally to financial engineering itself.