UPDATE: This draft is obsolete. Please see latest draft at
I’m pleased to announce the availability of my latest chapter on state based systems for my book “Modeling data with functional programming in R”. This chapter is the culmination of the ideas presented in the preceding chapters and presents numerous examples.
The chapter initially discusses the idea of state and how to manage it within closures. From this kernel we start to build some deterministic systems ranging from fractals, to cellular automata, and finally to a trading system modeled as a finite state machine. The chapter finishes with two probabilistic systems. The first is a Markov chain for modeling a corpus of text and the second is the Chinese restaurant process, which is used to generate the Dirichlet distribution.
As usual, comments are appreciated. The most useful comments are around comprehension and flow. If there is anything that is unclear, needs more explanation, is inconsistent, or incorrect, please let me know in the comments.
Also, my editor is always looking for more reviewers. Please get in touch if you are able to do this.
Seems like you posted the entire book …
LikeLike
Yes that’s by design. It’s easier for people to digest in toto as opposed to searching for each chapter on the blog
LikeLike
Ah, okay. Thought it was accidental.
LikeLike
Appreciate the concern. LMK what you think of the book!
LikeLike
Thanks for posting the whole book. As a suggestion from the perspective of those following from home, either in terminal or RStudio or such, you might negotiate with your publisher the graphical convention of assignment character(s) as <- rather than that presently employed. New users will be confused and wonder why Brian's assignment looks different from theirs and possibly assume they are doing it wrong. As R is a learn by doing language, this change might be beneficial. Also good to learn that one is an accidental adept at lambda calculus at whatever level when using R
LikeLike
Funny you mention that. I’ve spent the last 3 hours cleaning up formatting. The literate <- is actually intentional — I think it's more legible in book form. I'm also a believer in typing as muscle memory. That said I will likely post all the examples to github once the book is complete.
LikeLike
“All functions are first-class in R. [] As a reminder, the syntax for function definition assigns a function to a variable. This is no different from assignment of a data structure to a variable.”
Yes, functions are first class, but R does neither of the things you say. In R, a name is assigned to an object (non-technical use of the word), not the other way ’round, as you indicate.
x library(pryr)
>
> x y
> address(x)
[1] “0x108a93188”
> address(y)
[1] “0x108a93188”
‘x’ contains the address of the value 5
‘y’ does not contain the address of the variable ‘x’; it does contain the address of the value 5.
Similarly for functions (since they’re first-class):
> f g
> address(f)
[1] “0x106487800”
> address(g)
[1] “0x106487800”
It’s a subtle difference, but has significant implications for understanding R’s copy-on-modify semantics.
LikeLike
Thanks for pointing this out. I always believed R to be copy-by-value, but this makes sense in terms of being more memory efficient.
LikeLike
Hello,
I am interested in helping review your book, please let me know how to proceed regarding my giving you feedback, etc.
Looks like a very nice book.
Regards,
Jaime Suarez-Murias jaime_sm msn.com
LikeLike
Jaime, Great and thank you. I will forward your contact info to my editor, and he will be in touch.
LikeLike
Thank you.
I’m looking forward to hearing from your editor.
Take care,
Jaime
LikeLike
Hello Brian Lee,
I have not heard from your editor regarding how to provide feedback on your book. Do you want me to contact them directly?
Thank you,
Jaime
LikeLike
Jaime, he’s going to contact people after I finish the next chapter. You’ll probably hear from him some time this month. Also might be easier to correspond over twitter dm. I’m @cartesianfaith
LikeLike
Hello Brian,
I’m following up regarding reviewing your book. I was never contacted by your editor about this.
You suggested I contact you via Twitter DM; I tried but my message was not accepted saying you do not follow me. I’m @suarezmurias.
Thank you,
Jaime
LikeLike
Hi I just followed you on Twitter so please try again
LikeLike
Brian, I have been looking over your several publications of your book to this blog. I have some general comments+concerns+questions that follow from my reading, but they are for Chapter 1. I tried to comment on the old Chapter 1 post but the system didn’t seem to accept the comment but provided no error feedback (I assume the post is too old for commentary)
Here goes:
While I thoroughly enjoy the content that you have published here (for free!!) I find your example throughout 1.2(Starting at Figure 1.1) to be somewhat perplexing. I don’t understand your purpose behind attempting to juxtapose an explicit object-oriented structure to an invocation of apply() and speaking to a comparison of the complexity of their implementations. In the comparison, you made zero mention of the internal structure of apply() so -as a reader- I’m curious to know how you expect that to be taken as an apples-to-apples comparison that should persuade someone to choose “implicit apply(…)” over “explicit aggregate_iris(…)” Also, the object-oriented code that you have written for your comparison is trivial to implement and I would expect to take no longer than 10 minutes to prepare (with 5 minutes to grab some coffee.) I just don’t understand how that is supposed to persuade; the Strategy Pattern is exceptionally simple to all but neophytes. For me, your statement of “complexity with no value” does not hold water. The object-oriented code that you have written has a very explicit structure and I can see into your thought processes behind accomplishing the task.
In practice, coupling the Strategy pattern with an underlying Command implementation is my bread-and-butter declarative technique when I am writing in object-oriented style. To review, a Strategy encapsulates an algorithm and Command encapsulates a calling convention (Command encapsulates a receiver, a selector, and the arguments necessary to execute.) With CPS implemented at the Strategy level, you can declare the steps of an algorithm and delegate each of those steps to a Command, injecting the next algorithmic step as its continuation.
I appreciate gaining a bit of insight into why you structured this example in such a way. I feel that as it is structured, you are comparing “apply(magic)” to a straw man and saying “See; wasn’t apply() easier?” and I’m nearly certain you have something better to say than that.
I don’t know if you have something planned to reconcile this or if there is simply a clarifying statement that you can make to show how my reading is inaccurate but I thought it might add value to say that this is how I thought as I was reading.
Thanks!
LikeLike
Hi thanks for your comments. You raise good points regarding the example. As you note, the example is rather simple, but a full-blown example would take up too much space to be practical. On one hand I’m trying to concisely compare an FP approach to an OOP approach while also avoiding a book that is mired in these sorts of comparisons. Your point about using the command pattern instead of strategy actually highlights the bigger theme, which is that design patterns suffer from too narrowly defined use cases, most of which can be tossed aside when first-class functions are supported in a language. These aren’t meant to be fighting words, but given the context of data scientists, I haven’t met too many that have the capacity/inclination to debate the finer points of using one design pattern versus another. My view of functional programming is that I can do a lot efficiently and quickly with only a handful of concepts in my quiver. Given that I spend my day thinking about models, FP gives me a better trade off in terms of writing clean code that doesn’t demand all of my meat-space computational resources.
LikeLike