Solved – Computation speed in R

computational-statisticsr

I have been tasked with moving one of our current large stochastic models out of SAS and into a new language. Personally, I prefer a traditional compiled language, but the PI wants me to check out R, which I've never used. Our motivation for getting the model out of SAS is (1) many people don't have access to it because SAS is expensive, (2) we're looking to move away from an interpreted language, and (3) SAS is slow for the type of model we have.

For (1), obviously R satisfies the need for it to be free. For (2), ideally, we'd like to create an executable, but R is normally used as a scripted language. I see that someone has recently put out an R compiler – has this been well-received? Is it easy to use? We'd rather not force the user to download R themselves. For (3), our problem with SAS is all the time spent in I/O writing and reading data sets. Our model is computationally intensive, and we are often limited by runtime. (e.g. It's not uncommon for someone to hijack people's computers over the weekend to perform runs.) We have a similar model built in Fortran that doesn't have the same problem because all work is done in memory. How does R work? Will it be the same as SAS, in that it works in datasteps, reading and writing files? Or can it do array manipulation in memory?

Best Answer

R works in-memory - so your data do need to fit into memory for the majority of functions.

The compiler package, if I am thinking of the thing you are thinking of (Luke Tierney's compiler package supplied with R), is not the same thing as a compiled language in the traditional sense (C, Fortran). It is a byte compiler for R in the sense of Java bytecode executed by the Java VM or byte compiling of Emacs LISP code. It doesn't compile R code down into machine code but rather prepares the R code into bytecode so it can be used more efficiently than raw R code to be interpreted.

Note that if you have well formed Fortran you could probably have best of both worlds; R can call compiled Fortran routines.

Related Question