Suppose that you would like to create a function which does a series of computations on a data frame. You would like to pass a column as this function’s argument. LINK.
My new release of partools is now on CRAN.
The package is aimed at doing parallel data science in what I call an “un-MapReduce” manner. It takes the point of view that MapReduce-based frameworks such as Hadoop and Spark are fine for the types of applications their designers had in mind, namely rather simple SQL actions, but have fundamental handicaps that prevent them from performing well on many, if not most, of the types of computation that typical users need for large data sets and/or highly compute-bound applications. The distributed file/object nature of those MapReduce systems is retained, but the confining MapReduce computational paradigm is avoided. LINK.
I am pleased to update you on theData Science Boot Camp we ran at the Ted Rogers School of Management at Ryerson University in Toronto in collaboration with IBM’swww.BigDataUniversity.com. The 9-week long Boot Camp concluded on July 15. LINK.
© Sva prava pridržana, Kompjuter biblioteka, Beograd, Obalskih radnika 4a, Telefon: +381 11 252 0 272 |
||