Skip to content

Performance Comparison

MattNapsAlot edited this page Jan 3, 2012 · 3 revisions
x <- makeBigFrame(ncol=10000,nrow=100000)
xx <-  makeBigFrame("data.frame",ncol=10000,nrow=10000)
> dim(x)
[1] 1e+05 1e+04

> dim(xx)
[1] 10000 10000


> system.time(xx[10000,])
   user  system elapsed 
  0.198   0.017   1.348 

> system.time(x[10000,])
   user  system elapsed 
  2.808   0.654   3.639 

It looks like the bigDataFrame package adds a lot of overhead:

> system.time(HDF5ReadData(hdfFile(x), "/all.data/dataValues", options=list(startindex=0,nrows=1)))
   user  system elapsed 
  0.003   0.000   0.005 

Here is how the system time breaks up by task:

> x <- BigDataFrame('bigDataFrame.h5')
> dim(x)
[1] 1e+05 1e+04
> x[1,]
Get Data
   user  system elapsed 
  0.007   0.000   0.007 
Make Data Frame
   user  system elapsed 
  0.067   0.009   0.076 
Set Row Names
   user  system elapsed 
  2.126   0.885   3.048 
Set Col Names
   user  system elapsed 
  0.137   0.034   0.173 
Getting col classes
   user  system elapsed 
  0.135   0.034   0.171 
Getting levels values
   user  system elapsed 
  0.273   0.036   0.311 
Setting Storage Mode
   user  system elapsed 
 46.729   0.366  47.553 

Clone this wiki locally