Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is insert slow? #13

Open
talgalili opened this issue Nov 9, 2018 · 2 comments
Open

Is insert slow? #13

talgalili opened this issue Nov 9, 2018 · 2 comments

Comments

@talgalili
Copy link

I came across your packages (which is cool!), I've played with adding an as.list.stack function, but it appears to be very slow. Below is the code, please let me know if I've missed anything:

# install.packages("datastructures")
# install.packages("microbenchmark")

library("datastructures")
library("microbenchmark")

as.list.stack <- function(x, ..., mode) {
  # x is a stack
  # mode is based on the class of the top element
  n <- size(x)
  top <- peek(x)
  if(missing(mode)) {
    if(is.list(top)) {
      mode <- "list"
    } else {
      mode <- class(top)
    }
  }
  out <- vector(n, mode = mode)
  for(i in n:1) {
    out[[i]] <- pop(x)
  }
  
  out
}


##
# Basic benchmarking:
f1 <- function(n = 100) {
  x <- c()
  for(i in 1:n) {
    x <- c(x, i)
  }
  x
}

f2 <- function(n = 100) {
  x <- stack()
  for(i in 1:n) {
    insert(x, i)
  }
  as.list(x)
}

# f2(10)

Benchmarking:

> library("microbenchmark")
> microbenchmark(f1(100),f2(100), times = 10)
Unit: microseconds
    expr       min        lq       mean    median        uq       max neval
 f1(100)    73.661    99.357   141.3709   124.345   197.047   210.013    10
 f2(100) 10904.531 11115.825 13590.8941 11988.055 17615.234 18755.162    10
> microbenchmark(f1(10000),f2(10000), times = 10)
Unit: milliseconds
      expr     min       lq     mean   median       uq       max neval
 f1(10000) 190.201 203.9211 230.4935 227.3870 246.1155  291.2356    10
 f2(10000) 581.717 642.4793 699.8962 655.3675 701.4607 1004.4701    10

When doing some profiling, it seems that the insert operation is very expansive:

library(profvis)
profvis({
  n <- 10000
  x <- c()
  for(i in 1:n) {
    x <- c(x, i)
  }
  x
})
profvis({
  n <- 10000
  x <- stack()
  for(i in 1:n) {
    insert(x, i)
  }
  as.list(x)
})

I'm suspecting this might be because insert is a generic method and it has to do a lookup everytime I call it (and since it is S4, it might make it a bit slow).
I've tried to find the function directly but failed to get it:

> # showMethods("insert")
> getMethod("insert", "stack")
Error in getMethod("insert", "stack") : 
  no method found for function 'insert' and signature stack

Any thoughts?

@dirmeier
Copy link
Owner

dirmeier commented Nov 12, 2018

Hey, yes you are right. Iteratively inserting scalars is rather slow compared to inserting an entire vector. So I'd just do

insert(x, seq(n))

I guess that's due to the Rcpp/Boost call. I haven't found to time to benchmark the package, so your efforts are highly appreciated, thanks.

@talgalili
Copy link
Author

Interesting. But then, would that be faster than R?! (worth checking)

Do you see a way in which it could become faster?!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants