Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data.table scope issue in functions #342

Open
DavideMessinaARS opened this issue Jun 9, 2021 · 2 comments
Open

Data.table scope issue in functions #342

DavideMessinaARS opened this issue Jun 9, 2021 · 2 comments

Comments

@DavideMessinaARS
Copy link

I'm new to disk.frame so maybe I'm misunderstanding how it works with data.table.

I run disk.frame version 0.50 and data.table version 1.14.0

library(disk.frame)
library(data.table)
setup_disk.frame()

test_dt = as.disk.frame(data.table(x = seq_len(10)), outdir = file.path(tempdir(), "test"), overwrite = TRUE)

test_fun <- function(fun_dt) {
  col_vect <- "x"
  print(fun_dt[, max(get(col_vect))])
}

col_vect <- "x"

test_fun(test_dt)
# return [1]  5 10

rm(col_vect)

test_fun(test_dt)
# return Error

The traceback for the error is:

Error in get(col_vect) : object 'col_vect' not found 
13. stop(condition) 
12. signalConditions(obj, exclude = getOption("future.relay.immediate", "immediateCondition"),
      resignal = resignal, ...) 
11. signalConditionsASAP(obj, resignal = FALSE, pos = ii) 
10. resolve.list(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE) 
9. resolve(y, result = TRUE, stdout = stdout, signal = signal, force = TRUE) 
8. value.list(fs) 
7. value(fs) 
6. future_xapply(FUN = FUN, nX = nX, chunk_args = X, args = list(...),
    get_chunk = `[`, expr = expr, envir = envir, future.globals = future.globals,
    future.packages = future.packages, future.scheduling = future.scheduling,
    future.chunk.size = future.chunk.size, future.stdout = future.stdout,  ... 
5. future.apply::future_lapply(get_chunk_ids(df, strip_extension = FALSE), 
    function(chunk_id) {
        chunk = get_chunk(df, chunk_id, keep = keep_for_future)
        data.table::setDT(chunk) ... 
4. `[.disk.frame`(fun_dt, , max(get(col_vect))) 
3. fun_dt[, max(get(col_vect))] 
2. print(fun_dt[, max(get(col_vect))]) 
1. test_fun(test_dt)
@xiaodaigh
Copy link
Collaborator

there's an issue with disk.frame where it doesn't wor within functions. it's to do with the global scope and NSE. I am designing a revamp of how disk.frame handles NSE. But the caveat is that functions are unlikely to compose well.

So this is a "known" issue.

@DavideMessinaARS
Copy link
Author

I found a workaround to the scope issue by sending the objects to the GlobalEnv:

test_fun <- function(fun_dt) {
  col_vect <<- "x"
  print(fun_dt[, max(get(col_vect))])
}

(or using assign)

The problem is I can't modify the function I'm using so I'll need to wait for a fix to disk.frame or program myself a stopgap solution.

In any case, thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants