Skip to content

Internal error: DT passed to assign has not been allocated enough column slots #4100

@tyner

Description

@tyner

It appears that set() does not always allocate enough slots. Example:

library(data.table)
DT = data.table(a = runif(10))

my.set = function(x, i = NULL, j, value) {
    if (truelength(DT) < length(DT)) {
       stop("bad input")
    }
    if (truelength(DT) == length(DT)) {
        cat("do we need to call alloc.col or setDT here?\n")
    }
    # note: switching from set() to DT[, c(j) := value] works here
    set(x, i, j, value)
    
    if (truelength(DT) < length(DT)) {
       stop("bad output")
    }
    invisible(x)
}

set.seed(6860)
while(ncol(DT) < 10000L){

    new.name = paste("V", ncol(DT) + 1L)
    new.value = sample(nrow(DT))
    
    my.set(DT, j = new.name, value = new.value)
}

As indicated in the comment above, switching from "set(x, i, j, value)" to "DT[, c(j) := value]" makes it work. The situation seems similar as to what was reported in #1830.

sessionInfo() is:

R version 3.6.0 (2019-04-26) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS: /home/btyner/R360/lib64/R/lib/libRblas.so
LAPACK: /home/btyner/R360/lib64/R/lib/libRlapack.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.6.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    top requestOne of our most-requested issues

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions