Skip to content

Using a function inside by causes troubles #5583

@AbrJA

Description

@AbrJA

I'm facing an issue when I summarize a data.table using a function inside the "by" clause

Here is an example:

> library(data.table)
> dt <- data.table(x = c(4, 5, 1, 3, 2), y = 1L, key = "x")
> dt
   x y
1: 1 1
2: 2 1
3: 3 1
4: 4 1
5: 5 1
> str(dt)
Classes ‘data.table’ and 'data.frame':	5 obs. of  2 variables:
 $ x: num  1 2 3 4 5
 $ y: int  1 1 1 1 1
 - attr(*, ".internal.selfref")=<externalptr> 
 - attr(*, "sorted")= chr "x"

> dt_sum <- dt[, .(.N), by = .(round(2 / x))]
> dt_sum
   round N
1:     2 1
2:     1 2
3:     0 2
> str(dt_sum)
Classes ‘data.table’ and 'data.frame':	3 obs. of  2 variables:
 $ round: num  2 1 0
 $ N    : int  1 2 2
 - attr(*, "sorted")= chr "round"
 - attr(*, ".internal.selfref")=<externalptr> 

> dt_sum[round == 0] # ERROR
Empty data.table (0 rows and 2 cols): round,N
> dt_sum[round == 1] # CORRECT
   round N
1:     1 2
> dt_sum[round == 2] # ERROR
Empty data.table (0 rows and 2 cols): round,N

I think the issue is here - attr(*, "sorted")= chr "round" because dt_sum isn't already sorted. I don't know if it's a known issue and there's documentation about it, I didn't find anything. Gretings!

> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0

locale:
 [1] LC_CTYPE=es_MX.UTF-8       LC_NUMERIC=C               LC_TIME=es_MX.UTF-8        LC_COLLATE=es_MX.UTF-8     LC_MONETARY=es_MX.UTF-8   
 [6] LC_MESSAGES=es_MX.UTF-8    LC_PAPER=es_MX.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=es_MX.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.14.6

loaded via a namespace (and not attached):
[1] compiler_4.2.2 tools_4.2.2   

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions