-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Consider a simple data table:
dt <- data.table(i = 1:2, na = rnorm(2), nb=rnorm(2),
ua=runif(2), ub=runif(2))
dt
## i na nb ua ub
## 1: 1 0.8675148 1.1900491 0.09394934 0.2717421
## 2: 2 -0.1700282 0.9188715 0.58017687 0.5443863
when melting it into multiple colums we get:
melt(dt, measure.vars=list(c("na", "nb"), c("ua", "ub")))
## i variable value1 value2
## 1: 1 1 0.8675148 0.09394934
## 2: 2 1 -0.1700282 0.58017687
## 3: 1 2 1.1900491 0.27174213
## 4: 2 2 0.9188715 0.54438634
In particular variable is a factor with levels "1" and "2". This behavior seems to be undocumented. ?melt tells
‘list’ is a generalization of the vector version - each
element of the list (which should be ‘integer’ or
‘character’ as above) will become a ‘melt’ed column.
and
From version ‘1.9.6’, ‘melt’ gains a feature with ‘measure.vars’
accepting a list of ‘character’ or ‘integer’ vectors as well to
melt into multiple columns in a single function call efficiently.
The function ‘patterns’ can be used to provide regular expression
patterns. When used along with ‘melt’, if ‘cols’ argument is not
provided, the patterns will be matched against ‘names(data)’, for
convenience.
However, I cannot find anything about
- the fact that the variable will use numbered labels to denote the original columns, and
- what is the relationship between the numeric label and the original column name.
I know there are related feature requests (#2551and #3396). I am also aware of related solutions (e.g. on SO) that revolve around renaming the corresponding factor levels. However, for such solutions to be considered safe, the behavior of numeric levels should be documented and considered part of the API.
data.table 1.12.6; R 3.4, 3.6.