The arsenal
package relies somewhat heavily on variable
labels to make output more “pretty”. A label
here is
understood to be a single character string with “pretty” text (i.e., not
an “ugly” variable name). Three of the main arsenal
function use labels in their summary()
output. There are
several ways to set these labels.
We’ll use the mockstudy
dataset for all examples
here:
The summary()
method for tableby()
,
modelsum()
, and freqlist()
objects contains a
labelTranslations =
argument to specify labels in the
function call. Note that the freqlist()
function matches
labels in order, whereas the other two match labels by name. The labels
can be input as a list or a character vector.
summary(freqlist(tab.ex),
labelTranslations = c(arm = "Treatment Arm", sex = "Gender", mdquality.s = "LASA QOL"))
Treatment Arm | Gender | LASA QOL | Freq | Cumulative Freq | Percent | Cumulative Percent |
---|---|---|---|---|---|---|
A: IFL | Male | 0 | 29 | 29 | 1.93 | 1.93 |
1 | 214 | 243 | 14.28 | 16.21 | ||
NA | 34 | 277 | 2.27 | 18.48 | ||
Female | 0 | 12 | 289 | 0.80 | 19.28 | |
1 | 118 | 407 | 7.87 | 27.15 | ||
NA | 21 | 428 | 1.40 | 28.55 | ||
F: FOLFOX | Male | 0 | 31 | 459 | 2.07 | 30.62 |
1 | 285 | 744 | 19.01 | 49.63 | ||
NA | 95 | 839 | 6.34 | 55.97 | ||
Female | 0 | 21 | 860 | 1.40 | 57.37 | |
1 | 198 | 1058 | 13.21 | 70.58 | ||
NA | 61 | 1119 | 4.07 | 74.65 | ||
G: IROX | Male | 0 | 17 | 1136 | 1.13 | 75.78 |
1 | 187 | 1323 | 12.47 | 88.26 | ||
NA | 24 | 1347 | 1.60 | 89.86 | ||
Female | 0 | 14 | 1361 | 0.93 | 90.79 | |
1 | 121 | 1482 | 8.07 | 98.87 | ||
NA | 17 | 1499 | 1.13 | 100.00 |
summary(tableby(arm ~ sex + age, data = mockstudy),
labelTranslations = c(sex = "SEX", age = "Age, yrs"))
A: IFL (N=428) | F: FOLFOX (N=691) | G: IROX (N=380) | Total (N=1499) | p value | |
---|---|---|---|---|---|
SEX | 0.190 | ||||
Male | 277 (64.7%) | 411 (59.5%) | 228 (60.0%) | 916 (61.1%) | |
Female | 151 (35.3%) | 280 (40.5%) | 152 (40.0%) | 583 (38.9%) | |
Age, yrs | 0.614 | ||||
Mean (SD) | 59.673 (11.365) | 60.301 (11.632) | 59.763 (11.499) | 59.985 (11.519) | |
Range | 27.000 - 88.000 | 19.000 - 88.000 | 26.000 - 85.000 | 19.000 - 88.000 |
summary(modelsum(bmi ~ age, adjust = ~sex, data = mockstudy),
labelTranslations = list(sexFemale = "Female", age = "Age, yrs"))
estimate | std.error | p.value | adj.r.squared | Nmiss | |
---|---|---|---|---|---|
(Intercept) | 26.793 | 0.766 | < 0.001 | 0.004 | 33 |
Age, yrs | 0.012 | 0.012 | 0.348 | ||
Female | -0.718 | 0.291 | 0.014 |
Another option is to add labels after you have created the object. To
do this, you can use the form labels(x) <- value
or use
the pipe-able version, set_labels()
.
# the non-pipe version; somewhat clunky
tmp <- freqlist(tab.ex)
labels(tmp) <- c(arm = "Treatment Arm", sex = "Gender", mdquality.s = "LASA QOL")
summary(tmp)
Treatment Arm | Gender | LASA QOL | Freq | Cumulative Freq | Percent | Cumulative Percent |
---|---|---|---|---|---|---|
A: IFL | Male | 0 | 29 | 29 | 1.93 | 1.93 |
1 | 214 | 243 | 14.28 | 16.21 | ||
NA | 34 | 277 | 2.27 | 18.48 | ||
Female | 0 | 12 | 289 | 0.80 | 19.28 | |
1 | 118 | 407 | 7.87 | 27.15 | ||
NA | 21 | 428 | 1.40 | 28.55 | ||
F: FOLFOX | Male | 0 | 31 | 459 | 2.07 | 30.62 |
1 | 285 | 744 | 19.01 | 49.63 | ||
NA | 95 | 839 | 6.34 | 55.97 | ||
Female | 0 | 21 | 860 | 1.40 | 57.37 | |
1 | 198 | 1058 | 13.21 | 70.58 | ||
NA | 61 | 1119 | 4.07 | 74.65 | ||
G: IROX | Male | 0 | 17 | 1136 | 1.13 | 75.78 |
1 | 187 | 1323 | 12.47 | 88.26 | ||
NA | 24 | 1347 | 1.60 | 89.86 | ||
Female | 0 | 14 | 1361 | 0.93 | 90.79 | |
1 | 121 | 1482 | 8.07 | 98.87 | ||
NA | 17 | 1499 | 1.13 | 100.00 |
# piped--much cleaner
mockstudy %>%
tableby(arm ~ sex + age, data = .) %>%
set_labels(c(sex = "SEX", age = "Age, yrs")) %>%
summary()
A: IFL (N=428) | F: FOLFOX (N=691) | G: IROX (N=380) | Total (N=1499) | p value | |
---|---|---|---|---|---|
SEX | 0.190 | ||||
Male | 277 (64.7%) | 411 (59.5%) | 228 (60.0%) | 916 (61.1%) | |
Female | 151 (35.3%) | 280 (40.5%) | 152 (40.0%) | 583 (38.9%) | |
Age, yrs | 0.614 | ||||
Mean (SD) | 59.673 (11.365) | 60.301 (11.632) | 59.763 (11.499) | 59.985 (11.519) | |
Range | 27.000 - 88.000 | 19.000 - 88.000 | 26.000 - 85.000 | 19.000 - 88.000 |
mockstudy %>%
modelsum(bmi ~ age, adjust = ~ sex, data = .) %>%
set_labels(list(sexFemale = "Female", age = "Age, yrs")) %>%
summary()
estimate | std.error | p.value | adj.r.squared | Nmiss | |
---|---|---|---|---|---|
(Intercept) | 26.793 | 0.766 | < 0.001 | 0.004 | 33 |
Age, yrs | 0.012 | 0.012 | 0.348 | ||
Female | -0.718 | 0.291 | 0.014 |
data.frame
tableby()
and modelsum()
also allow you to
have label attributes on the data. Note that by default these attributes
usually get dropped upon subsetting, but tableby()
and
modelsum()
use the keep.labels()
function to
retain them.
[1] “integer”
[1] “keep_labels” “integer”
To undo this, simply loosen.labels()
:
[1] “integer”
You can set attributes one at a time in two ways:
…or all at once:
labels(mockstudy.lab) <- list(sex = "Sex", age = "Age, yrs")
summary(tableby(arm ~ sex + age, data = mockstudy.lab))
A: IFL (N=428) | F: FOLFOX (N=691) | G: IROX (N=380) | Total (N=1499) | p value | |
---|---|---|---|---|---|
Sex | 0.190 | ||||
Male | 277 (64.7%) | 411 (59.5%) | 228 (60.0%) | 916 (61.1%) | |
Female | 151 (35.3%) | 280 (40.5%) | 152 (40.0%) | 583 (38.9%) | |
Age, yrs | 0.614 | ||||
Mean (SD) | 59.673 (11.365) | 60.301 (11.632) | 59.763 (11.499) | 59.985 (11.519) | |
Range | 27.000 - 88.000 | 19.000 - 88.000 | 26.000 - 85.000 | 19.000 - 88.000 |
You can pipe this, too.
mockstudy %>%
set_labels(list(sex = "SEX", age = "Age, yrs")) %>%
modelsum(bmi ~ age, adjust = ~ sex, data = .) %>%
summary()
estimate | std.error | p.value | adj.r.squared | Nmiss | |
---|---|---|---|---|---|
(Intercept) | 26.793 | 0.766 | < 0.001 | 0.004 | 33 |
Age, yrs | 0.012 | 0.012 | 0.348 | ||
SEX Female | -0.718 | 0.291 | 0.014 |
To extract labels from a data.frame
, simply use the
labels()
function:
## $case
## NULL
##
## $age
## [1] "Age, yrs"
##
## $arm
## [1] "Treatment Arm"
##
## $sex
## [1] "Sex"
##
## $race
## [1] "Race"
##
## $fu.time
## NULL
##
## $fu.stat
## NULL
##
## $ps
## NULL
##
## $hgb
## NULL
##
## $bmi
## [1] "Body Mass Index (kg/m^2)"
##
## $alk.phos
## NULL
##
## $ast
## NULL
##
## $mdquality.s
## NULL
##
## $age.ord
## NULL
tableby()
and modelsum()
both support the
wrapping of long labels. Consider the width=
argument in
the print()
function:
mockstudy %>%
set_labels(list(age = "This is a really long label for the arm variable")) %>%
tableby(sex ~ age, data = .) %>%
summary() %>%
print(width = 20)
Male (N=916) | Female (N=583) | Total (N=1499) | p value | |
---|---|---|---|---|
This is a really | 0.048 | |||
long label for the | ||||
arm variable | ||||
Mean (SD) | 60.455 (11.369) | 59.247 (11.722) | 59.985 (11.519) | |
Range | 19.000 - 88.000 | 22.000 - 88.000 | 19.000 - 88.000 |