Package 'arsenal' reference manual

Title:	An Arsenal of 'R' Functions for Large-Scale Statistical Summaries
Description:	An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models. The primary functions include tableby(), a Table-1-like summary of multiple variable types 'by' the levels of one or more categorical variables; paired(), a Table-1-like summary of multiple variable types paired across two time points; modelsum(), which performs simple model fits on one or more endpoints for many variables (univariate or adjusted for covariates); freqlist(), a powerful frequency table across many categorical variables; comparedf(), a function for comparing data.frames; and write2(), a function to output tables to a document.
Authors:	Ethan Heinzen [aut, cre], Jason Sinnwell [aut], Elizabeth Atkinson [aut], Tina Gunderson [aut], Gregory Dougherty [aut], Patrick Votruba [ctb], Ryan Lennon [ctb], Andrew Hanson [ctb], Krista Goergen [ctb], Emily Lundt [ctb], Brendan Broderick [ctb], Maddie McCullough [art]
Maintainer:	Ethan Heinzen <[email protected]>
License:	GPL (>= 2)
Version:	3.6.4.0000
Built:	2025-03-25 06:06:49 UTC
Source:	https://github.com/mayoverse/arsenal

Not in

Description

The not-in operator for R.

Usage

x %nin% table
x %nin% table

Arguments

`x`	vector or `NULL`: the values to be matched.
`table`	vector or `NULL`: the values to be matched against.

Value

The negation of %in%.

Author(s)

Raymond Moore

Examples

1 %nin% 2:10
c("a", "b") %nin% c("a", "c", "d")
1 %nin% 2:10
c("a", "b") %nin% c("a", "c", "d")

An Arsenal of 'R' Functions for Large-Scale Statistical Summaries

Description

An Arsenal of 'R' functions for large-scale statistical summaries, which are streamlined to work within the latest reporting tools in 'R' and 'RStudio' and which use formulas and versatile summary statistics for summary tables and models.

Details

The package download, NEWS, and README are available on CRAN: https://cran.r-project.org/package=arsenal

Functions

Below are listed some of the most widely used functions available in arsenal:

tableby: Summary statistics of a set of independent variables by a categorical variable.

paired: Summary statistics of a set of independent variables paired across two timepoints.

modelsum: Fit models over each of a set of independent variables with a response variable.

freqlist: Approximate the output from SAS's PROC FREQ procedure when using the /list option of the TABLE statement.

comparedf: Compare two data.frames and report any differences between them, much like SAS's PROC COMPARE procedure.

write2word, write2html, write2pdf: Functions to output tables to a single Word, HTML, or PDF document.

write2: Functions to output tables to a single document. (Also the S3 backbone behind the write2* functions.)

keep.labels: Keep the 'label' attribute on an R object when subsetting.

formulize: A shortcut to generate one-, two-, or many-sided formulas.

mdy.Date and Date.mdy: Convert numeric dates for month, day, and year to Date object, and vice versa.

is.Date: Test if an object is a date.

%nin%: Test for "not in".

allNA and includeNA: some useful functions for dealing with NAs.

Data

mockstudy: Mock study data for examples.

Examples

library(arsenal)

library(arsenal)

`arsenal` tables with common structure

Description

arsenal tables with common structure

Usage

has_strata(x)

## S3 method for class 'arsenal_table'
x[i, j, ...]

## S3 method for class 'tableby'
x[i, j, ...]

## S3 method for class 'arsenal_table'
labels(object, ...)

## S3 replacement method for class 'arsenal_table'
labels(x) <- value

## S3 method for class 'arsenal_table'
print(x, ...)

## S3 method for class 'arsenal_table'
merge(x, y, all = FALSE, all.x = all, all.y = all, ...)

## S3 method for class 'freqlist'
merge(x, y, all = TRUE, ...)

## S3 method for class 'summary.arsenal_table'
print(
  x,
  ...,
  format = if (!is.null(x$text) && x$text %in% c("html", "latex")) x$text else
    "markdown",
  escape = x$text %nin% c("html", "latex"),
  width = NULL,
  min.split = NULL
)
has_strata(x)

## S3 method for class 'arsenal_table'
x[i, j, ...]

## S3 method for class 'tableby'
x[i, j, ...]

## S3 method for class 'arsenal_table'
labels(object, ...)

## S3 replacement method for class 'arsenal_table'
labels(x) <- value

## S3 method for class 'arsenal_table'
print(x, ...)

## S3 method for class 'arsenal_table'
merge(x, y, all = FALSE, all.x = all, all.y = all, ...)

## S3 method for class 'freqlist'
merge(x, y, all = TRUE, ...)

## S3 method for class 'summary.arsenal_table'
print(
  x,
  ...,
  format = if (!is.null(x$text) && x$text %in% c("html", "latex")) x$text else
    "markdown",
  escape = x$text %nin% c("html", "latex"),
  width = NULL,
  min.split = NULL
)

Arguments

`x`, `y`, `object`	An object of class `"arsenal_table"`
`i`, `j`	A vector to index `x` with: either names of variables, a numeric vector, or a logical vector of appropriate length. `i` indexes the x-variables, and `j` indexes the by-variables.
`...`	Other arguments (only used in `print.summary.arsenal_table`)
`value`	A list of new labels.
`all`, `all.x`, `all.y`	Logicals, denoting which terms to keep if not all are in common.
`format`	Passed to `kable`: the format for the table. The default here is "markdown". To use the default in `kable`, pass `NULL`. If `x$text` specifies LaTeX or HTML formatting, that format is used in the table.
`escape`	Passed to `kable`: should special characters be escaped when printed?
`width`, `min.split`	Passed to `smart.split` for formatting of the "term" column.

Defunct functions in `arsenal`

Description

Details about defunct functions in arsenal

Arguments

`x`, `y`	See `comparedf`.
`...`	Other arguments.

Details

comparison.control was renamed to comparedf.control in version 3.0.0.

compare.data.frame was renamed to comparedf in version 3.0.0.

length.tableby was removed in version 2.0.0.

includeNA.character and includeNA.numeric were removed in version 2.0.0 and replaced with a default method.

rangeTime was removed in version 1.5.0.

Deprecated functions in `arsenal`

Description

Details about deprecated functions in arsenal

as.data.frame.freqlist

Description

Convert freqlist object to a data.frame.

Usage

## S3 method for class 'freqlist'
as.data.frame(x, ..., labelTranslations = NULL, list.ok = FALSE)
## S3 method for class 'freqlist'
as.data.frame(x, ..., labelTranslations = NULL, list.ok = FALSE)

Arguments

`x`	An object of class `"freqlist"`.
`...`	Arguments to pass to `freq.control`
`labelTranslations`	A named list (or vector) where the name is the label in the output to be replaced in the pretty rendering by the character string value for the named element of the list, e.g., `list(age = "Age(Years)", meansd = "Mean(SD)")`.
`list.ok`	If the object has multiple by-variables, is it okay to return a list of data.frames instead of a single data.frame? If `FALSE` but there are multiple by-variables, a warning is issued.

Value

A data.frame corresponding to the freqlist object.

as.data.frame.modelsum

Description

Coerce a modelsum object to a data.frame.

Usage

## S3 method for class 'modelsum'
as.data.frame(x, ..., labelTranslations = NULL, list.ok = FALSE)
## S3 method for class 'modelsum'
as.data.frame(x, ..., labelTranslations = NULL, list.ok = FALSE)

Arguments

`x`	A `modelsum` object.
`...`	Arguments to pass to `modelsum.control`.
`labelTranslations`	A named list (or vector) where the name is the label in the output to be replaced in the pretty rendering by the character string value for the named element of the list, e.g., `list(age = "Age(Years)", meansd = "Mean(SD)")`.
`list.ok`	If the object has multiple by-variables, is it okay to return a list of data.frames instead of a single data.frame? If `FALSE` but there are multiple by-variables, a warning is issued.

Value

A data.frame.

Author(s)

Ethan Heinzen, based on code originally by Greg Dougherty

as.data.frame.tableby

Description

Coerce a tableby object to a data.frame.

Usage

## S3 method for class 'tableby'
as.data.frame(x, ..., labelTranslations = NULL, list.ok = FALSE)
## S3 method for class 'tableby'
as.data.frame(x, ..., labelTranslations = NULL, list.ok = FALSE)

Arguments

`x`	A `tableby` object.
`...`	Arguments to pass to `tableby.control`.
`labelTranslations`	A named list (or vector) where the name is the label in the output to be replaced in the pretty rendering by the character string value for the named element of the list, e.g., `list(age = "Age(Years)", meansd = "Mean(SD)")`.
`list.ok`	If the object has multiple by-variables, is it okay to return a list of data.frames instead of a single data.frame? If `FALSE` but there are multiple by-variables, a warning is issued.

Value

A data.frame.

Author(s)

Ethan Heinzen, based on code originally by Greg Dougherty

Compare two data.frames and report differences

Description

Compare two data.frames and report any differences between them, much like SAS's PROC COMPARE procedure.

Usage

comparedf(x, y, by = NULL, by.x = by, by.y = by, control = NULL, ...)

## S3 method for class 'comparedf'
print(x, ...)
comparedf(x, y, by = NULL, by.x = by, by.y = by, control = NULL, ...)

## S3 method for class 'comparedf'
print(x, ...)

Arguments

`x`, `y`	A data.frame to compare
`by`, `by.x`, `by.y`	Which variables are IDs to merge the two data.frames? If set to `"row.names"`, merging will occur over the row.names. If set to `NULL` (default), merging will occur row-by-row.
`control`	A list of control parameters from `comparedf.control`.
`...`	Other arguments, passed to `comparedf.control` when appropriate.

Author(s)

Ethan Heinzen, adapted from code from Andrew Hanson

Examples


df1 <- data.frame(id = paste0("person", 1:3), a = c("a", "b", "c"),
                  b = c(1, 3, 4), c = c("f", "e", "d"),
                  row.names = paste0("rn", 1:3), stringsAsFactors = FALSE)
df2 <- data.frame(id = paste0("person", 3:1), a = c("c", "b", "a"),
                  b = c(1, 3, 4), d = paste0("rn", 1:3),
                  row.names = paste0("rn", c(1,3,2)), stringsAsFactors = FALSE)
summary(comparedf(df1, df2))
summary(comparedf(df1, df2, by = "id"))
summary(comparedf(df1, df2, by = "row.names"))
df1 <- data.frame(id = paste0("person", 1:3), a = c("a", "b", "c"),
                  b = c(1, 3, 4), c = c("f", "e", "d"),
                  row.names = paste0("rn", 1:3), stringsAsFactors = FALSE)
df2 <- data.frame(id = paste0("person", 3:1), a = c("c", "b", "a"),
                  b = c(1, 3, 4), d = paste0("rn", 1:3),
                  row.names = paste0("rn", c(1,3,2)), stringsAsFactors = FALSE)
summary(comparedf(df1, df2))
summary(comparedf(df1, df2, by = "id"))
summary(comparedf(df1, df2, by = "row.names"))

Control settings for `comparedf` function

Description

Control tolerance definitions for the comparedf function.

Usage

comparedf.control(
  tol.logical = "none",
  tol.num = c("absolute", "percent", "pct"),
  tol.num.val = sqrt(.Machine$double.eps),
  int.as.num = FALSE,
  tol.char = c("none", "trim", "case", "both"),
  tol.factor = c("none", "levels", "labels"),
  factor.as.char = FALSE,
  tol.date = "absolute",
  tol.date.val = 0,
  tol.other = "none",
  tol.vars = "none",
  max.print.vars = NA,
  max.print.obs = NA,
  max.print.diffs.per.var = 10,
  max.print.diffs = 50,
  max.print.attrs = NA,
  ...,
  max.print.diff = 10
)
comparedf.control(
  tol.logical = "none",
  tol.num = c("absolute", "percent", "pct"),
  tol.num.val = sqrt(.Machine$double.eps),
  int.as.num = FALSE,
  tol.char = c("none", "trim", "case", "both"),
  tol.factor = c("none", "levels", "labels"),
  factor.as.char = FALSE,
  tol.date = "absolute",
  tol.date.val = 0,
  tol.other = "none",
  tol.vars = "none",
  max.print.vars = NA,
  max.print.obs = NA,
  max.print.diffs.per.var = 10,
  max.print.diffs = 50,
  max.print.attrs = NA,
  ...,
  max.print.diff = 10
)

Arguments

`tol.logical`, `tol.num`, `tol.char`, `tol.factor`, `tol.date`, `tol.other`	A function or one of the shortcut character strings or a list thereof, denoting the tolerance function to use for a given data type. See "details", below.
`tol.num.val`	Numeric; maximum value of differences allowed in numerics (fed to the function given in `tol.num`).
`int.as.num`	Logical; should integers be coerced to numeric before comparison? Default FALSE.
`factor.as.char`	Logical; should factors be coerced to character before comparison? Default FALSE.
`tol.date.val`	Numeric; maximum value of differences allowed in dates (fed to the function given in `tol.date`).
`tol.vars`	Either `"none"` (the default), denoting that variable names are to be matched as-is, a named vector manually specifying variable names to compare (where the names correspond to columns of `x` and the values correspond to columns of `y`), or a character vector denoting equivalence classes for characters in the variable names. See "details", below.
`max.print.vars`	Integer denoting maximum number of variables to report in the "variables not shared" and "variables not compared" output. `NA` will print all differences.
`max.print.obs`	Integer denoting maximum number of not-shared observations to report. `NA` will print all differences.
`max.print.diffs.per.var`, `max.print.diffs`	Integers denoting the maximum number of differences to report for each variable or overall. `NA` will print all differences for each variable or overall.
`max.print.attrs`	Integers denoting the maximum number of non-identical attributes to report.`NA` will print all differences.
`...`	Other arguments (not in use at this time).
`max.print.diff`	Deprecated.

Details

The following character strings are accepted:

tol.logical = "none": compare logicals exactly as they are.
tol.num = "absolute": compare absolute differences in numerics.
tol.num = "percent", tol.num = "pct" compare percent differences in numerics.
tol.char = "none": compare character strings exactly as they are.
tol.char = "trim": left-justify and trim all trailing white space.
tol.char = "case": allow differences in upper/lower case.
tol.char = "both": combine "trim" and "case".
tol.factor = "none": match both character labels and numeric levels.
tol.factor = "levels": match only the numeric levels.
tol.factor = "labels": match only the labels.
tol.date = "absolute": compare absolute differences in dates.
tol.other = "none": expect objects of other classes to be exactly identical.

A list with names mapped to x can be used to specify tolerances by variable. One unnamed element is supported as the default.

tol.vars: If not set to "none" (the default) or a named vector, the tol.vars argument is a character vector denoting equivalence classes for the characters in the variable names. A single character in this vector means to replace that character with "". All other strings in this vector are split by character and replaced by the first character in the string.

E.g., a character vector c("._", "aA", " ") would denote that the dot and underscore are equivalent (to be translated to a dot), that "a" and "A" are equivalent (to be translated to "a"), and that spaces should be removed.

The special character string "case" in this vector is the same as specifying paste0(letters, LETTERS).

Value

A list containing the necessary parameters for the comparedf function.

Author(s)

Ethan Heinzen

Examples

cntl <- comparedf.control(
  tol.num = "pct",     # calculate percent differences
  tol.vars = c("case", # ignore case
               "._",   # set all underscores to dots.
               "e")    # remove all letter e's
)

cntl <- comparedf.control(tol.char = list(
  "none",      # the default
  x1 = "case", # be case-insensitive for the variable "x1"
  x2 = function(x, y) tol.NA(x, y, x != y | y == "NA") # a custom-defined tolerance
))
cntl <- comparedf.control(
  tol.num = "pct",     # calculate percent differences
  tol.vars = c("case", # ignore case
               "._",   # set all underscores to dots.
               "e")    # remove all letter e's
)

cntl <- comparedf.control(tol.char = list(
  "none",      # the default
  x1 = "case", # be case-insensitive for the variable "x1"
  x2 = function(x, y) tol.NA(x, y, x != y | y == "NA") # a custom-defined tolerance
))

`comparedf` tolerances

Description

Internal functions defining tolerances for the comparedf.control function. To create your own tolerance definitions, see the vignette.

Usage

tol.NA(x, y, idx)

tol.num.absolute(x, y, tol)

tol.num.percent(x, y, tol)

tol.num.pct(x, y, tol)

tol.factor.none(x, y)

tol.factor.levels(x, y)

tol.factor.labels(x, y)

tol.char.both(x, y)

tol.char.case(x, y)

tol.char.trim(x, y)

tol.char.none(x, y)

tol.date.absolute(x, y, tol)

tol.logical.none(x, y)

tol.other.none(x, y)
tol.NA(x, y, idx)

tol.num.absolute(x, y, tol)

tol.num.percent(x, y, tol)

tol.num.pct(x, y, tol)

tol.factor.none(x, y)

tol.factor.levels(x, y)

tol.factor.labels(x, y)

tol.char.both(x, y)

tol.char.case(x, y)

tol.char.trim(x, y)

tol.char.none(x, y)

tol.date.absolute(x, y, tol)

tol.logical.none(x, y)

tol.other.none(x, y)

Arguments

`x`, `y`	vectors of the appropriate lengths and types.
`idx`	A logical vector of appropriate length.
`tol`	A numeric tolerance

Details

tol.NA takes as differences between two vectors any elements which are NA in one but not the other, or which are non-NA in both and TRUE in idx. It is useful for handling NAs in custom tolerance functions.

Value

A logical vector of length equal to that of x and y, where TRUE denotes a difference between x and y, and FALSE denotes no difference between x and y.

Author(s)

Ethan Heinzen

Extract differences

Description

Extract differences (diffs()), number of differences (n.diffs()), or number of not-shared observations (n.diff.obs()) from a comparedf object.

Usage

n.diff.obs(object, ...)

## S3 method for class 'comparedf'
n.diff.obs(object, ...)

## S3 method for class 'summary.comparedf'
n.diff.obs(object, ...)

n.diffs(object, ...)

## S3 method for class 'comparedf'
n.diffs(object, ...)

## S3 method for class 'summary.comparedf'
n.diffs(object, ...)

diffs(object, ...)

## S3 method for class 'comparedf'
diffs(
  object,
  what = c("differences", "observations"),
  vars = NULL,
  ...,
  by.var = FALSE
)

## S3 method for class 'summary.comparedf'
diffs(
  object,
  what = c("differences", "observations"),
  vars = NULL,
  ...,
  by.var = FALSE
)
n.diff.obs(object, ...)

## S3 method for class 'comparedf'
n.diff.obs(object, ...)

## S3 method for class 'summary.comparedf'
n.diff.obs(object, ...)

n.diffs(object, ...)

## S3 method for class 'comparedf'
n.diffs(object, ...)

## S3 method for class 'summary.comparedf'
n.diffs(object, ...)

diffs(object, ...)

## S3 method for class 'comparedf'
diffs(
  object,
  what = c("differences", "observations"),
  vars = NULL,
  ...,
  by.var = FALSE
)

## S3 method for class 'summary.comparedf'
diffs(
  object,
  what = c("differences", "observations"),
  vars = NULL,
  ...,
  by.var = FALSE
)

Arguments

`object`	An object of class `comparedf` or `summary.comparedf`.
`...`	Other arguments (not in use at this time).
`what`	Should differences or the not-shared observations be returned?
`vars`	A character vector of variable names to subset the results to.
`by.var`	Logical: should the number of differences by variable be reported, or should all differences be reported (the default).

Author(s)

Ethan Heinzen

formulize

Description

A shortcut to generate one-, two-, or many-sided formulas from vectors of variable names.

Usage

formulize(
  y = "",
  x,
  ...,
  data = NULL,
  collapse = "+",
  collapse.y = collapse,
  escape = FALSE
)
formulize(
  y = "",
  x,
  ...,
  data = NULL,
  collapse = "+",
  collapse.y = collapse,
  escape = FALSE
)

Arguments

`y`, `x`, `...`	Character vectors, names, or calls to be collapsed (by `"+"`) and put left-to-right in the formula. If `data` is supplied, these can also be numeric, denoting which column name to use. See examples.
`data`	An R object with non-null column names.
`collapse`	How should terms be collapsed? Default is addition.
`collapse.y`	How should the y-terms be collapsed? Default is addition. Also accepts the special string "list", which combines them into a multiple-left-hand-side formula, for use in other functions.
`escape`	A logical indicating whether character vectors should be coerced to names (that is, whether names with spaces should be surrounded with backticks or not)

Author(s)

Ethan Heinzen

Examples

## two-sided formula
f1 <- formulize("y", c("x1", "x2", "x3"))

## one-sided formula
f2 <- formulize(x = c("x1", "x2", "x3"))

## multi-sided formula
f3 <- formulize("y", c("x1", "x2", "x3"), c("z1", "z2"), "w1")

## can use numerics for column names
data(mockstudy)
f4 <- formulize(y = 1, x = 2:4, data = mockstudy)

## mix and match
f5 <- formulize(1, c("x1", "x2", "x3"), data = mockstudy)

## get an interaction
f6 <- formulize("y", c("x1*x2", "x3"))

## get only interactions
f7 <- formulize("y", c("x1", "x2", "x3"), collapse = "*")

## no intercept
f8 <- formulize("y", "x1 - 1")
f9 <- formulize("y", c("x1", "x2", "-1"))

## LHS as a list to use in arsenal functions
f10 <- formulize(c("y1", "y2", "y3"), c("x", "z"), collapse.y = "list")

## use in an lm
f11 <- formulize(2, 3:4, data = mockstudy)
summary(lm(f11, data = mockstudy))

## using non-syntactic names or calls (like reformulate example)
f12 <- formulize(as.name("+-"), c("`P/E`", "`% Growth`"))
f12 <- formulize("+-", c("P/E", "% Growth"), escape = TRUE)

f <- Surv(ft, case) ~ a + b
f13 <- formulize(f[[2]], f[[3]])

## two-sided formula
f1 <- formulize("y", c("x1", "x2", "x3"))

## one-sided formula
f2 <- formulize(x = c("x1", "x2", "x3"))

## multi-sided formula
f3 <- formulize("y", c("x1", "x2", "x3"), c("z1", "z2"), "w1")

## can use numerics for column names
data(mockstudy)
f4 <- formulize(y = 1, x = 2:4, data = mockstudy)

## mix and match
f5 <- formulize(1, c("x1", "x2", "x3"), data = mockstudy)

## get an interaction
f6 <- formulize("y", c("x1*x2", "x3"))

## get only interactions
f7 <- formulize("y", c("x1", "x2", "x3"), collapse = "*")

## no intercept
f8 <- formulize("y", "x1 - 1")
f9 <- formulize("y", c("x1", "x2", "-1"))

## LHS as a list to use in arsenal functions
f10 <- formulize(c("y1", "y2", "y3"), c("x", "z"), collapse.y = "list")

## use in an lm
f11 <- formulize(2, 3:4, data = mockstudy)
summary(lm(f11, data = mockstudy))

## using non-syntactic names or calls (like reformulate example)
f12 <- formulize(as.name("+-"), c("`P/E`", "`% Growth`"))
f12 <- formulize("+-", c("P/E", "% Growth"), escape = TRUE)

f <- Surv(ft, case) ~ a + b
f13 <- formulize(f[[2]], f[[3]])

Control settings for `freqlist` function

Description

Control test and summary settings for the freqlist function.

Usage

freq.control(
  sparse = FALSE,
  single = FALSE,
  dupLabels = FALSE,
  digits.count = 0L,
  digits.pct = 2L,
  ...,
  digits = NULL
)
freq.control(
  sparse = FALSE,
  single = FALSE,
  dupLabels = FALSE,
  digits.count = 0L,
  digits.pct = 2L,
  ...,
  digits = NULL
)

Arguments

`sparse`	a logical value indicating whether to keep rows with counts of zero. The default is `FALSE` (drop zero-count rows).
`single`	logical, indicating whether to collapse results created using a strata variable into a single table for printing
`dupLabels`	logical: should labels which are the same as the row above be printed? The default (`FALSE`) more closely approximates `PROC FREQ` output from SAS, where a label carried down from the row above is left blank.
`digits.count`	Number of decimal places for count values.
`digits.pct`	Number of decimal places for percents.
`...`	additional arguments.
`digits`	A deprecated argument

Value

A list with settings to be used within the freqlist function.

Author(s)

Ethan Heinzen

freqlist

Description

Approximate the output from SAS's PROC FREQ procedure when using the /list option of the TABLE statement.

Usage

freqlist(object, ...)

## S3 method for class 'table'
freqlist(
  object,
  na.options = c("include", "showexclude", "remove"),
  strata = NULL,
  labelTranslations = NULL,
  control = NULL,
  ...
)

## S3 method for class 'formula'
freqlist(
  formula,
  data,
  subset,
  na.action,
  na.options = c("include", "showexclude", "remove"),
  strata = NULL,
  labelTranslations = NULL,
  control = NULL,
  addNA,
  exclude,
  drop.unused.levels,
  ...
)
freqlist(object, ...)

## S3 method for class 'table'
freqlist(
  object,
  na.options = c("include", "showexclude", "remove"),
  strata = NULL,
  labelTranslations = NULL,
  control = NULL,
  ...
)

## S3 method for class 'formula'
freqlist(
  formula,
  data,
  subset,
  na.action,
  na.options = c("include", "showexclude", "remove"),
  strata = NULL,
  labelTranslations = NULL,
  control = NULL,
  addNA,
  exclude,
  drop.unused.levels,
  ...
)

Arguments

`object`	An R object, usually of class `"table"` or class `"xtabs"`
`...`	additional arguments. In the formula method, these are passed to the table method. These are also passed to `freq.control`
`na.options`	a character string indicating how to handling missing values: `"include"` (include values with NAs in counts and percentages), `"showexclude"` (show NAs but exclude from cumulative counts and all percentages), `"remove"` (remove values with NAs); default is `"include"`.
`strata`	(formerly `groupBy`) an optional character string specifying a variable(s) to use for grouping when calculating cumulative counts and percentages. `summary.freqlist` will also separate by grouping variable for printing. Note that this is different from `modelsum` and `tableby`, which take bare column names (and only one, at that!)
`labelTranslations`	an optional character string (or list) of labels to use for variable levels when summarizing. Names will be matched appropriately.
`control`	control parameters to handle optional settings within `freqlist`. See `freq.control`
`formula`, `data`, `subset`, `na.action`, `addNA`, `exclude`, `drop.unused.levels`	Arguments passed to `xtabs`.

Value

An object of class c("freqlist", "arsenal_table")

Author(s)

Tina Gunderson, with revisions by Ethan Heinzen

Examples

# load mockstudy data
data(mockstudy)
tab.ex <- table(mockstudy[c("arm", "sex", "mdquality.s")], useNA = "ifany")
noby <- freqlist(tab.ex, na.options = "include")
summary(noby)

# show the top 6 rows' frequencies and percents
head(summary(sort(noby, decreasing = TRUE)[c(1:4, 6)]))

withby <- freqlist(tab.ex, strata = c("arm","sex"), na.options = "showexclude")
summary(withby)
# load mockstudy data
data(mockstudy)
tab.ex <- table(mockstudy[c("arm", "sex", "mdquality.s")], useNA = "ifany")
noby <- freqlist(tab.ex, na.options = "include")
summary(noby)

# show the top 6 rows' frequencies and percents
head(summary(sort(noby, decreasing = TRUE)[c(1:4, 6)]))

withby <- freqlist(tab.ex, strata = c("arm","sex"), na.options = "showexclude")
summary(withby)

Helper functions for freqlist

Description

A set of helper functions for freqlist.

Usage

is.freqlist(x)

is.summary.freqlist(x)

## S3 method for class 'summary.freqlist'
head(x, n = 6L, ...)

## S3 method for class 'summary.freqlist'
tail(x, n = 6L, ...)

## S3 method for class 'freqlist'
sort(x, decreasing = FALSE, ...)
is.freqlist(x)

is.summary.freqlist(x)

## S3 method for class 'summary.freqlist'
head(x, n = 6L, ...)

## S3 method for class 'summary.freqlist'
tail(x, n = 6L, ...)

## S3 method for class 'freqlist'
sort(x, decreasing = FALSE, ...)

Arguments

`x`	A `freqlist` object.
`n`	A single integer. See `head` or `tail` for more details
`...`	Other arguments.
`decreasing`	Should the sort be increasing or decreasing?

Details

Note that sort() has to recalculate cumulative statistics. Note also that the reordering of rows will also affect which labels are duplicates; you may also want to consider using dupLabels=TRUE in freq.control().

Internal Functions

Description

Internal Functions

Usage

smart.split(string, width = Inf, min.split = -Inf)

replace2(x, list, values)
smart.split(string, width = Inf, min.split = -Inf)

replace2(x, list, values)

Arguments

`string`	A character vector
`width`	Either `Inf` or `NULL` to specify no splitting, or a positive integer giving the largest allowed string length.
`min.split`	Either `-Inf` or `NULL` to specify no lower bound on the string length, or a positive integer giving the minimum string length.
`x`	vector
`list`	an index vector
`values`	replacement values

Value

For smart.split, a list of the same length as string, with each element being the "intelligently" split string.

For replace2, a vector with the proper values replaced.

Keep Labels

Description

Keep the 'label' attribute on an R object when subsetting. loosen.labels allows the 'label' attribute to be lost again.

Usage

keep.labels(x, ...)

## S3 method for class 'data.frame'
keep.labels(x, ...)

## Default S3 method:
keep.labels(x, ...)

## S3 method for class 'keep_labels'
x[...]

## S3 replacement method for class 'keep_labels'
x[i] <- value

loosen.labels(x, ...)

## S3 method for class 'data.frame'
loosen.labels(x, ...)

## Default S3 method:
loosen.labels(x, ...)
keep.labels(x, ...)

## S3 method for class 'data.frame'
keep.labels(x, ...)

## Default S3 method:
keep.labels(x, ...)

## S3 method for class 'keep_labels'
x[...]

## S3 replacement method for class 'keep_labels'
x[i] <- value

loosen.labels(x, ...)

## S3 method for class 'data.frame'
loosen.labels(x, ...)

## Default S3 method:
loosen.labels(x, ...)

Arguments

`x`	An R object
`...`	Other arguments (not in use at this time).
`i`, `value`	See `[<-`.

Value

A copy of x with a "keep_labels" class appended on or removed. Note that for the data.frame method, only classes on the columns are changed; the data.frame won't have an extra class appended. This is different from previous versions of arsenal.

Author(s)

Ethan Heinzen

Labels

Description

Assign and extract the 'label' attribute on an R object. set_labels is the same as labels(x) <- value but returns x for use in a pipe chain. set_attr is the same as attr(x, which) <- value but returns x for use in a pipe chain.

Usage

## S3 method for class 'data.frame'
labels(object, ...)

## S3 method for class 'keep_labels'
labels(object, ...)

labels(x) <- value

## S3 replacement method for class 'keep_labels'
labels(x) <- value

## Default S3 replacement method:
labels(x) <- value

## S3 replacement method for class 'data.frame'
labels(x) <- value

set_labels(x, value)

set_attr(x, which, value)
## S3 method for class 'data.frame'
labels(object, ...)

## S3 method for class 'keep_labels'
labels(object, ...)

labels(x) <- value

## S3 replacement method for class 'keep_labels'
labels(x) <- value

## Default S3 replacement method:
labels(x) <- value

## S3 replacement method for class 'data.frame'
labels(x) <- value

set_labels(x, value)

set_attr(x, which, value)

Arguments

`...`	Other arguments (not in use at this time).
`x`, `object`	An R object.
`value`	A vector or list containing labels to assign. Labels are assigned based on names, if available; otherwise, they're assigned in order. Can pass `NULL` to remove all labels.
`which`	See `attr<-`

Details

The data.frame methods put labels on and extract labels from the columns of object.

Value

The labels of object, or object with new labels.

Author(s)

Ethan Heinzen

Convert numeric dates to Date object, and vice versa

Description

Convert numeric dates for month, day, and year to Date object, and vice versa.

Usage

mdy.Date(month, day, year, yearcut = 120)

Date.mdy(date)

is.Date(x)
mdy.Date(month, day, year, yearcut = 120)

Date.mdy(date)

is.Date(x)

Arguments

`month`	integer, month (1-12).
`day`	integer, day of the month (1-31, depending on the month).
`year`	integer, either 2- or 4-digit year. If two-digit number, will add 1900 onto it, depending on range.
`yearcut`	cutoff for method to know if to convert to 4-digit year.
`date`	A date value.
`x`	An object.

Details

Test if an object is a date.

More work may need to be done with yearcut and 2-digit years. Best to give a full 4-digit year.

Value

mdy.Date returns a Date object, and Date.mdy returns a list with integer values for month, day, and year. is.Date returns a single logical value.

Examples

mdy.Date(9, 2, 2013)

tmp <- mdy.Date(9, 2, 2013)
Date.mdy(tmp)

is.Date(tmp)
mdy.Date(9, 2, 2013)

tmp <- mdy.Date(9, 2, 2013)
Date.mdy(tmp)

is.Date(tmp)

Mock study data for examples

Description

Mock clinical study data for examples to test data manipulation and statistical functions. The function muck_up_mockstudy() is used in examples for comparedf.

Usage

mockstudy

muck_up_mockstudy()
mockstudy

muck_up_mockstudy()

Format

A data frame with 1499 observations on the following 15 variables:

case: a numeric identifier-patient ID
age: age in years
arm: treatment arm divided into 3 groups, character string
sex: a factor with levels Male Female
race: self-reported race/ethnicity, character string
fu.time: survival or censoring time in years
fu.stat: censoring status; 1=censor, 2=death
ps: integer, ECOG performance score
hgb: numeric, hemoglobin count
bmi: numeric, body mass index, kg/m^2
alk.phos: numeric, alkaline phosphatase
ast: numeric, aspartate transaminase
mdquality.s: integer, LASA QOL 0=Clinically Deficient, 1=Not Clinically Deficient
age.ord: an ordered factor split of age, with levels 10-19 < 20-29 < 30-39 < 40-49 < 50-59 < 60-69 < 70-79 < 80-89

An object of class data.frame with 1499 rows and 14 columns.

Examples

data(mockstudy)
str(mockstudy)
data(mockstudy)
str(mockstudy)

Fit models over each of a set of independent variables with a response variable

Description

Fit and summarize models for each independent (x) variable with a response variable (y), with options to adjust by variables for each model.

Usage

modelsum(
  formula,
  family = "gaussian",
  data,
  adjust = NULL,
  na.action = NULL,
  subset = NULL,
  weights = NULL,
  id,
  strata,
  control = NULL,
  ...
)
modelsum(
  formula,
  family = "gaussian",
  data,
  adjust = NULL,
  na.action = NULL,
  subset = NULL,
  weights = NULL,
  id,
  strata,
  control = NULL,
  ...
)

Arguments

`formula`	an object of class `formula`; a symbolic description of the variables to be modeled. See "Details" for more information.
`family`	similar mechanism to `glm`, where the model to be fit is driven by the family. Options include: binomial, gaussian, survival, poisson, negbin, clog, and ordinal. These can be passed as a string, as a function, or as a list resulting from a call to one of the functions. See `modelsum.family` for details on survival, ordinal, negbin, and clog families.
`data`	an optional data.frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in `data`, the variables are taken from `environment(formula)`, typically the environment from which `modelsum` is called.
`adjust`	an object of class `formula` or a list of formulas, listing variables to adjust by in all models. Specify as a one-sided formula, like: `~Age+ Sex`. If a list, the names are used for the summary function. Unadjusted models can be specified as `~ 1` or as a list: `list(Unadjusted = NULL)`.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default (`NULL`) is to use the defaults of `lm`, `glm`, or `coxph`, depending on the `family` specifications.
`subset`	an optional vector specifying a subset of observations (rows of `data`) to be used in the results. If `strata` is missing, this works as vector of logicals or an index; otherwise, it should be a logical vector.
`weights`	an optional vector specifying the weights to apply to each data observation (rows of `data`)
`id`	A vector to identify clusters. Only used for `relrisk` at this time.
`strata`	a vector of strata to separate model summaries by an additional group. Note that for families like "clog", the "usual" strata term to indicate subject groupings should be given in the `adjust` argument.
`control`	control parameters to handle optional settings within `modelsum`. Arguments for `modelsum.control` can be passed to `modelsum` via the `...` argument, but if a control object and `...` arguments are both supplied, the latter are used. See `modelsum.control` for other details.
`...`	additional arguments to be passed to internal `modelsum` functions.

Value

An object with class c("modelsum", "arsenal_table")

Author(s)

Jason Sinnwell, Patrick Votruba, Beth Atkinson, Gregory Dougherty, and Ethan Heinzen, adapted from SAS Macro of the same name

Examples


data(mockstudy)

tab1 <- modelsum(bmi ~ sex + age, data = mockstudy)
summary(tab1, text = TRUE)

tab2 <- modelsum(alk.phos ~ arm + ps + hgb, adjust = ~ age + sex,
                 family = "gaussian", data = mockstudy)
summary(tab2, text = TRUE)

summary(tab2, show.intercept = FALSE, text = TRUE)

tab2.df <- as.data.frame(tab2)

tab2.df[1:5,]
data(mockstudy)

tab1 <- modelsum(bmi ~ sex + age, data = mockstudy)
summary(tab1, text = TRUE)

tab2 <- modelsum(alk.phos ~ arm + ps + hgb, adjust = ~ age + sex,
                 family = "gaussian", data = mockstudy)
summary(tab2, text = TRUE)

summary(tab2, show.intercept = FALSE, text = TRUE)

tab2.df <- as.data.frame(tab2)

tab2.df[1:5,]

Control settings for `modelsum` function

Description

Control test and summary settings for modelsum function.

Usage

modelsum.control(
  digits = 3L,
  digits.ratio = 3L,
  digits.p = 3L,
  format.p = TRUE,
  show.adjust = TRUE,
  show.intercept = TRUE,
  conf.level = 0.95,
  ordinal.stats = c("OR", "CI.lower.OR", "CI.upper.OR", "p.value", "Nmiss"),
  binomial.stats = c("OR", "CI.lower.OR", "CI.upper.OR", "p.value", "concordance",
    "Nmiss"),
  gaussian.stats = c("estimate", "std.error", "p.value", "adj.r.squared", "Nmiss"),
  poisson.stats = c("RR", "CI.lower.RR", "CI.upper.RR", "p.value", "Nmiss"),
  negbin.stats = c("RR", "CI.lower.RR", "CI.upper.RR", "p.value", "Nmiss"),
  relrisk.stats = c("RR", "CI.lower.RR", "CI.upper.RR", "p.value", "Nmiss"),
  clog.stats = c("OR", "CI.lower.OR", "CI.upper.OR", "p.value", "concordance", "Nmiss"),
  survival.stats = c("HR", "CI.lower.HR", "CI.upper.HR", "p.value", "concordance",
    "Nmiss"),
  stat.labels = list(),
  ...
)
modelsum.control(
  digits = 3L,
  digits.ratio = 3L,
  digits.p = 3L,
  format.p = TRUE,
  show.adjust = TRUE,
  show.intercept = TRUE,
  conf.level = 0.95,
  ordinal.stats = c("OR", "CI.lower.OR", "CI.upper.OR", "p.value", "Nmiss"),
  binomial.stats = c("OR", "CI.lower.OR", "CI.upper.OR", "p.value", "concordance",
    "Nmiss"),
  gaussian.stats = c("estimate", "std.error", "p.value", "adj.r.squared", "Nmiss"),
  poisson.stats = c("RR", "CI.lower.RR", "CI.upper.RR", "p.value", "Nmiss"),
  negbin.stats = c("RR", "CI.lower.RR", "CI.upper.RR", "p.value", "Nmiss"),
  relrisk.stats = c("RR", "CI.lower.RR", "CI.upper.RR", "p.value", "Nmiss"),
  clog.stats = c("OR", "CI.lower.OR", "CI.upper.OR", "p.value", "concordance", "Nmiss"),
  survival.stats = c("HR", "CI.lower.HR", "CI.upper.HR", "p.value", "concordance",
    "Nmiss"),
  stat.labels = list(),
  ...
)

Arguments

`digits`	Numeric, denoting the number of digits after the decimal point for beta coefficients and standard errors.
`digits.ratio`	Numeric, denoting the number of digits after the decimal point for ratios, e.g. OR, RR, HR.
`digits.p`	Numeric, denoting the number of digits for p-values. See "Details", below.
`format.p`	Logical, denoting whether to format p-values. See "Details", below.
`show.adjust`	Logical, denoting whether to show adjustment terms.
`show.intercept`	Logical, denoting whether to show intercept terms.
`conf.level`	Numeric, giving the confidence level.
`ordinal.stats`, `binomial.stats`, `survival.stats`, `gaussian.stats`, `poisson.stats`, `negbin.stats`, `clog.stats`, `relrisk.stats`	Character vectors denoting which stats to show for the various model types.
`stat.labels`	A named list of labels for all the stats used above.
`...`	Other arguments (not in use at this time).

Details

If format.p is FALSE, digits.p denotes the number of significant digits shown. The p-values will be in exponential notation if necessary. If format.p is TRUE, digits.p will determine the number of digits after the decimal point to show. If the p-value is less than the resulting number of places, it will be formatted to show so.

Value

A list with settings to be used within the modelsum function.

Family functions for modelsum

Description

A set of family functions for modelsum.

Usage

survival()

ordinal(method = c("logistic", "probit", "loglog", "cloglog", "cauchit"))

negbin(link = c("log", "identity", "sqrt"))

clog()

relrisk(link = "log")
survival()

ordinal(method = c("logistic", "probit", "loglog", "cloglog", "cauchit"))

negbin(link = c("log", "identity", "sqrt"))

clog()

relrisk(link = "log")

Arguments

`method`	See `MASS::polr`.
`link`	See `MASS::glm.nb`.

Value

A list, in particular with element family.

Helper functions for modelsum

Description

A set of helper functions for modelsum.

Usage

is.modelsum(x)

is.summary.modelsum(x)

na.modelsum(object, ...)
is.modelsum(x)

is.summary.modelsum(x)

na.modelsum(object, ...)

Arguments

`x`	A `modelsum` object.
`object`	A `data.frame` resulting from evaluating a `modelsum` formula.
`...`	Other arguments, or a vector of indices for extracting.

Value

na.modelsum returns a subsetted version of object (with attributes).

Some functions to handle NAs

Description

allNA tests if all elements are NA, and includeNA sets the NAs in a character vector or factor to an explicit label.

Usage

allNA(x)

includeNA(x, label, ...)

## S3 method for class 'factor'
includeNA(x, label = "(Missing)", first = FALSE, ...)

## Default S3 method:
includeNA(x, label = "(Missing)", ...)
allNA(x)

includeNA(x, label, ...)

## S3 method for class 'factor'
includeNA(x, label = "(Missing)", first = FALSE, ...)

## Default S3 method:
includeNA(x, label = "(Missing)", ...)

Arguments

`x`	An object
`label`	A character string denoting the label to set `NA`s to.
`...`	Other arguments (not in use at this time).
`first`	Logical; should the new label be the first level?

Author(s)

Ethan Heinzen

Adjust P-values for Multiple Comparisons

Description

Adjust P-values for Multiple Comparisons

Usage

padjust(p, method, n, ...)

## Default S3 method:
padjust(p, method, n, ...)

## S3 method for class 'tableby'
padjust(p, method, n, suffix = " (adjusted for multiple comparisons)", ...)

## S3 method for class 'summary.tableby'
padjust(p, method, n, suffix = " (adjusted for multiple comparisons)", ...)
padjust(p, method, n, ...)

## Default S3 method:
padjust(p, method, n, ...)

## S3 method for class 'tableby'
padjust(p, method, n, suffix = " (adjusted for multiple comparisons)", ...)

## S3 method for class 'summary.tableby'
padjust(p, method, n, suffix = " (adjusted for multiple comparisons)", ...)

Arguments

`p`	An object.
`method`	correction method. Can be abbreviated.
`n`	number of comparisons, must be at least `length(p)`; only set this (to non-default) when you know what you are doing!
`...`	Other arguments.
`suffix`	A suffix to add to the footnotes indicating that the tests were adjusted.

Summary Statistics of a Set of Independent Variables Paired Across Two Timepoints

Description

Summarize one or more variables (x) by a paired time variable (y). Variables on the right side of the formula, i.e. independent variables, are summarized by the two time points on the left of the formula. Optionally, an appropriate test is performed to test the distribution of the independent variables across the time points.

Usage

paired(
  formula,
  data,
  id,
  na.action,
  subset = NULL,
  strata,
  control = NULL,
  ...
)
paired(
  formula,
  data,
  id,
  na.action,
  subset = NULL,
  strata,
  control = NULL,
  ...
)

Arguments

`formula`	an object of class `formula` of the form `time ~ var1 + ...`. See "Details" for more information.
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which the function is called.
`id`	The vector giving IDs to match up data for the same subject across two timepoints.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is `na.paired("in.both")`. See `na.paired` for more details
`subset`	an optional vector specifying a subset of observations (rows of data) to be used in the results. Works as vector of logicals or an index.
`strata`	a vector of strata to separate summaries by an additional group.
`control`	control parameters to handle optional settings within `paired`. Two aspects of `paired` are controlled with these: test options of RHS variables and x variable summaries. Arguments for `paired.control` can be passed to `paired` via the `...` argument, but if a control object and `...` arguments are both supplied, the latter are used. See `paired.control` for more details.
`...`	additional arguments to be passed to internal `paired` functions or `paired.control`.

Details

Do note that this function piggybacks off of tableby quite heavily, so there is no summary.paired function (for instance).

These tests are accepted:

paired.t: a paired t-test.
mcnemar: McNemar's test.
signed.rank: a signed rank test.
sign.test: a sign test.
notest: no test is performed.

Value

An object with class c("paired", "tableby", "arsenal_table")

Author(s)

Jason Sinnwell, Beth Atkinson, Ryan Lennon, and Ethan Heinzen

Control settings for `paired` function

Description

Control test and summary settings for the paired function.

Usage

paired.control(
  diff = TRUE,
  numeric.test = "paired.t",
  cat.test = "mcnemar",
  ordered.test = "signed.rank",
  date.test = "paired.t",
  mcnemar.correct = TRUE,
  signed.rank.exact = NULL,
  signed.rank.correct = TRUE,
  ...
)
paired.control(
  diff = TRUE,
  numeric.test = "paired.t",
  cat.test = "mcnemar",
  ordered.test = "signed.rank",
  date.test = "paired.t",
  mcnemar.correct = TRUE,
  signed.rank.exact = NULL,
  signed.rank.correct = TRUE,
  ...
)

Arguments

`diff`	logical, telling `paired` whether to calculate a column of differences between time points.
`numeric.test`	name of test for numeric RHS variables in `paired`: paired.t, signed.rank, sign.test.
`cat.test`	name of test for categorical variables: mcnemar
`ordered.test`	name of test for ordered variables: signed.rank, sign.test
`date.test`	name of test to perform for date variables: paired.t, signed.rank, sign.test
`mcnemar.correct`, `signed.rank.exact`, `signed.rank.correct`	Options for statistical tests. See `wilcox.test` and `mcnemar.test` for details.
`...`	Arguments passed to `tableby.control`

Details

Note that (with the exception of total) all arguments to tableby.control are accepted in this function (in fact, this function passes everything through to tableby.control). However, there are different defaults for the statistical tests (shown here). For details on the other arguments, please see the help page for tableby.control.

Value

A list with settings to be used within the paired function.

Author(s)

Ethan Heinzen

Helper functions for paired

Description

A set of helper functions for paired.

Usage

na.paired(missings = c("in.both", "fill", "asis"))
na.paired(missings = c("in.both", "fill", "asis"))

Arguments

missings

A character string denoting which action to take. See "Details", below.

Details

All methods subset out any NA time points or IDs. "in.both" (the default) subsets the data.frame to individuals who appear at both time points. "fill" adds explicit missings for the people missing second time points. "asis" does nothing to add or remove missings.

Value

na.paired returns a function used to subset data.frames in paired.

Make a column for "select all" input

Description

Make a column for "select all" input

Usage

selectall(...)

as.selectall(x)

## S3 method for class 'selectall'
as.matrix(x, ...)

## S3 method for class 'selectall'
x[i, j, drop = FALSE]

## S3 method for class 'selectall'
is.na(x)

is.selectall(x)
selectall(...)

as.selectall(x)

## S3 method for class 'selectall'
as.matrix(x, ...)

## S3 method for class 'selectall'
x[i, j, drop = FALSE]

## S3 method for class 'selectall'
is.na(x)

is.selectall(x)

Arguments

`...`	Named arguments of the same length. These should be logical, numeric (0/1) or a factor with two levels.
`x`	An object of class "selectall"
`i`, `j`, `drop`	Arguments to '[.matrix'

Examples

d <- data.frame(grp = rep(c("A", "B"), each = 5))
d$s <- selectall(
  `Option 1` = c(rep(1, 4), rep(0, 6)),
  `Option 2` = c(0, 1, 0, 0, 0, 1, 1, 1, 0, 0),
  `Option 3` = 1,
  `Option 4` = 0
)
summary(tableby(grp ~ s, data = d), text = TRUE)
d <- data.frame(grp = rep(c("A", "B"), each = 5))
d$s <- selectall(
  `Option 1` = c(rep(1, 4), rep(0, 6)),
  `Option 2` = c(0, 1, 0, 0, 0, 1, 1, 1, 0, 0),
  `Option 3` = 1,
  `Option 4` = 0
)
summary(tableby(grp ~ s, data = d), text = TRUE)

The summary method for a `comparedf` object

Description

Print a more detailed output of the comparedf object.

Usage

## S3 method for class 'comparedf'
summary(object, ..., show.attrs = FALSE)

## S3 method for class 'summary.comparedf'
print(x, ..., format = "pandoc")
## S3 method for class 'comparedf'
summary(object, ..., show.attrs = FALSE)

## S3 method for class 'summary.comparedf'
print(x, ..., format = "pandoc")

Arguments

`object`	An object of class `"comparedf"`, as made by the `comparedf` S3 method.
`...`	Other arguments passed to `comparedf.control`. In `print`, these are passed to `kable`.
`show.attrs`	Logical, denoting whether to show the actual attributes which are different. For (e.g.) factors with lots of levels, this can make the tables quite wide, so this feature is `FALSE` by default.
`x`	An object returned by the `summary.comparedf` function.
`format`	Passed to `kable`: the format for the table. The default here is "pandoc". To use the default in `kable`, pass `NULL`.

Value

An object of class "summary.comparedf" is returned.

summary.freqlist

Description

Summarize the freqlist object.

Usage

## S3 method for class 'freqlist'
summary(object, ..., labelTranslations = NULL, title = NULL)

## S3 method for class 'summary.freqlist'
as.data.frame(x, ..., list.ok = FALSE)
## S3 method for class 'freqlist'
summary(object, ..., labelTranslations = NULL, title = NULL)

## S3 method for class 'summary.freqlist'
as.data.frame(x, ..., list.ok = FALSE)

Arguments

`object`	an object of class `freqlist`
`...`	For `summary.freqlist`, these are passed to `as.data.frame.freqlist` (and hence to `freq.control`). For the print method, these are additional arguments passed to the `kable` function.
`labelTranslations`	A named list (or vector) where the name is the label in the output to be replaced in the pretty rendering by the character string value for the named element of the list, e.g., `list(age = "Age(Years)", meansd = "Mean(SD)")`.
`title`	Title/caption for the table, defaulting to `NULL` (no title). Passed to `kable`. Can be length > 1 if the more than one table is being printed.
`x`	An object of class `summary.freqlist`.
`list.ok`	If the object has multiple by-variables, is it okay to return a list of data.frames instead of a single data.frame? If `FALSE` but there are multiple by-variables, a warning is issued.

Value

An object of class "summary.freqlist" (invisibly for the print method).

Author(s)

Tina Gunderson, with major revisions by Ethan Heinzen

Examples

# load mockstudy data
data(mockstudy)
tab.ex <- table(mockstudy[c("arm", "sex", "mdquality.s")], useNA = "ifany")
noby <- freqlist(tab.ex, na.options = "include")
summary(noby)
withby <- freqlist(tab.ex, strata = c("arm","sex"), na.options = "showexclude")
summary(withby)
summary(withby, dupLabels = TRUE)
# load mockstudy data
data(mockstudy)
tab.ex <- table(mockstudy[c("arm", "sex", "mdquality.s")], useNA = "ifany")
noby <- freqlist(tab.ex, na.options = "include")
summary(noby)
withby <- freqlist(tab.ex, strata = c("arm","sex"), na.options = "showexclude")
summary(withby)
summary(withby, dupLabels = TRUE)

Summarize a `modelsum` object.

Description

Format the information in object as a table using Pandoc coding or plain text, and cat it to stdout.

Usage

## S3 method for class 'modelsum'
summary(
  object,
  ...,
  labelTranslations = NULL,
  text = FALSE,
  title = NULL,
  term.name = "",
  adjustment.names = FALSE
)

## S3 method for class 'summary.modelsum'
as.data.frame(
  x,
  ...,
  text = x$text,
  term.name = x$term.name,
  adjustment.names = x$adjustment.names,
  width = NULL,
  min.split = NULL,
  list.ok = FALSE
)
## S3 method for class 'modelsum'
summary(
  object,
  ...,
  labelTranslations = NULL,
  text = FALSE,
  title = NULL,
  term.name = "",
  adjustment.names = FALSE
)

## S3 method for class 'summary.modelsum'
as.data.frame(
  x,
  ...,
  text = x$text,
  term.name = x$term.name,
  adjustment.names = x$adjustment.names,
  width = NULL,
  min.split = NULL,
  list.ok = FALSE
)

Arguments

`object`	A `modelsum` object.
`...`	For `summary.modelsum`, other arguments passed to `as.data.frame.modelsum`. For `as.data.frame.summary.modelsum`, "width" and "min.split" are passed to `smart.split`. For `print`ing the summary object, these are passed to both `as.data.frame.summary.modelsum` and `kable`.
`labelTranslations`	A named list (or vector) where the name is the label in the output to be replaced in the pretty rendering by the character string value for the named element of the list, e.g., `list(age = "Age(Years)", meansd = "Mean(SD)")`.
`text`	An argument denoting how to print the summary to the screen. Default is `FALSE` (show markdown output). `TRUE` and `NULL` output a text-only version, with the latter avoiding all formatting. `"html"` uses the HTML tag `<strong>` instead of the markdown formatting, and `"latex"` uses the LaTeX command `\textbf`.
`title`	Title/caption for the table, defaulting to `NULL` (no title). Passed to `kable`. Can be length > 1 if the more than one table is being printed.
`term.name`	A character vector denoting the column name for the "terms" column. It should be the same length as the number of tables or less (it will get recycled if needed). The special value `TRUE` will use the y-variable's label for each table.
`adjustment.names`	Logical, denoting whether the names of the adjustment models should be printed.
`x`	An object of class `"summary.modelsum"`.
`width`	Passed to `smart.split` for formatting of the "term" column.
`min.split`	Passed to `smart.split` for formatting of the "term" column.
`list.ok`	If the object has multiple by-variables, is it okay to return a list of data.frames instead of a single data.frame? If `FALSE` but there are multiple by-variables, a warning is issued.

Value

An object of class "summary.modelsum"

Author(s)

Ethan Heinzen, based on code originally by Greg Dougherty

The summary method for a `tableby` object

Description

The summary method for a tableby object, which is a pretty rendering of a tableby object into a publication-quality results table in R Markdown, and can render well in text-only.

Usage

## S3 method for class 'tableby'
summary(
  object,
  ...,
  labelTranslations = NULL,
  text = FALSE,
  title = NULL,
  pfootnote = FALSE,
  term.name = ""
)

## S3 method for class 'summary.tableby'
as.data.frame(
  x,
  ...,
  text = x$text,
  pfootnote = x$pfootnote,
  term.name = x$term.name,
  width = NULL,
  min.split = NULL,
  list.ok = FALSE
)
## S3 method for class 'tableby'
summary(
  object,
  ...,
  labelTranslations = NULL,
  text = FALSE,
  title = NULL,
  pfootnote = FALSE,
  term.name = ""
)

## S3 method for class 'summary.tableby'
as.data.frame(
  x,
  ...,
  text = x$text,
  pfootnote = x$pfootnote,
  term.name = x$term.name,
  width = NULL,
  min.split = NULL,
  list.ok = FALSE
)

Arguments

`object`	An object of class `"tableby"`, made by the `tableby` function.
`...`	For `summary.tableby`, other arguments passed to `as.data.frame.tableby`. For `print`ing the summary object, these are passed to both `as.data.frame.summary.tableby` and `kable`.
`labelTranslations`	A named list (or vector) where the name is the label in the output to be replaced in the pretty rendering by the character string value for the named element of the list, e.g., `list(age = "Age(Years)", meansd = "Mean(SD)")`.
`text`	An argument denoting how to print the summary to the screen. Default is `FALSE` (show markdown output). `TRUE` and `NULL` output a text-only version, with the latter avoiding all formatting. `"html"` uses the HTML tag `<strong>` instead of the markdown formatting, and `"latex"` uses the LaTeX command `\textbf`.
`title`	Title/caption for the table, defaulting to `NULL` (no title). Passed to `kable`. Can be length > 1 if the more than one table is being printed.
`pfootnote`	Logical, denoting whether to put footnotes describing the tests used to generate the p-values. Alternatively, "html" to surround the outputted footnotes with `<li>`.
`term.name`	A character vector denoting the column name for the "terms" column. It should be the same length as the number of tables or less (it will get recycled if needed). The special value `TRUE` will use the y-variable's label for each table.
`x`	An object of class `"summary.tableby"`.
`width`	Passed to `smart.split` for formatting of the "term" column.
`min.split`	Passed to `smart.split` for formatting of the "term" column.
`list.ok`	If the object has multiple by-variables, is it okay to return a list of data.frames instead of a single data.frame? If `FALSE` but there are multiple by-variables, a warning is issued.

Value

An object of class summary.tableby

Author(s)

Ethan Heinzen, based on code by Gregory Dougherty, Jason Sinnwell, Beth Atkinson, adapted from SAS Macros written by Paul Novotny and Ryan Lennon

Examples


set.seed(100)
## make 3+ categories for response
nsubj <- 90
mdat <- data.frame(Response=sample(c(1,2,3),nsubj, replace=TRUE),
                   Sex=sample(c("Male", "Female"), nsubj,replace=TRUE),
                   Age=round(rnorm(nsubj,mean=40, sd=5)),
                   HtIn=round(rnorm(nsubj,mean=65,sd=5)))

## allow default summaries on RHS variables
out <- tableby(Response ~ Sex + Age + HtIn, data=mdat)
summary(out, text=TRUE)
labels(out)
labels(out) <- c(Age="Age (years)", HtIn="Height (inches)")
summary(out, stats.labels=c(meansd="Mean-SD", q1q3 = "Q1-Q3"), text=TRUE)

set.seed(100)
## make 3+ categories for response
nsubj <- 90
mdat <- data.frame(Response=sample(c(1,2,3),nsubj, replace=TRUE),
                   Sex=sample(c("Male", "Female"), nsubj,replace=TRUE),
                   Age=round(rnorm(nsubj,mean=40, sd=5)),
                   HtIn=round(rnorm(nsubj,mean=65,sd=5)))

## allow default summaries on RHS variables
out <- tableby(Response ~ Sex + Age + HtIn, data=mdat)
summary(out, text=TRUE)
labels(out)
labels(out) <- c(Age="Age (years)", HtIn="Height (inches)")
summary(out, stats.labels=c(meansd="Mean-SD", q1q3 = "Q1-Q3"), text=TRUE)

Summary Statistics of a Set of Independent Variables by a Categorical Variable

Description

Summarize one or more variables (x) by a categorical variable (y). Variables on the right side of the formula, i.e. independent variables, are summarized by the levels of a categorical variable on the left of the formula. Optionally, an appropriate test is performed to test the distribution of the independent variables across the levels of the categorical variable.

Usage

tableby(
  formula,
  data,
  na.action,
  subset = NULL,
  weights = NULL,
  strata,
  control = NULL,
  ...
)
tableby(
  formula,
  data,
  na.action,
  subset = NULL,
  weights = NULL,
  strata,
  control = NULL,
  ...
)

Arguments

`formula`	an object of class `formula`; a symbolic description of the variables to be summarized by the group, or categorical variable, of interest. See "Details" for more information. To only view overall summary statistics, a one-sided formula can be used.
`data`	an optional data frame, list or environment (or object coercible by `as.data.frame` to a data frame) containing the variables in the model. If not found in data, the variables are taken from `environment(formula)`, typically the environment from which the function is called.
`na.action`	a function which indicates what should happen when the data contain `NA`s. The default is `na.tableby(TRUE)` if there is a by-variable, and `na.tableby(FALSE)` if there is not. This schema thus includes observations with `NA`s in x variables, but removes those with `NA` in the categorical group variable and strata (if used).
`subset`	an optional vector specifying a subset of observations (rows of data) to be used in the results. Works as vector of logicals or an index.
`weights`	a vector of weights. Using weights will disable statistical tests.
`strata`	a vector of strata to separate summaries by an additional group.
`control`	control parameters to handle optional settings within `tableby`. Two aspects of `tableby` are controlled with these: test options of RHS variables across levels of the categorical grouping variable, and x variable summaries within the grouping variable. Arguments for `tableby.control` can be passed to `tableby` via the `...` argument, but if a control object and `...` arguments are both supplied, the latter are used. See `tableby.control` for more details.
`...`	additional arguments to be passed to internal `tableby` functions or `tableby.control`.

Details

The group variable (if any) is categorical, which could be an integer, character, factor, or ordered factor. tableby makes a simple summary of the counts within the k-levels of the independent variables on the right side of the formula. Note that unused levels are dropped.

The data argument allows data.frames with label attributes for the columns, and those labels will be used in the summary methods for the tableby class.

The independent variables are a mixture of types: categorical (discrete), numeric (continuous), and time to event (survival). These variables are split by the levels of the group variable (if any), then summarized within those levels, specific to the variable type. A statistical test is performed to compare the distribution of the independent variables across the levels of the grouping variable.

The tests differ by the independent variable type, but can be specified explicitly in the formula statement or in the control function. These tests are accepted:

anova: analysis of variance test; the default test for continuous variables. When LHS variable has two levels, equivalent to two-sample t-test.
kwt: Kruskal-Wallis Rank Test, optional test for continuous variables. When LHS variable has two levels, equivalent to Wilcoxon test.
wt: An explicit Wilcoxon test.
medtest: A median test.
chisq: chi-square goodness of fit test for equal counts of a categorical variable across categories; the default for categorical or factor variables
fe: Fisher's exact test for categorical variables
trend: trend test for equal distribution of an ordered variable across a categorical variable; the default for ordered factor variables
logrank: log-rank, the default for time-to-event variables
notest: no test is performed.

To perform a mixture of asymptotic and rank-based tests on two different continuous variables, an example formula is: formula = group ~ anova(age) + kwt(height). The test settings in tableby.control apply to all independent variables of a given type.

The summary statistics reported for each independent variable within the group variable can be set in tableby.control.

Finally, multiple by-variables can be set using list(). See the examples for more details.

Value

An object with class c("tableby", "arsenal_table")

Author(s)

Jason Sinnwell, Beth Atkinson, Gregory Dougherty, and Ethan Heinzen, adapted from SAS Macros written by Paul Novotny and Ryan Lennon

Examples

data(mockstudy)
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
summary(tab1, text=TRUE)

mylabels <- list(sex = "SEX", age ="Age, yrs")
summary(tab1, labelTranslations = mylabels, text=TRUE)

tab3 <- tableby(arm ~ sex + age, data=mockstudy, test=FALSE, total=FALSE,
                numeric.stats=c("median","q1q3"), numeric.test="kwt")
summary(tab3, text=TRUE)

# multiple LHS
summary(tableby(list(arm, sex) ~ age, data = mockstudy, strata = ps), text = TRUE)

tab.test <- tableby(arm ~ kwt(age) + anova(bmi) + kwt(ast), data=mockstudy)
tests(tab.test)

data(mockstudy)
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
summary(tab1, text=TRUE)

mylabels <- list(sex = "SEX", age ="Age, yrs")
summary(tab1, labelTranslations = mylabels, text=TRUE)

tab3 <- tableby(arm ~ sex + age, data=mockstudy, test=FALSE, total=FALSE,
                numeric.stats=c("median","q1q3"), numeric.test="kwt")
summary(tab3, text=TRUE)

# multiple LHS
summary(tableby(list(arm, sex) ~ age, data = mockstudy, strata = ps), text = TRUE)

tab.test <- tableby(arm ~ kwt(age) + anova(bmi) + kwt(ast), data=mockstudy)
tests(tab.test)

Control settings for `tableby` function

Description

Control test and summary settings for the tableby function.

Usage

tableby.control(
  test = TRUE,
  total = TRUE,
  total.pos = c("after", "before"),
  test.pname = NULL,
  numeric.simplify = FALSE,
  cat.simplify = FALSE,
  cat.droplevels = FALSE,
  ordered.simplify = FALSE,
  date.simplify = FALSE,
  numeric.test = "anova",
  cat.test = "chisq",
  ordered.test = "trend",
  surv.test = "logrank",
  date.test = "kwt",
  selectall.test = "notest",
  test.always = FALSE,
  numeric.stats = c("Nmiss", "meansd", "range"),
  cat.stats = c("Nmiss", "countpct"),
  ordered.stats = c("Nmiss", "countpct"),
  surv.stats = c("Nmiss", "Nevents", "medSurv"),
  date.stats = c("Nmiss", "median", "range"),
  selectall.stats = c("Nmiss", "countpct"),
  stats.labels = list(),
  digits = 3L,
  digits.count = 0L,
  digits.pct = 1L,
  digits.p = 3L,
  format.p = TRUE,
  digits.n = 0L,
  conf.level = 0.95,
  wilcox.correct = FALSE,
  wilcox.exact = NULL,
  chisq.correct = FALSE,
  simulate.p.value = FALSE,
  B = 2000,
  times = 1:5,
  ...
)
tableby.control(
  test = TRUE,
  total = TRUE,
  total.pos = c("after", "before"),
  test.pname = NULL,
  numeric.simplify = FALSE,
  cat.simplify = FALSE,
  cat.droplevels = FALSE,
  ordered.simplify = FALSE,
  date.simplify = FALSE,
  numeric.test = "anova",
  cat.test = "chisq",
  ordered.test = "trend",
  surv.test = "logrank",
  date.test = "kwt",
  selectall.test = "notest",
  test.always = FALSE,
  numeric.stats = c("Nmiss", "meansd", "range"),
  cat.stats = c("Nmiss", "countpct"),
  ordered.stats = c("Nmiss", "countpct"),
  surv.stats = c("Nmiss", "Nevents", "medSurv"),
  date.stats = c("Nmiss", "median", "range"),
  selectall.stats = c("Nmiss", "countpct"),
  stats.labels = list(),
  digits = 3L,
  digits.count = 0L,
  digits.pct = 1L,
  digits.p = 3L,
  format.p = TRUE,
  digits.n = 0L,
  conf.level = 0.95,
  wilcox.correct = FALSE,
  wilcox.exact = NULL,
  chisq.correct = FALSE,
  simulate.p.value = FALSE,
  B = 2000,
  times = 1:5,
  ...
)

Arguments

`test`	logical, telling `tableby` whether to perform tests of x variables across levels of the group variable.
`total`	logical, telling `tableby` whether to calculate a column of totals across group variable.
`total.pos`	One of `"before"` or `"after"`, denoting where to put the total column relative to the by-variable columns.
`test.pname`	character string denoting the p-value column name in `summary.tableby`. Modifiable also with `modpval.tableby`.
`numeric.simplify`, `date.simplify`	logical, tell `tableby` whether to condense numeric/date output to a single line. NOTE: this only simplifies to one line if there is only one statistic reported, such as `meansd`. In particular, if `Nmiss` is specified and there are missings, then the output is not simplified.
`cat.simplify`, `ordered.simplify`	logical, tell `tableby` whether to remove the first level of the categorical/ordinal variable if binary. If `TRUE`, only the summary stats of the second level are reported (unless there's only one level, in which case it's reported). If `"label"`, the second level's label is additionally appended to the label. NOTE: this only simplifies to one line if there is only one statistic reported, such as `countpct`. In particular, if `Nmiss` is specified and there are missings, then the output is not simplified.
`cat.droplevels`	Should levels be dropped for categorical variables? If set to true, p-values will not be displayed unless `test.always = TRUE` as well.
`numeric.test`	name of test for numeric RHS variables in `tableby`: anova, kwt (Kruskal-Wallis), medtest (median test). If no LHS variable exists, then a mean is required for a univariate test.
`cat.test`	name of test for categorical variables: chisq, fe (Fisher's Exact)
`ordered.test`	name of test for ordered variables: trend
`surv.test`	name of test for survival variables: logrank
`date.test`	name of test for date variables: kwt
`selectall.test`	name of test for date variables: notest
`test.always`	Should the test be performed even if one or more by-group has 0 observations? Relevant for kwt and anova.
`numeric.stats`, `cat.stats`, `ordered.stats`, `surv.stats`, `date.stats`, `selectall.stats`	summary statistics to include for the respective class of RHS variables within the levels of the group LHS variable.
`stats.labels`	A named list of labels for all the statistics function names, where the function name is the named element in the list and the value that goes with it is a string containing the formal name that will be printed in all printed renderings of the output, e.g., `list(countpct="Count (Pct)")`. Any unnamed elements will be ignored. Passing `NULL` will disable labels.
`digits`	Number of decimal places for numeric values.
`digits.count`	Number of decimal places for count values.
`digits.pct`	Number of decimal places for percents.
`digits.p`	Number of decimal places for p-values.
`format.p`	Logical, denoting whether to format p-values, or character, a `glue` specification for how to format. See "Details", below.
`digits.n`	Number of decimal places for N's in the header. Set it to NA to suppress the N's.
`conf.level`	Numeric, denoting what confidence level to use for confidence intervals. (See, e.g., `binomCI`)
`wilcox.correct`, `wilcox.exact`	See `wilcox.test`
`chisq.correct`	logical, correction factor for chisq.test
`simulate.p.value`	logical, simulate p-value for categorical tests (fe and chisq)
`B`	number of simulations to perform for simulation-based p-value
`times`	A vector of times to use for survival summaries.
`...`	additional arguments.

Details

All tests can be turned off by setting test to FALSE. Otherwise, test are set to default settings in this list, or set explicitly in the formula of tableby.

Options for statistics are described more thoroughly in the vignette and are listed in tableby.stats

Value

A list with settings to be used within the tableby function.

Author(s)

Jason Sinnwell, Beth Atkinson, Ethan Heinzen, Terry Therneau, adapted from SAS Macros written by Paul Novotny and Ryan Lennon

Examples

set.seed(100)
## make 3+ categories for Response
mdat <- data.frame(Response=c(0,0,0,0,0,1,1,1,1,1),
                   Sex=sample(c("Male", "Female"), 10,replace=TRUE),
                   Age=round(rnorm(10,mean=40, sd=5)),
                   HtIn=round(rnorm(10,mean=65,sd=5)))

## allow default summaries in RHS variables, and pass control args to
## main function, to be picked up with ... when calling tableby.control
outResp <- tableby(Response ~ Sex + Age + HtIn, data=mdat, total=FALSE, test=TRUE)
outCtl <- tableby(Response ~ Sex + Age + HtIn, data=mdat,
                  control=tableby.control(total=TRUE, cat.simplify=TRUE,
                  cat.stats=c("Nmiss","countpct"),digits=1))
summary(outResp, text=TRUE)
summary(outCtl, text=TRUE)
set.seed(100)
## make 3+ categories for Response
mdat <- data.frame(Response=c(0,0,0,0,0,1,1,1,1,1),
                   Sex=sample(c("Male", "Female"), 10,replace=TRUE),
                   Age=round(rnorm(10,mean=40, sd=5)),
                   HtIn=round(rnorm(10,mean=65,sd=5)))

## allow default summaries in RHS variables, and pass control args to
## main function, to be picked up with ... when calling tableby.control
outResp <- tableby(Response ~ Sex + Age + HtIn, data=mdat, total=FALSE, test=TRUE)
outCtl <- tableby(Response ~ Sex + Age + HtIn, data=mdat,
                  control=tableby.control(total=TRUE, cat.simplify=TRUE,
                  cat.stats=c("Nmiss","countpct"),digits=1))
summary(outResp, text=TRUE)
summary(outCtl, text=TRUE)

Helper functions for tableby

Description

A set of helper functions for tableby.

Usage

is.tableby(x)

is.summary.tableby(x)

modpval.tableby(x, pdata, use.pname = FALSE)

tests(x)

## S3 method for class 'tableby'
tests(x)

na.tableby(lhs = TRUE)

## S3 method for class 'tableby'
xtfrm(x)

## S3 method for class 'tableby'
sort(x, ...)

## S3 method for class 'tableby'
Ops(e1, e2)

## S3 method for class 'tableby'
head(x, n = 6L, ...)

## S3 method for class 'tableby'
tail(x, n = 6L, ...)
is.tableby(x)

is.summary.tableby(x)

modpval.tableby(x, pdata, use.pname = FALSE)

tests(x)

## S3 method for class 'tableby'
tests(x)

na.tableby(lhs = TRUE)

## S3 method for class 'tableby'
xtfrm(x)

## S3 method for class 'tableby'
sort(x, ...)

## S3 method for class 'tableby'
Ops(e1, e2)

## S3 method for class 'tableby'
head(x, n = 6L, ...)

## S3 method for class 'tableby'
tail(x, n = 6L, ...)

Arguments

`x`	A `tableby` object.
`pdata`	A named data.frame where the first column is the by-variable names, the (optional) second is the strata value, the next is the x variable names, the next is p-values (or some test stat), and the (optional) next column is the method name.
`use.pname`	Logical, denoting whether the column name in `pdata` corresponding to the p-values should be used in the output of the object.
`lhs`	Logical, denoting whether to remove `NA`s from the first column of the data.frame (the "left-hand side")
`...`	Other arguments.
`e1`, `e2`	`tableby` objects, or numbers to compare them to.
`n`	A single integer. See `head` or `tail` for more details

Details

Logical comparisons are implemented for Ops.tableby.

Value

na.tableby returns a subsetted version of object (with attributes). Ops.tableby returns a logical vector. xtfrm.tableby returns the p-values (which are ordered by order to sort).

tableby Summary Statistics Functions

Description

A collection of functions that will report summary statistics. To create a custom function, consider using a function with all three arguments and .... See the tableby vignette for an example.

Usage

arsenal_sum(x, na.rm = TRUE, ...)

arsenal_min(x, na.rm = TRUE, ...)

arsenal_max(x, na.rm = TRUE, ...)

arsenal_mean(x, na.rm = TRUE, weights = NULL, ...)

arsenal_sd(x, na.rm = TRUE, weights = NULL, ...)

arsenal_var(x, na.rm = TRUE, weights = NULL, ...)

meansd(x, na.rm = TRUE, weights = NULL, ...)

meanse(x, na.rm = TRUE, weights = NULL, ...)

meanpmsd(x, na.rm = TRUE, weights = NULL, ...)

meanpmse(x, na.rm = TRUE, weights = NULL, ...)

meanCI(x, na.rm = TRUE, weights = NULL, conf.level = 0.95, ...)

medianrange(x, na.rm = TRUE, weights = NULL, ...)

medianmad(x, na.rm = TRUE, weights = NULL, ...)

arsenal_median(x, na.rm = TRUE, weights = NULL, ...)

arsenal_range(x, na.rm = TRUE, ...)

gmean(x, na.rm = TRUE, weights = NULL, ...)

gsd(x, na.rm = TRUE, weights = NULL, ...)

gmeansd(x, na.rm = TRUE, weights = NULL, ...)

gmeanCI(x, na.rm = TRUE, weights = NULL, conf.level = 0.95, ...)

Nsigntest(x, na.rm = TRUE, weights = NULL, ...)

Nevents(x, na.rm = TRUE, weights = NULL, ...)

medSurv(x, na.rm = TRUE, weights = NULL, ...)

NeventsSurv(x, na.rm = TRUE, weights = NULL, times = 1:5, ...)

NriskSurv(x, na.rm = TRUE, weights = NULL, times = 1:5, ...)

Nrisk(x, na.rm = TRUE, weights = NULL, times = 1:5, ...)

medTime(x, na.rm = TRUE, weights = NULL, ...)

q1q3(x, na.rm = TRUE, weights = NULL, ...)

medianq1q3(x, na.rm = TRUE, weights = NULL, ...)

iqr(x, na.rm = TRUE, weights = NULL, ...)

Nmiss(x, weights = NULL, ...)

Nmisspct(x, weights = NULL, ...)

Nmiss2(x, weights = NULL, ...)

Nmisspct2(x, weights = NULL, ...)

N(x, weights = NULL, ...)

Npct(x, weights = NULL, ...)

Nrowpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  weights = NULL,
  ...,
  totallab = "Total"
)

count(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

countpct(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

pct(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

countN(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

countrowpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  ...,
  totallab = "Total"
)

rowpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  ...,
  totallab = "Total"
)

countcellpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  ...,
  totallab = "Total"
)

binomCI(x, levels = NULL, na.rm = TRUE, weights = NULL, conf.level = 0.95, ...)

rowbinomCI(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  conf.level = 0.95,
  ...,
  totallab = "Total"
)
arsenal_sum(x, na.rm = TRUE, ...)

arsenal_min(x, na.rm = TRUE, ...)

arsenal_max(x, na.rm = TRUE, ...)

arsenal_mean(x, na.rm = TRUE, weights = NULL, ...)

arsenal_sd(x, na.rm = TRUE, weights = NULL, ...)

arsenal_var(x, na.rm = TRUE, weights = NULL, ...)

meansd(x, na.rm = TRUE, weights = NULL, ...)

meanse(x, na.rm = TRUE, weights = NULL, ...)

meanpmsd(x, na.rm = TRUE, weights = NULL, ...)

meanpmse(x, na.rm = TRUE, weights = NULL, ...)

meanCI(x, na.rm = TRUE, weights = NULL, conf.level = 0.95, ...)

medianrange(x, na.rm = TRUE, weights = NULL, ...)

medianmad(x, na.rm = TRUE, weights = NULL, ...)

arsenal_median(x, na.rm = TRUE, weights = NULL, ...)

arsenal_range(x, na.rm = TRUE, ...)

gmean(x, na.rm = TRUE, weights = NULL, ...)

gsd(x, na.rm = TRUE, weights = NULL, ...)

gmeansd(x, na.rm = TRUE, weights = NULL, ...)

gmeanCI(x, na.rm = TRUE, weights = NULL, conf.level = 0.95, ...)

Nsigntest(x, na.rm = TRUE, weights = NULL, ...)

Nevents(x, na.rm = TRUE, weights = NULL, ...)

medSurv(x, na.rm = TRUE, weights = NULL, ...)

NeventsSurv(x, na.rm = TRUE, weights = NULL, times = 1:5, ...)

NriskSurv(x, na.rm = TRUE, weights = NULL, times = 1:5, ...)

Nrisk(x, na.rm = TRUE, weights = NULL, times = 1:5, ...)

medTime(x, na.rm = TRUE, weights = NULL, ...)

q1q3(x, na.rm = TRUE, weights = NULL, ...)

medianq1q3(x, na.rm = TRUE, weights = NULL, ...)

iqr(x, na.rm = TRUE, weights = NULL, ...)

Nmiss(x, weights = NULL, ...)

Nmisspct(x, weights = NULL, ...)

Nmiss2(x, weights = NULL, ...)

Nmisspct2(x, weights = NULL, ...)

N(x, weights = NULL, ...)

Npct(x, weights = NULL, ...)

Nrowpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  weights = NULL,
  ...,
  totallab = "Total"
)

count(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

countpct(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

pct(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

countN(x, levels = NULL, na.rm = TRUE, weights = NULL, ...)

countrowpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  ...,
  totallab = "Total"
)

rowpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  ...,
  totallab = "Total"
)

countcellpct(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  ...,
  totallab = "Total"
)

binomCI(x, levels = NULL, na.rm = TRUE, weights = NULL, conf.level = 0.95, ...)

rowbinomCI(
  x,
  levels = NULL,
  by,
  by.levels = sort(unique(by)),
  na.rm = TRUE,
  weights = NULL,
  conf.level = 0.95,
  ...,
  totallab = "Total"
)

Arguments

`x`	Usually a vector.
`na.rm`	Should NAs be removed?
`...`	Other arguments.
`weights`	A vector of weights.
`conf.level`	Numeric, denoting what confidence level to use for confidence intervals.
`times`	A vector of times to use for survival summaries.
`levels`	A vector of levels that character `x`s should have.
`by`	a vector of the by-values.
`by.levels`	a vector of the levels of `by`.
`totallab`	What to call the total "column"

Details

Not all these functions are exported, in order to avoid conflicting NAMESPACES. Note also that the functions prefixed with "arsenal_" can be referred to by their short names (e.g., "min" for "arsenal_min").

Value

Usually a vector of the appropriate numbers.

Internal `tableby` functions

Description

A collection of functions that may help users create custom functions that are formatted correctly.

Usage

tbfmt(x, digits = NULL, digits.count = NULL, digits.pct = NULL, ...)

as.tbstat(
  x,
  oldClass = NULL,
  fmt = "{y}",
  which.count = 0L,
  which.pct = 0L,
  ...
)

as.tbstat_multirow(x)
tbfmt(x, digits = NULL, digits.count = NULL, digits.pct = NULL, ...)

as.tbstat(
  x,
  oldClass = NULL,
  fmt = "{y}",
  which.count = 0L,
  which.pct = 0L,
  ...
)

as.tbstat_multirow(x)

Arguments

`x`	Usually a vector.
`digits`, `digits.pct`, `digits.count`	Digits specifications
`...`	arguments to pass to `as.tbstat`.
`oldClass`	class(es) to add to the resulting object.
`fmt`	A `glue` string, where the object is exposed as the variable `x`, and a default-formatted version (using `tbfmt`) exposed as the variable `y`. `digits`, `digits.count`, and `digits.pct` are also exposed.
`which.count`	Which statistics are counts? The default is 0, indicating that none are.
`which.pct`	Which statistics are percents? The default is 0, indicating that none are.

Details

The vignette has an example on how to use these.

as.tbstat defines a tableby statistic with its appropriate formatting.

tbfmt applies some default formatting.

as.tbstat_multirow marks an object (usually a list) for multiple-row printing.

write2

Description

Functions to output tables to a single document. (Also the S3 backbone behind the write2* functions.)

Usage

write2(object, file, ..., output_format)

## S3 method for class 'arsenal_table'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'summary.arsenal_table'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'comparedf'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'summary.comparedf'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'verbatim'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'yaml'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'code.chunk'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'knitr_kable'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'xtable'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'character'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'list'
write2(
  object,
  file,
  ...,
  append. = FALSE,
  render. = TRUE,
  keep.rmd = !render.,
  output_format = NULL
)

## Default S3 method:
write2(
  object,
  file,
  FUN = NULL,
  ...,
  append. = FALSE,
  render. = TRUE,
  keep.rmd = !render.,
  output_format = NULL
)
write2(object, file, ..., output_format)

## S3 method for class 'arsenal_table'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'summary.arsenal_table'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'comparedf'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'summary.comparedf'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'verbatim'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'yaml'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'code.chunk'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'knitr_kable'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'xtable'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'character'
write2(object, file, ..., output_format = NULL)

## S3 method for class 'list'
write2(
  object,
  file,
  ...,
  append. = FALSE,
  render. = TRUE,
  keep.rmd = !render.,
  output_format = NULL
)

## Default S3 method:
write2(
  object,
  file,
  FUN = NULL,
  ...,
  append. = FALSE,
  render. = TRUE,
  keep.rmd = !render.,
  output_format = NULL
)

Arguments

`object`	An object.
`file`	A single character string denoting the filename for the output document.
`...`	Additional arguments to be passed to `FUN`, `rmarkdown::render`, etc. One popular option is to use `quiet = TRUE` to suppress the command line output.
`output_format`	One of the following: An output format object, e.g. `rmarkdown::html_document(...)`. A character string denoting such a format function, e.g. `"html_document"`. In this case, the `"..."` are NOT passed. The format function itself, e.g. `rmarkdown::html_document`. In this case, the `"..."` arguments are passed. One of `"html"`, `"pdf"`, and `"word"`, shortcuts implemented here. In this case, the `"..."` arguments are passed. `NULL`, in which the output is HTML by default. See `rmarkdown::render` for details.
`append.`	Logical, denoting whether (if a temporary `.Rmd` file of the same name already exists) to append on. Used mostly for `write2.list`.
`render.`	Logical, denoting whether to render the temporary `.Rmd` file. Used mostly for `write2.list`.
`keep.rmd`	Logical, denoting whether to keep the intermediate `.Rmd` file. Used mostly for `write2.list`.
`FUN`	The summary-like or print-like function to use to generate the markdown content. Can be passed as a function or a character string. It's expected that `FUN(object, ...)` looks "good" when put directly in a `.Rmd` file.

Details

write2 is an S3 method. The default prints the object (using print) inside a section surrounded by three back ticks. See verbatim for details.

There are methods implemented for tableby, modelsum, and freqlist, all of which use the summary function. There are also methods compatible with kable, xtable, and pander_return. Another option is to coerce an object using verbatim() to print out the results monospaced (as if they were in the terminal). To output multiple tables into a document, simply make a list of them and call the same function as before. Finally, to output code chunks to be evaluated, use code.chunk.

For more information, see vignette("write2").

Value

object is returned invisibly, and file is written.

Author(s)

Ethan Heinzen, adapted from code from Krista Goergen

Examples

## Not run: 
data(mockstudy)
# tableby example
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
write2(tab1, tempfile(fileext = ".rtf"),
  toc = TRUE, # passed to rmarkdown::rtf_document, though in this case it's not practical
  quiet = TRUE, # passed to rmarkdown::render
  title = "My cool new title", # passed to summary.tableby
  output_format = rmarkdown::rtf_document)

write2html(list(
  "# Header 1", # a header
  code.chunk(a <- 1, b <- 2, a + b), # a code chunk
  verbatim("hi there") # verbatim output
),
  tempfile(fileext = ".html"),
  quite = TRUE)

## End(Not run)
## Not run: 
data(mockstudy)
# tableby example
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
write2(tab1, tempfile(fileext = ".rtf"),
  toc = TRUE, # passed to rmarkdown::rtf_document, though in this case it's not practical
  quiet = TRUE, # passed to rmarkdown::render
  title = "My cool new title", # passed to summary.tableby
  output_format = rmarkdown::rtf_document)

write2html(list(
  "# Header 1", # a header
  code.chunk(a <- 1, b <- 2, a + b), # a code chunk
  verbatim("hi there") # verbatim output
),
  tempfile(fileext = ".html"),
  quite = TRUE)

## End(Not run)

Helper functions for `write2`

Description

Helper functions for write2.

Usage

verbatim(...)

code.chunk(..., chunk.opts = "r")
verbatim(...)

code.chunk(..., chunk.opts = "r")

Arguments

`...`	For `verbatim`, objects to print out monospaced (as if in the terminal). For `code.chunk`, either expressions or single character strings to paste into the code chunk.
`chunk.opts`	A single character string giving the code chunk options. Make sure to specify the engine!

Details

The "verbatim" class is to tell write2 to print the object inside a section surrounded by three back ticks. The results will look like it would in the terminal (monospaced).

code.chunk() is to write explicit code chunks in the .Rmd file; it captures the call and writes it to the file, to execute upon knitting.

write2word, write2html, write2pdf

Description

Functions to output tables to a single Word, HTML, or PDF document.

Usage

write2word(object, file, ...)

write2pdf(object, file, ...)

write2html(object, file, ...)
write2word(object, file, ...)

write2pdf(object, file, ...)

write2html(object, file, ...)

Arguments

`object`	An object.
`file`	A single character string denoting the filename for the output document.
`...`	Additional arguments to be passed to `FUN`, `rmarkdown::render`, etc. One popular option is to use `quiet = TRUE` to suppress the command line output.

Details

To generate the appropriate file type, the write2* functions use one of rmarkdown::word_document, rmarkdown::html_document, and rmarkdown::pdf_document to get the job done. "..." arguments are passed to these functions, too.

Value

object is returned invisibly, and file is written.

Author(s)

Ethan Heinzen, adapted from code from Krista Goergen

Examples

## Not run: 
data(mockstudy)
# tableby example
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
write2html(tab1, "~/trash.html")

# freqlist example
tab.ex <- table(mockstudy[, c("arm", "sex", "mdquality.s")], useNA = "ifany")
noby <- freqlist(tab.ex, na.options = "include")
write2pdf(noby, "~/trash2.pdf")

# A more complicated example
write2word(tab1, "~/trash.doc",
  keep.md = TRUE,
  reference_docx = mystyles.docx, # passed to rmarkdown::word_document
  quiet = TRUE, # passed to rmarkdown::render
  title = "My cool new title") # passed to summary.tableby

## End(Not run)
## Not run: 
data(mockstudy)
# tableby example
tab1 <- tableby(arm ~ sex + age, data=mockstudy)
write2html(tab1, "~/trash.html")

# freqlist example
tab.ex <- table(mockstudy[, c("arm", "sex", "mdquality.s")], useNA = "ifany")
noby <- freqlist(tab.ex, na.options = "include")
write2pdf(noby, "~/trash2.pdf")

# A more complicated example
write2word(tab1, "~/trash.doc",
  keep.md = TRUE,
  reference_docx = mystyles.docx, # passed to rmarkdown::word_document
  quiet = TRUE, # passed to rmarkdown::render
  title = "My cool new title") # passed to summary.tableby

## End(Not run)

Include a YAML header in `write2`

Description

Include a YAML header in write2

Usage

yaml(...)

## S3 method for class 'yaml'
print(x, ...)

## S3 method for class 'yaml'
c(..., recursive = FALSE)

is.yaml(x)
yaml(...)

## S3 method for class 'yaml'
print(x, ...)

## S3 method for class 'yaml'
c(..., recursive = FALSE)

is.yaml(x)

Arguments

`...`	For `yaml()`, arguments to be bundled into a list and passed to `as.yaml`. For `print.yaml()`, extra arguments. For `c.yaml()`, "yaml" objects to be concatenated.
`x`	An object of class `"yaml"`.
`recursive`	Not in use at this time.

Value

A text string of class "yaml".

Author(s)

Ethan Heinzen, adapted from an idea by Brendan Broderick

Examples

x <- yaml(title = "My cool title", author = "Ethan P Heinzen")
x
y <- yaml("header-includes" = list("\\usepackage[labelformat=empty]{caption}"))
y
c(x, y)
x <- yaml(title = "My cool title", author = "Ethan P Heinzen")
x
y <- yaml("header-includes" = list("\\usepackage[labelformat=empty]{caption}"))
y
c(x, y)

Package 'arsenal'

Help Index

Not in

Description

Usage

Arguments

Value

Author(s)

See Also

Examples

An Arsenal of 'R' Functions for Large-Scale Statistical Summaries

Description

Details

Functions

Data

Examples

arsenal tables with common structure

Description

Usage

Arguments

See Also

Defunct functions in arsenal

Description

Arguments

Details

See Also

Deprecated functions in arsenal

Description

See Also

as.data.frame.freqlist

Description

Usage

Arguments

Value

as.data.frame.modelsum

Description

Usage

Arguments

Value

Author(s)

See Also

as.data.frame.tableby

Description

Usage

Arguments

Value

Author(s)

See Also

Compare two data.frames and report differences

Description

Usage

Arguments

Author(s)

See Also

Examples

Control settings for comparedf function

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

comparedf tolerances

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Extract differences

Description

Usage

Arguments

Author(s)

See Also

formulize

Description

`arsenal` tables with common structure

Defunct functions in `arsenal`

Deprecated functions in `arsenal`

Control settings for `comparedf` function

`comparedf` tolerances

Control settings for `freqlist` function

Control settings for `modelsum` function