Package 'rigr' reference manual

Title:	Regression, Inference, and General Data Analysis Tools in R
Description:	A set of tools to streamline data analysis. Learning both R and introductory statistics at the same time can be challenging, and so we created 'rigr' to facilitate common data analysis tasks and enable learners to focus on statistical concepts. We provide easy-to-use interfaces for descriptive statistics, one- and two-sample inference, and regression analyses. 'rigr' output includes key information while omitting unnecessary details that can be confusing to beginners. Heteroscedasticity-robust ("sandwich") standard errors are returned by default, and multiple partial F-tests and tests for contrasts are easy to specify. A single regression function can fit both linear and generalized linear models, allowing students to more easily make connections between different classes of models.
Authors:	Amy D Willis [aut, cre] , Taylor Okonek [aut], Charles J Wolock [aut], Brian D Williamson [aut], Scott S Emerson [aut], Andrew J Spieker [aut], Yiqun T Chen [aut], Travis Y Hee Wai [ctb], James P Hughes [ctb], R Core Team [ctb], Akhil S Bhel [ctb], Thomas Lumley [ctb]
Maintainer:	Amy D Willis <[email protected]>
License:	MIT + file LICENSE
Version:	1.0.7
Built:	2025-03-11 05:32:49 UTC
Source:	https://github.com/statdivlab/rigr

Regression, Inference, and General Data Analysis Tools in R

Description

Developed by Scott S. Emerson, Andrew J. Spieker, Brian D. Williamson, and Travis Y. Hee Wai at the University of Washington Department of Biostatistics. Currently maintained by Prof. Amy Willis at the University of Washington Department of Biostatistics. Previously maintained by Charles Wolock and Taylor Okonek, also at the University of Washington Department of Biostatistics. Aims to facilitate regression, descriptive statistics, and one- and two-sample inference by implementing more intuitive layout and functionality for existing R functions.

Details

Package:	rigr
Type:	Package
Version:	1.0.0
Date:	2021-09-10
License:	MIT

A set of tools designed to facilitate easy adoption of R for students in introductory classes with little programming experience. Compiles output from existing routines together in an intuitive format, and adds functionality to existing functions. For instance, the regression function can perform linear models and generalized linear models. The user can also specify multiple-partial F-tests to print out with the model coefficients, and robust standard errors are provided automatically. We also provide functions for descriptive statistics and one- and two-sample inference with improved, legible output.

Author(s)

Scott S. Emerson, Andrew J. Spieker, Brian D. Williamson, Amy D. Willis, Charles Wolock, and Taylor Okonek

Maintainer: Amy Willis <[email protected]>

ANOVA

Description

Compute analysis of variance (or deviance) tables for two fitted, nested uRegress objects. The model with more parameters is referred to as the full model (or the larger model), and the model with fewer parameters is referred to as the null model (or the smaller model).

Usage

## S3 method for class 'uRegress'
anova(object, full_object, test = "LRT", robustSE = TRUE, useFdstn = TRUE, ...)
## S3 method for class 'uRegress'
anova(object, full_object, test = "LRT", robustSE = TRUE, useFdstn = TRUE, ...)

Arguments

`object`	an object of class `uRegress`, the model with fewer parameters (i.e. the null model).
`full_object`	an object of class `uRegress`, the model with more parameters (i.e. the full model).
`test`	a character string specifying the test statistic to be used. Can be one of `'Wald'` or `'LRT'`, which corresponds to Wald or likelihood (partial likelihood for hazard regressions) ratio tests. Note that currently the Wald test is only supported for symbolically nested models; that is, when the larger model contains all the covariates (with the same names) in the smaller model.
`robustSE`	a logical value indicating whether or not to use robust standard errors in calculation. Defaults to `TRUE`. If `TRUE`, then `robustSE` must have been `TRUE` when `reg` was created.
`useFdstn`	a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution. Defaults to `FALSE`. This option is not supported when input `reg` is a hazard regression (i.e., `fnctl="hazard"`).
`...`	argument to be passed in

Value

A list of class anova.uRegress with the following components:

`printMat`	A formatted table with inferential results (i.e., test statistics and p-values) for comparing two nested models.
`null model`	The null model in the comparison.
`full model`	The full model in the comparison.

Examples

# Loading required libraries
library(sandwich)

# Reading in a dataset
data(mri)

# Linear regression of LDL on age and stroke (with robust SE by default)
testReg_null <- regress ("mean", ldl~age+stroke, data = mri)

# Linear regression of LDL on age, stroke, and race (with robust SE by default)
testReg_full <- regress ("mean", ldl~age+stroke+race, data = mri)
# Comparing the two models using the Wald test with robust SE
anova(testReg_null, testReg_full, test = "Wald")
# Loading required libraries
library(sandwich)

# Reading in a dataset
data(mri)

# Linear regression of LDL on age and stroke (with robust SE by default)
testReg_null <- regress ("mean", ldl~age+stroke, data = mri)

# Linear regression of LDL on age, stroke, and race (with robust SE by default)
testReg_full <- regress ("mean", ldl~age+stroke+race, data = mri)
# Comparing the two models using the Wald test with robust SE
anova(testReg_null, testReg_full, test = "Wald")

Calculate Cook's distances from `uRegress` objects

Description

Extracts Cook's distances from uRegress objects by relying on functionality from the stats package.

Usage

## S3 method for class 'uRegress'
cooks.distance(model, ...)
## S3 method for class 'uRegress'
cooks.distance(model, ...)

Arguments

`model`	an object of class `uRegress`, as returned by regress.
`...`	other arguments to pass to `stats::cooks.distance`

Value

a vector of Cook's distances

Descriptive Statistics

Description

Produces table of relevant descriptive statistics for an arbitrary number of variables of class integer, numeric, Surv, Date, or factor. Descriptive statistics can be obtained within strata, and the user can specify that only a subset of the data be used. Descriptive statistics include the count of observations, the count of cases with missing values, the mean, standard deviation, geometric mean, minimum, and maximum. The user can specify arbitrary quantiles to be estimated, as well as specifying the estimation of proportions of observations within specified ranges.

Usage

descrip(
  ...,
  strata = NULL,
  subset = NULL,
  probs = c(0.25, 0.5, 0.75),
  geomInclude = FALSE,
  replaceZeroes = FALSE,
  restriction = Inf,
  above = NULL,
  below = NULL,
  labove = NULL,
  rbelow = NULL,
  lbetween = NULL,
  rbetween = NULL,
  interval = NULL,
  linterval = NULL,
  rinterval = NULL,
  lrinterval = NULL
)
descrip(
  ...,
  strata = NULL,
  subset = NULL,
  probs = c(0.25, 0.5, 0.75),
  geomInclude = FALSE,
  replaceZeroes = FALSE,
  restriction = Inf,
  above = NULL,
  below = NULL,
  labove = NULL,
  rbelow = NULL,
  lbetween = NULL,
  rbetween = NULL,
  interval = NULL,
  linterval = NULL,
  rinterval = NULL,
  lrinterval = NULL
)

Arguments

`...`	an arbitrary number of variables for which descriptive statistics are desired. The arguments can be vectors, matrices, or lists. Individual columns of a matrix or elements of a list may be of class `numeric`, `factor`, `Surv`, or `Date`. Factor variables are converted to integers. Character vectors will be coerced to numeric. Variables may be of different lengths, unless `strata` or `subset` are non-`NULL`. A single `data.frame` or `tibble` may also be entered, in which case each variable in the object will be described.
`strata`	a vector, matrix, or list of stratification variables. Descriptive statistics will be computed within strata defined by each unique combination of the stratification variables, as well as in the combined sample. If `strata` is supplied, all variables must be of that same length.
`subset`	a vector indicating a subset to be used for all descriptive statistics. If `subset` is supplied, all variables must be of that same length.
`probs`	a vector of probabilities between 0 and 1 indicating quantile estimates to be included in the descriptive statistics. Default is to compute 25th, 50th (median) and 75th percentiles.
`geomInclude`	if not `FALSE` (the default), includes the geometric mean in the descriptive statistics.
`replaceZeroes`	if not `FALSE` (the default), this indicates a value to be used in place of zeroes when computing a geometric mean. If `TRUE`, a value equal to one-half the lowest nonzero value is used. If a numeric value is supplied, that value is used for all variables.
`restriction`	a value used for computing restricted means, standard deviations, and geometric means with censored time-to-event data. The default value of `Inf` will cause restrictions at the highest observation. Note that the same value is used for all variables of class `Surv`.
`above`	a vector of values used to dichotomize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values greater than each element of `above`.
`below`	a vector of values used to dichotomize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values less than each element of `below`.
`labove`	a vector of values used to dichotomize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values greater than or equal to each element of `labove`.
`rbelow`	a vector of values used to dichotomize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values less than or equal to each element of `rbelow`.
`lbetween`	a vector of values with `-Inf` and `Inf` appended is used as cutpoints to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between successive elements of `lbetween`, with the left-hand endpoint included in each interval.
`rbetween`	a vector of values with `-Inf` and `Inf` appended is used as cutpoints to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between successive elements of `rbetween`, with the right-hand endpoint included in each interval.
`interval`	a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with neither endpoint included in each interval.
`linterval`	a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with the left-hand endpoint included in each interval.
`rinterval`	a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with the right-hand endpoint included in each interval.
`lrinterval`	a two-column matrix of values in which each row is used to define intervals of interest to categorize variables. The descriptive statistics will include an estimate for each variable of the proportion of measurements with values between two elements in a row, with both endpoints included in each interval.

Details

This function depends on the survival R package. You should execute library(survival) if that library has not been previously installed. Quantiles are computed for uncensored data using the default method in quantile(). For variables of class factor, descriptive statistics will be computed using the integer coding for factors. For variables of class Surv, estimated proportions and quantiles will be computed from Kaplan-Meier estimates, as will be restricted means, restricted standard deviations, and restricted geometric means. For variables of class Date, estimated proportions will be labeled using the Julian date since January 1, 1970.

Value

An object of class uDescriptives is returned. Descriptive statistics for each variable in the entire subsetted sample, as well as within each stratum if any is defined, are contained in a matrix with rows corresponding to variables and strata and columns corresponding to the descriptive statistics. Descriptive statistics include

N: the number of observations.
Msng: the number of observations with missing values.
Mean: the mean of the nonmissing observations (this is potentially a restricted mean for right-censored time-to-event data).
Std Dev: the standard deviation of the nonmissing observations (this is potentially a restricted standard deviation for right-censored time to event data).
Geom Mn: the geometric mean of the nonmissing observations (this is potentially a restricted geometric mean for right-censored time to event data). Nonpositive values in the variable will generate NA, unless replaceZeroes was specified.
Min: the minimum value of the nonmissing observations (this is potentially restricted for right-censored time-to-event data).
Quantiles: columns corresponding to the quantiles specified by probs (these are potentially restricted for right-censored time-to-event data).
Max: the maximum value of the nonmissing observations (this is potentially restricted for right-censored time-to-event data).
Proportions: columns corresponding to the proportions as specified by above, below, labove, rbelow, lbetween, rbetween, interval, linterval, rinterval, and lrinterval.
restriction: the threshold for restricted means, standard deviations, and geometric means.
firstEvent: the time of the first event for censored time-to-event variables.
lastEvent: the time of the last event for censored time-to-event variables.
isDate: an indicator that the variable is a Date object.

Examples


# Read in the data
data(mri) 

# Create the table 
descrip(mri)

# Read in the data
data(mri) 

# Create the table 
descrip(mri)

Calculate dfbeta from `uRegress` objects

Description

Extracts dfbeta from uRegress objects by relying on functionality from the stats package. Note that dfbeta and dfbetas are not the same (dfbetas are less than the dfbeta values by a scaling factor that reflects both the leverage of the observation in question and the residual model error).

Usage

## S3 method for class 'uRegress'
dfbeta(model, ...)
## S3 method for class 'uRegress'
dfbeta(model, ...)

Arguments

`model`	an object of class `uRegress`, as returned by regress.
`...`	other arguments to pass to `stats::dfbeta`

Value

a matrix of dfbeta values, with a row for each observation and a column for each model coefficient

Calculate dfbetas from `uRegress` objects

Description

Extracts dfbetas from uRegress objects by relying on functionality from the stats package. Note that dfbeta and dfbetas are not the same (dfbetas are less than the dfbeta values by a scaling factor that reflects both the leverage of the observation in question and the residual model error).

Usage

## S3 method for class 'uRegress'
dfbetas(model, ...)
## S3 method for class 'uRegress'
dfbetas(model, ...)

Arguments

`model`	an object of class `uRegress`, as returned by regress.
`...`	other arguments to pass to `stats::dfbetas`

Value

a matrix of dfbetas values, with a row for each observation and a column for each model coefficient

Create Dummy Variables

Description

Create Dummy Variables

Usage

dummy(
  x,
  subset = rep(TRUE, length(x)),
  reference = sort(unique(x[!is.na(x)])),
  includeAll = FALSE
)
dummy(
  x,
  subset = rep(TRUE, length(x)),
  reference = sort(unique(x[!is.na(x)])),
  includeAll = FALSE
)

Arguments

`x`	`y` variable used to create the dummy variables.
`subset`	`cluster` a subset of the data, if desired.
`reference`	the reference value for the dummy variables to compare to.
`includeAll`	logical value indicating whether all of the dummy variables should be returned (including the reference).

Value

A matrix containing the dummy variables.

Examples


data(mri)

# Create a dummy variable for chd
dummy(mri$chd)

data(mri)

# Create a dummy variable for chd
dummy(mri$chd)

FEV dataset

Description

Data from a study of 654 children on the relationship between smoking status and lung function (measured by FEV). Each row corresponds to a single clinic visit and contains information on age, height, sex, FEV, and smoking status. More information, including a coding key, is available at http://www.emersonstatistics.com/datasets/fev.doc.

Usage

fev
fev

Format

A data frame with 654 rows and 7 variables:

seqnbr: case number (the numbers 1 to 654)
subjid: subject identification number (unique for each different child)
age: subject age at time of measurement (years)
fev: measured forced exhalation volume (liters per second)
height: subject height at time of measurement (inches)
sex: subject sex
smoke: smoking habits ("yes" or "no")

Source

http://www.emersonstatistics.com/datasets/fev.txt

Calculate the hat-values (leverages) from `uRegress` objects

Description

Extracts hat-values (leverages) from uRegress objects by relying on functionality from the stats package.

Usage

## S3 method for class 'uRegress'
hatvalues(model, ...)
## S3 method for class 'uRegress'
hatvalues(model, ...)

Arguments

`model`	an object of class `uRegress`, as returned by regress.
`...`	other arguments to pass to `stats::hatvalues`

Value

a vector of hat-values (leverages)

Tests of Linear Combinations of Regression Coefficients

Description

Produces point estimates, interval estimates, and p-values for linear combinations of regression coefficients using a uRegress object.

Usage

lincom(
  reg,
  comb,
  null.hypoth = 0,
  conf.level = 0.95,
  robustSE = TRUE,
  joint.test = FALSE,
  useFdstn = FALSE,
  eform = reg$fnctl != "mean"
)
lincom(
  reg,
  comb,
  null.hypoth = 0,
  conf.level = 0.95,
  robustSE = TRUE,
  joint.test = FALSE,
  useFdstn = FALSE,
  eform = reg$fnctl != "mean"
)

Arguments

`reg`	an object of class `uRegress`.
`comb`	a vector or matrix containing the values of the constants which create the linear combination of the form $c_0 + c_1\beta_1 + \dots$ Zeroes must be given if coefficients aren't going to be included. For testing multiple combinations, this must be a matrix with number of columns equal to the number of coefficients in the model.
`null.hypoth`	the null hypothesis to compare the linear combination of coefficients against. This is a scalar if one combination is given, and a vector or matrix otherwise. The default value is `0`.
`conf.level`	a number between 0 and 1, indicating the desired confidence level for intervals.
`robustSE`	a logical value indicating whether or not to use robust standard errors in calculation. Defaults to `TRUE`. If `TRUE`, then `robustSE` must have been `TRUE` when `reg` was created.
`joint.test`	a logical value indicating whether or not to use a joint Chi-square test for all the null hypotheses. If joint.test is `TRUE`, then no confidence interval is calculated. Defaults to `FALSE`.
`useFdstn`	a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution. Defaults to `TRUE`. This option is not supported when input `reg` is a hazard regression (i.e., `fnctl="hazard"`).
`eform`	a logical value indicating whether or not to exponentiate the estimated coefficient. By default this is performed based on the type of regression used.

Value

A list of class lincom (joint.test is False) or lincom.joint (joint.test is True). For the lincom class, comb entries in the list are labeled comb1, comb2, etc. for as many linear combinations were used. Each is a list with the following components:

`printMat`	A formatted table with inferential results for the linear combination of coefficients. These include the point estimate, standard error, confidence interval, and t-test for the linear combination.
`nms`	The name of the linear combination, for printing.
`null.hypoth`	The null hypothesis for the linear combination.

Examples

# Loading required libraries
library(sandwich)

# Reading in a dataset
data(mri)

# Linear regression of LDL on age (with robust SE by default)
testReg <- regress ("mean", ldl~age+stroke, data = mri)

# Testing coefficient created by .5*age - stroke (the first 0 comes from excluding the intercept)
testC <- c(0, 0.5, -1)
lincom(testReg, testC)

# Test multiple combinations: 
# whether separately whether .5*age - stroke = 0 or Intercept + 60*age = 125 
testC <- matrix(c(0, 0.5, -1, 1, 60, 0), byrow = TRUE, nrow = 2)
lincom(testReg, testC, null.hypoth = c(0, 125))

# Test joint null hypothesis:
# H0: .5*age - stroke = 0 AND Intercept + 60*age = 125 
lincom(testReg, testC, null.hypoth = c(0, 125), joint.test = TRUE)

# Loading required libraries
library(sandwich)

# Reading in a dataset
data(mri)

# Linear regression of LDL on age (with robust SE by default)
testReg <- regress ("mean", ldl~age+stroke, data = mri)

# Testing coefficient created by .5*age - stroke (the first 0 comes from excluding the intercept)
testC <- c(0, 0.5, -1)
lincom(testReg, testC)

# Test multiple combinations: 
# whether separately whether .5*age - stroke = 0 or Intercept + 60*age = 125 
testC <- matrix(c(0, 0.5, -1, 1, 60, 0), byrow = TRUE, nrow = 2)
lincom(testReg, testC, null.hypoth = c(0, 125))

# Test joint null hypothesis:
# H0: .5*age - stroke = 0 AND Intercept + 60*age = 125 
lincom(testReg, testC, null.hypoth = c(0, 125), joint.test = TRUE)

MRI dataset

Description

Data from an observational study of the incidence of cardiovascular disease (especially heart attacks and congestive heart failure) and cerebrovascular disease (especially strokes) in the U.S. elderly. More information, including a coding key, is available at http://www.emersonstatistics.com/datasets/mri.doc.

Usage

mri
mri

Format

A data frame with 735 rows and 30 variables:

ptid: Participant identification number.
mridate: The date on which the participant underwent MRI scan in MMDDYY format.
age: Participant age at time of MRI, in years.
sex: The sex of the partipant. Only 'Male' and 'Female' are represented.
race: Participant's race. One of the following: 'White', 'Black', 'Asian', or 'Subject did not identify as White, Black or Asian'. It is unclear if study participants self-identified their race, or if it was guessed by the study organisers.
weight: Participant's weight at time of MRI (pounds).
height: Participant's height at time of MRI (centimeters).
packyrs: Participant smoking history in pack years (1 pack year = smoking 1 pack of cigarettes per day for 1 year). A participant who has never smoked has 0 pack years.
yrsquit: Number of years since quitting smoking. A current smoker will have a nonzero packyrs and a 0 for yrsquit. A never smoker will have a zero for both variables.
alcoh: Average alcohol intake for the participant for the two weeks prior to MRI (drinks per week, where one drink is 1 oz. whiskey, 4 oz. wine, or 12 oz.beer).
physact: Physical activity of the participant for the week prior to MRI (1,000 kcal).
chf: Indicator of whether the participant had been diagnosed with congestive heart failure prior to MRI (0=no, 1=yes).
chd: Indicator of whether the participant had been diagnosed with coronary heart disease prior to MRI (0=no, 1=diagnosis of angina, 2=diagnosis of myocardial infarction).
stroke: Indicator of whether the participant had been diagnosed with a cerebrovascular event prior to MRI (0=no, 1=diagnosis of a transient ischemic attack, 2=diagnosis of stroke).
diabetes: Indicator of whether the participant had been diagnosed with diabetes prior to MRI (0=no, 1=yes).
genhlth: an indicator of the participant's view of their own health (1=excellent, 2=very good, 3=good, 4=fair, 5=poor)
ldl: a laboratory measure of low density lipoprotein (a kind of cholesterol) in the participant's blood at the time of MRI (mg/dL).
alb: a laboratory measure of albumin, a kind of protein, in the participant's blood at the time of MRI (g/L).
crt: a laboratory measure of creatinine, a waste product, in the participant's blood at the time of MRI (mg/dL).
plt: a laboratory measure of the number of platelets circulating in the participant's blood at the time of MRI (1000 per cubic mm).
sbp: a measurement of the participant's systolic blood pressure in their arm at the time of MRI (mm Hg).
aai: the ratio of systolic blood pressure measured in the participant's ankle at time of MRI to the systolic blood pressure in the participant's arm.
fev: a measure of the forced expiratory volume in the participant at the time of MRI (L/sec).
dsst: a measure of cognitive function (Digit Symbol Substitution Test) for the participant at the time of MRI. Maximum score possible is 100.
atrophy: a measure of loss of neurons estimated by the degree of ventricular enlargement relative to the predicted ventricular size; with 0 indicating no atrophy and 100 indicating the most severe degree of atrophy.
whgrd: a measure of white matter changes detected on MRI. 0 means no changes, 9 means marked changes.
numinf: a count of the number of distinct regions identified on MRI scan which were suggestive of infarcts.
volinf: a measure of the total volume of infarct-like lesions found on MRI scan (cubic cm).
obstime: the total time (in days) that the participant was observed on study between the date of MRI and death or September 16, 1997, whichever came first.
death: an indicator that the participant was observed to die while on study. If 1, the number of days recorded in obstime is the number of days from that participant's MRI to their death. If 0, the number of days in obstime is the number of days between that participant's MRI and September 16, 1997.

Source

http://www.emersonstatistics.com/datasets/mri.txt

Create Polynomials

Description

Creates polynomial variables, to be used in regression. Will create polynomials of degree less than or equal to the degree specified, and will mean center variables by default.

Usage

polynomial(x, degree = 2, center = mean(x, na.rm = TRUE))
polynomial(x, degree = 2, center = mean(x, na.rm = TRUE))

Arguments

`x`	variable used to create the polynomials.
`degree`	the maximum degree polynomial to be returned. Polynomials of degree <= `degree` will be returned.
`center`	the value to center the polynomials at.

Value

A matrix containing the linear splines.

Examples


# Reading in a dataset
data(mri)

# Create a polynomial on ldl
polynomial(mri$ldl, degree=3)

# Use a polynomial in regress
regress("mean", atrophy ~ polynomial(age, degree = 2), data = mri)

# Reading in a dataset
data(mri)

# Create a polynomial on ldl
polynomial(mri$ldl, degree=3)

# Use a polynomial in regress
regress("mean", atrophy ~ polynomial(age, degree = 2), data = mri)

Prediction Intervals for `uRegress` objects

Description

Produces prediction intervals for objects of class uRegress.

Usage

## S3 method for class 'uRegress'
predict(object, interval = "prediction", level = 0.95, ...)
## S3 method for class 'uRegress'
predict(object, interval = "prediction", level = 0.95, ...)

Arguments

`object`	an object of class `uRegress`.
`interval`	Type of interval calculation
`level`	Tolerance/confidence level
`...`	other arguments to pass to the appropriate predict function for the class of `object$fit`. See `predict.lm`, or `predict.glm` for more details.

Value

Returns a matrix with the fitted value and prediction interval for the entered X.

Examples


# Loading required libraries
library(survival)
library(sandwich)

# Reading in a dataset
data(mri)

# Linear regression of LDL on age (with robust SE by default)
testReg <- regress ("mean", ldl~age, data = mri)

# 95% Prediction Interval for age 50
predict(testReg)

# Loading required libraries
library(survival)
library(sandwich)

# Reading in a dataset
data(mri)

# Linear regression of LDL on age (with robust SE by default)
testReg <- regress ("mean", ldl~age, data = mri)

# 95% Prediction Interval for age 50
predict(testReg)

Test of proportions with improved layout

Description

Performs a one- or two-sample test of proportions using data. This test can be approximate or exact.

Usage

proptest(
  var1,
  var2 = NULL,
  by = NULL,
  exact = FALSE,
  null.hypoth = ifelse(is.null(var2) && is.null(by), 0.5, 0),
  alternative = "two.sided",
  conf.level = 0.95,
  correct = FALSE,
  more.digits = 0
)
proptest(
  var1,
  var2 = NULL,
  by = NULL,
  exact = FALSE,
  null.hypoth = ifelse(is.null(var2) && is.null(by), 0.5, 0),
  alternative = "two.sided",
  conf.level = 0.95,
  correct = FALSE,
  more.digits = 0
)

Arguments

`var1`	a (non-empty) vector of binary numeric (0-1), binary factor, or logical data values
`var2`	an optional (non-empty) vector of binary numeric (0-1), binary factor, or logical data values
`by`	a variable of equal length to that of `var1` with two outcomes (numeric or factor). This will be used to define strata for a prop test on `var1`.
`exact`	If true, performs a test of equality of proportions using exact binomial probabilities.
`null.hypoth`	a number specifying the null hypothesis for the mean (or difference in means if performing a two-sample test). Defaults to 0.5 for a one-sample test and 0 for a two-sample test.
`alternative`	a string: one of `"less"`, `"two.sided"`, or `"greater"` specifying the form of the test. Defaults to a two-sided test.
`conf.level`	confidence level of the test. Defaults to 0.95.
`correct`	a logical indicating whether to perform a continuity correction
`more.digits`	a numeric value specifying whether or not to display more or fewer digits in the output. Non-integers are automatically rounded down.

Details

Missing values must be given by "NA"s to be recognized as missing values. Numeric data must be given in 0-1 form. This function also accepts binary factor variables, treating the higher level as 1 and the lower level as 0, or logical variables.

Value

A list of class proptest. The print method lays out the information in an easy-to-read format.

`tab`	A formatted table of descriptive and inferential results (total number of observations, number of missing observations, sample proportion, standard error of the proportion estimate), along with a confidence interval for the underlying proportion.
`zstat`	the value of the test statistic, if using an approximate test.
`pval`	the p-value for the test
`var1`	The user-supplied first data vector.
`var2`	The user-supplied second data vector.
`by`	The user-supplied stratification variable.
`par`	A vector of information about the type of test (null hypothesis, alternative hypothesis, etc.)

Examples


# Read in data set
data(psa)
attach(psa)

# Define new binary variable as indicator
# of whether or not bss was worst possible
bssworst <- bss
bssworst[bss == 1] <- 0
bssworst[bss == 2] <- 0
bssworst[bss == 3] <- 1


# Perform test comparing proportion in remission
# between bss strata
proptest(factor(inrem), by = bssworst)

# Read in data set
data(psa)
attach(psa)

# Define new binary variable as indicator
# of whether or not bss was worst possible
bssworst <- bss
bssworst[bss == 1] <- 0
bssworst[bss == 2] <- 0
bssworst[bss == 3] <- 1


# Perform test comparing proportion in remission
# between bss strata
proptest(factor(inrem), by = bssworst)

Test of proportions from summary statistics

Description

Performs a one- or two-sample test of proportions using counts of successes and trials, rather than binary data. This test can be approximate or exact.

Usage

proptesti(
  x1,
  n1,
  x2 = NULL,
  n2 = NULL,
  exact = FALSE,
  null.hypoth = ifelse(is.null(x2) && is.null(n2), 0.5, 0),
  conf.level = 0.95,
  alternative = "two.sided",
  correct = FALSE,
  more.digits = 0
)
proptesti(
  x1,
  n1,
  x2 = NULL,
  n2 = NULL,
  exact = FALSE,
  null.hypoth = ifelse(is.null(x2) && is.null(n2), 0.5, 0),
  conf.level = 0.95,
  alternative = "two.sided",
  correct = FALSE,
  more.digits = 0
)

Arguments

`x1`	Number of successes in first sample
`n1`	Number of trials in first sample
`x2`	Number of successes in second sample
`n2`	Number of trials in second sample
`exact`	If true, performs a test of equality of proportions with Exact Binomial based confidence intervals.
`null.hypoth`	a number specifying the null hypothesis for the mean (or difference in means if performing a two-sample test). Defaults to 0.5 for one-sample and 0 for two-sample.
`conf.level`	confidence level of the test. Defaults to 0.95
`alternative`	a string: one of `"less"`, `"two.sided"`, or `"greater"` specifying the form of the test. Defaults to a two-sided test. When either `"less"` or `"greater"` is used, the corresponding one-sided confidence interval is returned.
`correct`	a logical indicating whether to perform a continuity correction
`more.digits`	a numeric value specifying whether or not to display more or fewer digits in the output. Non-integers are automatically rounded down.

Details

If x2 or n2 are specified, then both must be specified, and a two-sample test is run.

Value

A list of class proptesti. The print method lays out the information in an easy-to-read format.

`tab`	A formatted table of descriptive and inferential results (total number of observations, sample proportion, standard error of the proportion estimate), along with a confidence interval for the underlying proportion.
`zstat`	the value of the test statistic, if using an approximate test.
`pval`	the p-value for the test
`par`	A vector of information about the type of test (null hypothesis, alternative hypothesis, etc.)

Examples

# Two-sample test
proptesti(10, 100, 15, 200, alternative = "less")

# Two-sample test
proptesti(10, 100, 15, 200, alternative = "less")

PSA dataset

Description

Data from a study of 50 men having hormonally treated prostate cancer. Includes information on PSA levels, tumor characteristics, remission status, age, and disease state. More information, including a coding key, is available at http://www.emersonstatistics.com/datasets/PSA.doc.

Usage

psa
psa

Format

A data frame with 50 rows and 9 variables:

ptid: patient identifier
nadirpsa: lowest PSA value attained post therapy (ng/ml)
pretxpsa: PSA value prior to therapy (ng/ml)
ps: performance status (0= worst, 100= best)
bss: bone scan score (1= least disease, 3= most)
grade: tumor grade (1= least aggressive, 3= most)
age: patient's age (years)
obstime: time observed in remission (months)
inrem: Indicator whether patient still in remission at last follow-up (yes or no)

Source

http://www.emersonstatistics.com/datasets/psa.txt

General Regression for an Arbitrary Functional

Description

Produces point estimates, interval estimates, and p values for an arbitrary functional (mean, geometric mean, proportion, odds, hazard) of a variable of class integer, or numeric when regressed on an arbitrary number of covariates. Multiple Partial F-tests can be specified using the U function.

Usage

regress(
  fnctl,
  formula,
  data,
  intercept = TRUE,
  weights = rep(1, nrow(data.frame(data))),
  subset = rep(TRUE, nrow(data.frame(data))),
  robustSE = TRUE,
  conf.level = 0.95,
  exponentiate = fnctl != "mean",
  replaceZeroes,
  useFdstn = TRUE,
  suppress = FALSE,
  na.action,
  method = "qr",
  qr = TRUE,
  singular.ok = TRUE,
  contrasts = NULL,
  init = NULL,
  ties = "efron",
  offset,
  control = list(...),
  ...
)
regress(
  fnctl,
  formula,
  data,
  intercept = TRUE,
  weights = rep(1, nrow(data.frame(data))),
  subset = rep(TRUE, nrow(data.frame(data))),
  robustSE = TRUE,
  conf.level = 0.95,
  exponentiate = fnctl != "mean",
  replaceZeroes,
  useFdstn = TRUE,
  suppress = FALSE,
  na.action,
  method = "qr",
  qr = TRUE,
  singular.ok = TRUE,
  contrasts = NULL,
  init = NULL,
  ties = "efron",
  offset,
  control = list(...),
  ...
)

Arguments

`fnctl`	a character string indicating the functional (summary measure of the distribution) for which inference is desired. Choices include `"mean"`, `"geometric mean"`, `"odds"`, `"rate"`, `"hazard"`.
`formula`	an object of class `formula` as might be passed to `lm`, `glm`, or `coxph`. Functions of variables, specified using `dummy` or `polynomial` may also be included in `formula`.
`data`	a data frame, matrix, or other data structure with matching names to those entered in `formula`.
`intercept`	a logical value indicating whether a intercept exists or not. Default value is `TRUE` for all functionals. Intercept may also be removed if a "-1" is present in `formula`. If "-1" is present in `formula` but `intercept = TRUE` is specified, the model will fit without an intercept. Note that when `fnctl = "hazard"`, the intercept is always set to `FALSE` because Cox proportional hazards regression models do not explicitly estimate an intercept.
`weights`	vector indicating optional weights for weighted regression.
`subset`	vector indicating a subset to be used for all inference.
`robustSE`	a logical indicator that standard errors (and confidence intervals) are to be computed using the Huber-White sandwich estimator. The default is TRUE.
`conf.level`	a numeric scalar indicating the level of confidence to be used in computing confidence intervals. The default is 0.95.
`exponentiate`	a logical indicator that the regression parameters should be exponentiated. This is by default true for all functionals except the mean.
`replaceZeroes`	if not `FALSE`, this indicates a value to be used in place of zeroes when computing a geometric mean. If `TRUE`, a value equal to one-half the lowest nonzero value is used. If a numeric value is supplied, that value is used. Defaults to `TRUE` when `fnctl = "geometric mean"`. This parameter is always `FALSE` for all other values of `fnctl`.
`useFdstn`	a logical indicator that the F distribution should be used for test statistics instead of the chi squared distribution even in logistic regression models. When using the F distribution, the degrees of freedom are taken to be the sample size minus the number of parameters, as it would be in a linear regression model.
`suppress`	if `TRUE`, and a model which requires exponentiation (for instance, regression on the geometric mean) is computed, then a table with only the exponentiated coefficients and confidence interval is returned. Otherwise, two tables are returned - one with the original unexponentiated coefficients, and one with the exponentiated coefficients.
`na.action`, `qr`, `singular.ok`, `offset`, `contrasts`, `control`	optional arguments that are passed to the functionality of `lm` or `glm`.
`method`	the method to be used in fitting the model. The default value for `fnctl = "mean"` and `fnctl = "geometric mean"` is `"qr"`, and the default value for `fnctl = "odds"` and `fnctl = "rate"` is `"glm.fit"`. This argument is passed into the lm() or glm() function, respectively. You may optionally specify `method = "model.frame"`, which returns the model frame and does no fitting.
`init`	a numeric vector of initial values for the regression parameters for the hazard regression. Default initial value is zero for all variables.
`ties`	a character string describing method for breaking ties in hazard regression. Only `efron`, `breslow`, or `exact` is accepted. See more details in the documentation for this argument in the survival::coxph function. Default to `efron`.
`...`	additional arguments to be passed to the `lm` function call

Details

Regression models include linear regression (for the “mean” functional), logistic regression with logit link (for the “odds” functional), Poisson regression with log link (for the “rate” functional), linear regression of a log-transformed outcome (for the “geometric mean” functional), and Cox proportional hazards regression (for the hazard functional).

Currently, for the hazard functional, only 'coxph' syntax is supported; in other words, using 'dummy', 'polynomial', and U functions will result in an error when 'fnctl = hazard'.

Note that the only possible link function in 'regress' with 'fnctl = odds"' is the logit link. Similarly, the only possible link function in 'regress' with 'fnctl = "rate"' is the log link.

Objects created using the U function can also be passed in. If the U call involves a partial formula of the form ~ var1 + var2, then regress will return a multiple-partial F-test involving var1 and var2. If an F-statistic will already be calculated regardless of the U specification, then any naming convention specified via name ~ var1 will be ignored. The multiple partial tests must be the last terms specified in the model (i.e. no other predictors can follow them).

Value

An object of class uRegress is returned. Parameter estimates, confidence intervals, and p values are contained in a matrix $augCoefficients.

Examples

# Loading dataset
data(mri)

# Linear regression of atrophy on age
regress("mean", atrophy ~ age, data = mri)

# Linear regression of atrophy on sex and height and their interaction, 
# with a multiple-partial F-test on the height-sex interaction
regress("mean", atrophy ~ height + sex + U(hs=~height:sex), data = mri)

# Logistic regression of sex on atrophy
mri$sex_bin <- ifelse(mri$sex == "Female", 1, 0)
regress("odds", sex_bin ~ atrophy, data = mri)

# Cox regression of age on survival 
library(survival)
regress("hazard", Surv(obstime, death)~age, data=mri)
# Loading dataset
data(mri)

# Linear regression of atrophy on age
regress("mean", atrophy ~ age, data = mri)

# Linear regression of atrophy on sex and height and their interaction, 
# with a multiple-partial F-test on the height-sex interaction
regress("mean", atrophy ~ height + sex + U(hs=~height:sex), data = mri)

# Logistic regression of sex on atrophy
mri$sex_bin <- ifelse(mri$sex == "Female", 1, 0)
regress("odds", sex_bin ~ atrophy, data = mri)

# Cox regression of age on survival 
library(survival)
regress("hazard", Surv(obstime, death)~age, data=mri)

Extract Residuals from `uRegress` objects

Description

Extracts residuals (unstandardized, standardized, studentized, or jackknife) from uRegress objects.

Usage

## S3 method for class 'uRegress'
residuals(object, type = "", ...)
## S3 method for class 'uRegress'
residuals(object, type = "", ...)

Arguments

`object`	an object of class `uRegress`, as returned by regress.
`type`	denotes the type of residuals to return. Default value is `""`, which returns unstandardized residuals. `"standardized"`, `"studentized"`, and `"jackknife"` return the expected type of residuals.
`...`	other arguments

Details

Relies on functionality from the stats package to return residuals from the uRegress object. "studentized" residuals are computed as internally studentized residuals, while "jackknife" computes the externally studentized residuals.

Value

Returns the type of residuals requested.

Examples


# Reading in a dataset
data(mri)

# Create a uRegress object, regressing ldl on age
ldlReg <- regress("mean", age~ldl, data=mri)

# Get the studentized residuals
residuals(ldlReg, "studentized")

# Get the jackknifed residuals
residuals(ldlReg, "jackknife")

# Reading in a dataset
data(mri)

# Create a uRegress object, regressing ldl on age
ldlReg <- regress("mean", age~ldl, data=mri)

# Get the studentized residuals
residuals(ldlReg, "studentized")

# Get the jackknifed residuals
residuals(ldlReg, "jackknife")

Extract standardized residuals from `uRegress` objects

Description

Extracts standardized residuals from uRegress objects by relying on functionality from the stats package.

Usage

## S3 method for class 'uRegress'
rstandard(model, ...)
## S3 method for class 'uRegress'
rstandard(model, ...)

Arguments

`model`	an object of class `uRegress`, as returned by regress.
`...`	other arguments to pass to `residuals.uRegress`

Value

a vector of standardized residuals

Extract Studentized residuals from `uRegress` objects

Description

Extracts Studentized residuals from uRegress objects by relying on functionality from the stats package.

Usage

## S3 method for class 'uRegress'
rstudent(model, ...)
## S3 method for class 'uRegress'
rstudent(model, ...)

Arguments

`model`	an object of class `uRegress`, as returned by regress.
`...`	other arguments to pass to `residuals.uRegress`

Value

a vector of Studentized residuals

Salary dataset

Description

Data from a study of 1,597 faculty members at a single US university. Includes information on monthly salary each year from 1976 through 1995, as well as sex, highest degree attained, year of highest degree, field, year hired, rank, and administrative duties. More information, including a coding key, is available at http://www.emersonstatistics.com/datasets/salary.doc.

Usage

salary
salary

Format

A data frame with 19792 rows and 11 variables:

case: case number
id: identification number for the faculty member
sex: M (male) or F (female)
deg: highest degree attained: PhD, Prof (professional degree, eg, medicine or law), or Other (Master's or Bachelor's degree)
yrdeg: year highest degree attained
field: Arts (Arts and Humanities), Prof (professional school, e.g., Business, Law, Engineering or Public Affairs), or Other
startyr: year in which the faculty member was hired (2 digits)
year: year (2 digits)
rank: rank of the faculty member in this year: Assist (Assistant), Assoc (Associate), or Full (Full)
admin: Indicator of whether the faculty member had administrative duties (eg, department chair) in this year: 1 (yes), or 0 (no)
salary: monthly salary of the faculty member in this year in dollars

Source

http://www.emersonstatistics.com/datasets/salary.txt

T-test with Improved Layout

Description

Performs a one- or two-sample t-test using data. In the two-sample case, the user can specify whether or not observations are matched, and whether or not equal variances should be presumed.

Usage

ttest(
  var1,
  var2 = NA,
  by = NA,
  geom = FALSE,
  null.hypoth = 0,
  alternative = "two.sided",
  var.eq = FALSE,
  conf.level = 0.95,
  matched = FALSE,
  more.digits = 0
)
ttest(
  var1,
  var2 = NA,
  by = NA,
  geom = FALSE,
  null.hypoth = 0,
  alternative = "two.sided",
  var.eq = FALSE,
  conf.level = 0.95,
  matched = FALSE,
  more.digits = 0
)

Arguments

`var1`	a (non-empty) numeric vector of data values.
`var2`	an optional (non-empty) numeric vector of data.
`by`	a variable of equal length to that of `var1` with two outcomes. This will be used to define strata for a t-test on `var1`.
`geom`	a logical indicating whether the geometric mean should be calculated and displayed.
`null.hypoth`	a number specifying the null hypothesis for the mean (or difference in means if performing a two-sample test). Defaults to zero.
`alternative`	a string: one of `"less"`, `"two.sided"`, or `"greater"` specifying the form of the test. Defaults to a two-sided test.
`var.eq`	a logical value, either `TRUE` or `FALSE` (default), specifying whether or not equal variances should be presumed in a two-sample t-test. Also controls robust standard errors.
`conf.level`	confidence level of the test. Defaults to 0.95.
`matched`	a logical value, either `TRUE` or `FALSE`, indicating whether or not the variables of a two-sample t-test are matched. Variables must be of equal length.
`more.digits`	a numeric value specifying whether or not to display more or fewer digits in the output. Non-integers are automatically rounded down.

Details

Missing values must be given by NA to be recognized as missing values.

Value

a list of class ttest. The print method lays out the information in an easy-to-read format.

`tab`	A formatted table of descriptive and inferential statistics (total number of observations, number of missing observations, mean, standard error of the mean estimate, standard deviation), along with a confidence interval for the mean.
`df`	Degrees of freedom for the t-test.
`p`	P-value for the t-test.
`tstat`	Test statistic for the t-test.
`var1`	The user-supplied first data vector.
`var2`	The user-supplied second data vector.
`by`	The user-supplied stratification variable.
`par`	A vector of information about the type of test (null hypothesis, alternative hypothesis, etc.)
`geo`	A formatted table of descriptive and inferential statistics for the geometric mean.
`call`	The call made to the `ttest` function.

Examples


# Read in data set
data(psa)
attach(psa)

# Perform t-test
ttest(pretxpsa, null.hypoth = 100, alternative = "greater", more.digits = 1)

# Define new binary variable as indicator
# of whether or not bss was worst possible
bssworst <- bss
bssworst[bss == 1] <- 0
bssworst[bss == 2] <- 0
bssworst[bss == 3] <- 1

# Perform t-test allowing for unequal
# variances between strata -#
ttest(pretxpsa, by = bssworst)

# Perform matched t-test
ttest(pretxpsa, nadirpsa, matched = TRUE, conf.level = 99/100, more.digits = 1)


# Read in data set
data(psa)
attach(psa)

# Perform t-test
ttest(pretxpsa, null.hypoth = 100, alternative = "greater", more.digits = 1)

# Define new binary variable as indicator
# of whether or not bss was worst possible
bssworst <- bss
bssworst[bss == 1] <- 0
bssworst[bss == 2] <- 0
bssworst[bss == 3] <- 1

# Perform t-test allowing for unequal
# variances between strata -#
ttest(pretxpsa, by = bssworst)

# Perform matched t-test
ttest(pretxpsa, nadirpsa, matched = TRUE, conf.level = 99/100, more.digits = 1)

T-test Given Summary Statistics with Improved Layout

Description

Performs a one- or two-sample t-test given summary statistics. In the two-sample case, the user can specify whether or not equal variances should be presumed.

Usage

ttesti(
  obs,
  mean,
  sd,
  obs2 = NA,
  mean2 = NA,
  sd2 = NA,
  null.hypoth = 0,
  conf.level = 0.95,
  alternative = "two.sided",
  var.eq = FALSE,
  more.digits = 0
)
ttesti(
  obs,
  mean,
  sd,
  obs2 = NA,
  mean2 = NA,
  sd2 = NA,
  null.hypoth = 0,
  conf.level = 0.95,
  alternative = "two.sided",
  var.eq = FALSE,
  more.digits = 0
)

Arguments

`obs`	number of observations for the first sample.
`mean`	the sample mean of the first sample.
`sd`	the sample standard deviation of the first sample.
`obs2`	number of observations for the second sample (this is optional).
`mean2`	if `obs2` is supplied, then sample mean of the second sample must be supplied.
`sd2`	if `obs2` is supplied, then sample standard deviation of the second sample must be supplied.
`null.hypoth`	a number specifying the null hypothesis for the mean (or difference in means if performing a two-sample test). Defaults to zero.
`conf.level`	confidence level of the test. Defaults to 0.95.
`alternative`	a string: one of `"less"`, `"two.sided"`, or `"greater"` specifying the form of the test. Defaults to a two-sided test.
`var.eq`	a logical value, either `TRUE` or `FALSE` (default), specifying whether or not equal variances should be presumed in a two-sample t-test.
`more.digits`	a numeric value specifying whether or not to display more or fewer digits in the output. Non-integers are automatically rounded down.

Details

If obs2, mean2, or sd2 is specified, then all three must be specified and a two-sample t-test is run.

Value

a list of class ttesti. The print method lays out the information in an easy-to-read format.

`tab`	A formatted table of descriptive and inferential statistics (number of observations, mean, standard error of the mean estimate, standard deviation), along with a confidence interval for the mean.
`df`	Degrees of freedom for the t-test.
`p`	P-value for the t-test.
`tstat`	Test statistic for the t-test.
`par`	A vector of information about the type of test (null hypothesis, alternative hypothesis, etc.)
`twosamp`	A logical value indicating whether a two-sample test was performed.
`call`	The call made to the `ttesti` function.

Examples


# t-test given sample descriptives
ttesti(24, 175, 35, null.hypoth=230)

# two-sample test
ttesti(10, -1.6, 1.5, 30, -.7, 2.1)

# t-test given sample descriptives
ttesti(24, 175, 35, null.hypoth=230)

# two-sample test
ttesti(10, -1.6, 1.5, 30, -.7, 2.1)

Create a Partial Formula

Description

Creates a partial formula of the form ~var1 + var2. The partial formula can be named by adding an equals sign before the tilde.

Usage

U(...)
U(...)

Arguments

...

partial formula of the form ~var1 + var2.

Value

A partial formula (potentially named) for use in regress.

Examples


# Reading in a dataset
data(mri)

# Create a named partial formula
U(ma=~male+age)

# Create an unnamed partial formula

U(~male+age)

# Reading in a dataset
data(mri)

# Create a named partial formula
U(ma=~male+age)

# Create an unnamed partial formula

U(~male+age)

Wilcoxon Signed Rank and Mann-Whitney-Wilcoxon Rank Sum Test

Description

Performs Wilcoxon signed rank test or Mann-Whitney-Wilcoxon rank sum test depending on data and logicals entered. Relies heavily on the function wilcox.test. Adds formatting and variances.

Usage

wilcoxon(
  var1,
  var2 = NULL,
  alternative = "two.sided",
  null.hypoth = 0,
  paired = FALSE,
  exact = FALSE,
  correct = FALSE,
  conf.int = FALSE,
  conf.level = 0.95
)
wilcoxon(
  var1,
  var2 = NULL,
  alternative = "two.sided",
  null.hypoth = 0,
  paired = FALSE,
  exact = FALSE,
  correct = FALSE,
  conf.int = FALSE,
  conf.level = 0.95
)

Arguments

`var1`	numeric vector of data values. Non-finite (missing or infinite) values will be omitted.
`var2`	optional numeric vector of data values. Non-finite (missing or infinite) values will be omitted.
`alternative`	specifies the alternative hypothesis for the test; acceptable values are `"two.sided"`, `"greater"`, or `"less"`.
`null.hypoth`	the value of the null hypothesis.
`paired`	logical indicating whether the data are paired or not. Default is `FALSE`. If `TRUE`, data must be the same length.
`exact`	logical value indicating whether or not an exact test should be computed.
`correct`	logical indicating whether or not a continuity correction should be used and displayed.
`conf.int`	logical indicating whether or not to calculate and display a confidence interval
`conf.level`	confidence level for the interval. Defaults to 0.95.

Details

In the one-sample case, the returned confidence interval (when conf.int = TRUE) is a confidence interval for the pseudo-median of the underlying distribution. In the two-sample case, the function returns a confidence interval for the median of the difference between samples from the two distributions. See wilcox.test for more information.

Value

A list of class wilcoxon is returned. The print method lays out the information in an easy-to-read format.

`statistic`	the value of the test statistic with a name describing it.
`parameter`	the parameter(s) for the exact distribution of the test statistic.
`p.value`	the p-value for the test (calculated for the test statistic).
`null.value`	the parameter `null.hypoth`.
`alternative`	character string describing the alternative hypothesis.
`method`	the type of test applied.
`data.name`	a character string giving the names of the data.
`conf.int`	a confidence interval for the location parameter (only present if the argument `conf.int=TRUE`).
`estimate`	an estimate of the location parameter (only present if the argument `conf.int=TRUE`).
`table`	a formatted table of rank sum and number of observation values, for printing.
`vars`	a formatted table of variances, for printing.
`hyps`	a formatted table of the hypotheses, for printing.
`inf`	a formatted table of inference values, for printing.

Examples


#- Create the data -#
cf <- c(1153, 1132, 1165, 1460, 1162, 1493, 1358, 1453, 1185, 1824, 1793, 1930, 2075)
healthy <- c(996, 1080, 1182, 1452, 1634, 1619, 1140, 1123, 1113, 1463, 1632, 1614, 1836)

#- Perform the test -#
wilcoxon(cf, healthy, paired=TRUE)

#- Perform the test -#
wilcoxon(cf, healthy, conf.int=TRUE)

#- Create the data -#
cf <- c(1153, 1132, 1165, 1460, 1162, 1493, 1358, 1453, 1185, 1824, 1793, 1930, 2075)
healthy <- c(996, 1080, 1182, 1452, 1634, 1619, 1140, 1123, 1113, 1463, 1632, 1614, 1836)

#- Perform the test -#
wilcoxon(cf, healthy, paired=TRUE)

#- Perform the test -#
wilcoxon(cf, healthy, conf.int=TRUE)

Package 'rigr'

Help Index

Regression, Inference, and General Data Analysis Tools in R

Description

Details

Author(s)

ANOVA

Description

Usage

Arguments

Value

Examples

Calculate Cook's distances from uRegress objects

Description

Usage

Arguments

Value

Descriptive Statistics

Description

Usage

Arguments

Details

Value

Examples

Calculate dfbeta from uRegress objects

Description

Usage

Arguments

Value

Calculate dfbetas from uRegress objects

Description

Usage

Arguments

Value

Create Dummy Variables

Description

Usage

Arguments

Value

Examples

FEV dataset

Description

Usage

Format

Source

Calculate the hat-values (leverages) from uRegress objects

Description

Usage

Arguments

Value

Tests of Linear Combinations of Regression Coefficients

Description

Usage

Arguments

Value

Examples

MRI dataset

Description

Usage

Format

Source

Create Polynomials

Description

Usage

Arguments

Value

Examples

Prediction Intervals for uRegress objects

Description

Usage

Arguments

Value

See Also

Examples

Test of proportions with improved layout

Description

Usage

Arguments

Details

Value

Calculate Cook's distances from `uRegress` objects

Calculate dfbeta from `uRegress` objects

Calculate dfbetas from `uRegress` objects

Calculate the hat-values (leverages) from `uRegress` objects

Prediction Intervals for `uRegress` objects

Extract Residuals from `uRegress` objects

Extract standardized residuals from `uRegress` objects

Extract Studentized residuals from `uRegress` objects