| Title: | Bayesian Treed Distributed Lag Models |
|---|---|
| Description: | Estimation of distributed lag models (DLMs) based on a Bayesian additive regression trees framework. Includes several extensions of DLMs: treed DLMs and distributed lag mixture models (Mork and Wilson, 2023) <doi:10.1111/biom.13568>; treed distributed lag nonlinear models (Mork and Wilson, 2022) <doi:10.1093/biostatistics/kxaa051>; heterogeneous DLMs (Mork, et. al., 2024) <doi:10.1080/01621459.2023.2258595>; monotone DLMs (Mork and Wilson, 2024) <doi:10.1214/23-BA1412>. The package also includes visualization tools and a 'shiny' interface to check model convergence and to help interpret results. |
| Authors: | Daniel Mork [aut, cre, cph] (ORCID: <https://orcid.org/0000-0002-7924-0706>), Seongwon Im [aut] (ORCID: <https://orcid.org/0009-0000-8447-5852>), Ander Wilson [aut] (ORCID: <https://orcid.org/0000-0003-4774-3883>) |
| Maintainer: | Daniel Mork <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.1.1 |
| Built: | 2026-05-11 20:16:31 UTC |
| Source: | https://github.com/danielmork/dlmtree |
Estimates the marginal effects of an exposure while accounting for expected changes in co-occurring exposures at the same time point. Values of co-occurring exposures are modeled nonlinearly using a spline model with predictions made at the lower an upper values for the exposure of interest.
adj_coexposure( exposure.data, object, contrast_perc = c(0.25, 0.75), contrast_exp = list(), conf.level = 0.95, keep.mcmc = FALSE, verbose = TRUE )adj_coexposure( exposure.data, object, contrast_perc = c(0.25, 0.75), contrast_exp = list(), conf.level = 0.95, keep.mcmc = FALSE, verbose = TRUE )
exposure.data |
Named list of exposure matrices used as input to TDLMM. |
object |
Model output for TDLMM from dlmtree() function. |
contrast_perc |
2-length vector of percentiles or named list corresponding to lower and upper exposure percentiles of interest. Names must equal list names in 'exposure.data'. |
contrast_exp |
Named list consisting lower and upper exposure values. This takes precedence over contrast_perc if both inputs are used. |
conf.level |
Confidence level used for estimating credible intervals. Default is 0.95. |
keep.mcmc |
If TRUE, return posterior samples. |
verbose |
TRUE (default) or FALSE: print output |
adj_coexposure
A list with the following components (or posterior samples if keep.mcmc = TRUE):
Name |
vector of exposure names |
Time |
integer vector of lags |
Effect |
posterior mean of marginal effects |
SE |
standard error of the estimate |
Lower |
lower bound of credible interval of the marginal effect estimate |
Upper |
upper bound of credible interval of the marginal effect estimate |
cEffect |
cumulative marginal effects |
cLower |
lower bound of credible interval of the cumulative marginal effect |
cUpper |
upper bound of credible interval of the cumulative marginal effect |
CW |
boolean vector indicating critical window |
Matrix of five different exposures, each measured over 40 weeks.
data(coExp)data(coExp)
matrix
https://aqs.epa.gov/aqsweb/airdata/download_files.html
https://www.epa.gov/outdoor-air-quality-data
Method for combining information from DLMs of single exposure
combine.models(mlist)combine.models(mlist)
mlist |
a list of models |
combine.models
A data frame with model fit information of the models included in the list
Method for combining information from DLMs of mixture exposures.
combine.models.tdlmm(mlist)combine.models.tdlmm(mlist)
mlist |
a list of models |
combine.models.tdlmm
A data frame with model fit information of the models included in the list
fast set intersection tool assumes sorted vectors A and B
cppIntersection(A, B)cppIntersection(A, B)
A |
sorted integer vector A |
B |
sorted integer vector B |
vector of resulting intersection
diagnose generic function for S3method
## S3 method for class 'summary.hdlm' diagnose(x, ...) ## S3 method for class 'summary.hdlmm' diagnose(x, ...) ## S3 method for class 'summary.monotone' diagnose(x, ...) ## S3 method for class 'summary.tdlm' diagnose(x, ...) ## S3 method for class 'summary.tdlmm' diagnose(x, ...) ## S3 method for class 'summary.tdlnm' diagnose(x, ...) diagnose(x, ...)## S3 method for class 'summary.hdlm' diagnose(x, ...) ## S3 method for class 'summary.hdlmm' diagnose(x, ...) ## S3 method for class 'summary.monotone' diagnose(x, ...) ## S3 method for class 'summary.tdlm' diagnose(x, ...) ## S3 method for class 'summary.tdlmm' diagnose(x, ...) ## S3 method for class 'summary.tdlnm' diagnose(x, ...) diagnose(x, ...)
x |
a summary object resulting from summary() applied to an object of class 'tdlm', 'tdlmm', 'tdlnm', 'hdlm', 'hdlmm', 'monotone' |
... |
not used. |
shiny interface for assessing model convergence. The interface includes tabs for MCMC diagnostics such as trace plots, density plots, and convergence measures for distributed lag effects, DLM tree sizes, and hyperparameters.
Calculates the distributed lag effect with DLM matrix for linear models.
dlmEst(dlm, nlags, nsamp)dlmEst(dlm, nlags, nsamp)
dlm |
A numeric matrix containing the model fit information |
nlags |
total number of lags |
nsamp |
number of mcmc iterations |
A cube object of lag effect x lag x mcmc
The 'dlmtree' function accommodates various response variable types, including continuous, binary, and zero-inflated count values. The function is designed to handle both single exposure and exposure mixtures. For a single exposure, users are offered options to model non-linear effects (tdlnm), linear effects (tdlm), or heterogeneous subgroup/individualized effects (hdlm). In the case of exposure mixtures, the function supports lagged interactions (tdlmm), and heterogeneous subgroup/individualized effects (hdlmm) allowing for a comprehensive exploration of mixture exposure heterogeneity. Additionally, users can fine-tune parameters to impose effect shrinkage and perform exposure selection, enhancing the adaptability and precision of the modeling process. For more detailed documentation, visit: dlmtree website.
dlmtree( formula, data, exposure.data, dlm.type = "linear", family = "gaussian", mixture = FALSE, het = FALSE, control.mcmc = list(), control.hyper = list(), control.family = list(), control.tdlnm = list(), control.het = list(), control.mix = list(), control.monotone = list(), control.diagnose = list() )dlmtree( formula, data, exposure.data, dlm.type = "linear", family = "gaussian", mixture = FALSE, het = FALSE, control.mcmc = list(), control.hyper = list(), control.family = list(), control.tdlnm = list(), control.het = list(), control.mix = list(), control.monotone = list(), control.diagnose = list() )
formula |
object of class formula, a symbolic description of the fixed effect model to be fitted, e.g. y ~ a + b. |
data |
data frame containing variables used in the formula. |
exposure.data |
numerical matrix of exposure data with same length as data, for a mixture setting (tdlmm, hdlmm): named list containing equally sized numerical matrices of exposure data having same length as data. |
dlm.type |
dlm model specification: "linear" (default), "nonlinear", "monotone". |
family |
'gaussian' for continuous response, 'logit' for binomial, 'zinb' for zero-inflated negative binomial. |
mixture |
flag for mixture, set to TRUE for tdlmm and hdlmm (default: FALSE). |
het |
flag for heterogeneity, set to TRUE for hdlm and hdlmm (default: FALSE). |
control.mcmc |
list of MCMC control parameters. This is passed to dlmtree.control.mcmc. |
control.hyper |
list of hyperparameter control parameters. This is passed to dlmtree.control.hyper |
control.family |
list of family control parameters. This is passed to dlmtree.control.family |
control.tdlnm |
list of TDLNM control parameters. This is passed to dlmtree.control.tdlnm |
control.het |
list of control parameters for heterogeneous models. This is passed to dlmtree.control.het |
control.mix |
list of mixture control parameters. This is passed to dlmtree.control.mix |
control.monotone |
list of control parameters for monotone model. This is passed to dlmtree.control.monotone |
control.diagnose |
list of control parameters for diagnostics. This is passed to dlmtree.control.diagnose |
dlmtree
Model is recommended to be run for at minimum 5000 burn-in iterations followed by 15000 sampling iterations with a thinning factor of 5. Convergence can be checked by re-running the model and validating consistency of results. Examples are provided below for the syntax for running different types of models. For more examples, visit: dlmtree website.
object of one of the classes: tdlm, tdlmm, tdlnm, hdlm, hdlmm, monotone
# The first three examples are for one lagged exposure # treed distributed lag model (TDLM) # binary outcome with logit link D <- sim.tdlmm(sim = "A", mean.p = 0.5, n = 1000) tdlm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures[[1]], dlm.type = "linear", family = "logit", control.family = list(binomial.size = 1)) # summarize results tdlm.sum <- summary(tdlm.fit) tdlm.sum # plot results plot(tdlm.sum) # Treed distributed lag nonlinear model (TDLNM) # Gaussian regression model D <- sim.tdlnm(sim = "A", error.to.signal = 1) tdlnm.fit <- dlmtree(formula = y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "nonlinear", family = "gaussian") # summarize results tdlnm.sum <- summary(tdlnm.fit) tdlnm.sum # plot results plot(tdlnm.sum) # Heterogeneous TDLM (HDLM), similar to first example but with heterogeneous exposure response D <- sim.hdlmm(sim = "B", n = 1000) hdlm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", het = TRUE) # summarize results hdlm.sum <- summary(hdlm.fit) hdlm.sum # shiny app for HDLM if (interactive()) { shiny(hdlm.fit) } # The next two examples are for a mixture (or multivariate) exposure # Treed distributed lag mixture model (TDLMM) # Model for mixutre (or multivariate) lagged exposures # with a homogenious exposure-time-response function D <- sim.tdlmm(sim = "B", error = 25, n = 1000) tdlmm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, control.mix = list(interactions = "noself"), dlm.type = "linear", family = "gaussian", mixture = TRUE) # summarize results tdlmm.sum <- summary(tdlmm.fit) # plot the marginal exposure-response for one exposure plot(tdlmm.sum, exposure1 = "e1") # plot exposure-response surface plot(tdlmm.sum, exposure1 = "e1", exposure2 = "e2") # heterogeneous version of TDLMM D <- sim.hdlmm(sim = "D", n = 1000) hdlmm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", mixture = TRUE, het = TRUE) # summarize results hdlmm.sum <- summary(hdlmm.fit) hdlmm.sum # summarize results if (interactive()) { shiny(hdlmm.fit) }# The first three examples are for one lagged exposure # treed distributed lag model (TDLM) # binary outcome with logit link D <- sim.tdlmm(sim = "A", mean.p = 0.5, n = 1000) tdlm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures[[1]], dlm.type = "linear", family = "logit", control.family = list(binomial.size = 1)) # summarize results tdlm.sum <- summary(tdlm.fit) tdlm.sum # plot results plot(tdlm.sum) # Treed distributed lag nonlinear model (TDLNM) # Gaussian regression model D <- sim.tdlnm(sim = "A", error.to.signal = 1) tdlnm.fit <- dlmtree(formula = y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "nonlinear", family = "gaussian") # summarize results tdlnm.sum <- summary(tdlnm.fit) tdlnm.sum # plot results plot(tdlnm.sum) # Heterogeneous TDLM (HDLM), similar to first example but with heterogeneous exposure response D <- sim.hdlmm(sim = "B", n = 1000) hdlm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", het = TRUE) # summarize results hdlm.sum <- summary(hdlm.fit) hdlm.sum # shiny app for HDLM if (interactive()) { shiny(hdlm.fit) } # The next two examples are for a mixture (or multivariate) exposure # Treed distributed lag mixture model (TDLMM) # Model for mixutre (or multivariate) lagged exposures # with a homogenious exposure-time-response function D <- sim.tdlmm(sim = "B", error = 25, n = 1000) tdlmm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, control.mix = list(interactions = "noself"), dlm.type = "linear", family = "gaussian", mixture = TRUE) # summarize results tdlmm.sum <- summary(tdlmm.fit) # plot the marginal exposure-response for one exposure plot(tdlmm.sum, exposure1 = "e1") # plot exposure-response surface plot(tdlmm.sum, exposure1 = "e1", exposure2 = "e2") # heterogeneous version of TDLMM D <- sim.hdlmm(sim = "D", n = 1000) hdlmm.fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", mixture = TRUE, het = TRUE) # summarize results hdlmm.sum <- summary(hdlmm.fit) hdlmm.sum # summarize results if (interactive()) { shiny(hdlmm.fit) }
Diagnostic control settings for dlmtree model fitting
dlmtree.control.diagnose( subset = NULL, lowmem = FALSE, verbose = TRUE, save.data = TRUE, diagnostics = FALSE, initial.params = NULL )dlmtree.control.diagnose( subset = NULL, lowmem = FALSE, verbose = TRUE, save.data = TRUE, diagnostics = FALSE, initial.params = NULL )
subset |
integer vector to analyze only a subset of data and exposures. |
lowmem |
TRUE or FALSE (default): turn on memory saver for DLNM, slower computation time. |
verbose |
TRUE (default) or FALSE: print output |
save.data |
TRUE (default) or FALSE: save data used for model fitting. This must be set to TRUE to use shiny() function on hdlm or hdlmm |
diagnostics |
TRUE or FALSE (default) keep model diagnostic such as the number of terminal nodes and acceptance ratio. |
initial.params |
initial parameters for fixed effects model, FALSE = none (default), "glm" = generate using GLM, or user defined, length must equal number of parameters in fixed effects model. |
list of control parameters for diagnostics.
Family control settings for dlmtree model fitting
dlmtree.control.family(binomial.size = 1, formula.zi = NULL)dlmtree.control.family(binomial.size = 1, formula.zi = NULL)
binomial.size |
integer type scalar (if all equal, default: 1) or vector defining binomial size for 'logit' family. |
formula.zi |
(only applies to family = 'zinb') object of class formula, a symbolic description of the fixed effect of zero-inflated (ZI) model to be fitted, e.g. y ~ a + b. This only applies to ZINB where covariates for ZI model are different from NB model. This is set to the argument 'formula' by default. |
list of family control parameters.
Control settings for dlmtree model fitting, when used for heterogeneous models
dlmtree.control.het( modifiers = "all", modifier.splits = 20, modtree.params = c(0.95, 2), modtree.step.prob = c(0.25, 0.25, 0.25), dlmtree.type = "shared", selection.prior = 0.5 )dlmtree.control.het( modifiers = "all", modifier.splits = 20, modtree.params = c(0.95, 2), modtree.step.prob = c(0.25, 0.25, 0.25), dlmtree.type = "shared", selection.prior = 0.5 )
modifiers |
string vector containing desired modifiers to be included in a modifier tree. The strings in the vector must match the names of the columns of the data. By default, a modifier tree considers all covariates in the formula as modifiers unless stated otherwise. |
modifier.splits |
integer value to determine the possible number of splitting points that will be used for a modifier tree. |
modtree.params |
numerical vector of alpha and beta hyperparameters controlling modifier tree depth. (default: alpha = 0.95, beta = 2) |
modtree.step.prob |
numerical vector for probability of each step for modifier tree updates: 1) grow, 2) prune, 3) change. (default: c(0.25, 0.25, 0.25)) |
dlmtree.type |
specification of dlmtree type for HDLM: shared (default) or nested. |
selection.prior |
scalar hyperparameter for sparsity of modifiers. Must be between 0.5 and 1. Smaller value corresponds to increased sparsity of modifiers. |
list of control parameters for heterogeneous models.
Hyperparameter control settings for dlmtree model fitting
dlmtree.control.hyper( shrinkage = "all", params = c(0.95, 2), step.prob = c(0.25, 0.25) )dlmtree.control.hyper( shrinkage = "all", params = c(0.95, 2), step.prob = c(0.25, 0.25) )
shrinkage |
character "all" (default), "trees", "exposures", "none", turns on horseshoe-like shrinkage priors for different parts of model. |
params |
numerical vector of alpha and beta hyperparameters controlling dlm tree depth. (default: alpha = 0.95, beta = 2) |
step.prob |
numerical vector for probability of each step for dlm tree updates: 1) grow/prune, 2) change, 3) switch exposure. (default: c(0.25, 0.25, 0.25)) |
list of hyperparameter control parameters.
MCMC control settings for dlmtree model fitting
dlmtree.control.mcmc(n.trees = 20, n.burn = 1000, n.iter = 2000, n.thin = 10)dlmtree.control.mcmc(n.trees = 20, n.burn = 1000, n.iter = 2000, n.thin = 10)
n.trees |
integer for number of trees in ensemble. |
n.burn |
integer for length of MCMC burn-in. |
n.iter |
integer for number of MCMC iterations to run model after burn-in. |
n.thin |
integer MCMC thinning factor, i.e. keep every tenth iteration. |
list of MCMC control parameters.
Control settings for dlmtree model fitting, when used for mixture models
dlmtree.control.mix(interactions = "noself", sparsity.prior = 1)dlmtree.control.mix(interactions = "noself", sparsity.prior = 1)
interactions |
'noself' (default) which estimates interactions only between two different exposures, 'all' which also allows interactions within the same exposure, or 'none' which eliminates all interactions and estimates only main effects of each exposure. |
sparsity.prior |
positive scalar hyperparameter for sparsity of exposures. (default: 1) |
list of mixture control parameters.
Control settings for dlmtree model fitting, when used for monotone model
dlmtree.control.monotone( gamma0 = NULL, sigma = NULL, tree.time.params = c(0.95, 2), tree.exp.params = c(0.95, 2), time.kappa = NULL )dlmtree.control.monotone( gamma0 = NULL, sigma = NULL, tree.time.params = c(0.95, 2), tree.exp.params = c(0.95, 2), time.kappa = NULL )
gamma0 |
vector (with length equal to number of lags) of means for logit-transformed prior probability of split at each lag; e.g., gamma_0l = 0 implies mean prior probability of split at lag l = 0.5. |
sigma |
symmetric matrix (usually with only diagonal elements) corresponding to gamma_0 to define variances on prior probability of split; e.g., gamma_0l = 0 with lth diagonal element of sigma=2.701 implies that 95% of the time the prior probability of split is between 0.005 and 0.995, as a second example setting gamma_0l=4.119 and the corresponding diagonal element of sigma=0.599 implies that 95% of the time the prior probability of a split is between 0.8 and 0.99. |
tree.time.params |
numerical vector of hyperparameters for monotone time tree. |
tree.exp.params |
numerical vector of hyperparameters for monotone exposure tree. |
time.kappa |
scaling factor in dirichlet prior that goes alongside 'time.split.prob' to control the amount of prior information given to the model for deciding probabilities of splits between adjacent lags. |
list of control parameters for monotone model.
Control settings for dlmtree model fitting, when used for TDLNM
dlmtree.control.tdlnm( exposure.splits = 20, time.split.prob = NULL, exposure.se = NULL )dlmtree.control.tdlnm( exposure.splits = 20, time.split.prob = NULL, exposure.se = NULL )
exposure.splits |
scalar indicating the number of splits (divided evenly across quantiles of the exposure data) or list with two components: 'type' = 'values' or 'quantiles', and 'split.vals' = a numerical vector indicating the corresponding exposure values or quantiles for splits. |
time.split.prob |
probability vector of a spliting probabilities for time lags. (default: uniform probabilities) |
exposure.se |
numerical matrix of exposure standard errors with same size as exposure.data or a scalar smoothing factor representing a uniform smoothing factor applied to each exposure measurement. (default: sd(exposure.data)/2) |
list of TDLNM control parameters.
dlmtree model with fixed Gaussian process approach
dlmtreeGPFixedGaussian(model)dlmtreeGPFixedGaussian(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
dlmtree model with Gaussian process approach
dlmtreeGPGaussian(model)dlmtreeGPGaussian(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
dlmtree model with shared HDLM approach
dlmtreeHDLMGaussian(model)dlmtreeHDLMGaussian(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
dlmtree model with HDLMM approach
dlmtreeHDLMMGaussian(model)dlmtreeHDLMMGaussian(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
dlmtree model with nested HDLM approach
dlmtreeTDLM_cpp(model)dlmtreeTDLM_cpp(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
dlmtree model with fixed Gaussian approach
dlmtreeTDLMFixedGaussian(model)dlmtreeTDLMFixedGaussian(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
dlmtree model with nested Gaussian approach
dlmtreeTDLMNestedGaussian(model)dlmtreeTDLMNestedGaussian(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
Calculates the distributed lag effect with DLM matrix for non-linear models.
dlnmEst(dlnm, predAt, nlags, nsamp, center, se)dlnmEst(dlnm, predAt, nlags, nsamp, center, se)
dlnm |
A numeric matrix containing the model fit information |
predAt |
Number of splits in the model |
nlags |
total number of lags |
nsamp |
number of mcmc iterations |
center |
center parameter |
se |
Standard error parameter |
A cube object of lag effect x lag x mcmc
Calculates the distributed lag effect with DLM matrix for non-linear models.
dlnmPLEst(dlnm, predAt, nlags, nsamp, center)dlnmPLEst(dlnm, predAt, nlags, nsamp, center)
dlnm |
A numeric matrix containing the model fit information |
predAt |
Number of splits in the model |
nlags |
total number of lags |
nsamp |
number of mcmc iterations |
center |
center parameter |
A cube object of lag effect x lag x mcmc
A recursive method for drawing a new tree structure
drawTree(depth, alpha, beta)drawTree(depth, alpha, beta)
depth |
depth of a tree |
alpha |
tree shape parameter, 0 < alpha < 1 |
beta |
tree size parameter, beta > 0 |
drawTree
integer value of number of terminal nodes
Method for calculating subgroup-specific lag effects for heterogeneous models: HDLM, HDLMM
estDLM( object, new.data, group.index, conf.level = 0.95, exposure = NULL, keep.mcmc = FALSE, mem.safe = FALSE, verbose = TRUE )estDLM( object, new.data, group.index, conf.level = 0.95, exposure = NULL, keep.mcmc = FALSE, mem.safe = FALSE, verbose = TRUE )
object |
object of a model fit. Must be 'hdlm' or 'hdlmm' |
new.data |
data frame with new observations with the same number of modifiers |
group.index |
list of index (row numbers) for subgroup specification |
conf.level |
confidence level for credible interval of effects |
exposure |
exposure of interest for 'hdlmm' method |
keep.mcmc |
store mcmc in the output |
mem.safe |
boolean memory parameter for rule index |
verbose |
TRUE (default) or FALSE: print output |
estDLM
A list with the following components:
conf.level |
Specified confidence level |
mod |
a list of modifers with a vector of values from the model |
n |
Number of observation per specified subgroup |
groupIndex |
list of index (row numbers) for specified subgroup |
dlmMean |
distributed lag effects per subgroups |
dlmCI |
credible intervals for distributed lag effects per subgroups |
dlmCum |
cumulative effects per subgroups |
dlFunction |
type of DLM class |
plotData |
data frame built for easier visualization of distributed lag effects for each subgroup (facet) |
Matrix containing pairwise covariances for real exposure data consisting of five different exposures, each measured over 37 weeks.
data(exposureCov)data(exposureCov)
matrix
https://aqs.epa.gov/aqsweb/airdata/download_files.html
https://www.epa.gov/outdoor-air-quality-data
Download simulated data for dlmtree articles
get_sbd_dlmtree()get_sbd_dlmtree()
A data frame with 10000 rows (observations) and 202 variables. All data is simulated. The variables are:
bwgaz |
Outcome to be used. Simulated birth weight for gestational age z-score. |
ChildSex |
Binary sex of child. |
MomAge |
Continuous age in years. |
GestAge |
Continuous estimated gestational age at birth in weeks. |
MomHeightIn |
Continuous maternal height in inches. |
MomPriorWeightLbs |
Continuous mothers pre-pregnancy weight in pounds. |
MomPriorBMI |
Continuous mothers pre-pregnancy BMI. |
race |
Categorical race. |
Hispanic |
Binary indicator of Hispanic. |
MomEdu |
Categorical maternal highest educational attainment. |
SmkAny |
Binary indicator of any smoking during pregnancy. |
Marital |
Categorical maternal marital status. |
Income |
Categorical income. |
EstDateConcept |
Estimated date of conception. |
EstMonthConcept |
Estimated month of conception. |
EstYearConcept |
Estimated year of conception. |
pm25_1 - pm25_37 |
Weekly average exposure to PM2.5 for weeks 1 to 37. The columns are already scaled by the exposure IQR of 0.35. |
no2_1 - no2_37 |
Weekly average exposure to NO2 for weeks 1 to 37. The columns are already scaled by the exposure IQR of 9.13. |
so2_1 - so2_37 |
Weekly average exposure to SO2 for weeks 1 to 37. The columns are already scaled by the exposure IQR of 0.96. |
co2_1 - co2_37 |
Weekly average exposure to CO for weeks 1 to 37. The columns are already scaled by the exposure IQR of 0.15. |
temp_1 - temp_37 |
Weekly average exposure to temperature for weeks 1 to 37. The columns are already scaled by the exposure IQR of 27.93 |
source |
Variable indicating that the data came from the bdlim package. |
sbd_dlmtree <- get_sbd_dlmtree()sbd_dlmtree <- get_sbd_dlmtree()
Calculates the lagged interaction effects with MIX matrix for linear models.
mixEst(dlm, nlags, nsamp)mixEst(dlm, nlags, nsamp)
dlm |
A numeric matrix containing the model fit information |
nlags |
total number of lags |
nsamp |
number of mcmc iterations |
A cube object of interaction effect x lag x mcmc
dlmtree model with monotone tdlnm approach
monotdlnm_Cpp(model)monotdlnm_Cpp(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
Method for calculating posterior inclusion probabilities (PIPs) for modifiers in HDLM & HDLMM
pip(object, type = 1)pip(object, type = 1)
object |
An object of class dlmtree. |
type |
Type=1 indicates single modifier PIPs. Type=2 indicates joint modifier PIPs for two modifiers. |
pip
numeric vector of PIPs named with modifiers (type=1) or data.frame of PIPs with the following columns (type=2):
var1 |
first modifier of joint modifiers |
var2 |
second modifier of joint modifiers |
pip |
joint PIPs for the two modifiers |
# Posterior inclusion probability with HDLM D <- sim.hdlmm(sim = "B", n = 1000) fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", het = TRUE) pip(fit) pip(fit, type = 2)# Posterior inclusion probability with HDLM D <- sim.hdlmm(sim = "B", n = 1000) fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", het = TRUE) pip(fit) pip(fit, type = 2)
Method for returning variety of plots for model summary of class 'monotone'
## S3 method for class 'summary.monotone' plot(x, plot.type = "mean", val = c(), time = c(), ...)## S3 method for class 'summary.monotone' plot(x, plot.type = "mean", val = c(), time = c(), ...)
x |
object of class 'summary.monotone', output of summary of 'monotone' |
plot.type |
string indicating plot type, options are 'mean' (default) which shows mean exposure-time response surface, 'se', 'ci-min', 'ci-max', 'slice' which takes a slice of the plot at a given 'val' or 'time', 'animate' which creates a animation of slices of the surface plot across exposure values (requires package gganimate) |
val |
exposure value for slice plot |
time |
time value for slice plot |
... |
additional parameters to alter plots: 'main', 'xlab', 'ylab', 'flab' which sets the effect label for surface plots, 'start.time' which sets the first time value |
plot.summary.monotone
A plot of distributed lag effect estimated with monotone-TDLNM
Method for plotting a distributed lag function for model summary of 'tdlm'
## S3 method for class 'summary.tdlm' plot(x, ...)## S3 method for class 'summary.tdlm' plot(x, ...)
x |
object of class 'summary.tdlm', output of summary of 'tdlm' |
... |
additional plotting parameters for title and labels 'start.time' which sets the first time value |
plot.summary.tdlm
A plot of distributed lag effect estimated with tdlm
Method for plotting DLMMs for model summary of class 'tdlmm'. Includes plots for marginal exposure effects as well as interactions between two exposures.
## S3 method for class 'summary.tdlmm' plot( x, type = "marginal", exposure1 = NULL, exposure2 = NULL, time1 = c(), time2 = c(), show.cw = TRUE, cw.plots.only = TRUE, trueDLM = NULL, scale = NULL, ... )## S3 method for class 'summary.tdlmm' plot( x, type = "marginal", exposure1 = NULL, exposure2 = NULL, time1 = c(), time2 = c(), show.cw = TRUE, cw.plots.only = TRUE, trueDLM = NULL, scale = NULL, ... )
x |
an object of type 'summary.tdlmm' from summary.tdlmm() output |
type |
plot type, 'marginal' (default) |
exposure1 |
exposure for plotting DLM |
exposure2 |
exposure paired with 'exposure1' for plotting interaction |
time1 |
plot a cross section from an interaction plot at specific time for 'exposure1' |
time2 |
plot a cross section from an interaction plot at specific time for 'exposure2' |
show.cw |
indicate location of critical windows in interaction plot with red points |
cw.plots.only |
show only plots with critical windows |
trueDLM |
A vector of true effects that can be obtained from the simulated data. Only applicable for simulation studies |
scale |
default = NULL, if scale is not NULL, the effects are exponentiated |
... |
additional plotting parameters for title and labels |
plot.summary.tdlmm
A plot of distributed lag effect or interaction surface estimated with tdlmm
Method for returning variety of plots for model summary of class 'tdlnm'
## S3 method for class 'summary.tdlnm' plot(x, plot.type = "mean", val = c(), time = c(), ...)## S3 method for class 'summary.tdlnm' plot(x, plot.type = "mean", val = c(), time = c(), ...)
x |
object of class 'summary.tdlnm', output of summary of 'tdlnm' |
plot.type |
string indicating plot type, options are 'mean' (default) which shows mean exposure-time response surface, 'cumulative' which shows the cumulative effects per exposure-concentration level, 'effect' which returns a grid of exposure concentration and lag to determine if the credible interval contains zero, with the direction of the effect indicated, 'se', 'ci-min', 'ci-max', 'slice' which takes a slice of the plot at a given 'val' or 'time', 'animate' which creates a animation of slices of the surface plot across exposure values (requires package gganimate) |
val |
exposure value for slice plot |
time |
time value for slice plot |
... |
additional plotting parameters for title and labels 'flab' which sets the effect label for surface plots, 'start.time' which sets the first time value |
plot.summary.tdlnm
A plot of distributed lag effect estimated with tdlnm
Data.frame containing a sample of weekly average PM2.5 exposures across a range of states/counties. The PM2.5 data was downloaded from US EPA (https://aqs.epa.gov/aqsweb/airdata/download_files.html) daily data summaries and averaged by week. Forty-week ranges were assess for non-missingness and grouped for this dataset.
data(pm25Exposures)data(pm25Exposures)
data.frame; columns: S = state, C = city, 1-40 = weekly exposure data
https://aqs.epa.gov/aqsweb/airdata/download_files.html
https://www.epa.gov/outdoor-air-quality-data
Method for making a 'pretty' output of a group of numbers. For example: 2,3,4,5,8,9,12,15,16 becomes 2-5,8-9,12,15-16
ppRange(r)ppRange(r)
r |
set of integers to make 'pretty' |
ppRange
character string of values representing 'r'
predict generic function for S3method
predict( x, new.data, new.exposure.data, ci.level = 0.95, type = "response", outcome = NULL, fixed.idx = list(), est.dlm = FALSE, verbose = TRUE, ... ) ## S3 method for class 'hdlm' predict( x, new.data, new.exposure.data, ci.level = 0.95, type = "response", outcome = NULL, fixed.idx = list(), est.dlm = FALSE, verbose = TRUE, ... ) ## S3 method for class 'hdlmm' predict( x, new.data, new.exposure.data, ci.level = 0.95, type = "response", outcome = NULL, fixed.idx = list(), est.dlm = FALSE, verbose = TRUE, ... )predict( x, new.data, new.exposure.data, ci.level = 0.95, type = "response", outcome = NULL, fixed.idx = list(), est.dlm = FALSE, verbose = TRUE, ... ) ## S3 method for class 'hdlm' predict( x, new.data, new.exposure.data, ci.level = 0.95, type = "response", outcome = NULL, fixed.idx = list(), est.dlm = FALSE, verbose = TRUE, ... ) ## S3 method for class 'hdlmm' predict( x, new.data, new.exposure.data, ci.level = 0.95, type = "response", outcome = NULL, fixed.idx = list(), est.dlm = FALSE, verbose = TRUE, ... )
x |
fitted dlmtree model with class 'hdlm', 'hdlmm' |
new.data |
new data frame which contains the same covariates and modifiers used to fit the model |
new.exposure.data |
new data frame/list which contains the same length of exposure lags used to fit the model |
ci.level |
credible interval level for posterior predictive distribution |
type |
type of prediction: "response" (default) or "waic". "waic" must be specified with 'outcome' parameter |
outcome |
outcome required for WAIC calculation |
fixed.idx |
fixed index |
est.dlm |
flag for estimating dlm effect |
verbose |
TRUE (default) or FALSE: print output |
... |
not used |
list with the following elements:
posterior predictive mean of fixed effect
lower/upper bound of posterior predictive distribution of fixed effect
estimated exposure effect
lower bound of estimated exposure effect
upper bound of estimated exposure effect
posterior predictive mean of exposure effect
lower/upper bound of posterior predictive distribution of exposure effect
posterior predictive mean
lower/upper bound of posterior predictive distribution
print generic function for S3method
print(x, ...) ## S3 method for class 'tdlnm' print(x, ...) ## S3 method for class 'tdlm' print(x, ...) ## S3 method for class 'tdlmm' print(x, ...) ## S3 method for class 'hdlm' print(x, ...) ## S3 method for class 'hdlmm' print(x, ...) ## S3 method for class 'monotone' print(x, ...) ## S3 method for class 'summary.hdlm' print(x, digits = 3, ...) ## S3 method for class 'summary.hdlmm' print(x, digits = 3, ...) ## S3 method for class 'summary.monotone' print(x, digits = 3, ...) ## S3 method for class 'summary.tdlm' print(x, digits = 3, ...) ## S3 method for class 'summary.tdlmm' print(x, digits = 3, ...) ## S3 method for class 'summary.tdlnm' print(x, digits = 3, ...)print(x, ...) ## S3 method for class 'tdlnm' print(x, ...) ## S3 method for class 'tdlm' print(x, ...) ## S3 method for class 'tdlmm' print(x, ...) ## S3 method for class 'hdlm' print(x, ...) ## S3 method for class 'hdlmm' print(x, ...) ## S3 method for class 'monotone' print(x, ...) ## S3 method for class 'summary.hdlm' print(x, digits = 3, ...) ## S3 method for class 'summary.hdlmm' print(x, digits = 3, ...) ## S3 method for class 'summary.monotone' print(x, digits = 3, ...) ## S3 method for class 'summary.tdlm' print(x, digits = 3, ...) ## S3 method for class 'summary.tdlmm' print(x, digits = 3, ...) ## S3 method for class 'summary.tdlnm' print(x, digits = 3, ...)
x |
An object of class 'tdlm', 'tdlmm', 'tdlnm', 'hdlm', 'hdlmm', 'monotone', representing a fitted model using dlmtree(); or a summary object produced by applying summary() to one of these model objects. |
... |
additional parameters |
digits |
number of decimal places to round the numeric values to |
For a fitted model object, prints an assorted model output including model formula call and available methods. For a summary object, prints a summary output of a model fit in the R console.
Multiple draw polya gamma latent variable for var c[i] with size b[i]
rcpp_pgdraw(b, z)rcpp_pgdraw(b, z)
b |
vector of binomial sizes |
z |
vector of parameters |
Eigen::VectorXd
Truncated multivariate normal sampler, mean mu, cov sigma, truncated (0, Inf)
rtmvnorm(mu, sigma, iter)rtmvnorm(mu, sigma, iter)
mu |
vector of mean parameters |
sigma |
covariance matrix |
iter |
number of iterations |
VectorXd
Method for calculating the weights for each modifier rule
ruleIdx(mod, mem.safe = FALSE)ruleIdx(mod, mem.safe = FALSE)
mod |
a list of modifier splitting rules |
mem.safe |
boolean memory parameter |
A list of weights per rule with modifiers
Method for centering and scaling a matrix
scaleModelMatrix(M)scaleModelMatrix(M)
M |
a matrix to center and scale |
scaleModelMatrix
a scaled matrix
shiny generic function for S3method
shiny(fit) ## S3 method for class 'hdlm' shiny(fit) ## S3 method for class 'hdlmm' shiny(fit)shiny(fit) ## S3 method for class 'hdlm' shiny(fit) ## S3 method for class 'hdlmm' shiny(fit)
fit |
object of class 'hdlm', 'hdlmm' to which S3method is applied |
shiny interface for further analysis on heterogeneous analyses. The interface includes tabs for modifier selection, personalized exposure effects and subgroup-specific effects.
Method for creating simulated data for HDLM & HDLMM
sim.hdlmm( sim = "A", n = 1000, error = 1, effect.size = 1, exposure.data = NULL )sim.hdlmm( sim = "A", n = 1000, error = 1, effect.size = 1, exposure.data = NULL )
sim |
character (A - E) specifying simulation scenario |
n |
sample size |
error |
positive scalar specifying error variance for Gaussian response |
effect.size |
the effect size of the window of susceptibility |
exposure.data |
exposure data. A matrix of exposure data for simulation A, B, C and a named list of exposure data for simulation D, E |
sim.hdlmm
Simulation scenarios:
Scenario A: Two subgroups with early/late windows determined by continuous and binary modifiers
Scenario B: Two subgroups with scaled effect determined by a continuous modifier
Scenario C: No heterogeneity i.e., same effect on all individuals
Scenario D: Three subgroups with three corresponding exposures. Subgroups are determined by continuous and binary modifiers
Scenario E: Two subgroups with two exposures. First group is associated with the scaled main effect and lagged interaction while the second group is only associated with the scaled main effect, no interaction.
Simulated data and true parameters
sim.hdlmm(sim = "A", n = 1000)sim.hdlmm(sim = "A", n = 1000)
Method for creating simulated data for TDLM & TDLMM
sim.tdlmm( sim = "A", n = 5000, error = 10, mean.p = 0.5, prop.active = 0.05, expList = NULL, r = 1 )sim.tdlmm( sim = "A", n = 5000, error = 10, mean.p = 0.5, prop.active = 0.05, expList = NULL, r = 1 )
sim |
character (A - F) specifying simulation scenario |
n |
sample size for simulation |
error |
positive scalar specifying error variance for Gaussian response |
mean.p |
scalar between zero and one specifying mean probability for simulation scenario A |
prop.active |
proportion of active exposures for simulation scenario C |
expList |
named list of exposure data |
r |
dispersion parameter of negative binomial distribution |
sim.tdlmm
Simulation scenarios:
Scenario A: Binary response with single exposure effect
Scenario B: Continuous response with main effect of PM2.5 and interaction
Scenario C: Continuous response to test exposure selection using exposure
Scenario D: Continuous response to test exposure selection using one exposure of main effect and two interaction effects among four exposures
Scenario E: Zero-inflated count response with single exposure effect
Scenario F: Zero-inflated count response with single exposure effect with main effect of PM2.5 and interaction
Simulated data and true parameters
sim.tdlmm(sim = "A", mean.p = 0.5, n = 1000)sim.tdlmm(sim = "A", mean.p = 0.5, n = 1000)
Method for creating simulated data for TDLNM
sim.tdlnm(sim = "A", error.to.signal = 1)sim.tdlnm(sim = "A", error.to.signal = 1)
sim |
character (A - D) specifying simulation scenario |
error.to.signal |
scalar value setting error: sigma^2/var(f) |
sim.tdlnm
Simulation scenarios:
Scenario A: Piecewise constant effect
Scenario B: Linear effect
Scenario C: Logistic effect, piecewise in time
Scenario D: Logistic effect, smooth in time
Simulated data and true parameters
sim.tdlnm(sim = "A", error.to.signal = 1)sim.tdlnm(sim = "A", error.to.signal = 1)
Calculates the posterior inclusion probability (PIP).
splitPIP(dlnm, nlags, niter)splitPIP(dlnm, nlags, niter)
dlnm |
A numeric matrix containing the model fit information |
nlags |
total number of lags |
niter |
number of mcmc iterations |
A matrix of split counts per mcmc
Method for determining split points for continuous modifiers
splitpoints(object, var, round = NULL)splitpoints(object, var, round = NULL)
object |
An object of class 'hdlm', 'hdlmm' |
var |
The name of a continuous variable for which the split points will be reported |
round |
The number of decimal places to round the variable (var) to. No rounding occurs if round=NULL (default) For positive integer values of round, the variable will be rounded and split points will be reported at the resulting level |
splitpoints
A data frame with split points and the probability that a split point was >= that split point value
# Split points with HDLM D <- sim.hdlmm(sim = "B", n = 1000) fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", het = TRUE) splitpoints(fit, var = "mod_num", round = 2) splitpoints(fit, var = "mod_scale", round = 2)# Split points with HDLM D <- sim.hdlmm(sim = "B", n = 1000) fit <- dlmtree(y ~ ., data = D$dat, exposure.data = D$exposures, dlm.type = "linear", family = "gaussian", het = TRUE) splitpoints(fit, var = "mod_num", round = 2) splitpoints(fit, var = "mod_scale", round = 2)
summary generic function for S3method
summary(x, conf.level = 0.95, ...) ## S3 method for class 'hdlm' summary(x, conf.level = 0.95, mcmc = FALSE, ...) ## S3 method for class 'hdlmm' summary(x, conf.level = 0.95, mcmc = FALSE, ...) ## S3 method for class 'monotone' summary( x, conf.level = 0.95, pred.at = NULL, cenval = 0, exposure.se = NULL, mcmc = FALSE, verbose = TRUE, ... ) ## S3 method for class 'tdlm' summary(x, conf.level = 0.95, mcmc = FALSE, ...) ## S3 method for class 'tdlmm' summary( x, conf.level = 0.95, marginalize = "mean", log10BF.crit = 0.5, mcmc = FALSE, verbose = TRUE, ... ) ## S3 method for class 'tdlnm' summary( x, conf.level = 0.95, pred.at = NULL, cenval = 0, exposure.se = NULL, mcmc = FALSE, verbose = TRUE, ... )summary(x, conf.level = 0.95, ...) ## S3 method for class 'hdlm' summary(x, conf.level = 0.95, mcmc = FALSE, ...) ## S3 method for class 'hdlmm' summary(x, conf.level = 0.95, mcmc = FALSE, ...) ## S3 method for class 'monotone' summary( x, conf.level = 0.95, pred.at = NULL, cenval = 0, exposure.se = NULL, mcmc = FALSE, verbose = TRUE, ... ) ## S3 method for class 'tdlm' summary(x, conf.level = 0.95, mcmc = FALSE, ...) ## S3 method for class 'tdlmm' summary( x, conf.level = 0.95, marginalize = "mean", log10BF.crit = 0.5, mcmc = FALSE, verbose = TRUE, ... ) ## S3 method for class 'tdlnm' summary( x, conf.level = 0.95, pred.at = NULL, cenval = 0, exposure.se = NULL, mcmc = FALSE, verbose = TRUE, ... )
x |
an object of class 'tdlm', 'tdlmm', 'tdlnm', 'hdlm', 'hdlmm', 'monotone' |
conf.level |
confidence level for computation of credible intervals |
... |
additional parameters |
mcmc |
keep all mcmc iterations (large memory requirement) |
pred.at |
numerical vector of exposure values to make predictions for at each time period |
cenval |
scalar exposure value that acts as a reference point for predictions at all other exposure values |
exposure.se |
scalar smoothing factor, if different from model |
verbose |
show progress in console |
marginalize |
value(s) for calculating marginal DLMs, defaults to "mean", can also specify a percentile from 1-99 for all other exposures, or a named vector with specific values for each exposure |
log10BF.crit |
Bayes Factor criteria for selecting exposures and interactions, such that log10(BayesFactor) > x. Default = 0.5. |
list of summary outputs of the model fit
dlmtree model with tdlmm approach
tdlmm_Cpp(model)tdlmm_Cpp(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
dlmtree model with tdlnm approach
tdlnm_Cpp(model)tdlnm_Cpp(model)
model |
A list of parameter and data contained for the model fitting |
A list of dlmtree model fit, mainly posterior mcmc samples
Integrates (0,inf) over multivariate normal
zeroToInfNormCDF(mu, sigma)zeroToInfNormCDF(mu, sigma)
mu |
vector of mean parameters |
sigma |
covariance matrix |
double
Data.frame containing a sample of weekly average wildfire PM, PM2.5, O3 across a range of counties of Colorado. The exposure data was downloaded from US EPA (https://aqs.epa.gov/aqsweb/airdata/download_files.html) daily data summaries and averaged by week.
data(zinbCo)data(zinbCo)
data.frame;
https://aqs.epa.gov/aqsweb/airdata/download_files.html
https://www.epa.gov/outdoor-air-quality-data