R/create_data.R
create_data.Rd
Does minimal processing of data to use as argument to fitting function
create_data(
data,
min_number = 0,
variable = "number",
time = "year",
date = "doy",
asymmetric_model = TRUE,
mu = ~1,
sigma = ~1,
covar_data = NULL,
est_sigma_re = TRUE,
est_mu_re = TRUE,
tail_model = "student_t",
family = "lognormal",
max_theta = 10,
share_shape = TRUE,
nu_prior = c(2, 10),
beta_prior = c(2, 1)
)
A data frame
A minimum threshold to use, defaults to 0
A character string of the name of the variable in 'data' that contains the response (e.g. counts)
A character string of the name of the variable in 'data' that contains the time variable (e.g. year)
A character string of the name of the variable in 'data' that contains the response (e.g. day of year). The actual #' column should contain a numeric response -- for example, the result from using lubridate::yday(x)
Boolean, whether or not to let model be asymmetric (e.g. run timing before peak has a different shape than run timing after peak)
An optional formula allowing the mean to be a function of covariates. Random effects are not included in the formula
but specified with the est_mu_re
argument
An optional formula allowing the standard deviation to be a function of covariates. For asymmetric models,
each side of the distribution is allowed a different set of covariates. Random effects are not included in the formula
but specified with the est_sigma_re
argument
a data frame containing covariates specific to each time step. These are used in the formulas mu
and sigma
Whether to estimate random effects by year in sigma parameter controlling tail of distribution. Defaults to TRUE
Whether to estimate random effects by year in mu parameter controlling location of distribution. Defaults to TRUE
Whether to fit Gaussian ("gaussian"), Student-t ("student_t") or generalized normal ("gnorm"). Defaults to Student-t
Response for observation model, options are "gaussian", "poisson", "negbin", "binomial", "lognormal". The default ("lognormal") is not a true lognormal distribution, but a normal-log in that it assumes log(y) ~ Normal()
Maximum value of log(pred) when limits=TRUE
. Defaults to 10
Boolean argument for whether asymmetric student-t and generalized normal distributions should share the shape parameter (nu for the student-t; beta for the generalized normal). Defaults to TRUE
Two element vector (optional) for penalized prior on student t df, defaults to a Gamma(shape=2, scale=10) distribution
Two element vector (optional) for penalized prior on generalized normal beta, defaults to a Normal(2, 1) distribution
data(fishdist)
datalist <- create_data(fishdist,
min_number = 0, variable = "number", time = "year",
date = "doy", asymmetric_model = TRUE, family = "gaussian"
)