Generate Simulated Data Replicates by controlling dosing, covariates, parametes, response, missingness and interims
Usage
generateData(
replicateN,
subjects = NULL,
treatSubj = subjects,
treatDoses,
treatSeq,
treatType = "Parallel",
treatPeriod,
genParNames,
genParMean,
genParVCov = 0,
respEqn,
respName = getEctdColName("Response"),
treatProp,
treatOrder = FALSE,
conCovNames,
conCovMean,
conCovVCov,
conCovCrit = NULL,
conCovDigits = 3,
conCovMaxDraws = 100,
disCovNames,
disCovVals,
disCovProb,
disCovProbArray,
extCovNames,
extCovFile,
extCovSubset,
extCovRefCol,
extCovSameRow = TRUE,
extCovDataId = idCol,
timeCovNames,
timeCovMean,
timeCovVCov,
timeCovCrit = NULL,
genParCrit,
genParBtwNames,
genParBtwMean,
genParBtwVCov,
genParErrStruc = "None",
genParMaxDraws = 100,
genParRangeTolerance = 0.5,
extParFile,
extParNames,
extParBtwNames,
extParBtwNums,
extParSubset = NULL,
extParCrit,
extParErrStruc = "None",
extParRefColData,
extParRefColName,
extParDataId = idCol,
respInvLink,
respDist = "Normal",
respVCov,
respErrStruc = "Additive",
respCrit,
respDigits = 3,
mcarProp = 0,
mcarRule,
dropFun,
dropFunExtraArgs = list(),
interimSubj,
interimMethod = "Sample",
seed = .deriveFromMasterSeed(),
idCol = getEctdColName("Subject"),
doseCol = getEctdColName("Dose"),
timeCol = getEctdColName("Time"),
trtCol = getEctdColName("Trt"),
parOmitFlag = getEctdColName("ParOmit"),
respOmitFlag = getEctdColName("RespOmit"),
missingFlag = getEctdColName("Missing"),
interimCol = getEctdColName("Interim"),
parBtwSuffix = ".Between",
deleteCurrData = TRUE,
covDiff = TRUE,
treatDiff = TRUE,
workingPath = getwd()
)
Arguments
- replicateN
(Required) Number of replicates for which to create simulated data
- subjects
(Required) Number of subjects in simulation
- treatSubj
(Optional) Number of subjects to which to allocate treatments, or a vector of allocations
- treatDoses
(Optional) Vector of numeric treatment doses. By default, this is the same as the "subjects" input
- treatSeq
(Optional) Treatment matrix for crossover designs. Missing by default, but this is required when treatType is set to "Crossover"
- treatType
(Optional) Treatment type: 'Parallel' or 'Crossover'. Default is "Parallel"
- treatPeriod
(Optional) Vector of numeric treatment time points. Missing by default, resulting in no "time" element in the generated data
- genParNames
(Optional) Names of fixed effects to generate. Missing by default, resulting in no fixed parameters being created
- genParMean
(Optional) Means for generating fixed parameters. Missing by default
- genParVCov
(Optional) Covariance matrix for generating fixed parameters. By default, this is a matrix of zeros
- respEqn
(Required) Formula for creating the simulated response
- respName
(Optional) Response variable name. Default is "RESP"
- treatProp
(Optional) Proportions for sampling. Missing by default, resulting in unbiased sampling
- treatOrder
(Optional) Logical flag: should allocations be assigned in order. FALSE by default
- conCovNames
(Optional) Continuous covariate names. Missing by default, resulting in no continuous covariates being created
- conCovMean
(Optional) Continuous covariate means. Missing by default
- conCovVCov
(Optional) Continuous covariate covariance matrix. Missing by default
- conCovCrit
(Optional) Continuous covariate acceptable range. Missing by default
- conCovDigits
(Optional) Continuous covariate rounding digits. 3 by default
- conCovMaxDraws
(Optional) Continuous covariate maximum draws. 100 by default
- disCovNames
(Optional) Discrete covariate names. Missing by default, resulting in no discrete covariates being created
- disCovVals
(Optional) Discrete covariate values. Missing by default
- disCovProb
(Optional) Discrete covariate probabilities. Missing by default
- disCovProbArray
(Optional) Array of probabilities for multivariate sampling. Missing by default
- extCovNames
(Optional) Names for the continuous covariates. Missing by default, resulting in no imported covariates
- extCovFile
(Optional) File from which to import (including full or relative path). Missing by default
- extCovSubset
(Optional) Subset to apply to data. Missing by default
- extCovRefCol
(Optional) Reference variable. Missing by default
- extCovSameRow
(Optional) Logical flag: should covariates sampled be from the same row. TRUE by default
- extCovDataId
(Optional) Subject variable name from file. Same as "idCol" by default
- timeCovNames
(Optional) Time-varying covariate names. Missing by default, resulting in no Time-varying covariates being created
- timeCovMean
(Optional) Time-varying covariate means. Missing by default
- timeCovVCov
(Optional) Time-varying covariate covariance matrix. Missing by default
- timeCovCrit
(Optional) Time-varying covariate acceptable range. Missing by default
- genParCrit
(Optional) Range of acceptable values for generated fixed effects. Missing by default
- genParBtwNames
(Optional) Between subject effects to generate. Missing by default, resulting in no created between subject effects
- genParBtwMean
(Optional) Means for generated between subject effects. Missing by default
- genParBtwVCov
(Optional) Covariance matrix for generated between subject effects. Missing by default
- genParErrStruc
(Optional) Function to map generated effects: Additive, Proportional or None. "None" by default
- genParMaxDraws
(Optional) Maximum number of iterations to generate valid parameters. 100 by default
- genParRangeTolerance
(Optional) Proportion of subjects with "in range" parameter data that we're happy proceeding with
- extParFile
(Optional) File name for external parameter data to import. Missing by default, resulting in no imported parameter variables
- extParNames
(Optional) Names of parameters to import from external file. Missing by default
- extParBtwNames
(Optional) Between subject effects variables to import from external file. Missing by default
- extParBtwNums
(Optional) Integer mapping between random and fixed effects in imported parameter data. Missing by default
- extParSubset
(Optional) Subsets to be applied to imported parameter before sampling. Missing by default
- extParCrit
(Optional) Acceptance range for imported parameter columns
- extParErrStruc
(Optional) Function to map effects from imported parameter data: Additive, Proportional or None. "None" by default
- extParRefColData
(Optional) Reference column in imported parameter data. Missing by default
- extParRefColName
(Optional) Reference column name from imported parameter data. Missing by default
- extParDataId
(Optional) Subject variable name in external parameter file. Same as "idCol" by default
- respInvLink
(Optional) Inverse link function for the linear predictor. Missing by default, resulting in no inverse link to be applied
- respDist
(Optional) Outcome response variable distribution ("Normal" by default)
- respVCov
(Optional) Residual error (co)variance to apply to generated response. None by default
- respErrStruc
(Optional) Function describing how to apply residual error to the generated response: Additive, Log-Normal or Proportional. "Additive" by default
- respCrit
(Optional) Range of acceptable values for created response. Missing (no criteria) by default
- respDigits
(Optional) Number of digits to which to round the created response. 3 by default
- mcarProp
(Optional) Proportion of observations to set to missing at random. 0 by default
- mcarRule
(Optional) Rule to specify which observations of the data should be included for MCAR allocation. Missing by default
- dropFun
(Optional) User defined function to define criteria for subject dropout. Missing (no dropout) by default
- dropFunExtraArgs
(Optional) Additional arguments to the dropout function. None by default
- interimSubj
(Optional) Proportion of total subjects to be assigned to each interim analysis. Missing by default, resulting in no "interim" variable derived
- interimMethod
(Optional) Method for creating interim variable: 'Sample' or 'Proportion'. "Sample" by default
- seed
(Optional) Random seed. By default, this is derived from the current session random seed
- idCol
(Optional) Subject variable name ("SUBJ" by default)
- doseCol
(Optional) Dose variable name ("DOSE" by default)
- timeCol
(Optional) Time variable name ("TIME" by default)
- trtCol
(Optional) Treatment variable name ("TRT" by default)
- parOmitFlag
(Optional) Parameter omit flag name ("PAROMIT" by default)
- respOmitFlag
(Optional) Response omit flag name ("RESPOMIT" by default)
- missingFlag
(Optional) Missingness flag name ("MISSING" by default)
- interimCol
(Optional) Interim variable name ("INTERIM" by default)
- parBtwSuffix
(Optional) Suffix for retained between subject effects variables. Suffix ".Between" is used by default
- deleteCurrData
(Optional) Should existing data be deleted before starting generation phase (TRUE by default)
- covDiff
(Optional) Should covariates differ between replicates (TRUE by default)
- treatDiff
(Optional) Should treatment allocation differ between replicates (TRUE by default)
- workingPath
(Optional) Working directory from which to create data. By default, the current working directory is used
Value
No value is returned from the generateData function. However, as a side effect, a number of simulated replicate datasets are created.
Details
The generateData function calls the low level generate data components to create sets of simulated data. The following components are called to create aspects of the simulated trial data:
createTreatments
: Used to create a dataset of all possible
treatment regimes to be allocated to subjects
allocateTreatments
: Use to allocate treatments to subjects in
the simulated study
createCovariates
: Creates a set of fixed covariates for a
simulated population
createParameters
: Creates simulated fixed and between subject
parameters for subjects in each replicate
createResponse
: Creates a simulated response variable based on
available derived data
createMCAR
: Adds a simulated "missing" flag to the data
createDropout
: Adds a simulated "missing" flag to the data
based on a dropout function
createInterims
: Assigns subjects in the study to interim
analyses
The function iteratively builds and combines the data components for each
replicte, and stores the data in the "ReplicateData" subdirectory of the
working directory. This data can then be analyzed using a call to the
analyzeData
function.
Author
Mike K Smith mstoolkit@googlemail.com
Examples
if (FALSE) {
generateData( replicateN = 500, subjects = 400, treatDoses = c(0, 5, 25, 50, 100),
conCovNames = c("wt", "age"), conCovMean = c(83, 55) , conCovVCov = c(14,10)^2 ,
conCovDigits = 1, conCovCrit = "18 <= age <= 65",
genParNames = "E0,ED50,EMAX", genParMean = c(2,50,10), genParVCov = diag( c(.5,30,10) ),
genParBtwNames = "E0,ED50,EMAX", genParBtwMean = c(0,0,0), genParBtwVCov = diag(3),
respEqn = "E0 + ((DOSE * EMAX)/(DOSE + ED50))", respVCov = 5,
interimSubj = ".3,.7")
}