Analyzes a set of simulated trial data, possibly including interim analyses
Usage
analyzeData(
replicates = "*",
analysisCode,
macroCode,
interimCode = NULL,
software = "R",
grid = FALSE,
waitAndCombine = TRUE,
cleanUp = FALSE,
removeMissing = TRUE,
removeParOmit = TRUE,
removeRespOmit = TRUE,
seed = .deriveFromMasterSeed(),
parOmitFlag = getEctdColName("ParOmit"),
respOmitFlag = getEctdColName("RespOmit"),
missingFlag = getEctdColName("Missing"),
interimCol = getEctdColName("Interim"),
doseCol = getEctdColName("Dose"),
sleepTime = 15,
deleteCurrData = TRUE,
initialDoses = NULL,
stayDropped = TRUE,
fullAnalysis = TRUE,
workingPath = getwd(),
method = getEctdDataMethod()
)
Arguments
- replicates
(Optional) Vector of replicates on which to perform analysis: all replicates are analyzed by default
- analysisCode
(Required) File containing analysis code (for R or SAS) or an R function for analysis (R only)
- macroCode
(Required) An R function to be used for macro evaluation of the result datasets. See the help file for the
macroEvaluation
function for more information- interimCode
(Optional) An R function to be applied to interim datasets in order to creation interim decisions. See the help file for the
interimAnalysis
function for more information. By default, no functions is provided, resulting in no interim analyses being performed- software
(Optional) The software to be used for analysis: either "R" or "SAS". "R" is the default software used
- grid
(Optional) If available, should the analysis be split across available CPUs. Uses the "parallel" package to split jobs across available cores. Uses minimum of either: Cores-1 or getOption("max.clusters"), usually =2. FALSE by default.
- waitAndCombine
(Optional) Should the process wait for all analyses to finish, then combine into micro and macro summary files? TRUE by default
- cleanUp
(Optional) Should micro/macro directories be removed on completion? TRUE by default
- removeMissing
(Optional) Should rows marked as 'Missing' during the data generation step be removed from the data before analysis is performed? TRUE by default
- removeParOmit
(Optional) Should any rows marked as 'Omitted' during the parameter data generation step (ie. parameters out of range) be removed from the data before analysis is performed? TRUE by default
- removeRespOmit
(Optional) Should any rows marked as 'Omitted' during the response generation step (ie. responses out of range) be removed from the data before analysis is performed? TRUE by default
- seed
(Optional) Random number seed to use for the analysis. Based on the current random seed by default
- parOmitFlag
(Optional) Parameter omit flag name. "PAROMIT" by default
- respOmitFlag
(Optional) Response omit flag name. "RESPOMIT" by default
- missingFlag
(Optional) Missing flag name. "MISSING" by default
- interimCol
(Optional) Interim variable name. "INTERIM" by default
- doseCol
(Optional) Dose variable name. "DOSE" by default
- sleepTime
(Optional) Number of seconds to sleep between iterative checks for grid job completion. 15 seconds are used by default
- deleteCurrData
(Optional) Should any existing micro evaluation and macro evaluation data be removed before new analysis is performed? TRUE by default
- initialDoses
(Optional) For interim analyses, which doses should be present in interim 1? All are included by default
- stayDropped
(Optional) For interim analyses, if a dose is dropped, should it stay dropped in following interims (as opposed to allowing the interim step to reopen the dose)
- fullAnalysis
(Optional) Should a "full" analysis be performed on all doses? Default TRUE
- workingPath
(Optional) Root directory in which replicate data is stored, and in which we should perform the analysis. Current working directory is used by default
- method
Data storage method (ie. where the replicate data is stored). Given by getEctdDataMethod by default
Value
This function will produce no direct output. As a consequence, however, many analysis, summary and log files will be produced.
Details
The first task of the function will be to check the options specifed: * If
the "grid" network is unavailable or if the length of the "replicates" input
is 1, the "grid" flag will be set to FALSE * If the "grid" flag is TRUE, the
call to analyzeData
will be split across multiple processors
using the "parallel" library * If the length of the "replicates" vector is
1, the "waitAndCombine" flag will be set to FALSE * If the "waitAndCombine"
flag is set to FALSE, the "cleanUp" flag will also be set to FALSE
The analyzeData
function will iterate around each replicate
specified in the "replicates" vector. For each replicate, the function will
first call the analyzeRep
with the required inputs. The output
from the call to analyzeRep
will be a data frame containing
micro evaluation data. This data frame will be checked to ensure it is of
the correct format. If the return from analyzeRep
is a valid
"Micro Evaluation" dataset, it will be saved to the "MicroEvaluation"
folder, and also passed to the macroEvaluation
function for
further analysis. If the return from macroEvaluation
is a
valid "Macro Evaluation" dataset, it will be saved to the "MicroEvaluation"
folder.
If the "waitAndCombine" flag is set to TRUE, the function will wait until
all grid jobs are finished (if grid has been used), then compile the "Micro"
and "Macro" evaluation results into single summary files (using the
compileSummary
function).
Note
There are some restrictions on the code inputs to the
analyzeData
function. These restrictions are discussed here:
Analysis Code: The "analysisCode" input must be either an R function or a
reference to an external file. If it is a reference to external file, it
must contain either SAS code (if software is "SAS") or R code (if software
is "R"). If the code is an R function, or an external R script, it must
accept a data frame as it's only argument and return an acceptable "Micro
Evaluation" data frame as set out in checkMicroFormat
. If the
code is an external SAS script, it must accept use a SAS dataset called
"work.infile" and create a SAS dataset called "work.outfile" that conforms
to the "Micro Evalutation" format as set out in
checkMicroFormat
. More information on "Micro Evaluation"
structures can be found in the help file for function
checkMicroFormat
.
Interim Code: The "interimCode" input must be an R function that accepts a
single "Micro Evaluation" data input, and returns an R "list" structure that
is either empty or contains one or more of the following elements: An
element called "STOP" which is a logical vector of length 1. This tells the
analyzeData
function whether the analysis should be halted at
this interim An element called "DROP" which is a vector of numeric values
relating to doses in the data to drop before the next interim is analyzed.
More information on "Micro Evaluation" structures can be found in the help
file for function interimAnalysis
.
Macro Code: The "macroCode" input must be an R function that accepts an
enhanced "Micro Evaluation" data input, and returns a valid "Macro
Evaluation" data structure (as specified in the help file for the
checkMacroFormat
function.
Author
Mike K Smith mstoolkit@googlemail.com
Examples
if (FALSE) {
# Standard analysis code
emaxCode <- function(data){
library(DoseResponse)
with( data,
{
uniDoses <- sort( unique(DOSE))
eFit <- emaxalt( RESP, DOSE )
outDf <- data.frame( DOSE = uniDoses,
MEAN = eFit$dm[as.character(uniDoses)],
SE = eFit$dsd[as.character(uniDoses)] )
outDf$LOWER <- outDf$MEAN - 2 * outDf$SE
outDf$UPPER <- outDf$MEAN + 2 * outDf$SE
outDf$N <- table(DOSE)[ as.character(uniDoses) ]
outDf
})
}
# Macro evaluation code
macrocode <- function(data) {
# making up a t-test
mu0 <- data$MEAN[ data$DOSE == 0 & data$INTERIM == 0]
mu100 <- data$MEAN[ data$DOSE == 100 & data$INTERIM == 0]
n0 <- data$N[ data$DOSE == 0 & data$INTERIM == 0]
n100 <- data$N[ data$DOSE == 100 & data$INTERIM == 0]
sd0 <- data$SE[ data$DOSE == 0 & data$INTERIM == 0]
sd100 <- data$SE[ data$DOSE == 100 & data$INTERIM == 0]
sddiff <- if( n0 == n100 ){
sqrt( (sd0^2 + sd100^2) / (n0 + n100) )
} else {
sqrt( (1/n0 + 1/n100) * ( (n0-1)*sd0^2 + (n100-1)*sd100^2 ) / (n0+n100-2) )
}
tstat <- ( mu100 - mu0 ) / sddiff
success <- abs(tstat) > qt( .975, n0+n100-2)
data.frame( SUCCESS = success, TSTAT = tstat )
}
# Interim analysis code
interimCode <- function( data ){
dropdose <- with( data, DOSE [ sign(UPPER) != sign(LOWER) & DOSE != 0] )
outList <- list()
if( length(dropdose) > 0 ) outList$DROP <- dropdose
outList$STOP <- length(dropdose) == nrow(data)-1
outList
}
# Run analysis
analyzeData( 1:5, analysisCode = emaxCode, macroCode = macrocode,
interimCode = interimCode )
}