Create covariates from a discrete distribution
Source:R/createDiscreteCovariates.R
createDiscreteCovariates.Rd
The values
and probs
argument are parsed using the
parseHashString
helper function. They could be either :
a vector giving the values for each variable.
c("1,2", "1,2,3")
would mean that the first variable takes values 1 and 2, and the second variable takes values 1, 2 and 3.a list giving the values for each variable.
list(c(1,2), c(1,2,3))
would mean that the first variable takes values 1 and 2, and the second variable takes values 1, 2 and 3.a compact notation using the hash symbol to separate variables
"1,2#1,2,3"
Usage
createDiscreteCovariates(
subjects,
names,
values,
probs,
probArray,
seed = .deriveFromMasterSeed(),
idCol = getEctdColName("Subject"),
includeIDCol = TRUE
)
Arguments
- subjects
(Required) Vector of subjects (or number of subjects) for which to create covariates
- names
(Required) Names of the discrete covariates to be created. All the names should be valid R names. See
link{validNames}
.- values
(Required) Values that the covariates can take. See details section.
- probs
(Optional) Probabilities for each covariates. See details section.
- probArray
(Optional) Probability array for uneven sampling. See details section.
- seed
(Optional) Random seed to use.By default, it is based on the current random seed
- idCol
(Optional) Name of the subject column. Must be a valid R name (see
validNames
) and not be duplicated with anynames
. "SUBJ" by default- includeIDCol
(Optional) A logical value. Should the subject column be
Details
Additionally for the probs
argument, a check is performed to make
sure that each variable probability sums to 1.
Alternatively, a probArray
argument can be given. This should be a
data frame containing one more column (named "prob") than the number of
variables to create. Each variable has a column which contains the values it
can take. The prob column gives the probability for each combination. See
examples. The prob column should sum up to one.
Examples
# 10 samples of X and Y where:
# P[ X = 1 ] = .1
# P[ X = 2 ] = .9
# -
# P[ Y = 7 ] = .5
# P[ Y = 8 ] = .4
# P[ Y = 9 ] = .1
createDiscreteCovariates( 10 ,
names = "X, Y",
probs = ".1,.9#.5,.4,.1",
values = "1,2#7,8,9")
#> SUBJ X Y
#> 1 1 2 7
#> 2 2 2 8
#> 3 3 2 8
#> 4 4 2 7
#> 5 5 2 8
#> 6 6 2 8
#> 7 7 2 7
#> 8 8 2 7
#> 9 9 1 7
#> 10 10 1 7
# using the probArray version
pa <- data.frame( F1 = rep(0:1, 3),
F2 = rep(1:3, each = 2),
PROB = c(.1,.2,.1,.2,.2,.2) )
createDiscreteCovariates( 100 , probArray = pa )
#> SUBJ F1 F2
#> 1 1 1 2
#> 2 2 0 1
#> 3 3 1 1
#> 4 4 1 2
#> 5 5 1 1
#> 6 6 1 1
#> 7 7 0 3
#> 8 8 1 1
#> 9 9 0 3
#> 10 10 0 3
#> 11 11 1 3
#> 12 12 1 1
#> 13 13 1 3
#> 14 14 1 2
#> 15 15 1 1
#> 16 16 0 3
#> 17 17 0 3
#> 18 18 1 3
#> 19 19 1 2
#> 20 20 0 3
#> 21 21 1 1
#> 22 22 1 3
#> 23 23 1 3
#> 24 24 1 3
#> 25 25 1 1
#> 26 26 1 2
#> 27 27 1 3
#> 28 28 1 3
#> 29 29 1 3
#> 30 30 1 2
#> 31 31 1 1
#> 32 32 1 3
#> 33 33 0 1
#> 34 34 1 2
#> 35 35 0 3
#> 36 36 1 2
#> 37 37 0 1
#> 38 38 1 3
#> 39 39 1 1
#> 40 40 1 1
#> 41 41 1 2
#> 42 42 1 2
#> 43 43 1 2
#> 44 44 0 2
#> 45 45 0 3
#> 46 46 0 2
#> 47 47 1 1
#> 48 48 1 2
#> 49 49 1 2
#> 50 50 1 2
#> 51 51 0 2
#> 52 52 1 1
#> 53 53 0 3
#> 54 54 1 1
#> 55 55 0 1
#> 56 56 0 3
#> 57 57 1 3
#> 58 58 0 2
#> 59 59 0 3
#> 60 60 1 1
#> 61 61 0 2
#> 62 62 0 1
#> 63 63 1 1
#> 64 64 1 1
#> 65 65 1 2
#> 66 66 1 2
#> 67 67 1 3
#> 68 68 1 2
#> 69 69 0 3
#> 70 70 1 1
#> 71 71 1 3
#> 72 72 0 3
#> 73 73 1 2
#> 74 74 0 3
#> 75 75 0 3
#> 76 76 0 3
#> 77 77 0 3
#> 78 78 1 2
#> 79 79 0 3
#> 80 80 1 1
#> 81 81 0 3
#> 82 82 0 1
#> 83 83 1 1
#> 84 84 1 1
#> 85 85 0 2
#> 86 86 1 2
#> 87 87 1 1
#> 88 88 1 2
#> 89 89 1 1
#> 90 90 0 1
#> 91 91 0 3
#> 92 92 1 3
#> 93 93 1 1
#> 94 94 1 2
#> 95 95 1 1
#> 96 96 0 1
#> 97 97 1 1
#> 98 98 1 3
#> 99 99 0 3
#> 100 100 1 2