Create covariates from a discrete distribution

The values and probs argument are parsed using the parseHashString helper function. They could be either :

a vector giving the values for each variable. c("1,2", "1,2,3") would mean that the first variable takes values 1 and 2, and the second variable takes values 1, 2 and 3.
a list giving the values for each variable. list(c(1,2), c(1,2,3)) would mean that the first variable takes values 1 and 2, and the second variable takes values 1, 2 and 3.
a compact notation using the hash symbol to separate variables "1,2#1,2,3"

Usage

createDiscreteCovariates(
  subjects,
  names,
  values,
  probs,
  probArray,
  seed = .deriveFromMasterSeed(),
  idCol = getEctdColName("Subject"),
  includeIDCol = TRUE
)

Arguments

subjects: (Required) Vector of subjects (or number of subjects) for which to create covariates
names: (Required) Names of the discrete covariates to be created. All the names should be valid R names. See link{validNames}.
values: (Required) Values that the covariates can take. See details section.
probs: (Optional) Probabilities for each covariates. See details section.
probArray: (Optional) Probability array for uneven sampling. See details section.
seed: (Optional) Random seed to use.By default, it is based on the current random seed
idCol: (Optional) Name of the subject column. Must be a valid R name (see validNames) and not be duplicated with any names. "SUBJ" by default
includeIDCol: (Optional) A logical value. Should the subject column be

Value

A data frame.

Details

Additionally for the probs argument, a check is performed to make sure that each variable probability sums to 1.

Alternatively, a probArray argument can be given. This should be a data frame containing one more column (named "prob") than the number of variables to create. Each variable has a column which contains the values it can take. The prob column gives the probability for each combination. See examples. The prob column should sum up to one.

Examples



  # 10 samples of X and Y where:
  # P[ X = 1 ] = .1
  # P[ X = 2 ] = .9
  # -
  # P[ Y = 7 ] = .5
  # P[ Y = 8 ] = .4
  # P[ Y = 9 ] = .1
  createDiscreteCovariates( 10 ,
                            names = "X, Y",
                            probs = ".1,.9#.5,.4,.1",
                            values = "1,2#7,8,9")
#>    SUBJ X Y
#> 1     1 2 7
#> 2     2 2 8
#> 3     3 2 8
#> 4     4 2 7
#> 5     5 2 8
#> 6     6 2 8
#> 7     7 2 7
#> 8     8 2 7
#> 9     9 1 7
#> 10   10 1 7
  # using the probArray version
  pa <- data.frame( F1 = rep(0:1, 3),
                    F2 = rep(1:3, each = 2),
                    PROB = c(.1,.2,.1,.2,.2,.2) )

  createDiscreteCovariates( 100 , probArray = pa )
#>     SUBJ F1 F2
#> 1      1  1  2
#> 2      2  0  1
#> 3      3  1  1
#> 4      4  1  2
#> 5      5  1  1
#> 6      6  1  1
#> 7      7  0  3
#> 8      8  1  1
#> 9      9  0  3
#> 10    10  0  3
#> 11    11  1  3
#> 12    12  1  1
#> 13    13  1  3
#> 14    14  1  2
#> 15    15  1  1
#> 16    16  0  3
#> 17    17  0  3
#> 18    18  1  3
#> 19    19  1  2
#> 20    20  0  3
#> 21    21  1  1
#> 22    22  1  3
#> 23    23  1  3
#> 24    24  1  3
#> 25    25  1  1
#> 26    26  1  2
#> 27    27  1  3
#> 28    28  1  3
#> 29    29  1  3
#> 30    30  1  2
#> 31    31  1  1
#> 32    32  1  3
#> 33    33  0  1
#> 34    34  1  2
#> 35    35  0  3
#> 36    36  1  2
#> 37    37  0  1
#> 38    38  1  3
#> 39    39  1  1
#> 40    40  1  1
#> 41    41  1  2
#> 42    42  1  2
#> 43    43  1  2
#> 44    44  0  2
#> 45    45  0  3
#> 46    46  0  2
#> 47    47  1  1
#> 48    48  1  2
#> 49    49  1  2
#> 50    50  1  2
#> 51    51  0  2
#> 52    52  1  1
#> 53    53  0  3
#> 54    54  1  1
#> 55    55  0  1
#> 56    56  0  3
#> 57    57  1  3
#> 58    58  0  2
#> 59    59  0  3
#> 60    60  1  1
#> 61    61  0  2
#> 62    62  0  1
#> 63    63  1  1
#> 64    64  1  1
#> 65    65  1  2
#> 66    66  1  2
#> 67    67  1  3
#> 68    68  1  2
#> 69    69  0  3
#> 70    70  1  1
#> 71    71  1  3
#> 72    72  0  3
#> 73    73  1  2
#> 74    74  0  3
#> 75    75  0  3
#> 76    76  0  3
#> 77    77  0  3
#> 78    78  1  2
#> 79    79  0  3
#> 80    80  1  1
#> 81    81  0  3
#> 82    82  0  1
#> 83    83  1  1
#> 84    84  1  1
#> 85    85  0  2
#> 86    86  1  2
#> 87    87  1  1
#> 88    88  1  2
#> 89    89  1  1
#> 90    90  0  1
#> 91    91  0  3
#> 92    92  1  3
#> 93    93  1  1
#> 94    94  1  2
#> 95    95  1  1
#> 96    96  0  1
#> 97    97  1  1
#> 98    98  1  3
#> 99    99  0  3
#> 100  100  1  2

Usage

Arguments

Value

Details

See also

Examples