i generating data set first want randomly draw number each observation discrete distribution, fill in var1
these numbers. next, want draw number distribution each row, catch number in var1
observation not eligible drawn anymore. want repeat relatively large number of times.
to make make more sense, suppose start with:
id 1 2 3 ... 999 1000
suppose distribution have ["a", "b", "c", "d", "e"] happen probability [.2, .3, .1, .15, .25].
i first randomly draw distribution fill in var
. suppose result of is:
id var1 1 e 2 e 3 c ... 999 b 1000
now e
not eligible drawn observations 1
, 2
. c
, b
, , a
ineligible observations 3
, 999
, , 1000
, respectively.
after columns filled in, may end this:
id var1 var2 var3 var4 var5 1 e c b d 2 e b d c 3 c b e d ... 999 b d c e 1000 e b c d
i not sure of how approach in stata. 1 way fill in var1
like:
gen random1 = runiform() replace var1 = "a" if random1<.2 replace var1 = "b" if random1>=.2 & random1<.5 etc....
note sticking (scaled) probabilities after creating var1
desirable, not required me.
here's solution works in long form select distribution. values selected, flagged done , next selection made groups contain remaining values. probabilities scaled @ each pass.
version 14 set seed 3241234 * example generated -dataex-. install: ssc install dataex clear input byte ip str1 y double p 1 "a" .2 2 "b" .3 3 "c" .1 4 "d" .15 5 "e" .25 end local nval = _n * following should true isid y expand 1000 bysort y: gen id = _n sort id ip gen done = 0 forvalues = 1/`nval' { // scale probabilities bysort id done (ip): gen double ptot = sum(p) // running sum id done: gen double phigh = sum(p / ptot[_n]) id done: gen double plow = cond(_n == 1, 0, phigh[_n-1]) // random number in range of (0,1) group bysort id done (ip): gen double x = runiform() // pick not done group; choose first x represent group id done: gen pick = !done & inrange(x[1], plow, phigh) // put picked obs @ end , create new var bysort id (pick ip): gen v`i' = y[_n] // done obs picked bysort id: replace done = 1 if _n == _n drop x pick ptot phigh plow } bysort id: keep if _n == 1
Comments
Post a Comment