--------------------------------------------------------------------------------- log: c:\TEACHING\bayeskoop\probitch9.txt log type: text opened on: 15 Nov 2005, 11:02:44 . . ** Program probitch9.do October 2005 for Stata version 9 . . ********** OVERVIEW OF PROBITCH9.DO ********** . . * Stata program based on Koop's MATLAB program chapter9b.m . * Probit example of Koop chapter 9.3 . . * This program does . * (1) Bayesian Gibbs for linear probit regression . * with uninformative prior . * using generated data . . ********** SETUP ********** . . set mem 5m (5120k) . set more off . version 9 . . ********** CREATE DATA AND SUMMARIZE ********** . . * Based on ch9artdatb.m . . * Generate artificial data set for probit illustration in Chapter 9 . * - explanatory variable x ~ N[0,1] . * - dependent variable y = 1(0.5*x + e > 0) . * for e ~ N[0,1] . . set obs 100 obs was 0, now 100 . set seed 1234567 . gen x = invnorm(uniform()) . gen ystar = 0.5*x + invnorm(uniform()) . gen y = (ystar > 0) . gen cons = 1 . sum Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- x | 100 -.05759 1.070252 -2.228272 3.027187 ystar | 100 -.280678 1.254698 -3.067405 2.495417 y | 100 .42 .496045 0 1 cons | 100 1 0 1 1 . . probit y x Iteration 0: log likelihood = -68.0292 Iteration 1: log likelihood = -62.273262 Iteration 2: log likelihood = -62.210099 Iteration 3: log likelihood = -62.210079 Probit regression Number of obs = 100 LR chi2(1) = 11.64 Prob > chi2 = 0.0006 Log likelihood = -62.210079 Pseudo R2 = 0.0855 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | .4356952 .1345176 3.24 0.001 .1720455 .6993449 _cons | -.1914617 .1310428 -1.46 0.144 -.4483008 .0653774 ------------------------------------------------------------------------------ . . ********** DO PROBIT BAYESIAN ********** . . mata ------------------------------------------------- mata (type end to exit) ----- : // Create y vector and X matrix from Stat data set using st_view command : st_view(y=., ., "y") // dependent : st_view(X=., ., ("cons", "x")) // regressors : Xnames = ("cons", "x") // used to label output : : // Calculate a few quantities outside the loop for later use : n = rows(X) : k = cols(X) : Xsquare = X'*X : Xtxinv=invsym(Xsquare) : Xtxinvchol = cholesky(Xtxinv) : : v1 = n : : // Specify the number of replications : // To get almost same as probit MLE increase to s0 = 1000 s1 = 10000 : s0 = 10 // number of burnin reps : s1 = 1000 // number of retained reps : s = s0+s1 // total reps : : // store all draws in the following matrices : y1 = J(s0+s1,1,0) : b_all = J(k,s1,0) : probs_all = J(3,s1,0) : : // beta conditional on h is Normal : // h is Gamma : : // choose a starting value for latent data : ystar = y : : // Now do Gibbs loop : : for (irep=1; irep<=s; irep++) { > > // Posterior-step: draw from beta | y* ~ N[bols, (X'X)^-1] > bols = Xtxinv*X'*ystar > b1 = bols > bdraw = b1 + Xtxinvchol*invnormal(uniform(k,1)) > > // Imputation step: make one draw of vector ystar > // where for ith observation ystar_i | y,b is truncated normal > // Right: If y = 1 we need draw from N[0,1] with ystar > -mu > // Left: If y = 0 we need draw from N[0,1] with ystar < -mu > // The method used here is explained at the end of the program > > for (i=1; i<=n; i++) { > mu = X[i,.]*bdraw > if (y[i,1]==0) { > uright = normal(-mu)*uniform(1,1) > ystar[i,1] = mu + invnormal(uright) > } > else { > uleft = normal(-mu) + (1-normal(-mu))*uniform(1,1) > ystar[i,1] = mu + invnormal(uleft) > } > } > > // Store the draws of b after burn-in plus a diagnostic used in Koop > if (irep>s0) { > // after discarding burnin, store all draws > j = irep-s0 > b_all[.,j] = bdraw // These are the posterior draws > // Calculate probability of making choice one for obs with x = +2, 0, > -2 > muterm = J(3,1,0) > muterm[1,1] = -bdraw[1,1] - 2*bdraw[2,1] > muterm[2,1] = -bdraw[1,1] > muterm[3,1] = -bdraw[1,1] + 2*bdraw[2,1] > probs = J(3,1,1) - normal(muterm); > probs_all[.,j] = probs > } > } : // The Gibbs loop has ended : : // OUTPUT : : // Calculate the mean of the draws etc. : // The variance and se assumes independence over draws which is not the cas > e : mean(b_all',1) 1 2 +-------------------------------+ 1 | -.2040995116 .4272618871 | +-------------------------------+ : variance(b_all',1) [symmetric] 1 2 +-----------------------------+ 1 | .0175302669 | 2 | .0000138771 .0178979639 | +-----------------------------+ : se = sqrt(diagonal(variance(b_all',1))) : se 1 +---------------+ 1 | .1324019141 | 2 | .1337832722 | +---------------+ : : // The correlation coeeficients up to lag 5 of the slopes : b2 = b_all'[|6,2 \ s1,2|] : b2lag1 = b_all'[|5,2 \ (s1-1),2|] : b2lag2 = b_all'[|4,2 \ (s1-2),2|] : b2lag3 = b_all'[|3,2 \ (s1-3),2|] : b2lag4 = b_all'[|2,2 \ (s1-4),2|] : b2lag5 = b_all'[|1,2 \ (s1-5),2|] : bslopeandlags = b2,b2lag1,b2lag2,b2lag3,b2lag4,b2lag5 : correlation(bslopeandlags,1) [symmetric] 1 2 3 4 5 +----------------------------------------------------------------------- 1 | 1 2 | .4516274791 1 3 | .2236612051 .450910037 1 4 | .1007102719 .2226282344 .4520865924 1 5 | .0322276428 .1002831305 .2229962429 .4523804908 1 6 | .0741816244 .0318863731 .0991774171 .222742873 .4522707387 +----------------------------------------------------------------------- 6 ---------------+ 6 1 | ---------------+ : : // The last 20 draws of b : last20drawsofb = b_all'[|(s1-19),1 \ (s1),.|] : last20drawsofb 1 2 +-------------------------------+ 1 | -.0459982448 .5934011165 | 2 | .0942317117 .4378678007 | 3 | -.1603931616 .4248903672 | 4 | -.2368542657 .4502867821 | 5 | -.0673048789 .1822288592 | 6 | .1396780197 .4355811575 | 7 | -.1177268524 .2742788854 | 8 | -.2497370933 .4137552133 | 9 | -.3182302181 .4800161821 | 10 | -.3755565479 .1829228889 | 11 | -.2026314112 .305437169 | 12 | -.1086700595 .383761822 | 13 | -.1693578725 .5715437313 | 14 | -.1559423692 .5727243798 | 15 | -.3071964087 .3747980206 | 16 | -.2500946395 .3260624744 | 17 | -.2909057514 .438265513 | 18 | -.0833342123 .5184623841 | 19 | -.4661469804 .1944185675 | 20 | -.2126789044 .3108386159 | +-------------------------------+ : : // The probabilities at various values of x : mean(probs_all',1) 1 2 3 +-------------------------------------------+ 1 | .7333700774 .4198128812 .1551474682 | +-------------------------------------------+ : : end ------------------------------------------------------------------------------- . . * Compare to probit . probit y x Iteration 0: log likelihood = -68.0292 Iteration 1: log likelihood = -62.273262 Iteration 2: log likelihood = -62.210099 Iteration 3: log likelihood = -62.210079 Probit regression Number of obs = 100 LR chi2(1) = 11.64 Prob > chi2 = 0.0006 Log likelihood = -62.210079 Pseudo R2 = 0.0855 ------------------------------------------------------------------------------ y | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | .4356952 .1345176 3.24 0.001 .1720455 .6993449 _cons | -.1914617 .1310428 -1.46 0.144 -.4483008 .0653774 ------------------------------------------------------------------------------ . . /* Inverse Transformation Method for drawing truncated normals. > > 1. If there was no truncation we just draw a uniform in (0,1) > and then let x = invcdf(u) where invcdf is inverse of the normal cdf > > 2. If there is truncation then we draw a uniform on (a,b) where > a>=0 and b<=1 are determined by the truncation > and then let x = invcdf(u) where invcdf is inverse of the normal cdf > > 3. Specifically suppose x ~ N(mu, s^2) with restriction that left < x < ri > ght > Then the untruncated cdf lies in (lowerprob, upperprob) where > lowerprob = Prob[x = left] = Prob[z = (left-mu)/s] = Phi((left-mu)/s)) > upperprob = Prob[x = right] = Prob[z = (right-mu)/s] = Phi((right-mu)/s > )) > To draw a uniform within lowerprob and upperprob > we transform a (0,1) uniform draw u01 as follows: > u = lowerprob + (upperprob-lowerprob)*u01 > Finally to evaluate invcdf(u) using we use > draw = mu + s*Phiinv(u) where Phiinv is inverse of the standard norma > l cdf > > 4. For left-truncated at 0, left = 0, right = infinity > and lowerprob = Phi(-mu/s) and upperprob = 1 > For right-truncated at 0, left = 0, right = infinity > and lowerprob = 0 and upperprob = 1 > */ . . ********** CLOSE OUTPUT ********** . . * log close . * clear . * exit . end of do-file . exit, clear