SAS中文论坛

 找回密码
 立即注册

扫一扫,访问微社区

查看: 735|回复: 0
打印 上一主题 下一主题

PROC MIXED: Macro to assess fixed and random effects for sig

[复制链接]

49

主题

76

帖子

1462

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
1462
楼主
 楼主| 发表于 2010-11-4 21:21:02 | 只看该作者

PROC MIXED: Macro to assess fixed and random effects for sig

From LCChien's blog on blogspot

原文載點:<span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px;"><a href="https://docs.google.com/viewer?url=http://analytics.ncsu.edu/sesug/2009/SC013.Bardenheier.pdf"><!-- m --><a class="postlink" href="http://analytics.ncsu.edu/sesug/2009/SC013.Bardenheier.pdf">http://analytics.ncsu.edu/sesug/2009/SC ... nheier.pdf</a><!-- m --></a></span><br /><br />若要從兩個nested mixed model挑選較佳的模式,比較專業的人會推薦使用Likelihood Ratio Test(以下簡稱LRT)來處理這類的問題。PROC MIXED程序並沒有語法來進行這個檢定,但這個檢定需要使用到的參數,卻全部都可以從PROC MIXED程序的輸出報表內找到。懂得運作的使用者可以從報表拿擷取相關參數,然後自己算一算再查個表大概就知道結果了。我自己曾經在準備博士班資格考時寫了一個簡易的macro來做LRT,但沒想到真的有人把這個觀念發表出來。我稍微看了一下這份技術文件內的macro寫法,與我當年寫的大同小異,但作者有將輸出報表格式設計的比較好看一點,所以我就來介紹一下這個方便進行LRT的macro。<br /><br /><a name='more'></a><br />假設兩個nested mixed model分別稱為reduced model和full model,通常reduced model會被設定在虛無假設H0下,full model會被設定在對立假設H1下。而整個LRT會從這兩個模式各取出兩個數據出來,一個是他們的自由度,另一個是他們-2log likelihood function的值。這兩個數據全部都可以從PROC MIXED的報表內找到。我以PROC MIXED線上手冊所附的一個範例為例,請參照這個網頁:<br /><br /><a href="http://www.tau.ac.il/cc/pages/docs/sas8/stat/chap41/sect30.htm"><!-- m --><a class="postlink" href="http://www.tau.ac.il/cc/pages/docs/sas8/stat/chap41/sect30.htm">http://www.tau.ac.il/cc/pages/docs/sas8 ... sect30.htm</a><!-- m --></a><br /><br />這兩個數據可以在這個地方找到,圖示如下(紅色箭頭處):<br /><span class="Apple-style-span" style="color: #4c4c4c; font-family: arial, helvetica, sans-serif; font-size: 12px;"><img id="imgThumbnail" src="http://rookeryiis2.aviary.com/storage/3472500/3472607_252c_rss.jpg" style="border-bottom-style: none; border-bottom-width: 0px; border-color: initial; border-left-style: none; border-left-width: 0px; border-right-style: none; border-right-width: 0px; border-top-style: none; border-top-width: 0px; border-width: initial;" /></span><br /><span class="Apple-style-span" style="color: #4c4c4c; font-family: arial, helvetica, sans-serif; font-size: 12px;"><br /></span><br />至於公式,引用 wiki 的資料:<br /><br /><span class="Apple-style-span" style="font-family: sans-serif; font-size: 13px; line-height: 19px;"><img alt="\begin{align}D &amp; = -2(\ln(\text{likelihood for null model}) - \ln(\text{likelihood for alternative model})) \\&amp; = -2\ln\left( \frac{\text{likelihood for null model}}{\text{likelihood for alternative model}} \right).\end{align}" class="tex" src="http://upload.wikimedia.org/math/a/c/6/ac612f42c3987cbac42333ad2c338013.png" style="border-bottom-style: none; border-color: initial; border-left-style: none; border-right-style: none; border-top-style: none; border-width: initial; vertical-align: middle;" /></span><br /><br />白話一點,就是把 reduced model 的 -2 log likelihood function 的值減掉 full model 的 -2 log likelihood function 的值,就可以算出上面那個 D。然後兩個模式的自由度數量的差異套進卡方分配裡面,就可以算出 p-value 了。如果 p-value &lt; 0.05,就表示代表 reduced model 的 H0 被拒絕了,也就是說 full model 比 reduced model 好。反之亦然。<br /><br />這種計算其實根本不需要使用到電腦的,但懶人永遠可以找藉口,所以原文作者還是花了將近兩頁的篇幅寫了這個 macro 來服務大眾。<br /><br /><code><span class="Apple-style-span" style="color: blue;">%macro LRMixtureTest(fullmodel=,redmodel=,DFfull=,DFred=);&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">/*&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">This macro calculates the Likelihood Ratio Test for fixed effects in a&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">mixed model&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">See Singer Applied Longitudinal Data Analysis pp 116-120.&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">and also calculates the mixture method pvalue for random effects in a&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">mixed model -see KKNM chapter 26.&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">The macro argument &amp;fullmodel refers to the output dataset produced by&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">running the full model in proc mixed and using the statement:&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">&nbsp;&nbsp; &nbsp;ods output FitStatistics=fm;&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">The macro argument &amp;DFfull refers to the output dataset produced by&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">running the full model in proc mixed and adding to the statement:&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">&nbsp;&nbsp; &nbsp;ods output FitStatistics=fm SolutionF=SFfm; The number of&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">parameters being tested (ie, fixed effects) determine the degrees of&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">freedom in the LRT and come from this dataset.&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">The argument &amp;redmodel refers to the output dataset produced by running&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">the reduced model in proc mixed using the statement:&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">&nbsp;&nbsp; ods output FitStatistics=rm &nbsp;SolutionF=SFrm ;&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">Maximum Likelihood should be used when comparing fixed effects because&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">it maximizes the likelihood of the sample data. Using the Restricted&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">Maximum likelihood should only be used when assessing a random term&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">(ie, the fixed effects should be the same in both models) &nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">because in RML we maximize the likelihood of the residuals. The degrees&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">of freedom would not be correct if the number of fixed effects were&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">different in each model when using RML -which is the default in PROC&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">MIXED.&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">*/&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">data &amp;fullmodel; set &amp;fullmodel;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;if descr='-2 Res Log Likelihood' then chisqfullRML=value; &nbsp;</span><span class="Apple-style-span" style="color: #38761d;">*RML;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;else if descr='-2 Log Likelihood' then chisqfullFML=value; </span><span class="Apple-style-span" style="color: #38761d;">*FML;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">run;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">data &amp;redmodel; set &amp;redmodel;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp;if descr='-2 Res Log Likelihood' then chisqredRML=value; </span><span class="Apple-style-span" style="color: #38761d;">*RML;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp;else &nbsp;if descr='-2 Log Likelihood' then chisqredFML=value; </span><span class="Apple-style-span" style="color: #38761d;">*FML;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">run;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">proc sql;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">CREATE TABLE flDF AS SELECT &nbsp;effect, count(*) AS fullDF from&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">work.&amp;DFfull; quit; </span><span class="Apple-style-span" style="color: #38761d;">*Count number of effects to get degrees of freedom;&nbsp;</span><br /><br /><span class="Apple-style-span" style="color: blue;">proc sql;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">create table rdDF as select &nbsp;effect, count(*) as redDF from&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">work.&amp;DFred; quit;&nbsp;</span><br /><br /><span class="Apple-style-span" style="color: blue;">data degfree (drop=effect);&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp;merge &nbsp;work.flDF work.rdDF;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">if _n_=1 then &nbsp;LRTDF= fullDF &nbsp;- redDF; run;&nbsp;</span><br /><br /><span class="Apple-style-span" style="color: blue;">data likelihood;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;merge &amp;fullmodel &amp;redmodel degfree;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;testintRML=abs((chisqredRML)-(chisqfullRML)); </span><span class="Apple-style-span" style="color: #38761d;">**Models can yield&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">negative LLs, those that are smaller in absolute value -ie, closer to 0&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">fit better (pg 116-117, Singer); &nbsp;</span><br /><br /><span class="Apple-style-span" style="color: blue;">7&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">testintFML=abs((chisqredFML)-(chisqfullFML)); </span><span class="Apple-style-span" style="color: #38761d;">**Models can yield&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">negative LLs, those that are smaller in absolute value -ie, closer to 0&nbsp;</span><br /><span class="Apple-style-span" style="color: #38761d;">fit better (pg 116-117, Singer);&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;pvaluemixture=(( .5*(1-probchi(testintRML,2)) + .5*(1-</span><br /><span class="Apple-style-span" style="color: blue;">probchi(testintRML,1)) )); &nbsp; &nbsp;</span><span class="Apple-style-span" style="color: #38761d;">*for random terms;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp;pvalueLRT=(1-probchi(testintFML,LRTDF)); &nbsp; &nbsp; &nbsp;</span><span class="Apple-style-span" style="color: #38761d;">*For fixed terms;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">proc print data=likelihood split='*' noobs;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; var &nbsp;testintFML &nbsp;pvalueLRT;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;format &nbsp;pvalueLRT 6.4;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;where testintFML ne .;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; label testintFML='likelihood ratio*test statistic*&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; comparing*reduced model to full model'&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; pvalueLRT='p-value LRT';&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; &nbsp; &nbsp;title "Likelihood Ratio Test for fixed effects"; run;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">proc print data=likelihood split='*' noobs;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; var &nbsp;testintRML &nbsp;pvaluemixture;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;format pvaluemixture &nbsp;6.4;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;where &nbsp;testintRML ne . ;&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; label testintRML='likelihood ratio*test statistic*&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; comparing*reduced model to full model'&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; pvaluemixture='mixture method p-value';&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;&nbsp; &nbsp; &nbsp;title "Mixture method Test for random effects";&nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">&nbsp;run; &nbsp;</span><br /><span class="Apple-style-span" style="color: blue;">%mend LRMixtureTest;&nbsp;</span></code><br /><br />這個 macro 裡面綠色的部分完全都是作者的註釋和說明,只有藍色的部分才是真正 SAS 會去執行的程式碼。其中只有四個參數要說明:<br /><br /><ul><li>fullmodel:full model 的 -2 log likelihood function 值</li><li>redmodel:reduced model 的&nbsp;-2 log likelihood function 值</li><li>DFfull:full model 的自由度</li><li>DFred:reduced model 的自由度</li></ul><br />四個參數設定就和上一段的敘述相同。全部都是必要的參數,缺一不可。因此,只要呼叫&nbsp;%macro LRMixtureTest 再套上四個參數值,電腦就會幫你做 LRT。我們來看原文內的一個範例。<br /><br />範例模式:<br /><br /><img alt="doodle.png" src="http://rookeryiis1.aviary.com/storage/3472500/3472730_b795_625x625.jpg" /><br /><br />這模式只使用一個因變數叫做 PCTBLK,但他同時是固定效應也是隨機效應。假設這是個 full model,而 reduced model 就設定為拿掉 PCTBLK 固定效應的模式,也就是拿掉上面劃紅線的那一段。<br /><br />因此,先跑兩個程式把那四個要用到的參數算出來:<br /><br /><code><span class="Apple-style-span" style="color: blue;">%include '\\cdc\private\mixture method pvalue macro1.sas';<br />proc mixed data=bhb7.allUS2levge10blk method=ml covtest empirical <br />noclprint ;<br />class &nbsp;state_id;<br />model diffcov=pctblk/ ddfm=contain s;<br />random intercept pctblk /subject=state_id type=ar(1) s ;<br />ods output FitStatistics=fm &nbsp;SolutionF=SFfm ;<br />run;<br /><br />proc mixed data=bhb7.allUS2levge10blk method=ml covtest empirical<br />noclprint;<br />class &nbsp;state_id;<br />model diffcov=/ddfm=contain &nbsp; s;<br />random intercept &nbsp;pctblk /subject=state_id type=ar(1) s;<br />ods output FitStatistics=rm SolutionF=SFrm ;<br />run;</span> </code><br /><br />跑完後,兩個模式的 -2 log likelihood function 值和自由度會被 ods output 輸出到不同的資料裡面,接著使用那個 macro 去把從這四個資料裡面抓數值出來算:<br /><br /><code><span class="Apple-style-span" style="color: blue;">%LRMixtureTest(fullmodel=fm,redmodel=rm,DFfull=SFfm,DFred=SFrm); </span></code><br /><br />最後這個 macro 就會輸出下面這個報表給你:<br /><span class="Apple-style-span" style="color: #4c4c4c; font-family: arial, helvetica, sans-serif; font-size: 12px;"><img id="imgThumbnail" src="http://rookeryiis1.aviary.com/storage/3472500/3472768_226b_rss.jpg" style="border-bottom-style: none; border-bottom-width: 0px; border-color: initial; border-left-style: none; border-left-width: 0px; border-right-style: none; border-right-width: 0px; border-top-style: none; border-top-width: 0px; border-width: initial;" /></span><br /><br />結果顯示 p-value &lt; 0.05,所以 full model 比 reduced model 好。<br /><br />從這個範例中,特別要強調一點是,如果兩個 nested mixed model 差異只有在固定效應時,一開始那兩個 PROC MIXED 執行時,後面的 method option 必須強制設定為 ML,因為如果沒有特別去設定的話,PROC MIXED 預設的參數估計方法是 REML,但這只有在兩個模式的差異是在隨機效應或者是 covariance structure 時才能使用。原文中的第一個範例就是用 REML 再估計參數,因為兩個模式的差別只有在一個 random intercept 而已。<br /><br /><br /><b><span class="Apple-style-span" style="font-size: x-large;">Contact Information:&nbsp;</span></b><br />Barbara Bardenheier, MPH, MA<br />Centers for Disease Control and Prevention<br />1600 Clifton Rd, MS E-52<br />Atlanta, GA 30333<br />Tel : (404) 639-8789<br />Fax : (404) 639-8614<br />Email : <!-- e --><a href="mailto:bfb7@cdc.gov">bfb7@cdc.gov</a><!-- e --><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6268919072942670865-6835902750623542080?l=sugiclub.blogspot.com' alt='' /></div>
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|小黑屋|手机版|Archiver|SAS中文论坛  

GMT+8, 2026-2-3 18:26 , Processed in 0.087865 second(s), 20 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表