SAS中文论坛

 找回密码
 立即注册

扫一扫,访问微社区

查看: 602|回复: 0
打印 上一主题 下一主题

A Macro for Computing the Mantel-Fleisś Criterion

[复制链接]

49

主题

76

帖子

1462

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
1462
楼主
 楼主| 发表于 2010-11-4 21:20:58 | 只看该作者

A Macro for Computing the Mantel-Fleisś Criterion

From LCChien's blog on blogspot

原文載點:<span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px;"><a href="https://docs.google.com/viewer?url=http://analytics.ncsu.edu/sesug/2009/PO008.Welch.pdf"><!-- m --><a class="postlink" href="http://analytics.ncsu.edu/sesug/2009/PO008.Welch.pdf">http://analytics.ncsu.edu/sesug/2009/PO008.Welch.pdf</a><!-- m --></a></span><br /><br />在作2乘2的卡方分析時,我們通常會看每一格裡面的樣本期望值是不是大於五,如果不大於五的比例超過一半的話,便需要使用 Fisher's exact test 的結果來取代原本的卡方分析結果。但這種判斷方法,在2乘2乘H的表格下並不適用。Mantel 和&nbsp;Fleisś 在 1980 年代發明了一種判斷方法來專門處理這種情況,不過這個方法並沒有內建在 SAS 現存的程序內,因此 Brandon Welch, Jane Eslinger 和 Rob Woolson 在 2009 年的 SESUG 發表了一篇技術文件,提供一個簡便的 macro 供使用者使用。<br /><a name='more'></a><br />有關 Mantel-Fleisś criterion 的理論,可詳見原文。以下就直接切入這個 macro 的使用方法:<br /><code><span class="Apple-style-span" style="color: blue;">%MantelFleiss(In,Strat,Var1,Var2,Count)</span></code><br /><br />這個名為 MantelFleiss 的 macro 只有五個參數,定義如下:<br /><ul><li>In:資料集名稱</li><li>Strat: 層數,亦即2乘2表格的數量(H)</li><li>Var1:行變數</li><li>Var2:列變數</li><li>Count:此變數是加權變數(weight),是非必要的變數。如果省略的話,程式會自動定義為「1」。</li></ul>原始碼如下:<br /><code><span class="Apple-style-span" style="color: blue;">%macro MantelFleiss(In,Strat,Var1,Var2,Count);<br /><br />%*-----------------------------------------------------------;<br />%* &amp;In  - input data set                 ;<br />%* &amp;Strat - stratification variable (e.g., site, race, etc.);<br />%* &amp;Var1 - row variable                  ;<br />%* &amp;Var2 - column variable                 ;<br />%* &amp;Count - optional weight variable            ;<br />%*-----------------------------------------------------------;<br /><br />%*reset output data sets;<br />PROC DATASETS nodetails nolist;<br />delete _counts _expect _mantel:;<br />RUN;<br />QUIT;<br /><br />%*determine number of strata levels;<br />PROC SQL noprint;<br />select count(distinct &amp;strat) into: stratflag from &amp;In;<br />QUIT;<br />%put **Number of strata levels &amp;stratflag;<br /><br />%*get expected counts and totals in each partial table;<br />PROC FREQ data = &amp;In;<br />tables &amp;Strat*&amp;Var1*&amp;Var2 / expected outexpect out = _expect;<br />%if %nrbquote(&amp;Count.) ne %then weight &amp;Count.;;<br />ODS output CrossTabFreqs = _counts;<br />RUN;<br /><br />%*get sum of expected cell counts in cell n11 in each partial table<br />(assign to macro var SUMEXP);<br />DATA _null_; <br />set _expect end = eof;<br />by &amp;Strat;<br />retain sumexp 0;<br />if first.&amp;Strat then do;<br />sumexp = sumexp + expected;<br />end;<br />if eof then call symput('sumexp',put(sumexp,8.4));<br />RUN;<br />%put Sum of expected values in cells nh11: sum(mh11) = %cmpres(&amp;sumexp);<br /><br />PROC SORT data = _counts; <br />by &amp;Strat &amp;Var1 &amp;Var2;<br />RUN;<br /><br /><br />%*Subset to the needed totals output by PROC FREQ<br />number the &amp;Strat, &amp;Var1, and &amp;Var2 for use below<br />0 - totals<br />1 - first &amp;Strat/&amp;Var1/&amp;Var2<br />2 - second &amp;Strat/&amp;Var1/&amp;Var2<br />...;<br /><br />DATA _mantelf1;<br />%*only include marginal totals from ODS;<br />set _counts(where = (_type_ not in ('100' '111'))); <br />by &amp;Strat &amp;Var1 &amp;Var2;<br />retain &amp;Strat._ 0 &amp;Var1._ &amp;Var2._;<br /><br />%*reset for change in strata;<br />if first.&amp;Strat then do;<br />&amp;Var1._ = 0; &amp;Var2._ = 0;<br />end;<br /><br />%*get numeric version of strata variable;<br />if first.&amp;Strat then &amp;Strat._ = &amp;Strat._ + 1;<br /><br />%*define zero as total for any &amp;Var1/&amp;Var2;<br />if missing(&amp;Var1) then &amp;Var1._ = 0; <br />else &amp;Var1._ + 1;<br /><br />if missing(&amp;Var2) then &amp;Var2._ = 0; <br />else &amp;Var2._ + 1;<br /><br />keep &amp;Strat._ &amp;Var1._ &amp;Var2._ &amp;Strat. &amp;Var1. &amp;Var2. frequency;<br />RUN;<br /><br />DATA _mantelf2;<br />set _mantelf1;<br />by &amp;Strat._;<br /><br />retain n11_ n1_1 n1_2;<br /><br />if first.&amp;Strat._ then do;<br />n11_ = 0;<br />n1_1 = 0;<br />n1_2 = 0; <br />end;<br /><br />%*get each total;<br />if &amp;Var1._ = 1 then n11_ = frequency; %*for row  one in the hth strata;<br />if &amp;Var2._ = 1 then n1_1 = frequency; %*for column one in the hth strata;<br />if &amp;Var2._ = 2 then n1_2 = frequency; %*for column two in the hth strata;<br /><br /><br />%*define (nh11)L and (nh11)U for final computation;<br />max_ = max(0,n11_-n1_2);<br />min_ = min(n1_1,n11_);<br /><br />if last.&amp;Strat._;<br /><br />RUN;<br /><br />%*sum acoss min and max to get lower/upper bounds of criterion;<br />DATA _mantelf3(keep = mf_suml mf_sumu mflendpt mfrendpt mf_r);<br />set _mantelf2 end = eof;<br /><br />retain mf_suml mf_sumu;<br /><br />if _n_ = 1 then do;<br />mf_suml = 0; mf_sumu = 0;<br /><br />end;<br /><br />mf_suml = mf_suml + max_;<br />mf_sumu = mf_sumu + min_;<br /><br />if eof then do;<br />mflendpt = &amp;sumexp-mf_suml;<br />mfrendpt = mf_sumu-&amp;sumexp;<br />mf_r   = min(mflendpt,mfrendpt);<br /><br />put "Sum of lower bound (sum[(nh11)L]) = " mf_suml;<br />put "Sum of upper bound (sum[(nh11)U]) = " mf_sumu;<br />put "Left end-point: sum[mh11] - sum[(nh11)L] = " mflendpt;<br />put "Right endpoint: sum[(nh11)U] - sum[mh11] = " mfrendpt;<br />put "Mantel-Fleiss R = min[" mflendpt "," mfrendpt "] = " mf_r;<br /><br />if mf_r&gt;5 then put "Mantel-Fleiss criterion satisfied (since R &gt; 5)";<br />else      put "Mantel-Fleiss criterion NOT satisfied (since R &lt;= 5)";<br />output; <br />end;<br /><br />label<br />mf_suml = "Summation of Maximum (sum[(nh11)L])"<br />mf_sumu = "Summation of Minumum (sum[(nh11)U])"<br />mflendpt = "Left End-Point: sum[mh11] - sum[(nh11)L]"<br />mfrendpt = "Right End-Point: sum[(nh11)U] - sum[mh11]"<br />mf_r   = "R Value: min[sum[mh11] - sum[(nh11)L],sum[(nh11)U] -  sum[mh11]]";<br />RUN;<br />%mend;</span>                               </code><br /><br />範例一:<br />資料檔:<br /><code><span class="Apple-style-span" style="color: blue;"> PROC FORMAT;<br />value trtf<br />1 = 'TreatmentA'<br />2 = 'TreatmentB'<br />;<br /><br />value sitef<br />1 = 'Arkansas'<br />2 = 'Indiana'<br />3 = 'Illinois'<br />;<br /><br />value respf<br />1 = 'Present'<br />2 = 'Absent'<br />;<br />RUN;<br /><br />DATA test;<br />do site = 1 to 3;<br />do treat = 1 to 2;<br />do resp = 1 to 2;<br />input counts @@;<br />output;<br />end;<br />end;<br />end;<br />label site = "Site" treat = "Treatment" resp = "Response";<br />format treat trtf. site sitef. resp respf.;<br />cards;<br />12 5 7 3<br />1 2 5 2 <br />0 9 5 6<br />;<br />RUN;</span>   </code><br /><br />上述程式在表格內呈現的結果如下:<br /><span class="Apple-style-span" style="color: #4c4c4c; font-family: arial, helvetica, sans-serif; font-size: 12px;"><img id="imgThumbnail" src="http://rookery5.aviary.com/storagev12/3432000/3432014_e602_rss.jpg" style="border-bottom-style: none; border-bottom-width: 0px; border-color: initial; border-left-style: none; border-left-width: 0px; border-right-style: none; border-right-width: 0px; border-top-style: none; border-top-width: 0px; border-width: initial;" /></span><br /><br />接著執行 macro:<br /><code><span class="Apple-style-span" style="color: blue;">%MantelFleiss(test,site,treat,resp,counts) ;</span></code><br /><br />要特別注意的一點是,這個 macro 的結果並不是呈現在 output 視窗裡面,而是直接呈現在 log 視窗裡面,所以送出這行程式碼後,便可以直接在 log 視窗上看到結果,如下所示:<br /><code><span class="Apple-style-span" style="color: blue;">Sum of expected values in cells nh11: sum(mh11) = 20.5130<br />Sum of lower bound (sum[(nh11)L]) = 9<br />Sum of upper bound (sum[(nh11)U]) = 25<br />Left end-point: sum[mh11] - sum[(nh11)L] = 11.513<br />Right endpoint: sum[(nh11)U] - sum[mh11] = 4.487<br />Mantel-Fleiss R = min[11.513 ,4.487 ] = 4.487<br />Mantel-Fleiss criterion NOT satisfied (since R &lt;= 5)</span></code><br /><br />原文作者很好心地幫使用者做出該資料是否有滿足 Mantel-Fleiss criterion 的判斷句,所以就不用管上面的數據所代表的意義了,但若在寫 paper 時,可能還是要把一些數據呈現出來,但其實這些數據並沒有解釋上的明顯意義。<br /><br />這個例子並沒有滿足 Mantel-Fleiss criterion,也就是說我們得使用 Fisher's exact test 來取代原本的卡方檢定。<br /><br />我們再來看另一個有滿足 Mantel-Fleiss criterion 的例子。<br /><br />範例二:<br /><br />我們直接來看資料的表格結構。<br /><span class="Apple-style-span" style="color: #4c4c4c; font-family: arial, helvetica, sans-serif; font-size: 12px;"><img id="imgThumbnail" src="http://rookeryiis2.aviary.com/storage/3432000/3432079_0028_rss.jpg" style="border-bottom-style: none; border-bottom-width: 0px; border-color: initial; border-left-style: none; border-left-width: 0px; border-right-style: none; border-right-width: 0px; border-top-style: none; border-top-width: 0px; border-width: initial;" /></span><br />所有參數設定都和範例一同樣,只是數據改變而已。執行 macro 後的結果如下:<br /><code><span class="Apple-style-span" style="color: blue;">Sum of expected values in cells nh11: sum(mh11) = 19.3630<br />Sum of lower bound (sum[(nh11)L]) = 9<br />Sum of upper bound (sum[(nh11)U]) = 30<br />Left end-point: sum[mh11] - sum[(nh11)L] = 10.363<br />Right endpoint: sum[(nh11)U] - sum[mh11] = 10.637<br />Mantel-Fleiss R = min[10.363 ,10.637 ] = 10.363<br />Mantel-Fleiss criterion satisfied (since R &gt; 5)</span></code><br /><br />從最後一句話得知這份資料是有滿足 Mantel-Fleiss criterion 的,所以我們便可以直接使用卡方檢定的結果。<br /><br /><b>CONTACT INFORMATION</b><br />Brandon Welch<br />Rho®, Inc.<br />Statistical Programming<br />6330 Quadrangle Dr., Ste. 500<br />Chapel Hill, NC 27517<br />Email: <!-- e --><a href="mailto:Brandon_Welch@rhoworld.com">Brandon_Welch@rhoworld.com</a><!-- e --><br />Phone: 919-595-6339<div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6268919072942670865-1667469089538034965?l=sugiclub.blogspot.com' alt='' /></div>
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|小黑屋|手机版|Archiver|SAS中文论坛  

GMT+8, 2026-2-3 18:26 , Processed in 0.084607 second(s), 20 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表