|
楼主

楼主 |
发表于 2011-5-14 07:18:17
|
只看该作者
Creating High-Quality Scatter Plots: An Old Story Told by th
From LCChien's blog on blogspot
原文載點:<a href="http://support.sas.com/resources/papers/proceedings10/057-2010.pdf"><!-- m --><a class="postlink" href="http://support.sas.com/resources/papers/proceedings10/057-2010.pdf">http://support.sas.com/resources/papers ... 7-2010.pdf</a><!-- m --></a><br /><br />散佈圖算是統計圖表裡面一個很基本的呈現資料分布型態的表示方法,但是用 PROC GPLOT 畫出來的散佈圖不甚美觀。一位辛辛那提大學數學系的中國學生寫了這一篇技術文件,採用了 PROC SGSCATTER 程序,將單純的散佈圖的品質提昇到另一個境地。<br /><br /><a name='more'></a>資料來源是採用 SAS 裡面現成的資料檔 sashelp.cars,但作者只採用部分的內容:<br /><pre><code>ods html style=harvest;<br />data cars;<br /> set sashelp.cars;<br /> where make in ('Jeep' 'Chevrolet' 'Ford' 'Chrysler');<br />run;</code></pre>變數的定義如下:<br /><div style="text-align: center;"><img height="153" src="http://farm6.static.flickr.com/5190/5665240480_7c67e59022_b.jpg" width="400" /></div><br />接下來所有的範例都會被 ODS 輸出成 html 的格式,採用了 harvest 的風格。如果你不喜歡 harvest 的風格,還有其他三種可以選:<br /><br /><img height="531" src="http://farm6.static.flickr.com/5266/5665250128_c395797cd6_b.jpg" width="640" /><br /><br /><b>範例一</b><br />以往用 PROC GPLOT 要畫 Y*(X1 X2) 的圖時,會產生兩張獨立的圖給你。現在用 PROC SGSCATTER 會自動給你合併成一張的圖。<br /><pre><code>proc sgscatter data=cars;<br /> plot invoice*(weight length); <br />run;</code></pre><img src="http://farm6.static.flickr.com/5185/5665282390_f5b5b396c3_z.jpg" /><br /><br /><b>範例二</b><br />如果想共用同一條軸,只要把 y= 和 x= 加上去就可以了。這樣可以讓散佈圖可視面積稍大一些。<br /><pre><code>proc sgscatter data=cars;<br /> compare <span class="Apple-style-span" style="color: red;">y=</span>invoice <span class="Apple-style-span" style="color: red;">x=</span>(weight length);<br />run;</code></pre><img src="http://farm6.static.flickr.com/5021/5664722017_c8bd529fb4_z.jpg" /><br /><br /><b>範例三</b><br />以前介紹過的散佈矩陣圖。在 PROC SGSCATTER 裡面用 matrix 語法來執行。<br /><pre><code>proc sgscatter data=cars;<br /> <span class="Apple-style-span" style="color: red;">matrix </span>invoice weight length;<br />run;</code></pre><img src="http://farm6.static.flickr.com/5225/5665288368_7a0c888aca_z.jpg" /><br /><br /><b>範例四</b><br />用 rows= 和 columns= 來指定合併圖時的行列數目。<br /><pre><code>proc sgscatter data=cars;<br /> plot invoice*(weight length) / rows=2 columns=1;<br />run;</code></pre><img src="http://farm6.static.flickr.com/5221/5664741877_c281da23a9_z.jpg" /><br /><br /><b>範例五</b><br />想要知道不同汽車製造商的散佈位置,首先在 plot 語法後面加上一個 group = 的選項,把汽車製造商的變數名稱 make 放入,這樣 PROC SGSCATTER 就會知道要針對那個變數做分組,並且自動替不同的製造商加上不同的顏色和點型。<br /><pre><code>proc sgscatter data=cars;<br /> plot MPG_city*weight / <span class="Apple-style-span" style="color: red;">group=make</span>;<br /> where make in ('Ford' 'Chrysler' 'Chevrolet');<br /> title 'Scatter Plot by Make';<br />run;</code></pre><img src="http://farm6.static.flickr.com/5068/5665307742_c84f3530c1_z.jpg" /><br /><br /><b>範例六</b><br />想要在散佈圖上加上廠商的名稱,則是用 datalabel= 這個選項來處理。<br /><pre><code>proc sql;<br /> create table cars2 as<br />select origin, make, mean(MSRP) as MSRP,<br /> mean(MPG_city) as MPG_city,<br /> mean(MPG_highway) as MPG_highway<br />from sashelp.cars<br />group by origin, make<br />order by origin, make;<br />quit;<br />proc sgscatter data=cars2;<br /> plot MSRP*MPG_highway / <span class="Apple-style-span" style="color: red;">datalabel=make</span> group=origin grid;<br /> title 'Averaged MSRP vs. Highway MPG for Car Makers by Origin';<br /> format MSRP dollar6.0;<br /> label MSRP='Manufacturer Suggested Retail Price' MPG_highway='Highway MPG'; <br />run;</code></pre><img src="http://farm6.static.flickr.com/5103/5664742869_3953e069d8_z.jpg" /><br /><br /><b>範例七</b><br />若要加上一條迴歸線以及信賴區間,則用 reg= 的選項來畫。<br /><pre><code>proc sgscatter data=cars2;<br /> plot MSRP*MPG_highway / datalabel=make group=origin grid <span class="Apple-style-span" style="color: red;">reg=(degree=2 clm nogroup)</span>;<br /> title 'Averaged MSRP vs. Highway MPG for Car Makers by Origin';<br /> title2 '-- with quadratic regression fitting and conf. intervals --';<br /> format MSRP dollar6.0;<br /> label MSRP='Manufacturer Suggested Retail Price' MPG_highway='Highway MPG'; <br />run;</code></pre><img src="http://farm6.static.flickr.com/5189/5664743243_8b70a18a20_z.jpg" /><br /><br /><b>範例八</b><br />要話95%的預測橢圓,則用 ellipse = 的選項來畫。<br /><pre><code>proc sgscatter data=cars2;<br /> compare y=MSRP x=(MPG_highway MPG_city) / group=origin <span class="Apple-style-span" style="color: red;">ellipse=(alpha=0.05 type=predicted)</span>;<br /> title 'Averaged MSRP vs. Highway/City MPG for car makers by Origin';<br /> title2 '-- with 95% prediction ellipse --';<br /> format MSRP dollar6.0;<br /> label MSRP='Manufacturer Suggested Retail Price'<br /> MPG_highway='Highway MPG' MPG_city='CITY MPG'; <br />run;</code></pre><img src="http://farm6.static.flickr.com/5223/5665309356_af429163f5_z.jpg" /><br /><br /><b>範例九</b><br />回到散佈矩陣圖,如果想要在圖的對角線畫上每個變數的次數分配柱狀圖以及常態曲線,則可用 diagnoal = 的選項處理。<br /><pre><code>title 'Scatter Plot Matrix with Histograms and Normal Fitting Curves';<br />proc sgscatter data=cars;<br /> matrix invoice weight length / <span class="Apple-style-span" style="color: red;">diagonal=(histogram normal)</span>;<br />run; quit;</code></pre><img src="http://farm6.static.flickr.com/5143/5664743805_fbce6809a5_z.jpg" /><br /><br /><b>範例十</b><br />最後,用 ODS GRAPHICS 設定圖型內部的參數,然後在 ODS HTML 後面加上路徑來讓生出來的圖存到指定位置去。<br /><pre><code><span class="Apple-style-span" style="color: red;">ods html gpath='C:\' style=harvest; </span><br /><span class="Apple-style-span" style="color: red;">ods graphics / reset=all width=12in height=6in border=off imagename='example' imagefmt=png;</span><br />proc sgscatter data=cars2;<br /> plot MSRP*(MPG_highway MPG_city) <br /> / datalabel=make group=origin <br /> grid reg=(degree=2 clm nogroup);<br /> title 'Averaged MSRP vs. Highway/City MPG for Car Makers by Origin';<br /> title2 '-- with quadratic regression fitting and conf. intervals --';<br /> format MSRP dollar6.0;<br /> label MSRP='Manufacturer Suggested Retail Price'<br /> MPG_highway='Highway MPG'<br /> MPG_city='City MPG'; <br />run; <br />ods html close;</code></pre>幾個重要的 ODS GRAPHICS 參數在此介紹一下:<br /><br /><ul><li>WIDTH=, HEIGHT= :設定圖型的長度和寬度</li><li>IMAGENAME=, IMAGEFMT= :設定圖型的名稱和格式</li><li>BORDER=ON|OFF :設定要不要畫圖型的邊界</li><li>RESET=ALL :繪圖結束後重設所有參數</li></ul><img src="http://farm6.static.flickr.com/5101/5664744051_6e299a1c76_z.jpg" /><br /><br /><br /><b>CONTACT INFORMATION</b><br />Your comments and questions are valued and encouraged. Contact the author at:<br />Xiangxiang Meng<br />Department of Mathematical Science University of Cincinnati<br /><!-- e --><a href="mailto:mengxa@mail.uc.edu">mengxa@mail.uc.edu</a><!-- e --><div class="blogger-post-footer"><img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6268919072942670865-6688406305376006917?l=sugiclub.blogspot.com' alt='' /></div> |
|