SAS中文论坛

标题: 怎么做有序Probit模型分析 [打印本页]

作者: shiyiming 时间: 2004-3-27 22:34
标题: 怎么做有序Probit模型分析
因变量是等级数据，共三级。自变量有很多，有等级变量，也有类别变量，还有一般的连续变量。现在要做有序Probit模型分析进行回归。怎么做？

作者: shiyiming 时间: 2004-3-28 11:26
标题: Re: 怎么做有序Probit模型分析
1. Probit模型和 Logistic模型差不多，在SAS的“Proc logistic”的model选项中选“NORMIT”；

2. 建议您看一下中国卫生统计2002.12上的文章“多分类有序logistics回归”。

作者: shiyiming 时间: 2004-3-28 18:23
看看william green 的书，计量经济分析 econometrics analysis

作者: shiyiming 时间: 2004-3-28 22:00
标题: Re: 怎么做有序Probit模型分析
[quote="collen":ca61a]1. Probit模型和 Logistic模型差不多，在SAS的“Proc logistic”的model选项中选“NORMIT”；

2. 建议您看一下中国卫生统计2002.12上的文章“多分类有序logistics回归”。[/quote:ca61a]

我现在改用logistic模型。
做“多分类有序logistics回归”,是不是用一般地proc logistic过程就可以了，比如：
proc logistic;
class.........;
model.......;
run;
不用什么特别的选项来针对多分类有序数据吧？

作者: shiyiming 时间: 2004-3-28 23:39
对于应变量为多分类的资料，SAS的logistic过程会拟合(n-1)个回归方程，其中n为应变量的水平数。每一个回归方程，都会给出相应的参数估计值。

至于是否等级资料，似乎这里并不重要，因为结果都会通过以上的形式给出。各水平间的效应对比，即可通过两个方程（也就是各自的参数估计值）的比较得出。

个人见解，不妥之处请批评指正。

作者: shiyiming 时间: 2004-3-29 07:10
标题: 先sort
proc 因变量sort;

proc logistic;
class.........;
model 选项NORMIT;
run;

作者: shiyiming 时间: 2004-3-30 02:58
Please note that:
用logistic做多分类有序回归，default是cumulative logit model. don't need sort 因变量.

有一本SAS User写的书很好，好像是叫Logistic Regression Using SAS Sofeware.

作者: shiyiming 时间: 2004-3-31 23:58
标题: 可以看一下LOGISTIC回归模型
王济川和郭志刚编写的。
主要探讨分类变量的分析方法。其中有PROBIT的简要介绍。
其实应该以LOGISTIC为主。
[quote="charles":318bf]Please note that:
用logistic做多分类有序回归，default是cumulative logit model. don't need sort 因变量.

有一本SAS User写的书很好，好像是叫Logistic Regression Using SAS Sofeware.[/quote:318bf]

作者: shiyiming 时间: 2004-4-3 20:23
王济川和郭志刚编写的那本logistic的书哪里有卖？

作者: shiyiming 时间: 2004-4-3 21:47
[quote="student":17666]对于应变量为多分类的资料，SAS的logistic过程会拟合(n-1)个回归方程，其中n为应变量的水平数。每一个回归方程，都会给出相应的参数估计值。

至于是否等级资料，似乎这里并不重要，因为结果都会通过以上的形式给出。各水平间的效应对比，即可通过两个方程（也就是各自的参数估计值）的比较得出。

个人见解，不妥之处请批评指正。[/quote:17666]
hi student,你说的是多分类无序变量，那会你和n-1个方程，但多分类有序应变量只有一个方城，但截距有n-1个，在stata中也叫切点，cut point。william green的书中limited dependent 那一章。

作者: shiyiming 时间: 2004-4-3 21:52
标题: Re: 怎么做有序Probit模型分析
[quote="forestshen":506d4][quote="collen":506d4]1. Probit模型和 Logistic模型差不多，在SAS的“Proc logistic”的model选项中选“NORMIT”；

2. 建议您看一下中国卫生统计2002.12上的文章“多分类有序logistics回归”。[/quote:506d4]

我现在改用logistic模型。
做“多分类有序logistics回归”,是不是用一般地proc logistic过程就可以了，比如：
proc logistic;
class.........;
model.......;
run;
不用什么特别的选项来针对多分类有序数据吧？[/quote:506d4]
probit 和logistic模型相似，但在计量经济学中常用probit模型，因为大多数学者认为经济中的数据残差项为正态分布，probit模型的假设就是残差为正态分布，而logistic模型假设残差符合logistic分布。
我从sashelp中找到一个例子，看看有用不（图片copy不上来）
Example 57.2: Multilevel Response
In this example, two preparations, a standard preparation and a test preparation, are each given at several dose levels to groups of insects. The symptoms are recorded for each insect within each group, and two multilevel probit models are fit. Because the natural sort order of the three levels is not the same as the response order, the ORDER=DATA option is specified in the PROC PROBIT statement to get the desired order.

The following statements produce Output 57.2.1:

data multi;
   input Prep $ Dose Symptoms $ N;
   LDose=log10(Dose);
   if Prep='test' then PrepDose=LDose;
   else PrepDose=0;
   datalines;
stand    10    None    33
stand    10    Mild       7
stand    10    Severe    10
stand    20    None    17
stand    20    Mild    13
stand    20    Severe    17
stand    30    None    14
stand    30    Mild       3
stand    30    Severe    28
stand    40    None       9
stand    40    Mild       8
stand    40    Severe    32
test    10    None    44
test    10    Mild       6
test    10    Severe    0
test    20    None    32
test    20    Mild    10
test    20    Severe    12
test    30    None    23
test    30    Mild       7
test    30    Severe    21
test    40    None    16
test    40    Mild       6
test    40    Severe    19
;

proc probit order=data;
   class Prep Symptoms;
   nonpara: model Symptoms=Prep LDose PrepDose / lackfit;
   weight N;
   title 'Probit Models for Symptom Severity';
run;

proc probit order=data;
   class Prep Symptoms;
   parallel: model Symptoms=Prep LDose / lackfit;
   weight N;
   title 'Probit Models for Symptom Severity';
run;

The first model allows for nonparallelism between the dose response curves for the two preparations by inclusion of an interaction between Prep and LDose. The interaction term is labeled PrepDose in the "Analysis of Parameter Estimates" table. The results of this first model indicate that the parameter for the interaction term is not significant, having a Wald chi-square of 0.73. Also, since the first model is a generalization of the second, a likelihood ratio test statistic for this same parameter can be obtained by multiplying the difference in log likelihoods between the two models by 2. The value obtained, 2 ×(-345.94 - (-346.31)), is 0.73. This is in close agreement with the Wald chi-square from the first model. The lack-of-fit test statistics for the two models do not indicate a problem with either fit.

Output 57.2.1: Multilevel Response: PROC PROBIT

Probit Models for Symptom Severity

Probit Procedure

Class Level Information
Name Levels Values
Prep 2 stand test
Symptoms 3 None Mild Severe


Probit Models for Symptom Severity

Probit Procedure

Model Information
Data Set WORK.MULTI
Dependent Variable Symptoms
Weight Variable N
Number of Observations 23
Missing Values 1
Name of Distribution Normal
Log Likelihood -345.9401767


Probit Models for Symptom Severity

Probit Procedure

Analysis of Parameter Estimates
Parameter DF Estimate Standard Error 95% Confidence Limits Chi-Square Pr > ChiSq
Intercept 1 3.8080 0.6252 2.5827 5.0333 37.10 <.0001
Intercept2 1 0.4684 0.0559 0.3589 0.5780 70.19 <.0001
Prep stand 1 -1.2573 0.8190 -2.8624 0.3479 2.36 0.1247
Prep test 0 0.0000 0.0000 0.0000 0.0000 . .
LDose 1 -2.1512 0.3909 -2.9173 -1.3851 30.29 <.0001
PrepDose 1 -0.5072 0.5945 -1.6724 0.6580 0.73 0.3935


Probit Models for Symptom Severity

Probit Procedure

Class Level Information
Name Levels Values
Prep 2 stand test
Symptoms 3 None Mild Severe


Probit Models for Symptom Severity

Probit Procedure

Model Information
Data Set WORK.MULTI
Dependent Variable Symptoms
Weight Variable N
Number of Observations 23
Missing Values 1
Name of Distribution Normal
Log Likelihood -346.306141


Probit Models for Symptom Severity

Probit Procedure

Analysis of Parameter Estimates
Parameter DF Estimate Standard Error 95% Confidence Limits Chi-Square Pr > ChiSq
Intercept 1 3.4148 0.4126 2.6061 4.2235 68.50 <.0001
Intercept2 1 0.4678 0.0558 0.3584 0.5772 70.19 <.0001
Prep stand 1 -0.5675 0.1259 -0.8142 -0.3208 20.33 <.0001
Prep test 0 0.0000 0.0000 0.0000 0.0000 . .
LDose 1 -2.3721 0.2949 -2.9502 -1.7940 64.68 <.0001

The negative coefficient associated with LDose indicates that the probability of having no symptoms (Symptoms='None') or no or mild symptoms (Symptoms='None' or Symptoms='Mild') decreases as LDose increases; that is, the probability of a severe symptom increases with LDose. This association is apparent for both treatment groups.

The negative coefficient associated with the standard treatment group (Prep = stand) indicates that the standard treatment is associated with more severe symptoms across all Ldose values.

The following statements use the PREDPPLOT statement to create the plot shown in Output 57.2.2 of the probabilities of the response taking on individual levels as a function of LDose. Since there are two covariates, LDose and Prep, the value of the CLASS variable Prep is fixed at the highest level, test. Although not shown here, the CDFPLOT statement creates similar plots of the cumulative response probabilities, instead of individual response level probabilities.

proc probit data=multi order=data;
   class Prep Symptoms;
   parallel: model Symptoms=Prep LDose / lackfit;
   predpplot var=ldose  level=("None" "Mild" "Severe")
            cfit=blue cframe=ligr inborder noconf ;
   weight N;
   title 'Probit Models for Symptom Severity';
run;

Output 57.2.2: Plot of Predicted Probilities for the Test Preparation Group


The following statements use the XDATA= data set to create a plot of the predicted probabilities with Prep set to the stand level. The resulting plot is shown in Output 57.2.3.

data xrow;
   input Prep $ Dose Symptoms $ N;
   LDose=log10(Dose);
   datalines;
stand    40    Severe    32
run;

proc probit data=multi order=data xdata=xrow;
   class Prep Symptoms;
   parallel: model Symptoms=Prep LDose / lackfit;
   predpplot var=ldose  level=("None" "Mild" "Severe")
            cfit=blue cframe=ligr inborder noconf ;
   weight N;
   title 'Predicted Probabilities for Standard Preparation';
run;

Output 57.2.3: Plot of Predicted Probabilities for the Standard Preparation Group

欢迎光临 SAS中文论坛 (http://mysas.net/forum/)