我现在改用logistic模型。
做“多分类有序logistics回归”,是不是用一般地proc logistic过程就可以了,比如:
proc logistic;
class.........;
model.......;
run;
不用什么特别的选项来针对多分类有序数据吧?[/quote:506d4]
probit 和logistic模型相似,但在计量经济学中常用probit模型,因为大多数学者认为经济中的数据残差项为正态分布,probit模型的假设就是残差为正态分布,而logistic模型假设残差符合logistic分布。
我从sashelp中找到一个例子,看看有用不(图片copy不上来)
Example 57.2: Multilevel Response
In this example, two preparations, a standard preparation and a test preparation, are each given at several dose levels to groups of insects. The symptoms are recorded for each insect within each group, and two multilevel probit models are fit. Because the natural sort order of the three levels is not the same as the response order, the ORDER=DATA option is specified in the PROC PROBIT statement to get the desired order.
The following statements produce Output 57.2.1:
data multi;
input Prep $ Dose Symptoms $ N;
LDose=log10(Dose);
if Prep='test' then PrepDose=LDose;
else PrepDose=0;
datalines;
stand 10 None 33
stand 10 Mild 7
stand 10 Severe 10
stand 20 None 17
stand 20 Mild 13
stand 20 Severe 17
stand 30 None 14
stand 30 Mild 3
stand 30 Severe 28
stand 40 None 9
stand 40 Mild 8
stand 40 Severe 32
test 10 None 44
test 10 Mild 6
test 10 Severe 0
test 20 None 32
test 20 Mild 10
test 20 Severe 12
test 30 None 23
test 30 Mild 7
test 30 Severe 21
test 40 None 16
test 40 Mild 6
test 40 Severe 19
;
proc probit order=data;
class Prep Symptoms;
nonpara: model Symptoms=Prep LDose PrepDose / lackfit;
weight N;
title 'Probit Models for Symptom Severity';
run;
proc probit order=data;
class Prep Symptoms;
parallel: model Symptoms=Prep LDose / lackfit;
weight N;
title 'Probit Models for Symptom Severity';
run;
The first model allows for nonparallelism between the dose response curves for the two preparations by inclusion of an interaction between Prep and LDose. The interaction term is labeled PrepDose in the "Analysis of Parameter Estimates" table. The results of this first model indicate that the parameter for the interaction term is not significant, having a Wald chi-square of 0.73. Also, since the first model is a generalization of the second, a likelihood ratio test statistic for this same parameter can be obtained by multiplying the difference in log likelihoods between the two models by 2. The value obtained, 2 ×(-345.94 - (-346.31)), is 0.73. This is in close agreement with the Wald chi-square from the first model. The lack-of-fit test statistics for the two models do not indicate a problem with either fit.
Output 57.2.1: Multilevel Response: PROC PROBIT
Probit Models for Symptom Severity
Probit Procedure
Class Level Information
Name Levels Values
Prep 2 stand test
Symptoms 3 None Mild Severe
Probit Models for Symptom Severity
Probit Procedure
Model Information
Data Set WORK.MULTI
Dependent Variable Symptoms
Weight Variable N
Number of Observations 23
Missing Values 1
Name of Distribution Normal
Log Likelihood -345.9401767
Class Level Information
Name Levels Values
Prep 2 stand test
Symptoms 3 None Mild Severe
Probit Models for Symptom Severity
Probit Procedure
Model Information
Data Set WORK.MULTI
Dependent Variable Symptoms
Weight Variable N
Number of Observations 23
Missing Values 1
Name of Distribution Normal
Log Likelihood -346.306141
The negative coefficient associated with LDose indicates that the probability of having no symptoms (Symptoms='None') or no or mild symptoms (Symptoms='None' or Symptoms='Mild') decreases as LDose increases; that is, the probability of a severe symptom increases with LDose. This association is apparent for both treatment groups.
The negative coefficient associated with the standard treatment group (Prep = stand) indicates that the standard treatment is associated with more severe symptoms across all Ldose values.
The following statements use the PREDPPLOT statement to create the plot shown in Output 57.2.2 of the probabilities of the response taking on individual levels as a function of LDose. Since there are two covariates, LDose and Prep, the value of the CLASS variable Prep is fixed at the highest level, test. Although not shown here, the CDFPLOT statement creates similar plots of the cumulative response probabilities, instead of individual response level probabilities.
proc probit data=multi order=data;
class Prep Symptoms;
parallel: model Symptoms=Prep LDose / lackfit;
predpplot var=ldose level=("None" "Mild" "Severe")
cfit=blue cframe=ligr inborder noconf ;
weight N;
title 'Probit Models for Symptom Severity';
run;
Output 57.2.2: Plot of Predicted Probilities for the Test Preparation Group
The following statements use the XDATA= data set to create a plot of the predicted probabilities with Prep set to the stand level. The resulting plot is shown in Output 57.2.3.
data xrow;
input Prep $ Dose Symptoms $ N;
LDose=log10(Dose);
datalines;
stand 40 Severe 32
run;
proc probit data=multi order=data xdata=xrow;
class Prep Symptoms;
parallel: model Symptoms=Prep LDose / lackfit;
predpplot var=ldose level=("None" "Mild" "Severe")
cfit=blue cframe=ligr inborder noconf ;
weight N;
title 'Predicted Probabilities for Standard Preparation';
run;
Output 57.2.3: Plot of Predicted Probabilities for the Standard Preparation Group