SAS中文论坛

 找回密码
 立即注册

扫一扫,访问微社区

查看: 657|回复: 0
打印 上一主题 下一主题

Ten ways to build a wrong scoring model(转载)

[复制链接]

49

主题

76

帖子

1462

积分

管理员

Rank: 9Rank: 9Rank: 9

积分
1462
楼主
 楼主| 发表于 2011-3-28 10:33:51 | 只看该作者

Ten ways to build a wrong scoring model(转载)

From supersasmacro's blog on Sina

<div>Ten ways to build a wrong scoring model</DIV>
<div><br /></DIV>
<div>&nbsp;</DIV>
<div><br /></DIV>
<div>Some ways to build a wrong scoring model are below- The author
doesn’t take any guarantee if your modeling team is using one of
these and still getting a correct model.</DIV>
<div><br /></DIV>
<div>1) Over fit the model to the sample. This over fitting can be
checked by taking a random sample again and fitting the scoring
equation and compared predicted conversion rates versus actual
conversion rates. The over fit model does not rank order deciles
with lower average probability may show equal or more conversions
than deciles with higher probability scores.</DIV>
<div><br /></DIV>
<div>2) Choose non random samples for building and validating the
scoring equation. Read over fitting above.</DIV>
<div><br /></DIV>
<div>3) Use Multicollinearity
(<!-- m --><a class="postlink" href="http://en.wikipedia.org/wiki/Multicollinearity">http://en.wikipedia.org/wiki/Multicollinearity</a><!-- m --> ) without business
judgment to remove variables which may make business sense.Usually
happens a few years after you studied and forgot
Multicollinearity.</DIV>
<div><br /></DIV>
<div>If you don't know the difference between Multicollinearity ,
Heteroskedasticity <!-- m --><a class="postlink" href="http://en.wikipedia.org/wiki/Heteroskedasticity">http://en.wikipedia.org/wiki/Heteroskedasticity</a><!-- m -->
this could be the real deal breaker for you</DIV>
<div><br /></DIV>
<div>4) Using legacy codes for running scoring usually with step
wise forward and backward &nbsp;regression .Happens
usually on Fridays and when in a hurry to make models.</DIV>
<div><br /></DIV>
<div>5) Ignoring signs or magnitude of parameter estimates ( that's
the output or the weightage of the variable in the equation).</DIV>
<div><br /></DIV>
<div>6) Not knowing the difference between Type 1 and Type 2 error
especially when rejecting variables based on P value. ( Not knowing
P value means you may kindly stop reading and click the You Tube
video in the right margin )</DIV>
<div><br /></DIV>
<div>7) Excessive zeal in removing variables. Why ? Ask yourself
this question every time you are removing a variable.</DIV>
<div><br /></DIV>
<div>&nbsp;Using the wrong causal event (like mailings
for loans) for predicting the future with scoring model (for
mailings of deposit accounts) . or using the right causal event in
the wrong environment ( rapid decline/rise of sales due to factors
not present in model like competitor entry/going out of business
,oil prices, credit shocks sob sob sigh)</DIV>
<div><br /></DIV>
<div>9) Over fitting</DIV>
<div><br /></DIV>
<div>10) Learning about creating models from blogs and not
&nbsp;reading and refreshing your old statistics
textbooks</DIV>
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

QQ|小黑屋|手机版|Archiver|SAS中文论坛  

GMT+8, 2025-6-11 09:14 , Processed in 0.177544 second(s), 20 queries .

Powered by Discuz! X3.2

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表