标题: 新手请教一个数据步问题,多多指教,谢谢 [打印本页] 作者: shiyiming 时间: 2009-12-23 18:29 标题: 新手请教一个数据步问题,多多指教,谢谢 我的数据集里有好多类似下面的行,主要有3个column, "number" "type" and "control"
one A N
one B Y
one C N
one D Y
two E N
two F N
two D N
two Z N
其实我只想要对应一个number有一条数据输出, 当control里有多于一个Y的时候就输出 one Y ,当control里没有一个Y的时候就输出 two N
就是说例子里我想要的输出应该如下
one Y
two N
小弟请问各位有什么比较快的方法可以实现呢,现谢谢你们的解答,希望能共同进步哈作者: shiyiming 时间: 2009-12-23 19:07 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 [code:20ouiyn7]data raw;
input number $ type $ control $;
datalines;
one A N
one B Y
one C N
one D Y
two E N
two F N
two D N
two Z N
;
data temp(drop=type temp);
temp='N';
do _n_=1 by 1 until(last.number);
set raw;
by number;
temp=ifc(upcase(control)='Y','Y',temp);
end;
control=temp;
run;[/code:20ouiyn7]作者: shiyiming 时间: 2009-12-24 15:54 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 谢谢hopewell的解答,学到东西了,另外联想到一个相关的问题,就是可不可以设置优先级来挑选数据呢?
期待大家的建议啊...作者: shiyiming 时间: 2009-12-24 18:13 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 [code:1tjt6eae]data raw;
input number $ type $ control $;
datalines;
one A Y
one B Y
one C Y
one D N
two E Y
two F N
two D N
;
data temp(drop=type);
set raw;
by number;
if last.number then output;
run;[/code:1tjt6eae]作者: shiyiming 时间: 2009-12-24 23:39 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 这个不是只是取最后一行的数据嘛,并没有用到优先级的吧
one A Y
one B Y
one D N
one C Y
如果数据行的顺序变换一样,结果不就不对了吗?能不能先检查D的control值,如果是Y就输出,然后就结束程序,如果是N 就接着检查C的control值,如果C的control值是Y就输出,然后就结束程序,如此按照D>C>B>A 的优先级检查下去?作者: shiyiming 时间: 2009-12-25 08:47 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 仔细想了想,觉得优先级有点像if....else if....else if....这样,不知道对不对?please advise.作者: shiyiming 时间: 2009-12-25 09:40 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 If I can assume that a type denoted by a larger letter has higher priority, namely, Z > Y > X > ... > C > B > A, then the following might be it:
[code:2z869kv0]data raw;
input number $ type $ control $;
datalines;
one A Y
one B Y
one C Y
one D N
two E Y
two F N
two D N
three Z N
three X N
three Y N
;
run;
proc sort data = raw
out = sorted;
by number
decending type;
run;
data results(drop = output_already);
set sorted;
by number
descending type;
retain output_already;
if first.number then output_already = 0;
if output_already = 0 then do;
if last.number or control = 'Y' then do;
output_already = 1;
output;
end;
end;
run; [/code:2z869kv0]作者: shiyiming 时间: 2009-12-25 10:19 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 如果我没理解错,排序后的last.var就是你要的结果作者: shiyiming 时间: 2009-12-26 00:13 标题: Re: 新手请教一个数据步问题,多多指教,谢谢 [quote="cloudpan2002":2ijcdt3a]If I can assume that a type denoted by a larger letter has higher priority, namely, Z > Y > X > ... > C > B > A, then the following might be it:
[code:2ijcdt3a]data raw;
input number $ type $ control $;
datalines;
one A Y
one B Y
one C Y
one D N
two E Y
two F N
two D N
three Z N
three X N
three Y N
;
run;
proc sort data = raw
out = sorted;
by number
decending type;
run;
data results(drop = output_already);
set sorted;
by number
descending type;
retain output_already;
if first.number then output_already = 0;
if output_already = 0 then do;
if last.number or control = 'Y' then do;
output_already = 1;
output;
end;
end;
run; [/code:2ijcdt3a][/quote:2ijcdt3a]
Thanks for the feedback. The flag [i:nc8a3ov3]output_already[/i:nc8a3ov3] is there so that once a control value is output for a given number, it ignores the rest of the entries for this number and go analyze the next number available. In case for a number it doesn't see any entries where control = 'Y', it has to output the last entry of this number anyways even though this last entry's control value is 'N'. I wish my understanding is right. <!-- s:) --><img src="{SMILIES_PATH}/icon_smile.gif" alt=":)" title="Smile" /><!-- s:) -->