data a;
input num1 num2 add $20. ;
cards;
0 5 bj
6 10 sh
11 15 gz
16 21 sz
22 29 hz
30 36 cd
37 41 gy
42 52 km
53 58 hk
59 100 qt
30 100 sxlion
;
run;
proc sort ;
by num1 num2;
run;
data b;
input id $ num;
cards;
001 8
002 15
003 43
004 97
;
run;
proc sort ;
by num;
run;
data ahuige(drop=num1 num2);
set b;
retain num1 num2 add;
do while (not (num1<=num<=num2) and (point+1<=maxA)) ;
point+1;
set a point=point nobs=maxA;
end;
run;
proc print;
run;作者: shiyiming 时间: 2009-2-13 13:23 标题: Re: 求助:两个数据集的搜寻比较 多谢sxlion提醒,是应当严谨 <!-- s:D --><img src="{SMILIES_PATH}/icon_biggrin.gif" alt=":D" title="Very Happy" /><!-- s:D -->
最近没怎么样用SAS,都有点生疏了... <!-- s:shock: --><img src="{SMILIES_PATH}/icon_eek.gif" alt=":shock:" title="Shocked" /><!-- s:shock: -->作者: shiyiming 时间: 2009-2-13 13:25 标题: Re: 求助:两个数据集的搜寻比较 这么多数据啊,借我玩玩啦,好像有更好的代码。作者: shiyiming 时间: 2009-2-13 13:32 标题: Re: 求助:两个数据集的搜寻比较 <!-- s:( --><img src="{SMILIES_PATH}/icon_sad.gif" alt=":(" title="Sad" /><!-- s:( --> 不大方便.....不过这个程序的效率已经蛮高的了,呵呵作者: shiyiming 时间: 2009-2-13 14:20 标题: Re: 求助:两个数据集的搜寻比较 如果不排序:
data ab(keep=id num add);
set b;
do i=1 to n;
set a point=i nobs=n;
if num1=<num=< num2 then return;
end;
如果前面的排好序:
data ab (keep=id num add);
set b;
point+1;
do i=point to n;
set a point=i nobs=n;
if num1=<num=< num2 then do ;
output ;
if add=lag(add) then point=point-1;
return;
end;
end;
run;
<!-- m --><a class="postlink" href="http://sasor.feoh.net/viewtopic.php?f=1&t=629&start=0&st=0&sk=t&sd=a">http://sasor.feoh.net/viewtopic.php?f=1 ... &sk=t&sd=a</a><!-- m -->
NOTE: There were 365462 observations read from the data set WORK.A.
NOTE: The data set WORK.A has 365462 observations and 3 variables.
NOTE: PROCEDURE SORT used:
real time 8.65 seconds
cpu time 0.79 seconds
493 proc sort data=b;
494 by num;
495 run;
NOTE: There were 111157 observations read from the data set WORK.B.
NOTE: The data set WORK.B has 111157 observations and 2 variables.
NOTE: PROCEDURE SORT used:
real time 0.21 seconds
cpu time 0.21 seconds
496
497 data ahuige(drop=num1 num2);
498 set b;
499 retain num1 num2 add;
500 do while (^(num1<=num<=num2) and (point+1<=maxA)) ;
501 point+1;
502 set a point=point nobs=maxA;
503 end;
504 if num^=.;
505 run;
NOTE: There were 111157 observations read from the data set WORK.B.
NOTE: The data set WORK.AHUIGE has 111054 observations and 3 variables.
NOTE: DATA statement used:
real time 1.78 seconds
cpu time 0.32 seconds
这个是徐福贵兄的,生成的formats 28.5M
538 data xxx/view=xxx;
539 fmtname = 'tianwild';
540 do until(eof);
541 set a(keep=num1 num2 add rename=(num1=start num2=end add=label)) end=eof;
NOTE: DATA STEP view saved on file WORK.XXX.
NOTE: A stored DATA STEP view cannot run under a different operating system.
NOTE: DATA statement used:
real time 0.00 seconds
cpu time 0.00 seconds
548 proc format cntlin=xxx;
NOTE: Format TIANWILD has been output.
NOTE: View WORK.XXX.VIEW used:
real time 5.07 seconds
cpu time 0.14 seconds
NOTE: There were 365462 observations read from the data set WORK.A.
NOTE: There were 365463 observations read from the data set WORK.XXX.
NOTE: PROCEDURE FORMAT used:
real time 5.09 seconds
cpu time 2.21 seconds
549 data c;
550 set b;
551 city = put(num, tianwild.);
552 if city ne '';
553 run;
NOTE: There were 111157 observations read from the data set WORK.B.
NOTE: The data set WORK.C has 111054 observations and 3 variables.
NOTE: DATA statement used:
real time 0.71 seconds
cpu time 0.71 seconds