如何在data step中使用macro variable?

shiyiming · 发表于 2010-8-3 13:50:32

有5个个大于200G的dataset（d1,d2,d3,d4,d5的所有变量完全一致并存储在同一服务器里）需要合并后进行其他的处理，但由于服务器的内存的限制，无法直接合并他们并pull出来使用，只能加上一些限制条件来pull，比如，变量name,同时也存在在另外一个相对小一些的dataset（s1）里，就想把那5个大的dataset含有与s1里相同name的记录挑出来。
proc sql noprint;
select distinct quote(name) into :id separated by ','
from s1;
quit;

data d;
set d1 d2 d3 d4 d5;
where name in (&id.);
run;
结果log出现错误提示，因为新建的macro变量最长只能允许64,000字节，可id实际长度是其20倍还多。
问题：有没有其他办法来处理我这个问题？
1.我曾想到用full join来合并，同加上限制条件name。
但不知道full join能不能同时合并5个dataset?
2.用call symput,不知道symput能不能建立name系列值?
3。用select distinct quote(name) into :id1-:id&n. 但却不知道在data step里如何调用这些mmacro变量?
所以，就想请教大家。
谢谢！

shiyiming · 发表于 2010-8-3 22:04:50

how about this way:
proc sql;
create table flag as
select distinct quote(name) from s1;
quit;

proc datasets lib=yourlib;
modify d1;
index create name/unique; *if name value is not unique, you may like to remove "/unique", like bollow;
run;
modify d2;
INDEX create name;
run;
modify d3;
INDEX create name;
run;
modify d4;
INDEX create name;
run;
modify d5;
INDEX create name;
run;
quit;

data sub_d1;
set flag;
set youlib.d1 key=name;
if _iorc_ then delete;*this line may not needed, you can test by yourself;
run;

* do the rest 4 same way;
.
.
.

*concatenate the sub_d1-sub_d5;

data yourdate;
set sub_:;
run;

********* if the number of records in flag much smaller than that of D1-D5, the way will much faster than SQL join, but need time to create index, so if your dataset d1-d5 are freqently updated, this may not the proper way.

shiyiming · 发表于 2010-8-3 23:34:55

dealing with those 5 datasets separately may be a good way, but not an efficient way, so I try to pull all the records from the 5 datasets at the same time using a SQL procedure, or, a SQL procedure to generate a macro variable(for name) and a data step to combine 5 datasets meanwhile using the established macro variable (name) to subset just like what I posted in 1st floor.
Anyway, your thought is pretty good and appreciate. Thanks.

		自动登录	找回密码
密码			立即注册

如何在data step中使用macro variable?

如何在data step中使用macro variable?

Re: 如何在data step中使用macro variable?

Re: 如何在data step中使用macro variable?