这个不是SAS的问题, 后来找人用C++做 了。作者: shiyiming 时间: 2011-4-6 23:47 标题: Re: 求助,压缩数据 我记得SAS的dataset有个compress选项,能压缩不少。作者: shiyiming 时间: 2011-4-8 03:55 标题: Re: 求助,压缩数据 compress= option only works with SAS dataset, not the desired TXT output
here is my experiment, hope that it helps to sxlion. Check the file size of ex1.txt and ex.txt (9.57MB vs. 19.1MB), 50% missing.
but my question here is that if you completely omit blank space for missing values, how could u recover those missing data in the right positions?
[code:ue6dtzw7]
data ex;
array var{1000};
do id=1 to 1e4;
do j=1 to dim(var);
if ranuni(0)<0.5 then var[j]=floor(ranuni(0)*10);
else var[j]=.;
end;
keep var:;
output;
end;
run;
data _null_;
file "c:\ex.txt" dlm='09'x;
set ex;
put var1-var4;
run;
data _null_;
file "c:\ex1.txt" dlm=' ';
set ex;
array var{1000};
do j=1 to dim(var);;
if var[j]^=. then put var[j] @@;
end;
run;
[/code:ue6dtzw7]作者: shiyiming 时间: 2011-4-8 12:17 标题: Re: 求助,压缩数据 遇到这种情况,我向来是加分隔符,然后用7zip来压缩。
不过用bin等十六进制文本保存,最精简。估计你c++干的应该是这个活儿。作者: shiyiming 时间: 2011-4-8 14:56 标题: Re: 求助,压缩数据 oloolo, 由于每条记录中值缺失的变量都是放后面位置(当然你这个更通用些), 所以无需担心位置问题,读的时候用missover就行了.
我在你的代码中给每条记录加了一回车符,这样position也明确了。
[code:2f9zv0mv]data _null_;
file "c:\ex1.txt" dlm=' ';
set ex;
array var{1000};
do j=1 to dim(var);
if var[j]^=. then put var[j] @@;
if j=dim(var) then put '0D'x;
end;
run;[/code:2f9zv0mv]
[quote:17bm48w4]data _null_;
file 'j:\fileout.txt';
do i=1 to 1000000;
put '1 2 3 4 5 6 7 8';
end;
run;
data a;
infile 'j:\fileout.txt';
input a b c d e f g h;
run;
data _null_;
set a;
file 'j:\fileout.dat';
put @1 a PIB1.
@2 b PIB1.
@3 c PIB1.
@4 d PIB1.
@5 e PIB1.
@6 f PIB1.
@7 g PIB1.
@8 h PIB1.
;
run;