If the length attribute is different, [color=#FF0000:16b1mb9v]SAS takes the length from the first data set that contains the variable[/color:16b1mb9v]. In the following example, all data sets that are listed in the MERGE statement contain the variable Mileage. In QUARTER1, the length of the variable Mileage is four bytes; in QUARTER2, it is eight bytes and in QUARTER3 and QUARTER4, it is six bytes. In the output data set YEARLY, the length of the variable Mileage is four bytes, which is the length derived from QUARTER1.
data yearly;
merge quarter1 quarter2 quarter3 quarter4;
by Account;
run;
To override the default and set the length yourself, specify the appropriate length in a LENGTH statement that precedes the SET, MERGE, or UPDATE statement.
以上的内容是SAS 9.1 Help中的原话。搜索关键词:“Reading, Combining, and Modifying SAS Data Sets”,“Combining SAS Data Sets: Basic Concepts”作者: shiyiming 时间: 2009-10-10 09:19 标题: Re: merge是变量长度问题 昨天后来试了一下,示例如下:
数据集A中USER_ID的LENGTH、INFORMAT及FORMAT的长度分别是32、32、32;数据集B中USER_ID的LENGTH、INFORMAT及FORMAT的长度分别是18、18、18;
如果
DATA C;
MERGE A B;
BY USER_ID;
RUN;
则没有问题;
如果
DATA C;
MERGE B A;
BY USER_ID;
RUN;
则会出现“WARNING: 输入数据集为 BY 变量 user_id 指定了多个长度。这可能导致意外的结果。”
如果是后者出现的问题,则可以通过如下办法改变:
data a;
length user_id $ 18;
set a;
run;
然后在用
DATA C;
MERGE B A;
BY USER_ID;
RUN;
猜想1:是不是和SAS中MERGE的原理有关,也就是说MERGE的时候,最主要是看BY变量的属性中的LENGTH,而不是FORMAT和INFORMAT的长度?
猜想2:上述出现的WARNING其实对结果是没有什么影响的?
不知道哪位高人可以指点一下?