程序总是找不到数据地址

shiyiming · 发表于 2010-4-30 10:44:43

*--------------------------------ExampleF2.sas------------------------------*
|                                                                         |
| An example SAS program to detect epistatic interactions among locus    |
| pairs based on genetic marker data and phenotypic data in an F2 maize |
| population with 150 individuals. Each plant was genotyped at 114       |
| codominant marker loci and also measured for plant height.  Data were |
| kindly supplied by D. Asmono and M. Lee, Iowa State University for use  |
| with this example program.                                           |
| Version 2.0 Updated May 30, 2001                                     |
|                                                                         |
|                                                                         |
| Written by Jim Holland, USDA-ARS, Department of Crop Science,          |
| North Carolina State University, Raleigh, NC 27695                   |
| and Harsha Ingle, The Bayer Corporation, Clayton, NC.                |
|                                                                         |
*---------------------------------------------------------------------------;

*---------------------------------------------------------------------------;

*----------------------------------------------------------------------------
* Before implementing the epistacy program, the genotypic and phenotypic data
* should be organized in a separate data file in a manner that can be
* processed easily by SAS.  Generally, each line or genotype should be listed
* in rows, the genetic data from the different marker loci listed in columns,
* and the dependent phenotypic data also listed in columns.  This program
* analyzes one trait at a time, although it could be modified to analyze
* multiple traits.
*
* It is assumed that the marker data are codominant, so that heterozygotes
* can be distinguished. For recombinant inbred or doubled haploid data, there
* is a companion version of the program - "epistacy.ri" - which will analyze
* each locus pair for additive-by-additive epistasis.
*
* It is also assumed that the genotypic data are coded 0 = homozygous for
* parent A allele, 1 = heterozygous, and 2 = homozygous for parent B allele.
* Other codings are possible, but the program will have to be modified, as
* done later in this example program.
*---------------------------------------------------------------------------;

*----------------------------------------------------------------------------
* Begin by setting options to determine the look of the output.  Here, it is
* chosen to have the output printed to pages sized 76 lines long and 64
* characters wide, without centering.
*
* IMPORTANT NOTE FOR MAINFRAME USERS: If you attempt to run "epistacy" as a
* batch job on a mainframe system, the log file created by the program will
* become too large and will cause the program to exit without completing
* execution.  To avoid this, include the options "nosource" and "nonotes"
* here.  Only include these options after ensuring that the program functions
* on a small subset of the marker loci - otherwise you will have no error
* messages to guide troubleshooting.
*---------------------------------------------------------------------------;

options nocenter ls=76 ps=64;

*----------------------------------------------------------------------------
* The data are maintained in two external files, "mo17rflp.txt" and
* "mo17rfl2.txt".  Input the two files and combine them into a single data
* set, named "all", using the set command in SAS.  Note that the loci used in
* this data set include loci with names like "BNL6.32".  SAS will not accept
* a variable name with a "." character in it.  Therefore, all "." characters
* in locus names have been changed to "z" in the input commands and
* throughout the rest of the program. It is assumed that all
* files are on the root "C:" directory.  If otherwise, the correct directory
* must be specified in each "infile" statement.  If the program is run on a
* system other than PC-SAS the directory identification will need to be
* altered to suit the system used.
*---------------------------------------------------------------------------;
data loci;
FILENAME ALLLOCI DDE "EXCEL|[MaizeData.xls]MaizeLoci!R1C1:R114C1";
INFILE ALLLOCI NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;
input locname $;

N=_n_;
call symput ('nummark', trim(left(N)));
run;

proc sort data=loci;
by locname;
run;

* Creates macro variables genmark1-genmark(&nummark)for
each unique genetic marker;

data a;
set loci;
by locname;
if first.locname then do;
i+1;
put locname= ;
ii=left(put(i,3.));
/* if you have more than 999 loci, increase the 3. to 4. for up to 10,000 markers, etc...*/
call symput ('genmark'||ii, trim(upcase(locname)));
end;
run;
%put _user_; * Displays the macro variables in the log window;

data one;
FILENAME one DDE "EXCEL|[MaizeData.xls]MaizeF2a!R2C1:R151C65";
INFILE one NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;
input PROG UMC94 $ UMC164 $ UMC157 $ NPI234 $ UMC13 $ P1 $ NPI429 $ NPI236 $
UMC37 $ PANXX2z3 $ ISU018 $ ISU019A $ ISU006 $ UMC86A $ BNL6z32 $ UMC53 $
UMC78 $ UMC131 $ UMC98A $ UMC4 $ BNL8z44B $ BNL8z15 $ UMC121 $ BNL8z35 $
UMC26 $ BNL3z18 $ Sh2 $ UMC123 $ PIO20713 $ BNL5z46 $ Bt2 $ NPI292 $ PIO10z25
$ UMC111 $ UMC86B $ BNL6z25 $ UMC72 $ UMC27 $ BNL10z06 $ Bt1 $ BNL10z12 $
ISU019B $ UMC51 $ UMC68  $ NPI235 $ UMC65 $ PL1 $ UMC21 $ NPI280 $ UMC62 $
AGP1 $ BNL15z40 $ UMC98B $ UMC110 $ BNL7z61 $ BNL8z39 $ BNL8z44A $ UMC35 $
BNL9z11 $ UMC103 $ BNL9z08 $ UMC48 $ NPI268 $ UMC7 $;

data two;
FILENAME two DDE "EXCEL|[MaizeData.xls]MaizeF2b!R2C1:R151C65";
INFILE two NOTAB DLM= '09'X DSD MISSOVER lrecl = 10240;
input PROG  C1 $ BNL3z06 $ UMC153 $ BNL8z17 $ BNL14z28 $ NPI209 $ UMC64 $ ISU005 $
ISU012 $ NPI287 $ ISU049 $ ISU040 $  ISU064 $ ISU104 $ ISU141 $ ISU169  $
ISU98 $ ISU119 $ ISU150 $ ISU133 $ ISU115 $ ISU036 $ ISU069 $ ISU109 $ ISU120
$ ISU139 $ ISU116 $ ISU074 $ ISU152 $ ISU124 $ ISU136A $ ISU136B $ ISU032 $
ISU088 $ ISU045 $ ISU048 $ ISU075 $ ISU147 $ ISU138 $ ISU093 $ ISU058 $
ISU047 $ ISU046 $ ISU100 $ ISU053 $ ISU021 $ ISU132A $ ISU132B $ ISU168A $
UMC165 $ PLHT;

data all; merge one two; by prog;
proc print;

run;

data all; merge one two; by prog;
proc print; run;

*create work sets that are empty to clear any residual memory;
data selects;
data output;
data means;
data empty;

*macro f2 performs anova for each pair of loci;
*options mprint symbolgen mlogic;
%macro f2(trait, pvalue);
%do i=1 %to &nummark -1;

dm log 'clear' continue;

   %do j=&i+1 %to &nummark;
      proc glm data=all outstat=out noprint;
      class &&genmark&i &&genmark&j;
      model &trait = &&genmark&i &&genmark&j &&genmark&i*&&genmark&j;

      contrast "AxA" &&genmark&i*&&genmark&j 1 -1 0 -1 1 0 0 0 0;
      contrast "AxD" &&genmark&i*&&genmark&j 1 1 -2 -1 -1 2 0 0 0;
      contrast "DxA" &&genmark&i*&&genmark&j 1 -1 0 1 -1 0 -2 2 0;
      contrast "DxD" &&genmark&i*&&genmark&j 1 1 -2 1 1 -2 -2 -2 4;

      lsmeans &&genmark&i*&&genmark&j/noprint out=mns;
      run;

      data check; set out; if upcase(_source_) = "&&genmark&i*&&genmark&j"
      and _type_ = "SS3";
      if prob le &pvalue then call symput("SELECT","YES");
      else call symput("SELECT", "NO");
      data check; set check;
      %if &select = YES
      %then %do;
            data out;length locus1 locus2 $12; set out;
            locus1 = "&&genmark&i";locus2 = "&&genmark&j";
            rename _name_ = &trait;
            if upcase(_source_) = "&&genmark&i" then _source_ = "M1";
            if upcase(_source_) = "&&genmark&j" then _source_ = "M2";
            if upcase(_source_) = "&&genmark&i*&&genmark&j" then _source_ = "INT";

            data output; set output out;

            data mns;length locus1 locus2 $12; set mns;
            locus1 = "&&genmark&i";locus2 = "&&genmark&j";
            rename &&genmark&i = geno1 &&genmark&j = geno2;

            data means; set means mns;

            data ds1 ds2 ds3 dscont;set output;
            if _type_ = "ERROR" then output ds1;
            if _type_ = "SS3" then output ds2;
            if _type_ = "SS1" then output ds3;
            if _type_ = "CONTRAST" then output dscont;

            data ds1; set ds1;
            rename df=dferr ss=sserr; drop f prob;
            proc sort; by locus1 locus2;

            data ds2; set ds2;if _source_ = "INT";
            rename df=dfint ss=ssint f = fint prob = probint;
            proc sort; by locus1 locus2;

            data a; set ds3; if _source_ = "M1";
               _type_ = "MODEL"; ssa = ss;
            data b; set ds3; if _source_ = "M2";
            _type_ = "MODEL"; ssb=ss;
            data c; set ds3; if _source_ = "INT";
            _type_ = "MODEL"; ssc=ss;
            data model; merge a b c; by _type_; ssmod = ssa+ssb+ssc;
            proc sort; by locus1 locus2;

            data axa; set dscont; if _source_ = "AxA";
            rename ss = ssaxa f = faxa prob = probaxa;
            proc sort; by locus1 locus2;
            data axd; set dscont; if _source_ = "AxD";
            rename ss = ssaxd f = faxd prob = probaxd;
            proc sort; by locus1 locus2;
            data dxa; set dscont; if _source_ = "DxA";
            rename ss = ssdxa f = fdxa prob = probdxa;
            proc sort; by locus1 locus2;
            data dxd; set dscont; if _source_ = "DxD";
            rename ss = ssdxd f = fdxd prob = probdxd;
            proc sort; by locus1 locus2;

            data ds4; merge ds1 ds2 model axa axd dxa dxd; by locus1 locus2;
            sstotal = sserr+ssmod;partr2 = ssint/sstotal;
            drop df  _source_ _type_;

            data geno1; set means;if geno1 = "A" and geno2 = "A"; geno00 = lsmean;
            proc sort; by locus1 locus2;
            data geno2; set means;if geno1 = "A" and geno2 = "H"; geno01 = lsmean;
            proc sort; by locus1 locus2;
            data geno3; set means;if geno1 = "A" and geno2 = "B"; geno02 = lsmean;
            proc sort; by locus1 locus2;
            data geno4; set means;if geno1 = "H" and geno2 = "A"; geno10 = lsmean;
            proc sort; by locus1 locus2;
            data geno5; set means;if geno1 = "H" and geno2 = "H"; geno11 = lsmean;
            proc sort; by locus1 locus2;
            data geno6; set means;if geno1 = "H" and geno2 = "B"; geno12 = lsmean;
            proc sort; by locus1 locus2;
            data geno7; set means;if geno1 = "B" and geno2 = "A"; geno20 = lsmean;
            proc sort; by locus1 locus2;
            data geno8; set means;if geno1 = "B" and geno2 = "H"; geno21 = lsmean;
            proc sort; by locus1 locus2;
            data geno9; set means;if geno1 = "B" and geno2 = "B"; geno22 = lsmean;
            proc sort; by locus1 locus2;

            data allgeno; merge geno1 geno2 geno3 geno4 geno5 geno6 geno7
            geno8 geno9;by locus1 locus2;
            rename _name_ = trait; drop stderr lsmean geno1 geno2;data geno1; set means;if geno1 = 0 and geno2 = 0; geno00 = lsmean;

            data ds4; merge ds4 allgeno; by locus1 locus2;
            if dfint = 0 then delete;
            proc sort; by locus1 locus2; run;

            data selects; set selects ds4;
            data output; set empty;
            data means; set empty;
            %end; /*ends commands invoked when p-value of interaction is < threshold*/

         %end;  /*end i do loop*/

%end;  /* end j do-loop*/
%mend glm;

%f2(plht, 0.001);

data selects; set selects; if dfint ne ".";
proc print; run;
run;

这段程序看不懂，地址怎么总说不对！数据和这个.sas文件在同一个目录下

shiyiming · 发表于 2010-4-30 11:20:12

[code:1qfmxrns]/* It is assumed that all files are on the root "C:" directory. */[/code:1qfmxrns]
注释里说把输入的MaizeData.xls放到C盘根目录下

by the way,10年前能把程序写成这样算是相当了得了,那会我连excel都不会用,sas更是连听都没听说过 <img src="{SMILIES_PATH}/icon_eek.gif" alt=":shock:" title="Shocked" />

shiyiming · 发表于 2010-4-30 11:49:36

ERROR: Physical file does not exist,
EXCEL|[MaizeData.xls]MaizeLoci!R1C1:R114C1
放到c盘根目录下，还是不行

shiyiming · 发表于 2010-4-30 12:16:47

excel表的sheet的名字是MaizeLoci吗? 我试了一下没报错呀.

shiyiming · 发表于 2010-4-30 13:42:50

我也没法把excel表传到网上

		自动登录	找回密码
密码			立即注册

程序总是找不到数据地址

程序总是找不到数据地址

Re: 程序总是找不到数据地址

Re: 程序总是找不到数据地址

Re: 程序总是找不到数据地址

Re: 还是不行，没有运行结果，同样出错