|
|
楼主

楼主 |
发表于 2009-10-29 13:37:49
|
只看该作者
hapmap文件导入到sas中的另一个问题
再次感谢上贴的hopewell,本来不抱多大希望,不想获得了如此热心有效的帮助,让我学好sas的信心大增:)
其实还有一个问题,可能比上一个麻烦一点,大家有兴趣的话不妨看看~
hapmap文件2:(; 前皆为一行)
#Wed Oct 28 01:29:38 2009: HapMap genotype data dump, SNPs genotyped in population MEX on chr7:26924045..27424045;
#For details on file format, see <!-- m --><a class="postlink" href="http://www.hapmap.org/genotypes/;">http://www.hapmap.org/genotypes/;</a><!-- m -->
rs# alleles chrom pos strand assembly# center protLSID assayLSID panelLSID QCcode NA19663 NA19664 NA19665 NA19722 NA19723 NA19649 NA19669 NA19656 NA19657 NA19658 NA19686 NA19719 NA19720 NA19724 NA19726 NA19747 NA19759 NA19773 NA19780 NA19675 NA19676 NA19677 NA19651 NA19653 NA19683 NA19684 NA19725 NA19727 NA19755 NA19756 NA19757 NA19772 NA19774 NA19775 NA19776 NA19777 NA19778 NA19783 NA19784 NA19796 NA19650 NA19671 NA19661 NA19682 NA19771 NA19779 NA19781 NA19782 NA19788 NA19659 NA19660 NA19662 NA19678 NA19680 NA19681 NA19746 NA19721 NA19748 NA19760 NA19718 NA19790 NA19794 NA19795 NA19654 NA19749 NA19751 NA19761 NA19762 NA19763 NA19770 NA19670 NA19716 NA19750 NA19789 NA19685 NA19679 NA19652;
rs774265 A/G chr7 26925442 + ncbi_b36 bbs urn:lsid:bbs.hapmap.org:Protocol:Phase3_Draft2:1 urn:lsid:bbs.hapmap.org:Assay:Phase3_Draft2_rs774265:1 urn:lsid:dcc.hapmap.org:Panel:US_Mexican-30-trios:3 QC+ GG GG GG GG GG GG GG GG GG AG GG GG GG GG GG GG GG GG AG GG GG GG GG GG GG GG GG GG GG AG AG AG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG AG GG GG GG AG GG GG GG;
......
rs4722699 C/T chr7 27423025 + ncbi_b36 bbs urn:lsid:bbs.hapmap.org:Protocol:Phase3_Draft2:1 urn:lsid:bbs.hapmap.org:Assay:Phase3_Draft2_rs4722699:1 urn:lsid:dcc.hapmap.org:Panel:US_Mexican-30-trios:3 QC+ CT CT TT CT CT CC CC CC CT CC CC CT CC TT CC CT CC CC CC CC CT CT CC CC CC CC CC CC CT CT TT CC CC CC CC CT CT CT TT CC CC CC CT CC CC CT CT CT CC CT CC CT CC CT CC TT CC TT CC CT CC CC CC CC CC CC CC CT CT CT CC CC CC CC CC CT CC;
软件所需要的两个文件(格式):
dat:
M rs774265
...
M rs4722699
这个似乎不是很难,我参考hopewell的照葫芦画瓢写了一个:
data datas;
infile 'E:\imputation\mex-ii.txt';
input text;
if substr(text,1,2)='rs' then
do;
text='M'||scan(text,1,' ');
output datas;
end;
run;
data _null_;
set datas;
file 'd:\dats.txt';
put text;
run;
sas报错:
NOTE: Invalid data for text in line 1 1-4.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9
1 #Wed Oct 28 01:29:38 2009: HapMap genotype data dump, SNPs genotyped in population MEX on
91 chr7:26924045..27424045 113
text=. _ERROR_=1 _N_=1
....(前三行文字报错类似)
NOTE: Invalid data for text in line 4 1-8.
4 rs774265 A/G chr7 26925442 + ncbi_b36 bbs urn:lsid:bbs.hapmap.org:Protocol:Phase3_Draft2:1
91 urn:lsid:bbs.hapmap.org:Assay:Phase3_Draft2_rs774265:1 urn:lsid:dcc.hapmap.org:Panel:US_M
181 exican-30-trios:3 QC+ GG GG GG GG GG GG GG GG GG AG GG GG GG GG GG GG GG GG
text=. _ERROR_=1 _N_=4
.....(后n行报错类似)
最后生成的文件是空的
我想是我对scan的使用还是有问题,另外第三排文字是个干扰
看不懂那个报错...
另外还要生成一个文件,写在下一贴吧
还是那句话,真诚的感谢您看到了这里~!
见笑了,谢谢:) |
|