|
|
楼主

楼主 |
发表于 2008-7-28 18:22:38
|
只看该作者
"FASTLOAD" option for loading data set into Teradata
Here is some tips about loading the SAS data sets into the Teradata database.
(1) When writing SAS data sets to Teradata database, we may use the “FASTLOAD” option:
Here is the example:
[code:ubwyelru] libname target teradata user=yourname password=yourpwd database=target tdpid="yourtdpid" OVERRIDE_RESP_LEN=YES DBCOMMIT=0;
libname source'/sasdata/';
data target.temp(FASTLOAD=YES FBUFSIZE=32000);
set source.temp;
format user_id 18.;
run;[/code:ubwyelru]
Follow this example, we manage to load a SAS data set of 1,500,000 columns into the Teradata in just 40 seconds while it seems that the loading would never succeed without the FASTLOAD option.
The parameter ‘FBUFSIZE’ means the buffer size for the loading to the Teradata. Its value depends on the environment of SAS system、Teradata system and the size for the SAS data set. As the experiment result, the value ‘32000’ is the optimal value for the loading. But it would change according to the different situations.
While using libname statement and data step to load the data set into our Teradata , we would probably meet with such a problem:the data types might change to an unexpected format when the data set has been loaded into. In this situation , we should use ‘FORMAT’ statement to assign an certain format. In the previous example, we set the user_id to the numeric 18 format. For more details for the data types difference between the SAS and Teradata , you may see the following link:
<!-- m --><a class="postlink" href="http://support.sas.com/documentation/cdl/en/acreldb/59618/HTML/default/a001384390.htm">http://support.sas.com/documentation/cd ... 384390.htm</a><!-- m -->
The “FASTLOAD” option has such limitations:
Ø FastLoad can load only empty tables; it cannot append to a table that already contains data. If you attempt to use FastLoad when appending to a table that contains rows, the append step fails.
Ø Both the Teradata FastLoad Utility and the SAS/ACCESS FastLoad facility log data errors to tables. Error recovery can be difficult. To find the error that corresponds to the code stored in the error table, see the Teradata FastLoad documentation.
Ø FastLoad does not load duplicate rows (rows where all corresponding fields contain identical data) into a Teradata table. If your SAS data set contains duplicate rows, you can use the normal insert (load) process.
Cautions: When the SAS data set has few column(I think the boundary would be 1000 columns), you may load it without the “FASTLOAD”option. It would be faster.
(2) When we have to load a very large table into Teradata, we have to concern the performance of the table stored in Teradata , especially the Primary Index of the table. In this condition, we could pre-create a table in the Teradata using the pass-through method and load the data into the created table with the PROC APPEND. Here is the example:
[code:ubwyelru] libname target teradata user=yourname password=yourpwd database=target tdpid="yourtdpid" OVERRIDE_RESP_LEN=YES DBCOMMIT=0;
proc sql;
connect to teradata as tera(user=yourname password=yourpwd database=target tdpid=yourtdpid);
execute (CREATE SET TABLE target.test ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT
( name varchar(20) not null,
sex varchar(20) not null,
age integer ,
height decimal(5,2) ,
weight decimal(5,2)
)
PRIMARY INDEX ( name )) by tera;
execute (commit) by tera;
disconnect to tera;
quit;
proc append data=sashelp.class base = target.test (FASTLOAD=YES FBUFSIZE=32000);
run;[/code:ubwyelru]
Follow this example, we may also avoid the problem of data type mismatch for we have pre-define the column in the pass-through SQL block. So we may also use this method to load the data. |
|