How to read a dynamically changing file and load into Oracle?

I have a tab delimited file which has 27 character fields. The file needs to be loaded into an Oracle table. But the challenge is that everytime the file comes it may or may not have values in all 27 fields.

Column Definition of the 27 fields:

TYPE: Char (1)
NAME: Char (30)
CUSTOM_VAL: Char (2)
CUSTOM_DATANAME1: Char (30)
CUSTOM_DATATYPE1: Char(1)
CUSTOM_DATAFORMAT1: Char(8)
CUSTOM_DATANAME2: Char (30)
CUSTOM_DATATYPE2: Char(1)
CUSTOM_DATAFORMAT2: Char(8)

...... and so on till CUSTOM_DATAFORMAT8

For example: If today's file has 5 records, three records have values in all 27 fields and two records have values in 10 fields. The rest of them are blank or empty.

Tomorrow's file will be different...I am not sure how to deal with this dynamically changing file.

I did an extensive search on the forum, but unable to find any related hits. There are hits on how to invoke SQL through UNIX.

Any help will be highly appreciated...

Thank You,
Madhu

Use SQL Loader

Thank You Ambikesh..

Our requirement is only to read this file in UNIX, invoke Oracle and load it into the specific table...

Any shell script examples for this will really help me a lot....

A shell script solution is simple enough but a better solution is to use an external table definition. This is defined in the database and has its own sqlldr like control file incorporated into the table definition. This then should give you access to whatever the file updates are as they occur throughout the day without invoking a shell script.

Whether or not same columns are provided in the file, as long as their location in the fixed width or CSV formatted flat file, you will be fine.

Thank you tmarikle...

I did not quite understand what exactly you meant by it...

Yes...It is a tab delimited file. But the problem is that if there are no values in the rest of the columns, they are not filled with spaces or tabs...

Row1: A B C D E F G H
Row2: A B C D
Row3: A B

I am absolutely not sure how I can do this...

Both sqlldr and external tables can process data in the format that you have described. I'll edit this post with some examples shortly.

Additional info:

Here is how you can define an external table using another table and a sqlldr control file to start with.

1) Create your table:

CREATE TABLE my_external_table (
my_key_no VARCHAR2(45)
,load_date DATE
,col_a VARCHAR2(30)
,col_b VARCHAR2(30)
,col_c VARCHAR2(30)
,col_d VARCHAR2(30)
,col_e VARCHAR2(30)
,col_f VARCHAR2(30)
)

2) Create a control file for sqlldr that allows for "missing" columns as per your requirement.

LOAD DATA

INTO TABLE "MY_EXTERNAL_TABLE"
APPEND

FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' AND '"'
TRAILING NULLCOLS

(
my_key_no "SUBSTRB(:my_key_no, 1, 45)"
,load_date "TO_DATE(:load_date, 'YYYYMMDD')"
,col_a "SUBSTRB(:col_a , 1, 30)"
,col_b "SUBSTRB(:col_a , 1, 30)"
,col_c "SUBSTRB(:col_a , 1, 30)"
,col_d "SUBSTRB(:col_a , 1, 30)"
,col_e "SUBSTRB(:col_a , 1, 30)"
,col_f "SUBSTRB(:col_a , 1, 30)"
)

3) Use sqlldr to define a comperable external table definition.

sqlldr user/pwd test.ctl external_table=generate_only

4) The resulting log file contains the external table DDL and examples on how to use it.

CREATE TABLE "SYS_SQLLDR_X_EXT_MY_EXTERNAL_T"
(
"MY_KEY_NO" VARCHAR(255),
"LOAD_DATE" VARCHAR(255),
"COL_A" VARCHAR(255),
"COL_B" VARCHAR(255),
"COL_C" VARCHAR(255),
"COL_D" VARCHAR(255),
"COL_E" VARCHAR(255),
"COL_F" VARCHAR(255)
)
ORGANIZATION external
(
TYPE oracle_loader
DEFAULT DIRECTORY SYS_SQLLDR_XT_TMPDIR_00000
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE CHARACTERSET WE8MSWIN1252
BADFILE 'SYS_SQLLDR_XT_TMPDIR_00000':'test.bad'
LOGFILE 'test.log_xt'
READSIZE 1048576
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ LDRTRIM
MISSING FIELD VALUES ARE NULL
REJECT ROWS WITH ALL NULL FIELDS
(
"MY_KEY_NO" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ ,
"LOAD_DATE" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ ,
"COL_A" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ ,
"COL_B" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ ,
"COL_C" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ ,
"COL_D" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ ,
"COL_E" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """ ,
"COL_F" CHAR(255)
TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' AND """
)
)
location
(
'test.dat'
)
)REJECT LIMIT UNLIMITED

At any rate, you can read up on external tables at asktom.oracle.com.

By the way, if you don't want to use an external table, you can still use sqlldr. You need a controlfile as follows:

LOAD DATA

INTO TABLE "MY_EXTERNAL_TABLE"
APPEND

FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' AND '"'
TRAILING NULLCOLS
(
my_key_no "SUBSTRB(:my_key_no, 1, 45)"
,load_date "TO_DATE(:load_date, 'YYYYMMDD')"
,col_a "SUBSTRB(:col_a , 1, 30)"
,col_b "SUBSTRB(:col_a , 1, 30)"
,col_c "SUBSTRB(:col_a , 1, 30)"
,col_d "SUBSTRB(:col_a , 1, 30)"
,col_e "SUBSTRB(:col_a , 1, 30)"
,col_f "SUBSTRB(:col_a , 1, 30)"
)

Then you can script your sqlldr call in Unix as follows:

sqlldr user/password control=test.ctl data=test.dat

Any column missing will be set to null. This is accompished in the "fields terminated by" statement in the control file.

Please note that this example assume CSV not tab delimited. Tab delimited use this syntax:

FIELDS TERMINATED BY X'09' OPTIONALLY ENCLOSED BY '"' AND '"'
TRAILING NULLCOLS

Thank you Thomas...

I didn't understand point 4....

I have invoked SQL LOADER through UNIX like this...Is the way you have suggested?

$ORACLE_HOME/bin/sqlldr userid=username/password@db control=$SCRIPT_DIR/metadatacontrolfile.ctl log=$SCRIPT_DIR/metadatacontrolfile.log data=$SCRIPT_DIR/file1
Control File is like this:

LOAD DATA
append
INTO TABLE ACXIOM_METADATA_A
FIELDS TERMINATED BY "\t"
TRAILING NULLCOLS
(
 CLIENT_PRODUCT_CD "SUBSTRB(:CLIENT_PRODUCT_CD , 1, 30)"
,PARTNER_NAME "SUBSTRB(:PARTNER_NAME , 1, 30)"
,NO_CUSTOM_VALUE INTEGER
,CUSTOM_VALUE1_NAME "SUBSTRB(:CUSTOM_VALUE1_NAME , 1, 30)"
,CUSTOM_VALUE1_TYPE "SUBSTRB(:CUSTOM_VALUE1_TYPE , 1, 1)"
,CUSTOM_VALUE1_FORMAT "SUBSTRB(:CUSTOM_VALUE1_FORMAT , 1, 8)"
,CUSTOM_VALUE2_NAME "SUBSTRB(:CUSTOM_VALUE2_NAME , 1, 30)"
,CUSTOM_VALUE2_TYPE "SUBSTRB(:CUSTOM_VALUE2_TYPE , 1, 1)"
,CUSTOM_VALUE2_FORMAT "SUBSTRB(:CUSTOM_VALUE2_FORMAT , 1, 8)"
,CUSTOM_VALUE3_NAME "SUBSTRB(:CUSTOM_VALUE3_NAME , 1, 30)"
,CUSTOM_VALUE3_TYPE "SUBSTRB(:CUSTOM_VALUE3_TYPE , 1, 1)"
,CUSTOM_VALUE3_FORMAT "SUBSTRB(:CUSTOM_VALUE3_FORMAT , 1, 8)"
,CUSTOM_VALUE4_NAME "SUBSTRB(:CUSTOM_VALUE4_NAME , 1, 30)"
,CUSTOM_VALUE4_TYPE "SUBSTRB(:CUSTOM_VALUE4_TYPE , 1, 1)"
,CUSTOM_VALUE4_FORMAT "SUBSTRB(:CUSTOM_VALUE4_FORMAT , 1, 8)"
,CUSTOM_VALUE5_NAME "SUBSTRB(:CUSTOM_VALUE5_NAME , 1, 30)"
,CUSTOM_VALUE5_TYPE "SUBSTRB(:CUSTOM_VALUE5_TYPE , 1, 1)"
,CUSTOM_VALUE5_FORMAT "SUBSTRB(:CUSTOM_VALUE5_FORMAT , 1, 8)"
,CUSTOM_VALUE6_NAME "SUBSTRB(:CUSTOM_VALUE6_NAME , 1, 30)"
,CUSTOM_VALUE6_TYPE "SUBSTRB(:CUSTOM_VALUE6_TYPE , 1, 1)"
,CUSTOM_VALUE6_FORMAT "SUBSTRB(:CUSTOM_VALUE6_FORMAT , 1, 8)"
,CUSTOM_VALUE7_NAME "SUBSTRB(:CUSTOM_VALUE7_NAME , 1, 30)"
,CUSTOM_VALUE7_TYPE "SUBSTRB(:CUSTOM_VALUE7_TYPE , 1, 1)"
,CUSTOM_VALUE7_FORMAT "SUBSTRB(:CUSTOM_VALUE7_FORMAT , 1, 8)"
,CUSTOM_VALUE8_NAME "SUBSTRB(:CUSTOM_VALUE8_NAME , 1, 30)"
,CUSTOM_VALUE8_TYPE "SUBSTRB(:CUSTOM_VALUE8_TYPE , 1, 1)"
,CUSTOM_VALUE8_FORMAT "SUBSTRB(:CUSTOM_VALUE8_FORMAT , 1, 8)"
,CUSTOM_VALUE9_NAME "SUBSTRB(:CUSTOM_VALUE9_NAME , 1, 30)"
,CUSTOM_VALUE9_TYPE "SUBSTRB(:CUSTOM_VALUE9_TYPE , 1, 1)"
,CUSTOM_VALUE9_FORMAT "SUBSTRB(:CUSTOM_VALUE9_FORMAT , 1, 8)"
,CUSTOM_VALUE10_NAME "SUBSTRB(:CUSTOM_VALUE10_NAME , 1, 30)"
,CUSTOM_VALUE10_TYPE "SUBSTRB(:CUSTOM_VALUE10_TYPE , 1, 1)"
,CUSTOM_VALUE10_FORMAT "SUBSTRB(:CUSTOM_VALUE10_FORMAT , 1, 8)"
,CUSTOM_VALUE11_NAME "SUBSTRB(:CUSTOM_VALUE11_NAME , 1, 30)"
,CUSTOM_VALUE11_TYPE "SUBSTRB(:CUSTOM_VALUE11_TYPE , 1, 1)"
,CUSTOM_VALUE11_FORMAT "SUBSTRB(:CUSTOM_VALUE11_FORMAT , 1, 8)"
,CUSTOM_VALUE12_NAME "SUBSTRB(:CUSTOM_VALUE12_NAME , 1, 30)"
,CUSTOM_VALUE12_TYPE "SUBSTRB(:CUSTOM_VALUE12_TYPE , 1, 1)"
,CUSTOM_VALUE12_FORMAT "SUBSTRB(:CUSTOM_VALUE12_FORMAT , 1, 8)"
,CUSTOM_VALUE13_NAME "SUBSTRB(:CUSTOM_VALUE13_NAME , 1, 30)"
,CUSTOM_VALUE13_TYPE "SUBSTRB(:CUSTOM_VALUE13_TYPE , 1, 1)"
,CUSTOM_VALUE13_FORMAT "SUBSTRB(:CUSTOM_VALUE13_FORMAT , 1, 8)"
,CUSTOM_VALUE14_NAME "SUBSTRB(:CUSTOM_VALUE14_NAME , 1, 30)"
,CUSTOM_VALUE14_TYPE "SUBSTRB(:CUSTOM_VALUE14_TYPE , 1, 1)"
,CUSTOM_VALUE14_FORMAT "SUBSTRB(:CUSTOM_VALUE14_FORMAT , 1, 8)"
,CUSTOM_VALUE15_NAME "SUBSTRB(:CUSTOM_VALUE15_NAME , 1, 30)"
,CUSTOM_VALUE15_TYPE "SUBSTRB(:CUSTOM_VALUE15_TYPE , 1, 1)"
,CUSTOM_VALUE15_FORMAT "SUBSTRB(:CUSTOM_VALUE15_FORMAT , 1, 8)"
,CUSTOM_VALUE16_NAME "SUBSTRB(:CUSTOM_VALUE16_NAME , 1, 30)"
,CUSTOM_VALUE16_TYPE "SUBSTRB(:CUSTOM_VALUE16_TYPE , 1, 1)"
,CUSTOM_VALUE16_FORMAT "SUBSTRB(:CUSTOM_VALUE16_FORMAT , 1, 8)"
,CUSTOM_VALUE17_NAME "SUBSTRB(:CUSTOM_VALUE17_NAME , 1, 30)"
,CUSTOM_VALUE17_TYPE "SUBSTRB(:CUSTOM_VALUE17_TYPE , 1, 1)"
,CUSTOM_VALUE17_FORMAT "SUBSTRB(:CUSTOM_VALUE17_FORMAT , 1, 8)"
,CUSTOM_VALUE18_NAME "SUBSTRB(:CUSTOM_VALUE18_NAME , 1, 30)"
,CUSTOM_VALUE18_TYPE "SUBSTRB(:CUSTOM_VALUE18_TYPE , 1, 1)"
,CUSTOM_VALUE18_FORMAT "SUBSTRB(:CUSTOM_VALUE18_FORMAT , 1, 8)"
,CUSTOM_VALUE19_NAME "SUBSTRB(:CUSTOM_VALUE19_NAME , 1, 30)"
,CUSTOM_VALUE19_TYPE "SUBSTRB(:CUSTOM_VALUE19_TYPE , 1, 1)"
,CUSTOM_VALUE19_FORMAT "SUBSTRB(:CUSTOM_VALUE19_FORMAT , 1, 8)"
,CUSTOM_VALUE20_NAME "SUBSTRB(:CUSTOM_VALUE20_NAME , 1, 30)"
,CUSTOM_VALUE20_TYPE "SUBSTRB(:CUSTOM_VALUE20_TYPE , 1, 1)"
,CUSTOM_VALUE20_FORMAT "SUBSTRB(:CUSTOM_VALUE20_FORMAT , 1, 8)"
,CUSTOM_VALUE21_NAME "SUBSTRB(:CUSTOM_VALUE21_NAME , 1, 30)"
,CUSTOM_VALUE21_TYPE "SUBSTRB(:CUSTOM_VALUE21_TYPE , 1, 1)"
,CUSTOM_VALUE21_FORMAT "SUBSTRB(:CUSTOM_VALUE21_FORMAT , 1, 8)"
,CUSTOM_VALUE22_NAME "SUBSTRB(:CUSTOM_VALUE22_NAME , 1, 30)"
,CUSTOM_VALUE22_TYPE "SUBSTRB(:CUSTOM_VALUE22_TYPE , 1, 1)"
,CUSTOM_VALUE22_FORMAT "SUBSTRB(:CUSTOM_VALUE22_FORMAT , 1, 8)"
,CUSTOM_VALUE23_NAME "SUBSTRB(:CUSTOM_VALUE23_NAME , 1, 30)"
,CUSTOM_VALUE23_TYPE "SUBSTRB(:CUSTOM_VALUE23_TYPE , 1, 1)"
,CUSTOM_VALUE23_FORMAT "SUBSTRB(:CUSTOM_VALUE23_FORMAT , 1, 8)"
,CUSTOM_VALUE24_NAME "SUBSTRB(:CUSTOM_VALUE24_NAME , 1, 30)"
,CUSTOM_VALUE24_TYPE "SUBSTRB(:CUSTOM_VALUE24_TYPE , 1, 1)"
,CUSTOM_VALUE24_FORMAT "SUBSTRB(:CUSTOM_VALUE24_FORMAT , 1, 8)"
,CUSTOM_VALUE25_NAME "SUBSTRB(:CUSTOM_VALUE25_NAME , 1, 30)"
,CUSTOM_VALUE25_TYPE "SUBSTRB(:CUSTOM_VALUE25_TYPE , 1, 1)"
,CUSTOM_VALUE25_FORMAT "SUBSTRB(:CUSTOM_VALUE25_FORMAT , 1, 8)"
,RECORD_LOAD_DT "TO_DATE(:lRECORD_LOAD_DT, 'YYYYMMDD')"
,RECORD_LOAD_USER "SUBSTRB(:RECORD_LOAD_USER , 1, 32)"
)

Looks like a good start. :stuck_out_tongue: I wasn't aware that "\t" could be substituted for X'09' field terminators. Also SUBSTRB may or may not be suitable for your environment. Its purpose is to chop a string on a byte boundary rather than a character boundary. This is used sometimes when you have a character column that is specified as VARCHAR2, CHAR, etc of some size that can't contain Unicode characters using several bytes for a single character. It's better to define the table as NVARCHAR2, NCHAR, etc. The column will allow for multiple bytes worth of storage for any single or multiple byte sequences of characters.