Intelligent Script to Insert Records in Database Table

Hello All Members,

I am new to this forum and to the shell scripting. I want to write a shell script to do the following:

Scenario: I have a pipe delimited .txt file with multiple fields in it. The very first row of the file contains the name of the column which resembles the name of the column in the corresponding database table. I am providing the example of the file below:

�Name�|�Dept�|�Empno�|�Salary�|�DOB�|�DOJ�
�Alexander�|�10�|�200�|�$1000�|�25/05/1977�|�01/01/2015�
�Graham�|�11�|�300�|�$1500�|�05/09/1970�|�19/02/2011�
�Bell� |�12�|�400�|�$2000�|�31/01/1965�|�01/10/1999�

Table Structure:

Create table EMPLOYEE (
Name VARCHAR2(255),
Dept  NUMBER,
Empno NUMBER,
Salary NUMBER,
DOB DATE,
DOJ DATE);

Requirement: I want to write a script which can read the first row of the text file and create a dynamic insert statement based on the columns available in the file. The reason for this requirement is because:

  1. I do not know whether same number of columns will arrive in file every time or not. File may contains all the columns or very few of them.
  2. The order of the column is also not fixed. They could be in random order e.g. File can arrive with �Dept�|�Empno�|�DOB�|�DOJ�|�Name�|�Salary�

Currently I am using external table but I need to change the table definition every time files arrive in different column order. I cannot force the format of the file on the supplier of this file. So I have to work with this.

PsuedoCode: With my limited knowledge, I think below would be the pseudo code.

  • Read the first line of the file.
  • Read the columns available in the first line.
  • Create insert statement using the columns available in the first line.
  • And put the values to the above insert statement from the second row onwards.

I know this will not be a simple script (atleast for me). Hence I am posting this in forum. I would appreciate if any of the senior unix member can help me out with this.

Thanks.

Did you consider the threads proposed at the bottom of this page?

Some questions reg. your requirement:

  • How does the script know the table name?
  • Is it always the same?
  • Will the table be deleted somewhere else?
  • How does the script know the data type of the columns?
  • Will the date format always be the same?

Dear RudiC,

Please see my reply below in blue. I hope I am making sense.

Could this help you ?

#!/bin/sh

_fileName=$1
_tableName=$(basename ${_fileName})
_SQLFile=${_tableName}".sql"

tr "|" "," < ${_fileName} | awk -F"," -v tableNM=$_tableName -v sqlfn=${_SQLFile} 'BEGIN{sqlInsert="INSERT INTO table (columns) values (cValues)"}
                                  { if( NR==1)
                                    {
                                      gsub("\"","",$0)
                                      clName=$0
                                      next
                                    }
                                    gsub("table",tableNM,sqlInsert)
                                    gsub("columns",clName,sqlInsert)
                                    gsub("cValues",$0,sqlInsert)
                                    print  sqlInsert";\n" > sqlfn;
                                  }'



sqlplus -s user/passwrd <<ENDOFSQL
whenever sqlerror exit sql.sqlcode;
@${_SQLFile}
select sysdate from dual;
exit;
ENDOFSQL

RetCode=$?

#Check the return code from SQL Plus
if [ $RetCode != 0 ]
then
    echo "********************"
    echo "ERROR: The SQL Plus Command Failed. RetCode: $RetCode"
else
    echo "********************"
    echo "SQL Plus Successfully Ran. RetCode: $RetCode"
fi

Invocation

/path/toscript/InsertSql.sh /file/path/employee
1 Like

Thanks RudiC,

I will run this in my environment and revert back to you. The table will already exist in the database so I do not need to add table creation/drop part of the code.

I am trying to understand how this will cater to the changing order of the columns (and column names)? I needed this to read the first row of the file to establish the number (and list) of columns arrive in the file. This list will then be added to the into the INSERT statement followed by the values.

---------- Post updated at 09:56 AM ---------- Previous update was at 09:54 AM ----------

I think i have lost the second post of yours which used awk command. I put that code in action but was expecting output as below:

insert into EMPLOYEE (Name, Deptno,Empno, Salary, DOB, DOJ)  values("Alexander","10","200","$1000","25/05/1977","01/01/2015"); 
insert into EMPLOYEE (Name, Deptno,Empno, Salary, DOB, DOJ) values("Graham","11","300","$1500","05/09/1970","19/02/2011"); 
insert into EMPLOYEE (Name, Deptno,Empno, Salary, DOB, DOJ) values("Bell" ,"12","400","$2000","31/01/1965","01/10/1999");

For that, try

awk     '       {gsub (/\|/,",")
                }
         NR==1  {gsub (/"/,"")
                 TMP = "insert into EMPLOYEE (" $0 ") values ("
                 next
                }
                {print TMP, $0, ");"
                }
        ' FS="\|" file
insert into EMPLOYEE (Name,Dept,Empno,Salary,DOB,DOJ) values ( "Alexander","10","200","$1000","25/05/1977","01/01/2015" );
insert into EMPLOYEE (Name,Dept,Empno,Salary,DOB,DOJ) values ( "Graham","11","300","$1500","05/09/1970","19/02/2011" );
insert into EMPLOYEE (Name,Dept,Empno,Salary,DOB,DOJ) values ( "Bell" ,"12","400","$2000","31/01/1965","01/10/1999" );

To allow for your word processor's non-ASCII double quotes, you may want to add gsub (/�|�/, "\"") upfront.

I had to remove my former post as I missed the requirement by far. pravin27's suggestion will work for you after a little polishing.

1 Like

Hi Pravin,

I ran the script in my environment. The output I am getting from it is shown below.

1.I ran the script and it is populating correct insert statement in the file. But also giving below error after execution.

./InsertSQL.sh : line 30: [:0:unary operator expected

2.Besides, I am also getting error that it is not able to find the employee.sql file.
But when I looked into the directory, I can see a file {_SQLFile}

I think the script wants to create an employee.sql file and then run this file against sqlplus command which will eventually fire the insert statements created in the file.

INSERT INTO TABLE (NAME,DEPT,EMPNO,SALARY,DOB,DOJ) values ("Alexander","10","1000","10000","25/05/1977","01/01/2015");


---------- Post updated at 12:19 PM ---------- Previous update was at 11:58 AM ----------

I ran Pravin's code. After correcting many typos, which I introduced, I manage to run the code. And getting below two errors:

INSERT INTO EMPLOYEE (NAME, DEPT,EMPNO,SALARY,DOB,DOJ) VALUES ("Alexander","10","1000","10000","25/05/1977","01/01/2015");
ORA-00984: Column not allowed here.

I ran the command in sqlplus and realised the values part of the INSERT statement is carrying double quotes around VARCHAR columns. These should be single quotes. So Ideally the Insert Statement should look like below. I am amazed with help from your guys, I am getting there very fast. I think this is one of the last few hurdles and the script will be ready to roll.

INSERT INTO EMPLOYEE (NAME, DEPT,EMPNO,SALARY,DOB,DOJ) VALUES ('Alexander','10','1000','10000','25/05/1977','01/01/2015');
ORA-00984: Column not allowed here.

Try gsub (/�|�/, "\'") in the first line of the awk part.

You mean as shown below:

tr "|" "," < ${_fileName} | awk -F"," -v tableNM=$_tableName -v sqlfn=${_SQLFile} 'BEGIN{sqlInsert="INSERT INTO table (columns) values (cValues)"}                                   { if( NR==1)                                     { 
                                      gsub (/"|"/,"\'")
                                      gsub("\"","",$0)                                       clName=$0                                       next                                     }                                     gsub("table",tableNM,sqlInsert)                                     gsub("columns",clName,sqlInsert)                                     gsub("cValues",$0,sqlInsert)                                     print  sqlInsert";\n" > sqlfn;                                   }'

OR

awk     '      {gsub (/"|"/,"\'") 
               }
               {gsub (/\|/,",")                 }          NR==1  {gsub (/"/,"")                  TMP = "insert into EMPLOYEE (" $0 ") values ("                  next                 }                 {print TMP, $0, ");"                 }         ' FS="\|" file

Sorry, can't read your post. Did you try the Go advanced and there Preview Post ?

However, put it into the general action before the NR==1 pattern, like

awk     '       {gsub (/\|/,",")
                 gsub (/�|�/, "\"")
                }
         NR==1  {gsub (/\'/,"")
                 TMP = "insert into EMPLOYEE (" $0 ") values ("
                 next
                }
                {print TMP, $0, ");"
                }
        ' FS="\|" file

, but then take care to remove the "\'" in the NR==1 action! And, make sure you know what you're working on: these � � three byte chars, or the ASCII 0x22 double quotes

1 Like

This might help you

akshay@Aix:/tmp$ cat infile
�Name�|�Dept�|�Empno�|�Salary�|�DOB�|�DOJ�
�Alexander�|�10�|�200�|�$1000�|�25/05/1977�|�01/01/2015�
�Graham�|�11�|�300�|�$1500�|�05/09/1970�|�19/02/2011�
�Bell� |�12�|�400�|�$2000�|�31/01/1965�|�01/10/1999�
akshay@Aix:/tmp$ cat import.awk
function clean(x){ gsub(/[��\047\042]/,x); gsub(/\|/,",") }
FNR==1{ clean(); h = $0 ; next}
{
	clean("\047");
	print "insert into",table,"("h")","values","("$0");"
}
akshay@Aix:/tmp$ awk -vtable="Employee" -f import.awk infile >test.sql
akshay@Aix:/tmp$ cat test.sql 
insert into Employee (Name,Dept,Empno,Salary,DOB,DOJ) values ('Alexander','10','200','$1000','25/05/1977','01/01/2015');
insert into Employee (Name,Dept,Empno,Salary,DOB,DOJ) values ('Graham','11','300','$1500','05/09/1970','19/02/2011');
insert into Employee (Name,Dept,Empno,Salary,DOB,DOJ) values ('Bell' ,'12','400','$2000','31/01/1965','01/10/1999');

---------- Post updated at 12:49 AM ---------- Previous update was at 12:38 AM ----------

OR try this

akshay@Aix:/tmp$ cat import_2.awk
function clean(x){ gsub(/[��\047\042]/,x); gsub(/\|/,",") }
FNR==1{ clean("`"); print "INSERT INTO","`"table"`","("$0")","VALUES"; next}
{
	if(s)print s",";
	clean("\047");
	s = "("$0")"
}
END{
	print s";"
}
akshay@Aix:/tmp$ awk -vtable="Employee" -f import_2.awk infile
INSERT INTO `Employee` (`Name`,`Dept`,`Empno`,`Salary`,`DOB`,`DOJ`) VALUES
('Alexander','10','200','$1000','25/05/1977','01/01/2015'),
('Graham','11','300','$1500','05/09/1970','19/02/2011'),
('Bell' ,'12','400','$2000','31/01/1965','01/10/1999');
1 Like

This is what I get from above:

insert into Employee (Name,Dept,Empno,Salary,DOB,DOJ) values ('''Bell''' ,'''12''','''400''','''$2000''','''31/01/1965''','''01/10/1999''');

Don't forget, those �� are three byte chars (at least in my locale)!

1 Like

Modified code

#!/bin/sh

_fileName=$1
_tableName=$(basename ${_fileName})
_SQLFile=${_tableName}".sql"

tr "|" "," < ${_fileName} | awk -F"," -v tableNM=$_tableName -v sqlfn=${_SQLFile} 'BEGIN{sqlInsert="INSERT INTO table (columns) values (cValues)"}
                                  { if( NR==1)
                                    {
                                      gsub("\"","",$0)
                                      clName=$0
                                      next
                                    }
                                    gsub("table",tableNM,sqlInsert)
                                    gsub("columns",clName,sqlInsert)
                                    gsub("\"","'\''",$0)
                                    gsub("cValues",$0,sqlInsert)
                                    print  sqlInsert";\n" > sqlfn;
                                  }'



sqlplus -s $user/$passwrd <<ENDOFSQL
whenever sqlerror exit sql.sqlcode;
@${_SQLFile}
select sysdate from dual;
exit;
ENDOFSQL

RetCode=$?

Check the return code from SQL Plus
if [ $RetCode != 0 ]
then
    echo "********************"
    echo "ERROR: The SQL Plus Command Failed. RetCode: $RetCode"
else
    echo "********************"
    echo "SQL Plus Successfully Ran. RetCode: $RetCode"
fi
1 Like

Hi Akshay,

The first code works perfect for me. But I need these into one script. so that I can execute the test.sql from the single script only. So that once I fire the script, it will create the test.sql (or any other intermediate file), fire it against the database, and insert the records in the table.

One more questions, what do I need to change in case of ~ separated fields?

Thanks

---------- Post updated at 11:25 AM ---------- Previous update was at 07:09 AM ----------

Chaps. this must be my last question on this topic now.

How can I also include tilda (~) along with the pipe separators?

Some files have tilda and some have pipe.

Ok use this

function clean(x){ gsub(/[��\047\042]/,x); gsub(FS,",") }
FNR==1{ clean(); h = $0 ; next}
{
	clean("\047");
	print "insert into",table,"("h")","values","("$0");"
}
akshay@Aix:/tmp$ awk -vFS="~" -vtable="Employee" -f import.awk infile >test.sql
1 Like

This doesn't work. As while executing, it create INSERT statement with PIPE in it instead of Comma as shown below:

INSERT INTO EMPLOYEE (NAME|DEPT|EMPNO|SALARY|DOB|DOJ).......

or try this

if sep is ~

awk -vFS="\\\~" -vtable="Employee" -f import.awk infile

else if sep is |

awk -vFS="\\\|" -vtable="Employee" -f import.awk infile
1 Like

Try

awk     '       {gsub (/[|~]/,",")
                 gsub (/�|�/, "\047")
                }
         NR==1  {gsub (/\047/,"")
                 TMP = "insert into EMPLOYEE (" $0 ") values ("
                 next
                }
                {print TMP, $0, ");"
                }
        '  file
1 Like

Guys,

Thanks a lot for your prompt and detailed help. I am not able to put together script which can help me to dynamically create the insert script with different separators. And this will also insert data (using the insert statements created with your help) into the database table and cleanup the intermediate files.

Once again thanks a ton for your help especially RudyC, Akshay and Pravin. You guys really rock!! Keep the good work going.

Thanks.