Wrap lines with awk to create SQL script

alan · June 15, 2011, 7:46pm

Greetings!

Some of my files list hardware errors (we test electronic components), some have none. If the file name has no errors, I still want to display a message like "No error", else I display the error from the file itself.

I came up with this (with help)

for myfile in `find . -name "R*VER" -mtime +1`
do
     somestr=`grep -H ^err $myfile || echo -e "$myfile\tNo error"`
     echo "$somestr" |sed 's:./::;s:tmp/::;s/:/\t/;'
     echo $somestr | awk -F~ '{print $1"\t"$2"\t"$7"\t"$8"\t"$9"\t"$10}' | sed 's:./::;s:tmp/::;'
done

It works but outputs on two lines i.e.

RRR1~COS~COSMETICS~99537~jgmdtv132~1~P~R22-200~029053252648~20110607~094718.VER    No error
RRR1    COS    P    R22-200    029053252648    20110607
RRR1~COS~COSMETICS~ETT03~jgm14652.~1~F~R16-300~000894980523~20110607~084053.VER    err    ->IR Remote Key 1    3310    err   
RRR1    COS    F    R16-300    000894980523    20110607

I am looking to generate dynamically a SQL script with insert statements like so:

INSERT INTO MYTABLE (COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8)
VALUES ('RRR1~COS~COSMETICS~99537~jgmdtv132~1~P~R22-200~029053252648~20110607~094718.VER','No error','RRR1','COS','P','R22-200','029053252648','20110607');
INSERT INTO MYTABLE (COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8)
VALUES  ('RRR1~COS~COSMETICS~ETT03~jgm14652.~1~F~R16-300~000894980523~20110607~084053.VER',' err    ->IR Remote Key 1    3310    err','RRR1','COS','P','R16-300','000894980523','20110607');

All I would need to do is wrap both lines but I am not clear how to do it (still learning awk/sed). Is it possible?

Thanks for your help.

agama · June 15, 2011, 10:36pm

Yep, it's possible. Here is one way:

#!/usr/bin/env ksh

# if you have a lot of these files, this method will
# prevent arg length errors to the for
find . -name "R*VER" -mtime +1 | while read fname
do
    ( grep "^err" $fname || echo "No error" ) | awk  -v base=${fname##*/} '
        {
            split( base, a, "~" );          # split filename into components
            printf( "%s\t%s\t%s\t%s\t%s\t%s\t%s\t%s\n", base, $0, a[1], a[2], a[7], a[8], a[9], a[10] );
        }'
done

With some minor changes, it will generate your SQL (based on what you provided; I don't do SQL so I don't know if it's exactly what you need, but you'll get the drift):

#!/usr/bin/env ksh

# if you have a lot of these files, this method will
# prevent arg length errors to the for
find . -name "R*VER" -mtime +1 | while read fname
do
    ( grep "^err" $fname || echo "No error" ) | awk -v squote="'" -v base=${fname##*/} '
        {
            split( base, a, "~" );          # split filename into components
            printf( "INSERT INTO MYTABLE (COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8)\n" );
            str = sprintf( "VALUES (\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\",\"%s\")", base, $0, a[1], a[2], a[7], a[8], a[9], a[10] );
            gsub( "\"", squote, str );          # replace double quotes with singles
            print str;
        }'
done

The trick with the single quote, and substitution into str gets round the difficulty of using a single quote inside of an awk programme. It's possible, very messy, and makes for code that is difficult to maintain. There are other ways to do the quoting that are more efficient, but I think this is the most readable.

Other things that might not be too obvious:
${fname##*/} strips the leading path (./tmp/xxx/) from the filename returned by find.

The parens round this statement ( grep "^err" $fname || echo "No error" ) cause the statement to be executed in a subshell. This is necessary to cause output from the grep and the echo to pipe properly into the awk.

Hope this helps.

alan · June 16, 2011, 7:36pm

Thanks for taking the time. With a couple of friends, this is what we got

find . -name "R*VER" -exec sh -c 'grep -H ^err "{}" || echo "{}:No error"' \; |
        awk -F: '
                BEGIN { q="\047" }
                {
                        file=$1
                        result=$2
                        sub(/^.*\//,"",file)
                        split(file,a,/~/)
                        print "INSERT INTO MYTABLE (COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8) VALUES (" q file q "," q result q "," q a[1] q "," q a[2] q "," q a[7] q "," q a[8] q "," q a[9] q "," q a[10] q ")"
                }
        '

Works great. Thanks again.