Reading files under a folder and formatting content of each file

rocking77 · February 11, 2011, 4:20am

I have 'n' number of files in a folder .each file in the folder "myfolder" is having the content like.

COLNAME
------------
AAAAAA
BBBBBB
CCCCCC
DDDDDD
...
...

...
ZZZZZZ
26 recrod(s) selected.

My request is by reading each file in "myfolder" and format each file such a way delete line "COLNAME","------" ,"26 records(s) selected" including the blank lines.Hence i need an out like.

AAAAA
BBBBB
CCCCC
DDDDD
....

....

ZZZZZZ

Note:

This should be for all the files in a folder call "myfolder".after formating . i need to print .

INSERT INTO<file-name> (AAAAA,
BBBB,
CCCC,
DDDD,
...........
......
zzzzzz

Can somebody help me on this ?

pludi · February 11, 2011, 5:24am

What have you tried so far, and where are you stuck?

rocking77 · February 11, 2011, 5:30am

I don't know how to read files in folder .thats why i put a mail.

i was just tying on file content.using awk.

awk '{print $1}' 1.txt

Its giving out put with blankspaces as well.Please suggest how to start at least.

pludi · February 11, 2011, 5:45am

Looping over all files in a folder is as simple as

for file in /path/to/directory/*
do
    # Do something with $file here
done

Other than that I'd say you'll need man basename (POSIX) and man sed (POSIX)

rocking77 · February 11, 2011, 6:21am

Yes pludi,

I started using this code and written like

for fname in  `ls /xxx/sss`
do                                                                                                  do
cat $fname|awk '{print $1}'
done

out put is coming like

coluumname
---------
AAAAA
BBBBB
CCCCC
....

4

Columnname
----------
AAAAAA
BBBBBB
CCCCCC
....

FFFFF

5...

from this oupt put i need to delete the blankspaces and line "columname" line which has 4 ,5 ...etc.

Can you suggest how to add negetive condition in awk to not prient specifi lines?

pludi · February 11, 2011, 6:25am

First, there's no need to invoke ls for the loop, the shell can expand the parameters just fine on itself. Second, for removing certain lines I've found sed to do a much better job:

for file in /xxx/sss/*
do
    sed -e '1,2d;$d' $file
done

rocking77 · February 11, 2011, 7:02am

Thanks pludi.

I trid

out put should come like .

for fname in  `ls /xxx/sss`
do                                                                                                  do
cat $fname|awk '/./''{print $1}'
done

even sed is working .But how to achieve the below output.any suggession please?

INSERT INTO FILENAME1 (AAAA,BBB,CCC);
INSERT INTO FILENAME2 (AAAA,BBBB,CCCC,DDD,FFF);

pludi · February 11, 2011, 7:49am

OK, let's tackle this step by step.

First, don't start by attacking the loop. That's just a simplification so that you don't have each and every file manually. Instead, develop the necessary steps using 1 file as example, in a way that can be applied to all other files.

The first thing you want to do with each file is remove the first 2 lines, and the last one. This can easily be done using man sed (POSIX):

sed -e '1,2d; $d' file

Next, you want to remove all empty lines. This can be included with the previous sed statement:

sed -e '1,2d; $d; /^[:space:]*$/d' file

Once that's done you want to surround the values with and SQL INSERT statement. For that, we need a second sed statement:

sed -e '1,2d; $d; /^[:space:]*$/d' file | \
sed -e '1s/^/INSERT INTO file VALUES (/; $!s/$/,/; $s/$/);/'

This can be read as "On the first line, replace the beginning of the line with this string. On all but the last line, replace the end of the line with a comma. On the last line, append the string to the end of the line."

Now that a single file can be processed as needed, we can start working on the loop. A simple for loop is all that is needed:

for file in /xxx/sss/*
do
    sed -e '1,2d; $d; /^[:space:]*$/d' ${file} | \
    sed -e '1s/^/INSERT INTO file VALUES (/; $!s/$/,/; $s/$/);/'
done

But wait! The table used should be the one the file is named after, so we'll have to use that somewhere too. For that we need man basename (POSIX), and we have to alter the second sed statement a bit (marked in green)

for file in /xxx/sss/*
do
    filename=$( basename $file )
    sed -e '1,2d; $d; /^[:space:]*$/d' ${file} | \
    sed -e '1s/^/INSERT INTO '${filename}' VALUES (/; $!s/$/,/; $s/$/);/'
done

That's all there is to it. There's no magic, arcane knowledge, or infinite UNIX wisdom needed as long as you can break down the steps as far as needed, and are not afraid of reading the man pages for the commands you think you'll need.

rocking77 · February 11, 2011, 8:25am

Hi Pludi,

when i try the same its giving syntax error.do we need to have add some thing ?

./column.sh: line 14: syntax error near unexpected token `('
./column.sh: line 14: ` sed -e '1s/^/INSERT INTO '${filename}' VALUES(/; $!s/$/,/; $s/$/);/''

pludi · February 11, 2011, 8:34am

That's an error message from the shell. Can you post the part of the complete script where you're using the loop, including the surrounding lines (copy-paste please)?

rocking77 · February 11, 2011, 8:47am

Its simple actually,

I have n number .sql files under a folder each file is having columns inside the file.

and i save this your code and running on that folder

for file in /home/db2inst1/test/JAVA/DDL/*
do
filename=$( basename $file )
sed -e '1,2d; $d; /^[:space:]*$/d ${file} \
sed -e '1s/^/INSERT INTO '${filename}' VALUES(/; $!s/$/,/; $s/$/);/'
done

when i ran the file using ./column.sh cmd.its giving error.Once its get successed i will use this part of the code in ohter script.

./column.sh: line 14: syntax error near unexpected token `('
./column.sh: line 14: ` sed -e '1s/^/INSERT INTO '${filename}' VALUES(/; $!s/$/,/; $s/$/);/''

pludi · February 11, 2011, 8:54am

Ah, I see the error, typo on my part. On the first sed command I've left out the closing quote. The line should read

sed -e '1,2d; $d; /^[:space:]*$/d' ${file} | \

rocking77 · February 11, 2011, 9:11am

Hi pludi,

yes its working party but ouput is coming like this Even i don't want the Last line "3 records(s) selected" for each file ouput.
sed: can't read sed: No such file or directory

--------------------,
AAAAAAA,
BBBBBBB,
CCCCCCC ,
3 record(s) selected.,
sed: can't read sed: No such file or directory
--------------------,
AAAAAAA,
BBBBBBB,
CCCCCCC....
......
......
.......

21 record(s) selected.,

do we need to edit some thing else? i could not seen any INSERT statement or braces "( ) " printed.

pludi · February 11, 2011, 9:27am

Yes, I've missed another typo (shame on me). There should be a pipe before the backslash, again on the first sed line. I've corrected and marked it in my previous post.

rocking77 · February 11, 2011, 9:45am

Hi Pludi,
sorry to trouble you.My request is almost coming to please help me.I already started learning sed&awk i will definelty come up with good practice next time.

Output is coming like

INSERT INOT xxxxxx.sql VALUES(--------------------,
AAAAAAA,
BBBBBBB,
CCCCCCC,
3 records(s) selected );

INSERT INTO xxxxx.sql VALUES(--------------------,
AAAAAAA,
BBBBBBB,
CCCCCCC....
......
......
.......

21 record(s) selected.);

where i don't want tablename with .sql extension,first line (----------) to be deleted ,last line "(x records(s) selected)" to be delted.Totally out put should be.

INSERT into tab1 (AAAAAAA,
BBBBBBB,
CCCCCCC) select 
AAAAAAA,
BBBBBBB,
CCCCCCC
from tab1;

pludi · February 11, 2011, 9:52am

If those lines are still coming in there's a difference between the layout of the input files you've given and reality. Can you post the first and last few lines of a file?

rocking77 · February 11, 2011, 10:04am

The input file in each file is like this .its almost same as i have given at very first.Is there any problem with line number in first sed?

d,2d??

/home/db2inst/test/JAVA/DDL $ more file1.sql

COLUMNNAME 
--------------------
AAAA
BBBB
CCCC
3 record(s) selected.

pludi · February 11, 2011, 10:20am

OK, so there is an empty line before the column name that wasn't there before. No big deal, just change the 1,2d to 1,3d . However, there has to be something after the "last" line ("... selected"). Even if it's just an empty line. Please check this.

If there is something trailing, you'll have to change the first sed line to read (additions marked in red)

sed -e '1,2d; $d; /^[:space:]*$/d; /selected\.$/d' ${file} | \

rocking77 · February 11, 2011, 10:39am

perfect its working fine.Only the next part should be another SELECT and all the columns like below.Can you pelase help me this the last attempt.

INSERT into tab1 (AAAAAAA,
BBBBBBB,
CCCCCCC) select 
AAAAAAA,
BBBBBBB,
CCCCCCC
from tab1;

This is only the last part left out.

we should repeat the 2nd SED again?