Splitting text file into 2 separate files ??

shekharjchandra · November 12, 2010, 4:25am

Hi All,
I am new to this forumn as well to the UNIX, I have basic knowledge of UNIX which I studied some years ago, now I have to do some shell scripting to load data into Oracle database using sqlldr utility, whcih I am able to do. I have a requirement where I need to do following operation.
I have one text file in which the data will be in following format for example
.................... FILE1.txt ....................
A1|1234561|010|065|
aaaaa
sssss
ddddd
fffff
A2|1234562|011|066|
qqqq
ww
eeeeeeee
r
A3|1234563|012|067|
ttttttttt
A4|1234564|013|068|
yyyyy
uuu
A5|1234565|014|069|
sdfsdfsd
werw345
feewwe
A6|1234566|015|060|
A7|1234567|016|061|
....................................................
Now from above file I have to split it into 2 different files for example
File_A.txt
....................................................
A1|1234561|010|065|
A2|1234562|011|066|
A3|1234563|012|067|
A4|1234564|013|068|
A5|1234565|014|069|
A6|1234566|015|060|
A7|1234567|016|061|
....................................................

File_B.txt
....................................................
aaaaa
sssss
ddddd
fffff
qqqq
ww
eeeeeeee
r
ttttttttt
yyyyy
uuu
sdfsdfsd
werw345
feewwe
....................................................
I am able to create the first file FILE_A.txt using awk command, but I am not able create the second FILE_B.txt
I just like to know is there any other quick way of creating 2 seperate files as mentioned above or I can use same awk command to create the second file FILA_B.txt
Any help is highly appreciated.
Regards
JC

danmero · November 12, 2010, 4:37am

awk '{print > ("File"((/\|/)?"A":"B")".txt")}' File1.txt

shekharjchandra · November 12, 2010, 4:50am

Hi
I Tried with the given solution by specifying my file name in the command
i.e.
awk '{print > ("File"((/\|/)?"A":"B")".txt")}' spgen.txt

But I am getting following error, also any idea how to get second file i.e. FILE_B.txt from FILE1.txt

awk: syntax error near line 1
awk: illegal statement near line 1

I feel I am doing something wrong, please correct me...

I was able to create the first file by using following command
awk '/\|/' spgen.txt > FILE_A.txt

Regards
JC

danmero · November 12, 2010, 5:08am

Use gawk, nawk, mawk OR /usr/xpg4/bin/awk on Solaris.

shekharjchandra · November 12, 2010, 5:13am

Thanks Danmero,
That was really quick and apt solution, I used nawk and it worked out. But I would like to understand how it works, probably I will look into details of nawk man pages

Once again thanks for your help
regards
JC

danmero · November 12, 2010, 5:22am

Is simple, we just shift the output file name. ((/\|/)?"A":"B") if record match regular expression(pipe char) the value is A else value is B.

shekharjchandra · November 12, 2010, 7:48am

Further adding to my previous query ....

I want to create the 2 files based on above nawk command. I will be dynamically passing the value of project number through runtime input variable

i.e by using read v_projname
if my variable is having v_projname=PROJECT1234
so I have to use it something like $v_projectname in my given nawk command, I tried in various combination but I was not able to get the substituted value of variable
I used in this way nawk '{print > ("ABS\_"((/\|/)?"ADDR\_":"SLICE\_")"$v_projname\_NETWORKID.txt")}' spfile.txt

My output of files shoule be something like
ABS_ADDR_PROJECT1234_NETWORKID.txt
ABS_SLICE_PROJECT1234_NETWORKID.txt
But I am getting
ABS_ADDR_$v_projname_NETWORKID.txt
ABS_SLICE_$v_projname_NETWORKID.txt

ctsgnb · November 12, 2010, 8:31am

nawk -vVAR="$v_projname" '{print > ("ABS\_"((/\|/)?"ADDR\_":"SLICE\_")VAR"\_NETWORKID.txt")}' spfile.txt

shekharjchandra · November 12, 2010, 9:04am

Thanks for the solution !!

Just minor correction (Typo - Missed space between -v and VAR)

nawk -v VAR="$v_projname" '{print > ("ABS\_"((/\|/)?"ADDR\_":"SLICE\_")VAR"\_NETWORKID.txt")}' spfile.txt

Regards
JC

ctsgnb · November 12, 2010, 9:40am

Some awk and/or nawk version support without space

shekharjchandra · November 17, 2010, 5:12am

Hi
extending to my previous query ....

nawk -v invar1="$aa" '{print > ("ABS\_"((/\|/)?"A\_":"B\_")invar1"\_NETWORKID.txt")}' spfile.txt

Similar to invar1 variable in nawk I also need one more variable like invar2 to be passed into nawk. This I want to use in the place of NETWORKID

i.e. somthing similar to
nawk -v invar1="$aa" invar2="$bb" '{print > ("ABS\_"((/\|/)?"A\_":"B\_")invar1"\_"invar2".txt")}' spfile.txt

Above syntax is obviously wrong, I just like to know is there a way I can pass two different variables in nawk as above or other alternate way

- - - -

Also can I use Substring in nawk, because I need to find the '|' character in certain position in every line

i.e. I need to pick only those lines whose fifth character is '|' and then put that line in first file or else in second file
Eg:-
if
aa=xxxx
bb=yyyyy
and spfile.txt is

aaa1|bbb1|ccc1
11|21|31|
12|22|32|
13|23|33|
14|24|34|
aaa2|bbb2|ccc2
31|51|71|
32|52|72|
33|53|73|
34|54|74|
aaa3|bbb3|ccc3
41|61|81|
42|62|82|
43|63|93|
44|64|94|
aaa4|bbb4|ccc4

THEN

First file output should be (File ABS_A_xxxx_yyyy.txt)
aaa1|bbb1|ccc1
aaa2|bbb2|ccc2
aaa3|bbb3|ccc3
aaa4|bbb4|ccc4

Second file output should be (File ABS_B_xxxx_yyyy.txt)

11|21|31|
12|22|32|
13|23|33|
14|24|34|
31|51|71|
32|52|72|
33|53|73|
34|54|74|
41|61|81|
42|62|82|
43|63|93|
44|64|94|

Many many thanks in advance

regards
jc