Bash Script to read a file and parse each record

3vilwyatt · February 19, 2010, 3:14am

Hi Guys,

I am new to unix scripting and I am tasked to parse through a CSV file delimited by #.

Sample:

sample.csv

H#A#B#C
D#A#B#C
T#A#B#C

H = Header
D = Detail Record
T = Tail

What I need is to read the file and parse through it to get the columns.
I have no idea on how to do this in bash script.

I'd appreciate any assistance. Thanks.

zaxxon · February 19, 2010, 3:23am

What output do you need (example inside code tags please)?
Maybe a while/read construct or awk is an option for you.

3vilwyatt · February 19, 2010, 3:51am

Hi

something like this. its not working very well.

#!/bin/bash
i=0
cat sample.csv | while read fileline
do
echo "$fileline"
row=$fileline
i=$i+1
done

*** parse all rows in row array ***

i would like to output the detail record [D] only

A
B
C

zaxxon · February 19, 2010, 5:08am

If just the output is desired and column number is fixed:

awk -F"#" '/^D/ {print $2,$3,$4}' infile
A B C

If there are varying number of columns in a row, maybe something like this (could also use something with split function in awk):

$> cat infile
H#A#B#C
D#A#B#C#D#E
T#A#B#C
$> awk -F# '/^D/ {for(a=2; a<=NF; a++){print $a}}END{printf"\n"}' ORS=" " infile
A B C D E

If you want to have every element of the row as element of a bash array:

$> arr=(`awk -F"#" '/^D/ {print $2,$3,$4}' infile`)
$> echo ${arr[*]}
A B C
$> echo ${arr[1]}
B

If you like it more with usage of shell capabilities:

$> LINE=`grep ^D infile`
$> echo ${LINE#*#}| tr -s '#' ' '
A B C

and there are many more ways to do it.

Jerald_Nathan · February 19, 2010, 5:23am

cut -d# -f1 sample.csv >> firstcolumn.csv

Basically, it copies the first column and generated as in firstcolumn.csv

H
D
T
.
.
.

Alternatively, you can change the fields as

cut -d# -f2 sample.csv >> secondcolumn.csv

or

cut -d# -f1,2 sample.csv >> firstandsecondcolumn.csv

Good Bye.

3vilwyatt · February 19, 2010, 5:53am

Thanks for your reply guys. i appreciate it.

now i am trying something like this to work.

read the csv file
put the lines in a row array
next loop would be to parse each row to get the columns

csv file

Record Type#Batch Job ID#Batch Number#FileCreation Date#FileCreation Time#Production/Test Fileindicator#File Character
H#0002#0002#20100218#17.25#P#barani

Record Type#A#B#C#D#E#F#G#H#J#K#L
1#2#3#4#5#6#7#8#9#10#11

Record Type#NumberOFRecords
T#4500

N

script:

#!/bin/bash
i=0
cat 1.csv | while read fileline
do
echo "$fileline"
row=$fileline]
i=$(($i+1))
done

j=0
while [ $j -le $i ]
do
col1=$( cut �f 2-7  -d '#' �s $row)
echo "first char done"
col2=$( cut �f 1,3-7  -d '#' �s $row)
echo "sec char done"
col3=$( cut �f 1-2,4-7  -d '#' �s $row)
echo "3rd char done"
echo "$col1,$col2,$col3"
j=$(($j+1))
done

is this possible?
appreciate all your replies guys. thank you!

kshji · February 21, 2010, 3:30am

# you can parse the line using IFS and array
IFS="#" flds=( $fileline )
nrofflds=${#flds[@]}

fld1="${flds[0]}"  
fld=0
while (( fld<nrofflds ))
do
       #...  
       ((fld+=1))
done

3vilwyatt · February 21, 2010, 8:09pm

thanks kshji
ill go and try this...

3vilwyatt · February 25, 2010, 1:04am

Hi guys,

thanks for all your inputs.
i was able to work with my problem using this:

#!/bin/bash

FILE_NAME=1.csv

while read fileline
do
	OLDIFS=IFS
	IFS="=" flds=( $fileline )
	nrofflds=${#flds[@]}
	
	ctr=0

	while [[ $ctr -lt $nrofflds ]]
	do
		echo "${flds[$ctr]}"
		ctr=$(($ctr+1))
	done
			
	IFS=OLDIFS
done < $FILE_NAME

this prints the value of the columns per record per file line.
thanks for all your help! :):)