Hi Guys,
I am new to unix scripting and I am tasked to parse through a CSV file delimited by #.
Sample:
sample.csv
H#A#B#C
D#A#B#C
T#A#B#C
H = Header
D = Detail Record
T = Tail
What I need is to read the file and parse through it to get the columns.
I have no idea on how to do this in bash script.
I'd appreciate any assistance. Thanks.
zaxxon
February 19, 2010, 3:23am
2
What output do you need (example inside code tags please)?
Maybe a while/read construct or awk is an option for you.
Hi
something like this. its not working very well.
#!/bin/bash
i=0
cat sample.csv | while read fileline
do
echo "$fileline"
row=$fileline
i=$i+1
done
*** parse all rows in row array ***
i would like to output the detail record [D] only
A
B
C
zaxxon
February 19, 2010, 5:08am
4
If just the output is desired and column number is fixed:
awk -F"#" '/^D/ {print $2,$3,$4}' infile
A B C
If there are varying number of columns in a row, maybe something like this (could also use something with split function in awk):
$> cat infile
H#A#B#C
D#A#B#C#D#E
T#A#B#C
$> awk -F# '/^D/ {for(a=2; a<=NF; a++){print $a}}END{printf"\n"}' ORS=" " infile
A B C D E
If you want to have every element of the row as element of a bash array:
$> arr=(`awk -F"#" '/^D/ {print $2,$3,$4}' infile`)
$> echo ${arr[*]}
A B C
$> echo ${arr[1]}
B
If you like it more with usage of shell capabilities:
$> LINE=`grep ^D infile`
$> echo ${LINE#*#}| tr -s '#' ' '
A B C
and there are many more ways to do it.
cut -d# -f1 sample.csv >> firstcolumn.csv
Basically, it copies the first column and generated as in firstcolumn.csv
H
D
T
.
.
.
Alternatively, you can change the fields as
cut -d# -f2 sample.csv >> secondcolumn.csv
or
cut -d# -f1,2 sample.csv >> firstandsecondcolumn.csv
Good Bye.
Thanks for your reply guys. i appreciate it.
now i am trying something like this to work.
read the csv file
put the lines in a row array
next loop would be to parse each row to get the columns
csv file
Record Type#Batch Job ID#Batch Number#FileCreation Date#FileCreation Time#Production/Test Fileindicator#File Character
H#0002#0002#20100218#17.25#P#barani
Record Type#A#B#C#D#E#F#G#H#J#K#L
1#2#3#4#5#6#7#8#9#10#11
Record Type#NumberOFRecords
T#4500
N
script:
#!/bin/bash
i=0
cat 1.csv | while read fileline
do
echo "$fileline"
row=$fileline]
i=$(($i+1))
done
j=0
while [ $j -le $i ]
do
col1=$( cut �f 2-7 -d '#' �s $row)
echo "first char done"
col2=$( cut �f 1,3-7 -d '#' �s $row)
echo "sec char done"
col3=$( cut �f 1-2,4-7 -d '#' �s $row)
echo "3rd char done"
echo "$col1,$col2,$col3"
j=$(($j+1))
done
is this possible?
appreciate all your replies guys. thank you!
kshji
February 21, 2010, 3:30am
7
# you can parse the line using IFS and array
IFS="#" flds=( $fileline )
nrofflds=${#flds[@]}
fld1="${flds[0]}"
fld=0
while (( fld<nrofflds ))
do
#...
((fld+=1))
done
thanks kshji
ill go and try this...
Hi guys,
thanks for all your inputs.
i was able to work with my problem using this:
#!/bin/bash
FILE_NAME=1.csv
while read fileline
do
OLDIFS=IFS
IFS="=" flds=( $fileline )
nrofflds=${#flds[@]}
ctr=0
while [[ $ctr -lt $nrofflds ]]
do
echo "${flds[$ctr]}"
ctr=$(($ctr+1))
done
IFS=OLDIFS
done < $FILE_NAME
this prints the value of the columns per record per file line.
thanks for all your help! :):)