Unix shell script to parse the contents of comma-separated file

Dear All,

I have a comma-separated file.

  1. The first line of the file(header) should have 4 commas(5 fields).
  2. The last line of the file should have 1 comma(2 fields).

Pls help me in checking this condition in a shell script.

And the number of lines between the first line and last line of the file, should match with the last field of the first and last line.

That is, the last field of first and last line wil have a number, that should match with {number of line in the file} -2.

Pls help me out with this.

Example file: QDB_2008.txt

1.1,20070427151500,99567,99669,0009
00001,20070427,00567,6012345671,2081,I
00002,20070427,00568,6012345672,2054,I
00003,20070427,00569,6012345673,2063,I
00004,20070427,00570,6012345674,2081,D
00005,20070427,00571,6012345675,2054,D
00006,20070427,00572,6012345676,2063,D
00007,20070427,00573,6012345677,2081,U
00008,20070427,00574,6012345678,2054,U
00009,20070427,00575,6012345679,2063,U
101.1.0,0009#

Regards,
Krishna

awk -F, '
NR==1 { expect=$NF; fields=NF; last=0 }
NF != fields { last=NR;
  if (NR != 2) print NR ": wrong number of fields: " $0;
  if ($NF != expect) print NR ": last field value not the same as on first line"
  if ($NF != NR-2) print NR ": last field not equal to line count minus two"
  if ($NF != expect) print NR ": line count from first line not identical"
}
last > 0 && NR > last { print NR ": wrong number of fields: " $0 }' QDB_2008.txt

Slightly unwieldy, but should hopefully at least get you started.

> cat chk_valid 
#! /bin/bash
#
# script to check on file conditions

ifile="QDB_2008.txt"

line_1=$(head -1 $ifile)
line_lst=$(tail -1 $ifile)
line_cnt=$(cat $ifile | wc -l)
detl_cnt=$((line_cnt-2))

line_1_val=$(echo $line_1 | cut -d"," -f5)
line_lst_val=$(echo $line_lst | cut -d"," -f2 | cut -d"#" -f1)

if [ "$line_1_val" -ne "$line_lst_val" ]
   then
   echo "Error - header & footer line counts differ"
fi

if [ "$detl_cnt" -ne "$line_1_val" ]
   then
   echo "Error - # detail lines does not match expected counts"
fi

Hi ,

The above code works fine.
I want to run the script after adding newline character at the end of file.

Pls let me know how to append a newline character at the end of the file, if it does not exists.

Regards,
Krishna

awk 1 file

Regards

...or

$ echo "" >> file

This always append a newline at the end of a file while the OP wants to append a newline character at the end of the file, only if it does not exists.

Regards

Hi,

The above command does not seek to work.

I tried echo >> file
It is appending the newline at the end of file.

But can you help me in putting with a condition to check whether
if {newline does not exists at end of file}
then
echo >> file

Pls help me out with this.

Regards,
Krishna

Have you tried the awk solution?

Hi,

I got it working like this :
if [ -n "`tail -1c $file`" ]
then
echo >> $file
fi

This wil append new-line at the EOF if it does not exists.

I have some more file-level validation to take care. Need your valuable inputs on this.

My (comma-separated) file is something like below:

1.0,20080317081500,00001,00006,6
00001,20080317,00001,60213000071,2105,I
00002,20080317,00002,60213000071,0,D
00003,20080317,00003,60213000072,2104,I
00004,20080317,00004,60213000073,2103,I
00005,20080317,00005,60213000074,2102,I
00006,20080317,00006,60213000074,0,D
10.1.254.21

Ignoring first line and last line, I need to retrieve the last 3 fields of the other lines, for further validation. Copying these fields, to a separate file also is ok.

The new file can be:

60213000071,2105,I
60213000071,0,D
60213000072,2104,I
60213000073,2103,I
60213000074,2102,I
60213000074,0,D

Here i need to validate 2 points:

  1. In each line, first field should have length equal to 11 and start with 60
  2. In each line, last field should have either "D", "I" or "U" in it.

Pls let me know about this.

With Regards,
Krishna

Hi All,

Need someone's help urgently on this.

Pls let me know.

With Regards

It's not allowed to bump up questions, please read the rules.

The moderator team.