Please help!!!

All,

I need l help in writing a script;

I have a file with header on the first row data rows next and the last row in the file contains the record count of the data records only. I need to remove the last and first line and check to see that the data rows match the count on the last line. Can any one plzz help.

for eg file looks like this;

market|code|recordno|usage
a|2|3|4o
b|6|7|80
2

I need to remove the header and last row and count if the data rows equals the count at the bottom. Here data rows are 2 and the count is 2 hence good.
If not i need to through an error "data rows don't match count. Please help.

awk 'END{ printf ($1+2 != NR ) ? "Error\n" : "" }' file

Reborg,

Thanks very much for the code!! But after checking whether the record count matches i need to remove the first record and last record from the file and write the data rows to a new file. How can I acheive this? Please help.

awk '{data[NR] = $0} END { if ( NR - 2 == $1) { for ( i=2 ; i < NR; i++ ) { print data}}else { print "error"} }' file > newfile

Reborg,

Thank you very much for your great response. I am new to use awk and really it's a great feeling learning what it can do on files. I have one more question. If i need to do this on multiple files and write multiple output files as said in the earlier reply is it possible?

If I have 20 files with the same format in a directory can I apply the same code on all of them at one time and produce 20 output files if the data rows in each file matches the count in them. How can the same code read multiple files that start with similar pattern like for e.g REG1, REG2,REG3 etcc... Kindly let me know if that is possible or I have to run independently.

Once again thank you very much.

one idea - based on the one from reborg:

awk '{data[NR] = $0; out= FILENAME "new"; file[NR]=out;} END { if ( NR - 2 == $1) { for ( i=2 ; i < NR; i++ ) { print data} > file; close(file}else { print "error"} }' REG*

Hi Vgersh,

I am getting an error like this

awk '{data[NR] = $0; out= FILENAME "new"; file[NR]=out;} END { if ( NR -
2 == $1) { for ( i=2 ; i < NR; i++ ) { print data[i]} > file[i];
close(file[i]}else { print "error"} }' arun* syntax error The source
line is 1.
The error context is
{data[NR] = $0; out= FILENAME "new"; file[NR]=out;} END
{ if ( NR - 2 == $1) { for ( i=2 ; i < NR; i++ ) { print data[i]} >>> >
<<< file[i]; close(file[i]}else { print "error"} }
awk: The statement cannot be correctly parsed.
The source line is 1.
awk: There is a missing ) character

The missing ) character in red:

I was checking this for interest and I got an error similar to what mhssatya got.. Below is the error I got too.

syntax error The source line is 1.
The error context is
{data[NR] = $0; out= FILENAME "new"; file[NR]=out;} END { if ( NR - 2 == $1) { for ( i=2 ; i < NR; i++ ) { print data[i]} >>> > <<< file[i]; close(file[i])}else { print "error"} }
awk: The statement cannot be correctly parsed.
The source line is 1.

Make sure you used the code exactly as Klashxx corrected it.

if that doesn't work, try nawk in place of awk.

Reborg,

I am trying exactly as klashxx told but I got the error again and when I used nawk I got nawk: not found error.

I created 2 files as arun and arun1 like this:
arun:
run
1
1
2

arun1:

run
1
1
2

and i ran the code but it didn;t work. I am getting error as mhssatya got. I don't know whether he is getting the same error or not.

Reborg,

I am getting the same error as sravan is getting and also I need to do the processing for 30 files at the same time and write them to 30 new files if the count in each of them matches the record count in each of the files. I need to append _new to old file. For eg if the file names is

CARE01_DLY_MKT_YYYYMMDD then if the count matches to the data rows in the file then I need to write this to a new file with the only data rows with file name CARE01_DLY_MKT_YYYYMMDD _new.

This has to be repeated for all the 30 files in the directory. So how can I acheive this. Please help

awk '{data[FNR] = $0; out=FILENAME "_new"; file[FNR]=out;} END { if ( FNR - 2 == $1) { for ( i=2 ; i < FNR; i++ ) { print data  > file}  close(file)}else { print "error"} }' CARE01_DLY_???_`date+%Y%m%d`

Reborg,

In the file name only CARE01_DLY_MKT_YYYYMMDD only MKT changes for all the files. For e.g like CARE01_DLY_IRL_20060720 for 1st file, CARE01_DLY_IND_20060720 for 2nd file like that.

So what do i need to specify as file name in the code. I don't know where to give the name in the code. Please suggest

CARE01_DLY_???_YYYYMMDD

Reborg,

If I give like CARE01_DLY_???_YYYYMMDD how can the system calculate that it's todays file only. I need to make sure that that is today file.

Do i need to give this like this

awk '{data[FNR] = $0; out=CARE01_DLY_???_date+%Y%m%d"_new"; file[FNR]=out;} END { if ( FNR - 2 == $1) { for ( i=2 ; i < FNR; i++ ) { print data > file[i]} close(file[i])}else { print "error"} }' CARE01_DLY_???_date+%Y%m%d

This is not working for me. Please suggest

run it EXACTLY as updated above.

Reborg,

I am getting the below error.

syntax error The source line is 1.
The error context is
{data[FNR] = $0; >>> out=CARE01_DLY_?? <<<
awk: The statement cannot be correctly parsed.
The source line is 1

Please suggest

That does not look exactly like what I posted.

Reborg,

I am typing the whole code but after the last CARE0 statement it doesn't go any further. What should I do. If i copy paste the last statements are not getting printed. Do i need to keep that in a script? Kindly let me know.

Please suggest.