Replace a number in the last line of a delimited file.

Hi all,

I am fairly new to UNIX and I was wondering if you could provide me with some help! Lets say i have a file as below :

Line 1  
Line 2 
Line 3 
ABC|12|4|2

Now the number 4 in bold, this number will represent the number of row there is in the file excluding the header and footer so the result should be 2.

So what i want to do it to write a code that counts the number of rows in the file in this case 4, minus it by 2 (because excluding header and footer), go to the last line, replace the third value seperated by the delimiter and save the file.

Currently i am running the code below:

awk 'BEGIN{FS=OFS="|"} END{$3=NR-2;print};1' $1

This is the output it get after running the code:

Line 1  
Line 2 
Line 3 
ABC|12|4|2
ABC|12|2|2

My desired output is :

Line 1  
Line 2 
Line 3 
ABC|12|2|2

then save it.

can someone help me please ? Hope that all makes sense.

If, as shown in your example, only the trailer line contains pipe symbols; try:

awk '{BEGIN{FS=OFS="|"}NF==4{$3=NR-2}1' "$1"

Otherwise, try:

awk 'BEGIN{FS=OFS="|"}NR>1{print last}{last=$0}END{$3=NR-2;print}' "$1"
1 Like

Thanks Don i used the second code and it worked perfectly!

Hi so now i have another requirement.

Say i have a variable

ABC=CHICKEN
Line 1  
Line 2 
Line 3 
ABC|12|4|2
ABC|12|2|BOO

I want to replace Boo with chicken.

awk 'BEGIN{FS=OFS="|"}NR>1{print last}{last=$0}END{$3=$ABC;print}' "$1"

I have been trying it out like this but it doesn't work is there a way i can replace BOO with CHICKEN

awk 'NR>1{print last}{last=$0}END{$NF=ABC; print $0}' ABC="$ABC" FS="|" OFS="|" "$1"
1 Like

Your requirements are not at all clear:

  • Saying: "I want to replace Boo with chicken." is strange since "Boo" does not appear anywhere in your sample input and "chicken" is not what is stored in your variable. In BSD-, Linux-, and UNIX-systems strings, variable assignments, filenames, and almost everything else is case sensitive.
  • You can't directly use shell variables inside a single-quoted awk script.
  • Your awk script unconditionally replaces the 3rd field on every line in your input with the contents of the field specified by the numeric value of the awk variable ABC (which, since it hasn't been assigned a value, is 0). How this relates to changing the value of the 4th field (only when it has the value BOO ) to CHICKEN is unclear.

If what you are trying to do is to replace every occurrence of the string BOO in your input file with the contents of the shell variable ABC , you might want to try something more like:

ABC=CHICKEN
awk -v ABC="$ABC" '
BEGIN {	FS = OFS = "|"
}
{	gsub(/BOO/, ABC)
}
1' "$1"

If what you are trying to do is to replace the contents of every field whose value is entirely the case insensitive string BOO with the contents of the shell variable ABC , you might want to try something more like:

ABC=CHICKEN
awk -v ABC="$ABC" '
BEGIN {	FS = OFS = "|"
}
{	for(i = 1; i <= NF; i++)
		if($i ~ /^[Bb][Oo][Oo]$/)
			$i = ABC
}
1' "$1"

If what you are trying to do is to replace the contents of field $4 with the contents of the shell variable ABC if and only if that field contains the string BOO , you might want to try something more like:

ABC=CHICKEN
awk -v ABC="$ABC" '
BEGIN {	FS = OFS = "|"
}
$4 == "BOO" {
	$4 = ABC
}
1' "$1"

Of course, all of the above assume that you are using a shell that is based on Bourne shell syntax (not a shell that is based on csh syntax) and that you are not using a Solaris/SunOS system (in which case you would need to use /usr/xpg4/bin/awk or nawk instead of awk ).

If none of the above are what you are trying to do and what rdrtx1 guessed is not what you are trying to do then please give a much more detailed, clear description of what your code is supposed to do along with a clear specification of what shell and operating system you're using so we might have a better chance of determining what needs to be done to achieve your desired outcome.

2 Likes

Hi Don,

Apologies for this, my requirements are always changing and i've been trying to learn and implement stuff at the same time so sorry if i haven't been clear enough.

It is Shell based so i will be using AWK , i used your 3rd Sample Code and that work perfectly.

So i have another file that i would like to change.

Take below as a sample code that i currently have that i want to change. So what i want to do is replace everything beyond the second COMMA with whatever i want.

Line 1
Line 2
Line 1
ABC,2,CHICKEN,PASTA

The below code is my desired output

Line 1
Line 2
Line 1
ABC,2,TUNA,PIE

The below code is what i am doing currently using to replace this.

CHICKEN=CHICKEN,PASTA
TUNA=TUNA,PIE
sed "s/$CHICKEN/$TUNA/g" "$1" > temp2

The problem with this is that the code won't if someones accidentally spells something wrong.

So i would like to have a code which:

  1. Looks at the bottom line of the file only.
  2. Replace everything beyond the second COMMA regardless of whatever is there because i might not necessary know what is there.
  3. i want to replace everything beyond the second point with whatever i store in $TUNA .

Please can i get some help.

Thank You

Spelling things wrongly usually is fatal in IT unless specially taken care of in the code. So instead of softening the conditions for your replacement, you might want to install error checking upfront.
Howsoever, for your problem, try

sed -r "\$s/(([^,],){2}).*/\1$TUNA/g" file
Line 1
Line 2
Line 1
ABC,2,TUNA,PIE

or, if your sed version doesn't allow for extended regexes,

sed  "\$s/\(\([^,],\)\{2\}\).*/\1$TUNA/g" file

Unfortunately, when using double quotes, you need to escape the $ sign designating the last line of the input stream.

Hi RudiC,

I used you first sample code and it seems to only work with files with fewer lines, i have a file which consists of 40+ lines and it doesn't seem to work. Is there a way to resolve this?

Also would it matter if i have a file that consists of multiple Commas as a delimiter like below? Would it potentially affect the way the script works?

"FOOD",CAT,20160606,20160606,001
"FOOD1",TUNA,"PASTA"
"FOOD2",CHICKEN,"PASTA"
"FOOD3",HAM,"PASTA"
"FOOD4",EGG,"PASTA"
"FOOD5",NOODLE,"PASTA"
"FOOD6",BACON,"PASTA"
"FOOD7",FISH,"PASTA"
"FOOD8",BIG,"PASTA"
"TOTAL",8,FOOD,ABC1234567

Also apologies in advance if i keep changing the way the code in the file looks it's that i'm trying to keep it as generic as possible.

Generic is one thing. But, without knowing how it doesn't work and without knowing what the last line really looks like in a file where it doesn't work, we are all left guessing with no solid idea of what might be wrong.

If I were to make a bunch of wild guesses, might I be correct in saying that the sed command RudiC suggested in post #8:

sed  "\$s/\(\([^,],\)\{2\}\).*/\1$TUNA/g" file

works just fine as long as there is exactly one digit in the second field in the last line in file (as found in all of your sample input files), but it doesn't do anything if there is more than one digit in that field? If my wild guess is correct, would the following slight modification of RudiC's code work correctly for you?

sed "\$s/^\(\([^,]*,\)\{2\}\).*/\1$TUNA/" file

We don't necessarily need the exact contents of the file(s) you're working on to help you get working code, but we do need representative contents that accurately show us the format of the data you're going to be processing with a clear English description of the variations that would be expected in your real data.

1 Like

As the sed script is going for the last line in the input stream, it doesn't matter if it's 40 lines or thousands of lines of data.
You may want to anchor the regex (after applying Don Cragun's "patch") at the begin-of-line to be even safer to retain the FIRST two fields.

What do you mean by "multiple Commas as a delimiter like below"? I can't recognize any in your sample...