AWK question in the KORN shell

penfold · February 3, 2005, 9:26am

Hi,

I have two files with the following content:

gmrd.txt
235649;03;2563;598
291802;00;2563;598
314634;00;235649;598
235649;03;2563;598
393692;00;2563;598
411805;00;2563;598
411805;00;2563;598
235649;03;2563;598
414037;00;2563;598
575200;00;2563;598
70710;00;2563;598
70710;00;2563;598
235649;03;2563;598
70710;00;2563;598
808932;00;2563;5980
903857;00;2563;5980
979217;00;2563;598
235649;03;2563;598
A0ABVB;00;2563;598
235649;03;2563;598

and val_id.txt
235649;05;2563;598
235649;05;2563;598
564564;05;2563;598
235649;05;2563;598
235649;05;2563;598
212564;05;2563;598

(thesse are small samples of the actual files)

What I need to do is to use awk to get the first column of val_id.txt and then search gmrd.txt for any records with instances of the value occuring in the first column of vali_id.txt - if it finds any then it needs to replace the second column of that record by 05

Any help appreciated

I get the following error message:

./split.sh[2]: 235649;05;2563;598: syntax error

with the following code:

for i in $(< val_id.txt );do
index[i]="$i"
export index

awk 'BEGIN { FS = ";"; OFS = ";" }
{
if ($1 == "${index[i]")
$2 = "05"
print $0 ;
}' gmrd.txt
done

Perderabo · February 3, 2005, 1:23pm

When you do "for i in $(< val_id.txt );do", i will be set to one line of input after another. For example, "235649;05;2563;598". This is not a valid subscript, yet the next line, "index[i]=$i" tries to use i as an index. Even if that had worked, you cannot usefully export an array. You should probably rewrite this to use just ksh or just awk. It's not very clear what you really want to do, so I can't really suggest specific code.

penfold · February 3, 2005, 1:27pm

Thank you for your response -

What I am trying to do is to get look in the file val_id.txt and take those values in there - if I find instances of those values occuring in gmrd.txt then I want to replace the second column of values with something else say '05'

Adam

Perderabo · February 3, 2005, 1:35pm

How big are these files? You can only have 1k elements in an array. Would that be enough?

penfold · February 3, 2005, 2:26pm

more than enough

Perderabo · February 3, 2005, 2:50pm

#! /usr/bin/ksh

exec < val_id.txt
i=0
while IFS=";" read f1 f2 f3 f4 ; do
        index="$f1"
        ((i=i+1))
done
i=0
while ((i<${#index})) ; do
        echo ${index}
        ((i=i+1))
done

exec < gmrd.txt
while IFS=";" read f1 f2 f3 f4 ; do
        found=0
        i=0
        while ((i<${#index})) ; do
                [[ $f1 = ${index} ]] && found=1
                ((i=i+1))
        done
        if ((found)) ; then
                f2="xyz"
        fi
        echo "${f1};${f2};${f3};${f4}"
done
exit 0

Ygor · February 4, 2005, 1:48am

Using awk...

awk '
    BEGIN {
        FS = OFS = ";"
        while (getline < "val_id.txt" > 0)
            arr[$1] = 1
    }
    $1 in arr {
        $2 = 50
        $NF =  $NF "<---debug: line changed"
    }
    {print}
' gmrd.txt

Tested...

235649;50;2563;598<---debug: line changed
291802;00;2563;598
314634;00;235649;598
235649;50;2563;598<---debug: line changed
393692;00;2563;598
411805;00;2563;598
411805;00;2563;598
235649;50;2563;598<---debug: line changed
414037;00;2563;598
etc.

Remove the debug line if this is what you wanted.

penfold · February 4, 2005, 3:55am

What can I say...if I could offer you guys a job I would!

I tested it and it works great

vgersh99 · February 4, 2005, 9:11am

A slight variation on the awk theme:

nawk '
    BEGIN {
        FS = OFS = ";"
    }
    FNR==NR { arr[$1]; next} 
    $1 in arr {
        $2 = 50
    }
    1
' val_id.txt gmrd.txt

dbrundrett · February 8, 2005, 9:13am

ygor:

Using awk...

awk '
   BEGIN {
   FS = OFS = ";"
   while (getline < "val_id.txt" > 0)
   arr[$1] = 1
   }
   $1 in arr {
   $2 = 50
   $NF =  $NF "<---debug: line changed"
   }
   {print}
' gmrd.txt

Tested...

235649;50;2563;598<---debug: line changed
291802;00;2563;598
314634;00;235649;598
235649;50;2563;598<---debug: line changed
393692;00;2563;598
411805;00;2563;598
411805;00;2563;598
235649;50;2563;598<---debug: line changed
414037;00;2563;598
etc.

Remove the debug line if this is what you wanted.

Ygor,
In your code you have the line 'while (getline < "val_id.txt" > 0)', what is the '> 0'. Is this redirection or relational operator?

Thanks in advance

Ygor · February 8, 2005, 3:38pm

I am checking that the return value from getline is greater than zero.

This quote is from: http://www.cs.uu.nl/docs/vakken/st/nawk/nawk_25.html\#SEC28

dbrundrett · February 9, 2005, 8:14am

Thanks Ygor