Awk print to file problem

TasosARISFC · March 3, 2011, 8:52am

Hello,

I have a file seperated with "|" I want to search the 11th field if it matches certain words change it to an empty space.

I have managed to do that, but now I need it to replace the file.

this is my code:

 
awk 'BEGIN{OFS=FS="|"}$11=="to follow"||$11=="tbc"||$11=="123456"{$11=" "}{print}' file

i tried doing a print > file, print >"file" and a print $* > "file" but I still get the output on the screen and the file is not updated. Any ideas?

Thanks in advance
Tasos

in2nix4life · March 3, 2011, 9:09am

Hope this helps.

awk 'BEGIN{OFS=FS="|"}$11=="to follow"||$11=="tbc"||$11=="123456"{$11=" "}{print}' < file > newfile

TasosARISFC · March 3, 2011, 11:20am

sort it, print > "file" worked, for some reason it did not before, i just deleted the script and wrote it again and it worked.

thanks anyway

---------- Post updated at 04:20 PM ---------- Previous update was at 02:17 PM ----------

ok now I have a new problem, I am sure I saw a similar post here but wasn't able to locate it.

What I am trying to do now is have the script looking into a folder, if there are any files matching a prefix, then execute the awk command and update them, here is the code:

 
CMR='mypath'
ls $CMR |grep TEST_
var1=$(ls $CMR | grep TEST_)
if [ $? -eq 0 ]
then
  for filename in $var1
  do
 
    awk 'BEGIN{OFS=FS="|"}$11=="to follow"||$11=="tbc"||$11=="123456"{$11=" "}{print > "$filename"}' $filename

done
fi

however this does not update the files.
Any ideas what the syntax might be to print > an iteretive variable?

thanks in advance
tasos

Corona688 · March 3, 2011, 11:25am

You really can't edit-in-place like that. awk doesn't work that way. It'll either do nothing, or trash the entire file.

Just print in awk instead of printing to file, and do the redirection outside awk.

That's also a useless use of ls | grep, you can just glob it like this:

for FILE in "${CMR}"/TEST_*
do
        # Store output in temporary file /tmp/$$
        # then overwrite the original's contents
        awk '{stuff}' < "$FILE" > /tmp/$$ &&
                cat /tmp/$$ > "$FILE"
done

rm -f /tmp/$$

Only allow cat to overwrite it once you're very very sure awk is doing what you want.

TasosARISFC · March 3, 2011, 11:58am

Sadly the above did not work, the awk command was not executed at all or it replaces the amended file with the original again

yinyuemi · March 3, 2011, 2:41pm

hope this can help you.

sed -i -r -e 's/\|/\n/10' -e 's/\n(to follow||tbc||123456)/| /' file

Chubler_XL · March 3, 2011, 6:51pm

The ls | grep wasn't entirly useless it was testing for existance of a file matching the glob to avoid issues if no TEST_* files are in the dir.

TasosARISFC, the {stuff} was supposed to be your actual awk script not to be inserted literally

This should be close to what you need (checks for no TEST_ files; {stuff} now replaced with awk code to do the job):

if [ ! "$CRM"/TEST_* = "$CRM"'/TEST_*' ]
then
    for FILE in "${CMR}"/TEST_*
    do
        # Store output in temporary file /tmp/$$
        # then overwrite the original's contents
        awk -F\| '$11~/(tbc|123456|to follow)/{$11=x} 1' OFS="|" < "$FILE" > /tmp/$$ &&
            cat /tmp/$$ > "$FILE"
    done
    rm /tmp/$$
fi

TasosARISFC · March 4, 2011, 10:52am

Thank you for you reply, I was using the original code inside {stuff} , I did not take it literrally

The code above works, but not as I wanted, my code previously was explicity maching the search criteria, now it will replace the 11th field with a space if it finds a word matching, for example:

field 11= tbc ----gets replaced by " "
field 11= 0005533 tbc --- also gets replaced by " "

I want it to replace only the first case, e.i. explicitly matching the field.

another example

field 11= number to follow --- should not be replaced
field 11= to follow --- this should be replaced

---------- Post updated at 03:52 PM ---------- Previous update was at 03:08 PM ----------

This is what eventually worked for me. It does exactly what I want.
Chubler_XL in you reply the if check I think its wrong.

Anyway this is the code that worked:

 
CRM='mypath'
ls $CRM | grep TEST
if [ $? -eq 0 ]
  then
    for FILE in "${CRM}"/TEST*
    do
        # Store output in temporary file /tmp/$$
        # then overwrite the original's contents
        awk -F\| '$11=="to follow"||$11=="test"||$11=="tbc"||$11=="123456"||$11=="TBC"{$11=""} 1' OFS="|" < "$FILE" > /tmp/$$ &&
            cat /tmp/$$ > "$FILE"
 
    done
  rm /tmp/$$
fi

Thank you all for your help

Corona688 · March 4, 2011, 1:03pm

if [ ! "$CRM"/TEST_* = "$CRM"'/TEST_*' ]

That will fail with "too many arguments" whenever there's more than one directory and bomb the whole script!

I usually try something like

for FILE in "${CMR}"/TEST_*
do
    [ -d "$FILE" ] || continue # Don't run awk on nonexistent dirs
    # Store output in temporary file /tmp/$$
    # then overwrite the original's contents
    awk -F\| '$11~/(tbc|123456|to follow)/{$11=x} 1' OFS="|" < "$FILE" > /tmp/$$ &&
        cat /tmp/$$ > "$FILE"
done
rm -f /tmp/$$

...but even without the 'continue' check nothing bad will happen beyond awk printing an error message.

---------- Post updated at 12:03 PM ---------- Previous update was at 12:01 PM ----------

tasosarisfc:

 
CRM='mypath'
ls $CRM | grep TEST
if [ $? -eq 0 ]
  then
   for FILE in "${CRM}"/TEST*
   do
   # Store output in temporary file /tmp/$$
   # then overwrite the original's contents
   awk -F\| '$11=="to follow"||$11=="test"||$11=="tbc"||$11=="123456"||$11=="TBC"{$11=""} 1' OFS="|" < "$FILE" > /tmp/$$ &&
   cat /tmp/$$ > "$FILE"
 
   done
  rm /tmp/$$
fi

Thank you all for your help

The "ls $CRM | grep TEST" does absolutely nothing now unless you just want it printing to console for some reason. You can remove it.

Chubler_XL · March 6, 2011, 4:44pm

OK if you want exact match on that string list put ^ in front and $ on end.

So we have the following with Corona688's enhanced check (thanks given) for no TEST_ files:

for FILE in "${CMR}"/TEST_*
do
    [ -d "$FILE" ] || continue # Don't run awk on nonexistent dirs
    # Store output in temporary file /tmp/$$
    # then overwrite the original's contents
    awk -F\| '$11~/^(tbc|123456|to follow)$/{$11=x} 1' OFS="|" < "$FILE" > /tmp/$$ &&
        cat /tmp/$$ > "$FILE"
done
rm -f /tmp/$$