How to replace first line of every file using awk

quincyjones · November 26, 2013, 3:41am

How to replace first line of every file with mm200 ? Thanx in advance.

file.mail

>mm89589585989��*
GGG
>HH
DG

file1.mail

>mm454695879357
dg

output

filemail

>mm200
GGG
>HH
DG

file1.mail

>mm200
dg

Akshay_Hegde · November 26, 2013, 4:01am

Try : for every first line

$ cat <<eof | awk '$0 = NR==1 ? replace : $0' replace=">mm200"
>mm89589585989��*
GGG
>HH
DG
eof

>mm200
GGG
>HH
DG

for file use like this

$ awk '$0 = NR==1 ? replace : $0' replace=">mm200" file

for many files..

for file in *.type; do
       awk '$0 = NR==1 ? replace : $0' replace=">mm200" $file >tmpfile && mv tmpfile $file
done

chacko193 · November 26, 2013, 4:01am

Assuming that all the files are in the same directory :

#!/bin/sh

for file in /path/to/files/*
do
  if [ -f "$file" ]
  then
    sed '1 s/.*/mm200/' $file >tmpFile
    cp tmpFile $file
    rm tmpfFile
  fi
done

quincyjones · November 26, 2013, 4:26am

@Akshay

for many file this is working fine like this

awk '$0 = NR==1 ? replace : $0' replace=">mm200" file*

Akshay_Hegde · November 26, 2013, 4:34am

for many files this won't work because awk will not understand new file with file* as input, since we are not even checking whether FNR is resetting, so for many files copy last code in post #1 that will work.

--edit--

change mv tmpfile $file to mv -f tmpfile $file once your test gets over with sample data, and Sorry I forgot to put this note in post #1

RudiC · November 26, 2013, 8:56am

@Akshay Hegde: This is working fine, but it replaces every single line in the file by itself or by replace's contents, and - admittedly remote possibility - should there be an empty or zero-valued line, it would be suppressed. Why not simply use

awk 'NR==1 {$0=replace} 1' replace=">mm200" file

Akshay_Hegde · November 26, 2013, 9:05am

Thanks RudiC. Yes what you said is true.. I just tested like this...thanks for bringing this to my notice.

$ awk 'BEGIN{for(i=1;i<=5;i++)print i;print}' | awk '$0 = NR==1 ? replace : $0' replace=">mm200"
>mm200
2
3
4
5

$ awk 'BEGIN{for(i=1;i<=5;i++)print i;print}' | awk 'NR==1 {$0=replace} 1' replace=">mm200"
>mm200
2
3
4
5
            --- > empty line

quincyjones · November 26, 2013, 10:11am

then the final version is like this ?

for file in *.type; do
       awk 'NR==1 {$0=replace} 1' replace=">mm200" $file >tmpfile && mv -f tmpfile $file
done

Akshay_Hegde · November 26, 2013, 10:53am

@quincyjones Yes final version is like that. as long as there is no empty line what I posted will work, if there is any empty line my solution suppresses it (removes empty line).

Corona688 · November 26, 2013, 10:55am

Not bad but I would modify it a little to make it safer:

# VITAL:  Back up first!  Otherwise one typo could mangle all your input files
# and make you very sad.
tar -cf type-backup.tar *.type

for file in *.type; do
       awk 'NR==1 {$0=replace} 1' replace=">mm200" "$file" > tmpfile
       # Using cat instead of mv means $file will not change ownership
       # or permissions by accident.  This is because it overwrites the file,
       # rather than deleting it and putting a brand-new file in its place.
       cat tmpfile > "$file"
done

rm -f tmpfile

Subbeh · November 26, 2013, 11:12am

Alternatively you can overwrite the file in awk like this (test first, and correct me if I'm wrong):

awk '{if(NR==1)$0=replace; print > FILENAME}' replace=">mm200" *.type

Corona688 · November 26, 2013, 11:21am

Sorry, that is incorrect. Doing that will truncate the file before awk is finished reading it, throwing away most of its contents.

Even things like sed -i actually use temporary files to store changes to the file until they're ready to replace it.

Subbeh · November 26, 2013, 11:24am

You're right, it only works on small files, thanks

Corona688 · November 26, 2013, 11:32am

That it even works on small files means awk must be reading several lines in advance for you. You trash the original file pretty instantly, but it won't realize until it runs out of data and needs to read again.

Akshay_Hegde · November 26, 2013, 12:52pm

How about this ?

Generate some files for test

$ awk 'BEGIN{for(i=1;i<=5;i++)for(j=1;j<=5;j++)print j >"file"i".tmp"}'
$ ls file*.tmp -1
file1.tmp
file2.tmp
file3.tmp
file4.tmp
file5.tmp

BEFORE

$ awk '{ print FNR==1 && NR!=1 || NR==1 ? $0 FS FILENAME : $0 }' file*.tmp
1 file1.tmp
2
3
4
5
1 file2.tmp
2
3
4
5
1 file3.tmp
2
3
4
5
1 file4.tmp
2
3
4
5
1 file5.tmp
2
3
4
5

awk '

function write_2_og_file(file){
                                 system(sprintf("%s %s %s","cat","tmpfile>",file))
                              }

     NR==1 || NR!=1 && FNR==1{
                                  w  = "tmpfile"
                                  close(w)
                                  $0 = replace
                                  f = FILENAME
                             }
              NR!=1 && FNR==1{
                                  write_2_og_file(p)
                             }
                             {
                                  print $0 >w
                                  p=f
                             }
                          END{
                                  close(w)
                                  write_2_og_file(p)
                                  system("rm tmpfile")
                             }
    '   replace="<m200" file*.tmp

AFTER

$ awk '{ print FNR==1 && NR!=1 || NR==1 ? $0 FS FILENAME : $0 }' file*.tmp
<m200 file1.tmp
2
3
4
5
<m200 file2.tmp
2
3
4
5
<m200 file3.tmp
2
3
4
5
<m200 file4.tmp
2
3
4
5
<m200 file5.tmp
2
3
4
5

Let me know if I missed anything...

Don_Cragun · November 26, 2013, 1:27pm

I realize that the title of this thread requests a solution in awk, but most of the discussion in this thread seems to revolve around the fact that awk is not the correct tool for this job. Using ed or ex as in:

#!/bin/ksh
for i in "$@"
do      ed -s "$i" <<EOF
                1s/.*/>mm200/
                w
                q
EOF
done

avoids issues with input files being truncated, permissions changing, ownership changing, etc.