Deleting all lines except last 500

ailnilanjan · May 5, 2013, 3:18am

Hi All,

I want to write a script which first check the line counts of a file if its more than 500 it deletes rest except the last 500..

I tried sed but it looks sed counts line numbers from the head & not from tail.. May be I need a wc -l frist then apply if statement & pass on the line count to the sed statement..

Can you please suggest me a way to do..

I am using Solaris 10 box...

Jotne · May 5, 2013, 3:40am

Maybe some like this works

awk '(s-NR)<500' s=$(cat file | wc -l) file

RudiC · May 5, 2013, 3:41am

man tail

Jotne · May 5, 2013, 3:47am

Yes, why make it complex when there are commands to do this.
Maybe I am in love with awk
tail -n 500

ailnilanjan · May 5, 2013, 3:56am

tail is not deleting.. various use of tail is just displaying..

awk '(s-NR)<500' s=$(cat file | wc -l) file

is returning

"awk: can't open 664"

664 is actually the line number..

Jotne · May 5, 2013, 4:42am

cat t

one
two
three
four
five
six
seven
eight
nine
ten

tail -n 3 t
eight
nine
ten

awk '(s-NR)<3' s=$(cat t | wc -l) t
eight
nine
ten

ailnilanjan · May 5, 2013, 5:08am

My purpose is to delete the lines except the last few..

For viewing the certain last few I would have used smaller one:

tail -3 t

Jotne · May 5, 2013, 5:15am

This is how to delete

tail -n 500 oldfile > newfile

newfile then contains what you want.

MadeInGermany · May 5, 2013, 5:22am

Most efficient is to read the file once and store the lines in a circular buffer.
An attempt with awk

awk '
{s[i++]=$0}
{i=i%500}
(i in s){print s}
' file

Perl with its compact arrays might save some bytes of memory.
But there is certainly an option in head or tail ...

Jotne · May 5, 2013, 5:58am

This removes the 500-1 last line. He like to save the last 500 lines, not delete.

MadeInGermany · May 5, 2013, 7:19am

Oh sorry, then it's simply

tail -500 file

For academic interest, the circular buffer is still to be used but must be printed at the very end

awk '
{s[NR%n]=$0}
END {
for (i=NR+1;i<=NR+n;i++) {if (i%n in s) print s[i%n]}
}
' n=500 file

Scrutinizer · May 5, 2013, 9:17am

Not just academic interest. With some implementations of tail the internal buffer is so small that it may not be able to handle 500 lines...

--
Alternative circular buffer (on Solaris use /usr/xpg4/bin/awk , rather than awk ):

awk 'NR>n{sub("[^" RS "]*" RS,x,buf)} {buf=buf $0 RS} END{printf "%s",buf}' n=500 file

--
Without a buffer:

awk 'NR==FNR{next} FNR==1{m=NR-1} FNR>m-n' n=500 file file

or

awk 'NR>m-n' m="$(wc -l<file)" n=500 file

Yoda · May 5, 2013, 10:19am

Another approach:

[ $( wc -l < file ) -gt 500 ] && tac file | head -500 | tac

MadeInGermany · May 5, 2013, 12:00pm

For academic interest, I have bug-fixed my 2nd sample.
It also makes use of NR instead of an extra variable.

Scrutinizer presented a fix for Jotne's 1st sample

awk '(s-NR)<500' s="$(wc -l <file)" file

The bug occured with certain versions of wc that print a leading space.

RudiC · May 5, 2013, 3:47pm

$ tac file | awk 'NR<=500' | tac

alister · May 6, 2013, 12:32am

printf '%s\n' 1,-500d w q | ed -s file 2>/dev/null

Regards,
alister

ailnilanjan · May 6, 2013, 1:43am

Thanks all..

I see many responses!! ..

I will try them one by one will respond by the end of the day..

Thank you all again...

ailnilanjan · May 7, 2013, 3:10am

tac not found in my box

man tac
No manual entry for tac.

Below one worked perfectly & really smart way indeed :

printf '%s\n' 1,-500d w q | ed -s file 2>/dev/null