retain last 1000 line in a file

I have large file with around 100k+ lines. I wanted to retain only the last 100 lines in that file. One way i thought was using

 
tail -1000 filename > filename1
mv filename1 filename

But there should be a better solution.. Is there a way I can use sed or any such command to change the same file so that I have just the last 1000 lines are retained? So that we dont need to copy the output to another file and rename it back?

Many Thanks

No, you can't use the command to change the same file.You will have to put the desired output,after applying proper filters, in a seprate file then rename it to the original one...:D:D

Thank you Reebot !

But any faster operation then tail that can help out here ???

The approach will depend on the Operating System and the related tools, whether the file is currently open by a process (like syslogd), and whether it is a normal unix text file suitable for processing with shell tools.

Worth looking at the actual size of 1000 records because most versions of "tail" are limited in how much data they will buffer (you could for example ask for 1000 lines and only get 300).

If we assume that the file is a static normal unix text file and not open by another process, my first inclination would be to use "wc" to count the records then "sed" to output the required number of records to a temporary file.

1 Like

Try Following :

sed -e :a -e '$q;N;1001,$D;ba'  file_name

Hope this works for you....:b::b:

I'm thinking your issue is more I/O related than an issue with tail; maybe the onus is on the write speed as opposed to your accessing the last 1000 lines of the file. You could even read it into memory and then spit it back out to overwrite your file, but be careful as it may bite you if it's too big.

Otherwise, tail is pretty fast since alternatives would need to parse logic surrounding your line population. Some are better than others, but tail is very simple in its approach: $((EOF - integer)):

-> wc -l /opt/worklibs/tmp/i*114054.dat
  434641 /opt/worklibs/tmp/i_PPGLBL_20081231114054.dat

-> time sed -n "400000,1000p" /opt/worklibs/tmp/i_PPGLBL_20081231114054.dat >/dev/null

real    0m7.91s
user    0m3.96s
sys     0m3.94s

-> time tail -1000 /opt/worklibs/tmp/i_*114054.dat >/dev/null                                

real    0m0.03s
user    0m0.02s
sys     0m0.01s

-> time awk 'NR>=400000 && NR<=401001 {print $0;}' /opt/worklibs/tmp/i*114054.dat >/dev/null 

real    0m23.77s
user    0m23.06s
sys     0m0.71s

-> time nawk 'NR>=400000 && NR<=401001 {print $0;}' /opt/worklibs/tmp/i*114054.dat >/dev/null

real    0m4.57s
user    0m4.06s
sys     0m0.50s

1 Like

OR Else you can use two commands in a single stratch as :

sed -e :a -e '$q;N;1001,$D;ba'  orig_file >copy_file ;mv copy_file  orig_file 
1 Like

If you go for "tail" first run a quick test to see if your tail command is suitable.

tail -1000 filename | wc -l

Hopefully the answer is 1000 .

Thank you Very much Curleb, for that detailed explanation and info. I should be going with tail. That will solve my purpose..
Also thank you Reebot and Methyl for your answers ..

Many Thanks & Regards.

You can edit a file in place with ed. To delete all but the last 1000 lines:

ed -s file <<'EOF'
1,-1000d
wq
EOF

or, more clumsily

printf '1,-1000d\nwq\n' | ed -s file

Regards,
Alister