Select only the last line from the pattern

Hi,
I am really new in the shell script, but it is really useful for me to learn.
I have one question,
I have a large text document (actually few of them), inside there are lines with information about energies, between 10 to 20 of this lines, varies from one doc to another one, my questions are, using the bash shell

  1. how can I selectively choose the last one? (to be printed out in a new doc)
  2. How can I selectively choose the one before the last and the one before that one as well?

Thanks for any help

use head and tail commands to get the desired results.

last line : tail -1 filename.txt
line before the last : tail -2 filename.txt | head -1

You can also use sed to go to any perticular line.

It might be important to note that I am not looking for the last line, I am looking for the last energy which is far away from the last line of the document. Thanks

could you please post your sample data and the output you desire ?

You'll need to post a sample of your file in order for someone to answer your question. Assuming that the energy information spans multiple lines in the file, please post a sample of a complete set of the energy data and a few lines that precedes and follow the data. Without these kinds of examples it is impossible to help.

Here is a file sample, since I do not how to attach a Doc file here (if possible), I am attaching the dropbox link.
---https://dl.dropbox.com/u/63216126/6dialkene.out--
If you grep "E(" you will find a series of energies, the name in between the parenthesis changes from file to file, the number of energies varies from file to file. I need the last energy and the two before that.
Also since I am attaching the file here I would like to ask an extra question. If you grep "input orientation" you will also find a series of XYZ coordinates, how can I find and copy the last one and the two before that, using a bash shell script as well, but separate from the energy script.
Thanks a million for the help.

Also, and just as important, show us your desired output. If pasted here, enclose with code tags to preserve formatting.

Regards and welcome to the forum,
Alister

You could put together your info and the hints given in this thread:

grep "E(" 6dialkene.out |tail -3|cut -d" " -f8

if you need the energies by themselves. If you need the whole line, drop the cut filter.

grep -i "input orientation" will output 16 occurrences and data for these is screens and screens of data. Which ones do you need?

This definitely works, but i suspect the printing routine to give formatted output. If the number of blanks varies because, for instance, a two-digit value has one leading space more than a 3-digit value ("-234" vs " -23" - don't know if they are possible at all) your line might fail. I suggest a slight modification therefore (replace "<spc>" by a blank and "<tab>" by a literal tab character):

sed -n '/E(/ s/[<spc><tab>][<spc><tab>]*/<spc>/gp' | tail -3 | cut -d' ' -f4

The sed command does the same as the grep but replaces all consecutive whitespace by one space. This will work regardless of the number of leading whitespace.

I hope this helps.

bakunin

Basically I would just need to create one script for each of the following outputs. Thanks a million for the help, it is really useful for me

output 1 (Last energy):

SCF Done:  E(RB3LYP) =  -234.626900695     A.U. after    7 cycles

output 2 (-1 and -2 energies):

 SCF Done:  E(RB3LYP) =  -234.626899214     A.U. after    8 cycles
 SCF Done:  E(RB3LYP) =  -234.626900474     A.U. after    8 cycles

Output 3 (Last "input orientation"):
Input orientation:

      1          6           0       -0.399347   -5.262276    0.924477
      2          1           0        0.036285   -6.254346    0.851459
      3          1           0       -1.018756   -5.065127    1.796487
      4          6           0       -0.191339   -4.333337   -0.008437
      5          1           0        0.440298   -4.576219   -0.864753
      6          6           0       -0.747870   -2.936770    0.017425
      7          1           0       -1.402957   -2.809077    0.886816
      8          1           0       -1.372645   -2.775463   -0.873590
      9          6           0        0.357075   -1.853859    0.042264
     10          1           0        1.026052   -2.019980   -0.815204
     11          1           0        0.967516   -1.976710    0.944220
     12          6           0       -0.196861   -0.457411   -0.019460
     13          1           0       -0.788822   -0.220813   -0.905375
     14          6           0       -0.030242    0.478629    0.914665
     15          1           0        0.549025    0.287676    1.815162
     16          1           0       -0.460889    1.470485    0.814332

@bakunin: You're absolutely right re. the unreliable number of spaces (e.g. one would ecpect the energies to be in field 6 rather than -f8 for cut). I kept it on the simple side, your sed proposal is much safer here (would be field 6 then, btw).

@hmartine1983: as you obviously only want entire lines, try this (shamelessly borrowing aashish.sharma8's proposals):
output 1:

grep "E(" 6dialkene.out |tail -1
 SCF Done:  E(RB3LYP) =  -234.626900695     A.U. after    7 cycles

output 2:

grep "E(" 6dialkene.out |tail -3|head -2
 SCF Done:  E(RB3LYP) =  -234.626899214     A.U. after    8 cycles
 SCF Done:  E(RB3LYP) =  -234.626900474     A.U. after    8 cycles

output 3:

grep -A20 "Input orientation" 6dialkene.out |tail -16

will result in the 16 lines requested.
These rely heavily on data structures being fixed as shown in your example; any deviation might need more sophisticated analysis in the way bakunin pointed out.

Thanks a lot to everyone, very helpful