Copy selective lines from text file

ajayram · May 10, 2011, 4:23pm

Hello,

I have a text file which I need to check for presence of certain tags, and then copy a subsequent portion of text into another file. The tag matching canbe done with Grep but I do not know how to copy selective lines from one file to another. Is it possible do that?

I checked up some options. "cp" command copies the entire file, which I do not want !!

Can someone please help me out??

ahamed101 · May 10, 2011, 4:26pm

sample input and required output please

regards,
Ahamed

ajayram · May 10, 2011, 4:46pm

Input : is a tagged dataset.

.I 1
.T
Preliminary Report-International Algebraic Language
.B
CACM December, 1958
.A
Perlis, A. J.
Samelson,K.
.N
CA581203 JB March 22, 1978  8:28 PM
.I 2
.T
Extraction of Roots by Repeated Subtractions for Digital Computers
.B
CACM December, 1958
.A
Sugai, I.
.N
CA581202 JB March 22, 1978  8:29 PM

and so on..

Output : Based on the .I tags I should store in 2 separate files.. the details of .T and .N tags.
For example,

File 1

.T
Preliminary Report-International Algebraic Language
.N
CA581203 JB March 22, 1978  8:28 PM

and file 2

.T
Extraction of Roots by Repeated Subtractions for Digital Computers
.N
CA581202 JB March 22, 1978  8:29 PM

ahamed101 · May 10, 2011, 5:02pm

Try this

awk '/^\.I/ {_2="file"$2".txt"} /^\.T|^\.N/{_1=$0;getline;print "echo -e \""_1"\\n"$0"\">>"_2 }' inputfile | bash

regards,
Ahamed

ajayram · May 10, 2011, 6:00pm

Hello,
It worked fine but then it threw up an error..

bash: line 4321: syntax error near unexpected token `('
bash: line 4321: `echo -e ".T\nAn Algorithm for the Blocks and Cutnodes of a Graph (Corrigendum)">>file2161.txt'

Any ideas?

Regards,
Ajay

frankkoenen · May 10, 2011, 6:38pm

Piping content into a shell can be dangerous, avoid the pipe to "bash", here's the suggested script with revisions using "awk" to write directly to the output files without the need to pipe to bash ...

awk '
  /^\.I/ {
    _2="file"$2".txt"
  }
  /^\.T|^\.N/ {
   _1=$0;
   getline;
   print _1  "\n" $0 >> _2
  }
' inputfile

ahamed101 · May 10, 2011, 11:17pm

Thanks frank.
Ajay try this

awk '/^\.I/ {_2="file"$2".txt"} /^\.T|^\.N/{_1=$0;getline;print _1"\n"$0 >> _2  }' infile

regards,
Ahamed

ajayram · May 11, 2011, 4:06am

@ ahamed101 :

The code is now working, thank you very much. But I have some other doubts whch I would like to clarify with you.!

I am running this on Fedora 14, when I gave this command it said :

bash: cacm dataset/cacm.all: Permission denied

This file is located on a removable NTFS Partition of my hard drive. Then I tried to change the permissions of the file by giving the following command

chmod 644 cacm\ dataset/cacm.all

But the permissions of the file do not change,even if I try as super user the permissions stay the same. then I coped the file to home directory and with the old permissions, it worked. is there any way to get around this? Or do we need to copy the file to the Linux Partitions and then only work on it??

Some of the content in the .T tag , dot not end at a line, for eg. there is this one:-

.T
The Problem of Programming Communication with
Changing Machines A Proposed Solution-Part 2
.B

So is it possible to copy multiple lines.. that is from the starting of .T tag to the starting of the next tag .B??

The script you have given me has a lot of AWK scripting. I am not a very good at shell scripts. I thought of doing the job first with C , but then realized that shell script would be faster. Tried with Grep first, but realized that Grep is not very powerful and will not take me very far. I know very litttle AWK . Could you please tell me from where I can learn more??

ahamed101 · May 11, 2011, 8:42am

Not sure

awk '/^\.I/{_2="file"$2".txt"}/^\.T/{print $0>>_2;while(getline != /.B/){print $0>>_2}}/^\.N/{x=$0;getline;print x"\n"$0>>_2}' infile

Check this

regards,
Ahamed