I need to extract the following lines from this text and put it in different files.
From xxxx@gmail.com Thu Jun 10 21:15:46 2010
Return-Path: <xxxxx@gmail.com>
X-Original-To: xxx@localhost
Delivered-To:xxxx@localhost
Received: from ubuntu (localhost [127.0.0.1])
by ubuntu (Postfix) with ESMTP id 53FDD2575A
for <xxxxxx@localhost>; Thu, 10 Jun 2010 12:15:46 -0700 (PDT)
MIME-Version: 1.0
Received: from gmail-pop.l.google.com [xxxxx]
by ubuntu with POP3 (fetchmail-6.3.9-rc2)
for <xxxxxx@localhost> (single-drop); Thu, 10 Jun 2010 21:15:46 +0200 (CEST)
Received: by xxxxxx with HTTP; Thu, 10 Jun 2010 12:13:40 -0700 (PDT)
Date: Thu, 10 Jun 2010 21:13:40 +0200
Delivered-To: xxxxxxxr@gmail.com
Message-ID: <xxxxxxxxx@mail.gmail.com>
Subject: TOPIC
From: NAME <xxxxxxxx@gmail.com>
To: xxxxxxxxxxx@gmail.com
Content-Type: multipart/alternative; boundary=001485f1ea94fa4e4d0488b1d13c
X-Antivirus: avast! (VPS 100610-0, 10/06/2010), Inbound message
X-Antivirus-Status: Clean
--001485f1ea94fa4e4d0488b1d13c
Content-Type: text/plain; charset=ISO-8859-1
This is an exemple from text
--001485f1ea94fa4e4aaaa8b1d13c
Content-Type: text/html; charset=ISO-8859-1
This is an exemple from text
--001485f1ea94fa4e4aaaa8b1d13c--
From xxxx@gmail.com Thu Jun 10 21:15:46 2010
Return-Path: <xxxxx@gmail.com>
X-Original-To: xxx@localhost
Delivered-To:xxxx@localhost
Received: from ubuntu (localhost [127.0.0.1])
by ubuntu (Postfix) with ESMTP id 53FDD2575A
for <xxxxxx@localhost>; Thu, 10 Jun 2010 12:15:46 -0700 (PDT)
MIME-Version: 1.0
Received: from gmail-pop.l.google.com [xxxxx]
by ubuntu with POP3 (fetchmail-6.3.9-rc2)
for <xxxxxx@localhost> (single-drop); Thu, 10 Jun 2010 21:15:46 +0200 (CEST)
Received: by xxxxxx with HTTP; Thu, 10 Jun 2010 12:13:40 -0700 (PDT)
Date: Thu, 10 Jun 2010 21:13:40 +0200
Delivered-To: xxxxxxxr@gmail.com
Message-ID: <xxxxxxxxx@mail.gmail.com>
Subject: TOPIC
From: NAME <xxxxxxxx@gmail.com>
To: xxxxxxxxxxx@gmail.com
Content-Type: multipart/alternative; boundary=001485f1ea94fa4e4d0488b1d13c
X-Antivirus: avast! (VPS 100610-0, 10/06/2010), Inbound message
X-Antivirus-Status: Clean
--001485f1ea94fa4e4d0488b1d13c
Content-Type: text/plain; charset=ISO-8859-1
this text can be
1 or more lines
like this
--001485f1ea94fa4e4d0asdfadgad3c
Content-Type: text/html; charset=ISO-8859-1
this text can be
1 or more lines
like this
--001485f1ea94fa4e4d0asdfadgad3c--
I need an output file like this
Subject: TOPIC
From: NAME <xxxxxxxx@gmail.com>
this text can be
1 or more lines
like this
if there is only 1 file. then u can do the below code:-
egrep "Subject|From|Text" infile
but as far as ur text is considered i am sure it has more than one line. is their any specific pattern in ur text. (that u can check for.)
this is not the best solution but this will work.
for others wait for the masters of awk and sed
Can someone explain me what exactly does this code ?
thanks
I think that, first searches the text between "Content-Type.+plain" and "--" prints it.
then searches de lines with the text Subject|From and puts in the top. (f=0) print in line 0 ??
when line matched by "^Content-Type.+plain" is found, then next line is put in $0 by getline, and "f" is set to 1.
/^--/{f=0}
When line is matched by "^--", then "f" is set back to 0. So Those two commands set "f" variable to 1 for all lines between "^Content-Type.+plain" and "^--" in file. Next command:
/^(Subject|From)/||f{print}
checks if line contain "Subject" of "From" as first word, or that "f" variable is true (other than 0). If that is the case, then line is printed. So last command prints lines that were matched by "^(Subject|From)" or those between "^Content-Type.+plain" and "^--", becouse "f" variable was set to 1 for them.