bash dump raw email body txt from Maildir

unclecameron · December 1, 2011, 1:44am

I'm setting up a cronjob that will hopefully dump the contents of a email delivered to Maildir/new/ to /home/user/raw.txt (and then delete that email)

I could manually remove the bottom line of the raw email, then place the contents in the next line up in the /home/user/raw.txt, but it seems to me there's probably a more elegant way to script mutt to do it (which would also parse html email if the user forgot), or mail/pine/whatever, plus raw sed/awk/whatever wouldn't seem to be very error tolerant, is there a better way to do this? This is for an automated weather report I want to push to a website via a junk email account I've created on the webserver (that part is working fine).

I guess if I could figure out how to get procmail to do it directly, that'd work too. Right now I'm using (in postfix)

mailbox_command = /usr/bin/procmail -a "$EXTENSION" DEFAULT=$HOME/Maildir/ MAILDIR=$HOME/Maildir

zaxxon · December 1, 2011, 2:15am

I used procmail only as a mail filter - I don't know if there is any editing capabilities oir somethign similar.
If you want to cut and paste some line of your raw email, best just post a snippet using code tags so we can maybe assist with using something like sed/awk etc.

unclecameron · December 1, 2011, 1:27pm

cat /home/user/Maildir/new/1322721060.6211_0.www

Return-Path: <user1@blah.com>
X-Original-To: blah@blah.com
Delivered-To: blah@blah.com
Received: from smtp178.blah.com (smtp178.blah.com [207.97.245.178])
    by hostname.com (Postfix) with ESMTPS id C10893DD2F
    for <blah@blah.com>; Wed, 30 Nov 2011 22:30:59 -0800 (PST)
Received: from smtp27.blah.com (localhost.localdomain [127.0.0.1])
    by smtp27.blah.com (SMTP Server) with ESMTP id 04736118FC4
    for <blah@blah.com>; Thu,  1 Dec 2011 01:30:54 -0500 (EST)
X-SMTPDoctor-Processed: csmtpprox 2.7.4
Received: from localhost (localhost.localdomain [127.0.0.1])
    by smtp27.blahcom (SMTP Server) with ESMTP id F3BCC118FC5
    for <blah@blah.com>; Thu,  1 Dec 2011 01:30:53 -0500 (EST)
X-Virus-Scanned: OK
Received: by smtp27.blah.com (Authenticated sender: blah-AT-blah.com) with ESMTPSA id 78556118FC4
    for <blah@user.com>; Thu,  1 Dec 2011 01:30:53 -0500 (EST)
From: "blah <blah1@user.com>
To: <blah@user.com>
Subject: test4
Date: Wed, 30 Nov 2011 22:30:52 -0800
Message-ID: <00bc01ccaff2$c6b88670$54299350$@com>
MIME-Version: 1.0
Content-Type: text/plain;
    charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: Acyv8sWFTCt6aRGGS22J2iL97tXyVw==
Content-Language: en-us

This is the text I need, it may possibly span two lines or more
but it will always have a blank line above and below it.

the closing code tag truncated the blank line after the text I need, but it's really there.

I keep thinking mailx/mutt should somehow be able to dump the body to something.txt, I just can't seem to figure out how.

zaxxon · December 2, 2011, 9:40am

$ awk 'NR==2 {printf("%s",$0)}' RS= infile
This is the text I need, it may possibly span two lines or more
but it will always have a blank line above and below it.

unclecameron · December 3, 2011, 1:13am

hey, that's great Zaxxon, thanks

I don't understand how the NR==2 works, can you explain what's happening with that? I've been a bit fuzzy on awk, just trying to understand it better. Thanks again

zaxxon · December 4, 2011, 4:13am

No problem^^

NR stands for the awk built-in variable "Number Record". It automatically has the current line number assigned as value. If you have more than one file as input, there is also FNR which stands for "File Number Record".
Basically it is the line number of the current file that is being processed.

RS is the "Record Separator" variable, that stores the char that separates lines. Per default that is a newline "\n". By deleting that, awk automatically takes an expression like "\n\n+" as RS. That means that 1 or more empty lines will separate lines/records in terms of awk.
Since the text of the mail is in the second block, we just check if NR==2 and this evaluates to true, the commands inside {} will be executed. I took printf as it leaves out the additional blank line that would be displayed else.