Problems with Sed/awk/grep and line endings

Hello
I have created the following script, which is designed to manipulate a text document:

#!/bin/sh
# Get 3 lines, (last of which is "Quantity"); adjust order; put all three on one line with tabs.
FILENAME=~/Desktop/email.txt
LIST=$(grep -B2 "Quantity" ${FILENAME} |awk 'BEGIN { FS = "\n"; RS = "--"; } 
{ if ($4 != "")
 printf ( $4 "\t" $2 "\t" $3 "\n") 
 else { printf ($3 "\t" $1 "\t" $2 "\n") }
 } 
 END { }');

# Remove "Quantity :" and "Price : ".
LIST=$(echo -e $LIST |sed s/"Quantity: "//g);
LIST=$(echo -e $LIST |sed s/"Price: "//g);
echo -e $LIST > email2.txt;

#Remove asterisks; get range of text between strings; remove up to colon on each line.
ADDRESS=$(cat $FILENAME |sed s/\*//g);
ADDRESS=$(echo -e $ADDRESS |sed -n '/BILLING DETAILS/,/DELIVERY DETAILS/p');
ADDRESS=$(echo -e $ADDRESS |awk -F: '{ printf $2 "\n" }');
echo -e $ADDRESS >> email2.txt;

I have a number of problems with it.

  1. The \t tabs and \n newlines in the printf section of awk don't get written to the file, only spaces.
  2. The last three commands doesn't seem to work in the script, though they seem to work individually on the command line. Again, echoing to the Terminal displays linefeeds, but echoing to the file in the script does not produce line feeds, just spaces.
  3. Some of the text being processed is a few lines of "***********". When these lines are present, I end up with a directory listing in my final output file.

Can anyone explain why these problems are happening and how to stop them? Thanks.

The script processes two parts of a text file in two different ways. The second half (between two strings) should just be written to a new file without everything up to an including a colon on each line.
The first half takes three lines (the 3rd of which starts "Quantity: "), rearranges their order and then removes some text. I seem to have an off-by-one error for the first item, which is why there's an if..then.

Hope this makes sense! (Oh yes, I'm running OS X 10.6.3)

Always quote variable references, and you only need one call to sed:

echo -e "$LIST" | sed -e 's/Quantity: //g' -e 's/"Price: "//g' > email2.txt

UUOC.

ADDRESS=$(sed 's/*//g' "$FILENAME" );
echo -e "$ADDRESS" |awk -F: '/BILLING DETAILS/,/DELIVERY DETAILS/ { printf $2 "\n" }') >> email2.txt
1 Like

Many thanks. That's made the code simpler for starters.
It's also sorted out the problems of the linebreaks in the second bit; but the lack of tabs and linebreaks in the awk command is still there.

Any thoughts on why?

#!/bin/sh
# Get Items from first half of email and sort the groups of lines
FILENAME=~/Desktop/email.txt

LIST=$(grep -B2 "Quantity" ${FILENAME} |awk 'BEGIN { FS = "\n"; RS = "--"; } 
{ if ($4 != "")
 printf ( $4 "\t" $2 "\t" $3 "\n") 
 else { printf ($3 "\t" $1 "\t" $2 "\n") }
 } 
 END { }');
 
echo -e "$LIST" | sed -e 's/Quantity: //g' -e 's/"Price: "//g' > email2.txt
echo -e $LIST > email2.txt;
 
#Get Addresses from second half
ADDRESS=$(sed 's/*//g' "$FILENAME" );
echo -e "$ADDRESS" |awk -F: '/BILLING DETAILS/,/DELIVERY DETAILS/ { printf $2 "\n" }' >> email2.txt

Can you also explain the significance of askerisks in the text file being turned into a directory listing?

UUOC? Ah. Useless use of cat. :o

The braces around FILENAME don't do anything; the variable should be quoted: "$FILENAME"

Why is that second echo there? Why is $LIST unquoted?

An unquoted asterisk is expanded to all files in the current directory.

1 Like

Ah. OK. I have seen this in other scripts, so thought it was necessary. Interestingly, removing the braces has fixed the lack of tabs and linefeeds, so perhaps braces in this form DO do something.

Sorry, That's a vestigial command from days gone by. I've removed it.

The script now seems to be working as it should now. Many thanks.

So, short of removing asterisks in the target file as I have done, how do you escape this behaviour, to deal with text files/strings that contain asterisks?

One last thing: Do I need to make any changes if the input $FILENAME isn't the name of a file, but is simply a string containing all the data?

Quote them.

The same way as you used "$LIST".

It all seems to be working now. Athough I'm still having trouble caused by asterisks in the text. How can I stop the shell from interpreting "*********" as an instruction to display a folder listing?