Problem facing with sed and awk

jisha · April 9, 2008, 8:14am

Hi All,

I have a got a problem ..
I have t files as below:

1.txt contains

-----
-----
-----
column  1,  "cat",
column  24,  "dog",
column  100,  "rat",
-----
-----
-----

2.sh should contain

-----
-----
-----
awk 'BEGIN { printf ("%1s","cat")}' 
awk 'BEGIN { printf ("%24s","dog")}'
-----
-----
-----

When i execute 2.sh i should get the formatted printed output

I have tried the following codes;

awk '/column/ { print <> }' format.txt > we
where the <> is awk 'BEGIN .....
The problem with this command is I cannot get the lines above or below the search.
i.e only the lines containng "column" will be replaced and written to 2.sh

If i use sed, I cannot get the fileds from the file 1.txt
i.e; In order to get the "%1s" "%100s" from file 1.txt is not possible.

Is their any way we can overcome these two problems ?

thanks in advance
JS

Franklin52 · April 9, 2008, 9:05am

If I understand the question you can do something like:

awk '{gsub(",|\"","");printf("%*s\n", $2, $3)}' 1.txt

Regards

jisha · April 10, 2008, 12:03am

Hi Franklin52,

Thank you for the reply.
The code does't give me the output i need. But i think a bit a tuning can make it do.

Can u explain me how the code works here.

Thanks a lot
JS

Franklin52 · April 10, 2008, 2:40am

The gsub function removes the comma's and the quotes.
The printf function uses the value of $2 for the width option to format the output.

Regards

era · April 10, 2008, 2:43am

Understanding the context of this problem would probably help. Your first awk script specifically looks only for lines containing "column"; if that's not what you want, why do you do that?

Getting a number which is always the second field on a line with sed is not at all impossible.

sed -e 's/^[^ ]* *\([1-9][0-9]*\), *\("[^"]*"\),.*/"%\1s", \2/'

Franklin's code simply removes the commas and the double quotes (from between the fields, and around the third field, but for simplicity anywhere else too), then uses the selected fields in a printf directly.

jisha · April 10, 2008, 2:56am

Am specifically looking for lines that are beginning with "column" and replacing those lines with an " awk" command

In Short if i have a line :
column 1, "cat",
I need to replace this line with :
awk 'BEGIN { printf ("%1s","cat")}' ( given that no change should occur to the other lines in the file including the line numbers )

I need a parser to convert these lines.

Thanks to all
JS

era · April 10, 2008, 3:08am

What does this mean then? You want to print the whole file but replace those lines which are "column" lines?

sed -e 's/^column *\([1-9][0-9]*\), *\("[^"]*"\),.*/awk '"'"'BEGIN { printf ("%\1s", \2) }'"'/"

Still a bigger context would be useful for understanding the problem and your attempt at a solution, at it seems highly inefficient to create umpteen simple awk scripts with only a BEGIN part.

jisha · April 10, 2008, 3:12am

sed -e 's/^[^ ]* *\([1-9][0-9]*\), *\("[^"]*"\),.*/"%\1s", \2/'

The above code seems to work
But I have some doubts here .
Where is this code searching for "column" ( though it worked on the first column statement ) ?
I have many statements beginning with "column". The sed here works for the first found "column". How can i make it work for all the column?

Thanks in advance
JS

jisha · April 10, 2008, 3:14am

Exactly !!

1.txt >> parser >> 2.sh

This is "parser" is where am getting stuck with..

era · April 10, 2008, 3:17am

Scroll back, I managed to edit my previous reply while you were responding to it.

jisha · April 10, 2008, 5:06am

Hello era,

I tried using the below code as u have given :
cat 1.txt | sed -e 's/^[^ ]* *$[1-9][0-9]$, *$"[^"]*"$,./awk '"'"'BEGIN {printf("%\1s", \2)}'"'"'>> 2.sh/' > temp.txt

This worked for the first occurance of "column"

Where as when i used the code given below, there was no change in 1.txt and 2.sh.

sed -e 's/^column *$[1-9][0-9]$, *$"[^"]*"$,./awk '"'"'BEGIN { printf ("%\1s", \2) }'"'/"

I did some changes in the above code and then too it was not making a single difference between the two files.

I think I can go ahead with teh first code. But can u tell me

how it is searching the first occurance of " column" ??
How can i make code search for all teh firt occurance of the "column " and replace it??

Thanks for teh help ...
JS

era · April 10, 2008, 5:12am

It's not searching for the first occurrence. The updated code contains precisely the modification to only match lines which start with "column". Probably there is something in the regular expression which doesn't match your examples exactly. Worked on the samples I copy/pasted from above.

jisha · April 10, 2008, 5:19am

The format of 1.txt is some what like this :

fish in the water
print
column 1, "cat",
column 24, "dog",
column 100, "rat"hill and jill
run another hat
print column 23, "elephant"I commit to work
print
column 45, "friend"
column 34, " mother"
The junk stuff I added are other lines of the file 1.txt

era · April 10, 2008, 7:19am

And you specifically want to turn this into six tiny awk scripts so it won't run too fast?

Sarcasm aside, Franklin's solution looks much closer to what you really ought to be doing.

awk '$1 ~ "column"{gsub(",|\"","");printf("%*s\n", $2, $3)}' 1.txt

The sed script I wrote before assumes -- based on the examples you gave earlier -- that "column" would always be right at the start of a line. If it's not really, you want to add a " *" after the "^". Lesson: don't obfuscate your examples too much.

jisha · April 10, 2008, 7:52am

Thanks era for pointing that out ..its indead a lesson ..
Am still not getting a solution though am working on it

jisha · April 10, 2008, 7:58am

But Franklin's solution will directly put the script to give the output which is not want i need.

1.txt > temp.sh > 2.sh
when 2.sh runs i shud get the output not when temp.sh runs.
I need the solution to be written in temp.sh so that it access 1.txt, replace certain lines and writes to 2.sh. Then am suppose to run 2.sh to get the formatted printing.

franklin's solution directly jumps to the output. i dont get a 2.sh !!!

Hope I made it clear.

era · April 10, 2008, 8:09am

echo 'BEGIN {' >temp.awk
 awk '$1 ~ "column"{gsub(",|\"","");
  printf("printf (\"%%*s\\n\", \"" $2 "\", \"" $3 "\");\n") }' 1.txt >>temp.awk
echo '}' >>temp.awk

This creates an awk script, not a shell script. I'm sure you can figure out how to adapt it if you absolutely want a shell script.

jisha · April 11, 2008, 3:59am

Am still facing problems with this .
Am thinking of another logic.

Keeping the same file format ( I have writeen in bold -- scroll up), can I search for teh patterns "print" and "column" using a single grep (or any other search command ) from teh file 1.txt and write it to "temp.txt"

Thanks

era · April 11, 2008, 4:03am

egrep '(print|column)' 1.txt >temp.txt

jisha · April 11, 2008, 5:20am

Hi,

after using the above command and a bit of tuning i made the file format as below:

  print

# Print first some lines #
column 1, "cat",
column 9, rat,
column 56, goat

# this is a comment statement #
print column 46, naming function
print heaven
print hell
print
column 23, market"&&&",
column 30, fill
######
print
column 55, descrition,

Can u help mw to get some variables as below:
var1="print column 1, "cat", column 9, rat, column 56, goat"
var2="print column 46, naming function"
var3="print heaven"
-----
var5="print column 23, market"&&&",column 30, fill"

dont want the comment statements. And i can remove them ..In short,i need to select from the first" print " to the next print(exclude) and store that in a variable.

I can do the storing but cutting is not successful. I tried using awk but that too didnot give the desired o/p

Thanks