I'm trying to parse COBOL code to combine variables into one string. I have two variable names that get literals moved into them and I'd like to use sed, awk, or similar to find these lines and combine the variables into the final component. These variable names are always VAR1 and VAR2. For example, I'd like to find these:
MOVE 'LIT1' TO VAR1
MOVE 'LIT2' TO VAR2
...and print this:
LIT1LIT2
This could happen multiple times in one file. For example, if these were the entries:
some junk
MOVE 'LIT1' TO VAR1
other junk
MOVE 'LIT2' TO VAR2
more junk
more junk
MOVE 'LIT3' TO VAR1
MOVE 'LIT4' TO VAR2
junk again
MOVE 'LIT5' TO VAR1
more junk
more junk
MOVE 'LIT6' TO VAR2
junk
I'd like to print this:
LIT1LIT2
LIT3LIT4
LIT5LIT6
What tells you that only two MOVE statments get combined?
MOVE 'LIT1' TO VAR1.
Do all lines end with a period? Probably not.
Try this to start with
egrep '(VAR1$|VAR1\.$|VAR2$|VAR2\.$)' mycobol.cob |
awk '{ printf("%s", $2); if(FNR%2==0) {print} }' | tr -d \'
cat /your/file | sed -rn "/MOVE '[^']*' TO VAR[12]/p" | sed -r "/1$/N; s/MOVE '([^']*)' TO VAR1\nMOVE '([^']*)' TO VAR2/\1\2/"
hope this helps
Hey Jim,
Thanks, that is really close; I'm getting the output below. How can I print a new line instead of the second grep output (i.e. 'MOVE LIT2 TO VAR2') and a new line?
LIT1LIT2MOVE LIT2 TO VAR2
LIT3LIT4MOVE LIT4 TO VAR2
LIT5LIT6MOVE LIT6 TO VAR2
-Jay
Hey chebarbudo,
My grep doesn't like the -r option; I get 'sed: illegal option -- r'.
---------- Post updated at 05:27 PM ---------- Previous update was at 05:24 PM ----------
Jim,
Also, I forgot to answer..
There are always 2 move statements with the same variable names as a coding convention our programmers use.
Right, there will not always be a period at the end of the statement.
-Jay
---------- Post updated at 05:33 PM ---------- Previous update was at 05:27 PM ----------
Thanks Jim, I got it to work by using this:
egrep '(VAR1$|VAR1\.$|VAR2$|VAR2\.$)' $program |
awk '{ printf("%s", $2); if(FNR%2==0) {printf("\n")}}' | tr -d \'
LIT1LIT2
LIT3LIT4
LIT5LIT6
Thanks again!
-Jay
Hey Jay,
I'm glad you already found the answer. Can you anyway tell me if this new syntax works?
cat /your/file | sed -n "/MOVE '[^']*' TO VAR[12]/p" | sed "/1$/N; s/MOVE '\([^']*\)' TO VAR1\nMOVE '\([^']*\)' TO VAR2/\1\2/"
With -F, you needn't use tr command.
awk -F[\'\ ] '{ printf("%s", $3); if(FNR%2==0) {printf("\n")}}'
Hey chebarbudo,
Yes, that worked as well.
cat /myfiles/test.cob | sed -n "/MOVE '[^']*' TO VAR[12]/p" | sed "/1$/N; s/MOVE '\([^']*\)' TO VAR1\nMOVE '\([^']*\)' TO VAR2/\1\2/"
LIT1LIT2
LIT3LIT4
LIT5LIT6
Also, thanks for the tip rdcwayx.
Banking on the fact that all literals are in single quotes you could also use:
grep -o "'.*'" infile|xargs -n2|tr -d ' '
What about:
awk -F[\'\ ] '/VAR1/{x=$3}/VAR2/{print x $3}' inFile