Multiple line match using sed

Please help!

Input pattern, where ... could be any number of lines
struct A {
Blah1
Blah2
Blah3
...
} B;

output pattern
struct AB {
Blah1
Blah2
Blah3
...
};

I need help in extracting everything between { and }
if it would have been on a single line { \(.*\)} should have worked.

perl -0777 -pe 's/\A[^\{]*\{//s; s/\}.*?\{/\n/sg; s/\}[^\}]*\Z//s'

... assuming there is no nesting of braces. Put in a better separator than just a newline if you like. The \A and \Z patterns match beginning of file and end of file, respectively; the middle substitution is the meat of the program.

Thanks era, but I am not seeing expected output using perl command

$ perl -0777 -pe 's/\A[^\{]\{//s; s/\}.?\{/\n/sg; s/\}[^\}]*\Z//s' test

Blah1
Blah2
Blah3
...

So it looks like that your script is extracting everything between { and }

Is there any way that this can be cooked in sed?

I was thinking on the line of
sed -e "/^struct.*{.*/,/.*}.*/s/^struct[[:blank:]]*\(.*\)[[:blank:]]*{\(.*\)}\(.*\);/struct \1\3 {\2};/g

This is where I see problem in extracting multiline "\2" and "\3"

I know I am not an advance sed user, and it needs some N H kinda magic :slight_smile:

You are contradicting yourself. If that's not what you want, then what do you want?

Probably it could, if you are handy with sed, but I would not go there. (I have, more times than I care to remember, but it's so much easier in Perl.)

As an aside, you don't need to specify leading and trailing .* wildcards; sed will find the requested pattern anywhere on the line anyway (and in fact you are making it a bit harder for it).

sed -n '/{/,/}/{
s/^[^{]*{//
s/}[^}]*$//
p
}'

But you said this is not what you want, so you will need to explain what you do want.

... or maybe

sed -n '/{/,/}/{
/{/d
/}/d
p
}'

would be more to your liking. Feel free to replace either { or } with some sort of separator, again.

era, thanks for these scripts.
Sorry about the confusing first comment.
I was tryingto get output pattern
struct AB {
Blah1
Blah2
Blah3
...
};

and in this process I was unable to extract and replace multiline pattern from input
struct A {
Blah1
Blah2
Blah3
...
} B;

even if I tweak my sed command to include \n
sed -e "/^struct.*{.*/,/.*}.*/s/^struct[[:blank:]]*\(.*\)[[:blank:]]*{\(.*\)}\(.*\);/struct \1\3 {\2};/g

As you can see that I am not proficient with multiline pattern handling, thats why I see sed experts help in this forum

So the beef is the B after the closing brace, and you want that lifted up there before the opening brace? Or what?

perl -0777 -pe 's/(\s*\{.*?\})\s*(\S+);/\2\1;/sg'

Thats right.
I have to check before and after the { } and combine A and B (and leave rest of stuff the file intact)

Thanks a lot for your perl script, it works! I will try to make it work with actual header file

Good Morning ERA,
Your perl magic works fine. It needs a bit of tweaking in my actual test case
perl -0777 -pe 's/(\s*\{.*?\})\s*(\S*);/\2\1;/sg'

Also, irrespective of
struct A{
...
}B;
or enum A{
...
}B;
It moves B up. Could you please help me with making it work for specifically struct scenario. Thanks

Tighten up the regular expression for the opening brace. The \s* means whitespace so you'd want the "struct" keyword there in front of that.

perl -0777 -pe 's/struct(\s*\{.*?\})\s*(\S*);/struct \2\1;/sg' $file worked only for strtuct leaving enum intact

You can lift the struct inside the parens to reduce the duplication, too.

Thanks era. Good tips and Great help.

One more twist, say
if A exists(non-null) don't move B, just remove it.

Logic:
if A = "";
than
move B before {
else
delete B

Reason for doing this is to handle scnario like
struct A{
...
}A_T;
Your earlier conversion script is able to handle
typedef struct{
...
}A;
and
struct A{
...
};
output has to be
struct A{
...
};

So far we have onlly been doing substitutions. Adding if-then-else logic would require a quite different script. Maybe you could rephrase this using different rules? Like if A is typedef then remove A? That would still be easy to squeeze in.

perl -0777 -pe 's/(?:typedef\s+)?struct(\s*\{.*?\})\s*(\S*);/struct \2\1;/sg'

The (?:...) is just like a regular set of parentheses, except any match is not assigned to \1 or \2, so I didn't have to change the rest of the script. And the trailing ? says the whole thing is optional.

I was confused about putting the struct inside the parens, it needs to go before \2 so I'm keeping it separate after all.

Or you could post-process the output if you can tell after the fact which ones are wrong.

Sorry, I wasn't reading properly. This is a quite separate case so you can create a different program for that.

perl -0777 -pe 's/(struct\s+\S+\s*\{.*?\})(\s*\S*);/\1;/sg'

That will remove B if you have a "struct A { ... } B' and not touch any others -- it only substitutes if there is a "struct A" part.