thefang
February 24, 2014, 10:33am
1
Dear Unix Forums,
I am hoping you can help me with a pattern matching problem.
What am I trying to do?
I want to replace multiple lines of a text file (that match a multi-line pattern) with a single line of text. These patterns can span several lines and do not always have the same number of line breaks in between.
Example input file
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
One of the simpler patterns may look like this (pseudocode; meaning that only parts of the line are relevant and that the number of line breaks can vary):
^ * @CAL RtlInitAnsiString @PA1 0x0012f740 * ((1 to 3 line breaks)) * @CAL memmove @PA1 0x0012f740 * $
The matching text should be replaced with a string (e.g. "@MATCH ").
In the end, the file should look like this:
Desired output
@MATCH //replaced lines 1 and 2
@MATCH //replaced lines 3 to 5, including the irrelevant BlaCall
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
Current solution for adjacent lines only:
sed 'N;s/\@LIB.*\@CAL RtlInitAnsiString .*\@PA1 0x0012f741.*\n.*\@CAL memmove \@PA1 0x0012f740.*/\@MATCH/' inputfile
Unfortunately, this does not seem to work for several line breaks (ie. when there is "gap" between the lines containing RtlInitAnsiString and memmove).
Stuff I tried that didn't match anything:
perl -pne 'BEGIN {undef $/} s/\@LIB.*\@CAL RtlInitAnsiString \@PA1 0x0012f740.*\@CAL memmove \@PA1 0x0012f740.*/\@MATCH/' inputfile
perl -0pe 's/^\@LIB.*\@CAL RtlInitAnsiString .*\@PA1 0x0012f740.*\@CAL memmove \@PA1 0x0012f740.*$/\@MATCH/gm' inputfile
perl -0pe 's/^\@LIB*\@CAL RtlInitAnsiString *\@PA1 0x0012f740*.*\@CAL memmove \@PA1 0x0012f740*$/\@MATCH/s' inputfile
Any ideas how to get this kind of multi-line pattern matching to work? I'd prefer sed or perl, but awk is fine too
Thanks in advance!
Klashxx
February 24, 2014, 12:04pm
2
From python shell:
>>> import re
>>> text = '''
... @LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
... @LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
... @LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... '''
>>>
>>> pattern = re.compile('^.*?@CAL RtlInitAnsiString @PA1 0x0012f740.*?@CAL memmove @PA1 0x0012f740.*?$',re.MULTILINE|re.DOTALL)
>>>
>>> print re.sub(pattern,'@MATCH',text)
@MATCH
@MATCH
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
I am not working in perl stuff rigth now but i'm sure that the translation should be pretty straightforward
2 Likes
What should the output be for the input:
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
(which is your sample input file with line 2 removed)? Is line 1 kept in the output as is, or should lines 1 through 4 be changed to a single:
@MATCH
output line?
What should happen if there are more than 3 newlines between @CAL RtlAnsiStringToUnicodeString
and @CAL memmove
if there are no other occurrences of @CAL RtlAnsiStringToUnicodeString
between them?
1 Like
Need response to Don's questions as it looks like there can be a lot a varying cases.
Requirement Analysis not complete
With the current data you have provided, try this
awk '
/@CAL RtlInitAnsiString @PA1 0x0012f740/{s=1}
s && /@CAL memmove @PA1 0x0012f740/{ print "@MATCH"; s=0; next }
!s' infile
--ahamed
1 Like
thefang
February 25, 2014, 4:07am
5
Thank you all for your replies!
In response to Don's questions: This input file...
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
...should turn into this (minimal "destruction"):
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@MATCH
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
If there are more than 3 newlines in between the first and second part of the pattern, nothing should happen. The replacement should only be executed as long as the "maximum gap" is not exceeded (in this case 3). So if the input file would look like this:
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
...the script should NOT replace the large block of "RtlAnsiStringToUnicodeString".
Thanks again!
Klashxx
February 25, 2014, 8:03am
6
Interesting regex (python shell again):
>>> text = '''
... @LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
... @LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
... @LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... '''
>>>
>>> pattern = re.compile(r'''
... ^[^\n]+@CAL\sRtlInitAnsiString\s@PA1\s0x0012f740[^\n]+\n
... (?:(?!^[^\n]+RtlInitAnsiString)[^\n]+\n){1,3}
... ^[^\n]+@CAL\smemmove\s@PA1\s0x0012f740[^\n]+
... ''', re.X|re.M|re.S)
>>>
>>> print re.sub(pattern, '@MATCH', text)
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@MATCH
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
>>>
>>> text2 = '''
... @LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
... @LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
... @LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... @LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
... @LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
... '''
>>> print re.sub(pattern, '@MATCH', text2)
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@MATCH
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
1 Like
thefang
February 25, 2014, 9:32am
7
Thanks Klashxx,
your python script works!
I changed the pattern to...
>>> pattern = re.compile(r'''
... ^[^\n]+@CAL\sRtlInitAnsiString\s@PA1\s0x0012f740[^\n]+\n
... (?:(?!^[^\n]+RtlInitAnsiString)[^\n]+\n){0,3}
... ^[^\n]+@CAL\smemmove\s@PA1\s0x0012f740[^\n]+
... ''', re.X|re.M|re.S)
...so it would also match adjacent lines. Now I have to figure out how to turn this into a "one-liner" (I currently use "eval" to loop through a file containing pattern matching commands (mostly "sed")) and what each part of the expression does (up until now, my scripting endeavors were limited to rather basic stuff ).
Does anyone know how python compares to other approaches (awk, etc.) in terms of performance? The files I plan to analyze have upwards of 50,000 lines each and are matched against hundreds of single-line and multi-line patterns.
Cheers
$
$
$ cat input1
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
$
$ perl -lne 'BEGIN {$in = 0; $count = 0}
if (/\@CAL RtlInitAnsiString/) {
if ($x[0] =~ /\@CAL RtlInitAnsiString/) {print foreach (@x); @x = (); $count=0}
push @x, $_; $in = 1; $count++
}
elsif (/\@CAL memmove/ and $x[0] =~ /\@CAL RtlInitAnsiString/ and $in) {
print "\@MATCH"; @x = (); $in = 0; $count = 0
}
elsif ($in) {
$count++;
if ($count > 3) {print foreach (@x); @x=(); $count=0; $in=0}
else { push @x, $_; $count++ }
}
else {print}
' input1
@MATCH
@MATCH
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
$
$
$ cat input2
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
$
$ perl -lne 'BEGIN {$in = 0; $count = 0}
if (/\@CAL RtlInitAnsiString/) {
if ($x[0] =~ /\@CAL RtlInitAnsiString/) {print foreach (@x); @x = (); $count=0}
push @x, $_; $in = 1; $count++
}
elsif (/\@CAL memmove/ and $x[0] =~ /\@CAL RtlInitAnsiString/ and $in) {
print "\@MATCH"; @x = (); $in = 0; $count = 0
}
elsif ($in) {
$count++;
if ($count > 3) {print foreach (@x); @x=(); $count=0; $in=0}
else { push @x, $_; $count++ }
}
else {print}
' input2
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@MATCH
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
$
$
$ cat input3
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
$
$ perl -lne 'BEGIN {$in = 0; $count = 0}
if (/\@CAL RtlInitAnsiString/) {
if ($x[0] =~ /\@CAL RtlInitAnsiString/) {print foreach (@x); @x = (); $count=0}
push @x, $_; $in = 1; $count++
}
elsif (/\@CAL memmove/ and $x[0] =~ /\@CAL RtlInitAnsiString/ and $in) {
print "\@MATCH"; @x = (); $in = 0; $count = 0
}
elsif ($in) {
$count++;
if ($count > 3) {print foreach (@x); @x=(); $count=0; $in=0}
else { push @x, $_; $count++ }
}
else {print}
' input3
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@MATCH
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
$
$
$ cat input4
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL BlaCall @PA1 0x0012f741 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
$
$
$ perl -lne 'BEGIN {$in = 0; $count = 0}
if (/\@CAL RtlInitAnsiString/) {
if ($x[0] =~ /\@CAL RtlInitAnsiString/) {print foreach (@x); @x = (); $count=0}
push @x, $_; $in = 1; $count++
}
elsif (/\@CAL memmove/ and $x[0] =~ /\@CAL RtlInitAnsiString/ and $in) {
print "\@MATCH"; @x = (); $in = 0; $count = 0
}
elsif ($in) {
$count++;
if ($count > 3) {print foreach (@x); @x=(); $count=0; $in=0}
else { push @x, $_; $count++ }
}
else {print}
' input4
@MATCH
@MATCH
@LIB ADVAPI32.dll @CAL RtlInitAnsiString @PA1 0x0012f740 @PA2 "CriticalSectionTimeout" @RET0
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL RtlAnsiStringToUnicodeString @PA1 0x7ffdfbf8 @PA2 0x0012f740 @PA3 FALSE @RET STATUS_SUCCESS
@LIB ADVAPI32.dll @CAL memmove @PA1 0x0012f740 @PA2 0x0012f68c @PA3 4 @RET 0x0012f8bc
$
$
$
1 Like
Klashxx
February 25, 2014, 4:16pm
9
You should pick one tool and keep stick to it.
A starting point to give python a try:
#!/usr/bin/env python
import re
import sys
pattern = re.compile(r'''
^[^\n]+@CAL\sRtlInitAnsiString\s@PA1\s0x0012f740[^\n]+\n
(?:(?!^[^\n]+RtlInitAnsiString)[^\n]+\n){0,3}
^[^\n]+@CAL\smemmove\s@PA1\s0x0012f740[^\n]+
''', re.X|re.M|re.S)
with open(sys.argv[1], 'r') as f:
text = f.read()
tfilter = re.sub(pattern, '@MATCH', text)
print tfilter
sys.exit(0)
1 Like
Also a starting point to give awk a try:
awk \
-v S="@CAL RtlInitAnsiString @PA1 0x0012f740" \
-v E="@CAL memmove @PA1 0x0012f740" \
-v M="@MATCH" \
-v L=3 '
$0~S{if(R)print V;V=$0;R=FNR+L;next}
FNR==R{print V; R=V=x}
R&&$0~E {$0=M; R=V=x}
R{V=V"\n"$0;next}
1
END{if(V)print V}' $1
1 Like
thefang
February 26, 2014, 7:47am
11
Wow, three very nice approaches using three powerful tools (awk integrates best into the existing analysis scripts but the python and perl solutions look very promising too) - I will now test them on different and (much) larger sets of input files/patterns and then probably stick to the fastest one.
Thanks again, your input was extremely helpful and sure motivates me to get deeper into this kind of scripting.
Cheers!