we have data as below.
/* ------- pattern_1 --- */
kjfhas
/* ----------------- string ----------------- */
aadaew
/*--keyword-----*/
2134asdf
@@@@asdf
Requirement is to get keywords b/w comment lines (/-- --/) like below.
pattern_1
string
keyword
code:
awk '/\/\*.*[a-z0-9_]+.*\*\// {print $3}' file
it is working for first 2 comment lines since there are spaces. If no spaces in comment line (like 3rd one), not working. any help ?
The $3
does not fit then.
Most simple: a capture group and a reference.
sed can do it, and perl is the master:
perl -lne 'm#/\*.*?(\w+).*?\*/# and print $1' file
-l
strip and print with a newline
-n
loop around the input, no default print
-e
next argument is perl code
m
match
#
delimiter
.*?
mimimum "catch all"
( )
capture group
$1
reference to 1st capture group
\w
a "word" character
This would also show comment strings that are preceded or succeeded by other text:
pretext /* -- comment -- */ posttext
And
perl -lne 'print $1 while m#/\*.*?(\w+).*?\*/#g' file
would even print repeated comments:
pretext /* -- comment1 -- */ midtext /* -- comment2 -- */ posttext
2 Likes
another possible
grep '/\*' file | tr -d '/ *-'
pattern_1
string
keyword
Hi @MadeInGermany, Thanks much for the useful information.
extension of the requirement is to split the file into 3 and create as 3 separate files as below:
pattern_1.txt
/* ------- pattern_1 --- */
kjfhas
string.txt
/* ----------------- string ----------------- */
aadaew
keyword.txt
/*--keyword-----*/
2134asdf
@@@@asdf
below awk code is working if there are spaces in comment lines. If there are no spaces (like 3rd one), it's not working
awk '/\/\*.*[a-z0-9_]+.*\*\// {if (x) close(x); split($0,a," "); if (a[3] != "") {x=a[3]".jil"} else {next}} {if (x) print > x}' file
would like to know can this be achievable in perl ?
Sure.
perl -lpe 'if (m#/\*.*?(\w+).*?\*/#) { open(FH, ">", $1.".jil"); select FH; }' file
-p
loop around the input, default print
open FH, ">", $1.".jil"
handle FH, for writing, filename $1.jil
.
string concatenation operator
select FH
use FH as default
2 Likes
awk '$0 ~ /^\/\*/ { outputFile=gensub(/[ \*\-\/]/, "","G" ) ".txt"}
{ print $0 > outputFile }
' file
1 Like
Here's a quick test of GNU awk on Linux.
awk '($1 ~ /^\/\*/){sub(/\/\* *--* */, "", $0); sub(/ *--* *\*\//, "", $0); print}'
Work on the entire line because the number of fields is variable.
sub(/pattern/, "replacement", field)
is your friend.
/\/\* *--* */
looks for a literal forward slash /
followed by a literal asterisk *
, followed by zero or more space characters, followed by one or more hyphens, followed by zero or more space characters.
""
, replaces the previous pattern with nothing.
$0
operates on the entire line.
The second sub(...) does similar, but it looks for zero or more spaces followed by one or more hyphen characters, followed by zero or more space characters, followed by a literal asterisk *
, followed by a literal forward slash /
.
Given that this is simply substitution regular expressions, you could probably do this with sed. But you asked about awk, so that's how I answered.
N.B. Not all awk's are created equally. I usually use nawk on Solaris as awk is basic.