Text parsing

Hi All!

Is it possible to convert text file:

to:

?

 awk -F"[:,]" '{if ($1~/@2 line/){print $1": <XO>"$2"<XC,\""$2"\",\"\",\"\",\"\",0,0,\"\">,<XO>"$3"\",\"\",\"\",\"\",0,0,\"\">"}else{print $0}}' file
1 Like

I don't understand anything, but... it works! :))

Is it possilble to save the output to a file?

---------- Post updated at 07:04 AM ---------- Previous update was at 07:04 AM ----------

ps. THANKS!

at the end of command you add

command awk file > fileToSave
1 Like

Great! You saved a few days of my job! I will check it on a very large file.

akshay@nio:/tmp$ cat file
@1 line:Any text aaa
@2 line:tag1, tag2
@3 line:Any text bbb
@1 line:Any text ccc
@2 line:tag3, tag4
@3 line:Any text ddd 
akshay@nio:/tmp$ awk '{print ( /@2 line/ ) ? $1":"sprintf(fmt,$2,$2) OFS sprintf(fmt,$3,$3) : $0}' FS='[:, ]' OFS=',' fmt='<XO>%s<XC,"%s","","","",0,0,"">' file
@1 line:Any text aaa
@2:<XO>line<XC,"line","","","",0,0,"">,<XO>tag1<XC,"tag1","","","",0,0,"">
@3 line:Any text bbb
@1 line:Any text ccc
@2:<XO>line<XC,"line","","","",0,0,"">,<XO>tag3<XC,"tag3","","","",0,0,"">
@3 line:Any text ddd 

Hello,

May I explain this one?

awk -F"[:,]" '{if ($1~/@2 line/){print $1": <XO>"$2"<XC,\""$2"\",\"\",\"\",\"\",0,0,\"\">,<XO>"$3"\",\"\",\"\",\"\",0,0,\"\">"}else{print $0}}' file

-F is the option used to set the field separator, that can be a character or a regular expression. This time is a regular expression that's why it is between brackets.

[:,] is the regular expression that means that the separator would be either : or ,

So awk understands each separator for a field like this:

@1 line:Any text aaa
@2 line:tag1, tag2

For example, printing the first field,

awk -F[:,] '{print $1}' file
@1 line
@2 line
@3 line
@1 line
@2 line
@3 line

Printing the second field,

awk -F[:,] '{print $2}' file
Any text aaa
tag1
Any text bbb
Any text ccc
tag3
Any text ddd

Now we have the chunks of code so we can work with them to achieve what we want,

if ($1~/@2 line/)
{
print $1": <XO>"$2"<XC,\""$2"\",\"\",\"\",\"\",0,0,\"\">,<XO>"$3"\",\"\",\"\",\"\",0,0,\"\">"
}
else
{
print $0
}

with this condition we find if the first field matches the string between / /

$1~/@2 line/
```[/b]


 ``` print $0 ```  means printing the whole line.