Extract a portion of string from each line in Linux

Hi I have to extract the destination path information from each record the file is of variable length so I will not be able to use the print command.The search should start on variable "destinationPath" and it should end at immediate "," also the first field has to be printed

Input File:

1|"delete":true,"destinationPath":"/cd/rest/stone/data","dryRun":false,"write":true
20|"bandwidth":null,"delete":true,"destinationPath":"/ab/test/rock/archive","CopyRun":false

Expected Output:

1|"destinationPath":"/cd/rest/stone/data"
20|"destinationPath":"/ab/test/rock/archive"

Thanks in advance..!

Is this a homework assignment? Homework and coursework questions can only be posted in the Homework & Coursework forum under special homework rules.

Please review the rules, which you agreed to when you registered, if you have not already done so.

If you did not post homework, please explain the the source of the data you are processing and the nature of the problem you are working on. Please also tell us what operating system and shell you're using and show us what you have tried to solve this problem on your own.

If you did post homework in the main forums, please review the guidelines for posting homework and repost.

This is not the homework/Assignment. This is the backend data from the cloudera manager scheduler(BDR). The first field is the BDR id and the record consists of several other setting but we are concentrating on the BDR id and the destination path. We also changed the paths in compliance with security.

Welcome to the forum.

How far would this get you?

awk -F"[|,]" '{for (i=2; i<=NF; i++) if ($i ~ /destinationPath/)  print $1, $i}' OFS="|" file
1|"destinationPath":"/cd/rest/stone/data"
20|"destinationPath":"/ab/test/rock/archive"

Thanks Rudic this really serves my purpose. I have just one more question. If I want to grep for multiple keywords in each line . Lets say in this case we picked the field "destination" Along with this If we want to collect the field "delete".

Expected output for same input:

1|"destinationPath":"/cd/rest/stone/data","delete":true
 20|"destinationPath":"/ab/test/rock/archive","delete":true

Try

awk -F"[|,]" '
NR == 1         {for (n=split (TGTSTR, T); n; n--) TGT[T[n]]
                }
                {for (i=2; i<=NF; i++) for (t in TGT) if ($i ~ t) TMP = TMP (TMP?",":"") $i
                 print $1, substr (TMP, 2)
                 TMP = ""
                }
' OFS="|" TGTSTR="destinationPath,delete" file
1|delete":true,"destinationPath":"/cd/rest/stone/data"
20|delete":true,"destinationPath":"/ab/test/rock/archive"
1 Like

To get back the leading double-quote character on the field that starts with "delete": in the output, I think you want to change the line:

                 print $1, substr (TMP, 2)

in RudiC's suggested code to:

                 print $1, TMP

or change the line:

                {for (i=2; i<=NF; i++) for (t in TGT) if ($i ~ t) TMP = TMP (TMP?",":"") $i

to:

                {for (i=2; i<=NF; i++) for (t in TGT) if ($i ~ t) TMP = TMP "," $i
1 Like

Thanks, Don Cragun - I switched from one version to the other on the fly and didn't test / control thoroughly.