Parsing through a file with awk/sed

OrangeYaGlad · December 14, 2010, 9:59am

I don't necessary have a problem, as I have a solution. It is just that there may be a better solution.

GOAL: Part one: Parse data from a file using the "\" as a delimiter and extracting only the last delimiter. Part two: Parse same file and extract everything but the last delimited item.

Background: I was given 600+ registry keys that needed to be queried. I was given a file with the concatenated keys and values. IE...HKLM\Software\Microsoft\Driver Signing\Policy. I need the value "Policy" separated from the rest of the key.

Sample data:

HKLM\System\CurrentControlSet\Control\Lsa\LimitBlankPasswordUse
Password Policy security settings are not registry keys.
HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\DontDisplayLastUserName
HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\DisableCAD
[snip]......

My Solution:
Part one of the problem was a snap to solve, I ran the following command and got all the values.

awk --field-separator='\' '{ print $NF }' regkeydump

OUTPUT:

LimitBlankPasswordUse

DontDisplayLastUserName
DisableCAD
[snip]......

Perfect, works like a charm. Part two is where problems begin. No matter what I try, I cannot get this command to recognize the "\"

awk 'BEGIN {FS=ORS="\"} {for (i=1;i<NF;i++) print $i}'

So as a work around I replaced the "\" with ":" in my data file

cat regkeydump | tr '\' ':' > regkeydumpprep

Then ran the awk command

awk 'BEGIN {FS=ORS=":"} {for (i=1;i<NF;i++) print $i}' regkeydumpprep |sed 's/$/\n/' > regkeysonly

OUTPUT:

HKLM:System:CurrentControlSet:Control:Lsa:HKLM:Software:Microsoft:Windows:CurrentVersion:Policies:HKLM:Software:Microsoft:Windows:CurrentVersion:Policies:System
[snip]......

Some problems arise, the sed command is not working properly creating newlines. Second it removed a line that did not have the delimiter in the line. This isn't a huge deal, but means that I now have to manually line up the data. I now run a new sed command to replace the :HKLM with \nHKLM

sed -r "s/:HKLM/\\`echo -e '\nHKLM'`/g" regkeysonly > regkeysclean

OUTPUT:

HKLM:System:CurrentControlSet:Control:Lsa
HKLM:Software:Microsoft:Windows:CurrentVersion:Policies
HKLM:Software:Microsoft:Windows:CurrentVersion:Policies:System
[snip]......

I now replace the ":" with the "\" to get the data back to its original state

cat regkeysclean | tr ':' '\' > finished

OUTPUT:

HKLM\System\CurrentControlSet\Control\Lsa
HKLM\Software\Microsoft\Windows\CurrentVersion\Policies
HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System
[snip]......
`

This works, but I think that if I could get the awk loop to use the "\" as the delimiter and the sed command to work this would be a two liner. Any thoughts?

ctsgnb · December 14, 2010, 10:08am

sed 's:\\[^\\]*$::' yourfile

anurag.singh · December 14, 2010, 10:14am

cat inputFile | while read line
do
   firstPart=$(echo $line | sed 's/\(.*\)\\\(.*\)/\1/')
   secondPart=$(echo $line | sed 's/\(.*\)\\\(.*\)/\2/')
   echo $firstPart
   echo $secondPart
done

OR

cat inputFile | while read line
do
   firstPart=${line%\\*}
   secondPart=${line##*\\}
   echo $firstPart
   echo $secondPart
done

Scrutinizer · December 14, 2010, 10:38am

awk NF-=1 FS=\\ OFS=\\ file

ahmad.diab · December 14, 2010, 11:41am

if you are in solaris FS="\\" will only work with this awk "/usr/xpg4/bin/awk" and will not work with nawk , in linux system it will work with gawk:-

/usr/xpg4/bin/awk 'NF>1{$NF=""}1' FS="\\"  OFS="\\" infile.txt

O/P:-
HKLM\System\CurrentControlSet\Control\Lsa\
Password Policy security settings are not registry keys.
HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\
HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\

OrangeYaGlad · December 14, 2010, 11:59am

Thank you both ctsgnb and Scrutinizer.

Both commands work and are a nice one liner.

The only difference is that that ctsgnb's sed one liner does not parse out the lines without the delimiter. Which means a little less manual work for me.

I am more familiar with awk and understand exactly what it is doing but will need to to a little research on the sed command to get a full understanding. Unfortunately I don't get to use *nix as often as I would like, but now I know to use \\ to ensure the backslash is recognized.

ahmad.diab · December 14, 2010, 12:06pm

did you try my code , I am afraid that I understand you wrongly , did I?

what is the o/p you need to get?

Scrutinizer · December 14, 2010, 12:46pm

Hi, if you do not want to parse out the lines, you can do this:

awk NF-- FS=\\ OFS=\\ file

OrangeYaGlad · December 14, 2010, 2:22pm

Ahmad,

Your code does work, sorry must of missed it. Yours also doesn't delete any lines without a delimiter which is a plus for me.

Scrutinizer,

Keeps the return carriages but removes the data, even better. I delete the ones that do not have associated keys once I line the data up. Saves even more time, thanks.