Parsing a control file loop

The_Gamemaster · May 10, 2018, 11:48pm

Hi,

Let's say I have a control file like this:

RHEL apple "echo apple"
RHEL bravo "ls -l bravo*"
RHEL church "chmod church.txt"

SUSE drive "chown user1 drive.txt"
SUSE eagle "echo "eagle flies""
SUSE feather "ls -l feather*"

HP-UX google "sed 's/^Google.*$/&\
ACTION: go to www.google.com/' google.txt"
HP-UX heartache "ls -l heartache.txt* | awk '{ print $9 }'"

...and so on.

In my main script, after I get the operating system of the machine I'm executing the script on and assigning it to variable $OPER_SYS, the script will read this control file. For every line with the value of $OPER_SYS, I want the command on the last column be executed.

Please take note of the carriage return on the first HP-UX line. It's intended, because I want the sed command to find the line starting with Google and add a new line after that with the text on the right hand side... and \n doesn't work in HP-UX sed for some reason (if you guys know a way to simplify that, please let me know).

Obviously, the script I'm creating will be executed on several servers with different platforms i.e. RHEL, SUSE, HP-UX, Solaris so I want the code to be executable on all of them (I'm having difficulties with HP-UX obviously).

Thanks in advance.

bakunin · May 11, 2018, 4:00pm

Ok, let us start with the easy part: the most basic thing you have to take care of is the difference between these systems: HP-UX has (IIRC) a Korn Shell as default shell, some Linux systems (sorry, Linux is not my strong side) have bash , others have even more obscure shells (i remember running into some dash awhile ago, whatever that is). Furthermore the exact composition of the default PATH variable may different, i.e /bin is sometimes a link to /usr/bin , sometimes not, etc.. My suggestion is to first create a large case-switch where you take care of these differences, like this sketch:

case $OPER_SYS in
     RHEL)
          export PATH=....
          export TERM=....   # in case you need it
          ....
 
     SUSE)
          export PATH=....
          export TERM=....   # in case you need it
          ....

     HPUX)
          export PATH=....
          export TERM=....   # in case you need it
          ....

esac

...rest of your "OS-independent" code here...

Notice that i.e. HP-UX has some peculiarities: the df command exists, but if you want an output that resembles what you would expect you have to use bdf . (This might not be relevant in your case, but there might be similar things that are.) Also notice that commands dealing with package management, user management, logical volumes and similar things are different in different platforms. You may need to create a function for these things which gets some parameters and then does OS-dependent things depending on the value of $OPER_SYS .

This - and similar problems with special characters - is a problem which IMHO cannot be fully solved in shell: shells always interpret in some way what they read. Masking/escaping character in your input will only get you that far. A real solution would be a parser, but it would have to be written in a HLL.

So, if you can narrow down a list of special characters you need to be able to cope with - starting with the newline - a (maybe not very pretty but workable) solution might be found, otherwise i suggest you implement your problem not in shell but in some high-level language.

I hope this helps.

bakunin

The_Gamemaster · May 11, 2018, 5:03pm

bakunin:

Ok, let us start with the easy part: the most basic thing you have to take care of is the difference between these systems: HP-UX has (IIRC) a Korn Shell as default shell, some Linux systems (sorry, Linux is not my strong side) have bash , others have even more obscure shells (i remember running into some dash awhile ago, whatever that is). Furthermore the exact composition of the default PATH variable may different, i.e /bin is sometimes a link to /usr/bin , sometimes not, etc.. My suggestion is to first create a large case-switch where you take care of these differences, like this sketch:
case $OPER_SYS in
   RHEL)
   export PATH=....
   export TERM=....   # in case you need it
   ....
 
   SUSE)
   export PATH=....
   export TERM=....   # in case you need it
   ....

   HPUX)
   export PATH=....
   export TERM=....   # in case you need it
   ....

esac

...rest of your "OS-independent" code here...
Notice that i.e. HP-UX has some peculiarities: the df command exists, but if you want an output that resembles what you would expect you have to use bdf . (This might not be relevant in your case, but there might be similar things that are.) Also notice that commands dealing with package management, user management, logical volumes and similar things are different in different platforms. You may need to create a function for these things which gets some parameters and then does OS-dependent things depending on the value of $OPER_SYS .

This - and similar problems with special characters - is a problem which IMHO cannot be fully solved in shell: shells always interpret in some way what they read. Masking/escaping character in your input will only get you that far. A real solution would be a parser, but it would have to be written in a HLL.

So, if you can narrow down a list of special characters you need to be able to cope with - starting with the newline - a (maybe not very pretty but workable) solution might be found, otherwise i suggest you implement your problem not in shell but in some high-level language.

I hope this helps.

bakunin

Thanks for the tips here. I wish I could learn a new high-level language right now (Python comes to mind), but right now it's out of the question because I have a deadline with this script. The commands I put there are just examples, but the script that I'm trying to create will update the same configuration file on all servers that we manage. So in reality, that control file will only have sed commands (or any other command that can modify the configuration file if I hit a wall using sed, like with the case with HP-UX servers). So far I'm only having a problem with that carriage return because all of the sed statements will have a new line appended to the regexp being searched. I hope I have narrowed it down to you.

The_Gamemaster · May 12, 2018, 7:35pm

Can anyone just answer my first question?

I'll just find a solution regarding special characters somehow. Hopefully someone can help me out, I really need to finish this script this week.

MadeInGermany · May 13, 2018, 2:58am

The commands in quotes are troublesome.
Better have a tag that indicates where the (unquoted) command ends.
E.g. an empty line could do it.

Or perhaps you can give a filename?
And the given file stores the command(s)?

Chubler_XL · May 13, 2018, 10:27pm

You might have more success in using awk in filter your control file.

Slight change in format of control file:

RHEL apple
    echo apple
RHEL bravo 
    ls -l bravo*
RHEL church 
    chmod church.txt

SUSE drive
    chown user1 drive.txt
SUSE eagle 
    echo "eagle flies"
SUSE feather
    ls -l feather*

HP-UX google
    sed 's/^Google.*$/&\
ACTION: go to www.google.com/' google.txt

HP-UX heartache
    ls -l heartache.txt* | awk '{ print $9 }'

I set the variable CGRP to be the 2nd param on your control lines. The awk output could be piped to a shell once your happy it's looking OK:

OPER_SYS="$1"
  
case $OPER_SYS in
    RHEL|SUSE|HP-UX)
    ;;
    *)  echo "Usage $0 [RHEL|SUSE|HP-UX]" >&2
        exit 2
    ;;
esac

awk -v SYS=$OPER_SYS '
  NF==2 && $1==SYS { exec = 1 ; print "CGRP="$2 ; next }
  NF==2 && $1 ~ "(RHEL|SUSE|HP-UX)" { exec = 0 }
  exec { print }' control

example:

$ ./do_control HPUX
Usage ./do_control [RHEL|SUSE|HP-UX]

$ ./do_control HP-UX
CGRP=google
    sed 's/^Google.*$/&\
ACTION: go to www.google.com/' google.txt

CGRP=heartache
    ls -l heartache.txt* | awk '{ print $9 }'

The_Gamemaster · May 14, 2018, 12:42am

I apologize to everyone helping out here, I posted such a bad sample of what I'm actually doing. Anyway this one below is much closer to what I'm actually doing.

RHEL:syslogd:"sed -i 's/^syslogd.*$/&\n*ACTION \/etc\/init.d\/syslog start/' $CFG_DIR/ps_mon.cfg"
RHEL:ntpd:"sed -i 's/^ntpd.*$/&\n*ACTION \/etc\/init.d\/ntpd start/' $CFG_DIR/ps_mon.cfg"
RHEL:scopeux:"sed -i 's/^scopeux.*$/&\n*ACTION \/opt\/perf\/bin\/ovpa start all/' $CFG_DIR/ps_mon.cfg"

SUSE:cron:"sed -i 's/^cron.*$/&\n*ACTION \/etc\/init.d\/cron start/' $CFG_DIR/ps_mon.cfg"
SUSE:scopeux:"sed -i 's/^scopeux.*$/&\n*ACTION \/opt\/perf\/bin\/ovpa start all/' $CFG_DIR/ps_mon.cfg"
SUSE:midaemon:"sed -i 's/^midaemon.*$/&\n*ACTION \/opt\/perf\/bin\/ovpa start all/' $CFG_DIR/ps_mon.cfg"

HP-UX:scopeux:"find $CFG_DIR -name "ps_mon.cfg" | while IFS= read -r file; do sed 's/^scopeux.*$/&@*ACTION \/opt\/perf\/bin\/ovpa start all/' "$file" | tr '@' '\n' > tmp && mv tmp "$file"; done"
HP-UX:midaemon:"find $CFG_DIR -name "ps_mon.cfg" | while IFS= read -r file; do sed 's/^midaemon.*$/&@*ACTION \/opt\/perf\/bin\/ovpa start all/' "$file" | tr '@' '\n' > tmp && mv tmp "$file"; done"
HP-UX:perfalarm:"find $CFG_DIR -name "ps_mon.cfg" | while IFS= read -r file; do sed 's/^perfalarm.*$/&@*ACTION \/opt\/perf\/bin\/ovpa start all/' "$file" | tr '@' '\n' > tmp && mv tmp "$file"; done"

I've just found a way on how to deal with \n special character in HP-UX thus the above. Anyway, I replaced the delimiters with ":" because spaces inside the commands in quotes are problematic. When I try the code below in HP-UX:

for line in `cat sample.ctl`
do
echo $line
done

I get the output below:

HP-UX:scopeux:"find
$CFG_DIR
-name
"ps_mon.cfg"
|
while
IFS=
read
-r
file;
do
sed
's/^scopeux.*$/&@*ACTION
\/opt\/perf\/bin\/ovpa
start
all/'
"$file"
|
tr
'@'
'
'
>
tmp
&&
mv
tmp
"$file";
done"

That's why I can't use awk to assign variables. The only way I've seen so far to solve this is to replace the spaces with another character, like a comma, then remove the comma with sed substitution when I assign the command on a variable. Too much of a hassle. If you have a way to simplify this, please let me know.

Thanks for the help so far.

RudiC · May 14, 2018, 3:49am

Enclose the $line in double quote to have the shell preserve spaces within.

But, I may be thick-witted, but the entire problem escapes me, even with the new sample.
OK, you have the OS - however determined - in the first field. What is the second field for? And, why not just list the commands to be executed in a list right after the OS paragraph header?

Please give some consistent context / background to help people understand what you're after. The HP-UX problem from post#1 seems to have disappeared in the new sample?

The_Gamemaster · May 14, 2018, 5:05am

The script is for modifying a specific configuration file (ps_mon.cfg) on different servers. The 2nd field is there, so that the script will search for that keyword first in the config file, and if there's an *ACTION already added in the next line after that keyword, will not proceed on executing the command in the 3rd field. Otherwise, the script will execute the command to insert that *ACTION line after the keyword. Obviously, I have those sorted in, though I'm open to suggestions if you have simpler code since I'm a clunky coder.

Once the script got the OS of the server it will execute on, for example RHEL, how can I get only the lines with RHEL, then assign the second field to $PROCESS and the third field to $COMMAND in a loop, so that I can accomplish the above?

MadeInGermany · May 14, 2018, 12:34pm

One comment:
assuming there is embedded sed code, then

 's/^syslogd.*$/&\n*ACTION \/etc\/init.d\/syslog start/'

substitutes the current line by itself plus a new line.
This is the same as appending a new line, and the a command does it nicely

 '/^syslogd/ a*ACTION /etc/init.d/syslog start'

And no backslashes needed!

MadeInGermany · May 14, 2018, 1:23pm

The following file has another separator at the end of the command field

RHEL:syslogd:sed -i '/^syslogd/ a*ACTION /etc/init.d/syslog start/' $CFG_DIR/ps_mon.cfg:
RHEL:ntpd:"sed -i '/^ntpd/ a*ACTION \/etc\/init.d\/ntpd start/' $CFG_DIR/ps_mon.cfg:
RHEL:scopeux:"sed -i '/^scopeux/ a*ACTION /opt/perf/bin/ovpa start all/' $CFG_DIR/ps_mon.cfg:

SUSE:cron:"sed -i '/^cron/ a*ACTION /etc/init.d/cron start/' $CFG_DIR/ps_mon.cfg:
SUSE:scopeux:"sed -i '/^scopeux/ a*ACTION /opt/perf/bin/ovpa start all/' $CFG_DIR/ps_mon.cfg:
SUSE:midaemon:"sed -i '/^midaemon/ a*ACTION /opt/perf/bin/ovpa start all/' $CFG_DIR/ps_mon.cfg:

HP-UX:scopeux:file=$CFG_DIR/ps_mon.cfg; sed 's/^scopeux/ a\
*ACTION /opt/perf/bin/ovpa start all' "$file" > "$file".tmp && mv "$file".tmp "$file":
HP-UX:midaemon:file=$CFG_DIR/ps_mon.cfg; sed '/^midaemon/ a\
*ACTION /opt/perf/bin/ovpa start all' "$file" > "$file".tmp && mv "$file".tmp "$file":
HP-UX:perfalarm:file=$CFG_DIR/ps_mon.cfg; sed '/^perfalarm/ a\
*ACTION /opt/perf/bin/ovpa start all' "$file" > "$file".tmp && mv "$file".tmp "$file":

It can be parsed by the following Bourne-compatible shell script

#!/bin/sh
PATH=/bin:/usr/bin
sep=":"
while IFS=$sep read -r f1 f2 f3
do
  [ -n "$f1" ] || continue
  cmd=$f3
  while
    case $cmd in
    (*$sep) break;;
    esac
  do
    read line
    cmd="$cmd
$line"
  done
  cmd=`expr X"$cmd" : X"\(.*\)$sep"` 
  echo "\
field1 = $f1
field2 = $f2 
cmd = $cmd"
done

The good thing is, the command can contain any characters: \n or newlines, and all quoting characters, and even the : character!
NB the shell script reads from stdin. Run it with /bin/sh /path/to/script < input_file .

The_Gamemaster · May 21, 2018, 11:00pm

madeingermany:

The following file has another separator at the end of the command field

RHEL:syslogd:sed -i '/^syslogd/ a*ACTION /etc/init.d/syslog start/' $CFG_DIR/ps_mon.cfg:
RHEL:ntpd:"sed -i '/^ntpd/ a*ACTION \/etc\/init.d\/ntpd start/' $CFG_DIR/ps_mon.cfg:
RHEL:scopeux:"sed -i '/^scopeux/ a*ACTION /opt/perf/bin/ovpa start all/' $CFG_DIR/ps_mon.cfg:

SUSE:cron:"sed -i '/^cron/ a*ACTION /etc/init.d/cron start/' $CFG_DIR/ps_mon.cfg:
SUSE:scopeux:"sed -i '/^scopeux/ a*ACTION /opt/perf/bin/ovpa start all/' $CFG_DIR/ps_mon.cfg:
SUSE:midaemon:"sed -i '/^midaemon/ a*ACTION /opt/perf/bin/ovpa start all/' $CFG_DIR/ps_mon.cfg:

HP-UX:scopeux:file=$CFG_DIR/ps_mon.cfg; sed 's/^scopeux/ a\
*ACTION /opt/perf/bin/ovpa start all' "$file" > "$file".tmp && mv "$file".tmp "$file":
HP-UX:midaemon:file=$CFG_DIR/ps_mon.cfg; sed '/^midaemon/ a\
*ACTION /opt/perf/bin/ovpa start all' "$file" > "$file".tmp && mv "$file".tmp "$file":
HP-UX:perfalarm:file=$CFG_DIR/ps_mon.cfg; sed '/^perfalarm/ a\
*ACTION /opt/perf/bin/ovpa start all' "$file" > "$file".tmp && mv "$file".tmp "$file":

It can be parsed by the following Bourne-compatible shell script

#!/bin/sh
PATH=/bin:/usr/bin
sep=":"
while IFS=$sep read -r f1 f2 f3
do
  [ -n "$f1" ] || continue
  cmd=$f3
  while
   case $cmd in
   (*$sep) break;;
   esac
  do
   read line
   cmd="$cmd
$line"
  done
  cmd=`expr X"$cmd" : X"\(.*\)$sep"` 
  echo "\
field1 = $f1
field2 = $f2 
cmd = $cmd"
done

The good thing is, the command can contain any characters: \n or newlines, and all quoting characters, and even the : character!
NB the shell script reads from stdin. Run it with /bin/sh /path/to/script < input_file .

Thank you very much for this input. I apologize if I wasn't able to reply in the past few days. I got sick.

I haven't tested it yet, but will do today and let you know how it goes.

---------- Post updated at 11:00 AM ---------- Previous update was at 10:33 AM ----------

Ok tested the code at last, and while the parsing works pretty well, the variable $CFG_DIR is taken literally and I can't substitute any value for it. So when I try to execute the $cmd, I get the error "$CFG_DIR/ps_mon.cfg not found".

This is the only problem I have left, and once it's solved then I can finish the script.

bakunin · May 22, 2018, 6:00pm

For this kind of situations there is the eval keyword. Its use is always the last resort, so this is not a "recommendation" and the advice is to handle it with extreme care.

Since you don't show us your code you will have to find out how to incorporate it yourself. Just so much: basically it starts the parsing of the command line a second time. Here is an example what that means:

var="abc"
abc="123"
print - $$var

Now, this will NOT work. One might expect that "$var" is expanded to "abc" and "$abc" is expanded to "123", but this is not the case. In fact all variables are expanded at the same time and once they are - they are. There is no going back and expanding a second time.

In comes eval . This indeed starts the whole parsing process a second time and hence:

var="abc"
abc="123"
eval print - \$$var

will do the second indirection. This now works exactly like described above. Notice that the first "$" had to be escaped to protect it from being interpreted in the first parsing pass.

I hope this helps.

bakunin

The_Gamemaster · May 22, 2018, 8:01pm

bakunin:

For this kind of situations there is the eval keyword. Its use is always the last resort, so this is not a "recommendation" and the advice is to handle it with extreme care.

Since you don't show us your code you will have to find out how to incorporate it yourself. Just so much: basically it starts the parsing of the command line a second time. Here is an example what that means:
var="abc"
abc="123"
print - $$var
Now, this will NOT work. One might expect that "$var" is expanded to "abc" and "$abc" is expanded to "123", but this is not the case. In fact all variables are expanded at the same time and once they are - they are. There is no going back and expanding a second time.

In comes eval . This indeed starts the whole parsing process a second time and hence:
var="abc"
abc="123"
eval print - \$$var
will do the second indirection. This now works exactly like described above. Notice that the first "$" had to be escaped to protect it from being interpreted in the first parsing pass.

I hope this helps.

bakunin

This basically solved it. Thanks so much for everyone's help here.

MadeInGermany · May 23, 2018, 1:02pm

If you have the command stored in cmd, you can evaluate+run it with

eval "$cmd"

The quotes prevent word splitting and globbing before it is passed to eval.
Then eval does all the parsing as the shell normally does with direct shell code.