Using SED to generate new file from template

ppucci · July 25, 2009, 4:32pm

Hi there!

I am using a BASH script to read a CSV file (containing variable values)using while read, and for every record I want SED to get a template from a file, and using the variables read from the CSV, write a new file.

#!/bin/bash
current_ifs=$IFS ; #backup original IFS, need "," for thw CSV file
IFS=,
while read VAR1 VAR2 VAR3 FILENAME; do
 sed -n -e 's/VAR1/$VAR1/g'\
 -e 's/VAR2/$VAR2/g'\
 -e 's/VAR3/$VAR3/g'\
 TEMPLATE w $FILENAME 
done < $CSV
IFS=$current_ifs; #redefines original IFS
exit

But it looks like I am missing something. TEMPLATE is on the same folder, new file can go to any folder (would be nice to set it)

Help anyone?

edidataguy · July 26, 2009, 6:32am

Just the code is not enough.
Give a listing of the contents of all the files involved.
What is this TEMPLATE thing?
What is the "w" in TEMPLATE w $FILENAME?
What should the output look like?
In this forum,if you dont get a response in 1 or 2 hours, then most probably, your question is not clear.

ppucci · July 26, 2009, 7:56am

I've read somewhere about this "w" option on writing the output to a different file than input, but I guess I read it wrong or not sure how to use it...

Anyway, I went back to redirect using > and it worked!

Just for the record... the template files are templates with placeholder for Cisco router config files, like:

!configlet for HNAME (IP_ADD)
conf t
interface INT
 ip address IP_ADD mask MASK
end

And the source CSV file contains the hostname, interface number, IP addresses and subnet masks for each of the 1000+ devices I needed to create a configuration file... CSV also had the device type for each device (using the same types as TEMPLATE) and a FILENAM for each device so the script would write a different file for each device.

As I had many different templates, I also improved the code to save them on different folders according to the type.

I'll post latter the final script

Thank you for your reply!

edidataguy · July 26, 2009, 12:17pm

Yes, you are right, with the "w" you can write the output to a file.
but the syntax should be as follows:

w TEMPLATE $FILENAME

Where "TEMPLATE" will be the outputfile.

Did yo realize that you have used the FILENAME variable at both, the "read" as well as the input file name?

danmero · July 26, 2009, 12:30pm

@ ppucci
Why you don't post a sample input and a sample of required output.

ppucci · July 27, 2009, 2:10pm

edidataguy:

Yes, you are right, with the "w" you can write the output to a file.
but the syntax should be as follows:
w TEMPLATE $FILENAME
Where "TEMPLATE" will be the outputfile.

Did yo realize that you have used the FILENAME variable at both, the "read" as well as the input file name?

Well... guess I'll try "w" next time now I know how it works :)... does it have any difference to > after all?

As for the FILENAME var... I actually read it from the CSV archive (it is one of the columns) and then I apply it as the name of the output file (that's what I did on my finished script... but right, at my first example it was (wrongly) on the input file). That "while read" now reads:

while read VAR1 VAR2 VAR3 FILENAME; do
 sed -n -e "s/VAR1/$VAR1/g"\
 -e "s/VAR2/$VAR2/g"\
 -e "s/VAR3/$VAR3/g"\
 TEMPLATE > $FILENAME 
done < $CSV

Again, this is just an example on who it looks now... I'll post the actual script later

---------- Post updated 07-27-09 at 01:10 PM ---------- Previous update was 07-26-09 at 02:08 PM ----------

One last thing...

one of the columns on the csv file has values with spaces on it (i.e: interface name) but sed brings that around quotes (i.e: "interface name") and in some cases, it brings in quotes and adds a string to the end (i.e: "interface name"ESS).

What gives? Is there a way to tell sed not to do that?

By the way.... as promissed, here is the script that is working now:

#!/bin/bash
current_ifs=$IFS; #backup original IFS
IFS=,
echo "reading $1"
# while and sed are broken down for readability
while read REGION COUNTRY SITE_ID BU MGMT_SUBNET GATEWAY_MGMT\ 
 IP_MGMT NEW_MASK BLDG_NAME HNAME MODEL SERIAL SOC IOS\
 CUR_MGMT_IP CUR_MGMT_MASK CUR_MGMT_INT DEVTYPE TRUNK_TYPE\
 ROU_1_IP ROU_2_IP DIST_SW1_IP_OR_NH DIST_SW2_IP\
 ROUTING_PROTO_TYPE ROUTING_PROC OSPF_AREA HSRP AB\
 STATUS_VLAN_997 OBS UnID TRUNK_DATA FILENAME; do
 sed -e "s|HNAME|$HNAME|g"\
 -e "s|REGION|$REGION|g"\
 -e "s|SITE_ID|$SITE_ID|g"\
 -e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\
 -e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
 -e "s|IP_MGMT|$IP_MGMT|g"\
 -e "s|NEW_MASK|$NEW_MASK|g"\
 -e "s|GATEWAY_MGMT|$GATEWAY_MGMT|g"\
 -e "s|MGMT_SUBNET|$MGMT_SUBNET|g"\
 -e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\
 -e "s|CUR_MGMT_MASK|$CUR_MGMT_MASK|g"\
 -e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
 -e "s|ROU_1_IP|$ROU_1_IP|g"\
 -e "s|ROU_2_IP|$ROU_2_IP|g"\
 -e "s|DIST_SW1_IP_OR_NH|$DIST_SW1_IP_OR_NH|g"\
 -e "s|DIST_SW2_IP|$DIST_SW2_IP|g"\
 -e "s|ROUTING_PROTO_TYPE|$ROUTING_PROTO_TYPE|g"\
 -e "s|ROUTING_PROC|$ROUTING_PROC|g"\
 -e "s|OSPF_AREA|$OSPF_AREA|g"\
 -e "s|TRUNK_DATA|$TRUNK_DATA|g"\
 -e "s|HSRP|$HSRP|g"\
 $UnID > "./CONFIGS/$UnID/$FILENAME"
done < $1
IFS=$current_ifs; #redefines original IFS
exit

edidataguy · July 27, 2009, 11:57pm

Double check your code.
Why are these lines repeated in your code? Intentionally done?

-e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
 -e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
 -e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\
 -e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\

I would sugest changing all the lines as follows:

; "s|CUR_MGMT_INT|$CUR_MGMT_INT|"\
; "s|CUR_MGMT_INT|$CUR_MGMT_INT|"\

Why do you need the "-e" and the "/g" options.

Regarding the " " issue post the input and the output.

ppucci · July 28, 2009, 8:33am

Hey, thank you for that... I did not see it was duplicated... it did not keep the script from running, it just addded unecessary steps to it..

Well, I am new to sed, but from what I read -e is to inform that the following is still part of the script (instead of going sed "1..."; sed "2..."; sed "3..." you use sed -e "1..." -e "2..." -e "2..."). As for g, it is to replace all occurrences of searched text, isn't it? Well they may appear more than once on the template files...

Here is one record from the source file (all sensitive data was replaced, line break for readability):

persio@dynaserver:~/CUS$ more test
,"United States",226,,10.10.10.@,,10.10.10.@,255.255.255.128,Springfield,coresw1.cutomer.com,
WS-C3750G-12S-E,SERIALNO,SWIA8,12.2(35)SE5,10.10.10.@,255.255.255.192,Vlan997,"Switch Co
re L3",1,10.10.10.10,,,,2,ospf 12345,area 0,hsrp,,,,SWL3-2,#N/A,226-10_10_10_@-SWL3-2.txt
persio@dynaserver:~/CUS$

This record will use the following template according to its UnID (in red):

persio@dynaserver:~/CUS$ more SWL3-2
config t
!
vlan 996
desc Management VLAN
!
interface vlan 996
ip address IP_MGMT NEW_MASK
!
router ROUTING_PROCESS
network IP_MGMT 0.0.0.0 OSPF_AREA
!
! Trunk Configuration
TRUNK_DATA
! End trunk configuration
end
persio@dynaserver:~/CUS$

And when I run the script, this is the output file:

persio@dynaserver:~/CUS$ ./generate_config.sh test
reading test
persio@dynaserver:~/CUS$ more CONFIGS/SWL3-2/226-10_10_10_@-SWL3-2.txt
config t
!
vlan 996
 desc Management VLAN
!
interface vlan 996
 ip address 10.10.10.@ 255.255.255.128
!
router "ospf 10226"ESS
 network 10.10.10.@ 0.0.0.0 "area 0"
!
! Trunk Configuration
interface FastEthernet0/0
 switchport trunk allowed vlan add 996
! End trunk configuration
end
persio@dynaserver:~/CUS$

Details in blue are placeholder (on template) and replaced data (on output). Deatils in orange are unwanted characters that sed is including to the output file. I wanted to understand (1) why is it happening and (2) what can I do to prevent that?

edidataguy · July 28, 2009, 2:37pm

Well too many things in one go.
Lets split them.
Ragarding:

 
-e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
-e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
-e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\
-e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\

You need not call sed again and again.
As an example, the code can be as follows:

sed \
"s|HNAME|$HNAME|;   \
 s|REGION|$REGION|;  \
 s|SITE_ID|$SITE_ID|; \
.............."

In your case you don't need the "/G". Without it, the command will run faster.

ppucci · July 28, 2009, 2:51pm

edidataguy:

Well too many things in one go.
Lets split them.
Ragarding:
 
-e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
-e "s|CUR_MGMT_INT|$CUR_MGMT_INT|g"\
-e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\
-e "s|CUR_MGMT_IP|$CUR_MGMT_IP|g"\
You need not call sed again and again.
As an example, the code can be as follows:
sed \
"s|HNAME|$HNAME|;   \
 s|REGION|$REGION|;  \
 s|SITE_ID|$SITE_ID|; \
.............."
In your case you don't need the "/G". Without it, the command will run faster.

Ok, I'll give it a try without "-e" and "/g"... and for the duplicate replacements, already scraped them

---------- Post updated at 01:51 PM ---------- Previous update was at 01:43 PM ----------

tried... worked just fine... on this particular case, speed was not really an issue as the script ran for about 10 seconds or less (for a 950 records source file).

Will keep that in mind on my next scripts!

edidataguy · July 28, 2009, 7:15pm

Regarding " " and the ESS thing, all that I can think of is there is some junk char somewhere.
The first thing I will do is search for \r\n and replace with \n for both, the input as well as the code and run them again.
Else look for some other chars which you cannot see.
Beyond this I cannot help you, sorry.
If you do fix it, please do get back.

ppucci · July 28, 2009, 8:28pm

I fixed the quotes... I actually rebuilt the csv file from the Excel original. No more quotes now. However, I still get the ESS. I was careful to open the csv file at Notepad++ (Windows application) that allows me to convert the file to UNIX format. After that, I opened the file on Ubuntu with gnumeric, all good, no hidden chars.

Funny thing is that I only get it on one specific field (ROUTING_PROC), which I usually have ospf 1234 or eigrp 1234 as a value (numbers may actually vary). It keeps bringing ospf 1234ESS BUT it will not do it to eigrp 1234 values. That is rather strange. Did you have the chance to try it using the script and the sample input record?

Thank you again for your attention!

edidataguy · July 28, 2009, 8:46pm

post both the input records.
"ospf 1234" and "eigrp 1234"
Have a feeling it is messed up because of the field next/before to it (#?).
Just a hunch.

Another way is change you code:
router ROUTING_PROCESS
to
router xxx ROUTING_PROCESS yyy
see what happens.