ReV
February 23, 2006, 9:13am
1
Hi,
I am not sure how to start doing this so I hope to get some advice as to how to start.
I have 2 files. The source file contains data that I needed is in columns delimited by ";". For example, in this format:
"CONTINENT","COUNTRY","CITY","ID"
"asia","japan","tokyo","123"
"europe","germany","munich","456"
"africa","eygpt","cairo","789"
The output format is in xml format and the contents is something like:
<?xml version="1.0"?>
<!DOCTYPE ALR [
<!ENTITY gt ">">
<!ENTITY lt "<">
<!ENTITY quot """>
]>
<datafile>
<info ID="#id should be here#">
<info1>#continent name should be here'</info1>
<info2> </info2>
<info3>#country name should be here#</info3>
<info4>#city name should be here#</info4>
<info5>some fixed data</info5>
<info6> </info6>
<info7> </info7>
<info8>some fixed data</info8>
<info9>some fixed data</info9>
<info10>some fixed data</info10>
<info11>some fixed data</info11>
</info>
<action>
<attributes>
<attributkey id="011" >N</attributkey>
<attributkey id="106" >N</attributkey>
<attributkey id="114" >N</attributkey>
<attributkey id="119" >N</attributkey>
</attributes>
<comment> </comment>
</action>
</datafile>
There should be one output file created for each line in the source file.
Anyone can help? Thanks a lot!
vino
February 23, 2006, 9:33am
2
See the solution in Want to show files on web page
You need to do something similiar.
Something like this
HEADER='<?xml version="1.0"?>
<!DOCTYPE ALR [
<!ENTITY gt ">">
<!ENTITY lt "<">
<!ENTITY quot """>
]>
<datafile>
<info ID="'
function header {
echo "$HEADER" > output.xml
echo "$1" >> output.xml
echo "\">" >> output.xml
}
Parse the country specific details as
while read line
do
CONTINENT=$(echo "$line" | awk -F, '{ print $1 }')
COUNTRY=$(echo "$line" | awk -F, '{ print $2 }')
.
.
header $ID
done < input.file
where header is the method given above.
Use Ruby:
$out_count = 0
def process( str )
str.strip!
continent,country,city,id = str[1..-2].split(/","/)
$out_count += 1
File.open( "output_#{ $out_count }.xml", "w" ){|handle|
handle.puts <<"HERE_DOC"
<?xml version="1.0"?>
<!DOCTYPE ALR [
<!ENTITY gt ">">
<!ENTITY lt "<">
<!ENTITY quot """>
]>
<datafile>
<info ID="#{ id }">
<info1>#{ continent }</info1>
<info2> </info2>
<info3>#{ country }</info3>
<info4>#{ city }</info4>
<info5>some fixed data</info5>
<info6> </info6>
<info7> </info7>
<info8>some fixed data</info8>
<info9>some fixed data</info9>
<info10>some fixed data</info10>
<info11>some fixed data</info11>
</info>
<action>
<attributes>
<attributkey id="011" >N</attributkey>
<attributkey id="106" >N</attributkey>
<attributkey id="114" >N</attributkey>
<attributkey id="119" >N</attributkey>
</attributes>
<comment> </comment>
</action>
</datafile>
HERE_DOC
}
end
# Discard 1st line of file.
gets
# Read rest of file.
while gets
process( $_ )
end
Save this code in file "make_xml.rb".
Run it with
ruby make_xml.rb my_input_file
The output files are named "output_1.xml", etc.
ReV
February 24, 2006, 4:30am
4
What is ruby command? When I tried to enter the command line you have written, it says "ruby:command not found"
Ruby is a free programming language.
ReV
February 24, 2006, 4:40am
6
I can't seem to use it in my unix interface. What should I do?
It's readily available and open source. Could you download and install it?
Try this link: http://www.ruby-lang.org/en/20020102.html
ReV
February 24, 2006, 5:21am
8
Hi,
I cannot install it because I am not the system admin. And I am using a server from the company. I guess I can't use Ruby.
ReV
February 24, 2006, 5:28am
9
Hi,
I couldn't get the 2nd field into the output file. Can you please advise on how to add the 2nd field and so on?
vino:
See the solution in Want to show files on web page
You need to do something similiar.
Something like this
HEADER='<?xml version="1.0"?>
<!DOCTYPE ALR [
<!ENTITY gt ">">
<!ENTITY lt "<">
<!ENTITY quot """>
]>
<datafile>
<info ID="'
function header {
echo "$HEADER" > output.xml
echo "$1" >> output.xml
echo "\">" >> output.xml
}
Parse the country specific details as
while read line
do
CONTINENT=$(echo "$line" | awk -F, '{ print $1 }')
COUNTRY=$(echo "$line" | awk -F, '{ print $2 }')
.
.
header $ID
done < input.file
where header is the method given above.
vino
February 24, 2006, 5:29am
10
Show us what you have done so far.
vino
February 24, 2006, 5:34am
11
vino:
Parse the country specific details as
while read line
do
CONTINENT=$(echo "$line" | awk -F, '{ print $1 }')
COUNTRY=$(echo "$line" | awk -F, '{ print $2 }')
.
.
header $ID
done < input.file
where header is the method given above.
Would be much better of as
IFS=","
while read conti country city id
do
CONT=$conti
COUNTRY=$country
CITY=$city
ID=$id
header $ID
done < input.file
ReV
February 24, 2006, 6:55am
12
There are some changes in the requirements of the output file. One more fields needed.
There are also 2 more fields in the input file and the 2nd line is not needed.
This is what I have done:
#!/bin/ksh
HEADER='<?xml version="1.0"?>
<!DOCTYPE ALR [
<!ENTITY gt ">">
<!ENTITY lt "<">
<!ENTITY quot """>
]>
<datafile>
<info ID="'
BODY1=' <info1>'
BODY2=' <info2> </info2>
<info3>'
BODY3=' <info4>'
BODY4=' <info5>0</info5>
<info6> </info6>
<info7> </info7>
<info8>'
BODY5=' <info9>some fixed data</info9>
<info10>some fixed data</info10>
<info11>some fixed data</info11>
</info>
<action>
<attributes>
<attributkey id="011" >N</attributkey>
<attributkey id="106" >N</attributkey>
<attributkey id="114" >N</attributkey>
<attributkey id="119" >N</attributkey>
</attributes>
<comment> </comment>
</action>
</datafile>'
INPUTFILE=/home/dir/input.csv
OUTPUTFILE=/home/dir/output.xml
function header {
echo "$HEADER" > $OUTPUTFILE
echo "$id" >> $OUTPUTFILE
echo "\">" >> $OUTPUTFILE
}
function body1 {
echo "$BODY1" >>$OUTPUTFILE
echo "$continent" >>$OUTPUTFILE
echo "</info1>" >>$OUTPUTFILE
}
function body2 {
echo "$BODY2" >>$OUTPUTFILE
echo "$country" >>$OUTPUTFILE
echo "</info3>" >>$OUTPUTFILE
}
function body3 {
echo "$BODY3" >>$OUTPUTFILE
echo "$city" >>$OUTPUTFILE
echo "</info4>" >>$OUTPUTFILE
}
function body4 {
echo "$BODY4" >>$OUTPUTFILE
echo "$date" >>$OUTPUTFILE
echo "</info8>" >>$OUTPUTFILE
}
function body5 {
echo "$BODY5" >>$OUTPUTFILE
}
IFS=;
while read continent country city id address date
do
ID=$id
CONTINENT=$continent
COUNTRY=$country
CITY=$city
DATE=$date
header $ID
body1 $CONTINENT
body2 $COUNTRY
body3 $CITY
body4 $DATE
body5
done < $INPUTFILE
The input file contents:
continent;country;city,id;address;date
---------;-----------;----------------;----------;---------;--------
asia;japan;tokyo;123;apple road;12012000
europe;germany;munich;456;orange street;13072001
africa;eygpt;cairo;789;banana lane;06121999
vino
February 24, 2006, 7:06am
13
In
while read continent country city id address date
do
ID=$id
CONTINENT=$continent
COUNTRY=$country
CITY=$city
DATE=$date
header $ID
done < $INPUTFILE
do you know what header $ID means ?
You are invoking the function header with $ID as a parameter.
So inside the header function, this parameter $ID is accessible using the construct $@ or $1 ( anyway you like...)
So in your case, the header function has no idea what $id is. Hence it should be
function header {
echo "$HEADER" > $OUTPUTFILE
echo "$@" >> $OUTPUTFILE
echo "\">" >> $OUTPUTFILE
}
Likewise, for all the other functions, you should follow suit.
ReV
February 24, 2006, 7:25am
14
Hi,
I have tried with
function header {
echo "$HEADER" > $OUTPUTFILE
echo "$@" >> $OUTPUTFILE
echo "\">" >> $OUTPUTFILE
}
but the parameters are not parsed from the input to the ouput file. I have the format of the output file without all the parameters.
vino
February 24, 2006, 7:33am
15
Within your while loop, try echoing all the variables you have assigned. See it they contain the values or not.
ReV
February 24, 2006, 7:40am
16
Hmm.. it seems like all the variables are not assigned. echo produce empty values through the while loop.
vino
February 24, 2006, 7:53am
17
Your while loop works just fine. You will have to see whats happening at your end. Probably, the IFS is wrong.
ReV
February 24, 2006, 8:08am
18
I have checked the IFS. It is correct. I am confused
vino
February 24, 2006, 8:12am
19
Once again, show us what you are doing...
ReV
February 24, 2006, 8:37am
20
Ah... I have got it! the field separator should be IFS=";" format. Thanks a lot for the help!