Hello,
I have a spec file that contains a lot of strings that looks like this:
PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD
Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 4GB (1x4GB) DDR3, PC3-1600MHz, 750GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"CSV template
These strings need to be converted to an html table and then be inserted into a master .csv for uploading
The master .csv looks like
price,product code, SPECS,other things,
300.00,CODE 2112334, ,OTHER STRINGS,
500.00,CODE 2222222, ,OTHER STRINGS,Desired .csv output:
And the final .csv file should look like this:
price,product code, SPECS,other things,
300.00,CODE 2112334, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i3 3220 (Dual Core, 3.30GHz</td></tr><tr><td>Memmory</td><td> 2GB (1x2GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>500GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,
500.00,CODE 2222222, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i5 3470 (Quad Core 3.20GHz)</td></tr><tr><td>Memmory</td><td> 4GB (1x4GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>750GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,
This is what I have done so far:
Fist I need to "clean" the specs and get proper csv strings using the following script
#ls is probably a bad idea
for f in $(ls *.csv)
do
#fix newline from file
sed -i ':a;{N;s/NBD \n/NBD,/};ba;s/"//g;' "$f"
#fix csv & and remove unessesery strings
sed -i 's/"PC/PC/g;s/Core\,/Core/g;s/3\,/3./g;s/3MB\,//g;s/6MB\,//g;s/6MB//g;s/w \///g;s/7,200/7200/g;s/site\"/site/g;s/3MB//g;s/3\,/3\./g;s/w\///g;s/3\,/3\./g;s/Cache\,)/Cache/g;s/ Internal Dell Business Audio Speaker\,//g;' "$f"
#don't know how to remove symbols with sed using awk instead
awk 'NR==FNR {a[$1]=$2;next} {for ( i in a) gsub(i,a)}1' template $f >temp.txt
mv temp.txt $f
done
Then using this script to populate the html table
#!/bin/bash
#ls is probably a bad idea
for f in $(ls *.csv)
do
#split csv into 1 line .csv files
split --additional-suffix=.csv -d -l 1 "$f" output/data_
#populate html file and create .html files
for file in $(ls output/*.csv)
do
IFS=","
while read f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
do
echo "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> "
echo "<tbody>"
echo "<tr> "
echo "<td class=\"specsTitle\">Box</td> "
echo "<td class=\"specsDescript stripeBottom\">$f2</td> "
echo "</tr> "
echo "<tr> "
<snip>
done <$file > output/temp.txt
mv output/temp.txt $file.html
done
done
#remove not important .csv
rm output/*.csv
So at this point I have several .html files in the output folder that need to go into the final .csv.
This is something I am not sure on how to do.
Any comments/imporvments on the above scripts are also welcome