Insterting column in csv from multiple files

g9100 · February 15, 2014, 4:44am

Hello,

I have a spec file that contains a lot of strings that looks like this:

PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
  Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
   Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 4GB (1x4GB)      DDR3, PC3-1600MHz, 750GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr   Basic Warranty NBD on site"CSV template

These strings need to be converted to an html table and then be inserted into a master .csv for uploading
The master .csv looks like

price,product code, SPECS,other things,
  300.00,CODE 2112334,    ,OTHER STRINGS,
  500.00,CODE 2222222,    ,OTHER STRINGS,Desired .csv output:

And the final .csv file should look like this:

price,product code, SPECS,other things,
  300.00,CODE 2112334, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i3 3220 (Dual Core, 3.30GHz</td></tr><tr><td>Memmory</td><td> 2GB (1x2GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>500GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,
  500.00,CODE 2222222, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i5 3470 (Quad Core 3.20GHz)</td></tr><tr><td>Memmory</td><td> 4GB (1x4GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>750GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,

This is what I have done so far:

Fist I need to "clean" the specs and get proper csv strings using the following script

#ls is probably a bad idea
 for f in $(ls *.csv)
 do
 #fix newline from file
 sed -i ':a;{N;s/NBD   \n/NBD,/};ba;s/"//g;' "$f" 

 #fix csv & and remove unessesery strings
 sed -i 's/"PC/PC/g;s/Core\,/Core/g;s/3\,/3./g;s/3MB\,//g;s/6MB\,//g;s/6MB//g;s/w   \///g;s/7,200/7200/g;s/site\"/site/g;s/3MB//g;s/3\,/3\./g;s/w\///g;s/3\,/3\./g;s/Cache\,)/Cache/g;s/ Internal Dell Business Audio Speaker\,//g;' "$f"

#don't know how to remove symbols with sed using awk instead
awk 'NR==FNR {a[$1]=$2;next} {for ( i in a) gsub(i,a)}1' template $f >temp.txt
mv temp.txt $f
done

Then using this script to populate the html table

#!/bin/bash

#ls is probably a bad idea
for f in $(ls *.csv)
do
#split csv into 1 line .csv files
split --additional-suffix=.csv -d -l 1 "$f" output/data_

#populate html file and create .html files
for file in $(ls output/*.csv)
do

IFS=","
while read f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
do

echo "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> " 
echo "<tbody>"  
echo "<tr>  "   
echo "<td class=\"specsTitle\">Box</td> "
echo "<td class=\"specsDescript stripeBottom\">$f2</td> "
echo "</tr>     "   
echo "<tr>  "   
<snip>
done <$file  > output/temp.txt
mv output/temp.txt $file.html
done
done
#remove not important .csv
rm output/*.csv

So at this point I have several .html files in the output folder that need to go into the final .csv.
This is something I am not sure on how to do.
Any comments/imporvments on the above scripts are also welcome

spacebar · February 17, 2014, 8:19pm

Take a look at the "join" command but you will need to assign a key to each record so that it will join 1 to 1, 2 to 2 and so on.

You could also do it all in one script, this is a quick little perl script as an example:

my $in_file_1   =  '/temp/tmp/m';             ## file with quoted strings
my $in_file_2   =  '/temp/tmp/t';             ## csv file with text to join info from file_1 to
#my $out_file    =  '/temp/tmp/new_file.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache'
);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
#open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
#  print $out_file_fh $out_line . "\n";

}

g9100 · February 19, 2014, 2:17am

Hi,

Thank you for helping. But I must me doing something wrong.
Using a specfile.txt that looks like

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

and a csvfile.csv input file like this

,243720,1,
,243721,2,
,244773,3,

I modified your script a bit that now looks like:

#!/bin/perl

my $in_file_1   =  'specfile.txt';             ## file with quoted strings
my $in_file_2   =  'csvfile.csv';             ## csv file with text to join info from file_1 to
my $out_file    =  'outputfile.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
  print $out_file_fh $out_line . "\n";

}

The output that I get is

,243720,1,
,243721,2,
,2447733

and if I use > goodcsv as an output I get

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   ,243720,1,
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site",243721,2,
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   ,2447733

And the desired output should look more than this:

'<p><font size=""2""><span style=""font-weight: bold;""></span></font></p>
  DellTM OptiPlexTM 380  ,           .     : <br />     <br/>     <br />         
<br/> <hr/> <br/>
<table cellspacing="0" cellpadding="0" border="0" width="100%" class="col4"><tbody>
<snip>
</table>    
',243720,1
'<p><font size=""2""><span style=""font-weight: bold;""></span></font></p>
  DellTM OptiPlexTM 380  ,           .     : <br />     <br/>     <br />         
<br/> <hr/> <br/>
<table cellspacing="0" cellpadding="0" border="0" width="100%" class="col4"><tbody>
<tr> <td class="prodTitlespec">         </td> 
</tr> 
</tbody> 
</table> 
<table cellspacing="0" cellpadding="0" border="0" width="100%"> 
<snip>     
</table>    
',243721,2
,244773,3

I guess I do something wrong but I don;t know what

Kind Regards

spacebar · February 19, 2014, 9:34pm

Are the quoted strings in file(csvfile.csv) each on a separate record?
The reason I ask is because your code works for me:

$ cat csvfile.csv

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

$ cat specfile.txt

,243720,1,
,243721,2,
,244773,3,

-- Your code

#!/usr/bin/perl -w
# Script: testit.pl
use strict;
use warnings;

my $in_file_1   =  '/temp/tmp/specfile.txt';            ## file with quoted strings
my $in_file_2   =  '/temp/tmp/csvfile.csv';             ## csv file with text to join info from file_1 to
#my $out_file    =  'outputfile.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
#open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
  chomp $line_f2;
  # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
  chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
#  print $out_file_fh $out_line . "\n";

}

-- output from your code
$ testit.pl

,243720,1,PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core 3.30GHz, HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site
,243721,2,PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i5 3470 (Quad Core 3.20GHz,HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site

g9100 · February 20, 2014, 3:41am

Hi, thank you for taking the time to help me,

The csvfile.csv is like

,243720,1,
,243721,2,
,244773,3,

And the specfile is

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

You posted the other way around. (probably a typo)
Regarding the specfile keep in mind that there a normaly about 10 double lines (I didn't post them all to save space) and each double line is 1 spec that continues from NBD
eg. double line

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

is 1 specline like

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

If that causes a problem I don't mind editing the specfile manually to have each line right.
So that's not a big deal for me
But one magor thing that is missing from this script is that before the final output each specline has to be inserted inside a table so line

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD  Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

should be converted to an html table before merging with the final .csv
I do this with a script like

#!/bin/bash

for f in $(ls *.csv)
do
#split csv into 1line .csv files
split --additional-suffix=.csv -d -l 1 "$f" output/data_

#populate html file and create .html files
for file in $(ls output/*.csv)
do

IFS=","
while read f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
do

echo "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> "    
echo "<tbody>"    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Case    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    CPU    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f2</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Operating System    </td> "
echo "<td class=\"specsDescript stripeBottom\">$f7</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Hard disk    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f5</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Memory     </td>"
echo "<td class=\"specsDescript stripeBottom\">$f4</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f6</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    VGA    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Mobile Intel� Graphics Media Accelerator $f3</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Keyboard    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Mouse    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> "
echo "</tr>     "            
echo "</tbody>     "    
echo "</table>    "

done <$file  > output/temp.txt
mv output/temp.txt $file.html
done
done
#remove not important .csv
rm output/*.csv

The above script produces several .html files in the output folder. The content from these .html files should be then merged with the csvfile.csv file

Kind Regards

spacebar · February 20, 2014, 8:05pm

Ok, Code has been updated to handle quoted strings on 2 lines from file(specfile) and to create formatted file(i.e. html) as output:

use strict;
use warnings;

my $in_file_1   =  '/temp/tmp/specfile.txt';            ## file with quoted strings
my $in_file_2   =  '/temp/tmp/csvfile.csv';             ## csv file with text to join info from specfile to
my $out_file    =  '/temp/tmp/outputfile.txt';          ## formatted file
my $line_f1;
my $line_f2;
my $work_line;
my @fields;
my $fields_index;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

LINE: while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f1    =  <$in_file_fh_1>;       # Read first part of quoted string from file that is on two lines
  chomp $line_f1;
  $work_line = $work_line . $line_f1;   # Place line just read at end of variable
  next LINE if $line_f1 =~ /^\"/;       # Read next line to get 2nd part of quoted string
  # perform hash replacements
  $work_line  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $work_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $work_line  =~  s!^\"!!;
  $work_line  =~  s!\"$!!;

  $line_f2    =  <$in_file_fh_2>;           # Read matching line from csvfile
  chomp $line_f2;
  $work_line = $line_f2 . $work_line;       # Place csvfile line just read at beginning of variable
  
  print $work_line . "\n";                # Print complete line for visual reference/debugging

  # Split line into fields array and print each element for visual reference/debugging
  @fields  =  split( ( ',' ),$work_line );
  for $fields_index ( 0 .. $#fields ) {
      print $fields[$fields_index] . "\n";
  }
  print "\n";

  # Create output formatted lines with text and fields from array
  for $fields_index ( 0 .. $#fields ) {
    $out_line = "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> \n" .
                "<tbody>\n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Case    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    CPU    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[3]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Operating System    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">$fields[8]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Hard disk    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[6]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Memory     </td>\n" .
                "<td class=\"specsDescript stripeBottom\">$fields[5]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[7]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    VGA    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Mobile Intel� Graphics Media Accelerator $fields[4]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Keyboard    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Mouse    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> \n" .
                "</tr>     \n" .
                "</tbody>     \n" .
                "</table>    \n";
  }
  print $out_file_fh $out_line . "\n";     # Print formatted line to outfile
  $work_line = "";
}
close $in_file_fh_1;
close $in_file_fh_2;
close $out_file_fh;

Hopefully this is an example to show how you can do all in one script!

g9100 · February 21, 2014, 5:24pm

Hi,

Seems that I managed to get the result I wantend
Mainly an outputfile.csv file that I can import to my website
I decided to manually edit the specfile from a 2line spec to a 1line spec So I can use this script on other products too.
Although it seems to work fine ( I opened the outputfile.csv file with libreoffice and seems to be OK) I hope that I could get a final comment from you regading this script as this is my first attemt with perl and have almost no clue of what I am doing
So here it goes:

#!/usr/bin/perl -w

my $in_file_2   =  'specfile.txt';             ## file with quoted strings
my $in_file_1   =  'csvfile.csv';             ## csv file with text to join info from file_1 to
my $out_file    =  'outputfile.csv';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty:' => ' ',
 'Intel'      => ',Intel',
 '00)'      => ',00',
 '(D'      => 'D'

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
  print $out_file_fh $line_f1 . "\' \n";
  $work_line = $line_f1 . $out_line . "\n";


  # Split line into fields array and print each element for visual reference/debugging
  @fields  =  split( ( ',' ),$work_line );
  for $fields_index ( 0 .. $#fields ) {
      print $fields[$fields_index] . "\n";
  }
  print "\n";
  # Create output formatted lines with text and fields from array
  for $fields_index ( 0 .. $#fields ) {
    $out_line = "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> \n" .
                "<tbody>\n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Case    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    CPU    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[3]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Operating System    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">$fields[8]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Hard disk    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[6]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Memory     </td>\n" .
                "<td class=\"specsDescript stripeBottom\">$fields[5]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[7]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    VGA    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Mobile Intel� Graphics Media Accelerator $fields[4]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Keyboard    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Mouse    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> \n" .
                "</tr>     \n" .
		 "<tr>     \n" .
                "<td class=\"specsTitle\">    Warranty    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">  $fields[9] </td> \n" .
                "</tr>     \n" .
                "</tbody>     \n" .
                "</table>    \n \'";
  }
  print $out_file_fh $out_line . "\n";     # Print formatted line to outfile

}

Kind Regards

spacebar · February 21, 2014, 8:11pm

Check out this info: Beginner's Introduction to Perl - Perl.com

I would add the closes from my last example and get into a habit of closing files when through with them, from perlintro - perldoc.perl.org:
When you're done with your filehandles, you should close() them (though to be honest, Perl will clean up after you if you forget)

Glad I could help!!

g9100 · March 1, 2014, 8:26am

Hi again,

I totally forgot that the csv is missing the headers
someting like header1,header2,header3, at the begining of the output file
Can't seem to find a way to add it only on top of the output
Could you please help ?

Kind regads

spacebar · March 10, 2014, 7:12pm

Just write headers right after opening output file and before writing detail lines, see below example:

open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";
# Create header line in output file before writing detail lines.
print "header1,header2,header3,etc.....\n";