Insterting column in csv from multiple files

Hello,

I have a spec file that contains a lot of strings that looks like this:

PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
  Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
   Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 4GB (1x4GB)      DDR3, PC3-1600MHz, 750GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr   Basic Warranty NBD on site"CSV template

These strings need to be converted to an html table and then be inserted into a master .csv for uploading
The master .csv looks like

price,product code, SPECS,other things,
  300.00,CODE 2112334,    ,OTHER STRINGS,
  500.00,CODE 2222222,    ,OTHER STRINGS,Desired .csv output:

And the final .csv file should look like this:

price,product code, SPECS,other things,
  300.00,CODE 2112334, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i3 3220 (Dual Core, 3.30GHz</td></tr><tr><td>Memmory</td><td> 2GB (1x2GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>500GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,
  500.00,CODE 2222222, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i5 3470 (Quad Core 3.20GHz)</td></tr><tr><td>Memmory</td><td> 4GB (1x4GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>750GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,

This is what I have done so far:

Fist I need to "clean" the specs and get proper csv strings using the following script

#ls is probably a bad idea
 for f in $(ls *.csv)
 do
 #fix newline from file
 sed -i ':a;{N;s/NBD   \n/NBD,/};ba;s/"//g;' "$f" 

 #fix csv & and remove unessesery strings
 sed -i 's/"PC/PC/g;s/Core\,/Core/g;s/3\,/3./g;s/3MB\,//g;s/6MB\,//g;s/6MB//g;s/w   \///g;s/7,200/7200/g;s/site\"/site/g;s/3MB//g;s/3\,/3\./g;s/w\///g;s/3\,/3\./g;s/Cache\,)/Cache/g;s/ Internal Dell Business Audio Speaker\,//g;' "$f"

#don't know how to remove symbols with sed using awk instead
awk 'NR==FNR {a[$1]=$2;next} {for ( i in a) gsub(i,a)}1' template $f >temp.txt
mv temp.txt $f
done

Then using this script to populate the html table

#!/bin/bash

#ls is probably a bad idea
for f in $(ls *.csv)
do
#split csv into 1 line .csv files
split --additional-suffix=.csv -d -l 1 "$f" output/data_

#populate html file and create .html files
for file in $(ls output/*.csv)
do

IFS=","
while read f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
do

echo "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> " 
echo "<tbody>"  
echo "<tr>  "   
echo "<td class=\"specsTitle\">Box</td> "
echo "<td class=\"specsDescript stripeBottom\">$f2</td> "
echo "</tr>     "   
echo "<tr>  "   
<snip>
done <$file  > output/temp.txt
mv output/temp.txt $file.html
done
done
#remove not important .csv
rm output/*.csv

So at this point I have several .html files in the output folder that need to go into the final .csv.
This is something I am not sure on how to do.
Any comments/imporvments on the above scripts are also welcome :slight_smile:

Take a look at the "join" command but you will need to assign a key to each record so that it will join 1 to 1, 2 to 2 and so on.

You could also do it all in one script, this is a quick little perl script as an example:

my $in_file_1   =  '/temp/tmp/m';             ## file with quoted strings
my $in_file_2   =  '/temp/tmp/t';             ## csv file with text to join info from file_1 to
#my $out_file    =  '/temp/tmp/new_file.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache'
);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
#open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
#  print $out_file_fh $out_line . "\n";

}

Hi,

Thank you for helping. But I must me doing something wrong.
Using a specfile.txt that looks like

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

and a csvfile.csv input file like this

,243720,1,
,243721,2,
,244773,3,

I modified your script a bit that now looks like:

#!/bin/perl

my $in_file_1   =  'specfile.txt';             ## file with quoted strings
my $in_file_2   =  'csvfile.csv';             ## csv file with text to join info from file_1 to
my $out_file    =  'outputfile.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
  print $out_file_fh $out_line . "\n";

}

The output that I get is

,243720,1,
,243721,2,
,2447733

and if I use > goodcsv as an output I get

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   ,243720,1,
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site",243721,2,
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   ,2447733

And the desired output should look more than this:

'<p><font size=""2""><span style=""font-weight: bold;""></span></font></p>
  DellTM OptiPlexTM 380  ,           .     : <br />     <br/>     <br />         
<br/> <hr/> <br/>
<table cellspacing="0" cellpadding="0" border="0" width="100%" class="col4"><tbody>
<snip>
</table>    
',243720,1
'<p><font size=""2""><span style=""font-weight: bold;""></span></font></p>
  DellTM OptiPlexTM 380  ,           .     : <br />     <br/>     <br />         
<br/> <hr/> <br/>
<table cellspacing="0" cellpadding="0" border="0" width="100%" class="col4"><tbody>
<tr> <td class="prodTitlespec">         </td> 
</tr> 
</tbody> 
</table> 
<table cellspacing="0" cellpadding="0" border="0" width="100%"> 
<snip>     
</table>    
',243721,2
,244773,3

I guess I do something wrong but I don;t know what

Kind Regards

Are the quoted strings in file(csvfile.csv) each on a separate record?
The reason I ask is because your code works for me:

$ cat csvfile.csv

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

$ cat specfile.txt

,243720,1,
,243721,2,
,244773,3,

-- Your code

#!/usr/bin/perl -w
# Script: testit.pl
use strict;
use warnings;

my $in_file_1   =  '/temp/tmp/specfile.txt';            ## file with quoted strings
my $in_file_2   =  '/temp/tmp/csvfile.csv';             ## csv file with text to join info from file_1 to
#my $out_file    =  'outputfile.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
#open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
  chomp $line_f2;
  # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
  chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
#  print $out_file_fh $out_line . "\n";

}

-- output from your code
$ testit.pl

,243720,1,PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core 3.30GHz, HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site
,243721,2,PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i5 3470 (Quad Core 3.20GHz,HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site

Hi, thank you for taking the time to help me,

The csvfile.csv is like

,243720,1,
,243721,2,
,244773,3,

And the specfile is

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

You posted the other way around. (probably a typo)
Regarding the specfile keep in mind that there a normaly about 10 double lines (I didn't post them all to save space) and each double line is 1 spec that continues from NBD
eg. double line

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

is 1 specline like

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

If that causes a problem I don't mind editing the specfile manually to have each line right.
So that's not a big deal for me
But one magor thing that is missing from this script is that before the final output each specline has to be inserted inside a table so line

"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD  Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

should be converted to an html table before merging with the final .csv
I do this with a script like

#!/bin/bash

for f in $(ls *.csv)
do
#split csv into 1line .csv files
split --additional-suffix=.csv -d -l 1 "$f" output/data_

#populate html file and create .html files
for file in $(ls output/*.csv)
do

IFS=","
while read f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
do

echo "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> "    
echo "<tbody>"    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Case    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    CPU    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f2</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Operating System    </td> "
echo "<td class=\"specsDescript stripeBottom\">$f7</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Hard disk    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f5</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Memory     </td>"
echo "<td class=\"specsDescript stripeBottom\">$f4</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f6</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    VGA    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Mobile Intel� Graphics Media Accelerator $f3</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Keyboard    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Mouse    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> "
echo "</tr>     "            
echo "</tbody>     "    
echo "</table>    "

done <$file  > output/temp.txt
mv output/temp.txt $file.html
done
done
#remove not important .csv
rm output/*.csv

The above script produces several .html files in the output folder. The content from these .html files should be then merged with the csvfile.csv file

Kind Regards

Ok, Code has been updated to handle quoted strings on 2 lines from file(specfile) and to create formatted file(i.e. html) as output:

use strict;
use warnings;

my $in_file_1   =  '/temp/tmp/specfile.txt';            ## file with quoted strings
my $in_file_2   =  '/temp/tmp/csvfile.csv';             ## csv file with text to join info from specfile to
my $out_file    =  '/temp/tmp/outputfile.txt';          ## formatted file
my $line_f1;
my $line_f2;
my $work_line;
my @fields;
my $fields_index;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

LINE: while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f1    =  <$in_file_fh_1>;       # Read first part of quoted string from file that is on two lines
  chomp $line_f1;
  $work_line = $work_line . $line_f1;   # Place line just read at end of variable
  next LINE if $line_f1 =~ /^\"/;       # Read next line to get 2nd part of quoted string
  # perform hash replacements
  $work_line  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $work_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $work_line  =~  s!^\"!!;
  $work_line  =~  s!\"$!!;

  $line_f2    =  <$in_file_fh_2>;           # Read matching line from csvfile
  chomp $line_f2;
  $work_line = $line_f2 . $work_line;       # Place csvfile line just read at beginning of variable
  
  print $work_line . "\n";                # Print complete line for visual reference/debugging

  # Split line into fields array and print each element for visual reference/debugging
  @fields  =  split( ( ',' ),$work_line );
  for $fields_index ( 0 .. $#fields ) {
      print $fields[$fields_index] . "\n";
  }
  print "\n";

  # Create output formatted lines with text and fields from array
  for $fields_index ( 0 .. $#fields ) {
    $out_line = "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> \n" .
                "<tbody>\n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Case    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    CPU    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[3]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Operating System    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">$fields[8]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Hard disk    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[6]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Memory     </td>\n" .
                "<td class=\"specsDescript stripeBottom\">$fields[5]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[7]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    VGA    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Mobile Intel� Graphics Media Accelerator $fields[4]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Keyboard    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Mouse    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> \n" .
                "</tr>     \n" .
                "</tbody>     \n" .
                "</table>    \n";
  }
  print $out_file_fh $out_line . "\n";     # Print formatted line to outfile
  $work_line = "";
}
close $in_file_fh_1;
close $in_file_fh_2;
close $out_file_fh;

Hopefully this is an example to show how you can do all in one script!

Hi,

Seems that I managed to get the result I wantend :smiley:
Mainly an outputfile.csv file that I can import to my website
I decided to manually edit the specfile from a 2line spec to a 1line spec So I can use this script on other products too.
Although it seems to work fine ( I opened the outputfile.csv file with libreoffice and seems to be OK) I hope that I could get a final comment from you regading this script as this is my first attemt with perl and have almost no clue of what I am doing :slight_smile:
So here it goes:

#!/usr/bin/perl -w

my $in_file_2   =  'specfile.txt';             ## file with quoted strings
my $in_file_1   =  'csvfile.csv';             ## csv file with text to join info from file_1 to
my $out_file    =  'outputfile.csv';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty:' => ' ',
 'Intel'      => ',Intel',
 '00)'      => ',00',
 '(D'      => 'D'

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
  print $out_file_fh $line_f1 . "\' \n";
  $work_line = $line_f1 . $out_line . "\n";


  # Split line into fields array and print each element for visual reference/debugging
  @fields  =  split( ( ',' ),$work_line );
  for $fields_index ( 0 .. $#fields ) {
      print $fields[$fields_index] . "\n";
  }
  print "\n";
  # Create output formatted lines with text and fields from array
  for $fields_index ( 0 .. $#fields ) {
    $out_line = "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> \n" .
                "<tbody>\n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Case    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    CPU    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[3]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Operating System    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">$fields[8]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Hard disk    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[6]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Memory     </td>\n" .
                "<td class=\"specsDescript stripeBottom\">$fields[5]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[7]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    VGA    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Mobile Intel� Graphics Media Accelerator $fields[4]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Keyboard    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Mouse    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> \n" .
                "</tr>     \n" .
		 "<tr>     \n" .
                "<td class=\"specsTitle\">    Warranty    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">  $fields[9] </td> \n" .
                "</tr>     \n" .
                "</tbody>     \n" .
                "</table>    \n \'";
  }
  print $out_file_fh $out_line . "\n";     # Print formatted line to outfile

}

Kind Regards

Check out this info: Beginner's Introduction to Perl - Perl.com

I would add the closes from my last example and get into a habit of closing files when through with them, from perlintro - perldoc.perl.org:
When you're done with your filehandles, you should close() them (though to be honest, Perl will clean up after you if you forget)

Glad I could help!!:slight_smile:

Hi again,

I totally forgot that the csv is missing the headers
someting like header1,header2,header3, at the begining of the output file
Can't seem to find a way to add it only on top of the output
Could you please help ?

Kind regads

Just write headers right after opening output file and before writing detail lines, see below example:

open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";
# Create header line in output file before writing detail lines.
print "header1,header2,header3,etc.....\n";