Unix Script to parse a CSV

I am writing a unix script that will parse a CSV and edit the values. My CSV looks like this
0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94, 89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98 ,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,11 9,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212, 119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212, 119,169,197,264,189,229
366,438,315,319,382,287,398,320,416,382,407,397,342,448,276,392,297,368,237,347,336,332,384,405,412, 284,329,350,396,326,356

This script would run every hour on the hour deleting the first value on a specific line and adding a new record to the end of that same line. So for instance at 8:00am I might delete the 0 from the first line and add a 10 at the end. I also have a version of the CSV that has the time as the first value of each row. I have tried both awk and sed and I can't figure out how to replace a value at a specfic location. I think I can delete the first value in row 1 using this code snippet, but I dont know how to add the value to the end of the same row. Is there a way to parse the csv into a two dimesional array and then output it back into a csv at the end? It probably isn't the most efficient way but it would work.
I hope this isn't too confusing. Please let me know if you need more information. Thanks

 
sed '1s/^[^,]*,//' file.csv

In perl, you'd 'slurp' in the file, break it into an array.
You'd request the line of the array that you wanted, split it into a new array.
You'd pop the array to shorten it by one, unshift your new value to the front of the new array.
Then, you'd turn the array back into a string, and put it back into the same place in the original array.
Finally, you'd rewrite the original file with the new data.

It's not too hard, but expect it to take a couple of hours to write.

sed -e '1s/^[^,]*,//' -e '1s/$/,10/' file

danmero,
Thanks for the help. It is possible to add the 10 to the end of the line instead of the beginning. This line of code converts
0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
to
,10,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0

Ideally the new line would look like this
0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,10

Here is the output:

#  echo "0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0" | sed -e '1s/^[^,]*,//' -e '1s/$/,10/'
0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,10

It does work the when I use the echo command on just the one line but when I run it on my entire csv file it shows
,10,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94,89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,119,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212,119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212,119,169,197,264,189,229

I dont know why this is, but it doesn't make sense to me.

I'm beginning to think that I should spend a bit more time reviewing sed....

I threw the script together in perl - as an exercise for me, mostly. But if you find value in it, great. It's much longer than the line that danmero provided, and I'm sure that there are perl experts that can turn my code into a "one-liner"...

My script has two inputs <line number> and <new value> - as well as a debug or help flag:

-bash-3.00$ ./parse.pl -l 5 -v 300
-bash-3.00$ ./parse.pl -d -l 5 -v 300
input:
 41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
output:
 18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19,300

And here's the code:

#!/usr/bin/perl -w

################################################################################
# Pragma
use strict;
use Getopt::Std;
use vars qw/ %opt /;

################################################################################
# Forward declaration of subroutines
sub do_init();

################################################################################
# Variable declaration
my $input = "./file";
my $output = "./file.output";
my @entireList;
my $entireList;
my $listLineNumber;
my @oneLine;
my $newLine;
my $line;
my $DEBUG;

################################################################################
# You don't need to get your command line variables this way
#   - it's a habit that I got into, and it works for me - your mileage may vary
do_init();

# set DEBUG flag according to command line options
if ( $opt{d} )
{
   $DEBUG = 1;
}
else
{
   $DEBUG = 0;
}

################################################################################
# Begin the MAIN portion of the script
################################################################################

# Open your input file and place the contents into the array @entireList
# - each line is a separate element in the array
open(FILE, "<$input") or die "Cannot open $input for read :$!";
chomp (@entireList = <FILE>);
close( FILE );

# I haven't really added any error checking for the options. If you don't
# assign a line item or value, the script will still run - it won't substitute
# unless you provide -l <linenumber> AND -v <value>

if ( $opt{l} && $opt{v} )
{
   $listLineNumber = $opt{l};
   # split string (one line) into a temporary array
   print "input:\n $entireList[$listLineNumber]\n" if $DEBUG;
   @oneLine = split(/,/, $entireList[$listLineNumber]);
   shift @oneLine;
   push @oneLine, $opt{v};
   # put array back into a string - and back into the original array
   $entireList[$listLineNumber] = join(",",@oneLine);
   print "output:\n $entireList[$listLineNumber]\n" if $DEBUG;
}

# Write the file - here I have it writing to a new file - you can have it
# overwrite the original file if you prefer - simply change the variable
# declaration at the top for $output to match $input - or change the code
# below to use $input instead of $output. I recommend changing this in the
# variable declaration - so that you can reuse the script later
open(FILE2, ">$output") or die "Cannot open $output for read :$!";
foreach $line(@entireList)
{
   print FILE2 "$line\n";
}
close( FILE2 );


################################################################################
# Standard handling message - maybe too much for simple utilities - oh well... #
################################################################################
sub do_init()
{
   my $opt_string = 'dhl:v:';
   getopts( "$opt_string", \%opt ) or do_usage();
   do_usage() if $opt{h};
}
sub do_usage()
{
   print "\nusage: $0 [-h] [-l {line number}] [-n {new value}]\n\n";
   exit;
}

avronius,
Thanks for your help, I really need a unix script not perl. Perl is a really last resort. If I can't get a unix command that works I will look into using perl in my script. I have looked into sed and it seems as though a one line sed command should work I just can't get it to work for the adding the new value to the end of the line.

Should work, check for typo, use copy/paste

# cat file
0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94, 89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98 ,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,11 9,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212, 119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212, 119,169,197,264,189,229
366,438,315,319,382,287,398,320,416,382,407,397,342,448,276,392,297,368,237,347,336,332,384,405,412, 284,329,350,396,326,356
# sed -e '1s/^[^,]*,//' -e '1s/$/,10/' file
0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,10
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94, 89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98 ,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,11 9,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212, 119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212, 119,169,197,264,189,229
366,438,315,319,382,287,398,320,416,382,407,397,342,448,276,392,297,368,237,347,336,332,384,405,412, 284,329,350,396,326,356
# bash --version
GNU bash, version 3.1.17(1)-release (i486-pc-linux-gnu)
Copyright (C) 2005 Free Software Foundation, Inc.
# sed --version
GNU sed version 4.1.5
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
to the extent permitted by law.

Here is a copy paste of my putty session. It isn't working like yours and I dont know why. Maybe you can see something. I am unable to check the versions on the server. What about using the append option of sed to add the value to the end of the line. I am new to sed and still learning.

 
210> cat file
0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94,89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,119,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212,119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212,119,169,197,264,189,229
366,438,315,319,382,287,398,320,416,382,407,397,342,448,276,392,297,368,237,347,336,332,384,405,412,284,329,350,396,326,356
 
211> sed -e '1s/^[^,]*,//' -e '1s/$/,10/' file
,10,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94,89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,119,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212,119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212,119,169,197,264,189,229
#!/usr/bin/ksh93
#
# two arguments required ARG1: lineno  ARG2: new_number
#

[[ $# != 2 ]] && {
   print "ERROR - expecting two parameters"
   exit 1
}

# do more validation here if needed
line=$1
new_number=$2

TMP=./file.$$

sed -e ''"${line}"'s/\(^[^,],\)\(.*\)/\2/; '"${line}"'s/$/,'"${new_number}"'/' file > $TMP

mv $TMP file
cat file

exit 0

I get the exact same result using this script as I do when i run the command recommended by danmero.

I figured out the reason things weren't working. I was using a excel file that was saved as a csv. Therefore it was not a unix file. When I converted the file to a unix file things worked fine. Thanks for all your help.

See next post

Ok, so my last post wasn't very well thought out. I need to be able to find the std deviation of each row. The awk resources I have found seem indicate how to do this on each column, but not each row.

Here is what I have so far. The problem is I can't get the square root. Also none of my math functions give me floating point numbers but that is a secondary problem.

 
for i in `cat file.csv ` 
do
     x1=0
     x2=0
     sigma=0
     IFS=, 
     for f in $i 
          do  
          let x1=$x1+$f
          let x2=$f*$f+$x2
     done 
     let x1=$x1/30
     let x2=$x2/30
     let sigma=sqrt($x2-$x1*$x1)
     echo "Mean = " $x1
     echo "Standard Deviation = " $sigma
done

Please start a new thread for your new problem.