FreddyG
January 28, 2009, 10:05am
1
I have a file, not really a csv, but containing delineated data just the same. Lets call that file "raw_data.txt". It contains data in the format of company name:fein number like this:
first company name:123456789
second company name:987654321
what i need to do is read this file, apply sha1 to the fein field and then output that into a new file, we'll call "hashed_data.txt". So hashed_data.txt should contain data like
first company name:f7c3bc1d808e04732adf679965ccc34ca7ae3441
second company name:bfe54caa6d483cc3887dce9d1b8eb91408f1ea7a
i have tried looping thru the file with awk like so
read FILE
while read line
do
awk 'BEGIN{FS=OFS=":"}{print $1, $2}'
done < $FILE
but I'm stuck on how to run the second field thru sha1sum
danmero
January 28, 2009, 10:20am
2
bash solution.
IFS=":" && \
while read name fein
do echo "$name:"$(sha1 -qs $fein)
done < file > newfile
Note: You should read the sha1sum man
FreddyG
January 28, 2009, 11:20am
3
thanks! checking it out now
rwuerth
January 28, 2009, 11:27am
4
if your echo supports '-n' have you tried to put that into your echo statement?
echo -n "$name:"$(sha1sum <<< $fein)
I'm not sure if it will or wont work, I just don't know if it's been tried.
FreddyG
January 28, 2009, 11:29am
5
i think the command I'm actually going for is sha1sum. When i modified the first snipit you posted to this it seemed to work
IFS=":" && \
while read name fein
do echo "$name:"$(sha1sum <<< $fein)
done < raw_data.txt > hashed_data.txt
however, it's getting a newline feed in the hash as well
the accurate hash of
123456789
should be
f7c3bc1d808e04732adf679965ccc34ca7ae3441
echo -n 123456789 | sha1sum
will produce this hash
so need a way to trim the newline feed out of there.
tr -d '\n'
but where would it go ion the above snipit?
FreddyG
January 28, 2009, 11:34am
6
rwuerth:
if your echo supports '-n' have you tried to put that into your echo statement?
echo -n "$name:"$(sha1sum <<< $fein)
I'm not sure if it will or wont work, I just don't know if it's been tried.
seems to just remove the new line feed between the lines, IE ouput is now
first company name:179c94cf45c6e383baf52621687305204cef16f9 -second company name:a1b42d633e975efc2f665bda21f94e419c1b6074 -
rwuerth
January 28, 2009, 11:47am
7
Okay take the -n out of the echo.
try this before the echo statement
fein=`echo $fein | tr -d /\n/`
Or as I think about it, just
fein=`echo -n $fein`
FreddyG
January 28, 2009, 11:56am
8
ok, I have
IFS=":" && \
while read name fein
do
fein=`echo -n $fein`
echo "$name:"$(sha1sum <<< $fein)
done < raw_data.txt > hashed_data.txt
same result
first company name:179c94cf45c6e383baf52621687305204cef16f9 -
second company name:a1b42d633e975efc2f665bda21f94e419c1b6074 -
meh...
rwuerth
January 28, 2009, 11:57am
9
Also, I just realized:
the "<<<" for input to sha1sum
Automatically supplies a final newline!
Can you do it with a normal redirect i.e. '<' ?
rwuerth
January 28, 2009, 12:04pm
10
It's not pretty, but maybe this will work:
echo "$name:"$(echo -n $fein | sha1sum)
FreddyG
January 28, 2009, 12:13pm
11
I think we have it!
IFS=":" && \
while read name fein
do
echo "$name:"$(echo -n $fein | sha1sum)
done < raw_data.txt > hashed_data.txt
FreddyG
January 28, 2009, 12:17pm
12
YUP, that got it! Thanks so much for your help everybody