Replacing values into a single column. sed/PERL

satir · September 23, 2014, 4:26pm

Hello everyone,
i need to replace in the second column of my csv file, points by nothing and dash by comma like this:

Input:

1 2 1;12.111.312-2;1.2;2;1-3
2 1 1;11.212.331-1;3.3;1;2-2

Output:

1 2 1;12111312;2;1.2;2;1-3
2 1 1;11212331;1;3.3;1;2-2

SED or PERL commands preferably.

Thanks!

Don_Cragun · September 23, 2014, 4:34pm

You said you wanted "-" in field 2 to be replaced by ","; but your sample output used ";" instead of ",". Which do you really want?

What have you tried?

Why are sed and perl preferable to awk for this?

satir · September 23, 2014, 4:43pm

in fact i want to replace "-" by semicolon instead simple comma (Mea culpa)
I prefer sed or perl because after some experiences i have seen that sed or perl commands was more fast than awk...

Don_Cragun · September 23, 2014, 4:55pm

And, what have you tried?

junior-helper · September 23, 2014, 8:02pm

I assume he doesn't know where to start

# accurate and fast (recommended)
awk -F";" '{gsub(/\./,"",$2); sub(/-/,";",$2); print}' OFS=";" input

# accurate, but relatively slow
perl -e 'while(<>){ @f=split(/;/); $f[1]=~s/\.//g; $f[1]=~s/-/;/; print join(";",@f); }' input

# very fast, but it doesn't care for columns, it simply deletes the first two occurrences of "."
# and replaces the first ocurrence of "-" with ";". It might work for you, if there are no "."s
# and no "-" in the first column PLUS if there are always two "."s and one "-" in the second column.
sed 's/\.//;s/\.//;s/-/;/' input

satir · September 24, 2014, 2:13am

i cant select just the 2nd column. To replace its easy with

sed -i

command but i can do that for the whole file. To remove points, i can use

grep -Ev

, but how select just the 2nd column to do these changes.

Thanks

Julien

Don_Cragun · September 24, 2014, 2:51am

I prefer using awk (as suggested by junior-helper) because it is easy for me to read and immediately understand. I believe the following sed script also does what you want, but for many users this is less readable and harder to understand:

sed '
:again
/^\([^;]*;[^.;]*\)[.]/s//\1/
t again
/^\([^;]*;[^-;]*\)-/s//\1;/
' Input

Fast is nice; but I'm not sure that sed is going to be any faster than awk for this. Which script do you find easier to understand?

pilnet101 · September 24, 2014, 3:17am

Depending on your shell version, you can use bash alone without calling external tools.

while IFS=";" read -r line; do
    set -- ${line}
    tmp=${2//./}
    echo "${line/$2/${tmp//-/$IFS}}"
done < inputfile

Due to the simplicity and trivial performance difference, I would still recommend using awk. Nice to have several approaches to chose from though