Removing trailing zeros using sed

grajp002 · March 23, 2010, 9:44pm

Hello All,
I have a csv file with 3 columns. The file which looks like this

47850000,100,233
23560000,10000,456
78650000,560000,54
34000000,3456,3

The first column has 4 trailing zeros. I have to remove 4 trailing zeroes from 1st field. The output file should appear as follows.

4785,100,233
2356,10000,456
7865,560000,54
3400,3456,3

How can I achieve this using sed?

durden_tyler · March 23, 2010, 9:53pm

Maybe something like this ?

$ 
$ 
$ cat -n f9
     1 47850000,100,233
     2 23560000,10000,456
     3 78650000,560000,54
     4 34000000,3456,3
$ 
$ sed 's/^\([0-9][0-9][0-9][0-9]\)[^,]*\(,.*\)$/\1\2/' f9
4785,100,233
2356,10000,456
7865,560000,54
3400,3456,3
$ 
$

tyler_durden

kurumi · March 23, 2010, 9:57pm

sed 's/^\(.[^,]*\)0000,\(.*\)/\1,\2/' file

rdcwayx · March 23, 2010, 10:17pm

awk -F "," '{ if ($1~/0000$/) {$1=substr($1,1,length($1)-4)}}1' OFS="," urfile

alister · March 23, 2010, 10:42pm

sed 's/0000,/,/' file

daptal · March 23, 2010, 10:48pm

i think it would remove 0's from each field

it shd be

sed 's/0000,/,/1' file

and that would still be wrong if the first field doesnot have any zero then it would remove from the subsequent fields

alister · March 23, 2010, 11:30pm

daptal:

i think it would remove 0's from each field

it shd be
sed 's/0000,/,/1' file
and that would still be wrong if the first field doesnot have any zero then it would remove from the subsequent fields

The 1 subsitution flag is redundant. By default, SED always subs only the first instance, unless you give it a number or g. There is no difference at all between "sed 's/0000,/,/1'" and "sed 's/0000,/,/'".

The original poster stated that the first column contains 4 trailing zeroes. My sed will remove them. If only some of the rows in the first column contain the trailing zeroes, then the mistake is in the original poster's problem statement for being imprecise. I assume that the problem is stated correctly and choose not to overcomplicate my proposed solution.

Regards,
Alister

---------- Post updated at 11:30 PM ---------- Previous update was at 11:25 PM ----------

However, if it did need to handle a first field that may not contain the four trailing zeroes, the correct sed would be:

sed 's/^\([^,]*\)0000,/\1,/' file

ygemici · March 24, 2010, 5:42am

or maybe :
sed 's/\(^[0-9][0-9]*\)0000,/\1,/g' file

alister · March 24, 2010, 10:40am

The 'g' flag has no effect on this substitution command. Since the pattern is anchored to the beginning of the line, it cannot match in more than one place (if it could match multiple times, then the global flag would be an error since this problem only requires substituting the first occurrence).

Regards,
Alister

0-9 ↩︎

asalman.qazi · March 24, 2010, 12:41pm

try this one


sed 's/^\([^,|0]*\)00*\(.*\)/\1\2/' filename

Jairaj · March 24, 2010, 12:49pm

Try this :

rev infile | awk -F ',' '{ print $1",",$2"," substr($3,5)}' | rev

alister · March 24, 2010, 1:39pm

That assumes that there are no zeroes in the first column prior to the trailing 4. If there is one, the pattern will match a leading or embedded zero, leaving the trailing four intact. Also, if that other zero immediately precedes the trailing four, that command will remove 5 zeroes (or more).

$ cat data
47850000,100,233
20560000,10000,456
78600000,560000,54
34000000,3456,3

$ sed 's/^\([^,|0]*\)00*\(.*\)/\1\2/' data
4785,100,233
2560000,10000,456
786,560000,54
34,3456,3

I strongly suggest that the above code not be used unless the only zeroes in first field are the trailing four.

Regards,
Alister

ygemici · March 24, 2010, 5:13pm

I know this already.But there is not a problem in this command.
global flag on this example is not necessary but i use because of
not make a difference.

For example i add one word that is matched (1212000000)

[root@rhnserver ~]# sed 's/\(^[0-9][0-9]*\)0000,/\1,/g' data
4785,100,233,1212000000
2056,10000,456
7860,560000,54
3400,3456,3
[root@rhnserver ~]# sed 's/\(^[0-9][0-9]*\)0000,/\1,/' data
4785,100,233,1212000000
2056,10000,456
7860,560000,54
3400,3456,3

This shown not return any error by g flag and the result is same.
g flag returns results every word on the line instead of the first in the same line.Our pattern with start ^ .. Consequently matched result exist only one for every lines that g flag is or is not..

Thanks for your ideas