perl - Removing dots(".") from the data

Hi Friends,

I have a file1 as below

file1.txt

 
100000.||1925-01-10|00:00|1862SHERMA NAVE#1SE.||EVTON|IL|60201||22509.|BDSS|62007|2639.|26670
100001.||1935-01-10|00:00|1862NEW . YRK NO.||EVTON|IL|60201||22509.|BDSS|62007|2639.|26670.
100002.||1965-01-10|00:00|1862 IND . INC,CL .NAVE#1SE.||EVTON|IL|60201||22.509.|BDSS|62007.|2.639.|26670

I am using the below command to remove the dots"." from the file where before "|'

 
#perl -pe 's/\.(?=[|\s])//g' file1.txt
 
100000||1925-01-10|00:00|1862SHERMA NAVE#1SE||EVTON|IL|60201||22509|BDSS|62007|2639|26670
100001||1935-01-10|00:00|1862NEW  YRK NO||EVTON|IL|60201||22509|BDSS|62007|2639|26670
100002||1965-01-10|00:00|1862 IND  INC,CL .NAVE#1SE||EVTON|IL|60201||22.509|BDSS|62007|2.639.|26670

The output will remove all the dots for the interger fields and also the Address field which is mentioned as green .But i want to remove the dots only when column field is Integer and want to remove dot"." only before "|"

Expected output:

 
100000||1925-01-10|00:00|1862SHERMA NAVE#1SE.||EVTON|IL|60201||22509|BDSS|62007|2639|26670
100001||1935-01-10|00:00|1862NEW . YRK NO.||EVTON|IL|60201||22509|BDSS|62007|2639|26670
100002||1965-01-10|00:00|1862 IND . INC,CL .NAVE#1SE.||EVTON|IL|60201||22.509|BDSS|62007|2.639.|26670

Plz help

Try

perl -pe 's/\.\|/\|/g' file
100000||1925-01-10|00:00|1862SHERMA NAVE#1SE||EVTON|IL|60201||22509|BDSS|62007|2639|26670
100001||1935-01-10|00:00|1862NEW . YRK NO||EVTON|IL|60201||22509|BDSS|62007|2639|26670.
100002||1965-01-10|00:00|1862 IND . INC,CL .NAVE#1SE||EVTON|IL|60201||22.509|BDSS|62007|2.639|26670
1 Like

@Pamu Thanks for the quick response . Again in the 5th field the dot is removing. I actually dont want to remove the "dots"(.) from text fields . I should only remove the "." from the Number fields .

Expected Output :

 
100000||1925-01-10|00:00|1862SHERMA NAVE#1SE.||EVTON|IL|60201||22509|BDSS|62007|2639|26670
100001||1935-01-10|00:00|1862NEW . YRK NO.||EVTON|IL|60201||22509|BDSS|62007|2639|26670
100002||1965-01-10|00:00|1862 IND . INC,CL .NAVE#1SE.||EVTON|IL|60201||22.509|BDSS|62007|2.639.|26670

See if this works:

perl -pe 's/(^|(?<=\|))([0-9.]+)\.(\||$)/$2$3/g' file
2 Likes

@Scrutinizer : Thanks for the reply. it works as expected. Thank you :slight_smile: . Would you mind explaning the command ?

Hi, if there is a match that involves:
group 1: the beginning of a line(^) or a field separator (\|), followed by
group 2: 1 or more numbers or dots, followed by
a dot(\.), followed by
group 3: a field separator (\|) or the end of a line ($).

Then replace that by the reference to group 2 and 3.
group 1 contains a lookbehind to the previous field separator ((?<=\|)), so that it is not part of the actual match. This is important, otherwise not every field would be matched, because the regex would start looking for a next match, where the previous match ended..

1 Like

@Scrutinizer : Thank you so much :slight_smile: