Positional insertion for multibyte characters

Hi

I have a requirement to insert a dot "." after a position in each line, say 110th position.

For which, I have written the below command.

cat filename | sed 's/./&\./110' > new_filename

The code is working fine, but when we have multi byte (2 or 3) characters in the input file, the insertion of dot is not occurring at 110th location. Is there a way to resolve this issue ?

What operating system and version of sed are you using?

What codeset is being used to encode the multi-byte characters in your input file?

What locale was being used when you ran the command above?

What do you mean by the 110th location? Do you want to insert a period as the 111th character on the line or do you want to insert a period as the 111th byte on the line?

Using cat in this pipeline wastes system resources and slows down your script:

sed 's/./&\./110' < filename > new_filename

but fixing that won't change the problem you are reporting.

1 Like

Hi,

Please find the below responses.

Linux 2.6.32-573.7.1.e16.x86_64
GNU sed version 4.2.1
I am unaware of how the source file was processed
en_US.UTF-8
I wanted the period at 110th location

I'm a bit surprised as if I try to reproduce your problem, it does not seem to exist with my sed (GNU sed) 4.2.2 , unless run with the C locale:

sed 'p;s/./&\./17' file
1234567890123456789
12345678901234567.89
abcdefghijklmnopqrs
abcdefghijklmnopq.rs
abc����hijklmnopqrs
abc����hijklmnopq.rs
�������ߧ��������ߧ
�������ߧ��������.ߧ
abcdefghijklmnopqrs
abcdefghijklmnopq.rs
LC_ALL=C sed 'p;s/./&\./17' file
1234567890123456789
12345678901234567.89
abcdefghijklmnopqrs
abcdefghijklmnopq.rs
abc����hijklmnopqrs
abc����hijk.lmnopqrs
�������ߧ��������ߧ
��������.��������ߧ
abcdefghijklmnopqrs
abcdefghijklmnopq.rs

Could you be somewhat more specific with your problem?