Help awk/sed: putting a space after numbers:to separate number and characters.

rveri · March 5, 2013, 12:07pm

Hi Experts,

How to sepearate the list digit with letters : with a space from where the letters begins, or other words from where the digits ended.

file

52087mo(enbatl)
52049mo(enbatl)
52085mo(enbatl)
25051mo(enbatl)

The output should be looks like:

52087 mo(enbatl)
52049 mo(enbatl)
52085 mo(enbatl)
25051 mo(enbatl)

Thanks a lot,

mirni · March 5, 2013, 12:16pm

Try

 sed 's/^[0-9]*/& /'  file

Yoda · March 5, 2013, 12:24pm

Another approach:

awk '{ match($0,/^[0-9]*/); print substr($0, RSTART, RLENGTH) " " substr($0, RSTART+RLENGTH) } ' file

Scrutinizer · March 5, 2013, 12:32pm

@mirni. That would work in this case, but if if there is also text that begins with a non-digit then as space would be inserted at the start of the line.. To avoid that one could use:

sed 's/^[0-9][0-9]*/& /' file

@bipinajiith, the same applies to your awk approach..

mirni · March 5, 2013, 12:36pm

Yes, I was matching nothing also. Thank you.

I have a feeling lines may start with letters too, re-reading the OP's post:

In which case I would omit the anchor, and let it match the first group of digits:

sed 's/[0-9][0-9]*/& /' file

elixir_sinari · March 5, 2013, 12:39pm

scrutinizer:

@mirni. That would work in this case, but if if there is also text that begins with a non-digit then as space would be inserted at the start of the line.. To avoid that one could use:
sed 's/^[0-9][0-9]*/& /' file
@bipinajiith, the same applies to your awk approach..

Ah yes..that's a consequence of the * trying to match as much as possible but also as early as possible. "Being early" wins over "being greedy", if it can.

rveri · March 5, 2013, 1:47pm

Thanks all, it worked. Scrutinizer thanks for explaining 'text that begins with non-digit' as well.

I am wondering what would be the code if the digits stays in the middle:

Example text:
abce1234jklm

Want the output as :

abce 1234 jklm

echo "abce1234jklm" | sed 's/[0-9]*/ & /'

Doesn't works ,

Thanks a lot.

---------- Post updated at 01:47 PM ---------- Previous update was at 01:39 PM ----------

ok I am able to figure it out:

echo "abce1234jklm" | sed 's/^[a-z][a-z]*/& /;s/[0-9][0-9]*/&  /'

abce 1234  jklm

Thanks for all the helps all.

mirni · March 5, 2013, 2:00pm

It's because you missed scrutinizer's point.

[0-9]* matches zero or more digits, so it matches the beginning of the line (containing zero digits)
[0-9][0-9]* matches one or more digits

echo "abce1234jklm" | sed 's/[0-9][0-9]*/ & /'

works as you want

rveri · March 5, 2013, 2:31pm

Bipinajith ,
this is nice coding btw, and it too worked for me: Thanks a lot.

awk '{ match($0,/^[0-9]*/); print substr($0, RSTART, RLENGTH) " " substr($0, RSTART+RLENGTH) } ' file

---------- Post updated at 02:31 PM ---------- Previous update was at 02:27 PM ----------

Mirini,

Thanks for poinitng that out & explaining that earlier mentioned by Scrutinizer, and I got it fixed...wroks like charm.

echo "abce1234jklm" | sed 's/[0-9][0-9]*/ & /'
abce 1234 jklm

Thanks much...

Jotne · March 5, 2013, 4:03pm

Can this be used?

sed 's/[0-9]+/ & /'

This should match minimum one digit?
[0-9]+ gives the same [0-9][0-9]* ?

Scrutinizer · March 5, 2013, 4:05pm

You can with GNU sed en BSD sed -E (extended regex) option...

sed -E 's/[0-9]+/ & /' file

with regular sed an alternative to the earlier solution would be:

sed 's/[0-9]\{1,\}/ & /' file