Sed Question?

Kweekwom · February 2, 2008, 4:03pm

Hi,

Here's what I'm trying to do. I have a set of numbers like this:

12-13 15-18 23-28 36-38 42-43 53-56 70-72 76 80-86 93-110 119-128

But I want to echo the ranges of numbers. Is there any way I can get sed to replace for example, "12-13 15-18" with "{12..13} {15..18}" so it will echo the entire range?

Or, is there a better way of accomplishing this?

Thanks!

KevinADC · February 2, 2008, 6:12pm

I'm not really a sed coder but I think it is the same as perl in this situation with the excpetion that sed does inplace editing by default (I could be wrong). The problem is you have not described your input besides saying it has those sequence of ranges you posted.

sed -e 's/([0-9]+)(-)([0-9]+)/{\1..\2}/g' file

Anyway, see if it works, backup your file first. If I am totally wrong someone will correct me.

Kweekwom · February 2, 2008, 6:36pm

KevinADC,

Thanks for looking into it. I tried it and got this error:

sed: -e expression #1, char 32: invalid reference \2 on `s' command's RHS

Is there anymore info I can give you to make my situation more clear?

Thanks again.

drl · February 2, 2008, 6:48pm

Hi.

The seds I have used do not do "in-place" processing by default. If they allow it, it's usually with the "-i" option.

If your sed allows "-r", here's one way:

#!/bin/bash -

# @(#) s1       Demonstrate sed extended regular expression match, substitution.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1) sed

FILE=${1-data1}
echo
echo " Input file $FILE:"
cat $FILE

echo
echo " As processed by sed:"
sed -r -e 's/([0-9]+)-([0-9]+)[ ]*/{\1..\2} /g' $FILE

exit 0

Producing:

% ./s1
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0
GNU sed version 4.1.2

 Input file data1:
12-13 15-18 23-28 36-38 42-43 53-56 70-72 76 80-86 93-110 119-128

 As processed by sed:
{12..13} {15..18} {23..28} {36..38} {42..43} {53..56} {70..72} 76 {80..86} {93..110} {119..128}

See man sed for details ... cheers, drl

drl · February 2, 2008, 7:03pm

Hi.

If you are stuck with a sed that does not recognize "-r" such as on Solaris, you can use:

#!/bin/bash -

# @(#) s1       Demonstrate sed extended regular expression match, substitution.

SED=/usr/xpg4/bin/sed
SED=sed

echo "(Versions displayed with local utility \"version\")"
# version >/dev/null 2>&1 && version =o $(_eat $0 $1) $SED
uname -rs
version >/dev/null 2>&1 && version bash $SED

FILE=${1-data1}
echo
echo " Input file $FILE:"
cat $FILE

echo
echo " As processed by sed:"
# $SED -r -e 's/([0-9]+)-([0-9]+)[ ]*/{\1..\2} /g' $FILE
$SED  -e 's/\([0-9][0-9]*\)-\([0-9][0-9]*\)[ ]*/{\1..\2} /g' $FILE

exit 0

Which yields:

$ ./s1
(Versions displayed with local utility "version")
SunOS 5.10
GNU bash 3.00.16
sed (local) - no version provided.

 Input file data1:
12-13 15-18 23-28 36-38 42-43 53-56 70-72 76 80-86 93-110 119-128

 As processed by sed:
{12..13} {15..18} {23..28} {36..38} {42..43} {53..56} {70..72} 76 {80..86} {93..110} {119..128}

It seemed to have worked with both versions of Solaris sed ... cheers, drl

shamrock · February 2, 2008, 7:04pm

sed 's/\([0-9][0-9]*\)-\([0-9][0-9]*\)/{\1..\2}/g' file

Kweekwom · February 2, 2008, 7:32pm

Thank you drl and everyone else for your help, it worked great!

Thanks again!

Kweekwom · February 2, 2008, 7:56pm

Okay, now I'm having another problem with my script.

If I type:

echo {12..18}

It will spit out:

12 13 14 15 16 17 18

However, my script looks like this:

echo "Input first set of nodes" 

read node1 # This is where you insert the string of numbers

result=`echo $node1 | sed -r -e 's/([0-9]+)-([0-9]+)[ ]*/{\1..\2} /g'`

echo $result

I was hoping when it echoed $result, that the numbers that were inclosed in brackets for example {12..18} would then be printed as "12 13 14 15 16 17 18". This is not so, it merely echoes "{12-18}". Is there anyway for me to get this to work?

Thanks again.

KevinADC · February 2, 2008, 10:36pm

Actually mine should have been:

sed -e 's/([0-9]+)(-)([0-9]+)/{\1..\3}/g' file

(\3 instead of \2) but it may still not have worked. Like I said, I am not a sed coder really, juts learn what I have to on occasion. Thought I would give it a shot though as I am trying to pick up on more sed and ksh and similar.

KevinADC · February 2, 2008, 10:39pm

Thanks for the correction. I really was not sure. sed is fairly new to me, trying to pick up on it in my spare time. I should have known though, in this regards sed and perl are very very similar, -e -i and etc, the regular expression syntax appears to be identical as well.

drl · February 2, 2008, 11:21pm

Hi.

You are welcome; use bash3 and eval for this:

#!/bin/bash3 -

# @(#) user1    Demonstrate eval.

# echo "Input first set of nodes"

# read node1 # This is where you insert the string of numbers
node1="12-18"

result=`echo $node1 | sed -r -e 's/([0-9]+)-([0-9]+)[ ]*/{\1..\2} /g'`

echo {1..5}

eval echo $result

exit 0

Producing:

% ./user1
1 2 3 4 5
12 13 14 15 16 17 18

The useful page at http://www.tldp.org/LDP/abs/html/index.html can help you through these questions ... cheers, drl

Kweekwom · February 3, 2008, 5:33pm

Awesome! Thanks drl, it worked perfectly.

Hopefully I only have one more question now.

I do what you suggested, and I want to sort the output of "eval echo" numerically, so I do this:

#!/bin/bash3 -

# @(#) user1    Demonstrate eval.

# echo "Input first set of nodes"

# read node1 # This is where you insert the string of numbers
node1="435-437,476-492 70-72,76,80-86"

result=`echo $node1 | sed -r -e 's/([0-9]+)-([0-9]+)[ ]*/{\1..\2} /g'`

echo {1..5}

eval echo $result | sort -n

exit 0

But, it doesn't sort them numerically. Is there something I'm doing wrong that is causing this?

Thanks again in advance, and thanks for the link!

drl · February 3, 2008, 6:09pm

Hi.

The result of the final echo is a single line, so it is already in the correct order as sort looks at it.

What exactly are you looking for? If you were doing this manually, how would you go about it?

Note that you have interspersed a comma (,) at places in the list ... cheers, drl

Kweekwom · February 3, 2008, 6:32pm

I really don't know how I would go about it doing it manually.

I'm sorry, I didn't mean to have those comma's in there. Even with no comma's in the line, it still doesn't sort.

I guess I need to come at the whole thing from a different angle?

All I basically need it to do is take a list of numbers, like this:

435-437 70-72 76 80-86

and print the ranges and sort all of it numerically.

drl · February 3, 2008, 6:48pm

Hi.

OK, let us assume that we can easily get rid of the commas.

Then we are left with one line that contains a list of numbers, each separated from the next by a blank character. We have an arranging program -- sort -- that orders lines. What transformation would get those two ideas together? What do we need to do? ... cheers, drl

Kweekwom · February 3, 2008, 7:24pm

drl,

Are you saying I need to get each of the numbers into a separate line?

ghostdog74 · February 3, 2008, 8:39pm

#!/bin/sh
s="12-13 15-18 23-28 36-38 42-43 53-56 70-72 76 80-86 93-110 119-128"
echo $s | awk '
{
  for (i=1;i<=NF;i++){
    n=split($i,a,"-")
    for (j=a[1];j<=a[n];j++){
        printf  "%d " ,j  
    }
    print ""
  }
}'

output:

# ./test.sh
12 13
15 16 17 18
23 24 25 26 27 28
36 37 38
42 43
53 54 55 56
70 71 72
76
80 81 82 83 84 85 86
93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
119 120 121 122 123 124 125 126 127 128

drl · February 3, 2008, 9:30pm

Hi.

Yes, exactly! Then, presumably, you'd want to put them back together again. Here's one way to do that:

#!/bin/bash3 -

# @(#) user2    Demonstrate eval and tr.

# echo "Input first set of nodes"

# read node1 # This is where you insert the string of numbers
# node1="435-437,476-492 70-72,76,80-86"
node1="435-437 476-492 70-72 76 80-86"

result=`echo $node1 | sed -r -e 's/([0-9]+)-([0-9]+)[ ]*/{\1..\2} /g'`

echo {1..5}

eval echo $result |
tee t1 |
tr '[, ]' '\n' |
tee t2 |
sort -n |
tr '\n' ' '
echo

exit 0

Producing:

% ./user2
1 2 3 4 5
70 71 72 76 80 81 82 83 84 85 86 435 436 437 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492

The tee allows you to look at the intermediate form of the data. See the man pages for details on the rest. The tr especially often varies from system to system.

That's part of the design of *nix -- put general tools together to solve specific problems.

Best wishes ... cheers, drl

Kweekwom · February 3, 2008, 10:17pm

Wow drl, that's really cool. It works just as I need it to! I'll definitely delve deeper into those commands so I can understand them better. Thanks so much for your help!

drl · February 3, 2008, 10:36pm

Hi.

You're welcome.

If you get a chance, awk, as in the solution that ghostdog74 posted, is worthwhile to learn. It breaks up input lines into fields, which can then be re-arranged, deleted, assigned to, etc. Very useful, and, once you get the idea of:

  pattern-part { action-part }

you can do a lot with very little work ... cheers, drl