Create range from a list

run_time_error · May 14, 2010, 10:19am

Hello everyone,

I am trying to create a script that will make a range or ranges based on a sorted list of numbers.

Eg. If the list is like

The output range should be:

1-7
12-15

Would be greateful if you could point me in the right direction,

Cheers,
rte

radoulov · May 14, 2010, 10:27am

awk 'END {
  if (prev != max)
    print min, "-", prev
  }   
NR == 1 { min = $1 }  
$1 != prev + 1 {
  if (prev)
    print min, "-", prev
  min = $1; 
  }
{ prev = $1 }' infile

run_time_error · May 14, 2010, 10:41am

Does't work for me

I have created a file called test with the following content
1
2
3
4
5
10
11
12

but when I try to run

awk 'END {
  if (prev != max)
    print min, "-", prev
  }
$1 != prev + 1 {
  print min, "-", prev
  min = $1 
  }
{ 
  prev = $1 
  NR == 1 && min = $1 
  }' test

I get the error:

awk: syntax error near line 4
awk: bailing out near line 4

What am I doing wrong

radoulov · May 14, 2010, 10:42am

Use nawk or /usr/xpg4/bin/awk on Solaris.

run_time_error · May 14, 2010, 11:12am

Thanks radoulov,

That works perfectly !!

Franklin52 · May 14, 2010, 11:13am

Another approach:

awk '!r{r=n=$1;next}
++n!=$1{print r "-" --n;r=n=$1}
END{print r "-" n}
' file

Use nawk or /usr/xpg4/bin/awk on Solaris.

vgersh99 · May 14, 2010, 11:28am

Wasn't something similar answered here?

radoulov · May 14, 2010, 2:25pm

... and therefore it's most probably a homework.

alister · May 14, 2010, 4:30pm

For the userland deviant who isn't satisfied with a simple, straighforward AWK solution ;):

echo '[dlb=Qdlb1+=MlPxlRx]sC [sbq]sM [lan[-]nlbpst]sP [q]sQ [dsasb]sR' | cat - data \
| sed '2,$s/-/_/; 2s/.*/& lRx/; 3,$s/.*/& lCx/; $s/.*/& lPx/' | dc

a = lower bound of sequence
b = upper bound of sequence
R = reset macro. Set a and b equal to the latest value read (this is the beginning of a new sequence).
M = max macro. Set the new upper bound of the sequence, b.
P = printing macro. Prints "a-b".
C = comparison macro. Determines if the newest value is part of an ongoing sequence. If it is, call the M (max) macro to store it in b. If it is not, call the P (print sequence) macro and then the R (reset) macro to begin a new sequence.
Q = quit macro

It handles negative values, sequences that span negative to positive (-1, 0, 1), and sequences which repeat the same value (1,2,3,3,3,4,5 => 1-5).

I'm a relative dc newbie, so if someone can do it better (with dc), I'd be interested in seeing your approach.

Test run:

$ cat data
-100
-99
-98
-97
-96
-4
-3
-2
-1
0
1
2
3
4
5
1
2
3
3
3
4
5
14
15
16
17
18
$
$ echo '[dlb=Qdlb1+=MlPxlRx]sC [sbq]sM [lan[-]nlbpst]sP [q]sQ [dsasb]sR' | cat - data \
> | sed '2,$s/-/_/; 2s/.*/& lRx/; 3,$s/.*/& lCx/; $s/.*/& lPx/' | dc
-100--96
-4-5
1-5
14-18

Regards,
Alister

pseudocoder · May 14, 2010, 9:07pm

#!/usr/local/bin/perl

use strict;
use warnings;

my $count=0;
my @arr=();

 my @data = <DATA>;

  foreach my $rec (@data) {
  $count++;
  chomp $rec;

  if ($rec == $data[-1]) {
    if ($rec-1 != $arr[-1]) {
    print "$arr[0]-$arr[-1]\n";
    print "$rec-$rec\n";
    exit;  }
    else {
    push @arr, $rec;
    print "$arr[0]-$arr[-1]\n";
    exit; }
    }

  if ($count == 1) {
  push @arr, $rec; }

  else {
  if ($rec-1 != $arr[-1]) {
  print "$arr[0]-$arr[-1]\n";
  @arr=();
  push @arr, $rec;
  $count=1; }
 
  else {
  push @arr, $rec; }
 }
}

__DATA__
1
2
3
4
5
6
7
12
13
14
15