Suggestions for command line parsing

Hi all

I need to put a command line parser together to parse numeric fields and ranges passed to a script. I'm looking for a bash function that is as elegant and simple as possible.

So the input would be of the following form -

1,2,8-12

This would return -

1,2,8,9,10,11,12

Input can contain multiple positive or negative ranges of the form -

9-12,19-15

in any order with comma delimited single integers interspersed anywhere on the line.

I've come up with some code but frankly its ugly and I suspect that this is a problem that's been solved before.

Does anyone have a good solution?

Brad

Here is a bash approach:

#!/bin/bash

while read line
do
        for num in ${line//,/ }
        do
                if [[ "$num" =~ \- ]]
                then
                        if [[ "$num" =~ ^\- ]]
                        then
                                st="${num%-*}"
                                [[ "$st" =~ \-$ ]] && st="${st%-*}"
                                en="${num#*-}"
                                [[ "$en" =~ [0-9]*-[0-9]* ]] && en="${en#*-}"
                        else
                                st=${num%%-*}
                                en=${num#*-}
                        fi
                        if [ $st -le $en ]
                        then
                                while [ $st -le $en ]
                                do
                                        [ -z "$tmp" ] && tmp="$st" || tmp="$tmp $st"
                                        (( st++ ))
                                done
                        else
                                while [ $st -ge $en ]
                                do
                                        [ -z "$tmp" ] && tmp="$st" || tmp="$tmp $st"
                                        (( st-- ))
                                done
                        fi
                else
                        if [ -z "$tmp" ] && [ ! -z "$num" ]
                        then
                                tmp="$num"
                        elif [ ! -z "$num" ]
                        then
                                tmp="$tmp $num"
                        fi
                fi
        done

        printf "%s\n" "${tmp// /,}"
        tmp=""

done < file

Input & Output:

$ cat file
1,2,8-12,15,20-25
9-12,19-15,20-20,0--5,-1-2,-5--1

$ ./scr
1,2,8,9,10,11,12,15,20,21,22,23,24,25
9,10,11,12,19,18,17,16,15,20,0,-1,-2,-3,-4,-5,-1,0,1,2,-5,-4,-3,-2,-1
3 Likes

That's much better than anything I've come up with so far.

Many thanks, I'm sure a few people will find that useful.

Brad

A Perl solution :

#!/usr/bin/perl -w
use strict;

my $cur_dir = $ENV{PWD};
my ($value,@values,@chars,@sep,$i,@ext,$idx);
my ($startVal,$endVal,$secNegFlg);

@values=split(/,/,$ARGV[0]);

foreach $value (@values) {
  $idx++;
  @chars=split(//,$value);
  @sep=grep{ $_ eq "-" } @chars;

  #one or more (3 max) "-" found -> range format (xx-xx) detected
  # define starting and ending values of range
  if ($#sep >= 0) {
    # if 1st char is -, first number of range is negative
    if ($chars[0] eq "-") { ($startVal) = $value =~ m/^(-\d+)/ }
      else { ($startVal) = $value =~ m/^(\d+)/ } ;

    # if -- found, second number of range is negative
    ($secNegFlg) = $value =~ m/(--\d+)$/;
    if( defined($secNegFlg) ) {  ($endVal) = $value =~ m/(-\d+)$/ }
      else { ($endVal) = $value =~ m/(\d+)$/ } ;
  }

  # Printing stage
  if ($#sep >= 0) {
    if ($startVal > $endVal) {
      for($i=$startVal; $i>=$endVal; $i--) {
        print "$i";
        print "," if( $i > $endVal);
      }
    } else {
      for($i=$startVal; $i<=$endVal; $i++) {
        print "$i";
        print "," if( $i < $endVal);
      }
    }
    print "," if( $idx <= $#values);
  } else {
    print "$value";
    print "," if( $idx <= $#values);
  }
}

print "\n";

Output :

 %./file034.pl 1,3-5,10,15-17,20,-25--22,0--3,-2-2

1,3,4,5,10,15,16,17,20,-25,-24,-23,-22,0,-1,-2,-3,-2,-1,0,1,2

And another one:

perl -le'
  ($_ = shift) =~ 
    s/(-?\d+)-(-?\d+)/
    join ",", $1 > $2 ? reverse @{[$2..$1]} : @{[$1..$2]}
    /xeg;
  print
  ' -- <your_string>
% perl -le'
  ($_ = shift) =~
    s/(-?\d+)-(-?\d+)/
    join ",", $1 > $2 ? reverse @{[$2..$1]} : @{[$1..$2]}
    /xeg;
  print
  ' -- 9-12,19-15,20-20,0--5,-1-2,-5--1
9,10,11,12,19,18,17,16,15,20,0,-1,-2,-3,-4,-5,-1,0,1,2,-5,-4,-3,-2,-1

Works in recent bash, but not for negative numbers:

T=1,4,5,6-10,13-25
IFS="," A=($T) 
for  ((I=0; I<${#A[@]}; I++ )); do [ ${A/-} = ${A} ] &&  echo -n ${A} || eval echo -n {${A/-/..}}; echo -n " "; done; echo
1 4 5 6 7 8 9 10 13 14 15 16 17 18 19 20 21 22 23 24 25 

You could also read variable T from a file in a loop...

EDIT: Actually, with a minor modification, we're getting quite far into the negatives:

T=1,-4,5,6--10,13-25
IFS="," A=($T) 
for ((I=0; I<${#A[@]}; I++ )); do [ ${A/?-} = ${A} ] && echo -n ${A} || eval echo -n {${A/-/..}}; echo -n " "; done; echo
1 -4 5 6 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10 13 14 15 16 17 18 19 20 21 22 23 24 25

Thanks for the perl examples.

I need to pull this off in bash so I can't use them but they certainly look powerful. The last example being particularly compact...

Cheers

---------- Post updated at 10:35 PM ---------- Previous update was at 10:21 PM ----------

this is the best I have managed to come up with. Doesn't handle negative numbers but that was never in my original requirement -

#! /bin/bash

error_exit()
{
    echo $1
    exit 1
}

expand_range()
{
    [[ $@ =~ ^[0-9]*-[0-9]*$ ]] || error_exit "Usage: args ${FUNCNAME}"
    local LA=${1#*-} FA=${1%-*}
    local TEST="\(\(FA++\)\) -lt \${LA}"
    [[ ${FA} -gt ${LA} ]] && TEST="\(\(FA--\)\) -gt \${LA}"

    OUT="${OUT},${FA}"
    while eval [[ ${TEST} ]]
    do
        OUT="${OUT},${FA}"
    done
}

parse_line()
{
    [[ $@ =~ ^[0-9]+[0-9,-]*[0-9-]+$ ]] || error_exit "Usage: args ${FUNCNAME}"
    OUT=
    for SEG in $(echo $@ | awk 1 RS=",")
    do
        if [[ ${SEG} =~ ^[0-9]*-[0-9]*$ ]]
        then
            expand_range ${SEG}
        else
            OUT="${OUT},${SEG}"
        fi
    done
    echo ${OUT#,*}
}

parse_line 2,15-9,3,5-8,1

Output -

2,15,14,13,12,11,10,9,3,5,6,7,8,1

---------- Post updated at 10:51 PM ---------- Previous update was at 10:35 PM ----------

Well I'm glad to see so many people enjoying the problem as much as I am .. :slight_smile:

bash only (without external programs, i.e. awk, paste, sed etc.):

_expand() {
  local _s=$1 _q=${2:-,} _n IFS
  [[ $_s =~ [^$_q-[:digit:]]+ ]] && return 1 
  IFS=$_q read -a _a <<< "$_s" 
  for ((i = 0; i < ${#_a[@]}; i++)); do
    [[ ${_a} =~ (-?[[:digit:]]+)-(-?[[:digit:]]+) ]] && 
      ((${BASH_REMATCH[1]} < ${BASH_REMATCH[2]})) &&
      for (( j = ${BASH_REMATCH[1]}; j <= ${BASH_REMATCH[2]}; j++)); do
        _n+=( $j ) || _n+=( ${_a} )
      done ||
        for (( j = ${BASH_REMATCH[1]}; j >= ${BASH_REMATCH[2]}; j--)); do
        _n+=( $j ) || _n+=( ${_a} )
      done
  done
  IFS=$_q; printf "%s\n" "${_n
[*]}"
  }
$ _expand 9-12,19-15,20-20,0--5,-1-2,-5--1
9,10,11,12,19,18,17,16,15,20,0,-1,-2,-3,-4,-5,-1,0,1,2,-5,-4,-3,-2,-1

You may use a custom separator (the default is ,):

$ _expand 9-12:19-15:20-20:0--5:-1-2:-5--1 :
9:10:11:12:19:18:17:16:15:20:0:-1:-2:-3:-4:-5:-1:0:1:2:-5:-4:-3:-2:-1

The regular expression operator `=~', the add/append operator `+=' and the here-strings syntax `<<<', are version specific (old bash versions don't support them).