How to count how many of last array elements are equal with the last one?

Scott37 · November 22, 2020, 12:12pm

The question refer to Linux like Debian and Ubuntu.

How to count how many of last array elements are equal with the last one of elements? I am not understand enough bash on this time to solve this by self. So I hope for your help. THX

Given are a already filled array like follow:

my_array=( 1 2 3 7 7 6 7 7 7 )

The right count for me will be on this sample "3" and not "5"

For follow sample, the right count will be for me "1"

my_array=( 1 2 3 4 5 6 7 8 9 )

I need a out put like follow:

echo $count

The follow echo the count of paces of array, a don't count only the last places of array which are equal:

echo "${#my_array[@]}"

THX for helping by this.

nezabudka · November 22, 2020, 1:07pm

Hi.
Are negative values allowed in elements?

count=($(grep -Eo '(\S+)( \1)*$' <<<"${my_array[@]}"))
echo ${#count[@]}

Scott37 · November 22, 2020, 1:28pm

Every of the p.e 10 array elements, can consist of p.e. a domain name like www.domainname.extension . Domain names can usually consist of up to 63 character like [a-z, 0-9, _, -, .] If its possible, it will be more secure to support the full ASCI Character set.

nezabudka · November 22, 2020, 1:33pm

Well, then what I wrote should work.

Scott37 · November 22, 2020, 2:06pm

THX. This looking to work perfekt for my.

MadeInGermany · November 23, 2020, 9:19am

Nice solution!
However, I had my doubts. Indeed the following goes wrong

grep -Po '(\S+)( \1)*$' <<< "55 51 1 1"

Quick and dirty improvement: a word boundary anchor.

grep -Po '\b(\S+)( \1)*$' <<< "55 51 1 1"

Still goes wrong with "55 5.1 1 1"
I use -P not -E because \S and \b are from PCRE, not in the ERE standard.

--
Reminds me of the indent- script (contributed by Don Cragun some years ago):

sed 's/^\( *\)\1/\1/' filename

The \1 is in favor of the * "greediness".
The contrary indent+ script is easy:

sed 's/^ */&&/' filename

MadeInGermany · November 23, 2020, 2:10pm

Here is a correct solution with shell builtins:

#!/bin/bash
rev_are_equal(){
local i cmp
for ((i=$#; i>=1; i--))
do
  if ((i==$#))
  then
    cmp=${!i}
  else
    [ "${!i}" = "$cmp" ] || break
  fi
done
echo $(($#-i))
}

my_array=( 1 2 3 7 7 6 7 7 7 )
rev_are_equal "${my_array[@]}"

nezabudka · November 23, 2020, 8:14pm

count=($(fmt -1 <<<"${my_array[@]}" | uniq -c | tail -1))
echo $count

nezabudka · November 23, 2020, 8:23pm

A similar solution:

#!/bin/bash
last_cmp() {
        declare -n l=$1
        for _ in ${!l[@]}; do
                [ "${l[-1]}" = "${l[--k]}" ] || break
                ((count++))
        done
        echo $((count))
}

my_array=(one two three three)
last_cmp my_array

MadeInGermany · November 23, 2020, 8:35pm

printf "%s\n" "${my_array[@]}" | uniq -c | tail -1

Surprisingly simple.

MadeInGermany · November 23, 2020, 10:26pm

That's very efficient! And cannot overflow with "too many arguments".
Thanks for showing this!

At first further calls to the function failed. Why? Because it uses variables that are global and not initialized (so they have the values from the previous run).
Fix:

local count=0 k=0
declare -n l=$1

vgersh99 · November 23, 2020, 10:39pm

This as well as typeset -n is a very cool way passings vars by reference into/out of functions - I find it extremely helpful.

Scott37 · February 8, 2021, 12:43pm

This one looks very good for me. I replaced my old one by this one today, because i don't get remarks or error by this one by shellcheck.

Now I use it on follow way:

#!/bin/bash
my_array=( 1 2 3 7 7 6 7 7 7 )
var=$(printf "%s\n" "${my_array[@]}" | uniq -c | tail -1)
echo "$var"

By booth solutions above, I get the follow as output:

3 7

If I doing the follow by hand on terminal, I am getting:

echo 2

Output on terminal:

That`s looks, booth codes above, give out some empty space before the output "2" on this time.

It can be its possible to remove the spaces by the follow. I need to find out how to use:

| sed 's/ //'

RudiC · February 8, 2021, 1:33pm

That 2 is a bit surprising given that we have 7 7 7 as the last elements.
Try

$ var=$(printf "%s\n" "${my_array[@]}" | uniq -c | sed -n '${s/^ *//; p; }')
$ echo "$var"
3 7

Scott37 · February 8, 2021, 5:44pm

Its my error:

The searched solution should give back the count of last equal elements :

my_array=( 1 2 3 7 7 6 7 7 7 )

Thats are the follow:
3

nezabudka · February 8, 2021, 5:48pm

Hi @RudiC
Maybe I misunderstood, but the point was in the first element of the array:

var=($(printf "%s\n" "${my_array[@]}" | uniq -c | tail -1))
$ echo "$var"

Scott37 · February 8, 2021, 6:11pm

This one give the right answer "3". A bring follow remark by shellsheck:

count=($(fmt -1 <<<"${my_array[@]}" | uniq -c | tail -1))
^-- [SC2207](https://github.com/koalaman/shellcheck/wiki/SC2207): Prefer mapfile or read -a to split command output (or quote to avoid splitting).

This one give the right answer "3" to. A bring follow remark by shellsheck:

var=($(printf "%s\n" "${my_array[@]}" | uniq -c | tail -1))
     ^-- SC2207: Prefer mapfile or read -a to split command output (or quote to avoid splitting).

nezabudka · February 8, 2021, 6:48pm

Honestly, I see no point in these warnings. The content of the array is wrapped in a string:

time printf "%s\n" "$(seq 10000000)" | uniq -c | tail -1
      1 10000000

real	0m8,418s

Now, if there were no quotes:

time printf "%s\n" $(seq 10000000) | uniq -c | tail -1

      1 10000000

real	0m28,176s

MadeInGermany · February 8, 2021, 8:44pm

The shellcheck gives a general warning because

var=$( )

splits on IFS i.e. does not distinguish between space or newline.
It does not see that tail -1 yields one line only.

Here is another one that will calm the shellcheck:

printf "%s\n" "${my_array[@]}" | uniq -c | tail -1 | awk '{print $1}'

Or all in one awk:

printf "%s\n" "${my_array[@]}" | awk '$0!=p{p=$0;c=0}{c++}END{print c}'

RudiC · February 8, 2021, 9:09pm

Or, all in ONE awk:

awk -v"ARR=${my_array[*]}" 'BEGIN {for (n = MX = split(ARR, T); n && (T[n] == T[n-1]); n--); print MX - n + 1}'
3