Get unique elements from Array

mohtashims · January 1, 2020, 7:05am

I have an array code and output is below:

    
echo $1

while read -r fline; do
    echo "%%%%%%$fline%%%%%"
    fmy_array+=("$fline")
    done <<< "$1"

Output:

CR30903 YU0007 SRIL CR30903 Yogesh SRIL
%%%%%%CR30903    YU0007     SRIL%%%%%
%%%%%%CR30903    Yogesh SRIL%%%%%

Desired output:

TKT: 30903

ACTOR:
YU0007
Yogesh

DEPT: SRIL

As you can see TKT & DEPT gets single unique value printed while ACTOR has two values because they are not unique.

I know how to get unique element from array using sort -u but here we are dealing with a sub-element of an array.

Can you please suggest how can achieve the desired output ?

nezabudka · January 1, 2020, 8:20am

Hi
To get started, simply print the value of this parameter

echo "$1"

--- Post updated at 17:15 ---

Something like this?

cat file
30903	 YU0007	SRIL
30903	 Yogesh	SRIL
awk '{ B[$2] } END { print "TKT:", $1 RS "ACTOR:"; for (i in B) print i; print "DEPT:", $3 }' file

--- Post updated at 17:20 ---

or maybe

if [[ "${my_array[@]}" =~ $fline ]]; then
    continue
fi

mohtashims · January 1, 2020, 8:31am

Unfortunately I moved away from my system. Can we do without that information?

nezabudka · January 1, 2020, 8:39am

here's another

for fline in $1; do
if ! [[ "${my_array[@]}" =~ "$fline" ]]; then
        my_array+=("$fline")
fi
done
echo ${my_array[@]}

mohtashims · January 1, 2020, 8:55am

nezabudka:

Hi
To get started, simply print the value of this parameter
echo "$1"
--- Post updated at 17:15 ---

Something like this?
cat file
30903	 YU0007	SRIL
30903	 Yogesh	SRIL
awk '{ B[$2] } END { print "TKT:", $1 RS "ACTOR:"; for (i in B) print i; print "DEPT:", $3 }' file
--- Post updated at 17:20 ---

or maybe
if [[ "${my_array[@]}" =~ $fline ]]; then
   continue
fi

This won't work for dynamic data. We need to eliminate duplicate entries only if it exists. You are nowhere checking for unique or duplicate entries in your logic.

MadeInGermany · January 1, 2020, 9:30am

Assuming the input file has got many TKT records in sequence, an associative array is perfect:

awk '
NF>=3 {
  A[$1]=(A[$1] ORS $2)
  D[$1]=$3
}
END {
  for (t in A) {
    print "TKT:", t
    print  "ACTOR:", A[t]
    print "DEPT:", D[t]
    print ""
  }
}' file

Also doable in bash 4 (using its associative arrays).

nezabudka · January 1, 2020, 9:34am

awk '!T[$0]++' RS='[[:space:]]+' <<<"$1"

or

grep -o '\S*' <<<"$1" | sort -u

Further, if you want, you can arrange this into an array, here is a sample

my_array=($(grep -o '\S*' <<<"$1" | sort -u))

And please pay attention to post number #4

mohtashims · January 1, 2020, 11:43am

madeingermany:

Assuming the input file has got many TKT records in sequence, an associative array is perfect:
awk '
NF>=3 {
  A[$1]=(A[$1] ORS $2)
  D[$1]=$3
}
END {
  for (t in A) {
   print "TKT:", t
   print  "ACTOR:", A[t]
   print "DEPT:", D[t]
   print ""
  }
}' file
Also doable in bash 4 (using its associative arrays).

There is no file involved here.

I tried your solution but i get syntax error

    while read -r fline; do
      fmy_array+=("$fline")
echo $fline | awk 'NF>=3 {A[$1]=(A[$1] ORS $2) D[$1]=$3 } END {   for (t in A) {    print "TKT:", t    print  "ACTOR:", A[t]    print "DEPT:", D[t]    print ""  }}'
    done <<< "$1"

error output:
awk: cmd. line:1: NF>=3 {A[$1]=(A[$1] ORS $2) D[$1]=$3 } END {   for (t in A) {    print "TKT:", t    print  "ACTOR:", A[t]    print "DEPT:", D[t]    print ""  }}
awk: cmd. line:1:                                  ^ syntax error

--- Post updated at 12:52 PM ---

This solution did not help. Although it took care of the duplicate entries there is no way to determine the three field of each array element into "TKT", "ACTOR" & "DEPT"

here is the output I got:

--- Post updated at 01:06 PM ---

nezabudka:

Hi
To get started, simply print the value of this parameter
echo "$1"
--- Post updated at 17:15 ---

Something like this?
cat file
30903	 YU0007	SRIL
30903	 Yogesh	SRIL
awk '{ B[$2] } END { print "TKT:", $1 RS "ACTOR:"; for (i in B) print i; print "DEPT:", $3 }' file
--- Post updated at 17:20 ---

or maybe
if [[ "${my_array[@]}" =~ $fline ]]; then
   continue
fi

There is no files involved here.

Also, I have updated the Original post with the output of $1.

--- Post updated at 01:43 PM ---

nezabudka:

awk '!T[$0]++' RS='[[:space:]]+' <<<"$1"
or
grep -o '\S*' <<<"$1" | sort -u
Further, if you want, you can arrange this into an array, here is a sample
my_array=($(grep -o '\S*' <<<"$1" | sort -u))
And please pay attention to post number #4

grep -o '\S*' <<<"$1" | sort -u

<- This works but there is no order so i dont know which element is what as requested in the desired output. Can you please help assign this to get displayed as below:

Current output:

nezabudka · January 1, 2020, 1:51pm

egrep -o '(\S+\s+){2}\S+' <<<"$1" |
  awk '
        {A[$1]; B[$3]; C[$2]}
  END   { print "TKT:" cat(A) "\n\nDEPT:" cat(B) "\n\nACTOR:" cat(C)}
  function cat(T)
        { s=""; for (i in T)
                  s = s RS i
                  return s
        }'