Get unique elements from Array

I have an array code and output is below:

    
echo $1

while read -r fline; do
    echo "%%%%%%$fline%%%%%"
    fmy_array+=("$fline")
    done <<< "$1"

Output:

CR30903 YU0007 SRIL CR30903 Yogesh SRIL
%%%%%%CR30903    YU0007     SRIL%%%%%
%%%%%%CR30903    Yogesh SRIL%%%%%

Desired output:

TKT: 30903

ACTOR:
YU0007
Yogesh

DEPT: SRIL

As you can see TKT & DEPT gets single unique value printed while ACTOR has two values because they are not unique.

I know how to get unique element from array using sort -u but here we are dealing with a sub-element of an array.

Can you please suggest how can achieve the desired output ?

Hi
To get started, simply print the value of this parameter

echo "$1"

--- Post updated at 17:15 ---

Something like this?

cat file
30903	 YU0007	SRIL
30903	 Yogesh	SRIL
awk '{ B[$2] } END { print "TKT:", $1 RS "ACTOR:"; for (i in B) print i; print "DEPT:", $3 }' file

--- Post updated at 17:20 ---

or maybe

if [[ "${my_array[@]}" =~ $fline ]]; then
    continue
fi

Unfortunately I moved away from my system. Can we do without that information?

here's another

for fline in $1; do
if ! [[ "${my_array[@]}" =~ "$fline" ]]; then
        my_array+=("$fline")
fi
done
echo ${my_array[@]}

This won't work for dynamic data. We need to eliminate duplicate entries only if it exists. You are nowhere checking for unique or duplicate entries in your logic.

Assuming the input file has got many TKT records in sequence, an associative array is perfect:

awk '
NF>=3 {
  A[$1]=(A[$1] ORS $2)
  D[$1]=$3
}
END {
  for (t in A) {
    print "TKT:", t
    print  "ACTOR:", A[t]
    print "DEPT:", D[t]
    print ""
  }
}' file

Also doable in bash 4 (using its associative arrays).

awk '!T[$0]++' RS='[[:space:]]+' <<<"$1"

or

grep -o '\S*' <<<"$1" | sort -u

Further, if you want, you can arrange this into an array, here is a sample

my_array=($(grep -o '\S*' <<<"$1" | sort -u))

And please pay attention to post number #4

1 Like

There is no file involved here.

I tried your solution but i get syntax error

    while read -r fline; do
      fmy_array+=("$fline")
echo $fline | awk 'NF>=3 {A[$1]=(A[$1] ORS $2) D[$1]=$3 } END {   for (t in A) {    print "TKT:", t    print  "ACTOR:", A[t]    print "DEPT:", D[t]    print ""  }}'
    done <<< "$1"

error output:
awk: cmd. line:1: NF>=3 {A[$1]=(A[$1] ORS $2) D[$1]=$3 } END {   for (t in A) {    print "TKT:", t    print  "ACTOR:", A[t]    print "DEPT:", D[t]    print ""  }}
awk: cmd. line:1:                                  ^ syntax error

--- Post updated at 12:52 PM ---

This solution did not help. Although it took care of the duplicate entries there is no way to determine the three field of each array element into "TKT", "ACTOR" & "DEPT"

here is the output I got:

--- Post updated at 01:06 PM ---

There is no files involved here.

Also, I have updated the Original post with the output of $1.

--- Post updated at 01:43 PM ---

grep -o '\S*' <<<"$1" | sort -u

<- This works but there is no order so i dont know which element is what as requested in the desired output. Can you please help assign this to get displayed as below:

Current output:

egrep -o '(\S+\s+){2}\S+' <<<"$1" |
  awk '
        {A[$1]; B[$3]; C[$2]}
  END   { print "TKT:" cat(A) "\n\nDEPT:" cat(B) "\n\nACTOR:" cat(C)}
  function cat(T)
        { s=""; for (i in T)
                  s = s RS i
                  return s
        }'
1 Like