Trying to print the unique values in $2
before the -
, currently the count is displayed. Hopefully, the below is close. Thank you :).
file
chr2:46603668-46603902 EPAS1-902|gc=54.3 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 195.8
chr2:211460199-211460318 CPS1-1200|gc=41.2 105.6
awk '{sub(/\-.*/,"",$2)}!seen[$2]++{c++}END{print c}' file.txt
output is 3
desired output
EPAS1
CPS1
NOTCH3