So I've got problem which continues on my previous one (from few months ago:
Unix Linux Community - Technical support for all Unix and Linux users ).
Good, proven, working solutions for that old problem are those:
awk '{cur=$0; gsub(/[^[:alnum:]]/, "", cur); if (!a[tolower(cur)]++) print}'
and
awk '{s=tolower($0);gsub("[^[:alnum:]]","",s);x=$0} END {for(i in x) print x}'
These 2 approaches yield same results (but with different final order of lines, which is really unimportant for me).
These lines (any of them) are also, what I need modified now to work a little different, and that is purpose of this new topic:
I now don't need awk (in his search for duplicate lines in file) to consider and compare whole lines anymore. But only first parts of lines until it reaches character '*' (asterisk). Asterisk is separator in my file and everything that comes after asterisk, awk should not bother with (its like he got to end of the line). Asterisk occurs in every line in file but sometimes there is more then one per line (this should not confuse awk, and he should still take into account only first part of line, until first asterisk appears.
If someone can make good solution for this would save me week of work... also eternal gratitude from me