awk '
# If NR==FNR this is the first file, so get rid of
#+ the "(",")","-"," " characters ("gsub" is global substitution),
#+ and populate the x array: x[$0].
NR==FNR{gsub(/[ \(\)-]/,"");x[$0];next}
# Otherwise, it's the second file, so
#+ remove the spaces. Now we have
#+ the right formating.
{gsub(/ /,"")}
# If the current record is not
#+ previously stored in the x array,
#+ print it (default action).
!($0 in x)' file1 file2
Small clarification on this:
How can we use sed on a perticular column (third column in this example),
sed 's/[-() ]//g' is processing all the columns.
I have two files to compare.
Because the input data I was reading while writing the script
was different (the post was modified;
ghostdog74's post is showing the original sample)
Try this:
awk 'NR==FNR{ gsub(/[ \(\)-][A-Z]*/,"");x[$0];next}
{gsub(/ /,"")}!($0 in x)' file1 file2
sub is not there for comparisons, instead it substitutes the values in this way (/Matchpattern/SubstitutePattern/) like in this case sub(/ /,"",$3) it'll substitute any spaces in the third column with "" that means it'll remove spaces from third colum, in the same way gsub is functioning in this script, gsub(/[ \(\)-]/,"",$3) has match pattern /[\(\)-]/ ie match a ( or ) or - and replace it with "" null value means remove it, actual comparison is being done thru arrays and Radoulov has desdribed it earlier.