One is master and the other is detail. Because there is no uniqueness between them I have to rely on the row count of each file before processing.
With that said, I cut -d "|" -f2 from both files into > aa.out and ab.out
Once this is complete, I sort both files and do a diff to get the records in one and not the other and vice versa.
Now that I have a diff.out (most of the time it will only be a record or two)
I need to be able to remove them from the original file before application processing.
I will attempt to illustrate and for your review:
File 1
111111
222222
333333
555555
File 2
111111
222222
333333
444444
cat diff.out
03c4400
444444
03c5556
555555
I now need to remove those two records from both files - please let me know if any ideas.
I assume you have something like file1.orig and file2.orig with more fields but using | as a field separator. Here is a way to do this with comm and sed:
$
$ cat file2.orig
111111|sdjksd
222222|sdjksd
333333|sdjksd
444444|sdjksd
$ comm -13 file1 file2 | sed 's=^=/^=;s=$=\|/d=' > file2.sed
$ sed -f file2.sed < file2.orig
111111|sdjksd
222222|sdjksd
333333|sdjksd
$
correct - i am using a "|" delimited file and the only common field between the 2 files is -f3 that i cut before comparing - the rest of the fields are different.
Specifically, I am struggling with step to remove those records from the individual files once difference has been determined.