Mac OS 10.9
Let me preface this by saying this is not for marketing or spamming purposes.
I have a script that scans all the email messages in a directory (~/Library/Mail/Mailboxes) and outputs a single column list of email addresses. This will run multiple times a day and append the output file with new entries.
If an email is duplicated in the email folder- it is duplicated in the output file. How do I remove these duplications from the output file? Its just a single column of data separated by a new line. Not sure if I should have it check and exclude the output of duplicates or simply run a scan for duplicates after the output file is appended.
This list is being used as input for LDAP queries.
For reference, the scanning/output portion of my script is below:
find $SRC -type f -name *.emlx |
while read FILE
do
awk '/^From:/ && gsub(/.*<|>.*/,x)' $FILE
done > ~/Desktop/output.txt
echo "complete"