Comparing lists.. Arrays maybe?

Problem
Part 1.
Gather data from linux server and output to a file named data_DDMMYY
Output file must contain the file name and size

Part 2.
Compare todays data_DDMMYY to yesterdays data_DDMMYY and output results to a file named difference_DDMMYY
Output file must show the difference in size between same named files and identify new and deleted files

Solution
Part 1.
Log into linux, use command "find . -daystart -ctime 0 -type f | xargs ls -sSh > data_DDMMYY" thankyou Yazu
Sample output:

Part 2.
Perhaps use of an array and iterator? (my knowledge is very limited when it comes to arrays and iterators)

Idea
Grab file 1, go to line 1, read first string and save to variable x, read second string and save to list2
Grab file 2, using the current line in list2, look for a match.

If a match is found, go to that line, read first string and save to variable y
calculate difference of x-y and save to list1, output first line of list 1 and list 2 to difference_DDMMYY

If match is not found, output <string>.notFound to a new line in difference_DDMMYY

Grab file 1, go to line 2, read first string and save to x, read second string and save to new line in list 2
etc ... until all lines in file 1 have been used.

Then somehow I'd like to find the lines in file 2 that weren't copied to a list and output them to the difference_DDMMYY
as <String>.newFile
I'm sure there's some sort of for loop I can use to solve this, but as of right now I'm stumped and need ideas >.<
Any and all help is appreciated.

Have you tried using awk?
Awk can read multiple files and you can use the array concept in there.

ex: awk 'NR==FNR{A[NR]=$0;next}
{do compare operations}{save the value}' file1 file2 > file3

1 Like

After you save your files as data_DDMMYY you can use them in below code like file1 and file2, compared files has only one column:

awk '{if(FNR==NR) {arr[$0]++;next} if($0 in arr) { arr[$0]--; if (arr[$0] == 0) delete arr[$0];next}{print $0 >"newfile2"}} END {for(i in arr){print i >"newfile1"}}' file1 file2

newfile1 and newfile2 are the difference files, newfile1 different lines of file1 from file2 etc.

regards

1 Like

Thanks heaps, until i started playing with linux i had never heard of AWK. I'm doing some research now and will hopefully have a better understanding of the language soon ^.^