I have 2 files generated in linux that has common output and were produced across multiple hosts with the same setup/configs. These files do some simple reporting on resource allocation and user sessions. So, essentially, say, 10 hosts, with the same (2) system reporting in the files, so a combination of 20 files total.
Neither of the files contain any reference to the host name, but do contain resource info and correlative user info per machine.
I need to meet some basic reporting objectives that include:
-
taking cumulative reporting data on the resource and ranking the machines by availability
-
taking cumulative reporting data on the users and ranking on usage
I created a simple python script that runs through all the files and produces output that summarizes per machine, but I've not got the parsed data converted into a real data value, so its not sortable
I also think that the following needs to happen:
-
merge data files in linux w/ piping
-
append the machine name/# to the 1st column
-
read that w/ python slicing (since the machine name will always be the first "X" # of characters in the line
-
use a bash script, wrapped, then execute the python script after bash has merged all the data files
so something like
/bin/bash
merge
push to python
I might need to change my python script so that the function input isnt the machine name, but more like "big_data_file" etc
that make sense?