Hi,
I have a directory /home/datasets/ which contains a bunch (720) of subdirectories called hour_1/ hour_2/ etc..etc.. in each of these there is a single text file called (hour_1.txt in hour_1/ , hour_2.txt for hour_2/ etc..etc..) and i would like to do some text processing in them.
Each of these text files contains records (where this record is unique and there are no duplicates) and i want to initially separate each of these records into its own file and name it based on the second field (where the $2 field is an identifier and have this form : cust_xxx_yyy of the record...I'm currently doing this (example for file hour_1/hour_1.txt) :
(1)
awk '{print $0 > $2".txt"}' hour_1.txt
which results to multiple .txt files starting with cust_
then i want to have all these files as a single column file, therefore i do this:
(2)
awk '{print > "n_"FILENAME}' RS=" " cust_*
and finally i want to remove the first 3 records of the newly created files thus i do the following:
(3)
awk 'FNR>3 {print > "fin_"FILENAME}' n_cust*
I know that there might be an easier way of doing this even for a single directory, but is there a way to write a universal script and perform these 3 commands in all the directories?
Thanks in advance!