Difficulty to create .txt file from loop in bash

Tim2424 · March 5, 2020, 10:13am

I've this data :

data1.txt

2020-01-27-06-00;/dev/hd1;100;/
2020-01-27-12-00;/dev/hd1;100;/
2020-01-27-18-00;/dev/hd1;100;/
2020-01-27-06-00;/dev/hd2;200;/usr
2020-01-27-12-00;/dev/hd2;200;/usr
2020-01-27-18-00;/dev/hd2;200;/usr

 data2.txt 

2020-02-27-06-00;/dev/hd1;120;/
2020-02-27-12-00;/dev/hd1;120;/
2020-02-27-18-00;/dev/hd1;120;/
2020-02-27-06-00;/dev/hd2;230;/usr
2020-02-27-12-00;/dev/hd2;230;/usr
2020-02-27-18-00;/dev/hd2;230;/usr

 data3.txt 

2020-03-27-06-00;/dev/hd1;130;/
2020-03-27-12-00;/dev/hd1;130;/
2020-03-27-18-00;/dev/hd1;130;/
2020-03-27-06-00;/dev/hd2;240;/usr
2020-03-27-12-00;/dev/hd2;240;/usr
 2020-03-27-18-00;/dev/hd2;240;/usr

I would like to create a .txt file for each filesystem ( so hd1.txt, hd2.txt, hd3.txt and hd4.txt ) and put in each .txt file the sum of the value from each FS from each dataX.txt. I've some difficulties to explain in english what I want, so here an example of the result wanted

Expected content for the output file `hd1.txt`:

2020-01;/dev/hd1;300;/
2020-02;/dev/hd1;360;/
2020-03;/dev/hd1;390:/

Expected content for the file `hd2.txt`:

2020-01;/dev/hd2;600;/usr
2020-02;/dev/hd2;690;/usr
2020-03;/dev/hd2;720;/usr

The implementation I've currently tried:

for i in $(cat *.txt | awk -F';' '{print $2}' | cut -d '/' -f3| uniq)
do
    cat *.txt | grep -w $i | awk -F';' -v date="$(cat *.txt | awk -F';' '{print $1}' | cut -d'-' -f-2 | uniq )" '{sum+=$3} END {print date";"$2";"sum}' >> $i

done

But it doesn't works...

Can you show me how to do that ?

RudiC · March 5, 2020, 12:34pm

"But it doesn't works..." is not something people can start working / analysing / debugging upon. Be more precise / descriptive; include error messages, warnings, non-satisfying output, etc.

Try (assuming mount points are unique)

awk -F\; '
        {MTH = substr($1,1,7)
         SUM[MTH,$2] += $3
         MNT[MTH,$2]   = $4
        }
END     {for (s in SUM)         {n = split (s, T, "[;/]")
                                 print s, SUM, MNT  >  (T[n] ".txt")
                                }
        }
' OFS=\; SUBSEP=\; file[123]
cf *.txt

---------- hd1.txt: ----------

2020-03;/dev/hd1;390;/
2020-02;/dev/hd1;360;/
2020-01;/dev/hd1;300;/

---------- hd2.txt: ----------

2020-03;/dev/hd2;720;/usr
2020-02;/dev/hd2;690;/usr
2020-01;/dev/hd2;600;/usr

Tim2424 · March 6, 2020, 4:55am

Hello !

Thanks for your help !

I've try many different things in bash and my last script that most closely resembled at what I want take much more time ( 3minutes ) to do what your script do instantly. So thanks for that !

Last things, if I want to specify the path where are my data, I've just to do that ? :

' OFS=\; SUBSEP=\; /my/full/path/*.txt

And, If I understand how to change for the loclisation of my data, at this step :

END     {for (s in SUM)         {n = split (s, T, "[;/]")                                  print s, SUM, MNT  >  (T[n] ".txt")                                 }         }

How can I specify the path where I want to put the result of the script ?

Thanks again !

RudiC · March 6, 2020, 10:37am

1) Yes
2) How do you want to convey the target path to the script? In case you want to use the source files' path, and this is constant, to place the hd*.txt files next to the sources, try (in the END section)

END     {PTH = FILENAME
         sub (/[^\/]*$/, "", PTH)
         for (s in SUM)         {n = split (s, T, "[;/]")
                                 print s, SUM, MNT " > " (PTH T[n] ".txt")
                                }
         }

assuming the FILENAME variable is still defined and valid in that section.

Else you can use awk 's -v mechanism to supply the path.