The cd error is because you use $DIR without quoting it, which splits it upon spaces into two strings. You should put it in double quotes. But I don't think you need cd at all.
It'd be easier to search for your log files and read from them than finding dirs and cd-ing into them.
Please show a sample of your input data, too. Without that, I'm just wild-guessing. You should either do the whole thing or none of it in awk, using it on a single line is pointlessly wasteful.
# create a file, /tmp/$$ where $$ is this script's process id.
# it will have lines like /path/to/input-s.out final-energy
find . -name 'input-s.out' | while read FILENAME
do
awk '/HURRAY/ { opt=1 } ; opt && /^FINAL SINGLE POINT ENERGY/ { print FILENAME, $5 }' OFS="\t" "$FILENAME"
done > /tmp/$$
while IFS=$'\t' read FILE ENERGY
do
# Do whatever you want to do with this energy data
done < /tmp/$$
rm -f /tmp/$$
ok, there's something here
I try to say it in a logical manner
inside my home folder there are subfolder and inside the subfolder there are log files.
each subfolder name is the formula of a compound and each compound has four log files inside of the folder and all their names start with the compound's formula same as the subfolder.
an example is:
CHSiH3 (subfolder) -> CHSiH3-t.out, CHSiH3-s.out, CHSiH3-CC-s.out, CHSiH3-CC-t.out
so actually by what you said simply by searching from the current directory for each subdirectory name there should be four log files found where all have the .out extension.
I want to classify the energies based on the name of the log files
t-> triplet, s or not t -> singlet, CC -> coupled cluser, if no CC -> DFT
so at the end I can have tabular data like CSV file
Compound State Method Energy
CHSiH3 triplet CC 10.3243
CHSiH3 singlet DFT 9.9498
....
Do you think you can help me with cause I feel abit lost here.
So, even though there's four per folder, they can all be processed individually, as long as their name is considered? Or do you need them to be grouped in four?
I am also feeling a bit lost, because knowing there's four files per folder doesn't explain what you want done with them all.
I just want to read energies from them and make a table as I showed in the fomer post.
and I can only classify the energies in those tables if I have file names.
and the file names are different they can only be found by their extensions .out.
Specifications are difficult thing to write; wild guessing lead me to offer this as a zeroth approximation for the four files in one directory as posted above:
awk '
BEGIN {print "Compound\tState\tMethod\tEnergy"}
{n=split(FILENAME, T, "-")
printf "%s\t%s\t%s\n", T[1], substr(T[n],1,1)=="t"?"triplet":"singlet", T[2]=="CC"?"CC":"DFT"
}
' *.out
Compound State Method Energy
CHSiH3 singlet CC
CHSiH3 triplet CC
CHSiH3 singlet DFT
CHSiH3 triplet DFT
By no means I'm in a position to fill in the energy column, as the sample input file you posted has 97 occurrences of the word "energy" in it, and even the "FINAL SINGLE POINT ENERGY" phrase that you try to match in your code snippet comes up with 6 different values in that file.
the energy which would needs to be placed there is the "FINAL SINGLE POINT ENERGY" which shows up after the keyword "Hurray" shows up.
I had written a code in mathematica to deal it but shell is so different and I'm mixed up!
awk: syntax error at source line 4
context is
printf "%s\t%s\t%s\n", T[1], >>> substr(T[n],1,1)== <<<
awk: illegal statement at source line 4
awk: illegal statement at source line 4