Constructing a Matrix

Hi,

I do have couple of files in a folder. The names of each of the files have a pattern.

ahet_005678.txt
ahet_005898.txt
ahet_007678.txt
ahet_004778.txt
...
...
ahet_002378.txt

Each of the above files have the same pattern of data with 4 columns and have an header for the last 3 columns.


vi ahet_005678.txt

  A1 A2 A3
1 0.5 0 0
2 0 1 0.571
3 0.010 6 3.333
4 0 0 0
5 0.2 0 0
6 0 0 0
7 0.00527 1 1.667
8 0 0 0
9 0 0 0


vi ahet_005898.txt

  A1 A2 A3
1 0 0 0
2 0.5 1 0.571
3 0.010 6 3.333
4 0.34 0 0
5 0 0 0
6 0.09 0 0
7 0.00527 1 1.667
8 0 0 0
9 0 0 0

vi ahet_007678.txt

  A1 A2 A3
1 0.9 0 0
2 0.5 1 0.571
3 0.010 6 3.333
4 0 0 0
5 0.67 0 0
6 0 0 0
7 0.00527 1 1.667
8 0 0 0
9 0.45 0 0


I would like to get a single output file by greping the contents of each of files and a portion of the name of each of the file. More specifically, grep the numerical portion after the '_' in each of the file names, for example, in the file name ahet_005898.txt, get (grep)005898 and also the 2rd column of that file and then paste it horizontally in the result file. Desired out put file is in the matrix format

005678 0.5 0 0.10 0 0.2 0 0.00527 0 0
005898 0 0.5 0.010 0.34 0 0.09 0.00527 0 0
007678 0.9 0.5 0.010 0 0.67 0 0.00527 0 0.45
....
.....
....

Please let me know the best way in awk to grep portions of the name of the file and the second column and make a matrix using all the files in a directory.

What have you tried?

I tried the following in sed

sed -e :a -e '{N; s/\n/ /g; ta}' infile

But it only selects the first column. I am not sure how to extract the file name and print the second column horizontally for all the 1000's of file I have in the directory.

An awk approach

awk 'f!=FILENAME{f=g=FILENAME;gsub(/.*_|.txt/,x,g);printf "\n"g} 
     NF>3{printf " "$2} 
     END{printf "\n"}' ahet_00*

--ahamed

import glob
for i in glob.glob('leo/*.txt'):
 name=i[i.index('_')+1:i.rindex('.')]
 with open(i) as file:
  l=[name]
  for line in file:
    if line.startswith(' '):
        pass
    else:
        l.append(line.split(" ")[1])
  print(' '.join(l))