Content of filtered files write as columns into one file

IMPe · May 20, 2015, 9:36pm

Hi everybody!

i have a lot of files where i filter out data.

#!/bin/bash
f=sample_*.Spe
for i in $f `eval echo ls sample_*.Spe`
do
        if test -f "$i" 
        then
awk 'FNR==8 ||FNR==10 || (FNR>=13 && FNR<=268) {print $1}' $i > test$i.txt
paste test$i.txt test_f.txt > test_f.txt
        fi
done

what i have: several files where i filter out the first column:

what i want to have is --> write the filtered columns into one file - as a table

001             002               003     ...
05/13/2015   05/13/2015    05/13/2015
600              600              600
4                  8                 1
5                  6                 4
6                  4                 2
7                  2                 3
.                  .                 .
.                  .                 .
.                  .                 .

as you can see in the top example, i try to do it with paste, but that will not work propper. moreover i want a part of the filename in the top of every column

echo $i | cut -c 8-10

I dont have experience with awk-arrays - can someone help me to solve this in a reliable working script?

thanks in advance,
IMPe

Don_Cragun · May 20, 2015, 10:47pm

There are enough strange constructs in your sample shell script that I am not at all sure what you are trying to do, what your input files look like, nor what your output file is supposed to look like. It seems to be a very expensive way of copying the last regular file it processes to the file named test_f.txt .

Please show us three short sample input files (including their full filenames; not just the pattern that matches them), and show us the exact output you want to produce from those three sample input files. And, it is always helpful to know what operating system (including its release/version information) you're using and the version of the shell you're using.

IMPe · May 21, 2015, 12:57am

Hi Don!

Sorry for being not accurate!
OS: Linux ass210 3.16.7-13-desktop #1 SMP PREEMPT Wed Mar 18 17:31:15 UTC 2015 (ba2afab) x86_64 x86_64 x86_64 GNU/Linux
Shell: GNU bash, version 4.2.53(1)-release (x86_64-suse-linux-gnu)
input file [sample_001.Spe]. Up to 40 files with the subsequent structure and ascending number after the underscore [ sample_001.Spe, sample_002.Spe ... sample_040.Spe] .
structure of the input file: i haven't put all the 255 lines; i stretched the main part, but all of them are numbers. moreover i marked the for me important parts blue.

$SPEC_ID:
No sample description was entered.
$SPEC_REM:
DET# 1
DETDESC# GBS178 Model 927 SN 9244437 Input 1
AP# Maestro Version 7.01
$DATE_MEA:
05/13/2015 09:21:27
$MEAS_TIM:
600 600
$DATA:
0 255
       0
       0
       0
       0
       0
.
.
.
      1
       0
       0
       0
       0
       0
       0
       0
       0
       0
       0
$ROI:
1
57 105
$PRESETS:
Live Time
600
0
$ENER_FIT:
-1.172989 0.081834
$MCA_CAL:
3
-1.172989E+000 8.183417E-002 0.000000E+000 keV
$SHAPE_CAL:
3
1.428736E+001 0.000000E+000 0.000000E+000

The for me interesting data i want to get out of each file by using

f=sample_*.Spe
for i in $f `eval echo ls sample_*.Spe`
do
        if test -f "$i" 
        then
awk 'FNR==8 ||FNR==10 || (FNR>=13 && FNR<=268) {print $1}' $i > test$i.txt
...

and put the extracted columns (i.e. 40 columns from 40 files) into one file.
furthermore the output file should have a csv-format to transfer it to M$-Excel. It should look like the following example; the 3 dots in every column and the 3 dots in every line should represent the vertical and horizontal ellipsis :

001           002           003          ...  040
05/13/2015    05/13/2015    05/13/2015   ...  05/13/2015
600           600           600          ...  600
4             8             1            ...  2
5             6             4            ...  3
6             4             2            ...  5
7             2             3            ...  7
.             .             .            ...  .
.             .             .            ...  .
.             .             .            ...  .
3             4             5            ...  9

can you please help me.

Thanks in advance,
IMPe

Don_Cragun · May 21, 2015, 4:59am

You might want to try this instead:

#!/bin/bash
awk '
FNR == 1 {
	close(fn)
	seq = substr(FILENAME, 8, 3)
	fn = "test" FILENAME ".txt"
	printf("%-13s\n", seq) > fn
}
FNR == 8 || FNR == 10 || (FNR >= 13 && FNR <= 268) {
	printf("%-13s\n", $0) > fn
}' sample_*.Spe
paste -d ' ' testsample_*.Spe.txt > test_f.txt

It invokes awk once instead of 80 times, creates each of your testsample_*.Spe.txt files once instead of twice, invokes paste once instead of 80 times, and doesn't redirect the output of paste to one of its input files (which wiped out most of the work you were trying to do).

This assumes that your data doesn't have any fields wider than 13 characters and that you want data left aligned in each column (which matches the 1st two columns of output in your second sample output file.

IMPe · May 23, 2015, 1:39am

Hi Don,

thanks a lot for solving my problem.
Unfortunatelly i don't understand your
script completely but i can work with it.
It gives me what i want propper and efficiently!

Thanks a lot!!
IMPe

Don_Cragun · May 23, 2015, 2:31am

You're welcome.
I'm glad it is working for you.
What don't you understand about how it works?