I have results from some statistical analyses.
The format of the results are as given below:
- I want to select lines that have a p-value (last column) less than 0.05 from all the results files (*.results) and cat to a new results file.
- It would be very nice if a new column is added that tells me the name of the original results file for each line added.
awk preferred, but can also do python.
e.g: Stat1.result
Bin ID ZIP P
3 Individual-1813375 11972 1.99E-05
3 Individual-4681817 58156 0.0001712
2 Individual-13020362 23877 0.0006332
1 Individual-17226184 20192 0.003108
4 Individual-17037125 84105 0.008756
3 Individual-4680035 15428 0.01189
4 Individual-759458 46333 0.0283
1 Individual-2762682 29182 0.03233
4 Individual-11099561 23056 0.03826
3 Individual-7551560 13650 0.0576
2 Individual-1036543 65098 0.0579
2 Individual-235385 13339 0.05882
2 Individual-1430261 12487 0.08075
3 Individual-4677602 71462 0.09687
1 Individual-9398631 11902 0.1085
1 Individual-3767635 16008 0.1127
3 Individual-11733459 64659 0.1555
2 Individual-1867856 32616 0.1628
1 Individual-11581364 16013 0.1708
Result File should be like this:
File Bin ID ZIP P
Stat1.result 3 Individual-1813375 11972 1.99E-05
Stat1.result 3 Individual-4681817 58156 0.0001712
Stat1.result 2 Individual-13020362 23877 0.0006332
Stat1.result 1 Individual-17226184 20192 0.003108
Stat1.result 4 Individual-17037125 84105 0.008756
Stat1.result 3 Individual-4680035 15428 0.01189
Stat1.result 4 Individual-759458 46333 0.0283
Stat1.result 1 Individual-2762682 29182 0.03233
Stat2.result 4 Individual-2985340 29244 0.016565
Stat3.result 3 Individual-10177001 19173 0.0466
Stat6.result 2 Individual-1036543 65098 0.0479
Thank you for your help.