action over multiple files in a directory. It is a simple action, I want to print out the 1st 2 columns (i.e. $1 and $2) in each tab-separated document and output the result in a new file
*.pp
This is the
awk
that I have come up with so far, which is not giving me a result. Can someone help me identify the error?
awk FNR == 1 {if (o)close(o) o = FILENAME sub(/\*/, ".pp", o)} NR % $2,$1 {print > }
Could you please try following and let me know if this helps.
for i in *.pp
do
awk '{print $1 OFS $2 >> new_input_file}' OFS="\t" $i
done
OR
awk '{print $1 OFS $2 >> "new_output_file.txt";close(FILENAME)}' OFS="\t" *.pp
I haven't tested though, let me know if you have any queries on same.
I have tried and unfortunately it does not help - it seems to outut all of the files into one files called *.pp, rather than individual files with the suffix .pp
Let me preface a bit more the data. The file is tab-separated but there are lines of content in each column.
For instance, the information in the files would look something like this:
File1:
I love you man THIS IS GREAT NEWS 5 www.url.com
File2:
I love you girl THIS IS AWESOME NEWS 6 www.url.org
File3:
I love you son THIS IS BAD NEWS 7 www.url.co.uk
I need to print out in individual output files just the first two columns, so the output would be.
File1.pp
I love you man THIS IS GREAT NEWS
File2.pp
I love you girl THIS IS AWESOME NEWS
File3.pp
I love you son THIS IS BAD NEWS
When I need to extract quickly information from a column, I usually query the document also by defining the separators: