awk for loop help

Rabu · January 23, 2015, 1:45pm

Hello,
I have an input file that looks like so:

 1 2 3
4 5 6
7 8 9

and I just want to print the first and third column (note: my actual file contains many many more fields so I don't want to use '{ print $NF }' for each field I want.

I tried using:

awk 'BEGIN {FS=" "} { for (i=1; i<=NF; i++) if (i<2 || i>2) print $i }' input.txt

but I do not get the output I want.

Any help is appreciated, Thank you.

Scott · January 23, 2015, 1:56pm

Hello.

Me no understand.

If you want the first and third columns, why not use $1 and $3, instead of a for-loop?

$ echo "1 2 3
> 4 5 6
> 7 8 9" | awk '{print $1, $3}'
1 3
4 6
7 9
$

There's a bunch of other stuff to consider. Such as, the "input" you show starts with a leading space (presumably it's a space), and you set the field separator to a space, which means "1" would be the second field. Also, if you want the 1st and 3rd fields, $NF has no role to play. Also, you say the input file "looks like this", then proceed to claim that actually, it doesn't, etc.

Rabu · January 23, 2015, 2:15pm

Hi,
What I meant was that my actual file has the same format, but is much larger. It contains many more fields. I just put in three for the sake of simplicity.

There should be no space at the beginning, it should look like:

1 2 3
4 5 6
7 8 9

Again, my actual file contains many fields that are space separated, not just three. Thats why I wanted to use the for loop.

Scott · January 23, 2015, 2:17pm

Hi.

That's fine.

But it's the same answer: if you want to extract specific fields, just use the field numbers. There's no need for a loop.

awk '{ print $1, $3 }' file

The tab and the space are the default field separators, so there is no need to set FS if your file is space-separated (doing so (FS=" ") would mean every space adds a new field).

edit: Actually, that ("(doing so (FS=" ") would mean every space adds a new field)") is not true, at least in GNU Awk

(two spaces between A, B and C...)
$ echo "A  B  C" | awk -F" " '{print $1, $2}'
A B
$ echo "A  B  C" | cut -d" " -f1,2
A

RudiC · January 23, 2015, 2:55pm

Try this as a starting point:

awk 'NR==1  {MX=split(COLS,C)} {for (i=1; i<=MX; i++) printf "%s ", $(C); printf "\n"}' COLS="1 3" file
1 3 
4 6 
7 9

Don_Cragun · January 23, 2015, 3:51pm

As with many of Rabu's threads, the problem specification isn't really clear.

Scott is absolutely correct that a for loop is not needed with the specified problem (and would be inefficient if the goal really is to always print the 1st and 3rd fields from every input line).

But, if the real goal is to print every field except the 2nd field, then a for loop makes sense. But using print instead of printf (without a \n in the format string) would create separate lines for each printed field instead the desired single line of output.

Scott,
The latest update to the POSIX standards says this about a single character ERE being used for FS:

Note that this is why FS="." and FS="*" work without needing to use FS="[.]" or FS="\*" when the goal is use a period or asterisk as a field separator without having to worry about them being interpreted as a metacharacters. But, if you want to use period and asterisk as field separators you need something like FS="[.]|[*]" instead of just FS=".|*" .

Rabu,
Since {print $1,$3} does exactly what you have requested, please clearly explain why you don't want to use it (or clarify what you are trying to do so we understand why this code is not the best solution for your problem).

Scott · January 23, 2015, 4:02pm

Don Cragun, you sir, are awesome.

Rabu · January 23, 2015, 4:13pm

Hi Don and Scott,

I apologize for not clearly stating the problem, but say my actual input file contained 14 fields, and I only wanted to print all fields except for 6 and 7. Wouldn't a for loop be better suited to printing the fields I want instead of typing each individual field number?

I am still quite new to bash and awk, so please forgive any lapses in clarification.

Thanks to both of you.

Don_Cragun · January 23, 2015, 4:46pm

Untested (since you didn't provide any sample input and output), but pretty simple:

awk '
{	for(i = 1; i < 6; i++)
		printf("%s ", $i)
	for(i = 8; i < 14; i++)
		printf("%s ", $i)
	print $14
}' file

or:

awk '
{	for(i = 1; i <= NF; i++)
		if(i != 6 && i != 7)
			printf("%s%s", $i, (i == NF) ? "\n" : " ")
}' file

RudiC · January 23, 2015, 5:04pm

Try

awk 'NR==1  {MX=split(COLS,C)} {for (i=1; i<=MX; i++) $(C)=""; $0=$0; $1=$1}1' COLS="6 7" file

Scrutinizer · January 24, 2015, 2:39am

Or perhaps just:

awk '{$6=$7=x; $0=$0; $1=$1}1'  file

ken6503 · January 25, 2015, 6:09pm

Hi RudiC,

the code works.
one thing I don't understand is:

$0=$0; $1=$1

I know

$1=$1

is to rebuild the line with default one space.
what's "$0=$0; $1=$1" mean?

RudiC · January 26, 2015, 3:59am

man awk :

ken6503 · January 26, 2015, 10:01am

Thanks RudiC.