Assign number of records to a variable

Geneanalyst · January 22, 2018, 10:31am

How does one assign a variable, x to equal the number of records in a different file.

I have a simple command such as below:

awk -F "\t" '(NR>5) { if(($x == "0/0")) { print $0} }' a.txt > a1.txt

but I want x to equal the number of records in a different file, b.txt

RudiC · January 22, 2018, 10:46am

How do you define and determine the "number of records in a different file"? How can that number equal "0/0"

rdrtx1 · January 22, 2018, 10:53am

try something like:

awk -F "\t" 'NR>5 {if (x==8) print $0}' x=$(wc -l b.txt) a.txt

Note: x will be a number

Don_Cragun · January 22, 2018, 11:25am

Come on RudiC and rdrtx1,
In awk $x is the contents of a field in a line and that field might or might not be a number.

Note that the output of the command wc -l b.txt) will be something like:

    3465 b.txt

if the file b.txt contains 3465 lines (AKA records) of text. With that, the command:

awk -F "\t" 'NR>5 {if (x==8) print $0}' x=$(wc -l b.txt) a.txt

would invoke awk with the arguments:

awk -F "\t" 'NR>5 {if (x==8) print $0}' x= 3465 b.txt a.txt

Shell variable assignments only happen in shell command line processing when they occur at the start of a command line. In a shell variable assignment, the leading spaces in the command substitution would be assigned to the variable, but when it appears as a parameter on a command-line after the utility name, the spaces before and after the line count in the output from wc act as field separators.

Hi Geneanalyst,
One might guess that one of the following might do what you requested.

awk -F '\t' -v x=$(( $(wc -l < b.txt) + 0)) '(NR>5) { if(($x == "0/0")) { print $0} }' a.txt > a1.txt

or:

awk -F '\t' '
FNR == NR {
	x++
	next
}
(NR + x) > 5 {
	if(($x == "0/0")) { print }
}' b.txt a.txt > a1.txt

or:

awk -F '\t' '
FNR == NR {
	x++
	next
}
(NR + x) > 5 && $x == "0/0"' b.txt a.txt > a1.txt

One might also guess that you don't really want the awk variable x to be the number of records in b.txt , but instead want it to be the number of fields in that file. If that is what you want, it is even more important than in many other cases to know what operating system and shell you're using because some versions of awk have a nextfile command and others don't.

And, as always, sample input files and desired output would help us help you.

Geneanalyst · January 22, 2018, 11:44am

Thanks everyone,

Just to clarify, if b.txt has 300 lines, then I would like x =300

EDIT: Also, in some cases I would like to set x=number of lines in b.txt + 9

rdrtx1 · January 22, 2018, 11:51am

Wrongo. In the above x=$(wc -l b.txt) will set x to the line count and b.txt will be read by the script. Not sure what the intent of the script is and if the reading of b.txt is needed. So if b.txt is not needed to be read and just the line count is needed then try something like:

awk '{print $0, x}' $(wc -l b.txt| read x y; echo x=$x) infile

or just set x prior like:

wc -l b.txt | read x y
awk '{print $0, x}' x=$x a.txt

Don_Cragun · January 22, 2018, 12:08pm

rdrtx1:

Wrongo. In the above x=$(wc -l b.txt) will set x to the line count and b.txt will be read by the script. Not sure what the intent of the script is and if the reading of b.txt is needed. So if b.txt is not needed to be read and just the line count is needed then try something like:
awk '{print $0, x}' $(wc -l b.txt| read x y; echo x=$x) infile
or just set x prior like:
wc -l b.txt | read x y
awk '{print $0, x}' x=$x a.txt

Sorry to disagree with you, but:

set -xv
awk -F'\t' '{print x}' x=$(wc -l b.txt) a.txt
+ wc -l b.txt
+ awk '-F\t '{print x} x= 3466 b.txt a.txt
awk: can't open file 3466
 source line number 1

As I said before, x is set to the empty string, the number of lines counted by wc and the file processed by wc both become separate arguments to awk .

This was tested with both bash and ksh on macOS High Sierra version 10.13.2.

Don_Cragun · January 22, 2018, 12:13pm

Note that I corrected the last two awk suggestions in my post #4 after you posted the above (and didn't notice that you had responded until just now). Of course, this is still untested since you haven't supplied any sample data.

rdrtx1 · January 22, 2018, 12:19pm

We are UNIX people separated by different platforms.

Geneanalyst · January 22, 2018, 1:09pm

don cragun:

Come on RudiC and rdrtx1,
In awk $x is the contents of a field in a line and that field might or might not be a number.

Note that the output of the command wc -l b.txt) will be something like:
   3465 b.txt
if the file b.txt contains 3465 lines (AKA records) of text. With that, the command:
awk -F "\t" 'NR>5 {if (x==8) print $0}' x=$(wc -l b.txt) a.txt
would invoke awk with the arguments:
awk -F "\t" 'NR>5 {if (x==8) print $0}' x= 3465 b.txt a.txt
Shell variable assignments only happen in shell command line processing when they occur at the start of a command line. In a shell variable assignment, the leading spaces in the command substitution would be assigned to the variable, but when it appears as a parameter on a command-line after the utility name, the spaces before and after the line count in the output from wc act as field separators.

Hi Geneanalyst,
One might guess that one of the following might do what you requested.
awk -F '\t' -v x=$(( $(wc -l < b.txt) + 0)) '(NR>5) { if(($x == "0/0")) { print $0} }' a.txt > a1.txt
or:
awk -F '\t' '
FNR == NR {
	x++
	next
}
(NR + x) > 5 {
	if(($x == "0/0")) { print }
}' b.txt a.txt > a1.txt
or:
awk -F '\t' '
FNR == NR {
	x++
	next
}
(NR + x) > 5 && $x == "0/0"' b.txt a.txt > a1.txt
One might also guess that you don't really want the awk variable x to be the number of records in b.txt , but instead want it to be the number of fields in that file. If that is what you want, it is even more important than in many other cases to know what operating system and shell you're using because some versions of awk have a nextfile command and others don't.

And, as always, sample input files and desired output would help us help you.

Hi Don,

It just occurred to me that x also happens to be the last column of a.txt. Perhaps it would be easier to assign x to be the last column number of a.txt. Any idea how to do that?

Don_Cragun · January 22, 2018, 1:19pm

Very simply:

awk -F '\t' 'NR > 5 && $NF == "0/0"' a.txt > a1.txt

forget about x and just use the awk NF variable which is the Number of Fields in the current input record.