I want to print all the rows which have the pattern in column 3 ?
Is it possible to give patterns from a file, in AWk ?
How to do it in shell scripting (without using AWk) ?
The idea is to read the pattern file completely before doing anything else.
But you're right, in this case A[$3] would never be set as there is no field three in the pattern file. But if that was to change, using next means you wouldn't have to worry about it later!
Can I know what is the relationship between NR == FNR { A[$1] = $1; next } A[$3] ?
I quite confusing about the reason why you using the A[$X] to print rows, having pattern in specific column.
Thanks for your reply
First realize that awk arrays are indexed by a value not a number. While array indexes may look like a number, that is not how awk sees them.
The first clause sets up an array (A) indexed by values from the pattern file.
Indexes to A are cd003, cd005, etc... so A["cd003"] is a valid entry in the A array.
NR is the number records awk has ever read. NR is set to 1 when awk starts.
FNR is the number of records read from the current file. FNR is reset to 1 when a new file is opened.
Both are incremented when a record is read.
So, if NR is equal to FNR, then we are reading from the first file (pattern file) since the record counts are the same.
If NR is not equal to FNR, then we are reading from a subsequent file (i.e. data file).
The A[$3] (where $3 is the third field from the data file) says if the entry exists (i.e A["cd003"] then do the default action (print the line), else ignore that entry.
This clause is not executed on the pattern file because the next statement says "skip all following code. read next line, and start processing clauses from the top". The 'next' statement adds to the robustness of the code.
I try the command below:
awk 'NR == FNR { A[$1] = $2; next } A[$3]'
awk 'NR == FNR { A[$1] = $3; next } A[$3]'
awk 'NR == FNR { A[$1] = $4; next } A[$3]'
All fail to get my desired output result.
Thus I'm interesting about the reason why need to set like A[$1] = $1
---------- Post updated at 11:45 PM ---------- Previous update was at 11:41 PM ----------
Actually I got try the command below:
awk 'NR == FNR { A[$1] = $2; next } A[$3]'
awk 'NR == FNR { A[$1] = $3; next } A[$3]'
awk 'NR == FNR { A[$1] = $4; next } A[$3]'
All fail to get my desired output result.
Thus I'm interesting about the reason why need to set like A[$1] = $1
Really thanks ya.
I'm the new user of awk ^^
But I like awk
Scottn's srcript works fine based on your spec. What exactly are you trying to do?
Changing the A[$1] rvalue will not change the programs execution. If what you posted represents the command line you put to the shell, then you forgot to specify the pattern and data files..
Hi,
Sorry that I missed some hints..
Like what you said, "If what you posted represents the command line you put to the shell, then you forgot to specify the pattern and data files"
How can I do that to specify the pattern and data files?
pattern_file and data_file are just place holders for the actual paths to your files.
For instance if /export/home/user/real.txt is the path to the file with the patterns and /export/home/user/thedata.txt is the path to the data file, then the command line would be:
A[$1] = $1 is used to set up the array of patterns. It is set when NR is equal to FNR, in other words when the first file is read i.e. the pattern file. It means fill the associative array "$1" to the value of "$1", $1 being the first field of your pattern file So with the OP's provided inputs it gets filled like so:
So once the pattern file is done it starts reading the input file FNR is no longer equal to NR, so it will just execute the part "A[$3]" for each line, which means: print the current line if field 3 exist as a key in the array.
The script is only testing the existence of the array elements, not using its contents. So IMO the use of $1 is a tiny bit superfluous. I think we could also just set it to 1 instead:
As Scrutinizer says, what is assigned to A[$1] is unimportant. It just has to be a value. Why are you changing the A[$1] assignment? what are you trying to do ?
BTW, then 'next' statement has a good side effect... it prevents the A[$3] clause from being executed. If a novice decides to modify the script, it will prevent some undesired behavior. And awk doesn't have to do useless work processing a clause that isn't useful for the pattern file....
Thanks ya.
I fully understand about all the code now d:)
Can I ask you one more things?
Is it when I used A[$x]=$y
The x & y MUST be the same number,right?
Besides that, if I used A[$x]=1
The result will got some empty space for those not match with the pattern file.
awk 'NR == FNR { A[$1]=1 } A[$3]' pattern_file input_file
x
x
bca cd002 cd003 cza
bac cd004 cd005 zac
acb cd006 cd007 caz
cab cd007 cd008 azc
x
x
x represent the empty space.
---------- Post updated at 01:31 AM ---------- Previous update was at 01:30 AM ----------
Thanks jp2542a,
I fully understand about the code now d
Really thanks for all of your explanation;)