Hi,
We need to compare a text file File1.txt and config file File2.txt in a way that it checks if the content of File1.txt exists between the range mentioned in File2.cfg.
The range here is the range between col1 and col2 of File2.cfg
If the content of File1.txt lies between the range of File2.cfg then output should display the count of col3 of File2.cfg in Outputfile.
For Example :
File1.txt
65005
65007
65006
27117
68700
68399
File2.cfg
col1 col2 col3
65005 65008 A
68399 68399 A
68700 68700 A
22980 22999 B
27109 27125 C
Output File :
col3 Count
A 5
B 0
C 1
Could anyone please help me to do the same:confused:.
awk '
### THIS STORES THE RANGES FROM THE FIRST FILE ON THE COMMAND LINE (file2) ###
### IN AN ARRAY IN MEMORY. REQUIRED FOR LOOKUP LATER WHEN WE START READING ###
### file1. ###
FNR==NR{range[$1,$2]=$3;next}
##############################################################################
{
### FOR EACH LINE FROM file1, THIS LOOPS THROUGH THE ARRAY STORED IN MEMORY###
### AND CHECKS IF THE LINE READ IS IN ANY OF THE RANGES. IF YES, THE COUNT ###
### OF THE RANGE NAME (A,B,C,D,ETC.) IS INCREMENTED BY ONE. ###
for(i in range)
{
c[range]+=0
split(i,r,SUBSEP)
if($1>=r[1] && $1<=r[2])
c[range]++
}
###############################################################################
}
### THIS IS DONE AFTER READING THE 2 FILES COMPLETELY. THIS SIMPLY PRINTS ###
### THE ARRAY INDEX (RANGE NAME) AND THE CORRESPONDING VALUE (COUNT). THE ###
### OUTPUT IS PIPED TO THE SORT COMMAND TO GET THE OUTPUT DESIRED. ###
END{
for(i in c)
print i,c|"sort"}
################################################################################
' file2 file1
The only problem being our cfg file consist of some descriptive headers before each range specified. So while running the code we are also getting that in the count. Thus in output for all the descriptive header it is giving us the total count.
Example:
File2.cfg :
#Start End session #----- ----- -------
65005 65008 A
68399 68399 A
68700 68700 A
22980 22999 B
This is for your information
27109 27125 C