Hello,
I am working parsing a large input file1(field CFA)
I have to compare the the file1 field(CFA byte 88-96) with the content of the file2(It contains only one field) and and insert rows equal
in another file.
Here is my code and sample input file:
#########################################
# F.ne: CheckNBS
#########################################
function CheckNBS
{
writeInfo "************************************************************************************************"
writeInfo "----------------- CHECK FILE NBS FILE2 Start: $d ------------------"
FILE2="$DIR_OUT"/"FILE2"_"${NamingDate}.data"
FILE_OUT="$DIR_OUT"/"OUT_CAMPIONE_NBS"_"${NamingDate}.ctrl"
ListFILE=`cat "$NBSPATH"/"*"${DATA_RIFERIMENTO}"*"`
for FILE1 in ${ListFILE}
do
writeInfo "Elaborazione FILE1 : ${FILE1}"
ListCFA=`cat ${FILE2}`
for CFA in ${ListCFA}
do
zcat "$NBSPATH"/"$FILE1" | grep $CFA | awk '$1 == "201" { print $0 }' >> ${FILE_OUT}
done
done
}
Execution is very slow. I can use awk also on compressed files ?
You're zcat ting "$NBSPATH"/"$FILE1" and running grep | awk once for every CFA in $FILE2 . That consumes a lot of resources. Why don't you uncompress once into a temp file and use e.g. grep -f $FILE2 on the temp file? Does your system offer the zgrep command?
Yes, you can use the awk substr() function to grab substrings. If your uncompressed FILE1 contained any of the strings in FILE2 , the following awk script would print lines containing any matching lines:
awk '
FNR == NR {
CPA[$1]
next
}
substr($0, 88, 12) in CPA' FILE2 FILE1
but, as has already been stated, no lines in your sample files match.
If we add the following line to your sample FILE2 :