NOTE: dot(.) indicates one space
To be precise, I have a file(file1.txt) of fixed length and no field separator. Each record is of fixed length (21). I have a second file(file2.txt) which is having 2 fields only which are comma (,) separated. I need to parse through the first file. Whenever I will get a value from char 10-13, I will search in file2's first field. If I get the value, will take the second filed from the file2.txt and will replace the 16-19 charecters of the first file with the second field of second file. Point to note that the final fie is of (21) same length of the first file.
Hope this can be done through AWK. But as am new to it, any help is appreciated. I need this very immediately.
Thanks in advance,
This ksh script seems to work with your posted data:
#! /usr/bin/ksh
exec < file1.txt
integer saved limit
saved=0
limit=1000
#
# read a line and break it into fields
while IFS="" read line ; do
tmp=${line#?????????}
field1=${line%$tmp}
line="$tmp"
tmp=${line#????}
field2=${line%$tmp}
line="$tmp"
tmp=${line#??}
field3=${line%$tmp}
line="$tmp"
tmp=${line#????}
field4=${line%$tmp}
field5="$tmp"
#
# If field2 is numeric we will use it to search for a new field4
if [[ $field2 == +([0-9]) ]] ; then
#
# See if we previously saved the data for this field2
eval data=\${XX${field2}:-NOT_THERE}
if [[ $data != NOT_THERE ]] ; then
field4="$data"
else
#
# See if we can find field2 in the second file
if data=$(grep "^${field2}," file2.txt) ; then
data=${data##*,}
echo found data = $data
field4="$data"
#
# Save the first $limit records we find in memory to avoid re-examining the file each time
if ((saved<limit)) ; then
((saved=saved+1))
eval XX${field2}=\${data}
fi
fi
fi
fi
echo "${field1}${field2}${field3}${field4}${field5}"
done
exit 0
I don't believe that I have ever used a statement like:
if data=$(grep "^${field2}," file2.txt) ; then
before. It is a cool technique.
#!/usr/bin/python
f2data = {} #store as look up table
for f2 in open("file2.txt"):
fone,ftwo = f2.strip().split(",") # get 1235, 9998 etc
f2data[fone] = ftwo
for f1 in open("file1.txt"):
if f1[9:13] in f2data:
print f1[0:14] + f2data.get(f1[9:13]) + f1[19:].strip()
else:
print f1.strip()