I have a requirement of masking few specific fields in the UNIX file. The details are as following-
File is fixed length file with each record of 250 charater length.
2 fields needs to be masked � the positions are 21:30 and 110:120
The character by character making needs to be done which means one character in this field ( 21:30 and 110:120 ) should be replaced with some another character.
The replacement patterns should be kept in a separate file. For example
A:B
B:C
Which means A should be replaced with B , B with C and so on.
Is this a homework assignment. Homework assignments must be filed in the Homework & Coursework forum and the 1st post in threads in that forum must contain a completely filled out homework template.
If it isn't homework, you need to make your requirements much clearer. A masking operation would blank out or remove a specific character or a specific set of characters; not replace one set of characters with another set of characters.
And without sample input and corresponding sample output (in CODE tags), your specification is ambiguous. If an A is found and converted to a B (as in your sample), is the output for that position supposed to be a B or should it be a C because the 2nd line in your sample input file says B should be changed to C ?
This is not an assignment; this is a real time challenge. Apologies for not articulating problem in the correct manner and fully explaining the issue.
Maybe calling it a masking solution is not the right thing. My actual requirement is to replace a set of characters with another set of character. That too a character by character replacement so reverse decryption can be easily achieved as well.
As it�s character by character replacement every character will have one and only one replacement character. Hence ABCDE should convert to BCDEF where replacement algorithm is like
A : B
B : C
C : D
D : E
E : F
The first character to be found and replaced by second character.
If your replacement pattern file is in the format shown in post #1 or in the format shown in post #3, or if you are working on a system where setting FS to the empty string doesn't treat each input character as a separate field, you could this (although all of the field sizes and offsets are built into the code instead of being derived from an input operand):
awk -F'[[:blank:]]*:[[:blank:]]*' '
FNR == NR {
rp[$1] = $2
next
}
{ o = substr($0, 1, 20)
for(i = 21; i <= 30; i++)
o = o (((c = substr($0, i, 1)) in rp) ? rp[c] : c)
o = o substr($0, 31, 79)
for(i = 110; i <= 120; i++)
o = o (((c = substr($0, i, 1)) in rp) ? rp[c] : c)
print o substr($0, 121)
}' replacement_pattern_file data_file
If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .
The codes are working beautifully and are doing the intended translations.
My next challenge is the performance as these codes many need to run on files with 10M records. However I had a test run with 1M records and it completed in less than 8 minutes.
I will further post if I get any further challenge in this. Need to build a complete solution across files ( with multiple translation fields).