substitution of varying digits

mad_man12 · January 19, 2010, 7:24am

I had a requirement in which in need to pan(*) out digits except the first six visible, followed by six *, and rest
visible of a variable(input)

ex:
Input - 123456789012345
Output - 123456345
ex:
Input - 1234567890123456
Output - 1234563456

so i tried something like below and it worked.

if($length($(i+12))>=15)
   
  {sub(substr($(12+i),7,6),"******",$(12+i))}

But now the updated requirement is that I need only the first six and the last 4 to be visible and the
rest(will vary depending upon length of input) replaced by * for any
length greater or equal to 15.
Please Advice how to achieve the above.

panyam · January 19, 2010, 8:10am

Something like this:

echo "1234567890000000000012345" | sed 's/\([0-9]\{6\}\)\(.*\)\([0-9]\{4\}\)/\1(\2)\3/' | awk -F"[()]" '{gsub(".","*",$2); print }' OFS=""

mad_man12 · January 19, 2010, 10:12am

hey Panyam,

Its you Again!!, thanks a lot!! but would require more

as the following change needs to be imbedded in this code

 
ls *.txt | while read file ; do
awk -F: '/\+ABC/{for(i=0;++i<NF;){if($i~/\+ABC/&&length($(i+12))>=15){sub(substr($(12+i),7,6),"******",$(12+i))}}}1' OFS=":" $file > "$file"_encrypted
mv  "$file"_encrypted $file
done

The above code is the requirement if u remember(by chance) was to look for ABC in the .txt fle and replace the 12 field
with the above requirement.

deindorfer · January 19, 2010, 5:39pm

perl -lne '/(\d{6})(\d+)(\d{4})$/; print $1, "*" x length $2, $3' file

rdcwayx · January 20, 2010, 12:08am

$ echo "1234567890000000000012345"  |awk -F "" '{for (i=7;i<=(NF-4);i++) $i="*"}1' OFS=""
123456***************2345

---------- Post updated at 12:08 AM ---------- Previous update was at 12:01 AM ----------

provide sample of your txt file here, we give you one line solution.

mad_man12 · January 22, 2010, 10:59am

BAT:0310:2009-08-0:Y4   :H:D:00003721:03103721.IFH:00138770:05767:
00000000001279'
 
EXR:CLP:912.570000'
 
STA:A:9071559:2009-08-10::Ward::Mrs'
 
DEF::531.97:531.97:310221661617::+ABC:BAL:1:N::::5:40.00:0.00:2009-08-10:CN:11627877495099621::3:N:missc :N:PH:00010833:
0001+ABC:FPT:4:N::::5:19.99:0.00:2009-08-10:CN:1162 7987 9509 9621::3:N:miss c ross:N:AI:00220600:S3IA'
 
VDI:2004-03-12:133030431725:4:M:00001912:AT:BSP:9124029676:2004-05-06:Parker:4:12:::::I:::::N::129.00:129.00
:1234567887234567678:0:155.40::6:::::+TAX:UB:6.30+TAX:XT:15.10'
 
CTR:2009-08-10:0.00:0.00:30.00:30.00:7819.00:7819.00'
 
GTR:11.50:0.00:0.00:28457.81:149449.38:21298.48:154882.82:1725.89'
TRA'

i have a txt file as above and i need to mask the middle digits of the credit card num such that only the first 6
and last 4 are visible.

The credit card number appear in 12th position in +ABC segment separted by :
( the 12th position can have other things also apart from the credit card which shouldn't be masked) the way to
identify credit card num is the field before it that is the 11 field in ABC section is having any of these values
TXE,AF,XT,TT,IT,TX,DX,TY,DT,MO,SE,CF,AXE,DF,CX,TF,DE,XF,CNE,IX,CN,SC,XTE,AX,CX
then credit card is in 12th position and needs to masked

The credit card can be of varying digits (16, 17, 19.........) and they can digits of credit card can appear
together or with space

The credit card number also appear in 26th position in VDI segment separated by :

eg
1162 7987 9509 9621
1162798795099621
1162 7987 9509 9621 1234
11627987950996211234

The output needs to be

1162 79********9621
116279******9621
1162 79*************1234
116279**********1234

i used the above code as posted in my above post but it didn't
have the functionality of varying digits and the 11th field check of ABC section.

Please Advice how to achieve the above.

steadyonabix · January 22, 2010, 11:26am

It is a seriously bad idea, not to mention unlawful in most countries to post peoples credit card details in a forum like this. I can see the data looks old but none the less you should sanitize the data first.I work for a credit card company and they do scour the web looking for posts by their employees. If they found me posting this I would be marched out the door.I am sure Mrs Ward and Miss Ross would be less than delighted to find this here.

deindorfer · January 23, 2010, 2:19am

This might be better off as script, but it will do what you ask.

perl -pe 's/\n//g; s/\x27/\n/; s/^\s+//' infile | perl -F: -lane 'print $F[26] if /^VDI/; print $1 while ( /\+ABC(?::.*?){12}(.+?):/g )' | perl -lpe 's/\s+//g' | perl -lne '/(\d{6})(\d+)(\d{4})$/; print $1, "*" x length $2, $3'

Scrutinizer · January 23, 2010, 4:20am

Try this modification to rdcwayx' code:

awk -F "" '{j=0; for(i=1;i<7+j;i++)if($i==" ")j++; for(;i<=(NF-4);i++)$i="*"}1' OFS="" infile

deindorfer · January 23, 2010, 2:39pm

awk script above produces this output, when run against the example data:

BAT:03********************************************************767:
000000*****279'

EXR:CL*********000'

STA:A:**************************Mrs'

DEF::5**************************************************************************************************************833:
0001+A*********************************************************************************************3IA'

VDI:20**************************************************************************************************9.00
:12345*****************************************************.10'

CTR:20*******************************************.00'

GTR:11********************************************************.89'
TRA'

DEF::5**************************************************************************************************************833:
0001+A*************************************************************************************************************3IA'

Here is what the perl I posted produces:

perl -pe 's/\n//g; s/\x27/\n/; s/^\s+//' infile | perl -F: -lane 'print $F[26] if
/^VDI/; print $1 while ( /\+ABC(?::.*?){12}(.+?):/g )' | perl -lpe 's/\s+//g' | perl -lne '/(\d{6})(\d+)(\d{4})$/; print $1, "*" x length $2, $3'

116278*******9621
116279******9629
123456*********7678

I'll post a script. This is getting too silly for a one-liner.

Scrutinizer · January 23, 2010, 6:32pm

The awker was created for the original requirement and there it produces:

1162 79********9621
116279******9621
1162 79*************1234
116279**********1234

To get this this out of the revised specs, including the additional field code restriction for field 11, similar to your Perl, I ended up with something crazy like this :

awk -F: '{gsub(/\n/,"")}1' RS="\'" infile2 |
awk -F'\+ABC' '/^ *VDI:/{print$1}NF>1{for (i=2;i<=NF;i++)print $i}' OFS='\n' |
awk -F: '{if ($27)print $27; else if ($12 ~ /^TXE$|^AF$|^XT$|^TT$|^IT$|^TX$|^DX$|^TY$|^DT$|^MO$|^SE$|^CF$|^AXE$|^DF$|^CX$|^TF$|^DE$|^XF$|^CNE$|^IX$|^CN$|^SC$|^XTE$|^AX$|^CX$/) print $13}' |
awk -F "" '{j=0; for(i=1;i<7+j;i++)if($i==" ")j++; for(;i<=(NF-4);i++)$i="*"}1' OFS=""

Output:

116278*******9621
1162 79********9621
123456*********7678

I am sure this can be further optimized
Anyone?

deindorfer · January 24, 2010, 1:36am

I created perl that produces the correct output and which scales forever, provided the requester has posted *ALL* the use cases, so awk away; but the problem IS solved --in perl....

It's solved in awk, too. y'all just have to golf it some more. really it should be a script at this point. Let's call it quits: we solved in two languages, yay us!!

mad_man12 · January 27, 2010, 4:03am

Thanks Unix and Perl Gurus!!

i am looking for the replacement in the input file itself such that
the script once run should replace with * in the input file, the above code is giving me separate output as the masked card number only,

i need the whole file as output with only the cc num being masked.

please advice
Thanks in Advance!!