Masking Bank Account Number except last 4 digits in the file

Hello Unix Guru's,

I need help in the masking Bank Account Number except last 4 digits in the file using either unix command or shell script.

I'm greatly appreciate your help.

File Name: Sample.txt
560|101012|4267||||||||520114025017|Balance_bank|06/30/2018||||151716.41|AUD
448|101034|3148||||||||232041005|Balance_bank|06/30/2018||||991.23|AUD
270|101034|2770||||||||121339005|Balance_bank|06/30/2018||||594217.11|AUD

Modified File Should be like below


560|101012|4267||||||||XXXXXXXX5017|Balance_bank|06/30/2018||||151716.41|AUD
448|101034|3148||||||||XXXXX1005|Balance_bank|06/30/2018||||991.23|AUD
270|101034|2770||||||||XXXXX9005|Balance_bank|06/30/2018||||594217.11|AUD

Welcome to the forum.

Any attempts / ideas / thoughts from your side?

1 Like

I know how to do it in Oracle (like below), however client need in shell script.

select length('520114025017'),
                  '520114025017'  bank_account_number,      
 lpad(substr('520114025017',length('520114025017')-3,length('520114025017')),9,'X')
          from dual;

Thanks,
Pradeep

Here is a hack with sed .

sed -e ':L' -e 's/\(\([^|]*|\)\{10\}X*\)[0-9]\([0-9]\{4\}\)/\1X\3/;tL' sample.txt

The \(\([^|]*|\)\{10\}X*\) matches the first 10 fields plus already substituted X es in field 11, and is put back as \1 .
The 4 digit look-ahead \([0-9]\{4\}\) is put back as \3 .
The one [0-9] is really substituted by the X .
The whole thing repeats in a loop, until nothing is left that can be substituted. (A /g option does not work because it cannot repeat on anything before the last matched character i.e. the look-ahead.)
A solution with awk or perl or bash would look simpler, because you can split on "|" and only work on field 11.

Depending on what the rest of the shell script looks like (and, most prominently, WHICH SHELL YOU ARE USING - you might have considered telling us) you can do it with variable expansion. Here is a solution in Korn shell, i haven't tested it in bash,but it should work there too (replace "print" with "echo" then):

#! /bin/ksh

function maskpart
{
mask="${1%????}"                # extract everything save for the last 4 digits
mask="${mask//?/x}"             # and replace all characters with x's

print - "${mask}${1#${1%????}}" # print the masked part and the last 4 digits

return 0
}

# here are examples of how to use the function:
maskpart "123456-7890"
maskpart "blafoo1234"

myvar="$(maskpart "123456-7890")" ; print - $myvar

while read BAN ; do
     printf "$BAN \t==> " ; maskpart "$BAN"
done < /file/with/bank-account-numbers
exit 0

The function does not cover for bank-account-numbers being 4 digits or shorter (or otherwise malformed). If this could be the case you will have to provide extra logic.

I hope this helps.

bakunin

2 Likes

Here a solution using Zsh. Assuming that you have stored your bank acount number into a variable called 'number', i.e.

number=123456789

you can get your number with the X by

xnumber=${${number:0:$((${#number}-4))}//?/X}${number:$((${#number}-4)):4}

Here comes a bash script, using a short version of bakunin's function and a print function. (The shell is lacking a builtin join function. Gives me the chance to demonstrate the power of a custom function.)

#!/bin/bash

sep="|"

maskpart(){
  local mask="${1%????}"
  echo "${mask//?/X}${1#$mask}"
}

joinprint(){
  local a s=$1 out=$2
  shift 2
  for a
  do
    out=$out$s$a
  done
  echo "$out"
}

while IFS=$sep read -a arr
do
  arr[10]=$(maskpart "${arr[10]}")
  joinprint "$sep" "${arr[@]}"
done

Hello MadeinGermany,

How to pass the below file to your script and then updated with masking value in 11th column in the file?

File Name: Sample.txt

560|101012|4267||||||||520114025017|Balance_bank|06/30/2018||||151716.41|AUD
448|101034|3148||||||||232041005|Balance_bank|06/30/2018||||991.23|AUD
270|101034|2770||||||||121339005|Balance_bank|06/30/2018||||594217.11|AUD
P59|101017|1765||||||||177-000072-011|Balance_bank|06/30/2018||||0.00|CNY

Thank you

echo '560|101012|4267||||||||520114025017|Balance_bank|06/30/2018||||151716.41|AUD' | awk -F'|' 'BEGIN{m="XXXXXXXXXXXXX"} {l=length($11);$11=substr(m,1,l-4) substr($11,l-3)}1' OFS='|'

My script in post#7 reads from stdin.
You must redirect it from the input file:

bash /path/to/script < Sample.txt

Additionally redirect the stdout to a new file:

bash /path/to/script < Sample.txt > Newfile.txt

If the x-bit is set on /path/to/script you can run it directly, it will take the interpreter from its #! shebang

/path/to/script < Sample.txt > Newfile.txt

Hi Vgerh99,

It works perfect. How to copy these results into the same file?

awk -F'|' 'BEGIN{m="XXXXXXXXXXXXX"} {l=length($11);$11=substr(m,1,l-4) substr($11,l-3)}1' OFS='|' NOAD_BK_BAL_MULTI_CURR_*

thanks,
Pradeep

Firstly, please start using code tags as outlined in the Forum's rules.

awk doesn't have the inline editing capabilities. You'll have wrap awk within a shell script, saving the output of awk to a temp file and then MVing the temp file to the original (when satisfied with the results).
E.g.

awk -F'|' 'BEGIN{m="XXXXXXXXXXXXX"} {l=length($11);$11=substr(m,1,l-4) substr($11,l-3)}1' OFS='|' myFile >|/tmp/myTempFile
mv /tmp/myTempFile myFile

Hi,
For fun, another bash script:

while IFS=\| read -a ARR
do
  [[ ${#ARR[10]} -gt 4 ]] && { 
    XX="${ARR[10]::((${#ARR[10]}-4))}"
    YY="${ARR[10]:((${#ARR[10]}-4)):4}"
    ARR[10]="${XX//?/X}$YY"
  }
  printf "%s" "${ARR[0]}"
  unset ARR[0]
  printf "|%s" "${ARR[@]}"
  printf "\n"
done </tmp/sample.txt

Regards.

Thank you so much guys. It's really helped me.