ernesto
September 28, 2016, 9:39am
1
Hi Guru's,
i have a problem and hope you could help me on how to go about it. I have a file which is impossible to decipher. Please see below.
INPUT
121014259981568JBAUT30JaniceBautistaMANGER TECHNICIANActiveTech
121014259879380DMASH03DavidMashawLEAD TECHNICIANActiveClerk
I wanted to have an output as shown below.
OUTPUT
JBAUT30 Janice Bautista MANGER TECHNICIAN
DMASH03 David Mashaw LEAD TECHNICIAN
Thanks.
Hello Ernesto,
Could you please try following and let me know if this helps you. But Input_file should be same as shown sample Input_file(ditto exact) else it would provide different output too.
awk '{sub(/[0-9]+/,X,$0);match($0,/[[:alpha:]]+[[:digit:]]+/);ID=substr($0,RSTART,RLENGTH);DES=substr($0,RLENGTH+1);match(DES,/[a-zA-Z]+/);NAME=substr(DES,RSTART,RLENGTH);gsub(/.* |Active.*/,X,DES);sub(/[a-z]+/,"& ",NAME);VAL=ID FS NAME FS DES;gsub(/LEAD|MANGER/," &",VAL);print VAL}' Input_file
EDIT: Adding a non-one liner form of above solution too now.
awk '{
sub(/[0-9]+/,X,$0);
match($0,/[[:alpha:]]+[[:digit:]]+/);
ID=substr($0,RSTART,RLENGTH);
DES=substr($0,RLENGTH+1);
match(DES,/[a-zA-Z]+/);
NAME=substr(DES,RSTART,RLENGTH);
gsub(/.* |Active.*/,X,DES);
sub(/[a-z]+/,"& ",NAME);
VAL=ID FS NAME FS DES;
gsub(/LEAD|MANGER/," &",VAL);
print VAL
}
' Input_file
Thanks,
R. Singh
1 Like
ernesto:
Hi Guru's,
i have a problem and hope you could help me on how to go about it. I have a file which is impossible to decipher. Please see below.
INPUT
121014259981568JBAUT30JaniceBautistaMANGER TECHNICIANActiveTech
121014259879380DMASH03DavidMashawLEAD TECHNICIANActiveClerk
I wanted to have an output as shown below.
OUTPUT
JBAUT30 Janice Bautista MANGER TECHNICIAN
DMASH03 David Mashaw LEAD TECHNICIAN
Thanks.
Hopefully this sed one liner might work...
sed -e 's/\([0-9]\)\([A-Z]\)/\1 \2/g;s/\([a-z]\)\([A-Z]\)/\1 \2/g;s/\([A-Z]\)\([A-Z]\)\([a-z]\)/\1 \2\3/g;s/^\([0-9]*\) //' file
1 Like
Hi,
Another version. ( output/sed is purely based on given input in post#1, it might not work if input lines are different)
$ cat file
121014259981568JBAUT30JaniceBautistaMANGER TECHNICIANActiveTech
121014259879380DMASH03DavidMashawLEAD TECHNICIANActiveClerk
sed -re 's/^[0-9]+([A-Z]{5}[0-9]{2})([A-Z^A-Z][a-z]*)([A-Z][^A-Z][a-z]*)([a-z]*)([A-Z][^a-z]*)([A-Z][A-Za-z]*)/\1 \2 \3\4 \5/' file
Gives output:
JBAUT30 Janice Bautista MANGER TECHNICIAN
DMASH03 David Mashaw LEAD TECHNICIAN
1 Like
Hello Shamrock,
Thank you for nice code, would like to add here IMHO. Above code gives extra string at last from ActiveTech
or ActiveClerk
, so to remove them too just a little modification of you code as follows should do the trick.
sed -e 's/\([0-9]\)\([A-Z]\)/\1 \2/g;s/\([a-z]\)\([A-Z]\)/\1 \2/g;s/\([A-Z]\)\([A-Z]\)\([a-z]\)/\1 \2\3/g;s/^\([0-9]*\) //;s/ Active.*//g' Input_file
Thanks,
R. Singh
2 Likes