Script to extract numbers from gecos field

Several (not all) of my gecos field contains numbers. The format would be
joanjett0123456
joebelow02347689
catdown

I need to be able to strip the numbers out of the gecos field . Any suggestions on how to approach this.

welcome to the community, @JCM !
We usually ask the OPs to post their own attempt(s) at solving their issue, but I'm going to bend the rules a bit, and get you started:

$ echo 'joanjett0123456' | sed 's/[0-9]//g'
joanjett

P.S. going forward.... please use the markdown tags when posting data/code samples - it's the rule here.

2 Likes

presuming gecos is the 5th field in your /etc/passwd file .... (if NOT, YOU need to provide CONCISE details not some random text)

cat /tmp/JCM 
_logd:*:272:272:Log Daemon3333 :/var/db/diagnostics:/usr/bin/false
_appinstalld:*:273:273:App 444444Install Daemon:/var/db/appinstalld:/usr/bin/false
_installcoordinationd:*:274:274:Install 8888888Coordination Daemon:/var/db/installcoordinationd:/usr/bin/false
_demod:*:275:275:Demo Daemon:/var/empty:/usr/bin/false
_rmd:*:277:277:Remote Management Daemon:/var/db/rmd:/usr/bin/false
_fud:*:278:278:2222Firmware 333Update 444Daemon9999:/var/db/fud:/usr/bin/false
_knowledgegraphd:*:279:279:Knowledge Graph Daemon:/var/db/knowledgegraphd:/usr/bin/false
_coreml:*:280:280:CoreML Services:/var/empty:/usr/bin/false
_trustd:*:282:282:trustd:/var/empty:/usr/bin/false
_oahd:*:441:441:OAH Daemon23456:/var/empty:/usr/bin/false

awk -F: '{gsub(/[0-9]+/,"",$5)}1' OFS=: JCM 
_logd:*:272:272:Log Daemon :/var/db/diagnostics:/usr/bin/false
_appinstalld:*:273:273:App Install Daemon:/var/db/appinstalld:/usr/bin/false
_installcoordinationd:*:274:274:Install Coordination Daemon:/var/db/installcoordinationd:/usr/bin/false
_demod:*:275:275:Demo Daemon:/var/empty:/usr/bin/false
_rmd:*:277:277:Remote Management Daemon:/var/db/rmd:/usr/bin/false
_fud:*:278:278:Firmware Update Daemon:/var/db/fud:/usr/bin/false
_knowledgegraphd:*:279:279:Knowledge Graph Daemon:/var/db/knowledgegraphd:/usr/bin/false
_coreml:*:280:280:CoreML Services:/var/empty:/usr/bin/false
_trustd:*:282:282:trustd:/var/empty:/usr/bin/false
_oahd:*:441:441:OAH Daemon:/var/empty:/usr/bin/false

@munkeHoller , I'm not sure if

_fud:*:278:278:2222Firmware 333Update 444Daemon9999:/var/db/fud:/usr/bin/false

should be changed to

_fud:*:278:278:Firmware 333Update 444Daemon9999:/var/db/fud:/usr/bin/false

or to

_fud:*:278:278:Firmware Update Daemon:/var/db/fud:/usr/bin/false

maybe you meant gsub instead of sub - OP will have to be more specific with the reqs.

1 Like

@vgersh99, edited to use gsub, tks for the heads up

as stated, since the 'data' supplied was less than adequate we're just guessing.

1 Like

Remove only the trailing number in the gecos field:

awk -F: '{sub(/[0-9]+$/,"",$5)}1' OFS=: passwd

The same with sed (for practical reason it counts two fields from the right end):

sed 's/[0-9]*\(\(:[^:]*\)\{2\}\)$/\1/' passwd

While awk takes ERE, sed (by default) takes BRE:
https://en.wikibooks.org/wiki/Regular_Expressions/POSIX_Basic_Regular_Expressions
In sed the \1 inserts the match of the 1st capture group ( = "sub expression" that starts with the 1st \( ).

Looking at the title of this thread, here comes the opposite, just extract/show the trailing numbers:

awk -F: '{sub(/.*[^0-9]/,"",$5)} $5!="" {print $5}' OFS=: passwd

Thank you for the solution. I apologize for the data sample not being clear. Suggestions have been noted.

Apologies. I have taken note of the suggestions on the post. I was trying to brief but will provide more details in any future posts.

Thank you for responding and the instruction for future posts.

Thank you for the solution. I apologize for not clearly defining my issue. Suggestions have been noted.

1 Like