awk to find number in a field then print the line and the number

Hi

I want to use awk to match where field 3 contains a number within string - then print the line and just the number as a new field.
The source file is pipe delimited and looks something like

1|net|ABC Letr1|1530|||
1|net|EXP_1040 ABC|1121|||
1|net|EXP_TG1224|1122|||
1|net|R_North|1123|||
1|net|RExp 123 456 X|1234|||

What I want as an output is:

1|net|ABC Letr1|1530|||1|
1|net|EXP_1040 ABC|1121|||1040|
1|net|EXP_TG1224|1122|||1224|
1|net|RExp 123 456 X|1234|||123 456|

i.e. field $7 has just the number from field $3
I got as far as matching where field 3 contains a number:
where the input file is called 'x'

cat x|gawk -F"|" '$3 ~ /[0-9]/'

but various attempts with substr have so far failed.
any help appreciated . .

thanks

Hello Mudshark,

Following may help you in same.

awk -F"|" '{Q=$3;gsub(/[[:alpha:]]|[[:punct:]]/,X,Q);sub(/^[[:space:]]/,X,Q);sub(/[[:space:]]+$/,X,Q);$(NF)=Q;if(Q){print $0 OFS}}' OFS="|" Input_file
 

Output will be as follows.

1|net|ABC Letr1|1530|||1|
1|net|EXP_1040 ABC|1121|||1040|
1|net|EXP_TG1224|1122|||1224|
1|net|RExp 123 456 X|1234|||123 456|
 

EDIT: Adding a non-one liner solution for same.

 awk -F"|" '{
                Q=$3;
                gsub(/[[:alpha:]]|[[:punct:]]/,X,Q);
                sub(/^[[:space:]]/,X,Q);
                sub(/[[:space:]]+$/,X,Q);
                $(NF)=Q;
                                                        if(Q){
                                                                print $0 OFS
                                                             }
            }
          ' OFS="|" Input_file
 

Thanks,
R. Singh

2 Likes

Try also

awk -F\| '{$7=$3; $8=""; gsub(/^[^0-9]*|[^0-9]*$/, "",$7)} $7' OFS=\| file
1|net|ABC Letr1|1530|||1|
1|net|EXP_1040 ABC|1121|||1040|
1|net|EXP_TG1224|1122|||1224|
1|net|RExp 123 456 X|1234|||123 456|
1 Like

Ravinder, you worked around

gsub(/[^[:digit:]]/,X,Q)
1 Like

This works with the English description of the desired output, but doesn't preserve the space between strings of digits as shown in the last line of the desired output. To get the desired output, one could use:

gsub(/[^[:digit:] ]/,X,Q)

if only <space> characters are to be allowed between strings of digits, or:

gsub(/[^[:digit:][:space:]]/,X,Q)

if any combinations of <space> and <tab> characters are to be allowed (as assumed by Ravinder's following sub() statements to get rid of leading and trailing space class characters.

Note that Ravinder's suggestion to use:

                sub(/^[[:space:]]/,X,Q);
                sub(/[[:space:]]+$/,X,Q);

after getting rid of alphabetic and punctuation class characters only removes one leading whitespace character (when it seems that there could be more than one). And, both of these could be replaced by a single gsub() call:

                gsub(/^[[:space:]]+|[[:space:]]+$/,X,Q);

or, if only space characters are to be allowed between digit strings:

                gsub(/^ +| +$/,X,Q);
2 Likes

Many thanks all for your help on this.
Suggestions both worked perfectly on my example data.

appreciate the help.

Thanks Don for the explanations.