Need HELP with AWK split. Need to check for "special characters" in string before splitting the file

Hi Experts.

I'm stuck with the below AWK code where i'm trying to move the records containing any special characters in the last field to a bad file.

awk -F, '{if ($NF ~ /^[0-9]|^[A-Za-z]/) print >"goodfile";else print >"badfile"}' filename

sample data

1,abc,def,1234,A *
2,bed,dec,342,* A         
3,dec,345,23,*&^          
4,sdf,fgh,234,  
5,ert,345,ghj,C**
6,ert,345,sdf,123          ---- only valid record

The output required must contain the first 5 records in badfile and the last record in good file.
But my above awk logic cosiders only the below records as badfile records:

2,bed,dec,342,* A 
3,dec,345,23,*&^ 
4,sdf,fgh,234, 

The other two invalid records ("A " and "C*") are being written into goodfile which is wrong.
Please help me fix this.

Note: the $NF values can contain [spaces:] between any alphanumeric chars. However, all spaces or null is considered a bad record.

Thanks Gurus!

Tune your regexp to:

awk ' /[0-9]$|[A-Za-z]$/{print >"goodfile";next}{print >"badfile"}' infile

I tuned the regex as suggested. But it does not give the required output.
code used:

awk -F, '{if ($NF ~ /[0-9]$|[A-Za-z]$/) print >"goodfile"; else print >"badfile"}' samp.txt

samp.txt

1,abc,def,1234,A *
2,bed,dec,342,* A
3,dec,345,23,*&^
4,sdf,fgh,234,
5,ert,345,ghj,C*2
6,ert,345,sdf,123

Output

$ cat goodfile
2,bed,dec,342,* A
5,ert,345,ghj,C*2
6,ert,345,sdf,123
$ cat badfile
1,abc,def,1234,A *
3,dec,345,23,*&^
4,sdf,fgh,234,

Check your syntax:

awk -F, '{if ($NF ~ /[0-9]$|[A-Za-z]$/) {print >"goodfile"} else {print >"badfile" }}'  infile

Sorry, Klashxx.
It gives the same output.

$ cat samp.txt
1,abc,def,1234,A *
2,bed,dec,342,* A
3,dec,345,23,*&^
4,sdf,fgh,234,
5,ert,345,ghj,C*2
6,ert,345,sdf,123

$ awk -F, '{if ($NF ~ /[0-9]$|[A-Za-z]$/) {print >"goodfile"} else {print >"badfile" }}' samp.txt

$ cat goodfile
2,bed,dec,342,* A
5,ert,345,ghj,C*2
6,ert,345,sdf,123

$ cat badfile
1,abc,def,1234,A *
3,dec,345,23,*&^
4,sdf,fgh,234,

---------- Post updated at 05:29 AM ---------- Previous update was at 05:28 AM ----------

I'm using korn shell. Maybe that makes a difference .??

This only looks at $NF's last char. This

awk -F, '{if ($NF~/[0-9A-Za-z][0-9A-Za-z][0-9A-Za-z]/) print >"goodfile"; else print>"badfile}' samp.txt

will work on the example, but it does not take into account the possible variable length of $NF. The repetition term /.../{length($NF)} does not seem to work, nor does the regex [[:alnum:]] contruct.

1 Like

Try this:

awk -F, '{if($NF~/^[[:alnum:][:blank:]]+$/) print > "goodfile"; else print > "badfile"}' infile