awk is Printing folders with only numbers as expected. But can't explain 'total' statement.

I am trying to get folder names that contain only numbers.

Can someone explain why following command is printing 'total 450' as part of output..

 
$>  ls -lt | awk '$9 ~ /^[1-9]*$/' | more
total 450
drwxr-x--x   3 user1  group1     512 Mar  9  2008 329227163
drwxr-x--x   3 user1  group1     512 Mar  9  2008 1285642344
drwxr-x--x   3 user1  group1     512 Mar  9  2008 94825883

Rest of the output is expected.

ls -lt | awk '$9 ~ /^[1-9]+$/' | more
1 Like

[1-9]* is 0 or more, therefore you're searching for empty or only numeric 9th field.
You need 1 or more - [1-9][1-9]* or [1-9]+ (if supported).

'cause "/[^1-9]*$" evaluates to 0 or more digits.
As there's no field 9 in 'total 450', the condition is true.
Try - you need one or more occurrences of digits:

ls -lt | awk '$9 ~ /^[1-9][1-9]*$/'

wow, so quick so many working solutions. Thaks to everyone. All solutions are working.

Yes,, ^[1-9]*+ is equivalent to 1 or more occurances of names that contain only numbers. Thanks to bartus11.

ls -lt | awk '$9 ~ /^[1-9]+$/'

Just to add that recent shells support extended globbing and if you're using one of them, you don't need awk for this task.

With ksh:

ls -lt !(*[!1-9]*)

With bash:

shopt -s extglob
ls -lt !(*[!1-9]*)

Or 0-9, depending on your needs.

With zsh:

ls -lt <->

It appears on solaris 8 I have extended set here --> /usr/xpg4/bin/ls
But I am not sure where in your reg.exp you are telling it to act only on folder names(column 9).

Also why can't we tell extended globber to understand something like
this -->


  1. 1-9 ↩︎

  2. 1-9 ↩︎

These are the X/Open versions of some Unix programs, but the filename generation/globbing
is a shell feature, the ls program doesn't see the glob, it sees the expanded list of filenames.

I'm just passing the filenames to the ls program, ls operates directly on filenames, no need to specify any column
(if I understand the question correctly ...)

No, but you can use the pattern I posted.

Do you need user/group/size/etc?

If not, you can use:

# touch 1 2 3 4
# ls -1t | egrep '^[0-9]+$'
1
2
3
4

And you don't even need the -1 option in this context.

P.S. Actually, if you don't need the modification time info, you don't need the external commands ls and grep either, with ksh (or bash and extglob enabled) you could simply:

printf '%s\n' !(*[!0-9]*)
shopt -s extglob
ls -lt !(*[!1-9]*)   # This is trying to list contents under the folders it is finding. I wanted only folder names
ls -lt <->   # This is trying to list contents under the folders it is finding. I wanted only folder names

Above commands would have listed only folders that are matching if it can get something like this working

Regarding why I started using awk instead of egrep, I also needed to filter folders based on thier owner. Here is what I have so far working.. I am trying to delete lot of old folders under /var/tmp

VAR_USERS='userA|userB|userC'
/usr/xpg4/bin/awk -v vUsers="$VAR_USERS" '$3 ~ vUsers && $9 ~ /^[1-9]+$|sjs*|JCE*/ {print $9}'

Ideally I would want to use parameter for this also -->

. But I could not get it to work.

VAR_USERS='userA|userB|userC'  # Folders owned by only these users.
VAR_FILTER='^[1-9]+$|sjs*|JCE*'    # (Folders with only Numbers) OR (that start with 'sjs') OR (that start with 'JCE')

/usr/xpg4/bin/awk -v vUsers="$VAR_USERS" -v vFilter="$VAR_FILTER" '$3 ~ vUsers && $9 ~ vFilter {print $9}'

I am all for not using expensive awk,, instead get this done by printf or bash's extglob or egrep( but I want to use parameters for all my search strings,,).


  1. 1-9 ↩︎

This should work

ls -lt | awk '/^d/&&($NF+0)==$NF'

So just add -d option to ls:

ls -ltd ...

Well, as it often happens, you begin with a simple question and you end up with a completely different requirement.
You do use parameters for all your search strings. Could you please elaborate further?

This works -->

VAR_USERS='userA|userB|userC'  # Folders owned by only these users.
/usr/xpg4/bin/awk -v vUsers="$VAR_USERS" '$3 ~ vUsers && $9 ~ /^[1-9]+$|sjs*|JCE*/ {print $9}'

This is not working -->

VAR_USERS='userA|userB|userC'  # Folders owned by only these users.
VAR_FILTER='^[1-9]+$|sjs*|JCE*'    # (Folders with only Numbers) OR (that start with 'sjs') OR (that start with 'JCE')

/usr/xpg4/bin/awk -v vUsers="$VAR_USERS" -v vFilter="$VAR_FILTER" '$3 ~ vUsers && $9 ~ vFilter {print $9}'

How can I pass regular expression as a parameter VAR_FILTER ?

I suppose you need a slightly different regex:

VAR_FILTER='^([1-9]+$|(sjs|JCE))'

# Folders owned by only these users.
VAR_USERS='userA|userB|userC'
# (Folders with only Numbers) OR (start with 'sjs') OR (start with 'JCE')
VAR_FILTER='^[1-9]+$|sjs*|JCE*'
VAR_FILTER='^([1-9]+$|(sjs|JCE))'

ls -lt | /usr/xpg4/bin/awk -v vUsers="$VAR_USERS" -v vFilter="$VAR_FILTER" '$3 ~ vUsers && $9 ~ vFilter {print $9}'

Both syntaxes used with VAR_FILTER are working as variables, I was stupid not to feed input given in BLUE.

Thanks a lot to everyone that helped. This post is closed.

They certainly do different things though:

sjs* and JCE* respectively match sj/JC everywhere in the filename (s*/E* mean 0 or more, not 1 or more).


  1. 1-9 ↩︎

VAR_FILTER='^[1-9]+$|sjs*|JCE*' 

you are correct, above would mean print if sjs OR JCE is found anywhere in the line. I wanted to find lines starting with either of those two strings, so I will go with the following.