Help with awk/shell script

I would like to write a shell script which will read a list of filenames(with absolute paths) from a file and then determine the just filename component (without the absolute path and the file extension) and the file extension (if any)

The example list of filenames are as below :

/home/mark.williams/a.java
/usr/local/bin/perl
/usr/local/bin/gtar
/usr/local/packages/abcd.efgh.ijkl.mnop.xml

So the filenames may or may not have extensions and also, the filenames might have special characters like ".", "_" embedded in them.

Any help will be really appreciated!

so where is your shell script?
here's mine:

awk 'BEGIN{FS="/"}{print $NF}' file

Here is my script

for cfilename in `cat filelist.txt | sort -u`
do
file_extension=`echo $cfilename | awk -F\/ '{print $NF}' | awk -F. '{print $NF}'`
export file_extension
filename=$(basename $cfilename | awk -F. -v OFS=. '{$NF=""; sub("[.]$", ""); print}')
export filename
filename_w_ext=$filename.$file_extension
export filename_w_ext
done

But the problem this script craps out when it encounters files with NO extensions e.g. /usr/local/bin/perl
So I have to make it little more robust to handle these scenarios where filenames having no extensions can also exist.

I guess, your script just returns the last field (NF) of the record, which unfortunately doesn't solve my problem. But thanks for responding to my post.

For filenames with no embedded new lines:

$ cat infile
/home/mark.williams/a.java
/usr/local/bin/perl
/usr/local/bin/gtar
/usr/local/packages/abcd.efgh.ijkl.mnop.xml

$ while read;do f="${REPLY##*/}";echo "${f%.*}";done<infile
a
perl
gtar
abcd.efgh.ijkl.mnop

With zsh:

while read;do print "$REPLY:t:r";done<infile

If your filenames have no embedded spaces or other pathological characters:

With zsh:

set -- $(<infile)
print -l "$@:t:r"

And if your extension is .tar.gz?

Thanks for your reply!

Your solution works fine to get the filename componet (without the extension) and so does mine.

My problem is getting the file extension (especially returning null when there is no extenson per se e.g. /usr/local/bin/perl) That's the part which I am struggling with.

Also, if the file is /tmp/abcde.tar.gz, I would like to return "gz" as the extenion.

Could you post the expected output?

while read mLine
do
  mNameAndExt=`basename $mLine`
  mName=`echo $mNameAndExt | sed 's;\(.*\)\..*;\1;'`
  if [ ${#mNameAndExt} -eq ${#mName} ]; then
    mExt=''
  else
    mExt=`echo $mNameAndExt | sed 's/.*\.//'`
  fi
done < input_file

You can do it entirely within the shell; there is no need for external commands:


while IFS= read -r path
do
  file=${path##*/}
  ext=${path##*.}
  printf "PATH:  %s\nFILE: %s\n EXT: %s\n\n" "$path" "$file" "$ext"
done < FILENAME

###

Do you mean something like this?
(this is zsh)

$ while read;do print "name: \"$REPLY:t:r\"  extension: \"${${REPLY:e}:-null}\"";done<infile
name: "a"  extension: "java"
name: "perl"  extension: "null"
name: "gtar"  extension: "null"
name: "abcd,efgh_ijkl.mnop"  extension: "xml"

Thanks guys for your quick replies! Here is a summary of my testing the various solutions proposed by various people above :

cfajohnson - For some reason your solution returned PATH value for EXT when filename didn't have extension (e.g. /usr/local/bin/perl)

radoulov - Sorry couldn't test your solution because I am using sh and ksh only on our boxes. So can't really tell whether it works

Shell Life - Your solution worked like charm and handled all possible filetypes I have in my test environment

thanks again for helping me out! appreciate it.

So here is the solution I landed up using in my script :

#################################################

#!/bin/sh

for cfilename in `cat filelist.txt | sort -u`
do
filename_w_ext=`basename $cfilename`
filename=`echo $filename_w_ext | sed 's;\(.*\)\..;\1;'`
if [ ${#filename_w_ext} -eq ${#filename} ]; then
file_extension=''
else
file_extension=`echo $filename_w_ext | sed 's/.
\.//'`
fi
echo $filename_w_ext
echo $filename
echo $file_extension
done

Which is why I followed it up with a correction.

That will fail if any of the lines contain spaces.

And cat is unnecessary:

for cfilename in `sort -u filelist.txt`

If you want the file sorted, and duplicates removed, use:

sort -u filelist.txt | while IFS= read -r path
do
  file=${path##*/}
  case $file in
    *.*) ext=${path##*.} ;;
    *) ext= ;;
  esac
  printf "PATH:  %s\nFILE: %s\n EXT: %s\n\n" "$path" "$file" "$ext"
done



###

This script will be orders of magnitude faster than one which calls external commands (three of them!) for every line of the file.