Use awk to have the fourth column with spaces

Hi Gurus,
We have a ftpserver from which we do a dir command and output it to a local file.
The content of the ftpfile is:

07-15-09  06:06AM                 5466 ABC_123_ER19057320090714082723.ZIP
07-15-09  06:07AM                 3801 ABC_123_ER19155920090714082842.ZIP
07-15-09  06:07AM                 2034 ABC_123_ER19257020090714083003.ZIP
07-15-09  06:07AM                 5346 ABC_123_ER19456120090714083105.ZIP
07-15-09  06:07AM                50188 ABC_123 4507131004299717363.ZIP
07-15-09  06:07AM                10867 ABC_69 ER194561.ZIP
07-15-09  06:07AM                73183 ABC_69_ER194631.ZIP
07-15-09  06:07AM                 1576 ABC_69_ER195427.ZIP
07-15-09  06:07AM                 5880 ABC_69_ER195428.ZIP

I generally use awk '{print $4}' ftpfile. But, I realized that the filename in the ftp might contain spaces.
So I came up with the command:

awk '{i=4;while (NF>=i) {print $i;i++}}' ftplist

But the above command prints new line for every print I give. I want it in the same line. I tried using cut but it doesn't help.
On an ad-hoc basis, I am currently using the below command:
awk '{if (NF!=4) {print $4,$NF}else {print $4}}' ftplist
The above command prints what I want but, I assume here that the filename will have only one space.
Please suggest.

Try:

awk '{i=4;while (NF>=i) {printf("%s ", $i);i++};printf "\n"}' ftpfile

Or:

awk 'sub(/[^ ]* *[^ ]* *[^ ]* */,"")' ftpfile

If you have GNU awk (gawk) you can avoid the repeating pattern using the --re-interval option.

And, of course, you can use sed for this.

Or plain with awk:

awk '{$1=$2=$3="";sub("   ","")}1' file

Regards

Thanks that command helped. I have one more doubt. I am unable to use shell variables in awk. Could you help me again with that? I have a sample file like this:

PROJECT PATH
-----------------
ABC  /ABC/Datafiles/FILES
CLNT /CLNT/DataFiles/FILES 
BHL    /BHL/MISC/FILES
JMH /JMH/DataFiles/FILES
SMS /CLA/Datafiles/FILES
PCS /CLA/Datafiles/FILES

Now I have a script where I allow the user to enter the name of project based on which I grep the file and give the path. But here too, I realized that I am doing grep on the entire file rather than on only first column. The problem for that is the user can give anything on the given line and it might result in undesirable output.

My desired output is if the user gives PCS, he should see /CLA/DATAFILES/FILES

I initially used :
grep "$a" filename|awk '{print $2}'
where a is the name of the project which user gives.
I tried to use:
awk '{if ($1=$a) {print $2}else{print "WRONG INPUT"}}' filename

But the above awk doesn't work. Please help
Thanks

You could do something like:

awk 'BEGIN {
  printf "Enter project name: "
  getline name < "-"
}
$1==name {print $2}' file

I found one more alternative:
awk '{if ( $1== "'"$a"'") {print $3}}' filename

Enclose the shell variable with (double quotes, single quotes,double quotes)

Your script also works..

Thank you.

Hi Franklin,
Could you please explain your command, if you don't mind?

Regards,
Don[COLOR="\#738fbf"]

awk '{$1=$2=$3="";sub("   ","")}1' file

Explanation

Empty the fields 1, 2 and 3:

$1=$2=$3=""

Remove the spaces (field separators) of the empty fields:

sub("   ","")}

Means true to awk and the default action of awk is to print the record:

1

Regards

cut can do this..

-bash-3.2$ cat test
07-15-09  06:06AM                 5466 ABC_123_ER19057320090714082723.ZIP
07-15-09  06:07AM                 3801 ABC_123_ER19155920090714082842.ZIP
07-15-09  06:07AM                 2034 ABC_123_ER19257020090714083003.ZIP
07-15-09  06:07AM                 5346 ABC_123_ER19456120090714083105.ZIP
07-15-09  06:07AM                50188 ABC_123 4507131004299717363.ZIP
07-15-09  06:07AM                10867 ABC_69 ER194561.ZIP
07-15-09  06:07AM                73183 ABC_69_ER194631.ZIP
07-15-09  06:07AM                 1576 ABC_69_ER195427.ZIP
07-15-09  06:07AM                 5880 ABC_69_ER195428.ZIP
07-15-09  06:07AM                10867 ABC_69 ER194561 test1.zip
07-15-09  06:07AM                10867 ABC_69 ER194561 test2.ZIP
07-15-09  06:07AM                10867 ABC_69 ER194561 test3.ZIP
-bash-3.2$ cat test | tr -s " " | cut -d ' ' -f4-
ABC_123_ER19057320090714082723.ZIP
ABC_123_ER19155920090714082842.ZIP
ABC_123_ER19257020090714083003.ZIP
ABC_123_ER19456120090714083105.ZIP
ABC_123 4507131004299717363.ZIP
ABC_69 ER194561.ZIP
ABC_69_ER194631.ZIP
ABC_69_ER195427.ZIP
ABC_69_ER195428.ZIP
ABC_69 ER194561 test1.zip
ABC_69 ER194561 test2.ZIP
ABC_69 ER194561 test3.ZIP
-bash-3.2$

Hi Franklin and ryandegreat,
Thanks for the reply.Both your commands work fine as long as there is only one space in the fourth field.
If the fourth field has more than one space then, it is reducing the space in it. e.g.

My input is:

10-14-09  08:57AM               164117 091008 ABC DEF
10-14-09  08:57AM               304435 091009  ABC DEF
10-14-09  08:57AM               199438 091013   ABC DEF
10-14-09  08:57AM                  974 10_09_2009.160923090008095.abc.def

My output is:

091008 ABC DEF
091009 ABC DEF
091013 ABC DEF
10_09_2009.160923090008095.abc.def

My command shouldn't behave that way ...

Use nawk or /usr/xpg4/bin/awk on Solaris if you get errors:

awk -F" |!" '{$1=$2=$3="";sub(" {1,}","")}1' file | sed 's/[^ ]* //'

The command doesn't work ..
My output should not trim the spaces in the fourth field. It's not necessary to use awk command. Please tell me if there is any other way to extract the fourth field. The problem is that the fourth field is actually the file name in the ftp and I am automating it.
So, it has to copy the file.

---------- Post updated at 07:57 PM ---------- Previous update was at 07:51 PM ----------

Hi Radulov,
Your command works great!! Thanks a ton.

Would request you to post explanation please...
Don

With a slighty modification of the sed command:

awk -F" |\!" '{$1=$2=$3="";sub(" {1,}","")}1' file | sed 's/[*^ ]* //'