Help with extract using awk

dsravan · May 18, 2010, 4:30pm

I am trying to get the filenames which is $9 when i use ls -lrt

but i am getting only some of the name using below command.

ls -lrt *.edf

is it bcs the filename has spaces in between them?
How can I get the complete file name in cases like this?

m_kapur83 · May 18, 2010, 4:50pm

y dont u try ..
ls -a *.edf > output

dsravan · May 18, 2010, 4:58pm

I also need file size along with that. So i can't use ls

anbu23 · May 18, 2010, 5:58pm

ls -a *.edf | cut -d" " -f5,9-

dsravan · May 19, 2010, 10:14am

This is not working. I am not getting the filename and size correctly. All my files have spaces in them and i have to get their full name along with size

joeyg · May 19, 2010, 11:21am

instead of

ls -a *.edf | cut -d" " -f5,9-

try

ls -a *.edf | tr -s " " | cut -d" " -f5,9-

vgersh99 · May 19, 2010, 11:26am

This will produce wrong results if the file name contains multiple sequential spaces.

methyl · May 19, 2010, 11:45am

Two stage process. Generate the filename(s) with "ls" (or "find") then find out the size.

ls -1 *\.edf | while read filename
do
        filesize=`ls -lad "${filename}"|awk '{print $5}'`
        echo "filename : ${filename}"
        echo "filesize : ${filesize}"
done

vgersh99 · May 19, 2010, 11:52am

or:

#!/bin/ksh

ls -l *\.edf| while read -r a a a a s a a a n
do
  echo "[$s] [$n]"
done

pseudocoder · May 19, 2010, 12:08pm

ls -lrt *.edf | cut -d" " -f7- | sed 's/[A-Z][a-z][a-z] [0-9][0-9 ] [0-9][0-9]:[0-9][0-9] //'

methyl · May 19, 2010, 12:15pm

Does not work with a normal "ls" listing (though the one posted is strangely spaced).

pseudocoder · May 19, 2010, 12:29pm

methyl,
did you run my command and it did not work for you? There was a small error, because the initial regex was for systems which have 00 Mon format in the ls -l output (where 00 represents the day and Mon month). I've adjusted the regex for Mon 00 format. Maybe you want to run it again?
However "ls -lrt *.edf | cut -d" " -f7-" always worked pretty nice for me

verdepollo · May 19, 2010, 2:27pm

How about this?:

ls -lrt -D %\  *.edf | awk '{print substr($0, index($0,$5))}'

Animefoo · May 19, 2010, 2:35pm

Why not just use sed?

ls -lrt *.edf | sed 's/.*:.. //g'

pseudocoder · May 19, 2010, 3:03pm

Because he needs the size field as well
Else he could simply run ls *.edf

vgersh99 · May 19, 2010, 3:08pm

... that and:

-rw-rw-rw-   1 user group      407 Oct 14  2009 foo   bar
-rw-rw-rw-   1 user group      407 Oct 14  2009 foo:ba .txt

dsravan · May 20, 2010, 2:32pm

Thanks Vgersh. Your solution worked fine. Many thanks to all who gave wonderful responses. But I needed file size and name and hence vgresh solution worked for me.

---------- Post updated at 02:32 PM ---------- Previous update was at 02:28 PM ----------

vgersh,

Can you please explain the script?

methyl · May 24, 2010, 12:58pm

On behalf of vgersh99. The "while read -r" statement reads each matching line from the "ls -l *\.edf" statement in turn. It places the results into environment variables called $a, $s, $n . $a is a variable to use unwanted fields, $s is the file size, $n is the file name. The clever bit of the script is that $n contains the whole filename whether or not it contains spaces or special characters. This is because the last field in a "read" statement is filled with any remaining characters from the input line. See "man read".
The square brackets in the output echo are not special syntax, they just help to show where each field starts and stops when testing the script.

Note the backslash in "ls -l *\.edf". This is good practice because unix is not MSDOS and a fullstop in a filename is just another character but it can have special meaning in a regular expression (which this isn't).

trey85stang · May 24, 2010, 9:56pm

if you have du

du -s *.edf

methyl · May 25, 2010, 6:49am

@trey85stang
The file sizes output from "du -s" are quite different from "ls -la", but they are the same as those from "ls -las".
The file sizes from "du -s" and "ls -las" are in units of disc blocks and reflect the actual space taken by the files. A disc block is usually 512 bytes ... but it depends on the system.
The output from "ls -l" is how much data there is in each file up to the EOF marker. The EOF marker may be part-way through a disc block.

On one of my systems a single byte text file takes 8,192 bytes of disc space.