Hi Experts,
I have a file as given below and want to filter out the filenames in it , by deleting left and right filds and to have the fllenames (There are spaces in the filename),
Sun Jan 11 11:20:10 2009 1 0 /home/output/file2311_recent.list user1 user2 0 done
Sun Jan 11 11:20:10 2009 1 0 /home/output/file2312 jan recent.list user1 user2 0 done
Sun Jan 11 11:20:10 2009 1 0 /home/output/Output2313 feb recent.text user1 user2 0 done
I want to eliminate first 7 field from left side , and 4 fields from right side , and to get the output . Is there anything available with scripting , sed awk ?
Output should be:
/home/output/file2311_recent.list
/home/output/file2312 jan recent.list
/home/output/Output2313 feb recent.text
---------- Post updated at 06:27 PM ---------- Previous update was at 06:25 PM ----------
To keep the forums high quality for all users, please take the time to format your posts correctly.
First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags
```text
and
```
by hand.)
Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.
Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.
Scrutinizer , the rev and grep code worked nicely, However couldnot understand the grep code properly, appreciate if you could explain the grep code a bit.
vgersh99,
The nawk code looks nice but I could not use it, as nwak not available in redhat linux.
# python -c "for line in open('file'): print ' '.join(line.split()[7:-4])"
/home/output/file2311_recent.list
/home/output/file2312 jan recent.list
/home/output/Output2313 feb recent.text
/ the character / ,
[^.]* followed by any number of characters that are not a dot
\. followed by a dot
[^ ]* followed by any number of characters that are not a space
and then from man grep:
-o Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.
Actually it is good you asked because now I realize that \. is not necessary, so the expression becomes:
although OP's sample doesn't have other fields (including if filenames contains dots as well) containing dots in them, its prudent to include checks for them as well if using the above grep command. otherwise, eliminating by column is a more flexible way to go
$ more file
Sun Jan 11 11:20:10 2009 1 0 /home/output/file2311_recent.list user1 user2 0 done
Sun Jan 11 11:20:10 2009 1 0 /home/output/file2312 jan recent.list firstname.lastname user2 0 done
Sun Jan 11 11:20:10 2009 1 0 /home/output/Output.2313 feb recent.text user1 user2 0 done
$ grep -o '/[^.]*[^ ]*' file
/home/output/file2311_recent.list
/home/output/file2312 jan recent.list
/home/output/Output.2313
first two is ok, last is not. because there is "." in file name. whereas going by fields, results should always be consistent
# python -c "for line in open('file'): print ' '.join(line.split()[7:-4])"
/home/output/file2311_recent.list
/home/output/file2312 jan recent.list
/home/output/Output.2313 feb recent.text
All the codes are great, Many Thanks!,
Scrutinizer , thanks for explaining the regular expression, it is crystal clear .
Thanks Ghostdog74 for pointing out the extra dot in file name issue(.)
The idea with eliminating the left and right fields and keeping the desired one makes it perfect. sed is fine except the extra . , nawk couldnt check, rev & python worked very well, Tx. Rveri.