Cutting out text from specific portion on filename

Hi,

how do I go about cutting out the first numeric characters after the word "access"?

access1005101228.merged-00.15.17.86.d8.b8.log.gz

$ echo access1005101228.merged-00.15.17.86.d8.b8.log.gz | cut -c7-12
100510

oh sorry

did not make it clear that I am running a find command and there might be different directories before the actual filename so a cut will not work

This command gives the 6 characters after the word access:

sed -e 's/.*access\(......\).*/\1/'
1 Like
echo 'access1005101228.merged-00.15.17.86.d8.b8.log.gz' | sed 's/.*access\([^.][^.]*\).*/\1/'

Bash solution is more simple :slight_smile:

a="access1005101228.merged-00.15.17.86.d8.b8.log.gz"
echo ${a:6:6}
100510

As mentioned by the OP:

there might be different directories before the actual filename

... and the length of the stream of numbers following 'access' might vary as well (most likely).

bash way

A="access1005101228.merged-00.15.17.86.d8.b8.log.gz"
D="access"

A=${A#$D}
A=${A:0:6}

try:

find .....  | awk -F "/" '{print substr($NF,7,6)}'

yes you are right :slight_smile: i had forgetten this :frowning:

then allright :b:

a=/root/test/sdasda/sdasdasd/access1005101228.merged-00.15.17.86.d8.b8.log.gz
a=${a#*access*} ; echo ${a:0:6}
100510
 

Hi,

thanks for all your help. I like the sed solution a lot and my ueber script is nearly finished however now I have one more file with a different structure

/dir1/dir2/maybeotherDIRS/http_log.SOMENAME.2010.05.01.00.45.00.230.gz

again I need to extract the date portion.

One way:

sed 's/.*\([0-9]\{4\}\).\([0-9]\{2\}\).\([0-9]\{2\}\).*/\1\2\3/' 
echo '/dir1/dir2/maybeotherDIRS/http_log.SOMENAME.2010.05.01.00.45.00.230.gz' | sed  -e 's#.*/##' -e 's#[^0-9][^0-9]*[.]\([0-9][0-9.][0-9.]*\)[.].*#\1#'
1 Like

Or, with Perl -

$
$
$ echo '/dir1/dir2/maybeotherDIRS/http_log.SOMENAME.2010.05.01.00.45.00.230.gz' | perl -pne 's/^.*?\.([\d\.]+)\..*$/$1/'
2010.05.01.00.45.00.230
$
$

tyler_durden

thanks :b:

ls -R1 /EMEA/*/*/http_log*.gz |sed  -e 's#.*/##' -e 's#[^0-9][^0-9]*[.]\([0-9][0-9.][0-9.]*\)[.].*#\1#' | cut -d. -f3,4,5 | tr -d "."

I don't think its a good way to invoke three external commands and pipes just to get any sub string. (at least for your requirement).

if your files patterns are same.

ls -R1 /EMEA/*/*/http_log*.gz | awk -F"." '{print $5$6$7}'