Finding a string in a text file and posting part of the line

busdude · March 24, 2010, 12:31am

What would be the most succinct way of doing this (preferably in 1 line, maybe 2):
searching the first 10 characters of every line in a text file for a specific string, and if it was found, print out characters 11-20 of the line on which the string was found.
In this case, it's known that there are no duplicates in the file, so it either finds the string on one line or none of them.

I'm guessing one of grep, sed or awk can be used but I can't figure out the best way.

kurumi · March 24, 2010, 12:36am

sed -n 's/^helloworld//p' file

 sed -n '/^hello/s/..........//p' file

ygemici · March 24, 2010, 6:17am

maybe like this

# cat 1
thisiscomm11thisisthisis
thisiscomm22thisisthisis
thisiscomm33thisisthisis

# sed 's/^thisiscomm//g' 1
11thisisthisis
22thisisthisis
33thisisthisis

Jairaj · March 24, 2010, 7:33am

Try this also using awk:

awk '{if (substr($0,1,10) == "helloworld") {print substr($0,11)} else {print substr($0,1)}}' file

danmero · March 24, 2010, 7:48am

awk -v v=string 'v==substr($0,1,10){print substr($0,11,10)}' infile

Jairaj · March 24, 2010, 7:57am

Correct one is :

awk '{if (substr($0,1,10) == "helloworld") {print substr($0,11,20)} else {print substr($0,1)}}' file

danmero · March 24, 2010, 2:58pm

Jairaj

Always read the OP requirement.
Please use [code] tags when you post code.
Don't double-post !
Your solution is wrong , check the OP.

alister · March 24, 2010, 4:28pm

Hi, danmero:

I agree. Jairaj's proposed solution is definitely incorrect. However, without some clarification from the original poster, yours could be as well.

If the string being searched must occur in the first ten characters, but can itself be less than 10 characters, it could then occur at different locations within the first 10 characters. If this scenario is a possibility, then

v==substr($0,1,10)

would be insufficient.

If this is the case, then the following would be the correct approach:

awk -v v=string 'i=index($0,v){if (i+length(v)-1<=10) print substr($0,11,10)}'

Again, I think the original poster's problem statement is a bit too vague, leaving some wiggle room for this possibility.

Regards,
Alister

danmero · March 24, 2010, 11:09pm

This is not about right vs. wrong OR what if suppositions.
The OP ask to "searching the first 10 characters" and that's a fact and we should start from here. Everything else are only suppositions.
I can provide answers/solutions(short) base on facts, otherwise the list will be too long

Regards,
�

alister · March 25, 2010, 1:46am

Your solution is incorrect, regardless (as was mine). The problem requested printing characters 11-20 if a match is found; that second substr call returns 11-30 (the third argument is length, not an index). The 20 should be a 10.

Regards,
Alister

danmero · March 25, 2010, 7:18am

That's correct, I fix my original post

durden_tyler · March 25, 2010, 7:47am

Well, Perl scripts are known to be succinct...
For the dummy file below, assuming the search string is the 10 character "helloworld" -

$
$ cat f2
helloworld>abcdefghijklmnopqrstuvwxyz
holamundo>abcdefghijklmnopqrstuvwxyz
helloworld>ABCDEFGHIJKLMNOPQRSTUVWXYZ
helloworld>0123456789
bonjourmonde>abcdefghijklmnopqrstuvwxyz
hallowelt>ABCDEFGHIJKLMNOPQRSTUVWXYZ
$
$
$ perl -lne 'print $1 if /^helloworld(.{10})/' f2
>abcdefghi
>ABCDEFGHI
>012345678
$
$

tyler_durden

Jairaj · March 25, 2010, 9:41am

what is wrong Mr danmero?

danmero · March 25, 2010, 11:20am

awk '{if (substr($0,1,10) == "helloworld") {print substr($0,11,10)}}'

You don't want to print the record if no match(your else). My mistake , I take the 20 from you :rolleyes: see alister comments above.

To simplify we can write

awk 'substr($0,1,10) == "helloworld" {print substr($0,11,10)}'