Search string between the strings

File name : Sample.txt

<ownername>Oracle< ownername>

I am new to unix world , i would like to search string and return back to another sh script.

bascially i want to read file Sample.txt find the string between <ownername> Sample.txt < ownername> .
Gerneric way to find the string between <>x<>.Could you please help me to get the code for the same.

Output should be : Oracle

Best Regards,
Baaaalaaaa

grep '<ownername>[^<]*Sample.txt[^<]*</ownername>'

Narrative: grep for the tag and then some characters not < and then the string and then more not < and then the closing tag.

grep can find the lines with string in tags, but sed can dig out the string. This assumes one per line:

sed '
  s/.*<ownername>\([^<]*\)<\/ownername>.*/\1/
  t
  d
 '

Narrative: select in every line with a regex that captures the whole line but a substring captures the string between tags, and replace the whole line with that substring. If the replacement occurs, branch to end of script (print and go to next line) else delete line.

I like sed because the bits and pieces are reusable in sed, vi, ex, ed, grep, egrep, ksh command line editing, C, JAVA, PERL, awk.

Hi
Thank you so much for your help.Actually i have missed out few items.

Input File name : Sample.txt

line 1
<schema>database<schema>
line 2
line 3
line 4
<schema>Oracle<schema>
line 5

Actually i would like to read <schema>database<schema>string from input file and return only once database as my output.

output : database

Sorry to bother you , Please guide me for the same.
Thanks in advance

Baaalaaa

---------- Post updated at 11:17 PM ---------- Previous update was at 11:01 PM ----------

Just executed this script in unix command prompt grep '<ownername>[^<]*test.txt[^<]*</ownername>'

i didnt get any output but i press ctr c to come out;

Please clarify what the input format is. In your first posting, there is a space before the ending tag label, in your last one, there is no more space. Everyone is expecting a / there but your files might not be xml. In that case, that might be:

'<ownername>[^<]*test.txt[^<]*<ownername>'

Input file : test.txt

<owner_name>balaji<owner_name>

I have tired the same command in UNIX box , but i am not getting any output , control is in same place

w : 32 :/sh
%
grep '<owner_name>[^<]*test.txt[^<]*<owner_name>'

grep '<owner_name>[^<]*<owner_name>'

Well, not having given it a file or piped in a stream, it was grep'ing stdin = your keyboard.

The sed command q stops it, printing the current buffer if not -n.

sed '
  s/.*<schema>\([^<]*\)<schema>.*/\1/
  t quit
  d
  :quit 
  q
 ' Sample.txt

or use sed -n, not usually my choice as it ends up being longer:

sed -n '
  s/.*<schema>\([^<]*\)<schema>.*/\1/
  t pquit
  b
  :pquit 
  p
  q
 ' Sample.txt

Or:

awk -F"<|>" '/<schema>/{print $3}'
sed -n 's|.*<\(ownername\)>\(.*\)</*\1>.*|\2|p'

thank you so much for your help

my input file Sample.txt format has changed little bit </owner_name>

line 1
<owner_name>balaji</owner_name>
line 2
line 3
line 4

please help me for the same.

Since slash is a meta-char, you need \/ or [/].

---------- Post updated at 03:16 PM ---------- Previous update was at 03:15 PM ----------

sed '
  s/.*<schema>\([^<]*\)<\/schema>.*/\1/
  t quit
  d
  :quit 
  q
 ' Sample.txt
sed -n 's|.*<\(owner_name\)>\(.*\)</*\1>.*|\2|p' infile

---------- Post updated at 21:20 ---------- Previous update was at 21:17 ----------

Hi DGPickett, actually I don't think I do, since I am using | as the separator, so it isn't a metacharacter in this case.

True, I stand corrected -- must have had my head tilted.

You just do not exit after the first hit is printed.

I have not used many options like post-s p only good with -n because they are utterly redundant to other, more modular, less limited, more general pieces, and space in my head is more precious than disk space. Commands are designed by committee, it seems.

Your right, it is not that efficient, but the idea is that there may be more then one pair of tags (each on the same line otherwise it will not work), to quit I'll gladly use your solution :wink:

sed 's|.*<\(owner_name\)>\(.*\)</*\1>.*|\2|;te;d;:e;q' infile

---------- Post updated at 22:02 ---------- Previous update was at 21:36 ----------

This should work with content spread over more than one line

awk '$2=="/owner_name"{gsub(/\n/," ",$1);print $1}' FS=\< RS=\> infile

file name: text.txt

line 1
<owner_name>shell</owner_name>
line 2
line 3
line 4
<owner_name>shell</owner_name>

it worked but i am getting 2 times output, but i need only once.

sed -n 's|.*<\(owner_name\)>\(.*\)</*\1>.*|\2|p' text.txt

output
shell
shell

thank you so much for all your help

Hi that is by design, see the alternative sed solution in my previous post and solutions by other posters..