Help with extracting text from a string

I dont know if I am making any sense here. But I need to do something like this.

I have a variable that contains result from the svnlook command on a post-commit hook script.

test=`/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$"

and I get

test=A  /content/qa/lesson1/index.html A  /content/qa/lesson2/index.html

basically those two files are just been added to the repository.

Now I need to write some unix script to extract dirctory names lesson1 and lesson2 from that text.

How can I do this? Any help or suggestion is highly appreciated.

Thanks

KM

Unclear with your sample -- is the output on just one line? Or, is it on multiple lines?

thanks for you response jowyg.

sorry my question is pretty not clear. let me know if you have any question.

I would like to have the output in an array coz i need to do more things with those names.
like

dir[0]=lesson1
dir[1]=lesson2

Thanks.

Is your output currently

test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html

or

test=
A /content/qa/lesson1/index.html 
A /content/qa/lesson2/index.html

or something else

And, do you always want the 3rd field in your

/aaa/bbb/ccc/ddd/eee.html 

lines?

yes all the time i am looking into the third field. yes my output is usually like this
test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html

I am checking the svn repository if there has been new index.html file added. There could be as many index.html file but the folder name is different. currently in the example I have two files. based on that I need to create a redirect file that points to that location on a server. So I need that folder name to create that url.

>echo test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html | gawk '{print $2"\n"$4}' | gawk -F"/"'{print "dir["NR-1"]=",$4}'
dir[0]= lesson1
dir[1]= lesson2

or simply append this to your current command?

| gawk '{print $2"\n"$4}' | gawk -F"/"'{print "dir["NR-1"]=",$4}'

Thanks joeyg

Sorry if I am not understanding something but i am getting invalid range error:

$ echo test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html | gawk '{print $2"\n"$4}' | gawk -F"/"'{print "dir["NR-1"]=",$4}'
gawk: fatal: Invalid range end: //{print "dir["NR-1"]=",$4}/

$ echo "A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html" | gawk '{print $2"\n"$4}' | gawk -F"/"'{print "dir["NR-1"]=",$4}'
gawk: fatal: Invalid range end: //{print "dir["NR-1"]=",$4}/

Are you using bash, or ksh, or ???

i am using bash.

Thanks.

Does this

$ echo test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html | gawk '{print $2"\n"$4}'

give you

/content/qa/lesson1/index.html
/content/qa/lesson2/index.html

thanks joeyg.

yes that works. but when i added that grep part on my hookscript, i am getting nothing. just the echo works fine as you said.

here is a part of my hook script.

changes=`/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$" | gwak 'print $2"\n"$4}'`
echo '>>>'$changes >> $ACTION_LOG

So what i am doing here is checking to see if new index.html file has been added to the repository. and then from there i am need to extract the folder name.

changes=`/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$"`

gives you the

A  /content/qa/lession1/index.html A /content/qa/lession2/index.html

so i added the script you gave me to it.

changes=`/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$"| gwak 'print $2"\n"$4}'

didnt give me anything back.

After you set the variable changes

>echo "$changes"

to see it on the screen
then, you might want

echo ">>>$changes" >> $ACTION_LOG

as sometimes spaces and other characters can confuse things

yes it didnt work either. I get resonse printed on the log file for the previous command

changes=`/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$"`
echo '>>>'$changes >> $ACTION_LOG

as

>>>A /content/qa/lession1/index.html A /content/qa/lession2/index.html

thanks.

---------- Post updated at 05:14 PM ---------- Previous update was at 04:43 PM ----------

One thing I also realised that gawk '{print $2"\n"$4}' works only if you have two files. if it is more than 2 then index. If there are three files checked in to the repository it will still return only 2 files.

$  echo test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html A /content/qa/lesson3/index.html| gawk '{print $2"\n"$4}'
/content/qa/lesson1/index.html
/content/qa/lesson2/index.html

this is not what i am looking for.

Can you just enter the command?
The way you have it, you are writing output from the command to a string and then writing that to a file. Thus, why do you need to write the >>> ?
Do this

changes=`/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$"`
echo '>>>'$changes >> $ACTION_LOG
/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$" | gawk '{print $2"\n"$4}

which should write your stuff, but also execute the same command and send new output to the screen

/content/qa/lesson1/index.html
/content/qa/lesson2/index.html

Sorry, yes I write the output to a text file $ACTION_LOG and i do
tail -f when I actually check in files through my IDE to see if anything is coming out.

Thanks.

Can you do the following:

>echo test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html A /content/qa/lesson3/index.html | sed 's/A/~A/g' | tr '~' '\n' | grep "^A" | cut -d" " -f2
/content/qa/lesson1/index.html
/content/qa/lesson2/index.html
/content/qa/lesson3/index.html

Essentially, do your command with a new filter at end

/usr/bin/svnlook changed $REPOS -r $REV | grep "^A.*index.html$" | sed 's/A/~A/g' | tr '~' '\n' | grep "^A" | cut -d" " -f2

---------- Post updated at 04:28 PM ---------- Previous update was at 04:25 PM ----------

append the following to your tail command

| sed 's/A/~A/g' | tr '~' '\n' | grep "^A" | cut -d" " -f2

to get one line for each instance

Does that much work?

I used this and got it to work.

echo test=A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html A /content/qa/lesson3/index.html | \
sed 's/A/~A/g' | tr '~' '\n' | grep "^A" | cut -d" " -f2 | cut -d"/" -f4

I get

lession1
lession2
lessions

Now I need to plug this into my actual code and see if that works as i wanted. THanks for helping me so far. i will inform you if i got this going. thanks.

Here's another way but it's a couple milliseconds slower than sed | tr | grep | cut | cut

# echo "A /content/qa/lesson1/index.html A /content/qa/lesson2/index.html" | awk '{print $2"\n"$4}' | xargs -n1 dirname | xargs -n1 basename
lesson1
lesson2
# 

Unlcear whether appending the sed, etc..., commands gave you what you needed.