I am requesting for the text parsing section below. Any helps are highly appreciated.
<tr valign="top"><td nowrap>Source name</td>
<td style="text-align: justify">Sample Name<br></td>
I want Sample Name from above.
In the same file, I have to search for another pattern like this
<td><a href="http://www.unix.com/sra?term=SRX12345">SRX12345</a></td>
I want SRX12345 from this pattern.
Now, my final output will be
SRX12345 Sample Name
---------- Post updated 09-28-12 at 08:42 AM ---------- Previous update was 09-27-12 at 03:03 PM ----------
Hi Friends,
I worked on this task and reached till this point.
Could someone please enhance it?
For the first searching, I used this
cat input | awk -F'>' '/nowrap>Source/ {getline; print $2}'
The output was
Sample Name<br
For the second pattern, I wrote this
cat input |awk '/^<td><a href=/'| grep -o 'http://unix.com/sra?term=[^"]*'| awk -F'=' '{print $2}'
and the output was
SRX12345
But, I would like to join both of them together and the expected final output is
Sample Name SRX12345
I can't use join or other awk scripts because, I am running them in two separate instances and the order is changing. I have more than 500 search patterns to search this way and I want both of them together in two columns.