extract and format information from a file

Hi,

Following is sample portion of the file;

<JDBCConnectionPool DriverName="oracle.jdbc.OracleDriver"
MaxCapacity="10" Name="MyApp_DevPool"
PasswordEncrypted="{3DES}7tXFH69Xg1c="
Properties="user=MYAPP_ADMIN" ShrinkingEnabled="false"
Targets="myapp_server" TestTableName="SQL SELECT 1 FROM DUAL" URL="jdbc:oracle:thin:@myserver.net:1521:dbs130"/>

Is there any way (through shell script) I can select information as following :

DriverName=="oracle.jdbc.OracleDriver"
Name=="MyApp_DevPool"
URL="jdbc:oracle:thin:@myserver.net:1521:dbs130"

Any suggestion will be highly appreciated.

Thanks

Sujoy

if your formatting is clean and as shown:

#  awk '{print $NF}' infile | egrep "Name|URL" | sed 's#/>$##'
DriverName="oracle.jdbc.OracleDriver"
Name="MyApp_DevPool"
URL="jdbcracle:thin:@myserver.net:1521:dbs130"

If we can assume that you are only interested in attributes within the file, and that they are always in double quotes, how about this.

  1. Convert the file to one attribute per line
  2. Grep the ones you want from that
sed -e 's%/>%%' -e 's/\(^\| \)\([A-Za-z]*="[^"]*"\)/\
\2/g' phile.xml | egrep '^(DriverName|Name|URL)='

If your sed can't handle a literal newline (yes, that's slash, backslash, newline, \2/g, in the wrap between the first and second line) then it's a bit tricky. Some seds also understand \n to mean a literal newline in the substitution part.

Tytalus' solution assumes your fields will always be the final field on a line, which sounds kind of precarious. (Also awk | grep is Useless; awk is perfectly capable of taking care of most of what grep can do.)

I assume that the snippet you provided is part of a valid well-formed XML document. If so, the following XSL stylesheet will transform the XML document into the output you want and is a better solution than using sed/awk/etc.

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" />

<xsl:template match="JDBCConnectionPool">
DriverName=="<xsl:value-of select="@DriverName"/>"
Name=="<xsl:value-of select="@Name"/>"
URL=="<xsl:value-of select="@URL"/>"
</xsl:template>

</xsl:stylesheet>
$ xsltproc file.xsl file.xml
DriverName=="oracle.jdbc.OracleDriver"
Name=="MyApp_DevPool"
URL=="jdbcracle:thin:@myserver.net:1521:dbs130"
$

Hi Murphy,

Thanks a lot for your xsl. Yes its prefectly feching the required info.
If you can pls provide following clarifications:

1.the output is consisting of lot of spaces in between two results.

  1. How to insert the "filename" with every result in attach output; so the detail for specific file can be identified.

3.If I wish to add one more "template match" section within the same xsl ; how to do that?

regards

Sujoy

This should give you the desired output:

awk 'BEGIN{FS="\""; printf("Filename= %s\n\n", FILENAME)}
$1 ~ /.* DriverName=$/{print "DriverName==" FS $2 FS}
$3 ~ /.* Name=$/{print "Name==" FS $4 FS}
$5 ~ /.* URL=$/{print "URL==" FS $6 FS ;print ""}
' file

Regards

Hi Franklyn,

I have modified the same with

awk 'BEGIN{FS="\""; printf("Filename= %s\n\n", /usr/data/weblogic/config/mktmixDomain/config.xml)}
$1 ~ /.* DriverName=$/{print "DriverName==" FS $2 FS}
$3 ~ /.* Name=$/{print "Name==" FS $4 FS}
$5 ~ /.* URL=$/{print "URL==" FS $6 FS ;print ""}
' file

but it gives following error:

./new.sh
awk: syntax error near line 1
awk: illegal statement near line 1

where am I going wrong?

Hi Murphy,

One more thing ; how can I capture the output in another HTML so that the details can be viewed or arranged in proper format.

Thanks in advance.

Quote the string in the printf statement:

printf("Filename= %s\n\n", "/usr/data/weblogic/config/mktmixDomain/config.xml")

But you should use the FILENAME variable.

XSLT1 does not have a facility to determine the name of a document being transformed from within the document. The name must be passed in as a top level param.

See the following stylesheet which includes support for both a filename and XHML outpur

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    indent="yes"/>
<xsl:param name="FNAME"/>

<xsl:template match="JDBCConnectionPool">
   <html>
      <head>
      </head>
      <body>
         Filename=="<xsl:value-of select="$FNAME"/>"
         DriverName=="<xsl:value-of select="@DriverName"/>"
         Name=="<xsl:value-of select="@Name"/>"
         URL=="<xsl:value-of select="@URL"/>"
      </body>
   </html>
</xsl:template>
</xsl:stylesheet>

Invoke as follows

xsltproc --param FNAME "'MYFILENAME'" file.xsl file.xml

Which results in the following output:

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  </head>
  <body>
         Filename=="MYFILENAME"
         DriverName=="oracle.jdbc.OracleDriver"
         Name=="MyApp_DevPool"
         URL=="jdbcracle:thin:@myserver.net:1521:dbs130"
   </body>
</html>

Hope this helps you!

Hi Murphy,

Thanks so much for your reply. One more thing ; if I want to extract "Name" from following field in the same xml file alongwith previous details for JDBCConnection pool;

<?xml version="1.0" encoding="UTF-8"?>
<Domain ConfigurationVersion="8.1.5.0" Name="accSys815Domain">
<Server AcceptBacklog="50" DefaultProtocol="t3"
DefaultSecureProtocol="t3s" ExpectedToRun="false"

What would be the changes to be done in the xsl?

TIA

Sujoy

I dont want to use FNAME instead it should capture the domain name from the above section.

Hi Murphy,

Pls find attached the modified xsl ; which I want to use for multiple config files.

following command is used to get the output.html file (as attached).

xsltproc --param FNAME "'mktmixDomain'" fname.xsl mktmixDomain.xml > OUTPUT.html

Can you pls suggest changes to be done to address following:

  1. If the domain name can be captured from the config.xml specific block;

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="http://dev2dev.bea.com/blog/euxx/config.xsl"?>
<Domain ConfigurationVersion="8.1.5.0" Name="mktmixDomain" ProductionModeEnabled="true">

  1. The script should process multiple config.xml files under a folder.

  2. Output should be in a single html file.

My basic objective is to create an inventory of all JDBC detail for all weblogic domains.

Regards
Sujoy