How to extract text from xml file

I have some xml files that got created by exporting a website from RedDot. I would like to extract the cost,
course number, description, and meeting information.

<?xml version="1.0" encoding="UTF-16" standalone="yes" ?>

  • <PAG PAG0="3AE6FCFD86D34896A82FCA3B7B76FF90" PAG3="525312" PAG7="38574.3936342593" PAG8="48E1DBCD03594F0E8CE93D9736BD5698" PAG9="C8E8FB21EE5343FEBA77C040EF1C9BFC" PAG11="39160.5590162037" PAG12="C8E8FB21EE5343FEBA77C040EF1C9BFC" PAG13="39160.5937384259" PAG14="C8E8FB21EE5343FEBA77C040EF1C9BFC" PAG15="" PAG16="" PAG17="0" PAG18="1" PAG19="48E1DBCD03594F0E8CE93D9736BD5698" PAG20="" PAG21="79EA41233D5F4B36B0BAC07286866783" PAG22="0" PAG23="0" PAG29="39160.5937384259" PAG30="0" PAG31="38574.3936342593" PAG32="0" PAG33="0">
  • <IO_VAL>
    <VAL VAL1="3AE6FCFD86D34896A82FCA3B7B76FF90" VAL2="2" VAL3="PAG" VAL4="Advanced HVAC Maintenance" VAL6="3AE6FCFD86D34896A82FCA3B7B76FF90" VAL7="0" VAL8="0" VAL9="38748.7126851852" VAL10="0" />
    <VAL VAL1="B6FC365A81BA49F6B87D5F83A385FF50" VAL2="1" VAL3="PGE" VAL4="text" VAL6="B6FC365A81BA49F6B87D5F83A385FF50" VAL7="0" VAL8="0" VAL9="39160.5590046296" VAL10="0">$400<BR>$400</VAL>
    <VAL VAL1="0DE7DBA40D9C4570AF7E1052369443CF" VAL2="1" VAL3="PGE" VAL4="text" VAL6="CE65E148437444F6BE216C8C6889B241" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">XPOB 556-501<BR>XPOB 556-502</VAL>
    <VAL VAL1="6407D6626D1F448389C817DABD01C51F" VAL2="1" VAL3="PGE" VAL4="text" VAL6="6407D6626D1F448389C817DABD01C51F" VAL7="0" VAL8="0" VAL9="39160.3767361111" VAL10="0">6/2-8/4 <BR>6/4-7/11*</VAL>
    <VAL VAL1="8B3B923981B346B499770E3DCA8230F0" VAL2="1" VAL3="PGE" VAL4="text" VAL6="D1E8B01771824275997556D439647E4E" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">S<BR>MW</VAL>
    <VAL VAL1="BAA7472ACAD742E1A8BAED1FDABCE2E9" VAL2="1" VAL3="PGE" VAL4="text" VAL6="BAA7472ACAD742E1A8BAED1FDABCE2E9" VAL7="0" VAL8="0" VAL9="38755.6905902778" VAL10="0">This 40-hour course expands upon the topics covered in the Basic HVAC Maintenance course.<EM>Prerequisite: Basic Heating and Air Conditioning Equipment Maintenance course or instructor approval required prior to registering.</EM> Books not included</VAL>
    <VAL VAL1="D48131678F254EDF9D8ABDB2C13EDC6A" VAL2="1" VAL3="PGE" VAL4="text" VAL6="8B75B8517379488CBEBD4E55DBD76E7C" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">M<BR>M</VAL>
    <VAL VAL1="E316E14FFDC94C4CBC856554ADF971C1" VAL2="1" VAL3="PGE" VAL4="text" VAL6="E316E14FFDC94C4CBC856554ADF971C1" VAL7="0" VAL8="0" VAL9="39160.3768287037" VAL10="0">*No class�7/2-4</VAL>
    <VAL VAL1="DF2EF049448F41A7AC18B4B71BA6F66D" VAL2="1" VAL3="PGE" VAL4="text" VAL6="467A8FEB25964EE2924BC3183C5FB424" VAL7="0" VAL8="0" VAL9="38574.3936342593" VAL10="0">8 a.m.-noon<BR>8 a.m.-noon</VAL>
    </IO_VAL>
    </PAG>

The text I would like to extract is from this area

VAL10="0">$400<BR>$400</VAL>
VAL10="0">XPOB 556-501<BR>XPOB 556-502</VAL>
VAL10="0">6/2-8/4 <BR>6/4-7/11*</VAL>
VAL10="0">S<BR>MW</VAL>
VAL10="0">This 40-hour course expands upon the topics covered in the Basic HVAC Maintenance course. Course is held in Bldg. <EM>Prerequisite: Basic Heating and Air Conditioning Equipment Maintenance course or instructor approval required prior to registering.</EM> Books not included</VAL>
VAL10="0">M<BR>M</VAL>
VAL10="0">*No class�7/2-4</VAL>
VAL10="0">8 a.m.-noon<BR>8 a.m.-noon</VAL>

I have AIX version 5. Any suggestions would be deeply appreciated.

PERL.

Try to write a problem in PERL

awk '/VAL10="0">/ {	  
	  match($0,"VAL10=\"0\">")
	  v1start=RSTART
	  match($0,"</VAL>")
	  v2start=RSTART
	  print substr($0,v1start,v2start)
	}
' "file"

output:

# ./test.sh
VAL10="0">$400<BR>$400</VAL>
VAL10="0">XPOB 556-501<BR>XPOB 556-502</VAL>
VAL10="0">6/2-8/4 <BR>6/4-7/11*</VAL>
VAL10="0">S<BR>MW</VAL>
VAL10="0">This 40-hour course expands upon the topics covered in the Basic HVAC Maintenance course.<EM>Prerequisite: Basic Heating and Air Conditioning Equipment Maintenance course or instructor approval required prior to registering.</EM> Books not included</VAL>
VAL10="0">M<BR>M</VAL>
VAL10="0">*No class�7/2-4</VAL>
VAL10="0">8 a.m.-noon<BR>8 a.m.-noon</VAL>

That does the trick. Thank you so much for your help.