Parsing xml files

I want to search for all the xml files on the server that have "Status" in them. Is this the correct code that I should use? Can anyone explain exactually what this code does?

xmlFileNames=$(find . -name "*.xml" -exec grep -l ".*Status" {} \; 2>/dev/null)

It searches for all XML files in the current directory hierarchy (find . -name "*.xml") for the string Status (-exec grep -l ".Status") (the . in this search pattern is superfluous), returning the names of the files (grep -l) that do into a variable called xmlFileNames (xmlFileNames=$(...)).

1 Like

Thanks Scott!

---------- Post updated at 10:57 AM ---------- Previous update was at 10:56 AM ----------

Is my code correct then? lol

Only if you run it from / (the root directory), otherwise:

xmlFileNames=$(find / ...)

would be required (/ instead of .)

You might encounter a lot of errors (access denied, or other) along the way, so:

xmlFileNames=$(find / ... 2> /dev/null)

(put 2> /dev/null inside the closing parenthesis to throw away the errors)

1 Like

Thanks a bunch Scott!

".*Status" is a bit redundant, "Status" would do.

1 Like

If I use a

 xmlfilename=$(grep -l "Status" *.xml )

can I pull out each file name seperately in a for loop?

Yes.

xmlFileNames=$(grep -l Status "*.xml")
for file in $xmlFileNames; do
  ...
done

Just be aware that if there are too many filenames you might get an error ("too many args").

You might want to add some recursion to the grep command to more closely emulate the find behaviour.

A better approach is to use a while-loop:

find ... | while read file; do
  ...
done
1 Like

Is this correct?

for xmlFileName in ${xmlFileNames}
	    do 
	        xmlFileName=$(echo $xmlFileName | sed 's|./||')     # Remove leading ./ path that find command prefixes to filenames
	        cp -f $xmlFileName $DIR/$xmlFileName
	done

I'm not using the find command anymore just the grep. Would that create a problem with the echo command?

---------- Post updated at 08:42 AM ---------- Previous update was at 08:07 AM ----------

Every time I run this script that I'm creating I keep getting this error:

 filename.ksh 89:no memory : Not enough storage is available to process this command.

I assume its the line with the echo & sed commands that are causing the problem. Its only less than 1000 xmls files that I'm trying to copy. They're not even large files. Idk?

I'm not sure what the syntax for the read command should be translated from this code

for XmlFileName in ${xmlFileNames}
	do 
	        XmlFileName=$(echo $XmlFileName | sed 's|./||')     # Remove leading ./ path that find command prefixes to filenames
	        cp $PXmlFileName $NEW_DIR/
	done