Firstly i do not require alot of help.. i am right at the end of finishing my scipt but cannot find a solution to the last part.
What i need to do is, prompt the user for a file to work with, which i have done.
promt the user for an output file - which is done.
#!/bin/bash
echo "Get my XML"
echo -n "Enter the source file name : "
read infile
echo -n "Enter output file name : "
read outfile
sed -n 1433,1615p $infile >> $outfile
echo "Data should be in $outfile if this compiled correctly"
The file, is .txt and is massive, i only need the last 200 lines or so which is XML... I know i can use SED to specify what line numbers to extract to the output file, but not all documents that use this script will require the last 200, it could be the midlle 50.
Which leads me on to my problem, using SED or AWK i would like to extract all the xml after 'Sending XML' which is consistant accross all documents, up until the words ' Message sending ended.'
I have been reading various articles/forums which have helped and has lead me to providing my current example. Although, using line numbers is not feasible, they will differ accross the documents, whereas the words above are always present.
I really hope someone can help as i have spent far too much time on this!
I cannot thank you enough.. i have posted on so many forums and no one ever gets back to me! I will be using this more often!
I now have another issue, The XML i have been left with has line breaks, see below:
Sending XML :
<document> <docRequestID>2010-10-22-11.57.22.903813</docRequestID><docStylesh
eet>Thunderhead</docStylesheet><requestType>claim</requestType><level0Object>
<objectType>transaction</objectType><objectID>900</objectID><objectSeq>1</ob
Line break has effected tag
jectSeq><level1Object> <objectType>lifelite</objectType><objectID>901</object
As you can see, the line break has effected this tag half is on the line below and this XML cannot be used like that. I would like for it to remove the linespaces at the start of the line, looking like : </object>
Hello, i added your code into my script, im not sure what file you were referring to so i have attached what i used.
#!/bin/bash
echo "getXML"
echo -n "Enter the source file name WITH extension : "
read infile
echo "Processing... : "
sleep 1
echo -n "Enter output file name (extenstion not applicable) : "
read outfile
sed -n '/Sending XML/,/Message sending ended/p' $outfile | od -bc
echo "Processing XML... : "
sleep 1
echo "Success..Data should be in '$outfile' if compiled correctly"
The outcome...
Unexpected error: Incomplete multibyte sequence in input when i open the outfile created.
On the terminal i got loads of different numbers fly accross the screen. Im not sure if they are even related to the infile i have.. attached below...
e l d I D > < f i e l d N a m
0031640 145 076 144 141 164 145 117 015 012 040 146 102 151 162 164 150
e > d a t e O \r \n f B i r t h
0031660 074 057 146 151 145 154 144 116 141 155 145 076 074 146 151 145
< / f i e l d N a m e > < f i e
0031700 154 144 126 141 154 165 145 057 076 074 057 157 142 152 145 143
l d V a l u e / > < / o b j e c
0031720 164 106 151 145 154 144 076 074 157 142 152 145 143 164 106 151
t F i e l d > < o b j e c t F i
0031740 145 154 144 076 040 074 146 151 145 154 144 111 104 076 061 065
e l d > < f i e l d I D > 1 5
0031760 061 067 074 057 146 151 145 015 012 040 154 144 111 104 076 074
1 7 < / f i e \r \n l d I D > <
0032000 146 151 145 154 144 116 141 155 145 076 154 151 146 145 164 151
f i e l d N a m e > l i f e t i
0032020 155 145 123 154 141 101 155 157 165 156 164 074 057 146 151 145
m e S l a A m o u n t < / f i e
0032040 154 144 116 141 155 145 076 074 146 151 145 154 144 126 141 154
l d N a m e > < f i e l d V a l
0032060 165 145 076 061 070 060 060 060 060 060 074 057 146 151 145 154
What I wanted was you executing my command on your command prompt (the Linux dollar-prompt).
The file I was refering to was the source file. That is, the one that is being read in your Bash script.
Since you are going to test your Bash script, I am sure you know the name of the source file that you'll enter at the prompt above. That file name will be assigned to the variable "infile" in your script.
Now, let's say the source file name you have in mind is "abc.txt".
This file has some XML stuff embedded in it. My hunch is that there are Unicode characters in that XML stuff.
i tried that and replaced the file with my source file, in my case it was trace.txt i am not sure where the output file is though? I checked trace.txt and it was the same doc, do i not need to specify where the output is?
sorry if im being slow, i only started learning three weeks ago