Hi I was hoping someone could help me with a sed script I am trying to write? I am on a Mac running ElCapitan
I have some text that I have converted from a pdf that I want to format into an xml file.
In the file I have managed to delete all the text I do not need. The text I have left is Identified with an x and y coordinate at the beginning of the line, for its position on the page from the bottom left where x is 0 & y is 0. This is fortunate for me as everything with the same x coordinate is under the same column and every thing with the same y coordinate is on the same line.
My problem is I need to do a sort on these lines to get them in an order that I can use. I also need to append the y coordinates on page 2 onward with an 02 to 99 to denote the page it is on and to make sure when I do the sort items from page 2 onward stay in the correct page order.
This is an example of the text I have to work with
227.25 214.100 PLACEHOLDER (Qty) Tj
409. 214.100 PLACEHOLDER (Ink) Tj
19.8999 214.149 PLACEHOLDER (Prt) Tj
250.100 214.100 PLACEHOLDER (Hours) Tj
I wanted to create leading and trailing zeros so that the decimal place will continue to be in the same position from the left I was thinking that 6 places before the decimal point and after would probably work fine and leave some wiggle room if I come across something unexpected. Hopefully the text after would appear as
000227.250000 000214.100000 PLACEHOLDER (Qty) Tj
000409.000000 000214.100000 PLACEHOLDER (Ink) Tj
000019.899900 000214.149000 PLACEHOLDER (Prt) Tj
000250.100000 000214.100000 PLACEHOLDER (Hours) Tj
I am fine with creating the script to append a page number at the start of the number But I just can not seem to get the leading and trailing zeros where they need to be. If you could point me in the right direction or give me some suggestions to try I would greatly appreciate any help you can give me.
Thank you very much Paul