A file reports.txt (see attachment) contains 17 pages of patient reports. Each patient is identified by a prefix i.e. 11 and a 7 digits number. There are total six patients reports in the file. One patient report may contain multiple pages. Following are the page count of each Lab no (seven digit number).
I am looking for an awk or perl solution to split the file according to 7 digit number. The expected file name is prefix (i.e. 11)and the 7 digit number.
What a "page" is depends on your paper and font, so I can't tell if I have enough pages. But this splits as you ask.
nawk '{ print > "11" $3 ".txt" }' < file.txt
[edit] Okay, your actual data is nothing like the data you actually showed in your post. Working on it.
---------- Post updated at 03:08 PM ---------- Previous update was at 02:33 PM ----------
The data was so scrambled it took a while to see any patterns. I look for the "Lab." in each page and find the number after it. If no 'Lab.' is found in the page, it uses the last one it found.
Thanks a lot for giving me a hand. So far I have copied your code to a file called yy in the same directory where a copy of reports.txt is there. When I used "awk yy", its not doing anything since last 15 mins. Could you please see if I am wrong with any command ?
As both of you suggested, I have putted the same code in a shell script and run it by sh yy, I have even try it from the command line too. This time its finished instantly but not produced anything nor even any error :()
Well, the solution doesn't work. (And my second suggestion, about output, is wrong - i was inattentive, sorry.)
Your file was produced by some text processor, not a text editor. It has a lot of special escape sequences. Is it possible to convert your file in your text processor to plain text?
If not it would be hard to give you a solution - it needs to do some binary hacking to define borders of chunks in order to split the file.