Thanks for giving your time and effort to answer questions and helping newbies like me understand awk.
I have a huge file, millions of lines, so perl takes quite a bit of time, I'd like to convert these perl one liners to awk.
Basically I'd like all lines with ISA sandwiched between non-word characters on its own line
then I'd like to remove the first non-word character in front of "sandwiched" ISAs or put another way put "sandwiched" ISAs at the beginning of the line
How would I do this in awk? Thanks so much for help, I really do appreciate it. Please let me know if I can explain this more clearly or if you need data examples.
Here are a few sample lines ... I only want the lines with red ISA on a new line not the ones in purple ISA ... I know its a bit messy ... I can explain the logic/syntax of the file, if you'd like
---------- Post updated 07-06-11 at 11:04 AM ---------- Previous update was 07-05-11 at 05:38 PM ----------
Thought I'd add some details on the file.
ISA, GS, ST, AK1, AK2, AK5, AK9, SE, GE, IEA are line headers and generally follow the same order. ISA is the beginning of the record, IEA is the end of the record. There are tens of thousands of records in a given file.
The file also has non-word character field seperators (ie ~ !), it also has line seperators (either a newline or non-word character, later an awk script will change all [\W] to newlines)
Do you have any way to test that code on some Linux machine? I made some simple test with file containing one line with 4 milion random characters and it ran successfully on Linux: