Copy data to new file based on input pattern

Hi All,

I want to create a new file based on certain conditions and copy only those conditioned data to new file.

Input Data is as it looks below.

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616
ORDER|Header|Modify|32|32|1617
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1617
ORDER|Header|Add|32|32|1618
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1618
ORDER|Header|Add|32|32|1619
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1619

I want to copy only those data to a new file which has the Header starting with #ORDER|Header|Add|32|32|1618 and followed by details lines until I reach Termination line which looks as #ORDER|T|32|32|1618

1618,1619 all these keeps getting incremented by 1 and is considered as 1 chunk of data. So in the data mentioned above I need only the below data.

Sample output:

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616
ORDER|Header|Add|32|32|1618
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1618
ORDER|Header|Add|32|32|1619
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1619

Thanks & Regards
Gaurav

Hello grvk101,

Welcome to forums, hope you will enjoy learning and sharing knowledge here. Please use code tags for sample Input_file and expected output too.
Please try following and let me know if this helps you.

awk '!/ORDER\|Header\|Add\|32\|32\|161[68]/ && !/ORDER\|Details/ && !/ORDER\|T/{flag=""} /ORDER\|Header\|Add\|32|\32\|161[68]/{flag=1} flag'   Input_file

Output will be as follows.

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616
ORDER|Header|Add|32|32|1618
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1618
ORDER|Header|Add|32|32|1619
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1619

EDIT: Adding a non-one liner form of solution too here.

awk '
!/ORDER\|Header\|Add\|32\|32\|161[68]/ && !/ORDER\|Details/ && !/ORDER\|T/{
  flag=""
}
/ORDER\|Header\|Add\|32|\32\|161[68]/{
  flag=1
}
flag
'   Input_file
 

Thanks,
R. Singh

Hello Ravinder,

Getting error as below.

$ awk !/ORDER\|Header\|Add\|32\|32\|161[68]/ && !/ORDER\|Details/ && !/ORDER\|T/{flag=""} /ORDER\|Header\|Add\|32|\32\|161[68]/{flag=1} flag samplefile.dat
./test.sh[6]: $:  not found

Added to it I need to search data based only on the initial conditions and the rest of the columns in the Header needs to skipped.

Eg -
Search only on basis of below format

ORDER|Header|Add..........................
Followed by details
and then by Terminate ie ORDER|T|............

If possible kindly share the entire code at once so that I can get a feel of it and make use of the same in my next codes which are related to it.

Not clear. What except removing "Modify" blocks needs to be done?

Hello grvk101,

Not sure if you have pasted complete command, please copy complete command of mine and try, also you haven't showed us which O.S you are on, so in case you are on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk .

Also seems your very first post and last post having different requirement, please mention all details of your question in code tags.

Thanks,
R. Singh

Hi,
If i understand correctly ?
This sed command write in a file all text under ..1618 and ..1618.

sed -E '/([^|]*\|){5}1618$/!d;:A;N;/\n([^|]*\|){4}1618$/!bA;w savefile' infile

I'm expecting another shoe to drop soon with the rest of your requirements, but here are three simple awk scripts and one simple sed script that seem to do what you requested in post #1:

awk -F'|' '$1 == "ORDER" && $2 == "Header" && $3 == "Add",$1 == "ORDER" && $2 == "T"' file

awk '/^ORDER\|Header\|Add\|/,/^ORDER\|T\|/' file

awk 'substr($0, 1, 17) == "ORDER|Header|Add|",substr($0, 1, 8) == "ORDER|T|"' file

sed -n '/^ORDER|Header|Add|/,/^ORDER|T|/p' file

Maybe you can modify one of these to work with your other, unspecified requirements to get something that will work for you.

Dear Don,

Thnx for your effort. The command provided by you worked perfectly fine and I was able to select all the records with Add headers and subsequent result.:b:

Dear Ravinder,

Thnx for your effort.:b:

Below is my requirement and output.
Input file:

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616
ORDER|Header|Modify|32|32|1617
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1617
ORDER|Header|Add|32|32|1618
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1618
ORDER|Header|Add|32|32|1619
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1619

Output:

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616
ORDER|Header|Add|32|32|1618
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1618
ORDER|Header|Add|32|32|1619
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1619

Regards.
Gaurav

Dear Don,

I was going through my output and found out that the lines having Header with M (modify) lines were also included and were converted to A (Add).

But I want the lines with Headers M (modify) to be ignored (skipped).

PFB sample.

Input

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616
ORDER|Header|Modify|32|32|1617
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1617

Required Output

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616

As of now using the code you have provided, I got the output as follows.

ORDER|Header|Add|32|32|1616
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1616
ORDER|Header|Add|32|32|1617
ORDER|Details1.........
ORDER|Details2.........
ORDER|Details3.........
ORDER|Details4.........
ORDER|T|32|32|1617

Note:- The Modify is getting changed to Add, whereas I need to ignore the same.

Regards.
Gaurav

This is an interesting story, but there is absolutely nothing in any of the four code suggestions I supplied in post #7 in this thread that could change the string Modify to the string Add .

If a transaction termination line does not start with the exact sequence:

ORDER|T|

and a modify transaction follows it, that modify transaction might also be copied to the output but the Modify would not be changed to Add .

Please show us a sample input file that exhibits the behavior you have described above (including the exact line before the start of the modify transaction that is changed to an add transaction. And, tell us which of the four suggestions I provided exhibits the behavior you're seeing with that input.

Dear Don,

Found out where the issue was. Thnx for your prompt.
There was duplicate data in 2 different file for the same number with Header as Add as well as Modify, so modify from 1 file was ignored and Add from other file was added.
Sry for the inconvenience caused.

Regards.
Gaurav