Help me pls : splitting single file in unix into different files based on data

I have a file in unix with sample data as follows :

--------------------------------------------------------------
--------------------------------------------------------------
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\main|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\tag|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|metadata||7|8|RF=||{0|}}

I want this file to be splitted into different files and corresponding to the sample data 2 files with file names main and tag and those files must have data as below:
main:

{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\main|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}

___________________________________________________________________________________________
tag:

{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\tag|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
----------------------------------------------------------
{30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|metadata||7|8|RF=||{0|}}

____________________________________________________________________________________________
I need to do some other analysis on these files.My file contains many as similar partitions.Can anyone help me in this issue in UNIX . Thanks in advance

what criteria you are following to split the file

try this..

awk '{if($0 ~ /XXparameter\|Layout\|/){a++;print > "file_"a}else{if(a){print > "file_"a}}}' file

for your above record two files are created..

$ls file_*
file_1  file_2
1 Like

I have to emphasize raj_saini20's request to be more specific, as e.g. all the characteristic lines are identical. However, assuming the line containing "Layout" being the separator, this should do the task:

 awk '/Layout/{fn="file"++x} x{print > fn}' file
1 Like

How can i name my new files with the names that i got from the text??

---------- Post updated at 11:02 PM ---------- Previous update was at 11:00 PM ----------

With the code above i can get starting text but how to make multiple files from one file which contains all data , can we achieve this using loop??? can you please help me to extract lines i.e starting position and ending line also.Thanks in advance

I don't understand. As I pointed out before, all the "Layout" lines in your sample are identical. What should be the filename?

WAIT! There's main and tag in the line after the "Layout" line. Try

$ awk 'BEGIN{FS="[\\\|]"} /Layout/{a=$0; getline; fn=$14;  print a >fn } a{print > fn}' file

---------- Post updated at 11:02 PM ---------- Previous update was at 11:00 PM ----------

Again, I don't understand. The code is producing a different, new output file every time it encounters a line containing "Layout". BTW, you did not specify a criterion about how to split the file, as already requested by raj_saini20. Did you try the code? What do you mean by "starting position and ending line"? Where and when do we get these and where should we put these?
Pls. provide meaningful input samples and desired output.

1 Like
--------------------------------------------------------------
--------------------------------------------------------------
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts 
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\main|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
---------------------------------------------------------- {30001002|XXparameter|metadata||7|8|RF=||{0|}}
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\tag|3|2|Pw$|@{0|}}
-------------------------------------------------------
----------------------------------------------------------
---------------------------------------------------------- {30001002|XXparameter|metadata||7|8|RF=||{0|}} 
{30001002|XXparameter|metadata||7|8|RF=||{0|}}

[/COLOR]The starting line of the file should be highlighted in green and
ending line of the file should be highlighted in violet(we have repeating
identical last line lines , just observe sample data , i need the
last most ending line into the file.)Similar to this sample data
i have many such a kind of partitions in my original file and even
layout is repeating twice in each of the partition i want.Sorry
i did not observe that when i am posting sample data. I'll provide you the smallest original file
on Monday.Please try to solve that problem.Now did u get first and
last line criteria??

---------- Post updated at 12:42 PM ---------- Previous update was at 11:41 AM ----------

The file names (main and tag from sample data) are also repeating.So,can we assign a variable initializing it to zero and incrementing it every time a file is created and appending that variable to the last word of the file name.
For example file names:

main1
tag2
main3
main4
tag5

Can this is possible?or do u have any suggestions like can we make it:

main1
tag1
main2
main3
tag2

In the sense for main there are 3 files and tag there are 2. Is this possible? Please help me out.......

For the colouring of lines, I leave this for your exercise. For the first line, you may want to print your colour escape sequences around the a in the print a > fn statement.
For the file numbering, try:

awk 'BEGIN{FS="[\\\|]"} /Layout/{a=$0; getline; fn=$14(++A[$14]);  print a >fn } a{print > fn}' file
1 Like
awk '{if($0 ~ /XXparameter\|Layout\|/){L=$0}else{if(L){if($0 ~ /tag/){t++;s="tag"t}else{m++;s="main"m}{print L>s;L=""}}{print > s}}}' file
1 Like

I am getting error :

 awk 'BEGIN{FS="[\\\|]"} /Layout/{a=$0; getline; fn=$14;  print a >fn } a{print > fn}' temp.txt
awk: warning: escape sequence `\|' treated as plain `|'
awk: (FILENAME=temp.txt FNR=136) fatal: expression for `>' redirection has null string value

---------- Post updated at 02:19 PM ---------- Previous update was at 02:09 PM ----------

All of the partitions i need is as below it contains 2 times layout:

{2010503005|XXGfvertex|46|0|99|0|{|{30100001|XXparameter_set|@@@@
{{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts [vector _interp_("mfile:$\{INF_ENTRPRSDWUNFYRETLCRED_MFS\}/m_cdp2_uedw_v_cls_uld.dat", "dollar_substitution")]]]|3|9||@{0|}}
{30001002|XXparameter|read_metadata|$\{INF_ENTRPRSDWUNFYRETLCRED_DML\}/cdp2_uedw_v_cls.dml|3|2|f$|@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Input_File.mdc|3|2|Pw$|@{0|}}
{30001002|XXparameter|eme_dataset_location|$\{INF_ENTRPRSDWUNFYRETLCRED_MFS\}/m_cdp2_uedw_v_cls_uld.dat|3|2|$|@{0|}}
}}@0|@127358|686578|152000|707000|40000|40000|37115|m_cdp2_uedw_v_cls_uld.dat|SunTrust Bank Inc.||1|10|-1||6||32769|-1|-1|}}
{2010203004|XXGoport|47|0|101|0|{@{}@191000|721000|11000|11000|read|0.0|@@@2160|0|}}
{2010503005|XXGfvertex|48|0|104|0|{Represents one file, many files, or a multifile as an input to your graph.|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|protection|0666|12|2|RF$||{0|}}
{30001002|XXparameter|mode|0x0001|1|2|FH$|modes of access|{0|}}
{30001002|XXparameter|Layout|@28|2|RF$||{0|}}
{30001002|XXparameter|read_metadata||7|1|RFl||{0|}}
{30001002|XXparameter|mpcmodtime|1196372206|1|1|Hl|The last modification time of this component's template|{0|}}
{30001002|XXparameter|eme_dataset_location|@3|9|F|Place in the EME to create a dataset corresponding to this file.|{0|}}
}}@0|@0|0|0|0|0|0|0|@@@1|10|-1|@6|@1|-1|-1|}}
{2010203004|XXGoport|49|0|106|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}
}}@0|0|0|0|read|0.0|@@@2160|0|}}

So while using this code:

awk '/Layout/{fn="file"++x} x{print > fn}' temp.txt

I am getting wrong files. so i tried with

awk '/Layout|$[[record/{fn="file"++x} x{print > fn}' temp.txt

But it is throwing error. Can u pls resolve this......

Thanks a lot in advance

still not clear what you want..
In every post your requirement changes...
As per your tried code above..
try this..

awk '/Layout\|\$\[\[recor/{fn="file_"++x} x{print > fn}' file
1 Like

Now the below code is working fine but can this code be restricted to the
only word layout .

awk '/Layout\|\$\[\[recor/{fn="file_"++x} x{print > fn}' temp.txt

I already mentioned that i have some lines containing InputLayout and OutputLayout,these are also being divided which is not needed.
So strictly 'layout' and '$' partioning must be done.

Thanks in advance

---------- Post updated at 05:22 PM ---------- Previous update was at 03:36 PM ----------

With the code :

awk 'BEGIN{FS="[\\\|]"} /Layout/{a=$0; getline; fn=$14;  print a >fn } a{print > fn}' temp.txt

I am getting the following error:

awk: warning: escape sequence `\|' treated as plain `|'
awk: (FILENAME=temp.txt FNR=136) fatal: expression for `>' redirection has null string value

can u please resolve this...Thanks in advance

Using only one "Layout" we have already answered in the same thread.
see Post 3 & 4.

And for quick response please provide your input and desired output in detail.

 
{2010503005|XXGfvertex|72|0|189|0|{|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|mode|0x0001|3|2|$|@{0|}}
{30001002|XXparameter|key|\{retl_prod_hier_dim_id\}|3|2|$|@{0|}}
{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts [vector _interp_("mfile:$\{INF_RETLDATAMART_MFS\}/m_cdp2_rdm_dt_retl_prod_hier_dim_lkp.dat", "dollar_substitution")]]]|3|9||@{0|}}
{30001002|XXparameter|read_metadata|$\{INF_RETLDATAMART_DML\}/cdp2_rdm_dt_retl_prod_hier_dim.dml|3|2|f$|@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Input_File.mdc|3|2|Pw$|@{0|}}
}}@0|@119757|267507|145000|288000|40000|40000|37101|m_cdp2_rdm_ dt_retl_prod_hier_dim_lkp.dat|SunTrust Bank Inc.||1|10|-1||6||32769|-1|-1|}}
{2010203004|XXGoport|73|0|191|0|{@{}@184000|302000|11000|11000|read|0.0|@@@2160|0|}}
{2010503005|XXGfvertex|74|0|194|0|{Lookup files are components containing shared data. Use lookup files with the DML lookup functions to access records according to a key.|
{30100001|XXparameter_set|@@@@{{30001002|XXparameter|protection|0666|12|2|RF$||{0|}}
{30001002|XXparameter|mode|0x0200|1|2|FH$|modes of access|{0|}}
{30001002|XXparameter|condition||3|2|F$||{0|}}
{30001002|XXparameter|conditionInputPort||3|2|F$||{0|}}
{30001002|XXparameter|conditionOutputPort||3|2|F$||{0|}}
{30001002|XXparameter|condition_interpretation|Remove completely|15|1|Fl||{2|Replace with flow|Remove completely|}}
{30001002|XXparameter|condition_interpretation.display_name|condition-interpretation|3|9|P|@{0|}}
{30001002|XXparameter|key||19|2|RF$|Key specifier For Lookup File|{0|}}
{30001002|XXparameter|key.condition|mode lookup|3|15|P?|@{0|}}
{30001002|XXparameter|Layout|@28|2|RF$||{0|}}
{30001002|XXparameter|read_metadata||7|2|RF$|Record Format|{0|}}
{30001002|XXparameter|mpcmodtime|1196372206|1|1|Hl|The last modification time of this component's template|{0|}}
{30001002|XXparameter|eme_dataset_location||3|2|F$|Place in the EME to create a dataset corresponding to this file.|{0|}}
}}@0|@0|0|0|0|0|0|0|@@@1|10|-1|@6|@1|-1|-1|}}
{2010203004|XXGoport|75|0|196|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=|Record Format|{0|}}
{2010503005|XXGfvertex|46|0|99|0|{|{30100001|XXparameter_set|@@@@
{{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts [vector _interp_("mfile:$\{INF_ENTRPRSDWUNFYRETLCRED_MFS\}/m_cdp2_uedw_v_cls_uld.dat", "dollar_substitution")]]]|3|9||@{0|}}
{30001002|XXparameter|read_metadata|$\{INF_ENTRPRSDWUNFYRETLCRED_DML\}/cdp2_uedw_v_cls.dml|3|2|f$|@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Input_File.mdc|3|2|Pw$|@{0|}}
{30001002|XXparameter|eme_dataset_location|$\{INF_ENTRPRSDWUNFYRETLCRED_MFS\}/m_cdp2_uedw_v_cls_uld.dat|3|2|$|@{0|}}
}}@0|@127358|686578|152000|707000|40000|40000|37115|m_cdp2_uedw_v_cls_uld.dat|SunTrust Bank Inc.||1|10|-1||6||32769|-1|-1|}}
{2010203004|XXGoport|47|0|101|0|{@{}@191000|721000|11000|11000|read|0.0|@@@2160|0|}}
{2010503005|XXGfvertex|48|0|104|0|{Represents one file, many files, or a multifile as an input to your graph.|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|protection|0666|12|2|RF$||{0|}}
{30001002|XXparameter|mode|0x0001|1|2|FH$|modes of access|{0|}}
{30001002|XXparameter|Layout|@28|2|RF$||{0|}}
{30001002|XXparameter|read_metadata||7|1|RFl||{0|}}
{30001002|XXparameter|mpcmodtime|1196372206|1|1|Hl|The last modification time of this component's template|{0|}}
{30001002|XXparameter|eme_dataset_location|@3|9|F|Place in the EME to create a dataset corresponding to this file.|{0|}}
}}@0|@0|0|0|0|0|0|0|@@@1|10|-1|@6|@1|-1|-1|}}
{2010203004|XXGoport|49|0|106|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}
{2010600005|XXGgraph|136|0|409|0|{Repartitions data records by key values and then sorts the records within each partition.|
{30100001|XXparameter_set|@@@@{{30001002|XXparameter|Key|\{retl_sub_lob_cd\}|3|2|$|@{0|}}
{30001002|XXparameter|InputLayout|$[[record kind 85 subkind 0 parts [vector _interp_("$\{INF_RETLDATAMART_MFS\}", "dollar_substitution")]]]|3|9||@{0|}}
{30001002|XXparameter|Max_core|$AI_GRAPH_MAX_CORE|3|2|$|@{0|}}
{30001002|XXparameter|OutputLayout|$[[record kind 85 subkind 0 parts [vector _interp_("$\{INF_RETLDATAMART_MFS\}", "dollar_substitution")]]]|3|9||@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Sort\\Partition_by_Key_and_Sort.mp|3|2|Pw$|@{0|}}
}}@0|@298442|218821|318000|239000|481000|303000|37023|PKS1 - \{retl_sub_lob_cd\}|Ab Initio Software|Created 04/22/98 12:54:47|1|10|-1||6||32769|{0|}0|0|{0|}{0|}{0|}{0|}.4407484531402588|481000|303000|0|}}
{2010210004|XXGflow|137|0|411|0|{@{}@384|.5|.5|{8|217000|171000|237000|171000|296000|171000|316000|171000|}0|20|}}
{2010501005|XXGpvertex|138|0|413|0|{Groups data according to a collator. 
 
 
A Hash Partition component is generally followed by a Local Sort component.|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|in_metadata||3|8|s=|@{0|}}
{30001002|XXparameter|out_metadata||3|8|s=|@{0|}}
}}@0|@61000|118000|81000|138000|126000|68000|0|Partition by Key|Ab Initio Software|Built-in|1|10|-1||6||32769|1|{1|0|}}}
{2010203004|XXGoport|139|0|415|0|{@{}@206000|166000|11000|11000|out|0.0|@@@2322|0|}}
{2010202004|XXGiport|140|0|418|0|{@{}@71000|166000|11000|11000|in|0.0|@@@1808|0|}}
{2010501005|XXGpvertex|141|0|422|0|{Orders your data according to a collating expression.|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|in_metadata||3|8|s=|@{0|}}
{30001002|XXparameter|out_metadata||3|8|s=|@{0|}}
}}@0|@307000|118504|327000|139000|104000|65000|0|Sort|Ab Initio Software|Built-in|1|10|-1||6||32769|1|{1|0|}}}
{2010203004|XXGoport|142|0|424|0|{@{}@430000|166000|11000|11000|out|0.0|@@@2448|0|}}
{2010202004|XXGiport|143|0|426|0|{@{}@317000|166000|11000|11000|in|0.0|@@@1808|0|}}
{2010203004|XXGoport|144|0|431|0|{@{}@529000|301000|11000|11000|out0|.5|@@@14736|0|}}
{2010202004|XXGiport|145|0|435|0|{@{}@308000|301000|11000|11000|in0|.5|@@@14096|0|}}
{2010600005|XXGgraph|146|0|439|0|{Repartitions data records by key values and then sorts the records within each partition.|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|Key||19|2|RF$|Field to partition on|{0|}}
{30001002|XXparameter|InputLayout|@9|2|RF$||{0|}}
{30001002|XXparameter|Max_core|100663296|1|2|F$|maximum memory usage (before spilling to disk) in bytes|{0|}}
{30001002|XXparameter|OutputLayout|@9|2|RF$||{0|}}
{30001002|XXparameter|_parameters_of_Sort|@34|9|FHK|@{0|}}
{30001002|XXparameter|_parameters_of_Partition_by_Key|@34|9|FHK|@{0|}}
{30001002|XXparameter|conditionInputPort|in0|3|1|Fl||{0|}}
{30001002|XXparameter|conditionOutputPort|out0|3|1|Fl||{0|}}
{30001002|XXparameter|condition_interpretation|Replace with flow|15|1|Fl||{2|Replace with flow|Remove completely|}}
{30001002|XXparameter|condition_interpretation.display_name|condition-interpretation|3|9|P|@{0|}}
{30001002|XXparameter|mpcmodtime|1196372210|1|1|Hl|The last modification time of this component's template|{0|}}
{30001002|XXparameter|HelpID|comp_partition_by_key_and_sort|3|2|R$||{0|}}
{30001002|XXparameter|_ab_rexec_username|$USERNAME|3|1|RHl||{0|}}
}}@0|@0|0|0|0|0|0|0|@@@1|10|-1|@6|@1|{0|}0|0|{0|}{0|}{0|}{0|}1.0|0|0|7|}}
{2010210004|XXGflow|147|0|441|0|{@{}@384|.5|.5|{8|217000|171000|237000|171000|296000|171000|316000|171000|}0|20|}}
{2010501005|XXGpvertex|148|0|443|0|{Groups data according to a collator. 
 
__________________________________________________________________________________________________________________________________________________________
A Hash Partition component is generally followed by a Local Sort component.|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|Layout||3|8|=|@{0|}}
{30001002|XXparameter|key||3|8|=|@{0|}}
{30001002|XXparameter|!prototype_path|C:\\gui\\src\\mpc\\Partition\\Hash.mpc|3|2|Pw$|@{0|}}
{30001002|XXparameter|_propagate_through|metadata type: out = in
metadata type: in = out|3|9||@{0|}}
}}@0|@61000|118000|81000|138000|126000|68000|0|Partition by Key|Ab Initio Software|Built-in|1|10|-1||6||32769|1|{1|0|}}}
{2010203004|XXGoport|149|0|445|0|{@{}@206000|166000|11000|11000|out|0.0|@@@2322|0|}}
{2010202004|XXGiport|150|0|448|0|{@{}@71000|166000|11000|11000|in|0.0|@@@1808|0|}}
{2010501005|XXGpvertex|151|0|450|0|{|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|mpname|hash-partition|3|2|H$|The name used on the mp command line for this component|{0|}}
{30001002|XXparameter|image__|unitool|3|2|H$|The image used if this component was a custom component|{0|}}
{30001002|XXparameter|Layout|@9|2|RF$||{0|}}
{30001002|XXparameter|key||19|2|RFO$|Field to partition on|{0|}}
{30001002|XXparameter|in_metadata||7|1|RFsl||{0|}}
{30001002|XXparameter|out_metadata||7|1|RFsl||{0|}}
{30001002|XXparameter|doc_transform||8|2|FHs$|Document your transformation for dependency analysis|{0|}}
{30001002|XXparameter|doc_operation1|out::document(in)|3|1|RHl|The custom transformation|{0|}}
{30001002|XXparameter|port_analysis|out=in|3|2|H$||{0|}}
{30001002|XXparameter|continuous_analysis||3|2|H$||{0|}}
{30001002|XXparameter|_propagate_through||3|1|FHKl|@{0|}}
}}@0|@0|0|0|0|0|0|0|@@@1|10|-1|@6|@1|1|{1|0|}}}
{2010203004|XXGoport|152|0|452|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}
}}@0|0|0|0|out|0.0|@@@2322|0|}}
{2010202004|XXGiport|153|0|455|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}
}}@0|0|0|0|in|0.0|@@@1808|0|}}
--------------------------------------------------------------------------------------------------------------------------------------------------------------
{2010501005|XXGpvertex|154|0|460|0|{Orders your data according to a collating expression.|
{30100001|XXparameter_set|@@@@{{30001002|XXparameter|Layout||3|8|=|@{0|}}
{30001002|XXparameter|key||3|8|=|@{0|}}
{30001002|XXparameter|max_core||3|8|=|@{0|}}
{30001002|XXparameter|!prototype_path|C:\\gui\\src\\mpc\\Sort-Merge\\Sort.mpc|3|2|Pw$|@{0|}}
{30001002|XXparameter|_propagate_through|metadata type: out = in
metadata type: in = out|3|9||@{0|}}
}}@0|@307000|118504|327000|139000|104000|65000|0|Sort|Ab Initio Software|Built-in|1|10|-1||6||32769|1|{1|0|}}}
{2010203004|XXGoport|155|0|462|0|{@{}@430000|166000|11000|11000|out|0.0|@@@2448|0|}}
{2010202004|XXGiport|156|0|464|0|{@{}@317000|166000|11000|11000|in|0.0|@@@1808|0|}}
{2010501005|XXGpvertex|157|0|467|0|{|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|mpname|local-sort|3|2|H$|The name used on the mp command line for this component|{0|}}
{30001002|XXparameter|image__|unitool|3|2|H$|The image used if this component was a custom component|{0|}}
{30001002|XXparameter|Layout|@9|2|RF$||{0|}}
{30001002|XXparameter|key||19|2|RFO$|Field to sort on|{0|}}
{30001002|XXparameter|max_core||1|2|FK$|maximum memory usage (before spilling to disk) in bytes|{0|}}
{30001002|XXparameter|max_core.display_name|max-core|3|9|P|@{0|}}
{30001002|XXparameter|max_core.keyword|max-core|3|9|P|@{0|}}
{30001002|XXparameter|in_metadata||7|1|RFsl||{0|}}
{30001002|XXparameter|out_metadata||7|1|RFsl||{0|}}
{30001002|XXparameter|doc_transform||8|2|FHs$|Document your transformation for dependency analysis|{0|}}
{30001002|XXparameter|doc_operation1|out::document(in)|3|1|RHl|The custom transformation|{0|}}
{30001002|XXparameter|port_analysis|out=in|3|2|H$||{0|}}
{30001002|XXparameter|continuous_analysis||3|2|H$||{0|}}
{30001002|XXparameter|_propagate_through||3|1|FHKl|@{0|}}
}}@0|@0|0|0|0|0|0|0|@@@1|10|-1|@6|@1|1|{1|0|}}}
{2010203004|XXGoport|158|0|469|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}
}}@0|0|0|0|out|0.0|@@@2448|0|}}
{2010202004|XXGiport|159|0|472|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}
}}@0|0|0|0|in|0.0|@@@1808|0|}}
{2010203004|XXGoport|160|0|478|0|{@{}@0|0|0|0|out0|.5|@@@14736|0|}}
{2010202004|XXGiport|161|0|481|0|{@{}@0|0|0|0|in0|.5|@@@14096|0|}}
{2010503005|XXGfvertex|132|0|393|0|{|{30100001|XXparameter_set|@@@@
{{30001002|XXparameter|Layout|$[[record kind 85 subkind 0 parts [vector _interp_("mfile:$\{INF_RETLDATAMART_MFS\}/m_cdp2_rdm_dt_retl_prod_hier_xref.dat", "dollar_substitution")]]]|3|9||@{0|}}
{30001002|XXparameter|write_metadata|$\{INF_RETLDATAMART_DML\}/cdp2_rdm_dt_retl_prod_hier_xref.dml|3|2|f$|@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Output_File.mdc|3|2|Pw$|@{0|}}
}}@0|@1194511|289945|1220000|310000|40000|40000|37109|m_cdp2_rdm_ dt_retl_prod_hier_xref.dat|SunTrust Bank Inc.||1|10|0||6||32769|-1|-1|}}
{2010202004|XXGiport|133|0|395|0|{@{}@1210000|324000|11000|11000|write|0.0|@@@1776|0|}}
{2010503005|XXGfvertex|134|0|398|0|{Represents one file, many files, or a multifile as an output from your graph.|{30100001|XXparameter_set|@@@@{{30001002|XXparameter|protection|0666|12|2|RF$||{0|}}
{30001002|XXparameter|mode|0x0062|1|2|FH$|modes of access|{0|}}
{30001002|XXparameter|condition||3|2|F$||{0|}}
{30001002|XXparameter|conditionInputPort||3|2|F$||{0|}}
{30001002|XXparameter|conditionOutputPort||3|2|F$||{0|}}
{30001002|XXparameter|condition_interpretation|Remove completely|15|1|Fl||{2|Replace with flow|Remove completely|}}
{30001002|XXparameter|condition_interpretation.display_name|condition-interpretation|3|9|P|@{0|}}
{30001002|XXparameter|key||19|2|RF$|Key specifier For Lookup File|{0|}}
{30001002|XXparameter|key.condition|mode lookup|3|15|P?|@{0|}}
{30001002|XXparameter|Layout|@28|2|RF$||{0|}}
{30001002|XXparameter|write_metadata||7|1|RFl||{0|}}
{30001002|XXparameter|mpcmodtime|1196372208|1|1|Hl|The last modification time of this component's template|{0|}}
{30001002|XXparameter|eme_dataset_location||3|2|F$|Place in the EME to create a dataset corresponding to this file.|{0|}}
}}@0|@0|0|0|0|0|0|0|@@@1|10|-1|@6|@1|-1|-1|}}
{2010202004|XXGiport|135|0|400|0|{@{30100001|XXparameter_set|@@@@{{30001002|XXparameter|metadata||7|8|RF=||{0|}}

I need four files from the above data with Names Input_File1,Input_File2,Partition_by_Key_and_Sort3 and Output_File4.

This is my requirement.Please solve it, Thanks for ur patience with me. Thanks a lot in advance

Is this want you want..?

awk -F "\\" '{if($0~/Layout\|\$\[\[recor/){s=$0;}
else if($0~/Ab Initio 1438\\/){split($11,a,".");x++;fn=a[1]x;{print s > fn ;s=""}}
else if(s){s=s"\n"$0}
else{if(fn){print > fn}}}' file
1 Like

I have many such files where there are different partitions(not only four). So can u please suggest me code for n number of partitions.
Thanks in advance

Just find one common line to get the partitions.. And if your common pattern is same it will work for all the patterns..
like Ab Initio 1438 is common for all the partitions.. just find that pattern and partition position..

{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Input_File.mdc|3|2

Hope this helps you..:slight_smile:

1 Like

[SIZE=3]

{30001002|XXparameter|!prototype_path|C:\\Program Files\\Ab Initio\\Ab Initio GDE\\Components\\Transform\\Join.mpc|3|2|Pw$|@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Input_File.mdc|3|2|Pw$|@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Datasets\\Input_File.mdc|3|2|Pw$|@{0|}}
{30001002|XXparameter|!prototype_path|E:\\program files\\Ab Initio 1438\\Components\\Transform\\Join.mpc|3|2|Pw$|@{0|}} 
 

Sorry yaar , again another issue cam einto existence. Please observe the above code
The Line highlighted in red is a little bit different than others . So the name of the file i got is Transform instead of Join.
can we do any thing such that the name before [COLOR=\#ff0000].mpc|3|2|Pw$|@{0|}}
and after \\ as name of the file. Bcz i found this is working in all the files.

//Join.mpc|3|2|Pw$|@{0|}}
 

Thanks in advance.

As per my observation.. see below...
It is C directory

{30001002|XXparameter|!prototype_path|C:\\Program Files\\Ab Initio\\Ab Initio GDE\\Components\\Transform\\Join.mpc|3|2|Pw$|@{0|}}

try this..

awk -F "\\" '{if($0~/Layout\|\$\[\[recor/){s=$0;}
else if($0~/E:\\\\Program Files\\\\Ab Initio/){split($11,a,".");x++;fn=a[1]x;{print s > fn ;s=""}}
else if($0~/C:\\\\Program Files\\\\Ab Initio/){split($13,a,".");x++;fn=a[1]x;{print s > fn ;s=""}}
else if(s){s=s"\n"$0}
else{if(fn){print > fn}}}' file
1 Like

Cant we get the name like this... after \\ and before .mpc|3|2|Pw$|@\{0|}}.
Bcz in further files if i found any other thing again code needs to be changed but this must be strictly same in all cases.

Thanks a lot in advance