Hi All,
Sorry for more question today. I am having a text file . Like below
704925680_TOTAL->MANUAL->TT IOR GSB
775116444_TOTAL->POO TO->TT
-572275295_TOTAL->MANUAL->MTO
-611408278_TOTAL->PRIE LEL
456690129_TOTAL->BTT TOO
475919266_TOTAL->MANUAL->COM
-172680236_TOTAL->BTT TOO->MTO
481202389_TOTAL->MANUAL->TT COM
655296953_TOTAL->POO TO->COLE
-434977163_TOTAL->POO TO->SPI
-766208436_TOTAL->MANUAL->TT 3W GSB
-1903630442_TOTAL->COLE
-541283452_TOTAL->MANUAL->CRT
These are the hierarchical data. But not on the hierarchy order. If you see all fall on total and after total the next data come in the file in the example above it is MANUAL
. But the MANUAL
data are not group and scattered. I am trying to bring these on the group. The data to be grouped always falls after ->
. The example I have given is 3 level we may have 4 or level. The above files needs to be grouped as below
-766208436_TOTAL->MANUAL->TT 3W GSB
704925680_TOTAL->MANUAL->TT IOR GSB
-572275295_TOTAL->MANUAL->MTO
-541283452_TOTAL->MANUAL->CRT
481202389_TOTAL->MANUAL->TT COM
475919266_TOTAL->MANUAL->COM
-611408278_TOTAL->PRIE LEL
456690129_TOTAL->BTT TOO
-172680236_TOTAL->BTT TOO->MTO
775116444_TOTAL->POO TO->TT
655296953_TOTAL->POO TO->COLE
-434977163_TOTAL->POO TO->SPI
-1903630442_TOTAL->COLE
Is there any easy way to do in Perl or awk. I am thinking to do C code. Any help will be great.
RudiC
October 4, 2017, 6:30pm
2
Try
sort -t">" -k2 file
456690129_TOTAL->BTT TOO
-172680236_TOTAL->BTT TOO->MTO
-1903630442_TOTAL->COLE
475919266_TOTAL->MANUAL->COM
-541283452_TOTAL->MANUAL->CRT
-572275295_TOTAL->MANUAL->MTO
-766208436_TOTAL->MANUAL->TT 3W GSB
481202389_TOTAL->MANUAL->TT COM
704925680_TOTAL->MANUAL->TT IOR GSB
655296953_TOTAL->POO TO->COLE
-434977163_TOTAL->POO TO->SPI
775116444_TOTAL->POO TO->TT
-611408278_TOTAL->PRIE LEL
If this doesn't satisfy your needs, pls post why and how M(ANUAL) sorts before B(TT), and C(OLE) after P(OO).
1 Like
Thanks that worked. I have the ID's too with delimited by ~. Tried with cut
and sort
it is sorting only the description. I need the whole record grouped by tag ( Manual,POO TO etc
)
Also one more question I have multiple level after >
I can sort by each column and pipe it to next K2 right ?
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
Expected output
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
RudiC
October 5, 2017, 3:20am
4
What do you mean by I can sort by each column and pipe it to next K2
?[/FONT][/COLOR]
Please ignore the statement.
I was asking for below where it worked using sort -K4
TOT-COL-TOOT-SOF-KID
TOT-GEK-TOOT-SOF-MAN
Now the issue I am having is with the ID's as mentioned
RudiC
October 5, 2017, 6:32am
6
Please be aware that a sort
key by default stops at line end; to confine it to a single field, the stop position needs to be specified as well, e.g. -k4,4
. Also note that it's a lower case k
.
1 Like
Thanks. You mean to say I can make the below worked with -k4,4
Original
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
Expected output
Code:
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
Output after execution. Not sure how can traverse through the ~
delimiter and sort the whole record
sort -t">" -k4,4 ttt
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
RudiC
October 5, 2017, 7:30am
8
With a ">" separator, there's no 4th field to serve as a sort key in above data. I'm sorry, I don't get what you're after.
sorry If I am not clear. Basically my input is like below with ID's. I want to sort 4th columns with the hierarchy sort as MANUAL,POO TO
etc . I have added expected output below
Original
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
Expected Output
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
RudiC
October 5, 2017, 7:48am
10
Applying copious amounts of phantasy, (or wildly guessing, whatever you prefer) let me try to paraphrase your request, which, btw, dramatically changed during the course of the thread:
A file of records consisting of fields separated by "~" needs to be rearranged by a sub field of the 4th field which in turn is separated by "->". I use the term "rearranged" as I can't dream up an algorithm to sort character B after P after M. The algorithm with which to rearrange is not defined.
1 Like
From a row delimited by ~
take the 4th field and from that delimited by >
take the last record and group record based on the entire row . I don't need to sort I need to group basically it is ending in sort. As you mentioned I am not worried about where B comes. I need to group MANUAL
and others If any it can be of any order.
The desired output can be in a form below.
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
OR
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
179341~1704509~1704480~456690129_TOTAL->BTT TOO
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
OR
179341~1704509~1704480~456690129_TOTAL->BTT TOO
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
RudiC
October 5, 2017, 8:07am
12
"the last record" - even if I translate "record" to "field", "MANUAL" would NOT be the last.
"group record based on the entire row" - what does this mean? Is that bespoke subfield the key, or the entire row?
oops it is not last record it is second last . My bad.
When I mean entire row is the the record group needs to with ID as shown in example. I need to ID to be preserved with the grouping. I the example below the ID associated with the records needs be stick to it where the record moved based on grouping
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
RudiC
October 5, 2017, 8:17am
14
Try
sed 's/[~>]/&#/g' file | sort -t"#" -k5,5 | sed 's/#//g'
179341~1704509~1704480~456690129_TOTAL->BTT TOO
179341~1704481~1704480~475919266_TOTAL->MANUAL->COM
179341~1704489~1704480~704925680_TOTAL->MANUAL->TT IOR GSB
179341~1704505~1704480~-572275295_TOTAL->MANUAL->MTO
179341~1704497~1704480~775116444_TOTAL->POO TO->TT
179341~1704507~1704480~-611408278_TOTAL->PRIE LEL
1 Like
Works with the example. Not sure why it is not doing on the below file
~ 179341~1704503~1704480~1818623254_TOTAL ->MANUAL ->COL
~ 179341~1704504~1704480 ~1883140245_TOTAL ->MANUAL ->PRIV
~ 179341~1704505~1704480 ~1978359756_TOTAL
~ 179341~1704506~1704480 ~2058772066_TOTAL ->MANUAL ->ORAL-B PRO HEALTH GS B
~ 179341~1704507~1704480 ~2138579186_TOTAL ->PPNNH
~ 179341~1704508~1704480 ~257669199_TOTAL ->POWER TOOTHBRUSH->PPNNH
~ 179341~1704509~1704480 ~359869040_TOTAL ->RRRLOP
~ 179341~1704510~1704480 ~361375824_TOTAL ->POWER TOOTHBRUSH->AO BRANDS
~ 179341~1704511~1704480 ~372808786_TOTAL ->MANUAL ->REACH
~ 179341~1704512~1704480 ~456690129_TOTAL ->BATTERY TOOTHBRUSH
~ 179341~1704513~1704480 ~475919266_TOTAL ->MANUAL ->COL KIDS GSB
~ 179341~1704514~1704480 ~481202389_TOTAL ->MANUAL ->IRRR COMPLETE GSB
~ 179341~1704515~1704480 ~655296953_TOTAL ->POWER TOOTHBRUSH->COL
~ 179341~1704516~1704480 ~704925680_TOTAL ->MANUAL ->IRRR INDICATOR GSB
~ 179341~1704517~1704480 ~775116444_TOTAL ->POWER TOOTHBRUSH->IRRR
~ 179341~1704518~1704480 ~939829720_TOTAL ->REFILL HEADS
~ 179341~1704480~ 0 ~TOTAL CCCC
~ 179341~1704481~1704480 ~-1235133484_TOTAL ->MANUAL ->IRRR PULSAR GSB
~ 179341~1704482~1704480 ~-1308801667_TOTAL ->MANUAL ->IRRR
~ 179341~1704483~1704480 ~-1366207571_TOTAL ->POWER TOOTHBRUSH->PRIV
~ 179341~1704484~1704480 ~-1477892976_TOTAL ->POWER TOOTHBRUSH->IOPP SONICARE
~ 179341~1704485~1704480 ~-1565345926_TOTAL ->MANUAL
~ 179341~1704486~1704480 ~-1600570243_TOTAL ->MANUAL ->AO BRANDS
~ 179341~1704487~1704480 ~-172680236_TOTAL ->BATTERY TOOTHBRUSH->PPNNH
~ 179341~1704488~1704480 ~-1838093819_TOTAL ->MANUAL ->FIREFLY
~ 179341~1704489~1704480 ~-1903630442_TOTAL ->COL
~ 179341~1704490~1704480 ~-2042984783_TOTAL ->POWER TOOTHBRUSH
~ 179341~1704491~1704480 ~-2137023982_TOTAL ->MANUAL ->P&G KIDS GSB
~ 179341~1704492~1704480 ~-258140860_TOTAL ->MANUAL ->COL
~ 179341~1704493~1704480 ~-408734766_TOTAL ->REFILL HEADS->PPNNH
~ 179341~1704494~1704480 ~-434977163_TOTAL ->POWER TOOTHBRUSH->ORPRR
~ 179341~1704495~1704480 ~-541283452_TOTAL ->MANUAL ->CREST
~ 179341~1704496~1704480 ~-572275295_TOTAL ->MANUAL ->PPNNH
~ 179341~1704497~1704480 ~-611408278_TOTAL ->PRIV
~ 179341~1704498~1704480 ~-766208436_TOTAL ->MANUAL ->IRRR 3DW GSB
~ 179341~1704499~1704480 ~1321224044_TOTAL ->RECHARGEABLE TOOTHBRUSH
~ 179341~1704500~1704480 ~1452856872_TOTAL ->RECHARGEABLE TOOTHBRUSH->PPNNH
~ 179341~1704501~1704480 ~1684494481_TOTAL ->MANUAL ->COL MAX GSB
~ 179341~1704502~1704480 ~1733685778_TOTAL ->MANUAL ->COL 360 GSB
---------- Post updated at 07:06 PM ---------- Previous update was at 06:06 PM ----------
Again my mistake the ~ at start caused the issue. It worked..Thanks a million
RudiC
October 5, 2017, 10:29am
16
For the future - PLEASE
provide representative samples - AND STICK TO THEM!
provide a detailed and specific description of your problem - DON'T reference ID not explaining what that is, explain what you need (sorting / grouping / whatever) EVEN IF IT TAKES A FEW MORE LINES OF TEXT.
Don't let helpers guess what you want.
I recognize people in here are struggling with the "language barrier" - I do so myself - but that shouldn't keep them from telling necessary details from gaping info holes from blabber.