Problems with delimiters

Hello,

I have data in a file something like this -

UNB+UNOA:1+006415160:1+AR0000012360:ZZ+080701:0552+2++DELFOR++++T'UNH+2+DELFOR:D:97A:UN

Here, the delimiters used are + , : and ' . I have a set of such files in which these delimiters vary from one file to another.

I am developing a shell script which needs to take certain values from fields in this file depending on their positions which I will be doing it with the help of cut command.

My problem is how to find these varying delimiters from file to file. Means, in every file , how can i find out what is the delimiter after UNB(here +) , after UNOA (here :slight_smile: or before UNH (here '). I will store them in variables & then search for my required fields.

Please help me. It is very urgent.

Is there a reason for some many different delimiters? Why not change the + to | then the : to | then the ' to | ? Then, your file would be consistent.

> echo $inp
UNB+UNOA:1+006415160:1+AR0000012360:ZZ+080701:0552+2++DELFOR++++T'UNH+2+DELFOR:97A:UN
> echo $inp | tr "+" "|" | tr ":" "|" | tr "'" "|"
UNB|UNOA|1|006415160|1|AR0000012360|ZZ|080701|0552|2||DELFOR||||T|UNH|2|DELFOR|97A|UN

Exactly. Create a "filter". Make all delimiters consistent (1 delimiter) then use that output to do your cuts, etc.

InputWithManyDelimiters | filterScript | cut -fn -d "~" where "~" is what ever delimiter you choose as the cannoical one.... "~" (tilde) is a good one at least for English because the character is not commonly used in data...

.... or specify multiple delimeters/FieldSeparators with 'awk'

Useless Use Of tr :rolleyes:

> echo $inp | tr ":|+|'" "|"