Create Carriage Return using SED

I am hoping someone can help me with a solution to this problem using SED. My issue is actually two-fold. First, I have an order file that is sent to us from another system, however, there has been a change in current processes that requires a carriage return added to the end of each order. The file being sent to us is a .CSV with the orders running into each other as shown below:

0000235298~0080387170~~0000013338~COMPANY~CONTACT~ADDRESS~CITY~STATE~ZIP~US~INFO1~EMAIL~~INFO2~INFO3~PS0000235700~0080387811~~0000402296~ COMPANY ~~ ADDRESS ~ CITY ~STATE~ZIP~US~~ EMAIL ~~INFO2~INFO3~FS

The first order ends at PS and the next order begins directly after. The current software is capable of reading this properly. The problem begins with an additional software package that requires there to be a Carriage Return after PS.

My second dilemma is that there are two new fields being added to each order as shown below:

0000235298~0080387170~~0000013338~COMPANY~CONTACT~ADDRESS~CITY~STATE~ZIP~US~INFO1~EMAIL~~INFO2~INFO3~PS~Y~30000235700~0080387811~~0000402296~ COMPANY ~~ ADDRESS ~ CITY ~STATE~ZIP~US~~ EMAIL ~~INFO2~INFO3~FS
~Y~3]

Now there is a 3 at the end of the order. The final output of the information above must look like the format below:

0000235298~0080387170~~0000013338~COMPANY~CONTACT~ADDRESS~CITY~STATE~ZIP~US~INFO1~EMAIL~~INFO2~INFO3~PS~Y~3
0000235700~0080387811~~0000402296~ COMPANY ~~ ADDRESS ~ CITY ~STATE~ZIP~US~~ EMAIL ~~INFO2~INFO3~FS ~Y~

Also keep in mind there may or may not be a Y or even a 3 at the end, leaving data possibly looking like this PS~~. However, the system will know what to do as long as there is a Carriage Return at the end. [FONT=Calibri]The only tip i can think of regarding patterns is there will always be 18 ~ delimiters if that helps with finding a pattern.

I hope I have not confused this and that someone might be able to help me out with this. Thank you for any help you might have as well as your time.

---------- Post updated at 02:44 PM ---------- Previous update was at 12:14 PM ----------

In addition to the pattern I mentioned already, you can basically state that after the 18th ~ delimiter if there is a 3 then add a carriage return after the 3 otherwise add a carriage return after the 18th ~ delimiter.

awk solution. not sed.

 
awk -f a.awk infile

where a.awk:

 
{
 FS="~";
 for (i=1; i<=NF; i++) {
  if (length($(i))>10 && $i ~ /.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/) {
    w1=$(i);
    sub(/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/, "", w1);
    print w1 "";
    w2=substr($(i), 1+length(w1));
    printf (w2) "~";
  } else {
    if (i==NF) {
       print $(i);
    } else {
       printf $(i) "~";
    }
  }
 }
}
 

Works for first field of 10 digits.

This is nearly exactly what we need rdrtx1 thank you. However, for some reason it adds delimiters where they shouldn't be but only in the first line. Otherwise, all other lines are perfect.

Original first line:

0000255844~0080417885~~0000031501~FIRST LAST~DBA ANY COMPANY~142 W 1050 N~CITY~STATE~ZIP~US~INFO1~EMAIL~INFO2~INFO3~INFO4~CS~Y~3

After script:

0000255844~0080417885~~0000031501~FIRST~LAST~DBA~ANY~COMPANY~142~W~1050~N~CITY~STATE~ZIP~US~INFO1~EMAIL~INFO2~INFO3~INFO4~CS~Y~3

Using 1 line in input file:

0000255844~0080417885~~0000031501~FIRST LAST~DBA ANY COMPANY~142 W 1050 N~CITY~STATE~ZIP~US~INFO1~EMAIL~INFO2~INFO3~INFO4~CS~Y~3

Output:

 
0000255844~0080417885~~0000031501~FIRST LAST~DBA ANY COMPANY~142 W 1050 N~CITY~STATE~ZIP~US~INFO1~EMAIL~INFO2~INFO3~INFO4~CS~Y~3
 

I do not get spaces replaced by "~".

Maybe using sledge hammer to escape spaces:

New awk:

 
{
 FS="~";
 gsub(" ","!");
 for (i=1; i<=NF; i++) {
  if (length($(i))>10 && $i ~ /.[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/) {
    w1=$(i);
    sub(/[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]$/, "", w1);
    gsub("!"," ", w1);
    print w1 "";
    w2=substr($(i), 1+length(w1));
    gsub("!"," ", w2);
    printf (w2) "~";
  } else {
    gsub("!"," ", $(i));
    if (i==NF) {
       print $(i);
    } else {
       printf $(i) "~";
    }
  }
 }
}

1 Like

Don't know if I understand the problem exactly, but this sed maybe solve it:

sed -E 's/([^~]*~){18}3?/&\
/'
1 Like

Success!! That worked great rdrtx1. Thank you again. 244an I have not tried your solution yet but thank you for responding. I will post again when I try it.