Removing duplicate terms in a file

Hi everybody
I have a .txt file that contains some assembly code for optimizing it i need to remove some replicated parts.
for example I have:

e_li r0,-1 
e_li r25,-1  
e_lis r25,0000  
 
add r31, r31 ,r0 
       
e_li r28,-1  
e_lis r28,0000  
 
add r31, r31 ,r0 
       
e_li r28,-1  
e_lis r28,0000  
 
add r31, r31 ,r0 
       
e_li r2,-1  
e_lis r2,0000  
 
add r31, r31 ,r0 
       
e_li r9,-1  
e_lis r9,0000  
 
add r31, r31 ,r0 
       
e_li r24,-1  
e_lis r24,0000  
 
add r31, r31 ,r0 
       
e_li r21,-1  
e_lis r21,0000  
 
add r31, r31 ,r0 
       
e_li r28,-1  
e_lis r28,0000  
 
add r31, r31 ,r0 

So if in a way I could remove the replicated parts the final code would look like:

e_li r0,-1 
e_li r25,-1  
e_lis r25,0000  
 
add r31, r31 ,r0 
       
e_li r28,-1  
e_lis r28,0000  
 
add r31, r31 ,r0 
              
e_li r2,-1  
e_lis r2,0000  
 
add r31, r31 ,r0 
       
e_li r9,-1  
e_lis r9,0000  
 
add r31, r31 ,r0 
       
e_li r24,-1  
e_lis r24,0000  
 
add r31, r31 ,r0 
       
e_li r21,-1  
e_lis r21,0000  
 
add r31, r31 ,r0 
       

Thanks for your help

try:

awk '
{sub(" *$",""); sub("^ *",""); l=l":"$0; }
/add/ {if (b[l]) {l=""; next;} else {a[c++]=l; b[l]=l;};l=""}
END {
  for (i=0; i<c; i++) {
    sub("^:", "", a);
    gsub(":", "\n", a);
    printf a;
    print "";
  }
}
' a.txt
1 Like

Thanks rdrtx1, seems work! :slight_smile:

Alternatively (just for fun):

awk '{getline p} !A[$0,p]++{print $0 ORS p}' RS= ORS='\n\n' infile

But this is probably not practical, since it would be sensitive to extra spaces in the input file..