Grep values from column 2 in reference of column 1

Gents

Is it possible to update the code to get the desired output files from the input list. I called variable to the first column.

I need to consider the first column as key to grep the values in the second column according to the desired request.

input list
(attached )

output1

111111              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
                    116,118,124-125,120,122.                                   
232323              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
                    116,118,124-125,120,122-123,126,132.                       

output2

111111              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
111111              116,118,124-125,120,122.                                   
232323              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
232323              116,118,124-125,120,122-123,126,132.                       

Code tried

tp=1

awk '
    function printrange() { print start (start == last ? "" : "-" last) }
    NR == 1 {start=last=$1; next} 
    $1 == last+1 {last=$1; next} 
    {printrange(); start=last=$1}
    END {printrange()}
' file | paste -sd" " | fold -sw 60 | tr ' ' ',' | sed 's/^/$tp                   /'

thanks in advance :b:

I think the awk code in your solution attempt should work on $2 not $1.

1 Like

Hi MadeInGermany.
Thanks a lot to answer,
yes the code works for second option, but it works with single variable

$tp

, the purpose is to use column 1 as variable has can be many diff numbers.
Appreciate your support.

If that's not the output you want, what is the output you want?

1 Like

Hi Corona688.

I will like to get the 2 outputs like a posted in my request. My code do the second output , but i use a single variable, enter manually. but I will like to get it automatic from the first column.

Please help me.

output 1

111111              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
                    116,118,124-125,120,122.                                   
232323              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
                    116,118,124-125,120,122-123,126,132.

output 2

111111              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
111111              116,118,124-125,120,122.                                   
232323              21-84,87,85-86,88-106,108,111,109,112,110,113,115,114,117, 
232323              116,118,124-125,120,122-123,126,132.

And how would that look different from what you have?

You only have one input file anyway. What is grepping what?

I dont have idea how to add the automatic variable in the code.

And I have no idea what the automatic variable is supposed to do.

Wouldn't grepping the entire file against itself just print the entire file?

By boosting your awk code I get this all-in-one

awk '
  function range_to_out() {
    out=(out sep (start == last ? start : (start "-" last)))
  }
  function print_out() {
    printf "%s  %s\n", p1, out
  }
  NR == 1 { start=last=$2; p1=$1; next } 
  {
    if ($2 == last+1) { last=$2 } else {
      range_to_out(); sep=","; start=last=$2
    }
  }
  $1 != p1 || length(out) > 50 { print_out(); sep=out=""; p1=$1 }
  END { range_to_out(); print_out() }
' file
1 Like

MadeInGermany, many thanks for your help..

MadeinGermany.

I am trying to figure out how to have same output like in my code.

The information in the column 2 for input file needs to be in output file from column 21 to 80, filling all this range,

this what i got using your code

X52152              21-34,36-82,84,83,85-107,109-191,193,192,194-197,206,                         
X52152              198-199,207,200,202,209,203,211,204-205,213,208,210,212,                            
X52152              214-216,218-221,233,222,245,223,246,249,251,248,224-232,                            
X52152              234-243,247,250,253-330,332,331,333-336,338,337,339-462,                            
X52152              467,463-466,468-480,482-514,516,515,517-529,531-700,702,                            
X52152              701,703-752,754-764,766-801,803,802,804-1016,1019,1017-1018,                        
X52152              1020-1035,1037-1104,1107,1105,1108,1106,1109-1169,1172-1176,                        
X52152              1178-1209,1221,1210-1220,1222-1242,1244-1264,1266-1281,                             
X52152              1283,1282,1284-1312,1314,1313,1315-1319,1321-1366,1368-1403,                        
X52152              1405-1465,1467-1498,1500-1502,1504-1527,1529-1598,1600-1796,                        
        

this what i got using my code and i want to get the same output with your code

X52152              21-34,36-82,84,83,85-107,109-191,193,192,194-197,206,                                                   
X52152              198-199,207,200,202,209,203,211,204-205,213,208,210,212,                                                
X52152              214-216,218-221,233,222,245,223,246,249,251,248,224-232,                                                
X52152              234-243,247,250,253-330,332,331,333-336,338,337,339-462,467,                                            
X52152              463-466,468-480,482-514,516,515,517-529,531-700,702,701,                                                
X52152              703-752,754-764,766-801,803,802,804-1016,1019,1017-1018,                                                
X52152              1020-1035,1037-1104,1107,1105,1108,1106,1109-1169,1172-1176,                                            
X52152              1178-1209,1221,1210-1220,1222-1242,1244-1264,1266-1281,1283,                                            
X52152              1282,1284-1312,1314,1313,1315-1319,1321-1366,1368-1403,                                                 
X52152              1405-1465,1467-1498,1500-1502,1504-1527,1529-1598,1600-1796.                                            

I have changed this in your code

 $1 != p1 || length(out) > 51 { print_out(); sep=out=""; p1=$1 }

to get close to my out file, but i cant.

I attache a file named file.zp which contends the input information.

Appreciate your help.

The following seems to do it

awk '
  function range_to_out() {
    out=(out sep)
    addout=(start == last ? start : (start "-" last))
    if (length(out addout) >= 60) {
      print_out()
    }
    out=(out addout)
    sep=","
  }
  function print_out() {
    printf "%s  %s\n", p1, out
    sep=out=""
  }
  NR == 1 { start=last=$2; p1=$1; next } 
  {
    if ($2 == last+1) { last=$2 } else {
      range_to_out(); start=last=$2
    }
  }
  $1 != p1 { print_out(); p1=$1 }
  END { range_to_out(); print_out() }
' file
1 Like

Hi MadeInGermany.
Appreciate your support, I will try the code and let you know, many tks