Relocation strings using awk/sed from a index file

Hi All,

I'd always appreciate all helps from this website. I would like to relocate strings based on the index number from an index file.

Index numbers are shown on the first column in the index file (index.txt) and I would like to relocate "path" based on index numbers. Paths are placed in the same row if the index number is the same. For example, there are two zeros so path_sparc_ifu_dec_in_3826 is placed on the first row and path_sparc_ifu_dec_in_4349 is placed on the first row and next to path_sparc_ifu_dec_in_3826.

index.txt:

     0        path_sparc_ifu_dec_in_3826  str    DR     -         -
     0        path_sparc_ifu_dec_in_4349  stf    DR     -         -
     1        path_sparc_ifu_dec_in_2374  stf    DR     -         -
     1        path_sparc_ifu_dec_in_4011  stf    DR     -         -
     2        path_sparc_ifu_dec_in_3078  stf    DR     -         -

However, strings are written in another file (source.txt) and each "path" has four lines of strings.

source.txt:

    path_sparc_ifu_dec_in_3826
    dtu_inst_d[14]
    dec_fcl_rdsr_sel_pc_d
    0.8664
    path_sparc_ifu_dec_in_4349
    dtu_inst_d[18]
    dec_swl_rdsr_sel_thr_d
    0.795429
    path_sparc_ifu_dec_in_2374
    dtu_inst_d[13]
    dec_dcl_cctype_d[2]
    0.938914
    path_sparc_ifu_dec_in_4011
    dtu_inst_d[13]
    ifu_exu_useimm_d
    0.843643
    path_sparc_ifu_dec_in_3078
    dtu_inst_d[12]
    ifu_exu_shiftop_d[2]
    0.915818

The desired output is:

    path_sparc_ifu_dec_in_3826	    path_sparc_ifu_dec_in_4349
    dtu_inst_d[14]	    dtu_inst_d[18]
    dec_fcl_rdsr_sel_pc_d	    dec_swl_rdsr_sel_thr_d
    0.8664	0.795429
    path_sparc_ifu_dec_in_2374	    path_sparc_ifu_dec_in_4011
    dtu_inst_d[13]	    dtu_inst_d[13]
    dec_dcl_cctype_d[2]	    ifu_exu_useimm_d
    0.938914	0.843643
    path_sparc_ifu_dec_in_3078	
    dtu_inst_d[12]	
    ifu_exu_shiftop_d[2]	
    0.915818	

My idea is that (1)combining two files first and (2) relocate path info using the index number, but I don't know how to do this work. Probably, sed/awk is an appropriate language.

Any help is appreciated.

Best,

Jaeyoung

How about

awk '
NR==FNR         {T[$2] = $1
                 MX = $1
                 next
                }
$1 in T         {IX = T[$1]
                }
                {P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] $0
                }
END             {for (i=0; i<=MX; i++) for (j=0; j<4; j++) print P[i, j]
                }
' file[12]
    path_sparc_ifu_dec_in_3826    path_sparc_ifu_dec_in_4349
    dtu_inst_d[14]    dtu_inst_d[18]
    dec_fcl_rdsr_sel_pc_d    dec_swl_rdsr_sel_thr_d
    0.8664    0.795429
    path_sparc_ifu_dec_in_2374    path_sparc_ifu_dec_in_4011
    dtu_inst_d[13]    dtu_inst_d[13]
    dec_dcl_cctype_d[2]    ifu_exu_useimm_d
    0.938914    0.843643
    path_sparc_ifu_dec_in_3078
    dtu_inst_d[12]
    ifu_exu_shiftop_d[2]
    0.915818

Thank you. I think I need replace file[12] with index.txt and source.txt. Is that correct?

awk '
NR==FNR         {T[$2] = $1
                 MX = $1
                 next
                }
$1 in T         {IX = T[$1]
                }
                {P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] $0
                }
END             {for (i=0; i<=MX; i++) for (j=0; j<4; j++) print P[i, j]
                }
' index.txt source.txt

What happens if you do?

nothing to show. Is there a bash shell/cshell issue? I will try it with cshell.

No, no - it should run on bash ! And yes - file1 is index.txt, and file2 is source.txt. "nothing to show" means - no output?

Hi RudiC,

I have changed a server to run this code and get the desired result. One quick question is how to insert a tab between strings. I tried print P[i, j]"\t" and print "\t"P[i, j] , but did not work.

Thank you.

My result:

path_sparc_ifu_dec_in_3826path_sparc_ifu_dec_in_4349
dtu_inst_d[14]dtu_inst_d[18]
dec_fcl_rdsr_sel_pc_ddec_swl_rdsr_sel_thr_d
0.86640.795429
path_sparc_ifu_dec_in_2374path_sparc_ifu_dec_in_4011
dtu_inst_d[13]dtu_inst_d[13]
dec_dcl_cctype_d[2]ifu_exu_useimm_d
0.9389140.843643
path_sparc_ifu_dec_in_3078
dtu_inst_d[12]
ifu_exu_shiftop_d[2]
0.915818

"\t" is a <TAB> char:

awk 'BEGIN {print "\t"}' | hd
00000000  09 0a                                             |..|

, so something else must be going wrong there.

Is anybody help me to fix the last step? I need a tab or a space between strings. I tried many ways but did not work. Thank you in advance.

My current code:

awk '
NR==FNR         {T[$2] = $1
                 MX = $1
                 next
                }
$1 in T         {IX = T[$1]
                }
                {P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] $0
                }
END             {for (i=0; i<=MX; i++) for (j=0; j<4; j++) print P[i, j]
                }
' index.txt source.txt

My current output:

path_sparc_ifu_dec_in_3826path_sparc_ifu_dec_in_4349
dtu_inst_d[14]dtu_inst_d[18]
dec_fcl_rdsr_sel_pc_ddec_swl_rdsr_sel_thr_d
0.86640.795429
path_sparc_ifu_dec_in_2374path_sparc_ifu_dec_in_4011
dtu_inst_d[13]dtu_inst_d[13]
dec_dcl_cctype_d[2]ifu_exu_useimm_d
0.9389140.843643
path_sparc_ifu_dec_in_3078
dtu_inst_d[12]
ifu_exu_shiftop_d[2]
0.915818

In post #1 in this thread, you showed us index.txt and source.txt files that had leading spaces on every line in both files. To get the output you are showing above, we have to assume that those spaces do not appear in source.txt .

If you want a tab in the output separating the text from various lines in source.txt , change the line in your code marked in red above in your script from:

                {P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] $0

to:

                {P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] "\t" $0

I am assuming your executing code from post #3.
modify below line from

P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4] $0

to

P[IX, (FNR+3)%4] = P[IX, (FNR+3)%4]"\t"$0

Thank you so much, Don.

It is resolved when I added "\t" before $0.

Best,

Jaeyoung