Output formatting

I have input file in this way

John   1234   BASIC       26000 
John   1234   ALLOWC      01550
John   1234   INCER       01700
John   1234   REL         20000
Debi   2345   BASIC       29000
Debi   2345   ALLOWC      01600
Debi   2345   INCR        01900
Debi   2345   REL         21000
Rob    3456   BASIC       15000
Rob    3456   REL         01500

The output require is in the fowllowing format

Name   EMP     BASIC     ALLOWC     INCER     REL
John    1234   26000     01550      01700      20000
Debi    2345   29000     01600      01900      21000
Rob     3456   15000      --         --        01500

Here some employee may have joined newly so may not get all the perq.
That thas to be left blank .
Had it been equal numbers of lines for each employee then we can use the command paste .

Thanks in advance
Vakharia M J:confused:

Try...

$ cat file1
John   1234   BASIC       26000
John   1234   ALLOWC      01550
John   1234   INCER       01700
John   1234   REL         20000
Debi   2345   BASIC       29000
Debi   2345   ALLOWC      01600
Debi   2345   INCR        01900
Debi   2345   REL         21000
Rob    3456   BASIC       15000
Rob    3456   REL         01500

$ awk ' { d[$2, $3] = $4 }
      ! ($2 in k) { k[$2] = $1 ; h[++c] = $2 }
      ! ($3 in j) { j[$3] = 1 ; i[++m] = $3 }
      ! d["", $3] { d["", $3] = $3 }
      END {
            for (x = 0; x <= c; x++) {
                  z = h[x]
                  printf "%s%s%s%s", k[z], OFS, z, OFS
                  for (y = 1; y <= m; y++) {
                        w = i[y]
                        printf "%s%s", d[z, w], (y == m ? ORS : OFS)
                  }
            }
      }
    ' OFS="\t" file1 > file2

$ cat file2
                BASIC   ALLOWC  INCER   REL     INCR
John    1234    26000   01550   01700   20000
Debi    2345    29000   01600           21000   01900
Rob     3456    15000                   01500

$
1 Like

Ygor Thanks lot for your code and it worked,one more thing I have data
for more then 700 employees and when the data of two line and then
four lines and then threee lines , i.e. if the data varies as per the length of serivce , means Seniors have data of 4 lines and new recurits may have 2 lines there it does not give the reqiuied output ,
the red shows the change made to suit my requirement but it failed,

awk ' { d[$2, $3] = $NF }
      ! ($2 in k) { k[$2] = $1 ; h[++c] = $2 }
      ! ($3 in j) { j[$3] = 1 ; i[++m] = $3 }
      ! d["", $3] { d["", $3] = $3 }
      END {
            for (x = 0; x <= c; x++) {
                  z = h[x]
                  printf "%s%s%s%s", k[z], OFS, z, OFS
                  for (y = 1; y <= m; y++) {
                        w = i[y]
                        printf "%s%s", d[z, w], (y == m ? ORS : OFS)
                  }
            }
      }
    ' OFS="\t" file1 > file2

It works well if the datas are in descending order.
Any suggestion ?? Thanks for your code .
Vakharia M J

See if this code works for what you want:

#!/usr/bin/ksh
mPrevN="First_Time"
while read mName mEmp mDesc mValue
do
  if [[ "${mName}" != "${mPrevN}" ]]; then
    if [[ "${mPrevN}" = "First_Time" ]]; then
      echo 'Name EMP BASIC ALLOWC INCER REL'
    else
      echo ${mPrevN} ${mPrevE} ${mDBasic} ${mDAllowc} ${mDIncer} ${mDRel}
    fi
    mDBasic='--'
    mDAllowc='--'
    mDIncer='--'
    mDRel='--'
  fi
  case "${mDesc}" in
    "BASIC")  mDBasic=${mValue};;
    "ALLOWC") mDAllowc=${mValue};;
    "INCER")  mDIncer=${mValue};;
    "REL")    mDRel=${mValue};;
    *)  echo '??? Error: No description matched.'
        echo 'mDesc = <'${mDesc}'>';;
  esac
  mPrevN=${mName}
  mPrevE=${mEmp}
done < Input_File
if [[ "${mPrevN}" != "First_Time" ]]; then
  echo ${mPrevN} ${mPrevE} ${mDBasic} ${mDAllowc} ${mDIncer} ${mDRel}
fi
awk '{a[$1 FS $3]=$4; b[$3];c[$1]=$2} 
END {   printf "Name\tEMP\t";for (i in b) printf i OFS;printf RS; 
        for (i in c)  
           { printf i OFS c 
             for (j in b) printf OFS (a[i FS j]?a[i FS j]:" --")
             printf RS}
     }' OFS="\t" infile

Name    EMP     ALLOWC  BASIC   INCER   INCR    REL
John    1234    01550   26000   01700    --     20000
Rob     3456     --     15000    --      --     01500
Debi    2345    01600   29000    --     01900   21000

Rdcwayx, probably best to use emp# rather than name as your hash - just in case the data has two employees with the same name.

awk '{a[$2 FS $3]=$4; b[$3];c[$2]=$1} 
END {   printf "Name\tEMP\t";for (i in b) printf i OFS;printf RS; 
        for (i in c)  
           { printf c OFS i 
             for (j in b) printf OFS (a[i FS j]?a[i FS j]:" --")
             printf RS}
     }' OFS="\t" infile
1 Like

Yes, you are right, emp# should be always unique.

Hi Chubler_XL , rdcwayx and Shell_life

Thanks lot all of you for your code , Shell_life your code is not working but got the point to ponder and idea also so ,thanks for that .
Chubler_XL and recwayx your code works fine.

Thanks to you friends for your precious help .Keep it up.

with due respect and regards

:slight_smile: Vakharia M j

Vakharia, could you please explain why you say my code is not working?

My code even validate the input data.

I had to correct your input data as based on your desired output:

Name   EMP     BASIC     ALLOWC     INCER     REL
John    1234   26000     01550      01700      20000
Debi    2345   29000     01600      01900      21000
Rob     3456   15000      --         --        01500

Your input data with incorrect value:

John   1234   BASIC       26000 
John   1234   ALLOWC      01550
John   1234   INCER       01700
John   1234   REL         20000
Debi   2345   BASIC       29000
Debi   2345   ALLOWC      01600
Debi   2345   INCR       01900
Debi   2345   REL         21000
Rob    3456   BASIC       15000
Rob    3456   REL         01500