Help with awk

Hi,

I have a stream of data like below:

abcdef1CXYZ1999PQR
ghijkl2MNOJ2012GHI

By using AWK and substr I am extracting the data, but I need to perform a substr based on the year. Say if Year is less than 2011 then I have to extract 9th char for output 3rd column and if Year is greater than or equal 2011, I have to extract 11th char for output 3rd column.

abc|def|C|1999
ghi|jkl|J|2012

Thanks.

awk '{X=11} substr($0,12,4)<2011{X=8}
    {print substr($0,1,3),substr($0,4,3),substr($0,X,1),substr($0,12,4)}' OFS="|" file

Unfortunately, your written request is not consistent with the sample you gave (Pos 9 as opposed to Pos 8). Try this, based on your written request:

$ awk  '{YR=$12$13$14$15; print $1$2$3, $4$5$6, YR<2012?$9:$11, YR}' FS="" OFS="|" file
abc|def|X|1999
ghi|jkl|J|2012

Let me tell in a clear way....
I have input file as:

C0000169811J4GL48K83W6854582003-11-19KJJH74200321796572863292268120001-01-01NY1NN 0001-01-01NN2012-10-15+0017.2O+0008.1N018861244178   60  NN11            001
C0001583911B7FN14X6JS7417842006-09-15 N1L61198813156468100000676860001-01-01UY1NN 0001-01-01YN2006-11-10+0000.0A+0008.2N106031946596   00  NN97            001
C0001583911C3EJ56H5SN5823592007-01-08JACP41199500680246900000676860001-01-01UY1NN 0001-01-01YN2007-03-09+0000.0A+0008.2N106031946596   00  NN97            001
C0005698011C3CDFBH5DD1580052012-10-19PFDP41201328423902466684666840001-01-01NY1NN 0001-01-01NN0001-01-01+0020.4 +0020.4N357526219604   00  NN01            001

My code is like below:

 awk 'BEGIN {OFS="\t"} {if  (substr($0,79,1) ~ "1" || substr($0,79,1) ~ "C" || substr($0,79,1) ~ "L" || substr($0,79,1) ~ "B" || substr($0,
79,1) ~ "E") print substr($0,02,09),
        substr($0,48,09),
        substr($0,28,10),
        substr($0,62,05),
        substr($0,67,10),
        substr($0,12,01),
        substr($0,93,01)}'
 

The above code is working fine, but the requirement is changed in retrieving substr($0,12,01), which is last but one :

{if (substr($0,28,4)  > 2011)
 print substr($0,15,01)
 else
 print substr($0,12,01)};

Need to integrate this functionality in the existing,
i.e; print with if condition

---------- Post updated at 11:07 PM ---------- Previous update was at 05:25 PM ----------

Is it possible to implement if-else condition in print stmt ?

Try (untested):

$ awk 'BEGIN  {OFS="\t"}
              {YR = substr($0,28,4), X = substr($0,79,1)}
       X ~ /[1CLBE]/ {print substr($0,02,09), 
               ...,
               substr($0, YR>2011?15:12, 1),
               ...}
       '

Syntax error:
awk: cmd. line:1: {YR = substr($0,28,4), X = substr($0,79,1)}
awk: cmd. line:1: ^ syntax error

I am trying as follows:

INPUT_FILE:

C0000169811J4GL48K83W6854582003-11-19KJJH74200321796572863292268120001-01-01NY1NN 0001-01-01NN2012-10-15+0017.2O+0008.1N018861244178   60  NN11            001
C0001583911B7FN14X6JS7417842006-09-15 N1L61198813156468100000676860001-01-01UY1NN 0001-01-01YN2006-11-10+0000.0A+0008.2N106031946596   00  NN97            001
C0001583911C3EJ56H5SN5823592007-01-08JACP41199500680246900000676860001-01-01UY1NN 0001-01-01YN2007-03-09+0000.0A+0008.2N106031946596   00  NN97            001
C0005698011C3CDFBH5DD1580052012-10-19PFDP41201328423902466684666840001-01-01NY1NN 0001-01-01NN0001-01-01+0020.4 +0020.4N357526219604   00  NN01            001

AWK:

cat INPUT_FILE | \
awk 'BEGIN {OFS="\t"} {if  (substr($0,79,1) ~ "1" || substr($0,79,1) ~ "C" || substr($0,79,1) ~ "L" || substr($0,79,1) ~ "B" || substr($0,
79,1) ~ "E") print substr($0,02,09),
        substr($0,48,09),
        substr($0,28,10),
        substr($0,62,05),
        substr($0,67,10),
 
{ if (substr($0,28,4)  > 2011)
 print substr($0,15,01)
 else
 print substr($0,12,01) };
 
        substr($0,93,01)}'

How to implement the if in PRINT, in AWK?

Replace , with ; here:

           {YR = substr($0,28,4), X = substr($0,79,1)}
                                ^--- ";" 
1 Like

Thanks. Its working. But need a small modification.

awk 'BEGIN {OFS="\t"} {if  (substr($0,79,1) ~ "1" || substr($0,79,1) ~ "C" || substr($0,79,1) ~ "L" || substr($0,79,1) ~ "B" || substr($0,
79,1) ~ "E") print substr($0,02,09),.....
 
This was replaced by:
X = substr($0,79,1)}
     X ~ /[1CLBE]/ {print substr($0,02,09)

How can I add the following additional condition (if (substr($0,93,1) ~ "N" && )to the same:

 awk 'BEGIN {OFS="\t"} {if (substr($0,93,1) ~ "N" && (substr($0,79,1) ~ "1" || substr($0,79,1) ~ "C" || substr($0,79,1) ~ "L" || 
substr($0,79,1) ~ "B" || substr($0,79,1) ~ "E")) print substr($0,02,09),.....

Thanks.

Try

$ awk 'BEGIN  {OFS = "\t"}
              {YR = substr ($0, 28, 4)
               X  = substr ($0, 79, 1)
               Z  = substr ($0, 93, 1)
              }
       X ~ /[1CLBE]/ && Z == "N"  {
               print substr ($0, 02, 09),
               ...,
               substr ($0, YR>2011?15:12, 1),
               ...
              }
      '
1 Like