awk script to extract a column, replace one of the header and replace year(from ddmmyy to yyyy)

I have a csv which has lot of columns . I was looking for an awk script which would extract a column twice. for the first occurance the header and data needs to be intact but for the second occurance i want to replace the header name since it a duplicate and extract year value which is in ddmmyy and replace it with yyyy.

my csv looks like

"abc","sdf","sdf","we","qe","fr","ty","hgf","er","jy","PERIOD_END_DATE"
"123","","234","ety","","ghj","tyu","Inh","dn","ngnh","17-FEB-19"

what i want to do is use an awk script to print the 11th column twice and extract the year from the data and print it in yyyy format as below:

"PERIOD_END_DATE" "YEAR"
"17-FEB-19" "2019"

Currently i can extract the following from my csv:

"PERIOD_END_DATE" "PERIOD_END_DATE"
"17-FEB-19" "17-FEB-19"

using the awk script as follows:

awk -F ',' '{print $11,$11"\r"}' test1.csv > test2.csv

But what i need is the second column header to be replaced to "Year" and extract the year from the columns and print it in yyyy format for the whole column, so it will look like :

 "PERIOD_END_DATE" "YEAR"
"17-FEB-19" "2019"

Can anyone help me with this, Please ask for more clarity if this is unclear.

$ awk -F, -v yr='"YEAR"' ' { $0=$0 OFS (NR == 1 ? yr : "20" substr($0,length($0)-2,3)) } 1 ' OFS=","  file
"abc","sdf","sdf","we","qe","fr","ty","hgf","er","jy","END_DATE","YEAR"
"123","","234","ety","","ghj","tyu","Inh","dn","ngnh","17-FEB-19",2019"

will this year replacement will replace any year, say the data might have 2018, 2019, 2015, 2016, so all it gotta do is extract 18, 19 and add 20 in front so the value becomes 2014,2015,2016,2017,2018, 2019. Irrespective of the year. is this possible? Also, the file is a csv so I want to pass the csv file to the awk script since the file is huge, as can be seen i had input csv file to the script
test1.csv > test2.csv

Yes this will replace any year(If year is less than 2000 then prefix 19) and you can pass csv file to awk

@anbu23 for some reason it acts funny. the command I used is:

awk -F, -v yr='"YEAR"' ' { $0=$0 OFS (NR == 1 ? yr : "20" substr($0,length($0)-2,3)) } 1 ' OFS="," testk3.csv > test5.csv

and the output is:

,"YEAR"_END_DATE" "PERIOD_END_DATE"
,209"EB-19" "17-FEB-19"
,209"EB-19" "24-FEB-19"
 ,209"EB-19" "24-FEB-19"

Note: I made 1 mistake on my part while asking the question and before making an edit i mentioned "END_DATE" instead of "PERIOD_END_DATE". I apologize for the inconvenience.

can you post sample data from testk3.csv?

These are some lines from the file:

"PERIOD_END_DATE" "PERIOD_END_DATE"
"17-FEB-19" "17-FEB-19"
"24-FEB-19" "24-FEB-19"
"24-FEB-19" "24-FEB-19"

$ cat testk3.csv
"PERIOD_END_DATE" "PERIOD_END_DATE"
"17-FEB-19" "17-FEB-19"
"24-FEB-19" "24-FEB-19"
"24-FEB-19" "24-FEB-19"
$ awk -F, -v yr='"YEAR"' ' { $0=$0 OFS (NR == 1 ? yr : "20" substr($0,length($0)-2,3)) } 1 ' OFS="," testk3.csv
"PERIOD_END_DATE" "PERIOD_END_DATE","YEAR"
"17-FEB-19" "17-FEB-19",2019"
"24-FEB-19" "24-FEB-19",2019"
"24-FEB-19" "24-FEB-19",2019"

Strange, when i run the following command:

awk -F, -v yr='"YEAR"' ' { $0=$0 OFS (NR == 1 ? yr : "20" substr($0,length($0)-2,3)) } 1 ' OFS="," testk3.csv > testk5.csv

it shows me output as before:

$ head testk3.csv
"PERIOD_END_DATE" "PERIOD_END_DATE"
"17-FEB-19" "17-FEB-19"
"24-FEB-19" "24-FEB-19"
"24-FEB-19" "24-FEB-19"
"17-FEB-19" "17-FEB-19"
"03-MAR-19" "03-MAR-19"
"10-MAR-19" "10-MAR-19"
"17-FEB-19" "17-FEB-19"
"03-MAR-19" "03-MAR-19"
"24-FEB-19" "24-FEB-19"



$ head testk5.csv
,"YEAR"_END_DATE" "PERIOD_END_DATE"
,209"EB-19" "17-FEB-19"
,209"EB-19" "24-FEB-19"
,209"EB-19" "24-FEB-19"
,209"EB-19" "17-FEB-19"
,209"AR-19" "03-MAR-19"
,209"AR-19" "10-MAR-19"
,209"EB-19" "17-FEB-19"
,209"AR-19" "03-MAR-19"
,209"EB-19" "24-FEB-19"

You seem to have DOS line terminators (<CR> = ^M = 0x0D = \r) in your file - how /where did you produce it? Remove them like

$ awk -F, -v yr='"YEAR"' ' { sub (/\r$/, ""); $0=$0 OFS (NR == 1 ? yr : "20" substr($0,length($0)-2,2)) } 1 ' OFS="," 
"17-FEB-19" "17-FEB-19",2019
"24-FEB-19" "24-FEB-19",2019
"24-FEB-19" "24-FEB-19",2019

If we go back to post #1 in this thread, you might notice that Kunalcurious used the awk script:

awk -F ',' '{print $11,$11"\r"}' test1.csv > test2.csv

so there no reason to wonder where the <carriage-return>s came from...