I need to split the incoming source file in to multiple files using awk.
Split position is (6,13) : 8 positions
All the records that are greater than 20170101 and less than or equal to 20181231 should go in a split file with file name as source filename_greaterthan_20170101_lessthan_20181231 + yyyymmddhhmmss
All records that are less than 20170101 should go in a file with file name as source filename_lessthan_20170101 + yyyymmddhhmmss
All records that are greater than 20190101 should go in a file with file name as source filename_20190101 + yyyymmddhhmmss
Additionally instead of hard coding the condition in the script/command, can we pass it as a variable to the script , so the script remains dynamic.
awk '
{
y=substr($2,1,4) # Set the variable y to first 4 characters of
# the second field of the input file
f=b # set the output to the name in variable b
}
y<lt { # if the year is less than the min treshold
f=a # set the variable f to the name in variable a
}
y>gt { # if the year is more than the max treshold
f=c # set the variable f to the name in variable c
}
{
print>f # print the line to the appropriate file
}
' lt=2017 gt=2018 a=y1 b=y2 c=y3 infile # set variables lt, gt, a, b, and c and specify file name.
a) appending / prefixing the actual file name to the output file name would be way easier than inserting into the yn string. Try f=a FILENAME etc. If not happy with this, construct the f variable with a few substr() calls...
b) feel free to adjust the selection criteria to whatever you desire, but note that your above idea would not yield identical results, as $2 starts at char position 6 in your sample.