Date wise calculations?

nikhil_jain · April 15, 2016, 8:46am

POST_DATE	CHECK_NUMBER	TYPE	LOGIN_NAME
2015.09.09	XXXXXXXXXX	mark XXXXXXXXXX
2015.09.09	XXXXXXXXXX	fsadf	XXXXXXXXXX
2015.10.05	XXXXXXXXXX	defaa	XXXXXXXXXX
2015.10.05	XXXXXXXXXX	dewe	XXXXXXXXXX
2015.10.06	XXXXXXXXXX	dqwe	XXXXXXXXXX
2015.09.14	XXXXXXXXXX	dt4e	XXXXXXXXXX
2015.10.26	XXXXXXXXXX	dpwp	XXXXXXXXXX
2015.10.19	XXXXXXXXXX	dfsef	XXXXXXXXXX
2015.11.18	XXXXXXXXXX	deew	XXXXXXXXXX

[/CODE]

I want to find the dates for any 2months, for example if i ask for sept and October it should list me as below,

POST_DATE	CHECK_NUMBER	TYPE	LOGIN_NAME
2015.09.09	XXXXXXXXXX	mark XXXXXXXXXX
2015.09.09	XXXXXXXXXX	fsadf	XXXXXXXXXX
2015.10.05	XXXXXXXXXX	defaa	XXXXXXXXXX
2015.10.05	XXXXXXXXXX	dewe	XXXXXXXXXX
2015.10.06	XXXXXXXXXX	dqwe	XXXXXXXXXX
2015.09.14	XXXXXXXXXX	dt4e	XXXXXXXXXX
2015.10.26	XXXXXXXXXX	dpwp	XXXXXXXXXX
2015.10.19	XXXXXXXXXX	dfsef	XXXXXXXXXX

[/CODE]

my parameters would be explicit.

vbe · April 15, 2016, 9:10am

Yes... And what have you tried so far?

nikhil_jain · April 15, 2016, 9:25am

awk -F "|" '{print $1}' file name | head  | sort -u  | egrep "2015.09.??|2015.10.??"

[/CODE]

The above is giving me only the date part, but for the content i tried using other methodologies but no luck hence turned up to this forum

One more thing, those months i'll be passing as parameter.

RudiC · April 15, 2016, 9:29am

Why all that piping? Did you know that (e)grep by default prints the entire line if either pattern is matched?

nikhil_jain · April 15, 2016, 9:32am

actually few of the pipes is for head, my file is vast with million records, i just gave the dummy content.. also it has some 40 columns with many having values as date.

RavinderSingh13 · April 15, 2016, 9:35am

Hello Nikhil,

As a simple and starting point you could try with following code and let me know if this helps you, if you have more requirements then please do show us expected output with full requirement details on same.

awk 'NR==1{print;next}{split($1, A,".");if(A[2]==09 || A[2]==10){print}}'  Input_file

Output will be as follows.

POST_DATE       CHECK_NUMBER    TYPE    LOGIN_NAME
2015.09.09      XXXXXXXXXX      mark XXXXXXXXXX
2015.09.09      XXXXXXXXXX      fsadf   XXXXXXXXXX
2015.10.05      XXXXXXXXXX      defaa   XXXXXXXXXX
2015.10.05      XXXXXXXXXX      dewe    XXXXXXXXXX
2015.10.06      XXXXXXXXXX      dqwe    XXXXXXXXXX
2015.09.14      XXXXXXXXXX      dt4e    XXXXXXXXXX
2015.10.26      XXXXXXXXXX      dpwp    XXXXXXXXXX
2015.10.19      XXXXXXXXXX      dfsef   XXXXXXXXXX

Thanks,
R. Singh

Don_Cragun · April 15, 2016, 6:42pm

You might also want to try something like:

#!/bin/ksh
IAm=${0##*/}
if [ $# -ne 3 ]
then	printf 'Usage: %s year.month1 year.month2 filename\n' "$IAm" >&2
	exit 1
fi
awk -F'.' -v ym1="$1" -v ym2="$2" 'NR==1 || $1 FS $2==ym1 || $1 FS $2==ym2' "$3"

(I usually use the Korn shell, but you can use any shell that recognizes POSIX parameter expansions (e.g., ash , bash , dash , and many others in addition to ksh .)
If the sample data shown in post #1 in this thread is stored in a file named data and the above script is in an executable file named tester , then the command:

./tester 2015.09 2015.10 data

produces the output you requested and the command:

./tester 2015.09 2015.11 data

produces the output:

POST_DATE	CHECK_NUMBER	TYPE	LOGIN_NAME
2015.09.09	XXXXXXXXXX	mark XXXXXXXXXX
2015.09.09	XXXXXXXXXX	fsadf	XXXXXXXXXX
2015.09.14	XXXXXXXXXX	dt4e	XXXXXXXXXX
2015.11.18	XXXXXXXXXX	drew	XXXXXXXXXX

If you want to try this on a Solaris/SunOS system, change awk in the script to /usr/xpg4/bin/awk or nawk .

RudiC · April 16, 2016, 4:45am

For exactly your sample and request, this might do:

egrep "2015.(09|10)|POST" file
POST_DATE    CHECK_NUMBER    TYPE    LOGIN_NAME
2015.09.09    XXXXXXXXXX    mark XXXXXXXXXX
2015.09.09    XXXXXXXXXX    fsadf    XXXXXXXXXX
2015.10.05    XXXXXXXXXX    defaa    XXXXXXXXXX
2015.10.05    XXXXXXXXXX    dewe    XXXXXXXXXX
2015.10.06    XXXXXXXXXX    dqwe    XXXXXXXXXX
2015.09.14    XXXXXXXXXX    dt4e    XXXXXXXXXX
2015.10.26    XXXXXXXXXX    dpwp    XXXXXXXXXX
2015.10.19    XXXXXXXXXX    dfsef    XXXXXXXXXX

nikhil_jain · April 18, 2016, 2:05am

hey don,

thanks for the solution, how does it interprets the POST_DATE column, can u plz guide me?

Hey Ravinder

Thanks for the help, but it is not working as expected.

---------- Post updated at 11:35 AM ---------- Previous update was at 11:15 AM ----------

Rudi,

It is not working, It is even filtering the 11 and 12 month rows as well.

For 09 & 10 it works as expected, but as i change the parameter to 11 & 12 it does not

egrep -w "2015.(11|12)|POST" EXTRACT.txt | awk -F "|" '{print $12}' | head
2015.09.09
2015.11.02
2015.10.05
2015.10.05
2015.10.06
2015.09.14
2015.10.26
2015.10.19
2015.11.18
2015.09.15

[/CODE]

rudi,

On further analysis i found out that it is actually grepping some 03 available in that particular line, I wanted only those available in POST_DATE Column.
the month wise calculations, i'll be doing it there.

RudiC · April 18, 2016, 2:31am

Some creativity with the regex might yield the correct result for column 1:

egrep "^2015.(11|12)|POST"  file

nikhil_jain · April 18, 2016, 2:46am

my POST_DATE is not the column 1, It is column 12, Even this is not helping.

egrep "^2015.(03|04)|POST" EXTRACT.txt | head | awk -F "|" '{print $12}'
POST_DATE
POST_DATE
2015.12.02
 
2015.12.10
2015.12.28
2015.12.03
2015.12.21
2015.12.09

[/CODE]

RudiC · April 18, 2016, 2:49am

Where did you specify that we're dealing with field 12?

nikhil_jain · April 18, 2016, 2:54am

sorry my bad I forgot to.

RudiC · April 18, 2016, 2:56am

PLEASE consider giving a clear, crisp, and detailed specification in the very first post!
Try

egrep "^([^ ]+ +){11}2015.(11|12)|POST"  file

nikhil_jain · April 18, 2016, 3:01am

I'm really sorry for that rudi, but still no luck

egrep "([^ ]+ +){11}2015.(03|04)|POST"  EXTRACT.txt | awk -F "|" {'print $12'}
POST_DATE
POST_DATE
2015.12.02
 
2015.12.10
2015.12.28
2015.12.03
2015.12.21
2015.12.09
2015.12.09
2015.11.25
2015.12.01
2015.12.24
 
2015.12.09

[/CODE]

RudiC · April 18, 2016, 3:08am

Without a decent real life sample I can't help further.

nikhil_jain · April 18, 2016, 3:17am

Rudi,

PF the data below, I have masked lot of things, we have some more columns, dint included him here thinking it may not be required.

1008	116823	33	A	118	1431240091	15	ABCD 6510		2015.09.09	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	33
1008	116823	33	A	118	-1142149389	23.05	ABCD 6510		2015.09.09	XXXXXXXXXX	d	XXXXXXXXXX	debit	XXXXXXXXXX	2015.12.01
1008	116823	33	A	118	-392122611	18.9	ABCD 6510		2015.10.05	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	39
1008	116823	33	A	118	-54753935	84.34	ABCD 6510		2015.10.05	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	39
1008	116823	33	A	118	1966358768	2.91	ABCD 6510		2015.10.06	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	22
1008	116823	33	A	118	469200831	13.47	ABCD 6510		2015.09.14	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	23
1008	116823	33	A	118	1284265141	2.76	ABCD 6510		2015.10.26	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	1
1008	116823	33	A	118	-1911676106	4	ABCD 6510		2015.10.19	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	1
1008	116823	33	A	118	-2064109421	40	ABCD 6510		2015.11.18	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	11
1008	116823	33	A	118	2079801662	25	ABCD 6510		2015.09.15	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	28
1008	116823	33	A	118	1954930603	7	ABCD 6510		2015.10.19	XXXXXXXXXX	d	XXXXXXXXXX	2015.12.01	1	1
M	Mqq	BAN	DISPLAY_NAME	IT_A_I	TR_SEC_I	ANT	DESCRIPTION	TRANSACTION_DATE	POST_DATE	CHECK_NUMBER	Methid	LNAME	ASOF	Tstatus	Tcat

[/CODE]

Expecting this will help ... Have included even the column name below

RudiC · April 18, 2016, 3:27am

Listen, this is getting lengthy. There's no | delimiter as indicated in post#15. Using the default field separators, field 12 is

awk '{print $12}' file
d
d
d
d
d
d
d
d
d
d
d

Unless you take a decent step back, start over and present a meaningful and useful specification, I won't be in a position to help.

nikhil_jain · April 18, 2016, 3:30am

rudi,

Really sorry, I had just copied it from excel just noticed that some of the column i had to wrap it to mask.

Plz chk below if it helps

1008	116823	33	A	118	1431240091	282483235	1431240091	15	ABCD 6510		2015.09.09	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	-1142149389	282483125	-1142149389	23.05	ABCD 6510		2015.09.09	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	-392122611	282483183	-392122611	18.9	ABCD 6510		2015.10.05	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	-54753935	282483182	-54753935	84.34	ABCD 6510		2015.10.05	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	1966358768	282483181	1966358768	2.91	ABCD 6510		2015.10.06	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	469200831	282483225	469200831	13.47	ABCD 6510		2015.09.14	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	1284265141	282483140	1284265141	2.76	ABCD 6510		2015.10.26	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	-1911676106	282483161	-1911676106	4	ABCD 6510		2015.10.19	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	-2064109421	282483043	-2064109421	40	ABCD 6510		2015.11.18	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	2079801662	282483253	2079801662	25	ABCD 6510		2015.09.15	XXXXXXXXXX	d	XXXXXXXXXX
1008	116823	33	A	118	1954930603	282483160	1954930603	7	ABCD 6510		2015.10.19	XXXXXXXXXX	d	XXXXXXXXXX
M	Mqq	BAN	DISPLAY_NAME	IT_A_Id	Aitem_id	Btransid	TR_SEC_I	ANT	DESCRIPTION	TRANSACTION_DATE	POST_DATE	CHECK_NUMBER	Methid	LNAME

[/CODE]

Don_Cragun · April 18, 2016, 4:09am

Here is a commented version of the script I suggested:

#!/bin/ksh
# Set IAm to the final component of the pathname used to invoke this script.
IAm=${0##*/}

# If the number of operands given to this script is not 3, print a usage
# message to the standard error output and exit.
if [ $# -ne 3 ]
then	printf 'Usage: %s year.month1 year.month2 filename\n' "$IAm" >&2
	exit 1
fi
# Use awk to print any line in the file whose pathname is given by the 3rd
# operand to this script ("$3") where the field separator for lines in the file
# is set to a <period> character (-F'.') and:
# 1.  the input line number is 1 (NR==1): to print the header, or (||)
# 2.  the 1st field followed by the field separator followed by the 2nd field
#	is the string specified by the 1st operand to this script (-v ym1="$1")
#	and ($1 FS $2==ym1): to print lines matching the 1st given year
#	and month, or (||)
# 3.  the 1st field followed by the field separator followed by the 2nd field
#	is the string specified by the 2nd operand to this script (-v ym2="$2")
#	and ($1 FS $2==ym2): to print lines matching the 2nd given year
#	and month.
awk -F'.' -v ym1="$1" -v ym2="$2" 'NR==1 || $1 FS $2==ym1 || $1 FS $2==ym2' "$3"

Is this enough explanation, or is there something that still isn't clear?

If the date you want to process is in the 12th field and the field separator is a <tab> character and there is no header line (or the header line does not appear as the 1st line and doesn't need to be copied to the output) as in your latest sample input, you could try the following script instead:

#!/bin/ksh
# Set IAm to the final component of the pathname used to invoke this script.
IAm=${0##*/}

# If the number of operands given to this script is not 3, print a usage
# message and exit.
if [ $# -ne 3 ]
then	printf 'Usage: %s year.month1 year.month2 filename\n' "$IAm" >&2
	exit 1
fi

awk -F'\t' -v ym1="$1" -v ym2="$2" '
#NR == 1 {
# The following code to print the header has been removed because the sample
# data does not have a header line.
#	# Print the header line and skip further processing for this line.
#	print
#	next
#}
{	# Split the 12th tab separated field into the ymd[] array using
	# <period> as the field # separator within this field.
	split($12, ymd, /[.]/)
	#printf("$12=%s, ymd[1]=%s, ymd[2]=%s\n", $12, ymd[1], ymd[2])

	# Print this line if ym1 or ym2 is the same string as ymd[1] "." ymd[2].
	if(ym1 == ymd[1] "." ymd[2] || ym2 == ymd[1] "." ymd[2])
		print
}' "$3"

If someone wants to try either of the above scripts on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk .

As has been said before, giving us data that is not at all representative of the data you want to process, not telling us what the field separator is in the data you want to process, and continually changing your requirements is a huge waste of everyone's time.

I hope the above helps you, but if this still doesn't meet your requirements, we have little incentive to try to help if your requirements change again.