Needed some idea about transaction volume check on daily basis

Hi All,

Good morning. :slight_smile:

I need some idea about transaction volume check on daily basis.
Though I have some basic idea like:

  • > Check mean value for 3 months of historical data and then add/subtract 'n' number (could be 100 to 10000 as per source systems) to create a MAX and MIN range on daily basis (maybe?).
  • > So this way if the volume does not fall on this range, an e-Mail should come (SMTP e-Mail functionality is already there) or a job failure or something like that.

Please let me know if this can be a right process or there are better process to check this.

I'm open for ideas (Coding: sh/ksh/py).

Cheers,

Please give the details what you are talking about.

  • MySQL database transactions on Linux RedHat?
  • Email transactions on HP-UX email server?
  • An apache webserver on AIX?

Before anyone can offer any useful suggestions, you need to describe in details your system architecture and what you are trying to accomplish.

Transaction volume as in Bank Transaction Volume ( currency/cash/wire transactions by individuals [you, me, others :)]).

So each day, Source System delivers these transaction.
So I want to see if there are volume deviation for any day by comparing historical volume.

i.e. for a given date: 14-Dec-2019 --- Say we have received 150 Transactions for SRC1 which is far less compared to historical volume (ranged between 250-450 Transaction daily; sample data given below).
so in the above case

maxVolRange=450
minVolRange=250
if $volume < $minVolRange or $volume > $maxVolRange 
then
    validation job fail (or an e-Mail will come) 
else
    do bla bla
fi

The value of maxVolRange and minVolRange should change daily by checking historical data of Transactions received from source.

Sample data:

Date,Day of Week,Src1  Trxn Count,src2 Trxn Count
13-Dec-19,Friday,288,12705
12-Dec-19,Thursday,258,12064
11-Dec-19,Wednesday,419,11622
10-Dec-19,Tuesday,204,12287
9-Dec-19,Monday,282,11842
7-Dec-19,Saturday,0,335
6-Dec-19,Friday,295,12577
5-Dec-19,Thursday,513,12240
4-Dec-19,Wednesday,529,12257
3-Dec-19,Tuesday,442,11598
2-Dec-19,Monday,252,12496

So I want derive these maxVolRange and minVolRange on daily basis by calculating average/mean average... (maybe?) to check if we are receiving correct data or there is no missing data.

Please let me know if I was able to explain this.

~ Saps.

Thanks. But where do you get these numbers?

  • Do these number come from a DB query?
  • Counting web server page views?
  • Summing up hamsters as the fly in and out of the room?

When building a new system task, we should clearly know the interface; and you have not provided any of the details.

(even in your second reply).

You have describe "what you want" but you have not described nor provides any sample input files or the spec of the API, etc.

Hi Neo,

Apologies... Let me try to answer your queries:

It is an existing system. Current high level architecture:

Source system (Reuters, Bloomberg etc) ---> provide transaction details in some flat files to an internal team in raw format [.dat files] ---> That team apply some transformation logic to generate required files for our application [Infra: Hadoop, Unix] ----> we load these files [Infra: Unix, Oracle] ---> apply ETL transformation logic [Infra: Unix, Oracle] ---> Run rules/logic [Infra: Unix, Oracle] ---> generate alerts [Infra: Unix, Oracle] .

Yes, Oracle SQL query (Manual run in Database to check on daily basis. SQL output data provided as a sample csv file in previous post)

.
Now I would like to verify the data volume just before ETL transformation layer.

Please let me know if I'm able to provide required answer.

Cheers,
~ Saps

Thanks for that High Level Overview, ~ Saps

That sounds interesting.

Do you mind, then, to provide a fragment (sample input) to the code you need help with, maybe a few dozen or so lines?

Hi Neo,

I have not started the code development yet, but I have something in mind

  • - To get an average or median then add some number to decide MAX and MIN range
  • - Then check today's transaction count if it is falling between this range
  • - If not, job fails/mail comes.
  • - This MAX and MIN value will change daily by checking the historical data over a period of time (say 3 to 6 month)

I wanted to know if this could be a correct approach to determine the "daily volume range" or if there are any better ideas to determine this.

Cheers,
Saps.

Hi Saps,

If you read my post carefully, I did not ask for any code samples. My message to you said:

Kindly read more carefully.. Thank so much.