Website RTO. What monitoring can I setup and how to track issue?

Hello,

I have installed a WordPress theme on Cantos and brought up a website on AWS. I have added that website on Cloudflare. While I was working on webpage development, I noticed that sometimes website is unreachable. In 8 hours, I noticed it 2-3 times and after few seconds, it would come back online. As of now, this website is not production but before it goes live, I want to fix this RTO issue.

1- Probably, I would setup some kind of monitoring (with notification to my email), which can tell me, when it goes down. What kind of monitoring I can setup ? Something in AWS or Cloudflare or any third party tool ?
Was checking uptimerobot, but it checks website minimum at 1 minute interval.

2- How do I track, at what level it is failing ? I checked AWS instance and it was always up and no downtime on that Centos linux.

Please sugget solutions, probably not expensive ones.

Thanks

I don't think it is a matter of tools but a matter of organised debugging: let us first consider what could have gone wrong:

1) the server('s OS) - you ruled that out

2) the application software, i.e. Apache and/or whatever works on top of it.

3) the network connection of your server: possible reasons include network congestion, broadcast storms, intermittent hardware outage, ...

4) the connection between you and your server: caching mechanisms like Cloudflare may influence the connectivity until the cache is filled.

This off the top of my head list is probably neither complete nor detailed enough. You are welcome to edit it until it fits your environment. Once you have done that you start ruling out one point or sub-point after the other: for instance the application stack you use could be tested by a client working from within the server automatically and so foregoing the network connections otherwise necessary. Once you have established that you move on to the next point in the list.

Debugging is just the organised application of logic and a few usually rather simple tests once you have properly envisioned how things are supposed to work and what depends on what.

I hope this helps.

bakunin