A lot has been written on how important monitoring is. Instead of contributing to the frenzy or more tools, with heuristic end-to-end capabilities for DevOps, and the importance of monitoring, known already since the 80's, I would like to give a disruptive view. From the moment any tool presents its findings on a screen, the benefit is only evolutionary; it cannot contribute to a vision for a non-stop infrastructure for DevOps or any other operational model. We forget that monitoring is a process, not technologies.
What is reliability. Reliability is the confidence we have on a component or a system that it will not fail. How to improve reliability. How to assess reliability.
What is Availability? Availability is the percentage of time a service was available. How to Measure Availability. How to Improve Availability
To build a non-stop IT infrastructure, capable of supporting demanding IT services in a dynamic DevOps environment, 4 self-contained principles are needed. These principles are: Abstraction, Redundacy, Automation, and Proactiveness.
Machines, People and Processes - When we refer to IT infrastructure, we tend to forget that it is much more than hardware. In fact, hardware you use are the least of your concerns. You buy the best hardware of the most shinny technology, plug it to power, connect to network and it is up and running. Is this enough? Definitely Not. If you look into the history of your problems ...