How to Ensure Uptime with Advanced Monitoring in PegaSys Plus
PegaSys Plus, launched October 22, is a commercially licensed Ethereum platform designed for enterprises that want to accelerate their blockchain solution to production quickly. In the previous post in this series, we talked about PegaSys Plus' increased security features using blockchain database encryption. In this post, we dive into another one of PegaSys Plus’ enterprise-focused features - advanced monitoring.
Costs of System Downtime
Production grade blockchain requires consistent uptime of your platform. Getting as close as possible to 100% guaranteed uptime is every infrastructure provider's goal, and every customer’s long-awaited dream. System downtimes, be they planned or accidental, can be costly and cause frustration for users, more work for maintainers, and less productivity for everyone. There are entire workflows dedicated to ensuring maximum availability of systems, and speedy turnaround in the event of a system failure. Monitoring your system performance to ensure its optimal is the most effective way to identify potential bottlenecks early and address them before they become an issue.
Pain Points for Sysadmins
Having the right tools accessible to the individuals responsible for maintaining the health of your system is clearly a necessity rather than an option, as without the ability to monitor the current health and behavior of what’s occurring, you won’t know what needs addressing and where system administrators should focus their energies on. Metrics and monitoring tools for blockchain networks help ensure maximum availability is maintained on nodes and sufficient redundancy is in place to reach the ultimate aim of 100% uptime. Without these tools, you run the risk of your nodes not performing, falling behind on synchronizing with other peers in the network, or falling over by not being able to keep up with the demands the applications that rely on blockchain data have on them.
Why Monitoring Your Validators is Important
Hyperledger Besu comes ready to go with a set of metrics out of the box that can be exposed to enterprise-grade monitoring systems to understand what is happening “under the hood” of your network and where bottlenecks might be occurring. System hardware metrics such as memory, IO usage, and CPU utilization are some of the most commonly referred to metrics to ensure system specifications meet the demand use placed on it.
Validators in an enterprise ethereum blockchain network play a special role. They are responsible for processing transactions that are distributed in the network and produce valid blocks that all other validators agree to based on the predefined set of rules for the network. An IBFT2.0 consensus-based network requires a ⅔ majority of validators to be available and agree to produce a block. In such networks, monitoring the health of these validators, beyond just the basics of hardware utilization is of critical importance to identify consensus issues that may arise. If more than the required majority of your validators are offline, then your network runs the risk of being on “pause” until they come back online.
Advanced Monitoring in PegaSys Plus
Advanced monitoring with PegaSys Plus gives you the necessary metrics through a plugin with Hyperledger Besu that exposes details such as when the last time a validator produced a block, and how many of the previous X blocks (e.g. 100 blocks) it has produced. In a 4 validator network that is running at a healthy rate you would expect each validator to be producing 25% of the blocks, and the ability to identify when a validator might not be pulling its weight gives system administrators insights on where to optimise their network, or even when a validator may be attempting to act byzantine. The next release of PegaSys Plus will add to these insights by providing methods to measure and monitor more fine-grained details around IBFT2.0 round messages between validators, giving greater insight to system administrators to make timely, well-informed decisions.
Interested in learning more about how you can ensure system uptime with advanced monitoring using PegaSys Plus? Reach out to us here.