More and more organizations are digitally transforming their business by moving to the cloud. This isn’t surprising—but it does present a challenge to the IT team who is increasingly becoming a “broker” for these cloud services. IT (despite not controlling the underlying infrastructure, platforms, or even applications providing the cloud service) is still on the line for the continuity of that service, as well as the impact to the business if the service is impaired or down. The dilemma is simple: Do I really know how anything not in my data center is operating? How do I monitor anything I put in or consume from the cloud?
First of all—“cloud monitoring” can mean a great deal of things to many people. Mainly though—it boils down to the delivery of the contracted service and meeting agreed upon service-level agreements (SLAs) from the service provider. However, monitoring such SLAs can be difficult for IT. Oftentimes—they must rely on the service provider to do the monitoring (and the reporting) for them. While this isn’t necessarily bad from a “reducing complexity” standpoint, IT must still be able to aggregate any data provided by the service provider into a holistic view of their environment.
So—if I want to do a better job at “cloud monitoring”—what should I be focused on?
Agents vs. Agent-less
This is an on-going debate in the monitoring community—and there are pros and cons on both sides. Deploying agents usually allows for more in-depth and rich reporting when it comes to metrics. However—deploying and configuring agents takes time and may not make sense for the type of cloud services you’re consuming. For highly elastic computing scenarios where VMs may only exist for a short duration—the strategy should be agent-less. If using IaaS as a backbone to run critical IT and business services—then monitoring via agents may make more sense. However, the agent vs. agent-less debate needs input from your cloud service provider too. Some providers are OK with deployed agents reporting back monitoring data to you—while others will prefer to tackle certain types of monitoring on their end and simply provide you with the data—or potentially worse, just a dashboard. While IT monitoring dashboards are essential, you need a way to integrate the data provided to you by your cloud service provider into your own monitoring solution.
Rarely will you get any data about the CSPs underlying physical infrastructure—so you’re stuck with getting data on the VMs themselves. For the most part—this is what you’re really care about anyways. IT monitoring solutions have been tackling VM monitoring for years—so as long as your monitoring solution can tackle major virtualization platforms like Xen, vSphere, Hyper-V, and KVM—you should be good to go. Whether you want to deploy an agent or go agentless should probably be determined by your own cloud strategy and discussed with your IaaS provider to come up with solution that gives you the types of monitoring data you need.
Monitoring the Cloud Management Platform
There a tons of cloud management platforms available, but being able to cover popular ones like OpenStack, CloudStack, AWS, Azure, CloudPlatform, OpenShift, and Cloud Foundry should be key in your cloud monitoring strategy. Make sure your monitoring solution can accommodate your cloud management platform of choice. The last thing you need is to add yet another monitoring solution to the mix just to cover your cloud strategy.
Monitoring the Application Itself
Identifying key applications being hosted in or entirely provided through the service provider is critical. Knowing how the VM and platform is operating is great, but if the application or service on top of the platform goes down– you’d better know that. Make sure you can adequately monitor any key applications that are cloud-based—but also know how those applications tie into your entire service delivery model.
What’s Kinds of Data Do I Need?
Two kinds of data should be tracked when it comes to cloud services:
Uniting the Picture
Last, but not least—no cloud application is really in isolation, but rather interacts with many other components in your organization to deliver a service. Knowing how the performance of the cloud application impacts the overall delivery of any IT or business services is critical. Often monitoring occurs in silos—and this is particularly true when the cloud is involved. You need to be able to unite all the monitoring data you have (on-premise, off-premise, and cloud-based) into a clear picture of how the entire state of your IT is running and how any individual service is affecting your business. Without that clearer picture, it is often difficult for IT (and the CEO) to relate esoteric stats about network performance or meeting of SLAs to the actual operation of the business itself.
There’s no doubt that cloud services are increasingly important to businesses everywhere. As organizations seek to operate with greater agility, they are transforming themselves by moving to the cloud. However having a solid, cloud monitoring approach is just as important as the cloud service itself. It’s critical you know how you’ll monitor any service you outsource to or consume from the cloud or you may just find yourself experiencing service outages and delays you’d hoped were a thing of the past.
Disclaimer: As with everything else at NetIQ Cool Solutions, this content is definitely not supported by NetIQ, so Customer Support will not be able to help you if it has any adverse effect on your environment. It just worked for at least one person, and perhaps it will be useful for you too. Be sure to test in a non-production environment.