vRealize Operations Manager 6 - Deep Dive Part 1

vRealize Operations Manager also known as vROps, vROM, or as it used to be known as vCOps, is a VMware Monitoring engine that examines VMware components and provides recommendations, alerts, reports, and dashboards to display how the environment is performing.  It is not just for virtualized components and can monitor storage, networking, windows components, applications and other pieces depending on the management packs that you have installed.

The vROps has numerous add-ins that increase the penetration into the environment.  This deep dive will start with the basics; Architecture, installation, setup, upgrades / hotfixes, management pack installation and initial views.  Starting with vROps installation, a VMware based installation should be in place with a couple ESXi hosts and vCenter 5.5 install.

The architecture of vROps is fairly straightforward and is simplified from the previous 5.8 and older implementations.  The vApp that 5.8 and older used was a two VM implementation with a Web UI and the Analytics DB making of the pair.  With the move and complete redesign of the application the move went to a single VM with all the components running within the "Node".  To build resiliency additional nodes are added to the cluster for High-Availability, or for remote collection. 

All the components as seen interact in a single system as separate in how they communicate.  The Collector pulls the data from the multiple sources and pushes the information down the stack as the data becomes more stale.  The Persistence layer is where the historical data is placed, the Analytics layer is the active layer processing and calculations are performed.  The Controller layer is the piece where the information gathered is determined necessary now or to archive and UI is where the information is displayed.

As the environment needs more scale adding additional nodes is fairly simple and must maintain the same build type as determined in item 1 in the installation section below.  Mixing Standalone and Appliance implementations is not recommended.  

Master Node:  The master node is the first node installed in any environment and is where all the data resides.  It stores the user created content, and collected data and the historical data.  

Replica Node:  The replica node is exactly that, a replica of the master node.  All the same data is kept for the possibility that the master node is not available and the replica would take over in case of a failure.

Data Node: The Data Node helps distribute the historical, persistence, and analytics function around but doesn't hold any user based dashboards or data.

Remote Collector: The remote collector is for the collection within a DMZ, branch office or protected areas within the environment.  It doesn't contain the product UI, or the database for the historical information, nor does it contain the user based information in the master node.  

The master, replica and data nodes run a localized in memory database for enhanced analytics performance.  

The Product UI provides a web interface for advanced dashboards, report generation, and system analytics that help resolve issues based on trending, capacity, and performance metrics.  This will be discussed in later posts.

Assumption of this post is that this is a new or green field implementation.  Installing vROps is a fairly simple procedure as there is two different methods of installation; appliance or standalone.  As any engagement that I do I determine a few requirements:
  1. Is an appliance installation an acceptable installation or is it required to have the install on a customized OS install?  (standalone can install on a Linux or on Windows 2008 R2) before you ask, there is no difference that I have seen from one implementation to another, with the exception that the appliance is quicker to implement.
  2. Next is to determine the sizing for the environment, number of metrics that will be analyzed and if High Availability is needed.  A great sizing document is shown here http://kb.vmware.com/kb/2109312 . The KB has a spreadsheet near the bottom that allows the calculation of how many metrics your organization will or could produce based upon certain packs and components.  This allows a great exercise of what do you monitor and what don't you.
  3. Standard things like a static IP for the appliance / standalone node, a DNS entry for each node with a PTR record, a service account to access user and groups within AD and a password to use as the root / admin account for the application.
Part 2 of this Deep dive will discuss the installation procedure for both the standalone and appliance install...