Friday, January 29, 2016

A Tale of a Solution And Three Points of View

What are a vSphere Administrator, a Storage Administrator and a Security Administrator doing having lunch together? They are embracing the new world order. That’s the Data Center world we live in now. There are great expectations that as we move towards Services-based infrastructure that the line between Data Center IT silos is blurring, forcing Administrators from “unrelated” fields to cooperate much more than they did before. Who knows; may be not too far down the road it would only be two of them left.

Let’s talk about a neat (unsupported) feature of Virtual SAN. Virtual SAN relies on the IP Network to provide the transport between its clients (the Operating System such as ESXi) and its servers (the Storage Container, such as the Virtual SAN hosts) to securely store Virtual Machines. The Architecture of Virtual SAN is so that the Configuration Plane resides with vCenter and the Control/Data Planes reside with the Virtual SAN nodes. Or is it? If you read about Virtual SAN in William Lam’s VirtualGhetto (or if you prefer your readings to be in Spanish, pase por la pagina de Leandro Leonhardt, BlogVMware), you will realize (as the vSphere Administrator has) that you can do significant Virtual SAN configurations in the ESXi hosts via CLI. VMware only supports Virtual SAN for nodes that are members of the same vSphere cluster, but what got my attention was the fact that you can add an ESXi host to any Virtual SAN cluster. This would get the attention of the Storage Administrator and the Security Administrator.

So I setup a 3-node Virtual SAN cluster using vCenter (all in the same cluster, following VMware’s instructions), and ssh into one of the nodes to get the Virtual SAN cluster UUID (using the command esxcli vsan cluster get, the field is called Sub-Cluster Master UUID).

In a fourth ESXi host in a different vSphere cluster (doesn’t have to be in a cluster or even the same vCenter but I thought "why not? put it in a cluster one anyway"), I created a VMkernel port using the vSphere Web Client and enabled it for Virtual SAN. I could’ve done this via the CLI but I was feeling lazy. I then ssh into the fourth ESXi host and ran the join command:

I gave it a minute and sure enough the fourth host had joined the Virtual SAN cluster. Just to be sure I did it for a fifth host. Below is the output of the esxcli vsan cluster get in the fifth host.

Not only did the two hosts from a different vSphere cluster joined the Virtual SAN cluster, but one of them, the fifth one above, actually became the Virtual SAN Master for the Virtual SAN cluster.

Back in vCenter…let’s just say that it wasn’t happy. First thing I noticed is that vCenter spilled out an error message under Network status, as shown in the figure below.

From what I’ve been able to gather, this is a genetic error message that means something like “Something is going on with the Network and I, vCenter, have no clue what it is. Go grab the Network Administrator and the two of you go do some Network troubleshooting. Don’t forget to check the Network Multicast configuration while you are at it”. By the way, you will also lose some of the Virtual VSAN monitoring capabilities that vCenter provides.

Now, looking again at the picture above, notice that the Total capacity of Virtual SAN datastore is 59.23 GB. That’s because each of the 3 original Virtual SAN nodes (the 3 that are in the same vSphere cluster) is contributing 20 GB (plus 4 GB SSD) to the Virtual SAN Datastore. And more importantly, the two new nodes (cuarto and quinto) were able to access this new Datastore, a fact that will must definitely intrigue the Security Administrator.

Here is a screenshot of the fifth host’s Datastores.

So that got me thinking (and it didn’t hurt): Is there a way to make the fourth node provide storage capacity to the Virtual SAN Datastore? It turns out there is, which will probably please the Storage Administrator (AGAIN, this is UNSUPPORTED by VMware). First, identify the disks (at least one SSD) in the host that you want to add to the Virtual SAN Datastore and run this command (where -s is the SSD disk and -d is the HDD disk) to create the Virtual SAN disk group in the host:

You won’t get an acknowledgement that the join was successful, so run this command to confirm the Virtual SAN disk group got created:

Back again in vCenter, it will report the storage we just added by adding it to the 59.23 GB from earlier. Now the Total capacity of Virtual SAN datastore reads 78.97 GB:

And for the record, all disks in the Virtual SAN Datastore were usable. I created a storage policy of Host Failures to Tolerate 2, added disks in the fifth host to the Virtual SAN Datastore (since a Host Failures to Tolerate of uses the formula 2n + 1 to determine how many host are required to provide storage to the Virtual SAN Datastore) , assigned a Virtual Machine to it and sure enough…

In the image above we can see the consumed space in Virtual SAN Datastore is greater than 0. I can only see the used storage for the Virtual SAN Disk Groups seen by vCenter (the original 3 nodes). So of 3.32 GB that is currently used in the Virtual SAN Datastore, about 2.2 GB (1.1 GB + 372 MB + 768 MB) are being consumed in the original 3 node Virtual SAN clusters and the remainder is being stored with the two other ESXi hosts.

Let me stop here since the lunch break is over and everyone must get back to work.

Elver’s Opinion: The vSphere Administrator thinks this is awesome. Although not supported by VMware in Virtual SAN 6.1, the foundation is already embedded in the Virtual SAN code to support a Multi-vSphere Clusters Virtual SAN deployment. One application that quickly comes to mind: a single dedicated Virtual SAN cluster for the vSphere environment’s storage needs that can be quickly and relatively inexpensively scaled up.

For the Storage Administrator, it is a mixed bag. The good of it is that his job just got so much easier. Now he will have some time back to be more efficient at his day-day job. He can now go on vacation without having to announce his plans 8 months in advance and be concerned that the sky will fall when he's is not at the office. The bad of it is that his job just got so much easier. Unless he figures out a way that he can continue to be productive and of continue providing value to the busines, his role might be chopped up and distributed among other Data Center Administrators. For reference case of this, google Voice Engineer circa 2003 or Server Administrator circa 2007.

The Security Administrator on the other hand is probably concerned. But she is not concerned about job security. She just witnessed how two potentially rogue entities (the two ESXi hosts in their own little cluster) were able to access company data in the Virtual SAN Datastore with nothing more than the Virtual SAN cluster UUID.  What gives her some peace of mind is that you would need access to one of the Virtual SAN nodes to obtain the Virtual SAN cluster UUID. However she knows she has some work ahead of her to tighten security around Virtual SAN as Virtual SAN is an "Infrastructure-Consumer Solution" that does not have built-in security access-restriction mechanisms.


No comments:

Post a Comment