10-years ago, we were asking how to connect SAN/NAS devices to XenServer, and how to deal with high-availability with such devices. We talked about logical volumes, and how to make them look like local physical disks to the XenServer, and what the best filesystem is to contain a XenServer Storage Repository.
It goes without saying that there’s been an explosion of “cloud” technologies in the last decade, which has also affected how we store our data, and the price associated with that.
In this series of blog posts, I will explain some of these new technologies, and how I leveraged some of them as a virtual machine image store. I’ll also share some of the code I wrote, and some of the lessons I learned along the way.
How the Demo Center started: Here in the Citrix Demo Center, our customers can order demos on-demand, for which we will build from the ground up, a XenServer with that demo’s bits , running in the data center of the user’s choice. For this there is a website from which Citrix customers can choose a particular demo focus.
While thinking about the initial concept, we had visions of salespeople or SEs being able to demonstrate products to a potential customer with nothing but a laptop or even a tablet; and along with the connection instructions we provided them, give a full-blown demo. They could also extend an invitation to that demo for their key contact which could help Citrix go “viral” within that organization. This is because a demo from the Demo Center is shows, in explicit detail, how Citrix technology works in the cloud.
One of our unusual cloud requirements was that we had to be able to have an elastic supply of hardware for use with our hypervisor of choice: XenServer. We partner with IBM Cloud (SoftLayer) who maintains the hardware, and through their excellent API allows us to order, cancel and re-load specific hardware configurations along with the correct version of XenServer needed for the demo (I just want to tip my hat to the WW Technical Sales Enablement Development team who have put a massive amount of work into the front-end and backend communications with SoftLayer creating a world-class experience for our customers).
How it works: When a Demo Center customer orders an “admin” demo, the correct XenServer on the correct hardware is presented by SoftLayer to our backend systems. Our backend systems then upload some initialization files to that server, and the server then begins to build itself. These small files are nothing more than:
- Information about the server, including networks assigned and the data center
- Information about the demo, and where to get the bits to build the demo, and
- A small script that boots the build into existence.
Our first thought when designing this provisioning process was to have our backend system communicate over xapi (the Xen api) to control every aspect of the build. But then we realized that the server itself was capable of building itself, and that it could check in with our backend every now and then. This means that we would make full use of the processing power provided by the server itself, and that it was a cloud-scalable model.
When we build out a XenServer, there are two kinds of data that it needs in order to build the demo:
- The demo metadata, which is a small set of files that describes and controls the build-out and start-order of the virtual machines.
- Access to the virtual machine bits themselves, which are nothing more than compressed archives (i.e. xva files; the same as those produced by the XenCenter export function).
The latter can consume a large amount of storage space depending on the specific demo being built. For example, as of this writing, our most popular demo (CWS – Citrix Workspace Suite) is running at around 150GB for its working set. We also like to keep the last 3 versions of the any particular demo.
Solving the storage issue: In the beginning we solved the storage issue by hosting the bits on a mega-NFS server in our home data center of Dallas. Each XenServer you order from SoftLayer comes with a public and private address, the private being on their ultra-fast backend network. The private network traffic comes at no cost, so wiring the NFS storage server to our XenServers over the private network was a no-brainer. In the Demo Center, one of the things we offer, is to allow customers who have a SoftLayer account to use our services at no cost and no time limits, but they pay SoftLayer directly for the XenServer host it uses. On their backend network, SoftLayer isolates customer accounts from each other by assigning a VLAN (or VLANs) to that account. For our storage device, this meant that only the original account which created the NFS server could “see” that service. This problem could be overcome, but not without serious scaling issues being introduced on the switching infrastructure.
Also, without the code in place to synchronize the bits to other data centers, it became quickly apparent that this was not a scalable solution. For example, having to drag the bits from Dallas to Amsterdam to build a server there, proved to cost us time, which is directly related to the cost of the server. The more time you spend building a server, the costlier providing demos-as-a-service is. But really, I had coded the build software to be storage independent, and the NFS was used experimentally while we got the bugs out. We never really went live with NFS.
Sitting on top of the private SoftLayer network is their management layer, which is accessible by any VLAN owned by a particular account. This is for things such as licensing services. Luckily for us, SoftLayer introduced a BitTorrent (BT) seeder service in all their major data centers. All we needed to do was integrate that into our build code. We had the added benefit of every demo XenServer itself becoming a BT seeder. As the number of servers with a particular demo torrent on it came up, it was added to the BT mesh, thereby reducing the amount of time it took to download the demo bits for other XenServers. This served us well for many years, but there were several downsides:
- The core BT service (the tracker) was not an elastic service – it didn’t grow and shrink on demand as any true cloud service should. This meant that maintenance or downtime could seriously impact our service because the service was based on a single server. Adding more servers, adding more disk, or updating the OS was a manual task.
- We underestimated the amount of disk storage we needed, and were soon consumed with pruning. We wanted to keep the last 3 versions of any particular demo. Plus we were sharing this service with the lab team which had the same genus as the demo build code on the XenServer side.
- It was hard to give an accurate build time. Since a brand new version of a demo torrent had not yet formed into a full mesh, and a new version was probably at its greatest demand just after release, the initial builds using the new torrent was slow (12 to 15-hours sometimes).
- Scale was an issue when facilitating demos and labs for special events such as Summit and Synergy. We had to start building servers weeks in advance so as not to overwhelm the BT infrastructure by huddling the builds too closely together. Again, there’s a cost implication to that extra time.
Object Storage: I just happened to attend an OpenStack seminar a couple years ago, where I was introduced to SWIFT (also known as OpenStack Object Storage) which is an “…open source software designed to manage the storage of large amounts of data cost-effectively on a long-term basis across clusters of standard server hardware.” http://searchstorage.techtarget.com/definition/OpenStack-Swift
This intrigued me, because the storage array wasn’t now some bespoke high-end, purpose-built kit; it was an array of computers with plain (or sophisticated) disks attached. This makes it possible for a service provider to build an array from their standard equipment inventory. Indeed, in the seminar, we built an array using a bunch of laptops.
In essence: Resilience comes by Object Storage making multiple copies, and also by internally limiting larger objects to 5GB chunks (which it reconstructs on a read). Each object is stored under a container, and there is no concept of a folder/directory, but the object names can have delimiters such as ”/”. (This sounds simple, but trust me, in practice this will mess with your mind.) The objects in containers are kept track of via a directory database, which itself is kept in Object Storage. Of course there is a lot more detail that I won’t go into here, but for the purposes of this blog it’s sufficient to know that:
- The interface to the SoftLayer implementation of Object Storage is through a RESTful API, and so all commands and data are accessible through https.
- While there are device drivers out there that can use Object Storage like a block device, Object Storage is serial in that there is no random access to any particular part of an object. You write and read objects as a whole.
- We have a well-known container called ‘democenter-files’. Under that we have a pseudo folder structure to help segregate the different types of objects.
- For this implementation the Object Storage is wholly contained within a SoftLayer data center; it’s up to us to replicate our objects to other data centers.
In the next installment, I’ll start diving into the data structures and code I used to build an Object Storage library using bash scripts for XenServer.