Amazon Web Services – A Technical Primer for VMware Admins
Yes, yes, I know. Long time no blog. Still, isn’t it meant to be about quality and not quantity? That could spawn a million dirty jokes, so let’s leave it there. So to the matter in hand. Recently I’ve been working on a project that’s required me to have a much closer look at Amazon Web Services (or AWS for the lazy). I think probably like most I’ve heard the name and in my head just thought of it as web servers in the cloud and probably not much more than that. How I was wrong.
However, like most “cloud” concepts, because ultimately it’s based on the idea of virtualisation, it’s actually not that hard to get your head around what’s what and how AWS could be a useful addition to your armoury of solutions for all sorts of use cases. So with that in mind, I thought it would be really useful to put together a short article for folks who are dyed in the wool vSphere admins who might need to add an AWS string to their bow at some time in the near future. Let’s get started.
As you can see from the picture below, logging into the AWS console gives us a bewildering array of services from which to pick, most of which have exotic and funky names such as “Elastic Beanstalk” and “Route 53”. What I’m going to try and do here is to separate out (at a high level) the services AWS offers and how they kind of map into a vSphere world.
The AWS Console
Elastic Compute Cloud (EC2)
Arguably the main foundation of AWS, EC2 is the infrastructure as a service element. Herein comes the first of the differences. We no longer refer to the VMs as VMs, but we now refer to them as “instances”. In much the same way we might define it in vRealize or vCD, there are sizes of instances, from nano up to 8 x extra large, which should cater for most use cases. Each instance type has varying sizes of RAM, numbers of vCPUs and also workload optimisations, such as “Compute Optimised” or “Storage Optimised”.
Additionally, instance images are referred to as AMIs, which stands for “Amazon Machine Image”. Similar in concept I suppose to an OVA or OVF. It’s a pre-packaged virtual machine image that can be picked from the service catalog to provision services for end users. As you might expect, AMIs include both Windows and Linux platforms and there is also an AWS Marketplace from where you can trial or purchase pre-packaged AMIs for specific applications or services. In the example screen shot below, you can see that when we go into the “Launch Instance” wizard (think “create a new VM”) we can choose from both Amazon’s service catalog but also the AWS Marketplace. Why re-invent the wheel? If the vendor has pre-packaged it for you, you can trial it and also use it on a pay-as-you-go basis.
As you can see above, there is a huge amount from which to pick, and it’s very much the same in concept as the VMware Solution Exchange. What’s notable here is the billing concept. Whereas with vSphere we might be thinking in terms of a one off cost for a licence, with AWS, we need to start thinking about perpetual monthly billing cycles, which will also dictate whether or not AWS is suitable and represents value for money.
You can also take an existing AMI, perform some customisation on it (install your application for example) and then save this as an AMI that you can use to create new instances, but these AMIs are only visible to you, not others. I suppose the closest match to this is a template in vCenter. So again, many similarities, just different terminology and slight differences in workflows etc.
It’s also worth adding at this point before I move properly onto storage that the main storage platform is called EBS, or Elastic Block Storage. It’s Elastic because it can expand and contract, it’s Block because..well, it’s block level storage (think iSCSI, SAN etc.) and Storage because, well it’s storage. At this level, you don’t deal with LUNs and datastores, you just deal with the concept of an unlimited pool of storage, albeit with different definitions. In this sense, it’s similar to the vSphere concept of Storage Profiles.
Storage Profiles can help an administrator place workloads on the appropriate type of storage to ensure consistent and predictable performance. In AWS’s case, you have a choice of three – General Purpose, Provisioned IOPS and Magnetic. More on this in the storage section, but remember that EBS storage is persistent, so when an instance is restarted or powered off, the data remains. You can also add disks to an instance using EBS, for example if you wanted to create a software RAID within your instance.
You may also see references to Instance Storage. This is basically using storage on the host itself, rather than enterprise grade EBS storage. This type of storage is entirely transitory and only lasts for the lifetime of the instance section. Once the instance is powered off or destroyed (terminated in AWS parlance), the storage goes with it. Remember that!
One of the good things about EBS is that in the main, SSD storage is used. General Purpose is SSD and is used for exactly that. Provisioned IOPS is used mainly for high I/O workloads such as databases and messaging servers and Magnetic is spinning disk, so the cheapest of the cheapest and used for workloads with modest I/O requirements.
So to another service with an exotic hipster name, Amazon S3. This stands for Simple Storage Service and is Amazon’s main storage service. This differs from EBS as it’s an object based file service, rather than block based, which I suppose is more like what vSphere admins are used to.
Amazon refers to S3 locations as “buckets”, and it’s easy to think of them as a bunch of folders. You can have as many buckets as you like and again this storage is persistent. You can upload and download content, set permissions and even publish static websites from an S3 bucket. It’s also worth noting that bucket contents are highly available by way of replication across the region availability zones, but more about that later. By using IAM (Identity and Access Management) you can allow newly provisioned instances to copy content from an S3 bucket say into a web server content directory when they are provisioned, so you are good to go as soon as the instance is.
You can also have versioning, multi-factor authentication and lifecycle policies, but that’s beyond the scope of this article.
It’s not easy to map S3 to a vSphere concept, so we’ll leave it here for now, but at least you know in broad terms what S3 is.
One thing that AWS does very well (or very frustratingly, depending on your viewpoint) is hiding the complexity of networking and simplifying into a couple of key concepts and wizards.
In vSphere, we have the concepts of vSwitches, VDSes, port groups, VLAN tags, etc. In AWS, you pick a VPC (more on that later), a subnet and whether or not you want it to have an internet facing IP address. That’s pretty much it.
In terms of configuring the networking environment, when you sign up to AWS you get a default VPC, this stands for “Virtual Private Cloud” and is what is says it is – your own little bubble inside of AWS that nobody can see but you (analogous to a vCloud Director Organisational DC). You can add your own VPCs (up to a limit of 5, for now) if you want to silo off different departments or lines of business, for example. Think of a VPC as your vCenter view, but without clusters. VPCs operate pretty much on a simple, flat management model. If you have a PluralSight sub, it’s a good idea to check out Nigel Poulton’s VPC videos for a much better insight on how this all works.
VPCs don’t talk to each other by default, but you can link them together (and link VPCs from other AWS accounts if you want to). Again, it’s difficult to map this to a vSphere concept, but this helps explain what a VPC is.
Each instance will get an internal RFC 1918 type network address (say 10.x or 192.168.x, depending how CIDR blocks are configured) and those instances requiring external IP addresses will have this added transparently, so basically NAT because the VM does not know about the external facing address. I know it sounds a bit complicated, but actually it’s not, I’m just not good at explaining it!
One last concept to cover is Availability Zones (AZ). Generally there are three per region, and right now there are 11 regions worldwide. You can put workloads wherever you like, but if you want to add things like Elastic Load Balancer, you can’t just scatter gun your instances all over the planet.
An AZ in it’s most basic sense is a physical data centre, so easy to understand from a vSphere perspective. However, in AWS, as there are three AZs per region connected together via high speed, low latency network links, services such as S3 and Elastic Load Balancer (ELB) can take advantage of this. The region is the logical boundary for these services and means that S3 data is replicated around all AZs in the region and load balanced services that sit behind a single ELB can be placed in all three AZs if need be. All of this is configured by default, you don’t need to do anything yourself to let this magic happen.
Managing AWS from vCenter
In all the AWS concepts I’ve mentioned so far, I’ve discussed how things are done from the AWS web console. It’s also possible to manage and migrate VMs to AWS from vCenter Server, this is done with the AWS Management Portal. I haven’t yet tried it, but when I do, I’ll come back and write an article about it. This is a key piece of the puzzle though, as it allows “single pane of glass” management for vSphere and AWS.
Hopefully this has been a useful primer in mapping AWS concepts to vSphere ones. There are lots of services and constructs that are unique to AWS that don’t necessarily map back, but it’s still important to know what they are. I’ve summarised some of the mappings in the table below (and not all of them are directly 1-1 in concept), hopefully I can add more articles in the coming weeks.
Availability Zone = Data Centre (physical)
VPC = Datacenter (vCenter logical)
EBS = Storage Profiles (similar, but not exactly the same)
Instance = Virtual Machine
AMI = OVA/OVF