05-05-17

Event Review – Google Cloud Next London – Day One

I was fortunate enough to spend the last couple of days at the Google Cloud Next London event at the ExCel centre and I have a few thoughts about it I’d like to share. The main takeaway I got from the event is that while there may not be the breadth of services within Google Cloud (GCP) as there is in AWS or Azure, GCP is not a “me too” public cloud hyperscaler.

While some core services such as cloud storage, VPC networking, IaaS and databases are available, there are some key differences with GCP that are worth knowing about. My interpretation of what I saw over the couple of days was that Google have taken some of the core services they’ve been delivering for years, such as Machine Learning, Maps and Artificial Intelligence and presenting them as APIs for customers to consume within their GCP account.

This is a massive difference from what I can see with AWS and Azure. Sure, there are components of the above available in those platforms, but these are services which have been at the heart of Google’s consumer services for over a decade and they have incredible power. In terms of market size, both AWS and Azure dwarf GCP, but don’t be fooled into thinking this is not a priority area for Google, because it is. They have ground to make up, but they have very big war chests of capital to spend and also have some of the smartest people on the planet working for them.

To start with, in the keynote, there was the usual run down of event numbers, but the one that was most interesting for me was that there were 4,500 delegates, which is up a whopping 300% on last year, and 67% of registered attendees described themselves as developers. Google Cloud is made up of GCP, G Suite (Gmail and the other consumer apps), Maps and APIs, Chrome and Android. Google Cloud provides services to 1 billion people worldwide per day. Incredible!

Gratuitous GC partner slide

There was the usual shout out of thanks to the event sponsors. One thing I did notice in contrast to other vendor events I’ve been to was the paucity of partners in the exhibition hall. There were several big names including Rackspace, Intel and Equinix but obviously building a strong partner ecosystem is still very much a work in progress.

We then had a short section with Diane Greene, who many industry veterans will know as one of the founders of VMware. She is now Senior VP for Google Cloud and it’s her job to get Google Cloud better recognition in the market. Something I found quite odd about this section is that she seemed quite ill prepared for her content and brought some paper notes with her on stage, which is very unusual these days. There were several quite long pauses and it seemed very under-rehearsed, which surprised me. Normally the keynote speakers are well versed and very slick.

GDPR and GC investment

Anyway, moving on to other factoids – Greene committed Google to be fully GDPR compliant by the time it becomes law next May. She also stated there has been $29.4 billion spent on Google Cloud in the last three years. The Google fibre backbone carries one third of all internet traffic. Let that sink in for a minute!

There is ongoing investment in the GC infrastructure and when complete in late 2017/early 2018, there will be 17 regions and 50 availability zones in the GC environment, which will be market leading.

 

GCP regions, planned and current

Google Cloud billing model

One aspect of the conference that was really interesting was the billing model for virtual machines. In the field, my experience with AWS and Azure has been one of pain when trying to determine the most cost effective way to provide compute services. It becomes a minefield of right sizing instances, purchasing reserved instances, deciding what you might need in three year’s time, looking at Microsoft enterprise agreements to try and leverage Hybrid Use Benefit. Painful!

The GCP billing model is one in which you can have custom VM sizes (much like we’ve always had with vSphere, Hyper-V and KVM), so there is less waste per VM. Also, the longer you use a VM, the cheaper the cost becomes (this is referred to as sustained usage discount). Billing is also done per minute, which is in contrast to AWS and Azure who bill per hour. So even if you only use a part hour, you still pay the full amount.

It is estimated that 45% of public cloud compute spend is wasted, the GC billing model should help reduce this figure. You can also change VM sizes at any time and the sustained usage discount can result in “up to” 57% savings. Worth looking at, I think you’ll agree.

Lush from the UK were brought up to discuss their migration to GCP and they performed this in 22 days and they calculate 40% savings on hosting charges per year. Not bad!

Co-existence and migration

There has also been a lot of work done within GCP to support Windows native tools such as PowerShell (there are GCP cmdlets) and Visual Studio. There are also migration tools that can live move VMs from vSphere,. Hyper-V and KVM, as you’d probably expect. Worth mentioning too at this point that GCP has live migration for VMs as per vSphere and Hyper-V, which is unique to GCP right now, certainly to the best of my knowledge.

G Suite improvements

Lots of work has been done around G Suite, including improvements to Drive to allow for team sharing of documents and also using predictive algorithms to put documents at the top of the Drive page within one click, rather than having to search through folders for the document you’re looking for. Google claim a 40% hit rate from the suggested documents.

There are also add ons from the likes of QuickBooks, where you can raise an invoice from directly within Gmail and be able to reconcile it when you get back to QuickBooks. Nice!

Encryption in the cloud

Once the opening keynote wrapped, I went to my first breakout session which was about encryption within GC. I’m not going to pretend I’m an expert in this field, but Maya Kaczorowski clearly is, and she is a security PM at Google. The process of encrypting data within the GC environment can be summarised thus :-

  • Data uploaded to GC is “chunked” into small pieces (variable size)
  • Each chunk is encrypted and has it’s own key
  • Chunks are written randomly across the GC environment
  • Getting one chunk of data compromised is effectively useless as you will still need the other chunks
  • There is a strict hierarchy to the Key Management Service (shown below)

Google key hierarchy

A replay of this session is available on YouTube and is well worth a watch. Probably a couple of times so you actually understand it!

What’s new in Kubernetes and Google Container Engine

Next up was a Kubernetes session and how it works with Google Container Engine (GCE). I have to say, I’ve heard the name of Kubernetes thrown around a lot, but never really had the time or the inclination to see what all the fuss is about. As I understand it, Kubernetes is a wrapper over the top of container technologies such as Docker to provide more enterprise management and features such as clustering and scaling.

Kubernetes was written initially by Google before being open sourced and it’s rapidly becoming one of the biggest open source projects ever. One of the key drivers for using containers and Kubernetes is the ability to port your environment to any platform. Containers and Kubernetes can be run on Azure, AWS, GC or even on prem. Using this technology avoids vendor lock in, if this is a concern for you.

Kubernetes contributors and users

There is also a very high release cadence – a new version ships every three months and version 1.7 is due at the end of June (1.6 is the current version). The essence of containerisation is that you can start to use and develop microservices (services broken down into very small, fast moving parts rather than one huge bound up, inflexible monolithic stack). Containers also are stateless in the sense that data is stored elsewhere (cloud storage bucket, etc) and are disposable items.

In a Kubernetes cluster, you can now scale up to 5,000 pods per cluster. A cluster is a collection of nodes (think VMs) and pods are container items running isolated from each other on a node. Clusters can be multi-zone and multi-region and now also have the concept of “taints” and “tolerances”. Think of taints as node characteristics such as having a GPU, or a certain RAM or CPU size. A tolerance is a container rule that allows or disallows affinity based on the node taint. For example, a tolerance would allow a container to run on a node with a GPU only.

The final point of note here is that Google offer a managed Kubernetes service called Google Container Engine.

From Blobs to Relational Tables, where do I store my data?

My next breakout was to try and get a better view of the different storage options within GC. One of the first points made was really interesting in that Rolls Royce actually lease engines to airlines so they can collect telemetry data and have the ability to tune engines as well as perform pro-active maintenance based on data received back from the engines.

In summary, your storage options include:-

  • RDBMS – Cloud SQL
  • Data Warehousing – BigQuery
  • Hadoop – Cloud Storage
  • NoSQL – Cloud BigTable
  • NoSQL Docs – Cloud datastore
  • Scalable RDBMS – Cloud Spanner

Cloud Storage can have several different characteristics, including multi-region, regional, nearline and coldline. This is very similar to the options provided by AWS and Azure. Cloud Storage has an availability SLA of 99.95% and you use the same API to access all storage tiers.

Data lifecycle policies are available available in a similar way to S3, moving data between the tiers when rules are triggered. Delivery Network is performed using the Cloud CDN product and message queuing is performed using Cloud Pub/Sub. Cloud Storage for hybrid environments is also available in a similar way to StorSimple or the AWS Storage Gateway using partner solutions such as Panzura (cold storage, backup, tiering device, etc.)

Cloud SQL – 99.95% SLA, with failover replica and read replicas, which seemed very similar to how AWS RDS works. One interesting product was Cloud Spanner. This is a horizontally scalable RDBMS solution that offers typical SQL features such as ACID but with the scalability of typical cloud NoSQL solutions. This to me seemed a pretty unique feature of GC, I haven’t seen this elsewhere. Cloud Spanner also provides global consistency, 99.99% uptime SLA and a 99.999% multi-region availability SLA. Cool stuff!

Serverless Options on GCP

My next breakout was on serverless options on GCP. Serverless seems to the latest trend in cloud computing that for some people is the answer to everything and nothing. Both AWS and Azure provide serverless products, and there are a lot of similarities with the Google Functions product.

To briefly deconstruct serverless tech, this is where you use event driven process to perform a specific task. For example, a file gets uploaded to a storage bucket and this causes an event trigger where “stuff” is performed by a fleet of servers. Once this task is complete, the process goes back to sleep again.

The main benefit of serverless is cost and management. You aren’t spinning VMs up and down and you aren’t paying compute fees for idle VMs. Functions is charged per 100ms of usage and also how much RAM is assigned to the process. The back end also auto scales so you don’t have to worry about setting up your own auto scaling policies.

Cloud Functions is in it’s infancy right now, so only node.js is supported but more language support will be added over time. Cloud storage, Pub/Sub channels and HTTP webhooks can be used to capture events for serverless processes.

Day Two wrap up to come in the next post!

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.