05-01-18 : 6/7 Ain’t Bad : AWS Certified Big Data – Specialty Exam Tips

I’m pleased to say I just returned from sitting the AWS Certified Big Data Specialty exam and I managed to just about pass it first time. As always, I try and give some feedback to the community to help those who are planning on having a go themselves.

The exam itself is 65 questions over 170 minutes. In terms of difficulty, it’s definitely harder than the Associate level exams and in some cases, as tough as the Professional level exams. I didn’t feel particularly time constrained as with some other AWS exams as most of the questions are reasonably short (and a couple of them don’t make sense, meaning you need to take a best guess attempt at it).

In terms of preparation, I was lucky enough to be sent on the AWS Big Data course by my employer just before Christmas and it certainly helped but there was some exam content I didn’t remember the course covering. I also chose LinuxAcademy over A Cloud Guru, but really only for the reason that LA had hands on labs with its course and I don’t think ACG has them right now. There’s really no substitute for hands on lab to help understand a concept beyond the documentation.

I also use QwikLabs for hands on lab practice, there are a number of free labs you can use to help with some of the basics, above that for the more advanced labs, I’d recommend buying an Advantage Subscription which allows you to take unlimited labs based on a monthly charge. It’s about £40 if you’re in the UK, around $55 for US based folks. It might sound like a lot, but it’s cheaper than paying for an exam resit!

I won’t lie, Big Data is not my strong point and it’s also a topic I find quite dry, having been an infrastructure guy for 20 years or more. That being said, Big Data is a large part of the technology landscape we live in, and I always say a good architect knows a little bit about a lot of things.

As with other AWS exams, the questions are worded in a certain way. For example, “the most cost effective method”, “the most efficient method” or “the quickest method”. Maybe the latter examples are more subjective, but cost effectiveness usually wraps around S3 and Lambda as opposed to massive EMR and Redshift clusters, for example.

What should you focus on? Well the exam blueprint is always a good place to start. Some of the objectives are a bit generic, but you should have a sound grasp of what all the products are, the architecture of them and design patterns and anti-patterns (e.g. when not to use them). From here, you should be able to weed out some of the clearly incorrect answers to give you a statistically better chance of picking the correct answer.

Topic wise, I’d advise focusing on the following:-

  • Kinesis (Streams, Firehose, Analytics, data ingestion and export to other AWS services, tuning)
  • DynamoDB (Performance tuning, partitioning, use patterns and anti-patterns, indexing)
  • S3 (Patterns and anti-patterns, IA/Glacier and lifecycling, partitioning)
  • Elastic MapReduce (Products used in conjunction and what they do – Spark, Hadoop, Zeppelin, Sqoop, Pig, Hive, etc.)
  • QuickSight (Use patterns/anti-patterns, chart types)
  • Redshift (Data ingestion, data export, slicing design, indexing, schema types)
  • Instance types (compute intensive, smaller nodes of large instances vs larger nodes of smaller instances)
  • Compression (performance, compression sizes)
  • Machine Learning (machine learning model types and when you’d use them)
  • IoT (understand the basics of AWS IoT architecture)
  • What services are multi-AZ and/or multi-region and how to work around geographic constraints
  • Data Import/Export (when to use, options)
  • Security (IAM, KMS, HSM, CloudTrail)
  • CloudWatch (log files, metrics, etc.)

As with many AWS exams, the topics seem very broad, so well worth knowing a little about all of the above, but certainly focus on EMR and Redshift as they are the bedrock products of Big Data. If you know them well, I’d say you’re half way there.

You may also find Re:Invent videos especially helpful, especially the Deep Dive ones at the 300 or 400 level. The exam is passable, if I can do it, anyone can! Hopefully this blog helped you out, as there doesn’t seem to be much information out there on the exam since it went GA.

Just the Networking Specialty to do now for the full set, hopefully I’ll get that done before my SA Professional expires in June!




QwikLabs Competition Winner


Just a quick post today to say thanks to everyone who entered the QwikLabs competition and as promised, we have a winner! The random number generator picked out Hardik Mistry and he has already unwrapped his prize! Thanks again to QwikLabs for the token and for their support. If you haven’t yet swung by their site, I highly recommend it.



Blog Renaming and QwikLabs Competition

The more sharp eyed among you (and the hardy few that follow me on Twitter) will have noticed that Virtual Fabric is no more and Blue Clouds is the new name for this blog. Why? Well VF worked back in the days when pretty much all I did was on premises virtualisation with VMware products. Those products still rock and have their place, but obviously these days I spend more or less 100% of my time in the public cloud space, with Azure, Office 365 and AWS.

As such, Virtual Fabric doesn’t work as a name anymore, and in fact, I’ve been asked on more than one occasion if I sell curtains and soft furnishings. The answer is of course, NO! So I wondered what worked better, and of course all the cool names have gone, .cloud domains cost an arm and a leg and I lack the imagination to find something more memorable.

Anyway, seeing as I like cloud computing and I am a season ticket holder for Wigan Athletic, I thought “Blue Clouds”. Has a nice double meaning, and the .com was available, which was nice. All your existing bookmarks will still work, as I still own virtual-fabric.com and will do for the foreseeable future.


To celebrate this modest re-branding, I have a competition giveaway! That’s right! Hold onto your hats folks, as the awesome people at QwikLabs donated a free AWS lab token a little while back after I gave them a shout out on this very blog. If you want to enter the competition to win this prize, all you have to do is enter your name and e-mail address into the virtual tombola and I will pick a winner at random on Friday August 12th. Rules below :-

  • Winner chosen at random and my decision on the winner is final
  • No cash alternatives (awesomeness has no price)
  • QwikLabs token is good until 31/12/2017
  • Names and e-mail addresses are for draw purposes only, and I will not share them elsewhere. I hate spam as much as anyone
  • Please don’t enter multiple times under false names, that’s just bad sport
  • No bribes please, I’m not a politician
  • Any post draw whiners will be directed to talk to the hand
  • Winner will be posted on this blog when the competition closes

To enter, click here and simply provide your name and e-mail address.

Good luck and even if you don’t win, please give QwikLabs a spin if you’re studying for AWS exams, they’re very good!