05-01-18

05-01-18 : 6/7 Ain’t Bad : AWS Certified Big Data – Specialty Exam Tips

I’m pleased to say I just returned from sitting the AWS Certified Big Data Specialty exam and I managed to just about pass it first time. As always, I try and give some feedback to the community to help those who are planning on having a go themselves.

The exam itself is 65 questions over 170 minutes. In terms of difficulty, it’s definitely harder than the Associate level exams and in some cases, as tough as the Professional level exams. I didn’t feel particularly time constrained as with some other AWS exams as most of the questions are reasonably short (and a couple of them don’t make sense, meaning you need to take a best guess attempt at it).

In terms of preparation, I was lucky enough to be sent on the AWS Big Data course by my employer just before Christmas and it certainly helped but there was some exam content I didn’t remember the course covering. I also chose LinuxAcademy over A Cloud Guru, but really only for the reason that LA had hands on labs with its course and I don’t think ACG has them right now. There’s really no substitute for hands on lab to help understand a concept beyond the documentation.

I also use QwikLabs for hands on lab practice, there are a number of free labs you can use to help with some of the basics, above that for the more advanced labs, I’d recommend buying an Advantage Subscription which allows you to take unlimited labs based on a monthly charge. It’s about £40 if you’re in the UK, around $55 for US based folks. It might sound like a lot, but it’s cheaper than paying for an exam resit!

I won’t lie, Big Data is not my strong point and it’s also a topic I find quite dry, having been an infrastructure guy for 20 years or more. That being said, Big Data is a large part of the technology landscape we live in, and I always say a good architect knows a little bit about a lot of things.

As with other AWS exams, the questions are worded in a certain way. For example, “the most cost effective method”, “the most efficient method” or “the quickest method”. Maybe the latter examples are more subjective, but cost effectiveness usually wraps around S3 and Lambda as opposed to massive EMR and Redshift clusters, for example.

What should you focus on? Well the exam blueprint is always a good place to start. Some of the objectives are a bit generic, but you should have a sound grasp of what all the products are, the architecture of them and design patterns and anti-patterns (e.g. when not to use them). From here, you should be able to weed out some of the clearly incorrect answers to give you a statistically better chance of picking the correct answer.

Topic wise, I’d advise focusing on the following:-

  • Kinesis (Streams, Firehose, Analytics, data ingestion and export to other AWS services, tuning)
  • DynamoDB (Performance tuning, partitioning, use patterns and anti-patterns, indexing)
  • S3 (Patterns and anti-patterns, IA/Glacier and lifecycling, partitioning)
  • Elastic MapReduce (Products used in conjunction and what they do – Spark, Hadoop, Zeppelin, Sqoop, Pig, Hive, etc.)
  • QuickSight (Use patterns/anti-patterns, chart types)
  • Redshift (Data ingestion, data export, slicing design, indexing, schema types)
  • Instance types (compute intensive, smaller nodes of large instances vs larger nodes of smaller instances)
  • Compression (performance, compression sizes)
  • Machine Learning (machine learning model types and when you’d use them)
  • IoT (understand the basics of AWS IoT architecture)
  • What services are multi-AZ and/or multi-region and how to work around geographic constraints
  • Data Import/Export (when to use, options)
  • Security (IAM, KMS, HSM, CloudTrail)
  • CloudWatch (log files, metrics, etc.)

As with many AWS exams, the topics seem very broad, so well worth knowing a little about all of the above, but certainly focus on EMR and Redshift as they are the bedrock products of Big Data. If you know them well, I’d say you’re half way there.

You may also find Re:Invent videos especially helpful, especially the Deep Dive ones at the 300 or 400 level. The exam is passable, if I can do it, anyone can! Hopefully this blog helped you out, as there doesn’t seem to be much information out there on the exam since it went GA.

Just the Networking Specialty to do now for the full set, hopefully I’ll get that done before my SA Professional expires in June!