Azure VNet Peering Preview Now Available




One of the networking features that I liked AWS over Azure for was the ease of peering VPCs together. As a quick primer, an AWS VPC is basically your own private cloud within AWS, with subnets and instances and all that good stuff. Azure VNets are very similar in that they are a logical grouping of subnets, instances, address spaces, etc. Previously, to link VNets together, you had to use a VPN connection. That’s all well and good, but it’s a little bit clunky and in my opinion, is not as elegant as VPC peering.

Anyway, Microsoft has recently announced that VNet peering within a region is now available as a preview feature. This means that it’s available for you to try out, but be warned it’s pre-release software (much like a beta programme) and it’s a bit warts and all. It’s not meant to be used for production purposes and it is not covered by any SLAs.

The benefits of VNet peering include:-

  • Eliminates need for VPN connections between VNets
  • Connect ASM and ARM networks together
  • High speed connectivity across the Azure backbone between VNets

Many of the same restrictions that govern the use of VPC peering in AWS apply here too to VNet peering, including:-

  • Peering must occur in the same region
  • There is no transitive peering between VNets (VNet A is peered to VNet B but not to VNet C. VNet B is peered to VNet C but VNet A has no peer to VNet C)
  • There must be no overlap in the IP address space

While VNet peering is in preview, there is no charge for this service. Take a look at the documentation and give it a spin, in the test environment, obviously 😉




AWS : Keeping up with the changes


As we all know, working in the public cloud space means changes in the blink of an eye. Services are added, updated (and in some cases, removed) at short notice and it’s vital from not just a Solutions Architect’s perspective but from an end user or operational standpoint that we keep up to date with these announcements, as and when they happen.

In days of old, we’d keep an eye on a vendor’s annual conference when they’d reveal something cool in their keynote, with a release on that day or to follow shortly after. In the public cloud, innovation happens much quicker and it’s no longer a case of waiting for “Geek’s Christmas”.

To that end, today I was pointed towards the AWS “What’s New” blog, which in essence is a change log for AWS services. Yesterday alone lists 8 announcements or service updates.

It’s a site well worth bookmarking and reviewing on a regular basis, I’d suggest weekly if you have time. If you’re designing AWS infrastructures or running your business on AWS, you need to know what’s on the roadmap so you can plan accordingly.

You can visit the What’s New blog site here.



AWS Certified Solutions Architect Professional – Study Guide – Domain 8.0: Cloud Migration and Hybrid Architecture (10%)


The final part of the study guide is below – thanks to all those who have tuned in over the past few weeks and given some very positive feedback. I hope it helps (or has helped) you get into the Solutions Architect Pro club. It’s a tough exam to pass and the feeling of achievement is immense. Good luck!

8.1 Plan and execute for applications migrations

  • AWS Management Portal available to plug AWS infrastructure into vCenter. This uses a virtual appliance and can enable migration of vSphere workloads into AWS
  • Right click on VM and select “Migrate to EC2”
  • You then select region, environment, subnet, instance type, security group, private IP address
  • Use cases:-
    • Migrate VMs to EC2 (VM must be powered off and configured for DHCP)
    • Reach new regions from vCenter to use for DR etc
    • Self service AWS portal in vCenter
    • Create new EC2 instances using VM templates
  • The inventory view is presented as :-
    • Region
      • Environment (family of templates and subnets in AWS)
        • Template (prototype for EC2 instance)
          • Running instance
            • Folder for storing migrated VMs
  • Templates map to AMIs and can be used to let admins pick a type for their deployment
  • Storage Gateway can be used as a migration tool
    • Gateway cached volumes (block based iSCSI)
    • Gateway stored volumes (block based iSCSI)
    • Virtual tape library (iSCSI based VTL)
    • Takes snapshots of mounted iSCSI volumes and replicates them via HTTPS to AWS. From here they are stored in S3 as snapshots and then you can mount them as EBS volumes
    • It is recommended to get a consistent snapshot of the VM by powering it off, taking a VM snapshot and then replicating this
  • AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premise data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon Elastic MapReduce (EMR).
  • AWS Data Pipeline helps you easily create complex data processing workloads that are fault tolerant, repeatable, and highly available. You don’t have to worry about ensuring resource availability, managing inter-task dependencies, retrying transient failures or timeouts in individual tasks, or creating a failure notification system. AWS Data Pipeline also allows you to move and process data that was previously locked up in on-premise data silos
  • Pipeline has the following concepts:-
    • Pipeline (container node that is made up of the items below, can run on either EC2 instance or EMR node which are provisioned automatically by DP)
    • Datanode (end point destination, such as S3 bucket)
    • Activity (job kicked off by DP, such as database dump, command line script)
    • Precondition (readiness check optionally associated with data source or activity. Activity will not be done if check fails. Standard and custom preconditions available- DynamoDBTableExists, DynamoDBDataExists, S3KeyExists, S3PrefixExists, ShellCommandPrecondition)
    • Schedule
  • Pipelines can also be used with on premises resources such as databases etc
  • Task Runner package is installed on the on premises resource to poll the Data Pipeline queue for work to do (database dump etc, copy to S3)
  • Much of the functionality has been replaced by Lambda
  • Setup logging to S3 so you can troubleshoot it

8.2 Demonstrate ability to design hybrid cloud architectures

  • Biggest CIDR block you can have is a /16 and smallest is /28 for reservations
  • First four IP addresses and last one are reserved by AWS – always 5 reserved
    • – Network address
    • – Reserved for VPC router
    • – Reserved by AWS for DNS services
    • – Reserved by AWS for future use
    • – Reserved for network broadcast. Network broadcast not supported in a VPC, so this is reserved
  • When migrating to Direct Connect from a VPN, make the VPN connection and Direct Connect connection(s) as part of the same BGP area. Then configure the VPN to have a higher cost than the Direct Connect connection. BGP route prepending will do this as BGP is a metric based protocol. A single ASN is considered a more preferable route than an ASN with three or four values
  • For applications that require multicast, you need to configure a VPN between the EC2 instances with in-instance software, so the underlying AWS infrastructure is not aware of it. Multicast is not supported by AWS
  • VPN network must be a different CIDR block than the underlying instances are using (for example 10.x address for EC2 instances and 172.16.x addresses for VPN connection to another VPC)
  • SQL can be migrated by exporting database as flat files from SQL Management Studio, can’t replicate to another region or from on premises to AWS
  • CloudSearch can index documents stored in S3 and is powered by Apache SOLR
    • Full text search
    • Drill down searching
    • Highlighting
    • Boolean search
    • Autocomplete
    • CSV,PDF, HTML, Office docs and text files supported
  • Can also search DynamoDB with CloudSearch
  • CloudSearch can automatically scale based on load or can be manually scaled ahead of expected load increase
  • Multi-AZ is supported and it’s basically a service hosted on EC2, and these are how the costs are derived
  • EMR can be used to run batch processing jobs, such as filtering log files and putting results into S3
  • EMR uses Hadoop which uses HDFS, a distributed file system across all nodes in the cluster where there are multiple copies of the data, meaning resilience of the data and also enables parallel processing across multiple nodes
  • Hive is used to perform SQL like queries on the data in Hadoop, uses simple syntax to process large data sets
  • Pig is used to write MapReduce programs
  • EMR cluster has three components:-
    • Master node (manages data distribution)
    • Core node (stores data on HDFS from tasks run by task nodes and are managed by the master node)
    • Task nodes (managed by the master node and perform processing tasks only, do not form part of HDFS and pass processed data back to core nodes for storage)
  • EMRFS can be used to output data to S3 instead of HDFS
  • Can use spot, on demand or reserved instances for EMR cluster nodes
  • S3DistCp is an extension of DistCp that is optimized to work with AWS, particularly Amazon S3. You use S3DistCp by adding it as a step in a cluster or at the command line. Using S3DistCp, you can efficiently copy large amounts of data from Amazon S3 into HDFS where it can be processed by subsequent steps in your Amazon EMR cluster
  • Larger data files are more efficient than smaller ones in EMR
  • Storing data persistently on S3 may well be cheaper than leveraging HDFS as large data sets will require large instances sizes in the EMR cluster
  • Smaller EMR cluster with larger nodes may be just as efficient but more cost effective
  • Try to complete jobs within 59 minutes to save money (EMR billed by hour)


QwikLabs Competition Winner


Just a quick post today to say thanks to everyone who entered the QwikLabs competition and as promised, we have a winner! The random number generator picked out Hardik Mistry and he has already unwrapped his prize! Thanks again to QwikLabs for the token and for their support. If you haven’t yet swung by their site, I highly recommend it.



AWS Certified Solutions Architect Professional – Study Guide – Domain 7.0: Scalability and Elasticity (15%)


7.1 Demonstrate the ability to design a loosely coupled system

  • Amazon CloudFront is a web service (CDN) that speeds up distribution of your static and dynamic web content, for example, .html, .css, .php, image, and media files, to end users. CloudFront delivers your content through a worldwide network of edge locations. When an end user requests content that you’re serving with CloudFront, the user is routed to the edge location that provides the lowest latency, so content is delivered with the best possible performance. If the content is already in that edge location, CloudFront delivers it immediately. If the content is not currently in that edge location, CloudFront retrieves it from an Amazon S3 bucket or an HTTP server (for example, a web server) that you have identified as the source for the definitive version of your content.
  • CloudFront has two aspects – origin and distribution. You create a distribution and link it to an origin, such as S3, an EC2 instance, existing website etc
  • Two types of distributions, web and RTMP
  • Geo restrictions can be used to white or blacklist traffic from specific countries, blocking access to the distribution
  • GET, HEAD, PUT, POST, PATCH, DELETE and OPTIONS HTTP commands supported
  • Allowed methods are what CloudFront will pass on to the origin server. If you do not need to modify content, consider not allowing PUT, POST, PATCH, DELETE to ensure users to not modify content
  • CloudFront does not cache responses to POST, PUT, DELETE and PATCH requests, can POST content to an Edge location and then this is send on to the origin server
  • SSL can be used to provide HTTPS. Can either use CloudFront’s own certificate or use your own
    • To support older browsers, need dedicated SSL IP certificate per edge location, can be very expensive
    • SNI (Server Name Indication) custom SSL certs can be used by adding all hostnames behind the certificate but it is presented as a single IP address. Uses SNI extensions in newer browsers
  • 100 CNAME aliases per distribution, can use wildcard CNAMEs
  • Use Invalidation Requests to forcibly remove content from Edge locations. Need to use API call to do this or do it from the console, or set a TTL on the content
  • Alias records can be used to map a friendly name to a CloudFront URL (Route 53 supports this). Supports zone apex entry (name without www, such as example.com). DNS records for the same name must have the same routing type (simple, weighted, latency, etc) or you will get an error in the console
  • Alias records can then have “evaluate target” set to yes so that existing health checks are used to ensure the underlying resources are up before sending traffic onwards. If a health check for the underlying resource does not exist, evaluate target settings have no effect
  • AWS doesn’t charge for mapping alias records to CloudFront distributions
  • CloudFront supports dynamic web content using cookies to forward on to the origin server
  • Forward query strings passes the whole URL to the origin if configured in CloudFront, but only for a web server or application as S3 does not support this feature
  • Cookie values can then be logged into CloudFront access logs
  • CloudFront can be used to proxy upload requests back to the origin to speed up data transfers
  • Use a zero value TTL for dynamic content
  • Different URL patterns can send traffic to different origins
  • Whitelist certain HTTP headers such as cloudfront-viewer-country so that locale details can be passed through to the web server for custom content
  • Device detection can serve different content based on the User Agent string in the header request
  • Invalidating objects removes them from CloudFront edge caches. A faster and less expensive method is to use versioned object or directory names
  • Enable access logs in CloudFront and then send them to an S3 bucket. EMR can be used to analyse the logs
  • Signed URLs can be used to provide time limited access or access to private content on CloudFront. Signed cookies can be used to limit secure access to certain parts of the site. Use cases are signed URLs for a marketing e-mail and signed cookies for web site streaming or whole site authentication
  • Cache-control max-age header will be sent to browser to control how long the content is in the local browser cache for, can help improve delivery, especially of static items
  • If-modified-since will allow the browser to send a request for content only if it is newer than the modification date specified in the request. If the content has not changed, content is pulled from the browser cache
  • Set a low TTL for dynamic content as most content can be cached even if it’s only for a few seconds. CloudFront can also present stale data if TTL is long
  • Popular Objects report and cache statistics can help you tune CloudFront behaviour
  • Only forward cookies that are used to vary or tailor user based content
  • Use Smooth Streaming on a web distribution for live streaming using Microsoft technology
  • RTMP is true media streaming, progressive download downloads in chunks to say a mobile device. RTMP is Flash only
  • Supports existing WAF policies
  • You can create custom error response pages
  • Two ElastiCache engines available – Redis and Memcached. Exam will give scenarios and you must select the most appropriate
  • As a rule of thumb, simple caching is done by memcached and complex caching is done by Redis
  • Only Redis is multi-AZ and has backup and restore and persistence capabilities, sorting, publisher/subscriber, failover
  • Redis uses a persistence key store or caching engine for persistence
  • Redis has backup and restore and automatic failover and is best used for frequently changing data in a complex scale
  • Doesn’t need a database to backend it like memcached does
  • Leader boards is a good use case for Redis
  • Redis can be configured to use an Append Only File (AOF) that will repopulate the cache in case all nodes are lost and cache is cleared. This is disabled by default. AOF is like a replay log
  • Redis has a primary node and read only nodes. If the primary fails, a read only node is promoted to primary. Writes done to primary node, reads done from read replicas (asynchronous replication)
  • Redis snapshots are used to increase the size of nodes. This is not the same as EC2 snapshots, the snapshot creates a new node based on the snapshot and size is picked when launching
  • Redis can be configured to automatically backup daily in a window or manual snapshots. Automatic have retention limits, manual don’t
  • Memcached can scale horizontally and is multi-threaded, supports sharding
  • Memcached uses lazy loading, so if an app doesn’t get a hit from the cache, it requests it from the DB and then puts that into cache. Write through updates the cache when the database is updated
  • TTL can be used to expire out stale or unread data from the cache
  • Memcached does not maintain it’s own data persistence, database does this, scale by adding more nodes to a cluster
  • Vertically scaling memcached nodes requires standing up a new cluster of required instance sizes/types. All instance types in a cluster are the same type
  • Single endpoint for all memcached nodes
  • Put memcached nodes in different AZs
  • Memcache nodes are empty when first provisioned, bear this in mind when scaling out as this will affect cache performance while the nodes warm up
  • For low latency applications, place Memcache clusters in the same AZ as the application stack. More configuration and management but better performance
  • When deciding between Memcached and Redis, here are a few questions to consider:
    • Is object caching your primary goal, for example to offload your database? If so, use Memcached.
    • Are you interested in as simple a caching model as possible? If so, use Memcached.
    • Are you planning on running large cache nodes, and require multithreaded performance with utilization of multiple cores? If so, use Memcached.
    • Do you want the ability to scale your cache horizontally as you grow? If so, use Memcached.
    • Does your app need to atomically increment or decrement counters? If so, use either Redis or Memcached.
    • Are you looking for more advanced data types, such as lists, hashes, and sets? If so, use Redis.
    • Does sorting and ranking datasets in memory help you, such as with leaderboards? If so, use Redis.
    • Are publish and subscribe (pub/sub) capabilities of use to your application? If so, use Redis.
    • Is persistence of your key store important? If so, use Redis.
    • Do you want to run in multiple AWS Availability Zones (Multi-AZ) with failover? If so, use Redis.
  • Amazon Kinesis is a managed service that scales elastically for real-time processing of streaming data at a massive scale. The service collects large streams of data records that can then be consumed in real time by multiple data-processing applications that can be run on Amazon EC2 instances.
  • You’ll create data-processing applications, known as Amazon Kinesis Streams applications. A typical Amazon Kinesis Streams application reads data from an Amazon Kinesis stream as data records. These applications can use the Amazon Kinesis Client Library, and they can run on Amazon EC2 instances. The processed records can be sent to dashboards, used to generate alerts, dynamically change pricing and advertising strategies, or send data to a variety of other AWS services. The PutRecord command is used to put data into a stream
  • Data is stored in Kinesis for 24 hours, but this can go up to 7 days
  • You can use Streams for rapid and continuous data intake and aggregation. The type of data used includes IT infrastructure log data, application logs, social media, market data feeds, and web clickstream data. Because the response time for the data intake and processing is in real time, the processing is typically lightweight
  • The following are typical scenarios for using Streams
    • Accelerated log and data feed intake and processing
    • Real-time metrics and reporting
    • Real-time data analytics
    • Complex stream processing
  • An Amazon Kinesis stream is an ordered sequence of data records. Each record in the stream has a sequence number that is assigned by Streams. The data records in the stream are distributed into shards
  • A data record is the unit of data stored in an Amazon Kinesis stream. Data records are composed of a sequence number, partition key, and data blob, which is an immutable sequence of bytes. Streams does not inspect, interpret, or change the data in the blob in any way. A data blob can be up to 1 MB
  • Retention Period is the length of time data records are accessible after they are added to the stream. A stream’s retention period is set to a default of 24 hours after creation. You can increase the retention period up to 168 hours (7 days) using the IncreaseRetentionPeriod operation
  • A partition key is used to group data by shard within a stream
  • Each data record has a unique sequence number. The sequence number is assigned by Streams after you write to the stream with client.putRecords or client.putRecord
  • In summary, a record has three things:-
    • Sequence number
    • Partition key
    • Data BLOB
  • Producers put records into Amazon Kinesis Streams. For example, a web server sending log data to a stream is a producer
  • Consumers get records from Amazon Kinesis Streams and process them. These consumers are known as Amazon Kinesis Streams Applications
  • An Amazon Kinesis Streams application is a consumer of a stream that commonly runs on a fleet of EC2 instances
  • A shard is a uniquely identified group of data records in a stream. A stream is composed of one or more shards, each of which provides a fixed unit of capacity
  • Once a stream is created, you can add data to it in the form of records. A record is a data structure that contains the data to be processed in the form of a data blob. After you store the data in the record, Streams does not inspect, interpret, or change the data in any way. Each record also has an associated sequence number and partition key
  • There are two different operations in the Streams API that add data to a stream, PutRecords and PutRecord. The PutRecords operation sends multiple records to your stream per HTTP request, and the singular PutRecord operation sends records to your stream one at a time (a separate HTTP request is required for each record). You should prefer using PutRecords for most applications because it will achieve higher throughput per data producer
  • An Amazon Kinesis Streams producer is any application that puts user data records into an Amazon Kinesis stream (also called data ingestion). The Amazon Kinesis Producer Library (KPL) simplifies producer application development, allowing developers to achieve high write throughput to a Amazon Kinesis stream.
  • You can monitor the KPL with Amazon CloudWatch
  • The agent is a stand-alone Java software application that offers an easier way to collect and ingest data into Streams. The agent continuously monitors a set of log files and sends new data records to your Amazon Kinesis stream. By default, records within each file are determined by a new line, but can also be configured to handle multi-line records. The agent handles file rotation, checkpointing, and retry upon failures. It delivers all of your data in a reliable, timely, and simple manner. It also emits CloudWatch metrics to help you better monitor and troubleshoot the streaming process.
  • You can install the agent on Linux-based server environments such as web servers, front ends, log servers, and database servers. After installing, configure the agent by specifying the log files to monitor and the Amazon Kinesis stream names. After it is configured, the agent durably collects data from the log files and reliably submits the data to the Amazon Kinesis stream
  • SNS is Simple Notification Services – publisher creates a topic and then subscribers get updates sent to topics. This can be push to Android, iOS, etc
  • Use SNS to send push notifications to desktops, Amazon Device Messaging, Apple Push for iOS and OSX, Baidu, Google Cloud for Android, MS push for Windows Phone and Windows Push notification services
  • Steps to create mobile push:-
    • Request credentials from mobile platforms
    • Request token from mobile platforms
    • Create platform application object
    • Publish message to mobile endpoint
  • Grid computing vs cluster computing
    • Grid computing is generally loosely coupled, often used with spot instances and tend to grow and shrink as required. Use different regions and instance types
    • Distributed workloads
    • Designed for resilience (auto scaling) – horizontal scaling rather than vertical scaling
    • Cluster computing has two or more instances working together in low latency, high throughput environments
    • Uses same instance types
    • GPU instances do not support SR-IOV networking
  • Elastic Transcoder encodes media files and uses a pipeline with a source and destination bucket, a job and a pre-set (what media type, watermarks etc). Pre-sets are templates and may be altered to provide custom settings. Pipelines can only have one source and one destination bucket
  • Integrates into SNS for job status updates and alerts



AWS Certified Solutions Architect Professional – Study Guide – Domain 6.0: Security (20%)


6.1 Design information security management systems and compliance controls

  • AWS Directory Services is a hosted service that allows you to hook up your EC2 instances with an AD either on prem or standalone in the AWS cloud
  • Comes in two flavours:-
    • AD Connector
    • Simple AD
  • AD Connector permits access to resources such as Workspaces, WorkMail, EC2 etc via existing AD credentials using IAM
  • AD Connector enforces on premises policies such as password complexities, history, lockout policies etc
  • AD can also use MFA by leveraging RADIUS services
  • Simple AD is based within AWS and runs on a Samba 4 compatible server. Supports:-
    • User and group accounts
    • Kerberos based SSO
    • GPOs
    • Domain joining EC2 instances
    • Automated daily snapshots
    • Simple AD limitations:-
      • Does not support MFA
      • Cannot add additional AD servers
      • Can’t create trust relationships
      • Cannot transfer FSMO roles
      • Doesn’t support PowerShell scripting
  • In most cases, Simple AD is the least expensive option and your best choice if you have 5,000 or less users and don’t need the more advanced Microsoft Active Directory features.
  • AWS Directory Service for Microsoft Active Directory (Enterprise Edition) is a managed Microsoft Active Directory hosted on the AWS Cloud. It provides much of the functionality offered by Microsoft Active Directory plus integration with AWS applications. With the additional Active Directory functionality, you can, for example, easily set up trust relationships with your existing Active Directory domains to extend those directories to AWS services.
  • Microsoft AD is your best choice if you have more than 5,000 users and need a trust relationship setup between an AWS hosted directory and your on-premises directories
  • AD Connector is your best choice when you want to use your existing on-premises directory with AWS services.
  • CloudTrail is used for logging all API calls and events made in all regions in your AWS account. This can be either from the console or via the command line. It is more an auditing tool rather than a logging tool
  • CloudWatch is a monitoring service for AWS services. You can collect and track metrics, collect and track log files and set alarms. Works with EC2, DynamoDB, RDS instances as well as any custom metrics from your applications or log files those apps generate
  • By default, CloudWatch Logs will store your log files indefinitely. You can change the log group retention period at any time
  • Log groups are used to capture log files from instances and can gather them in a single folder structure, grouped by instance ID
  • CloudWatch alarms are only stored for 14 days
  • CloudWatch logging is billed per GB ingested and per GB archived per month, charged per alarm per month
  • Can work out cheaper to store your logs in S3, depending on your environment
  • CloudWatch can be used to monitor CloudTrail by creating logging groups to alert when a particular terms, phrases or values is found in a log file (“error”, etc.). This is the CloudWatch Logs feature. Define a metric filter to create alerts based on keywords or phrases in the log files, this then defines a measurable metric
  • Events can be monitored and shipped to CloudWatch, S3 or to a third party product such as Splunk
  • Don’t log to non persistent storage, such as EC2 EBS root volume. Log to S3 or CloudWatch
  • CloudTrail can log across multiple accounts and put logs in a single S3 bucket (needs cross account access)
  • CloudWatch can be used to monitor multiple AWS accounts
  • Awslogs package in Linux installs the log agent and forwards system logs to CloudWatch for collection and alerting
  • Awslogs.conf can be configured to send logging specific information to CloudWatch

6.2 Design security controls with the AWS shared responsibility model and global infrastructure

  • Inline policies are policies that are directly associated to an object (user, for example) and are deleted when the object is deleted. Use cases include:-
    • Requirement for strict one to one policy relationship
    • Ensuring the policy is deleted when the object is deleted
  • Managed policies are created and managed separately, use cases include:-
    • Version management (up to five versions)
    • Configuration rollback
    • Reusability
    • Central management
    • Delegation of permissions management
    • Larger policy size (up to 5K)
    • Can be customer managed or AWS managed (they have little AWS icon next to them)
    • Assign to groups, roles, users etc
    • Up to 10 managed policies may be assigned per object
  • Variables also supported in policies
  • Default policy position is to deny. Explicit deny trumps everything
  • Tags can be used to control access by adding a condition clause into policies – the condition must match a tag for access to be effective (eg. All EC2 instances where the tag matches Cost Centre : IT)
  • IAM policies follow the PARC model
    • Principal (IAM user, group, role)
    • Action (Launch instance, terminate instance, etc.)
    • Resource (EC2 instance, S3 bucket, etc)
    • Condition (where instance = i23523, for example)
      • Effect (Deny, Allow)
  • Wildcards as supported, both asterisk and question marks for granularity
  • NotAction provides a method to exempt or exclude an permission from a resource set, for example having NotAction iam* will grant permissions but not for IAM actions
  • When specifying multiple values in a policy JSON file, this is classed as an array and therefore the values must be wrapped in square brackets []

6.3 Design identity and access management controls

  • The AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users)
  • Users come from one of three sources:-
    • Federated (Active Directory, SAML). Uses AD credentials and does not need to be an IAM user, SSO allows login to console without assigning IAM credentials
    • Federation with OpenID web applications (Facebook, Google, Amazon etc)
    • Cross account access (IAM user from another account)
  • Federation is joining users in one domain (IAM) with another (AD, Facebook etc)
  • Identity Broker joins domain A to domain B
  • Identity Store/Provider is AD, Facebook etc
  • Identity is a user of that service or member of that domain
  • On a correct userid and password, STS returns 4 items – access key, secret access key, token and duration (token’s lifetime, between 1 and 36 hours,default is 12 hours for GetFederationToken, 1 hour for AssumeRole)
  • Identity Broker takes credentials from the application, checks LDAP. If this is correct, it goes to STS and passes the token for a role using GetFederationToken call using IAM credentials. STS passes the access token with permissions back to the broker who passes it back to the app which then accesses the respective resource (such as S3). Resource then verifies the token has appropriate access
    • Develop an identity broker to communicate with LDAP and STS
    • Broker always communicates with LDAP first and then with STS
    • Application gets temporary access to AWS resources
  • AssumeRole Action returns a set of temporary security credentials (consisting of an access key ID, a secret access key, and a security token) that you can use to access AWS resources that you might not normally have access to. Typically, you use AssumeRole for cross-account access or federation.You can optionally include multi-factor authentication (MFA) information when you call AssumeRole. This is useful for cross-account scenarios in which you want to make sure that the user who is assuming the role has been authenticated using an AWS MFA device
  • AssumeRoleWithWebIdentity returns a set of temporary security credentials for users who have been authenticated in a mobile or web application with a web identity provider, such as Amazon Cognito, Login with Amazon, Facebook, Google, or any OpenID Connect-compatible identity provider. Calling AssumeRoleWithWebIdentity does not require the use of AWS security credentials. Therefore, you can distribute an application (for example, on mobile devices) that requests temporary security credentials without including long-term AWS credentials in the application, and without deploying server-based proxy services that use long-term AWS credentials. Instead, the identity of the caller is validated by using a token from the web identity provider.
  • AssumeRoleWithSAML generally used for AD Federation requests.
  • DecodeAuthorizationMessage decodes additional information about the authorization status of a request from an encoded message returned in response to an AWS request. For example, if a user is not authorized to perform an action that he or she has requested, the request returns a Client.UnauthorizedOperation response (an HTTP 403 response). Some AWS actions additionally return an encoded message that can provide details about this authorization failure
  • GetFederationToken returns a set of temporary security credentials (consisting of an access key ID, a secret access key, and a security token) for a federated user. A typical use is in a proxy application that gets temporary security credentials on behalf of distributed applications inside a corporate network. Because you must call the GetFederationToken action using the long-term security credentials of an IAM user, this call is appropriate in contexts where those credentials can be safely stored, usually in a server-based application. If you are creating a mobile-based or browser-based app that can authenticate users using a web identity provider like Login with Amazon, Facebook, Google, or an OpenID Connect-compatible identity provider, we recommend that you use Amazon Cognito or AssumeRoleWithWebIdentity
  • GetSessionToken returns a set of temporary credentials for an AWS account or IAM user. The credentials consist of an access key ID, a secret access key, and a security token. Typically, you use GetSessionToken if you want to use MFA to protect programmatic calls to specific AWS APIs like Amazon EC2 StopInstances. MFA-enabled IAM users would need to call GetSessionToken and submit an MFA code that is associated with their MFA device. Using the temporary security credentials that are returned from the call, IAM users can then make programmatic calls to APIs that require MFA authentication. If you do not supply a correct MFA code, then the API returns an access denied error.
  • The GetSessionToken action must be called by using the long-term AWS security credentials of the AWS account or an IAM user. Credentials that are created by IAM users are valid for the duration that you specify, between 900 seconds (15 minutes) and 129600 seconds (36 hours); credentials that are created by using account credentials have a maximum duration of 3600 seconds (1 hour)
  • Assertions  are used in SAML to map AD groups to AWS roles

6.4 Design protection of Data at Rest controls

  • HSM is a Hardware Security Module and is a physical device used that safeguards and manages cryptographic keys, usually either a plug in card or a physical box
  • HSMs would previously have to be hosted on premises, which could mean latency between the application in AWS and the HSM on the customer site
  • Amazon provides CloudHSM. Keys can be created, stored and managed in a way that is only accessible to you
  • CloudHSM is charged with an upfront fee and then per hour until the instance is terminated. A two week eval is available by request
  • CloudHSM is single tenanted. When you purchase an instance, it’s dedicated to you
  • Has to be deployed in a VPC (EC2-Classic will have to add a VPC)
  • VPC peering can be used to access CloudHSM
  • You can use EBS volume encryption, S3 object encryption and key management with CloudHSM, but this does require custom scripting
  • If you need fault tolerance, you need to add a second CloudHSM in a cluster as if you lose your single one, you lose all the keys
  • Can integrate CloudHSM with RDS as well as Redshift
  • Monitor with syslog
  • AWS Key Management Service is used from the IAM console and allows an administrator to define keys for the encryption of data
  • KMS is region based
  • CMK is the Customer Master Key and is the top of the hierarchy and you can add KMS administrators using IAM. Users also need to have permissions via IAM or they are not allowed to use keys to perform encryption tasks
  • Accounts from other AWS accounts can be added as users
  • Key rotation changes the backing key and all backing keys are kept. These are used to encrypt and decrypt data. CMKs would need to be disabled to prevent any of the backing keys being used for encryption or decryption
  • Data encrypted using a key is lost if the key is lost
  • You can select which encryption key is used to create an encrypted EBS volume, for example. If none is selected, the default is the EBS key pre-created in KMS

6.5 Design protection of Data in Flight and Network Perimeter controls

  • NTP amplification can be used with a spoof IP address to return a large packet back to a different target (the intended victim) and flood the target with traffic
  • Reflection attacks involve eliciting a response from a server to a spoofed IP address where the compromised server acts like a reflector
  • Attacks can also take place at the application layer (layer 7) by flooding the web server with GET requests.
  • Slowloris attack is deliberately slow GET requests to open up lots of connections on the web server
  • Limit the attack surface by opening only required ports, use bastion hosts where appropriate and use private subnets
  • WAF is web application filter and provides protection at layer 7
  • Can use a community based WAF appliance or use the AWS WAF service
  • Stacks can also be scaled horizontally and vertically to meet the additional load placed on your infrastructure by a DDoS attack
  • Scaling out is easier than scaling up as it results in no downtime as instances are added
  • Geo restrictions or blocking can be used with CloudFront to prevent attacks from certain countries. This can be achieved by either using white or black listing
  • Origin Access Identity. Restrict access to S3 buckets by preventing direct user access and forcing them to access objects via CloudFront URLs
  • Alias records in Route 53 can be used to redirect traffic from an existing infrastructure to a new one with greater capacity and WAFs, built to withstand a DDoS attack. No DNS changes and no propagation delays
  • You also need to learn normal behaviour for an application so that you don’t block any traffic during month end spikes, for example
  • With C3, C4, R3, D2, and I2 instances, you can enable Enhanced Networking capabilities, which provides higher network performance (packets per second). This feature uses a network virtualization stack that provides higher I/O performance and lower CPU utilization compared to traditional implementations. With Enhanced Networking, your application can benefit from features that can aid in building resilience against DDoS attacks, such as high packet-per-second performance, low latency networking, and improved scalability.
  • Amazon Route 53 has two capabilities that work together to help ensure end users can access your application even under DDoS attack: shuffle sharding and anycast routing
  • Amazon Route 53 uses shuffle sharding to spread DNS requests over numerous PoPs, thus providing multiple paths and routes for your application
  • Anycast routing increases redundancy by advertising the same IP address from multiple PoPs. In the event that a DDoS attack overwhelms one endpoint, shuffle sharding isolate failures while providing additional routes to your infrastructure
  • Alias Record Sets can save you time and provide additional tools while under attack. For example, suppose an Alias Record Set for example.com points to an ELB load balancer, which is distributing traffic across several EC2 instances running your application. If your application came under attack, you could change the Alias Record Set to point to an Amazon CloudFront distribution or to a different ELB load balancer with higher capacity EC2 instances running WAFs or your own security tools. Amazon Route 53 would then automatically reflect those changes in DNS answers for example.com without any changes to the hosted zone that contains Alias Record Sets for example.com.
  • IDS is Intrusion Detection, IPS is Intrusion Protection
  • IDS/IPS is a virtual appliance installed into the public subnet that may communicate with a SoC such as Trend Micro, sends logs to S3 and an agent is required in each instance to capture and analyse traffic and requests
  • It is possible to restrict access to resources using tags. You can do an explicit deny permission and this overrides everything. Use Action:API permissions to prevent actions via the command line or AWS console


Blog Renaming and QwikLabs Competition

The more sharp eyed among you (and the hardy few that follow me on Twitter) will have noticed that Virtual Fabric is no more and Blue Clouds is the new name for this blog. Why? Well VF worked back in the days when pretty much all I did was on premises virtualisation with VMware products. Those products still rock and have their place, but obviously these days I spend more or less 100% of my time in the public cloud space, with Azure, Office 365 and AWS.

As such, Virtual Fabric doesn’t work as a name anymore, and in fact, I’ve been asked on more than one occasion if I sell curtains and soft furnishings. The answer is of course, NO! So I wondered what worked better, and of course all the cool names have gone, .cloud domains cost an arm and a leg and I lack the imagination to find something more memorable.

Anyway, seeing as I like cloud computing and I am a season ticket holder for Wigan Athletic, I thought “Blue Clouds”. Has a nice double meaning, and the .com was available, which was nice. All your existing bookmarks will still work, as I still own virtual-fabric.com and will do for the foreseeable future.


To celebrate this modest re-branding, I have a competition giveaway! That’s right! Hold onto your hats folks, as the awesome people at QwikLabs donated a free AWS lab token a little while back after I gave them a shout out on this very blog. If you want to enter the competition to win this prize, all you have to do is enter your name and e-mail address into the virtual tombola and I will pick a winner at random on Friday August 12th. Rules below :-

  • Winner chosen at random and my decision on the winner is final
  • No cash alternatives (awesomeness has no price)
  • QwikLabs token is good until 31/12/2017
  • Names and e-mail addresses are for draw purposes only, and I will not share them elsewhere. I hate spam as much as anyone
  • Please don’t enter multiple times under false names, that’s just bad sport
  • No bribes please, I’m not a politician
  • Any post draw whiners will be directed to talk to the hand
  • Winner will be posted on this blog when the competition closes

To enter, click here and simply provide your name and e-mail address.

Good luck and even if you don’t win, please give QwikLabs a spin if you’re studying for AWS exams, they’re very good!