Share your cloud infra with Terraform modules

Cloud comes with many advantages and one really nice feature is infrastructure as code (IaC). IaC allows one to manage data center trough definition files instead of physically configuring and setting up resources. Very popular tool for the IaC is Terraform.

Terraform is a tool for IaC and it works with multiple clouds. With Terraform configuration files are run from developers machine or part of the CI/CD pipelines. Terraform allows one to create modules, parts of the infrastructure that can be reused. A module is a container for multiple resources that are used together. Even for simple set up, modules are nice, as one does not need to repeat oneself, but they are very handy with some of the more resource-heavy setups. For example, setting up even somewhat simple AWS virtual private cloud (VPC) network can be resource heavy and somewhat complex to do with IaC. As VPC are typically setup in a similar fashion, generic Terraform modules can ease these deployments greatly.

Share your work with your team and the world

Nice feature of these Terraform modules is that you can fairly easily share them. As you are using these modules, you can source them from multiple different locations such as local file system, version control repositories, GitHub, Bitbucket, AWS S3 or HTTP URL. If, and when, you have your configuration files in version control, you can simply point your module’s source to this location. This makes sharing the modules across teams handy.

Terraform also has Terraform Registry, which is an index of modules shared publicly. Here you can share your modules with the rest of the world and really help out fellow developers. Going back to the VPC configuration, you can find really good Terraform modules to help you get started with this. Sharing your own code is really easy and Terraform has very good documentation about it [1]. What you need is GitHub repo named according to Terraform definitions, having description, right module structure and tag. That’s all.

Of course, when sharing you should be careful not to share anything sensitive and specific. Good Terraform Registry modules are typically very generic and self-containing. When sourcing directly from the outside locations, it is good to keep in mind that at times they might not be available and your deployments might fail. To overcome this, taking snapshots of used modules might be a good idea.

Also, I find it a good practice to have a disable variable in the modules. This way user of the module can choose whether to deploy the module by setting a single variable. This kind of variable is good to take into consideration all the way from the beginning because in many cases it affects all the resources in the module. I’ll explain this with the example below.

Send alarms to Teams channel – example

You start to build an application and early on want to have some monitoring in place. You identify the first key metric and start thinking about how to notify yourself on these. I run into this problem all the time. I’m not keen on emails, as those seem to get lost and require you to define who to send them to. On the other hand, I really like chats. Teams and Slack give you channels where you can collaborate on the rising issues and it is easy to add people to the channels.

In AWS, I typically create CloudWatch alarms and route them to one SNS topic. By attaching a simple Lambda function on this SNS one can send the message to the Teams, for example. In Teams, you control the message format with Teams cards. I created a simple card that has some information about the alarm and a link to the metric. I found myself doing this over again, so I decided to build a Terraform module for it.

Here is a simple image of the setup. Terraform module sets up SNS that in turn triggers Lambda function. Lambda function sends all the messages it receives to Teams channel. Set up is really simple, but handy. All I need is to route my CloudWatch alarms to the SNS that is setup by the module and I will get notifications to my Teams channel.

Simple image of the module and how it plugs into CloudWatch events and Teams

Module requires you only to give the Teams channel webhook URL where the messages are sent to. When you create CloudWatch alarm metrics you just need to send them to the SNS topic that the module creates. SNS topic arn is in the module output.

You can now find the Terraform module from the Terraform Registry with a name “alarm-chat-notification” or by following the link in the footer [2]. I hope you find it useful to get you going with alarms.

Disable variable

As I mentioned before, it is a good practice to have disable variable in the module. To do this in Terraform, it is a bit tricky. First, create a variable to control this, in my repo it is called “create” and it is a type of boolean defaulting true. Now all the resource my module has had to have the following line:

count = var.create ? 1 : 0

In Terraform this simply means that if my variable is false, this count is 0 and no resource will be created. Not the most intuitive, but makes sense. This also means that all the resources will be a type of list. Therefore, if you refer to other resources, you have to do it with list operation, even when we know that there is only one. For example, my lambda function refers to the role, it does it by referring to the first element in the list as follows:

aws_iam_role.iam_for_lambda[0].arn

Again this makes sense and it is good to keep in mind.

I hope this blog inspires you to create reusable Terraform modules for the world to use. And please, feel free to source the alarm module.

[1] https://www.terraform.io/docs/registry/modules/publish.html
[2] https://registry.terraform.io/modules/aloukiala/alarm-chat-notification/aws/

Author of this blog post is a data engineer who has built many cloud-based data platforms for some of the largest Nordic companies. 

My Summary of AWS re:Invent 2019

The re:Invent event for 2019 is officially over. However, the learning and innovation is never stopping. It was a full week of learning new things, mingling with other AWS users and basically having a good time in Las Vegas. You can continue learning by following AWS Events Youtube channel: https://www.youtube.com/channel/UCdoadna9HFHsxXWhafhNvKw

Personally, I would like to thank all my colleagues, customers of Solita and employees of the event organisation for a just magnificent conference. Thanks!

As fresh as I could be after a re:Invent.

View of Oakland in the second picture was very pretty. The gentlemen next to me from Datadog said that he has landed to SFO tens of times and our plane’s approaching direction was the first time also for him. Amazing view!

It’s not just about the services, actually it’s more about having the bold mindset to try new things

You don’t have to be an expert of everything what you do. If you are, it probably means that you are not following what is out there and you are doing familiar stuff to yourself repeatedly. I don’t say that you have to always be Gyro Gearloose. I mean that you should push the limits a bit, take controlled risk for reward and have the will to learn new things.

The three announcements that caught my attention

Fargate spot instances. That’s is what my project has been waiting for a while. It will do costs savings and make it possible to stop using ECS EC2 clusters in cost optimization manner. The rule of thumb is that you can save 70% of your costs with spot instances.

Outposts. I really like this idea that you can get AWS ecosystem integrated computing power next to in corporate data centers. The hybrid environments are only way for many customers. I would like to see in future some kind of control panel also inside Outpost. Now all information points out that you cannot basically to do any controlling for servers inside the Outposts in higher than OS level (e.g. login in via SSH or Remote desktop).

Warm Lambda’s. I think the most of Lambda developers have thought about warming up their Lambda resources manually via CloudWatch events etc. This simplifies the work as is should have always been. Now you can be sure that I there is request coming you will have some warm computing capacity to serve the request fast. The pricing starts from 1,50 $/month/128MB to have one provisioned concurrency (=warm lambda).

re:Play 2019 photos

We organized preparty at The Still in Mirage before.

Would you like to hear more what happens in re:Invent 2019? Sign up to our team’s Whatsapp group to chat with them and register to What happens in Vegas won’t stay in Vegas webinar to hear a whole sum-up after the event.

New call-to-action

My Thursday at AWS re:Invent 2019

Keynote by K Dr. Werner Vogels

Dr. Vogels is CTO at AWS. The keynote started with very detail information about virtual machine structure and evolvement during the last 20-25 years. He said that the AWS Nitro microVM is maybe the most essential component to provide secure and performance virtual machine environment. It has enabled rapid development of new instance types.

Ms. Clare Liguori (Principal Software Engineer, AWS) gave detail information about container services. In AWS there are two container platforms, the ECS EC2 with virtual machines and the serverless Fargate. If you compare scaling speed, the Fargate service can follow the needed capacity much faster and avoid under-provisioning (for performance) and over-provisioning (for cost saving). With ECS you have two scaling phases, first you need to scale up your EC2 instances and after that launch the tasks. 

During the keynote Mr. Jeff Dowds (Information Technology Executive, Vanguard) told their journey to AWS from corporate data center. Vanguard is registered investment advisor company located in USA and has over 5 trillion USA dollars in assets. Mr. Dowds convinced the benefit of public cloud by hard facts: -30% savings in compute costs, -30% savings in build costs, and finally 20x deployment frequency via automations. Changing the mindset of deployment philosophy, I think is the most important change for the Vanguard. Like said in the slides, they have now the ability to innovate!

Building a pocket platform as a service with Amazon Lightsail – CMP348

Mr. Robert Zhu (Principal Technical Evangelist, AWS) kept chalk talk session about the AWS Lightsail service. He started saying that this talk will be the most anti-talk in re:Invent in meaning of scaling, high availability and so on. The crowd was laughing loud.

In chalk talk the example deployed app was a progressive web app. PWA apps try to look as a native app e.g. in different phones. PWA’s typically use web browser in the background with shared UI code between operating systems.

The Lightsail service provides public static ip addresses and simple DNS service that you can use to connect the static ip address to your user-friendly domain name. It supports wildcard records and default record which is nice. The price for outbound traffic is very affordable: in 10 USD deal you get 2TB outbound traffic.

We used a lot of time how to configure a server in traditional way via ssh prompt: installing docker, acquiring certificate from Let’s encrypt etc.

The Lightsail service has no connection to VPC, no IAM roles, and so on. It is basically only a virtual server, so it is incompatible for creating modern public cloud enterprise experience.

Selecting the right instance for your HPC workloads – CMP409

Mr. Nathan Stornetta (Senior Product Manager, AWS) kept this builder session. He is a product manager for AWS ParallelCluster. In on-premises solutions you almost always need to do choices what to run and when to run. With public cloud’s elastic capacity, you don’t have to queue for resources and not to pay what you are not using.

HPC term stands for high performance computing which basically means that your workload does not fit into one server and you need a cluster of servers with high speed networking. Within the cluster the proximity between servers is essential.

In AWS there exists more than 270+ different instance types. To select right instance type needs experience about the workload and offering. Here is nice cheat sheet for instance types:

If your workload needs high performance disk performance in-and-out from the server the default AWS recommended choice would be to use Amazon FSx for Lustre cluster storage solution.

If you decide to use the Elastic file system EFS service, you should first think how much you need performance rather than what size you need. The design of EFS promise 200 MBps performance per each 1 TB of data. So, you should rather decide the needed performance so your application will have enough IO bandwidth in-use.

The newest choice is Elastic Fabric Adapter (EFA) which was announced a couple of months ago. More information about EFA can be found from here: https://aws.amazon.com/hpc/efa/

If you don’t have experience which storage would work the best for your workload, it is strongly recommended to test each one and make the decision after that.

Intelligently automating cloud operations – ENT305

This session was a workshop session. In workshop sessions there is multiple tables with same topic and in builder session there is one small table for each topic. So, there were more than hundred persons to do same exercise.

At first Mr. Francesco Penta (Principal Cloud Support Engineer, AWS) and Mr. Tipu Qureshi (Principal Engineer, AWS) gave a short overview of services that we are using in this session. I want to mention few of them. AWS Health keeps track of health of different services in your account. For example, it can alarm if your ACM certificate is not able to renew automatically (e.g. missing DNS records) or VPN tunnel is down.

The other service was AWS Auto Scaling predictive scaling. It is important thing if you want to avoid bigger under-provisioning. When just using e.g. CPU metric from last 5 minutes you are already late, bad. Also, if your horizontal scaling needs awhile to have new nodes in service, then the predictive scaling helps you to get more stable performance.

The workshop can be found here: https://intelligent-cloud-operations.workshop.aws/

I’m familiar with the tooling so I could have yelled Bingo as one of the first persons to finish. I was happy to finish early and go to hotel for short break before the Solita’s customer event and the re:Play. The re:Play starts at 8pm in Las Vegas Festival Grounds with music, food, drinks and more than 30 000 eye pairs.

Would you like to hear more what happens in re:Invent 2019? Sign up to our team’s Whatsapp group to chat with them and register to What happens in Vegas won’t stay in Vegas webinar to hear a whole sum-up after the event.

New call-to-action

My Wednesday at AWS re:Invent 2019

It was early morning today because the alarm clock woke up me around 6 am. The day started with Worldwide Public Sector Keynote talk at 7 am in Venetian Plazzo Ohall.

Worldwide Public Sector Breakfast Keynote – WPS01

This was my first time to take part to the Public Sector keynote. I’m not sure how worldwide it was. At least Singapore and Australia were mentioned, but I cannot remember anything special said about Europe.

Everyone who is following international security industry even a little could not have missed the fact who many cities, communities etc. have faced ransomware attack. Some victims paid the ransom in Bitcoins, some did not pay and many of victims are just quiet. The public cloud environment is a great way to protect your infrastructure and important data. Here is summary how to protect yourself:

The RMIT University from Australia has multiple education programs for AWS competencies and it was announced that they are now official AWS Cloud Innovation Centre (CIC). Typical students have some education background (eg. bachelor in IT) and they want to make some move in job market by re-education. Sounds great way!

The speaker Mr. Martin Bean from the RMIT showed the picture from Jetsons (1963, by Hanna Barber Production) that could already list multiple things that are invented for mass markets much later. Mr. Martin also reminded two things that got my attention: there are more people owning cellphone than toothbrush and 50 percent of jobs are going to transform to another in next 20 years.

Visit to expo area

After keynote I visited expo in Venetian Sands Expo area before heading to the Aria for the rest of Wednesday. The expo was huge, noisy, crowded etc. The more detail experience from last year was enough. At AWS Cloud Cafe I took panorama picture (click to zoom in) and that’s it, I was ready to leave.

I took the shuttle bus towards Aria. I was very happy that the bus driver dropped off us next the main door of Aria hotel which saves about average 20-30 minutes of queueing in Aria’s parking garage. Important change! On the way I passed the Manhattan of New York.

Get started with Amazon ElastiCache in 60 minutes – DAT407

Mr. Kevin McGehee (Principal Software Engineer, AWS) was the instructor for the ElasticCache Redis builder session. In the session we logged in to the Amazon console, opened Cloud9 development environment and then the just followed the clear written instructions.

The guide for builder session can be found from here: https://reinvent2019-elasticache-workshop.s3.amazonaws.com/guide.pdf

This session was about how to import data to the Redis via python and index and refine the data at the importing phase. In refinement the data becomes information with aggregated scoring, geo location etc. It’s easier to use by the requestor. That was interesting and looked easy.

Build an effective resource compliance program – MGT405

Mr.Faraz Kazmi (Software Development Engineer, AWS) held this builder session.

Conformance pack under AWS Config service was published last week. It can be integrated in AWS Organization level in account structure. With conformance packs you can make a group of config rules (~governance rules for common settings) easily in YAML format template and have consolidated view over those rules. There are few AWS managed packs currently available. “Operational Best Practices For PCI-DSS” pack is one for example.  It’s clear that AWS will provide more and more of those rule sets in upcoming months and so will also do the community via Github.

There are timeline view and compliance view of your all resources, so it makes this tool very effective to have consolidated view of compliance of resources.

You can find the material from here: https://reinvent2019.aws-management.tools/mgt405/en/

Btw. If you cannot find Conformance packs, you are possible using old Config service UI in the AWS Console. Make sure to switch to new UI. All new features are only done to the new UI.

The clean-up phase in the guide is not perfect. To addition to the guide you have to manually delete SNS topic and IAM roles that was created in the wizards. It was a disappointment that no sandbox account was provided.

Best practices for detecting and preventing data exposure – MGT408

Ms. Claudia Charro (Enterprise Solutions Architect, AWS) from Brasilia was the teacher in this session. This was very similar to my previous session that I was not aware. In both session we used Config rules and blocked public s3 usages.

The material can be found from here: https://reinvent2019.aws-management.tools/mgt408/en/cont/testingourenforcement.html

AWS Certification Appreciation Reception

The Wednesday evening started (as usually) with reception for certificated people at the Brooklyn Bowl. It is again nice venue to have some food, drinks, music and mingle with other people. I’m writing this around 8 pm so I left a bit early to get good night sleep for Thursday which is the last full day.

Brooklyn bowl outside Brooklyn bowl inside Brooklyn bowl bowling Brooklyn bowl dance floor

On the way back to my hotel (Paris Las Vegas) I found the version 3 Tesla Supercharge station which was the one of the first v3 stations in the world. It was not too crowed. The station was big when I’m comparing with the supercharger stations in Finland. The v3 Supercharger stations can provide up to 250kW charging power for Model 3 Long Range (LR) models, which has 75kWh battery size. I would have like to see the new (experimental) Cybertruck model.

Would you like to hear more what happens in re:Invent 2019? Sign up to our team’s Whatsapp group to chat with them and register to What happens in Vegas won’t stay in Vegas webinar to hear a whole sum-up after the event.

New call-to-action

 

My Tuesday at AWS re:Invent 2019

Please also check my blog post from Monday.

Starting from Tuesday each event venue provides great breakfast with lightning speed service. It scales and works. It’s always amazing how each venue can provide food services for thousands of people in short period of time. The most people are there for the first time so guidance has to be very clear and simple.

Today started with keynote by Mr. Andy Jassy. I was not able to join the live session at Venetian because of my next session. Moving from one location to another takes at least 15 minutes, and you have to be at least 10 minutes before at your session to reclaim you reserved seat. Starting last year, the booking systems forces one hour cap between sessions in different venues.

Keynote by Andy Jassy on Tuesday

You can find the full recap written by my colleagues here: Andy Jassy’s release roller coaster at AWS re:Invent 2019

Machine learning was the thing today. The Sagemaker service received tons of new features. Those are explained by our ML specialists here: Coming soon!

So I joined overflow room at Mirage for Jassy’s keynote session. Everyone has their personal headphones. I have more than 15 years background in software development, so it was love in first sight with CodeGuru. There are good news and bad news. It is service for static analysis of your code, to make review and finally but not definitely least to provide realtime profiling concept via installed agent.

The profiling information is provided in 5 minutes periods and it will provide profiling for several factors: CPU, memory and latency. It was promising product because Mr. Jassy told that Amazon has used it by itself for couple of years already. So, it is mature product already.

So, what was the bad news. It supports only Java. Nothing to add to that.

The other interesting announcement for me was general availability of Outposts. Finally, also in Europe you can have AWS fully managed servers inside your corporate datacenter. Those servers integrate fully to AWS Console and can be used eg. for running ECS container services. The starting price 8300 USD per month is very competitive because it already includes roughly 200 cores, 800GB memory and 2,7TB of instance storage. You can add EBS storage additionally starting from 2.7TB.

You can find more information here: https://aws.amazon.com/blogs/aws/aws-outposts-now-available-order-your-racks-today/

Performing analytics at the edge – IOT405

This session was a workshop and level 400 (highest). It was held by Mr. Sudhir Jena (Sr. IoT Consultant, AWS) and Mr. Rob Marano (Sr. Practice Manager, AWS).

Industry 4.0 a.k.a. IoT was totally new sector for me. It was very informative and pleasant session. It was all about AWS IoT Greengrass service which can provide low latency response but still managed platform which tons of features for handling data stream from IOT devices locally.

For multiple people it was first touch to AWS Cloud Development Kit which I fell in love about three months ago. It has multiple advances like refactoring, strong typing and good IDE support. You can find more information about AWS CDK here: https://docs.aws.amazon.com/cdk/latest/guide/home.html

In our workshop session we demonstrated to receive temperature, humidity etc. time series data stream from IoT device. The IoT device was in our case EC2 which simulated IoT device. From AWS IoT Greengrass console you can eg. deploy new version of analytic functions to the IoT devices.

Material for workshop can be found from Github: https://github.com/aws-samples/aws-iot-greengrass-edge-analytics-workshop

AWS Transit Gateway reference architectures for many VPCs – NET406

This was a session and it was held by Nick Matthews (Principal Solutions Architect, AWS) in glamorous Ballroom F at Mirage. It fits more than thousand people. The session was almost full, so it was a very popular session.

To summaries the topic, there are several good ways to do interconnectivity between multiple VPCs and corporate data center. In small scale you can things more manually but in large scale you need automation.

One provided solution for automation based on the use of tags. The autonomous team (owner of account A) can tag their shared resources predefined way. The transit account can read those changes via CloudTrail logging. So, each modification will create CloudTrail audit event which triggers lambda function. The function checks for if change is required and makes change request item to metadata table in DynamoDB to wait for approval. The network operator is notified via SNS (Simple notification service). The operator can then allow (or decline) the modification. Another Lambda will then do the needed route table modifications for the transit account and for the account A.

If you are interested, you can watch video from August 2019: https://pages.awscloud.com/AWS-Transit-Gateway-Reference-Architectures-for-Many-Amazon-VPCs_2019_0811-NET_OD.html

If you want to wait, I’m pretty sure that this re:Invent talk was also recorded and can be found from AWS Youtube channel in few week: https://www.youtube.com/user/AmazonWebServices

Fortifying web apps against bots and scrapers with AWS WAF – SEC357

Mr. Yuri Duchovny (Solution architect, AWS) held the session. It was the most intensive session with a lots of todo with many architectural examples and usage scenarios in demo screen. The AWS WAF service has got a new shiny UI in AWS Console. Also the AWS published few new features already in last few weeks, eg. Managed rules to give more protection in nondisruptive way. The WAF it self did not have multiple predefined rules for protection, only XSS (Cross-site Scripting) an SQLi (SQL Injection) were supported. All other rules needed to configure manually as regular expressions or so.

The WAF is service that should always be turned on for CloudFront Distribution, Application Load Balancer (ALB) and API Gateway.

The workshop material is again public and can be found from here: https://github.com/gtaws/ProtectWithWAF

Encryption options for AWS Direct Connect – NET405

Mr. Sohaib Tahir (Sr Solutions Architect, AWS) from Seattle was the teacher in this session. It was more listening than doing because of the short period of time. We (attendees) were group of seven from USA, Japan and Finland.

There was five possibilities to encrypt direct connection:

1. Private VIF (virtual interface) + application-layer TLS
2. Private VIF + virtual VPN appliances (can be in transit VPC)
3. Private VIF + detached VGW + AWS Site-to-site VPN (CloudHub functionality)
4. Public VIF + AWS Virtual Private Gateway (GP, IPSec tunnel, BGP)
5. Public VIF + AWS Transit Gateway (BGP, IPSec tunnel, BGP) NEW!

It’s good to remember that single VPN connections has 1,25 Gbps limit which can be hit easily with DX connection and eg. data intensive migration jobs. AWS recommendation is to use number five architecture if it is possible. Using the fifth architecture requires to have own direct connection so you cannot use shared model direct connection from 3rd party operator.

AWS published yesterday cross-region VPC connectivity via Transit Gateway. During the session Mr. Tahir started to do demonstrate this new feature ad-hoc but we ran out of time.

My Monday at AWS re:invent 2019

I started the day with breakfast at Denny’s. It was nice to have typical (I think) American breakfast. Thanks Mr. Heikki Hämäläinen for your company. By the way, all attendees from Solita are wearing those bright red hoodie shown in the picture. Thanks to our Cloud Ambassador Anton Floor. The hoodie makes it a lot easier to spot a colleague in a crowded places. Okay, let’s start going through my actual sessions.

How NextRoll leverages AWS Batch for daily business operations – CMP311

Advertisement company’s Tech Lead Mr. Roozbeh Zabihollahi described shortly their journey with AWS Batch service. If I remember correctly, they use about 5000 CPU years which is huge amount of computing power. It was nice to hear NextRoll allows their teams quite freely to choose which services they want to use. Nowadays Mr. Zabihollahi sees that more and more teams are looking into AWS Batch as a promising choice to use, rather than Hadoop or Spark.

Mr Zabihollahi believes that AWS Batch is good for several things:

AWS Batch is good for

If you are consideration start using AWS Batch you should be familiar at least these challenges:

The Mr. Steve Kendrex (Sr. Technical Product Manager, AWS) presented the road map of AWS Batch service. The support for Fargate (a.k.a serverless container service) is coming but Steve could not provide details for a wide audience. My personal guess is the spot instance support for Fargate is coming soon which provide key cost efficiency factor for batch operations.

Build self-service registration with facial recognition – ARC320

My first builder session this year was about integrating facial recognition for registering guests to an event. Me and four other attendees were led by Mr. Alan Newcomer (Solutions Architect, AWS) to this interesting topic. Mr. Newcomer had lived before near Las Vegas which was interesting to hear about him.

Each builder session starts with short queuing for the right table which you have hopefully reserved a spot beforehand:

The hall has multiple tables which each has 7 chairs, one for a teacher and 6 for participants, and screen for guidance purposes.

Typical the teacher provides website which has all the required information to do exercise. Additional to that the teacher provides unique password for each participant eg. for AWS Console login. After that each participant can start doing the exercise by themselves. The teacher provides helps whenever needed. You need to keep good pace all the time to be able to do whole exercise.

During the recognition session we built an application with had tree main functionalities:  user registering, do RSVP one day before the event and finally registering user at event via facial recognition. You actually look up the workshop material by yourself here: http://regappworkshop.com/

Managing DNS across hundreds of VPCs – NET411

This was my second chalk talk today. It started very well because right at the beginning audience heard real life problems from different attendees. The chalk talk was guided by Mr. Matt Johnson (Manager, Solutions Architecture, WWPS, AWS) and Mr. Gavin McCullagh (Principal System Development Engineer, AWS). They did extremely well.

It was reminded that the support for overlapping private zones was published recently. It enables autonomous structured dns management in multi-account environment.  For more information go to: https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-route-53-now-supports-overlapping-namespaces-for-private-hosted-zones/

During the session we looked up four different architecture for sharing DNS information with multiple VPCs (~accounts). The number four “Share and Associate Zones and Rules” was the most interesting which suites for massive number of private DNS zones and VPCs. It has hub account for outbound DNS traffic to corporate network and it uses private zone associating between VPC’s. The associating does not yet have native CloudFormation support but there are several ways to handle association, eg. using CloudFormation custom resources or custom Lamda function.

One major feature request was that AWS should support DNS query logging (query and response) in the VPC. The audience wanted to receive the logging information to CloudWatch log groups. The logging is needed for security/audit and debugging purposes.

Processing AWS Ground Station data in AWS – NET409

This my second builder session sounded very fancy, handling data from satellites. The attendees had very different experience from AWS and from satellites, everything from one to five in both topics. After the session I needed to update my CV…

In the sky there are few open satellites. Those can be listened by AWS Ground Station service and the data received to AWS account. The data link between the Ground Station and your VPC is made via elastic network interface (ENI).

In the example case we received 15 Mbps stream for 15 minutes. It was the period that the satellite was visible for the Ground Station’s antenna system. The stream from Ground Station needs always to be received by to Kratos DataDefender software that will parse UDP traffic. The Ground Station traffic is not in right order and sometimes missing species which is handled by the DataDefender.

The data stream was analyzed in few phases via S3 bucket and EC2 instances. The final product was precise TIFF format picture of the view of the satellite passing the Ground station antenna. The resolution was about 1 megapixel per kilometer.

Nordics Customer Reception

The evening ended to pleasant and well organised the Nordics Customer Reception event at the Barrymore. The Solita was one of the sponsors of the event. From the terrace we had great view towards the Encore hotel:

Would you like to hear more what happens in re:Invent 2019? Sign up to our team’s Whatsapp group to chat with them and register to What happens in Vegas won’t stay in Vegas webinar to hear a whole sum-up after the event.

New call-to-action

AWS re:Inforce 2019 review

Looking at new releases, sessions which I attended, comparison to re:Invent and overall value of the first-ever re:Inforce.

This year I attended re:Inforce, the first incarnation of AWS conference specializing in security. The conference was a two day event in Boston Convention & Exhibition Center.

New releases

Amazon VPC Traffic Mirroring was probably the biggest new release in the event, but doesn’t touch my projects much. But, if you have systems for analyzing network traffic, this could be useful.

AWS Security Hub and AWS Control Tower are generally available. Haven’t yet tested much of these, but announced already in re:Invent.

Amazon EC2 Instance Connect was released in truth after re:Inforce, but should have been released during the conference. A new way to connect in case you don’t want to use Session Manager.

Attended sessions

Keynote by Stephen E. Schmidt, VP and CISO of AWS

Keynote

The keynote speakers were great and overall the whole keynote was good. I would have liked to have more new releases, now the main content was importance of security and existing solutions.

GRC346 – DNS governance in multi-account and hybrid environments

Builder sessions were already available in the last years re:Invent, but it didn’t get into any of them. DNS is not really my main focus area, but interesting topic nonetheless. Still, probably leaving setup of this to network specialists.

The setup was a little bit let down, because the room was quite noisy with multiple builder sessions going on the same time and participants didn’t do much themselves. But, it was very easy to ask questions and there was good discussion between AWS architect and participants.

SEJ001 Security Jam

Hackaton / Jam was again a highlight of the event. Sadly, I was just in time in and hence didn’t have much time to talk with the team beforehand. The duration was actually 3,5 hours which felt a little bit short.

We had a three-person team, but we didn’t really achieve much synergy. At first, we decided everyone would begin as a solo player and ask help when needed. During the whole jam I worked with one other only for about one hour and the last one worked solo the whole time. We did call out some questions and answers to each other from time to time, but very minimal team work.

One lesson that I relearned again was to double check everything. In one task, there needed to be a private endpoint for API Gateway. In security jams, some of the setup is already done for you. So, when I checked the list of private endpoints and there was one, I thought that it was the correct one. But it was for AWS Systems Manager and therefore I would have needed to add a new one.

AWS has been improving the platform so that companies can request access to either AWS architect or company own personnel lead jams. Going to look into this and maybe holding an internal jam. But the cost was unclear and number of interested colleague was low last time that I tried to hold a GameDay.

Other sessions:

I also attended one lecture type session, one workshop and couple of chalk talks. To keep the length of the post manageable, I will skip them. But, feel free to ask about them.

Other activities

Security Hub (Expo)

Expo floor

Many of the partner solutions were of course about web firewalls etc., which aren’t the main interest for a data developer/architect. But there were also companies about data masking, encryption and audit trails. I have received many follow up emails and phone calls after the event. Luckily, many of them are interesting even though might not have a use case right away.

Networking

There was multiple unofficial after parties at Tuesday evening, but I only attended Nordic gathering sponsored by AWS. Quite small gathering, but made discussion easier. Most of the evening there was two from Sweden and one from Finland at my table, but couple of others visited.

No alt text provided for this image

Closing reception was really informal with food, drinks and games inside the conference center and outside in a lawn. Very nice, but not the best setup for me network. I did exchange couple of words with the people I met in the Nordic gathering.

Comparison to re:Invent

From my point of view, one major reason for attending for re:Invent to be able to hear and question AWS about brand new systems that only selected companies have been able to test with strict NDAs. Even if they aren’t right away available in Europe, you know the basic capabilities of the system and can plan for the future. And usually the technical people giving workshops and sessions give much more honest opinions compared to marketing material released about the service. This was mostly missing from re:Inforce, because only Amazon VPC Traffic Mirroring was completely new service.

Good thing was that having everything in one place made logistics much easier and there wasn’t so much moving around. The expo was also much smaller and interesting companies were easier to find.

Re:Invent has four main days and two shorter ones. Compared to that, two days of re:Inforce is quite short time. You don’t get familiarity of the location which would make moving around faster and you don’t have time to reorganize calendar if you would like to learn more about a certain topic. Also, from a traveling perspective, travel vs conference ratio is much worse with re:Inforce.

Summary

First feelings after the conference was that it was ok, but it has risen to good level after some thinking about it more objectively. The first impression came mostly because I was automatically comparing re:Inforce to re:Invent. In that comparison re:Inforce is lacking in multiple areas. But, if we are looking at re:Inforce objectively there was quite a lot to learn and meeting of new AWS users. And to some, shorted length and cheaper tickets might make it possible possible to attend where re:Invent isn’t a possibility.

If attending again, I should keep more free time in the calendar and participate in the background events like ongoing security jam and capture the flag. Also, more planning beforehand, because conference being only two days there really isn’t much time to reorganize days during the event.

The next re:Inforce will be in Houston, but the feedback form had a question for re:Inforce outside USA. So, there might be hope for one in Europe at some time in the future.

Additional reading

Got a laptop case with badges from AWS booths.

Kicking the tires of AWS Textract

Amazon Web Services' new ML/AI service Amazon Textract came to general availability and I gave it a quick test.

AWS has multiple services in AI/ML field. These include, for example, Amazon Comprehend for text analysis, Amazon Forecast for predicting future from set of data and Amazon Rekognition to extract information from pictures. Amazon Textract is a new service in this field and it was just announced to be generally available. Textract is a service which does Optical Character Recognition (OCR) from multiple file formats and stores output in a more usable format in JSON.

At the moment of release the AWS Textract can detect Latin-script characters from standard English alphabet and ASCII symbols. It can use PNG, JPEG and PDF as input files. I would say that there are enough input formats but would have wanted to see more languages available. Of course Finnish is not something that I assume to see anytime soon or at all. Textract is now available in three regions in US and Ireland in Europe.

Analyse test

Textract allows one to easily test what kind of results they can get with it. One can open Textract service and first see a sample document created by AWS. This helps to get started and get some kind of idea how to use it. Documents can be uploaded directly from the console and it automatically creates a S3 bucket to store them.

Textract sample document

 

I did tests with multiple files and file formats to see how it performs but used one PDF document as an example for this post. The PDF I used was AWS Landing Zone immersion day information sheet because it was handily available and had text, table and image in it. On the left in the picture, we can see again the areas where Textract has identified content and on the right is the extraction. From this kind of clear and simple document it seems to have picked up everything easily. It took around 10 seconds for this document to be analysed.

Test document

 

I would say that Textract handled all the files I gave it without too much problem. The view of the file and places where it finds text does not always align even though text output is correct. This happened for example with my CV where the visual representation was off on many places.

Visual analyse sample

Results

Outputs can also be downloaded directly from the console in a zip file and it will provide these four files.

  • apiResponse.json
  • tables.csv
  • keyValues.csv
  • rawText.txt

Tables.csv, keyValues.csv and rawText.txt are all quite clear. Tables holds all the tables and fields Textract found from the document and keyValues.csv holds form data. This is the table that was found in the document. It has been correctly read and put in table. Interestingly, it has also added empty columns for the long empty spaces between texts.

Test document table

 

Rawdata.csv contains extracted text from document in a raw format. It has all the text in non edited format, all the words just after each other.

H Automated Landing Zone Immersion Day Please join the AWS Nordics Partner team for an immersion day for the Automated Landing Zone. Learn how to set up an account structure according to best practices with the help of the ALZ solution. After you have performed this training, you will get access to the ALZ solution tools and materials sO you can use when setting up customer environments. This training will also be helpful for those of you interested in the AWS Control Tower service that will be available later this year. WHEN: April 1st 2019 (no joke) WHERE: AWS Office at Kungsgatan 49 in Stockholm Preliminary agenda 10:00 10:30 Welcome and Registration 10:30 10:40………

Textract also gives a full output of the process. This information is in JSON format and contains all the information about the findings. There is detailed information what was found and in where. It also gives a confidence percentage of the finding. This is a very large JSON document even with a small PDF, almost as big file as the original PDF.

    {
      "BlockType": "WORD",
      "Confidence": 99.962646484375,
      "Text": "account",
      "Geometry": {
        "BoundingBox": {
          "Width": 0.0724315419793129,
          "Height": 0.012798813171684742,
          "Left": 0.448628693819046,
          "Top": 0.37925970554351807
        },
        "Polygon": [
          {
            "X": 0.448628693819046,
            "Y": 0.37925970554351807
          },
          {
            "X": 0.5210602283477783,
            "Y": 0.37925970554351807
          },
          {
            "X": 0.5210602283477783,
            "Y": 0.39205852150917053
          },
          {
            "X": 0.448628693819046,
            "Y": 0.39205852150917053
          }
        ]
      },
      "Id": "f1c9bdeb-f76a-44ff-8037-6cb746d5613d",
      "Page": 1
    },

 

Conclusion

Textract is a needed addition to AWS AI/ML service family and fills the gap in analysis tools. Textract says that it will read English from multiple file formats and seems to do that well. All tests with PDFs and pictures were successful. Of course one wouldn’t use this service like this and upload single files manually. Textract has support in AWS cli and both Java and Python SDKs. That makes it possible to have, for example, automatic triggers in S3 bucket when new files are uploaded which launches Textract to do it’s thing. Overall a nice service which will probably be a very useful one for text analysis use cases.

Download a free Cloud Buyer's Guide

No public cloud? Then kiss AI goodbye

What’s the crucial enabling factor that’s often missing from the debate about the myriad uses of AI? The fact that there is no AI without a proper backend for data (cloud data warehouses/data lakes) or without pre-built components. Examples of this are Cloud Machine Learning (ML) in Google Cloud Platform (GCP) and Sagemaker in Amazon Web Services (AWS). In this cloud blog I will explain why public cloud offers the optimum solution for machine learning (ML) and AI environments.

Why is public cloud essential to AI/ML projects?

  • AWS, Microsoft Azure and GCP offer plenty of pre-built machine learning components. This helps projects to build AI/ML solutions without requiring a deep understanding of ML theory, knowledge of AI or PhD level data scientists.
  • Public cloud is built for workloads which need peaking CPU/IO performance. This lets you pay for an unlimited amount of computing power on a per-minute basis instead of investing millions into your own data centres.
  • Rapid innovation/prototyping is possible using public cloud – you can test and deploy early and scale up in the production if needed.

Public cloud: the superpower of AI

Across many types of projects, AI capabilities are being democratised. Public cloud vendors deliver products, like Sagemaker or CloudML, that allow you to build AI capabilities for your products without a deep theoretical understanding. This means that soon a shortage of AI/ML scientists won’t be your biggest challenge.  Projects can use existing AI tools to build world-class solutions such as customer support, fraud detection, and business intelligence.

My recommendation is that you should head towards data enablement. First invest in data pipelines, data quality, integrations, and cloud-based data warehouses/data lakes. So rather than using over-skilled AI/ML scientists, build up the essential twin pillars – cloud ops and skilled team of data engineers.

Enablement – not enforcement

In my experience, many organisations have been struggling to transition to public cloud due to data confidentiality and classification issues. Business units have been driving the adoption of modern AI-based technology. IT organisations have been pushing back due to security concerns.  After plenty of heated debate we have been able to find a way forward. The benefits of using public cloud components in advanced data processing have been so huge that IT has to find ways to enable the use of public cloud.

The solution for this challenge has proven to be proper data classification and the use of private on-premises facilities to support operations in public cloud. Data location should be defined based on the data classification. Solita has been building secure but flexible automated cloud governance controls. These enable business requests but keep the control in your hands, as well as meeting the requirements usually defined by a company’s chief information security officer (CISO). Modern cloud governance is built on automation and enablement – rather than enforcing policies.

Conclusion

  • The pathway to effective AI adoption usually begins by kickstarting or boosting the public cloud journey and competence within the company.
  • Our recommendation – the public cloud journey should start with proper analyses and planning.
  • Solita is able to help with data confidentiality issues: classification, hybrid/private cloud usage and transformation.
  • Build cloud governance based on enablement and automation rather than enforcement.

Download a free Cloud Buyer's Guide

AWS Summit Berlin 2019

My thoughts on the Berlin AWS Summit 2019

What is an AWS Summit?

AWS Summits are small, free events that happen in various cities around the world. They are a “satellite” event of the re:Invent which takes place in Las Vegas every year in November. If you cannot attend re:Invent, you should definately try to attend an AWS Summit.

Berlin AWS Summit

I have had the pleasure of attending the Berlin AWS Summit for 4 years in a row.

Werner Vogels

The event was a 2 day event held on 26-27 of February 2019 in Berlin. The first day was more focused for management or new cloud users and the second day had more deep-dive technical sessions. The event started with a keynote held by Werner Vogels, CTO of Amazon. This year the Berlin AWS Summit seemed to be very focused on topics around Machine Learning and AI. Also I think this year there were more people attending compared to 2018 or 2017.

You will always find other sessions that are interesting to you, even if ML&AI are currently not on your radar. For example I attended the session about “Observability for Modern Applications” that showed how to use AWS X-Ray and App Mesh to monitor and control large scale microservices running in AWS EKS or similar. App Mesh is currently in public preview and it looks very interesting!

The partners

Every year there are a lot of stands by various partners showcasing their products to the passers by. You can also participate in raffles with the cost of your email address (and obvious marketing emails that will ensue). Most of them will also hand out free swag, stickers or pens etc.

stands 1Stands 2Stands 3

Solita Oy is an AWS Partner, please check our qualifications on the AWS Partners page.

Differences to previous years

This year there was no AWS Certified lounge which was a surprise to me. It is a restricted area for people who have an active AWS Certification where they can network with other certified people. I hope it will return next year again.

 

Thank you for the event!

Thank you and goodbye