I was recently helping a company to work through a set of white hat penetration test results for a legacy web application. Among other things, was a note that while the web application did support TLS 1.2, it was still accepting connections via TLS 1.0/1.1. As to why this is a security risk, here is a great article on the subject from Digicert : https://www.digicert.com/blog/depreciating-tls-1-0-and-1-1. But in short, most modern web applications should only accept TLS 1.2, and should actually reject TLS 1.1 and 1.0.

Luckily this is a fairly trivial fix if we are using Azure App Services with no (or minimal) code changes!

Setting The Minimum TLS Version Via Portal

If we are looking to set the minimum TLS version via the portal, we first have to open up our App Service, and look down the left hand menu for TLS/SSL settings.

On this screen we can edit the TLS minimum version, which should really be 1.2 at all times.

And we are done! A very easy setting to change that adds a tonne of security benefits.

Setting The Minimum TLS Version Via ARM Template

While editing this setting via the portal is great, chances are you have an ARM template that you use to automate your deployments.

While it’s hard to show the full ARM template here as it’s rather verbose, inside your template you likely already have a Microsoft.Web/sites/config element, and inside that properties. Adding a minTlsVersion property will allow you to set the minimum TLS version of your web application.

{
	"type": "Microsoft.Web/sites/config",
	"name": "myAppServiceName/web",
	"apiVersion": "2018-11-01",
	[...]
	"properties": {
		[...]
		"minTlsVersion": "1.2",
	}
}

Default TLS Version

It’s important to note that this is mostly a legacy issue. If you create a fresh Azure App Service anytime beyond June 2018, the default minimum TLS version is automatically set to 1.2. However existing App Services are left unchanged and so you may have to do a quick work around of all existing services and upgrade them.

Additionally, if you for whatever reason did need to support TLS 1.0 (Which you really really shouldn’t!), then you would need to downgrade this setting on any new services created.

I’ve recently been reviewing content and training material to help both technical and non-technical people alike pass their AZ-900 Azure Fundamentals exam. The official exam description is :

The exam is intended for candidates who are just beginning to work with cloud-based solutions and services or are new to Azure.

And :

Azure Fundamentals exam is an opportunity to prove knowledge of cloud concepts, Azure services, Azure workloads, security and privacy in Azure, as well as Azure pricing and support. Candidates should be familiar with the general technology concepts, including concepts of networking, storage, compute, application support, and application development.

But what I really found was that the AZ-900 exam is all about your general fundamental knowledge of cloud in general, and a couple of services that you will almost always be using no matter which cloud set up you go for.

As an example, you should have a broad base of knowledge on cloud computing to answer questions like “What is a hybrid cloud model?” or “What is the difference between SAAS, PAAS and IAAS”, but also still be able to handle left field questions such as “What are Azure Availability Zones?” or “Which feature of Azure Active Directory will require users to have their mobile phone in order to be able to log in?”.

Essentially, it’s a very broad intro to Azure for anyone to get the gist of what cloud computing is about, and some general Azure terms and concepts. If you are using Azure in *any* capacity, whether it be sales, software development, infrastructure, security engineer, even a business analyst, it’s a worthwhile exam to take.

Exam Courses / Guides For AZ-900

There’s really only two main source of material for the AZ-900 that I recommend.

The first is actually the official Microsoft learning pathways for Azure Fundamentals.

It’s fairly verbose but it’s extremely content rich. In total, you are looking at 15 and a half hours worth of videos and reading material to get you prepared for the exam. Personally, I think it’s a bit overkill and if you have a bit of experience with Azure and Cloud already, it may be a bit of a slog to get through all 15 hours. But on the flip side, you’re going to be getting a real in-depth look at all things Azure and basically not have to go anywhere else for exam prep. And, it’s free, so no complaints there.

My second recommendation is Scott Duffy’s AZ-900 Microsoft Azure Fundamentals Exam Prep video course available on Udemy.

This course is a much more concise and chopped down version clocking in at just over 5 hours. It also comes with a 50 question practice test, and you can download the audio if you’re on the go as well. There is a cost associated with the course, but like everything on Udemy, you can usually pick it up on sale for $12 bucks or so. It’s had 125k+ students through the door so it must be doing something right! The other benefit is that if you use Scott Duffy for other Microsoft Exam Prep, he often sends out discount counts to get his other courses. So if you are interested in say the Scott Duffy course for the Azure Architecture exam, you can start with him with the fundamentals and use him all the way through your learning path.

Practice Exams For AZ-900

Microsoft actually offer a paid practice exam through Mindhub. The only issue is, the practice exam is (depending on your currency) more than the actual exam itself. While the actual exam may cost somewhere in the vicinity of $99USD, the practice exam for a 60 day period is $109USD. In my view, it’s just not worth it. Specifically for the AZ-900 exam, you really don’t need to be cramming for it. It really is a fundamental exam and so spending a tonne of money on practice questions just isn’t needed.

Again, Scott Duffy also has an AZ-900 practice exam on Udemy. This is much more reasonably priced, and usually you can get a bundle deal if you buy the course with the practice questions. I think some people have a worry that buying a third party practice exam won’t be close to the real thing, but if anything, I found people complaining a lot more about the MindHub practice exams than any others. The thing is as well, you can usually get these for around $10, and it comes with 150 practice questions. Overall, these should point you in the right direction of where you need to study more, and for very little cost.

Areas For Free Points

Any time I do an exam, I look for areas that are very easy to memorize, and yet will always come up in the exam. Many many years ago, I remember doing the CompTIA A+ exam which involves memorizing a set of 15 IRQ numbers and their corresponding devices. Just a list of 15 numbers. And every single exam there was atleast 2 questions asking “What IRQ is number 2” for example.

So with that in mind, when I was looking at AZ-900 and doing practice exams, I looked for areas where I knew I could easily memorize and learn the answer for, but were an almost certainty to come up in the exam :

  • Learn about availability sets, regions, update domains, fault domains, and in general know how Azure offers high availability.
  • Know the different levels of feature release (Private preview, Public Preview, General Availability) and their relevant SLAs.
  • Understand how Microsoft offers support and when.
  • Understand the benefits of cloud hosting in general. There is almost always a question about cloud computing being “elastic” or questions about scaling
  • Understand the different network security offerings in Azure, for example Firewall, DDOS Protection, NSG etc and when you would use them
  • Try and understand when you would use Cosmos, Data Lake, SQL Data Warehouse, Azure SQL, Blob Storage or any other storage mechanism that comes up, almost certainly you will be asked something about which is the right data storage mechanism
  • 100% know the difference between hot and cold storage in Azure Blob, you will always be asked something about this.
  • Read a bit on Azure AD. It’s kind of a broad topic but in general as long as you understand what it does, and what it can offer (For example MFA), you should be good.

Overall, the exam is honestly pretty straight forward. There are always curly ones like “What is the maximum amount of VM’s allowed in a scale set”, but overall from the list above, you will get one or two in each of those areas. For example, simply memorizing the difference between hot and cold storage in Azure Blob Storage now gets you an extra mark for about 10 minutes of reading.

Braindumps For AZ-900

I don’t think it would be a tips post for a Microsoft Exam without mentioning the infamous brain dumps. So let’s get this out of the way early. Braindumps are just cheating the exam in a different way. If you’ve never heard of braindumps, it’s essentially buying the questions for the real exam, and studying to be able to answer those specific questions only.

Yes you can use TestKing. Yes you can use Pass4Sure. But let me just say this. If you need braindumps to pass AZ-900, just find a different job. It will be easier that way.

You honestly should not need a braindump for any fundamentals course. I know some people try and justify using exam dumps sometimes by saying that the exam questions are worded poorly. Or the training material is garbage and there is no way you could pass the exam without a braindump. But AZ-900 is not that. It’s a very straight forward simple exam that as long as you do any amount of study, or have been working in IT using Azure (Or any cloud) for any length of time, you can pass. You do not need braindumps for this exam.

This is going to be a nice short and sweet article, but one that I’ve felt the need to write for a while.

Many times when I’ve built small internal applications, hosted in Azure, there has been a need for some level of authentication. Pretty often, this is going to be against the clients Azure AD instance. I’ve seen people wrap themselves up in knots trying to use the “Microsoft Authentication Library (MSAL)” inside their code. Often this comes with many code changes, configuration, and banging your head along the way.

Sometimes that headache is unavoidable, but other times for your simple 5 page website, it’s just way overkill.

Did you know that you can set up Azure Authentication across a web application, from inside the Azure Portal, without any code changes what so ever? It’s really simple! Simply navigate to your Azure App Service and select “Authentication” under settings on the left hand menu

Next add an identity provider, as noted in the screenshot below, this can be Microsoft (Active Directory), Facebook, Google or Twitter.

The settings are fairly explanatory and work much like how you would set up app registrations within Active Directory normally (But this time it’s mostly done for you).

Once added, any access to your application will be forced to authenticate with your chosen identity provider.

Now you’re probably asking, what if I want to limit access to certain groups or users? Unfortunately Azure App Service only provides an “Authentication” service, it does not provide an “Authorization” service. So even though it can force users to login, it simply passes those claims through to your application to then validate if a user should or should not be able to access that page.

What this means in practice is that if you need complicated set up of roles and permissions.. Maybe the built in Authentication with Azure App Service isn’t right for you (Although it is definitely doable) as it somewhat disconnects the authentication and authorization pieces. However, what I’ve found is for small internal applications that we simply want to say “Anyone in an org can use, but not the public”, then this is a great little way of achieving that with zero code changes.

Apologies for the absolute word soup of a title above, but I wasn’t sure how else to describe it! So instead, let me explain the problem I’ve been trying to solve lately.

Like many organizations using Azure Devops, we are slowly switching our pipelines to use YAML instead of the GUI editor. As part of this, I’ve been investigating the best way to conditionally deploy our CI build to environments. Notably, I want our CI build to run for every check in, on every branch, but only move to the “release” stage if we are building the develop branch and/or the main trunk. As we’ll find out later, there also needs to be an override mechanism for this because while it’s a general rule, it’s also something that may need to be flexed at times.

YAML pipelines documentation can be a bit shaky at times, so most of this came from trial and error, but here’s what I found to solve the problem of conditionally deploying Azure Pipelines based on a branch.

Using Environments

Your first option is to use Environments inside Azure Devops. You can add an “Approval and Check” to an environment, and then select Branch Control.

You can then specify a comma seperated list of branches that are allowed to pass this environment gate, and be deployed to the environment :

But here’s the problem I had with this approach. Environment gates such as the above are not in source control. Meaning that it’s hard to roll out across multiple projects at once (Compared to copy and pasting a YAML file). Now that’s a small issue, but the next one is a big one for me.

These checks based on a branch actually *fail* the build, they don’t “skip” it. So for example, if a branch does not match the correct pattern, you will see this :

This can be incredibly frustrating on some screens because it’s unclear whether your build/release pipeline actually failed, or it just failed the “check”. There is also no way to override this check on an adhoc basis. Maybe that’s something you desire, but there are rare cases where I actually need to deploy a feature branch to an environment to test something, and going through the GUI to disable branch control, release, then add it back just doesn’t make sense.

Using Pipeline Variables

A more explicit way I found to control this was to use variables inside the YAML itself. For example, every project of mine currently has the following variables utilized :

variables:
  isPullRequest: $[eq(variables['Build.Reason'], 'PullRequest')]
  isDevelopment: $[eq(variables['Build.SourceBranch'], 'refs/heads/develop')]

Now anywhere in my pipeline, I can use either variables.isPullRequest or variables.isDevelopment and make conditional deployments based on these. For example, I can edit my YAML to read like so for releasing to my development environment :

- stage: Development
  condition: and(succeeded(), eq(variables.isDevelopment, true))

This basically says, the previous steps must have succeeded *and* we must be using the development branch. When these conditions are not met, instead of a failure we see :

This is so much nicer than a failure and actually makes more sense given the context we are adding these gates. I don’t want the CI build to “fail”, I just want it to skip being released.

Adding Overrides

Remember how earlier I said that on occasion, we may want to deploy a feature branch even though it’s not in develop? Well we can actually add an override variable that when set, will push through the release.

First we must go to our YAML pipeline in Azure Devops, and edit it. Up the top right, you should see a button labelled Variables. Click it and add a new variable called “forceRelease” like so :

Unfortunately, we have to do this via the GUI for every build we wish to add this variable. At this time, there is no way to add it in the YAML and have Azure Devops recognize it (But there is hope for the future!).

In our YAML, we don’t need to declare the variable ourselves, instead it’s just available for use immediately. We can just modify our Development stage to look like so :

- stage: Development
  condition: and(succeeded(), or(eq(variables.isDevelopment, true), eq(variables.forceRelease, 'true')))

Now we are saying, if the branch is development or the variable forceRelease is set to true, then push through the release. If we try and kick off a build, we can now set the runtime variable at build time to push things through, no matter the branch.

Back in the day, Microsoft SQL Server Tuning Wizard along with the SQL Server Profiler was the best way to track performance of SQL queries. In production, you might even add in custom perfmon metrics to the mix. But these days, Azure SQL has you covered with an extremely powerful query performance insights tool that does all of the heavy lifting for you.

Accessing Query Performance Insights

On an Azure SQL Database, simply access the Query Performance Insight tool under the Intelligent Performance sub-heading. Note that this is at the database level, not the server level. While some metrics (Such as DTU/CPU) can be tracked at the server level, when looking at individual queries, we have to look at each database individually.

From here, we can access :

  • Resource Consuming Queries – These are queries that cost the most resource (CPU, Data) as a *sum* of all queries. That means even if a query is performant, but is executed often, it may appear in this list.
  • Long Running Queries – These are queries that take the most time to execute, but again are the *sum* of all queries. So even if a query returns fast, if it’s called often, it will appear in this list.
  • Custom – This is where we can create custom reports to better drill down into poorly performing SQL queries. This is generally our best bet at finding bad queries.

Selecting any query allows you to view the actual query text :

As well as the average CPU, Data, Duration and execution count over the time period :

Importantly, there is also a chart below which allows you to track during hour intervals the same metrics. This can help you pinpoint certain times of day that may be more problematic for certian SQL queries :

Overall, utilizing this data can go a long way to giving you very simple metrics to act upon, all with very digestible queries, charts, and graphs.

The thing to note with all of these graphs, is that there isn’t one single metric that will be able to tell you the exact performance issues with your application. For example, a SQL query may run 100 times across 100 different users in your application, but is only non-performant on a single user (Maybe they have far more data than all the others). If you look at the average of all of these queries, it may look perfectly fine, whereas sorting by “max” may pinpoint that at times, this query is non performant.

Custom Queries To Utilize

Earlier, we talked a little bit about how using Custom queries were the best way to diagnose performance issues. Here’s some of the queries that I utilize to better understand the performance of my Azure SQL Databases, and what I’m looking for when running them.

Execution Count Metrics

I utilize the Execution Count metric to understand if there are additional caching needs for my application. A good example is if every page load requires you to return how many “unread notifications” a user has in your system. Or maybe every page load, we check the current logged in user in the database.

For the former (notifications), maybe we can cache this value so we don’t hit the database so often for something that isn’t *too* important. For example, if a user gets a notification, does their notification count really need to increase in real time, or is it OK to be cached every 30 seconds?

For the latter, sometimes there isn’t anything you can do. Checking whether someone’s JWT/Authentication Cookie corresponds with a valid user in the database is probably unavoidable.

But what I try to look for is outliers and things that really don’t need to happening in real time.

Duration/CPU Average

I utilize both CPU and Duration average to find queries that have the slowest average time of executing. But we need to be careful here, because sometimes the queries in these reports truly are slow, but are unavoidable. A good example might be generating an admin report that happens once per week. Sure, we could offload this to something better at number crunching, but if it’s getting ran once a week, it’s probably not a big issue.

The real gold finds are when we can take a query that appears on the slowest average duration and on the execution count report. This means not only is it one of the slowest queries overall, but it’s also getting executed often. Sometimes the “sum” query aggregation can help you here, but not always, so I often run the two independently.

Duration/CPU Max

Finally, I utilize the Duration and CPU max to find outliers in queries that may not on average be slow, but are slow under certain conditions. Often these can be a bit of a guess. When looking at a query within the Azure Portal, you won’t be able to see the query parameters. Therefore you can’t always know the exact conditions that caused the query to slow down, but often you can start making educated guesses, and from there do test scenarios locally.

Really, what you look for out of queries from this panel are queries you wouldn’t expect to be slow, but could under certain conditions be loading a lot of data. A good example might be a user on an ecommerce site who buys things regularly. They may have hundreds or even thousands of “orders” attached to their user, but the average user may only have a couple. Here we may see the query show up here due to the max duration being extremely long for that one customer, but not show up on the average report.

Azure SQL Performance Recommendations

Spend any time using Azure SQL and you’re going to run across it’s own “Performance recommendations” engine. These are performance recommendations (generally indexes), that Azure recommends periodically to improve your applications performance. Personally, I don’t utilize them that much, and here’s why :

  • Generally speaking, Azure Performance Recommendations mostly end up recommending you create indexes. While this can be helpful, for the most part if you are watching your slow running queries using the Query Performance Insights tool, you’re going to find them yourself anyway and probably have a better understanding of the actual issue.
  • The recommendation engine also can update your database behind the scenes without you having to lift a finger. This is bad. In most scenarios, you’re going to want to add that missing index in your own source control. It’s very rare that I accept a chance via this performance recommendation engine, and let Azure implement it for me.
  • The performance recommendations don’t take into business logic, or domain knowledge into account. There may be specific reasons why queries are acceptably slow, and/or it may only be slow in some use cases which you are happy with.

In general, I think that the performance recommendations are a helpful tool for any developer, but maybe not as automated as it appears on the surface. Generally, I’ve had to go away and validate it’s findings and then implement the changes myself rather than the one click tool.

I recently ran into an issue where I wanted to test out a couple of the new pieces of functionality that Microsoft Teams apps can do (Notably, things around creating custom tabs within Teams). To test this out, I figured the easiest way would be to create a free teams account under my personal Microsoft account (So, not using Office 365), so I could play around with various test applications. What I found was that it is extremely hard to follow any guide to upload custom sideloaded apps to a free teams account, but it is possible!

If you want to skip right to the end to “What does work”, then I will forgive you, however first I want to outlay what exactly doesn’t work, and why this took me so long to figure out!

What Doesn’t Work

When guides out there (including Microsoft’s own documentation) describe uploading custom apps to Microsoft Teams, they talk about using the custom app “App Studio”. This is essentially an app, within Teams, that allows you to upload your own custom apps. That’s maybe a bit confusing, but in simple terms, it allows you to build a manifest file, upload logos, set privacy page URL’s all within a WYSIWYG editor, instead of editing JSON manually.

Once you’ve filled out all options, you’ll hit this step to start distributing your application.

The first option you are going to try is “Install”. Makes sense to try and install it for testing right? Then you’re probably likely to get this :

Or in text form :

Permissions needed. Ask your IT admin to add XYZ to this team.

Interestingly.. I am the IT admin since I created this teams account. This will lead you on a wild goose chase, notably to find either the “Teams Admin Center” or the “Office 365 Enterprise Admin Portal”. The problem is.. You aren’t an Office 365 customer. If you follow any of these links you find on the web to enable side loading applications, you’ll pretty often get the following.

You can’t sign in here with a personal account. Use your work or school account instead.

Very. Very. Frustrating.

Knowing that I couldn’t get around this limitation. Instead I decided to select the option to “Publish” from this same screen within App Studio. It looked promising until I got to a screen that said my “IT Admin would review my application and approve it”. Well.. I’m the IT admin so I guess I should receive an email soon with a nice link to approve everything? Nope! Nothing.

Doing this seems to just send it out into the ether. I never saw any link, option, or email to approve this app. Another dead end.

What Did Work

Finally, I saw another poor soul with the same issue and the usual unhelpful advice of logging into your non-existent Office 365 admin account. Then someone left a nothing comment.

You can still just upload the custom app normally.

What did “normally” mean in this context? Well I went back to App Studio and this time around selected the option to download my app to a zip.

Then at the very bottom of the Apps screen inside Teams, I selected the option to “Upload a customised app” (Note, *not* “Submit to app catalogue”).

And by magic, after a long wait of the screen doing nothing, it worked!

So what’s going on here? At a guess. I have a feeling that Free Teams Accounts have the option to sideload apps into the account, but they have other restrictions that cause the “App Studio” to report that the IT Admin will need to enable settings. It’s essentially bombing out and blaming a setting that it shouldn’t!

But there you have it. If you need to sideload custom apps into Free Teams, you *can* do it, you just can’t do it via App Studio.

For a long time now, Azure QnA Maker has been a staple of any Microsoft Bot Framework integration. At it’s simplest, QnA Maker is an extremely easy to use key/value pair knowledgebase. Where an incoming chat is best matched with a question inside QnA and that answer returned. Unfortunately, it’s rather basic and for a while has been relegated to only answering questions in a one question to one answer fashion. Essentially, QnA Maker lacked the ability to “follow up” questions to better drill down to an answer.

As an example, imagine the following question and answer.

Question : Where can I park?

Answer : If you are in Seattle, then you have to park around the back of the building using code 1234. If you are on the San Francisco campus, then unfortunately you will have to park on the street. Usually there are parks available on Smith Street. 

While we have answered the user’s question, we had to combine two different answers, one for parking in Seattle, the other for San Francisco. Maybe we add another campus, or we want to elaborate further on a particular location, things can get confusing for the user fast. It would be much better if a user asks where they can park, the first response is asking where they are located.

Thankfully, QnA Maker have recently released “Follow Up Prompts” which allows a bot to have a “Multi-Turn” conversation to better drill down an answer. There are a couple of gotcha’s with the interface at the moment, but for the most part it’s rather simple. Let’s take our example from above and see how it works.

Adding Follow Up Prompts To QnA Maker

The first thing we need to do is head to our KB Editor at https://www.qnamaker.ai/. This interface is generally fine as-is, but this time around we actually want to add one additional column. Select View options and select “Show Context”. This won’t immediately be evident what this does, but is super important as we add Follow Up Prompts.

Next, I’ll add the question “Where can I park?” like so :

Notice how our “answer” is actually the follow up question. Also notice that “Add follow-up prompt”. Clicking it, we need to fill out the the resulting popup like so :

The options are as follows :

Display Text is what our follow up button text will show. In our case, because our drill down question is asking the user which campus they are located at, we want to display a simple option of “Seattle”.

Link to QnA will actually be the initial answer. So we can fill this out as to how it will be answered if a user selects Seattle.

Importantly, we select “Context-Only” as this enforces that the only way someone can reach this, is by following the prompts from parking. Otherwise, a user can simply type “Seattle” even without first asking about parking.

After hitting save, because earlier we turned on the option to “Show context”, we will be shown a tree view of our conversation flow :

Let’s Save and Train, then Test.

Perfect! And if we ask “Seattle” out of the blue, we also see that it doesn’t return our parking answer out of the blue!

We can of course go back and add other options to the original question as often as we want.

Linking Existing QnA

One final thing I want to mention is that if you have QnA options that are somewhat close to each other, and you want to link between them, you can now also use Follow Up Prompts to do this. Most notably, I created a QnA answer to handle bad answers. I then can add it as a follow up question by typing the start of the question “Bad Answer”, and selecting the existing QnA question.

Obviously this is a great way to have a common method for handling bad answers, but you can also use this as a way to show “Related” QnA within the QnA Maker, and not have to handle conversation flow within your bot at all!

This post will be a continuation of our post introducing the AWS Fault Injection Simulator.

The idea was to run an experiment and remediate our findings but as it turned out, the post was already too long with a simple setup so I split it in two parts.

I’d recommend you to check the first part to better understand the context of this entry, but the “tl;dr” is that we set up an experiment with FIS that would target for termination all EC2 instances of an application managed by Elastic Beanstalk. The beanstalk configuration has an autoscaling group with a minimum of 1 instance, which meant terminating it incurred on an outage.

The Remediation

On the application side, we need to make sure our environment runs on a minimum of 2 instances.

       

This is a good reminded that even though you make use of managed services, you’re still in charge of the behavior of it. Managed services (regardless of being compute, databases, containers, etc) will do all the heavy lifting but it will only operate in whichever way you tell it to. Our first FIS experiment showed that the application setup wasn’t resilient enough to failure. Whilst, beanstalk made sure to spin up a new instance to replace the terminated one, there was still a minute or two of downtime.

The New Experiment

Now that we’re running more instances, I’m also going to update the experiment template. On its current form, it would still target all instances because it was just based on tags which are shared by all ec2 resources managed by beanstalk.

The action can remain as is, that is a terminate-ec2 type. The change will be at the target level. Here, we need to update it in such a way that it targets a subset of the instances and FIS provides you with two options to do so.

  1. Count: Fixed number of resources that will be targeted by the matching criteria
  2. Percentage: Percentage of affected instances. NB: FIS will round down the resulting number of targets in cases where you have an odd number of resources.

I want to test how my application behaves if I lose half my fleet, so I’ll set it up with a Percent mode at 50%. In this particular case, this is the equivalent to choosing Count with a value of 1.

After running this new experiment, we can test our application and see that there are no perceived changes to it. However, upon closer inspection to our resources, we’ll learn a few things

  1. Our EC2 fleet downside to 1 (which means our action ran as intended)
  2. Beanstalk is showing a Degraded state because 1 of the instances stopped sending data. If you remember, our application state was Unknown when the entire fleet disappeared.

   

We now have a new configuration to withstand certain types of failure and an experiment we can run on a regular basis to make sure our application configuration is up to it.

There are many more types of actions you can perform with FIS that we can explore in future entries.

Chaos Engineering has been around for a while, after being popularized by Netflix during their migration to the cloud. However, despite their best efforts to open source their tooling, a proper secure and reliable set up was complicated enough most people.

Fast forward to the AWS announcement of a limited preview new managed chaos engineering service called AWS Fault Injection Simulator at re:Invent 2020. After a couple of months of limited access, the service is now GA (us-east-1 only at the time of this post) and today’s post is about getting started with it.

The Setup

There are a number of actions the service can perform (stop/terminate instances, throttle APIs, etc) against a number of different targets (EC2, ECS, RDS with more to come). For this entry, we’ll keep it simple and just focus on terminating a production EC2 instance experiment. In this particular case, I’ll be using the sample NodeJS application managed by Elastic Beanstalk.

The Application

As mentioned before, I’m just using the sample NodeJS application that Elastic Beanstalk offers you to quickly get started. However, I wanted to some of the configuration choices that I made to my environment.

The first bit of configuration (and one to pay attention to) is around the high availability for my environment. You’ll notice that while it is load balanced and scale up to 4 instances, the minimum has actually been set at 1.

You can also see the resources the service created for us, which in this case is one EC2 instance to which I’ve applied a resource tag at the application level. This tag is of the form chaos:ready, which it is descriptive enough for me to understand what instances I want FIS to target during its experiments. You could choose whatever value of the key value pair tag or just not have one altogether.

Finally, here’s what the sample application looks like and it also serves as a one to see how our environment is running.

Experiment Time

From the FIS homepage, you’ll see your option is to create a new experiment template so go ahead and hit that button.

Disclaimer: FIS will execute whatever actions you define against your resources. The service doesn’t produce fake metrics or wizardy to simulate how a potential disruption affects your system. The service will indeed, terminate your instances if that’s the action you have chosen. You will be provided with a number of warning signs along the way but it’s better to be safe than sorry.

Think of the template as the definition for your experiments, the place in which you can specify actions, targets and alarms on top of the usual name, role (the role requires a trust relationship on ‘fis.amazonaws.com’) and tags that we’re used to from other AWS services.  As previously mentioned, today’s experiment will only perform a terminate instance action.

When creating our action, we’re asked to provide a name for it as well as an action type from a predefined list. Once you’ve selected your action type, the Target dropdown will appear with an already prepopulated value created for you. The last option is something called “Start after“, what this means is in cases were a template has multiple actions, you might choose to run them in parallel or in sequence. Right now, it can be ignored given we’re going for the one action.

Now, let’s edit the target FIS created for us. I’ll start by updating the name for something a bit more descriptive, the Resource Type can stay as is because we’re indeed targeting EC2 instances. Now comes the fun part and arguably the area in which you need to focus the most which is how are we going to target these resources.

We see the selected method by default is using a resource ID. For our particular example, it might look like it’s enough and it indeed could be for a one off execution. It is true we’re only running one EC2 instance but we need to save the template with a fixed ID, so that means we’re not really in a position to reuse the template given that if we succeed and actually terminate the instance that particular ID will be lost.

So let’s use tags and filters and as soon as we select that method, a couple of “resource” options will appear. The first one is tags, and as you can imagine it will only run against resources with the specified tags. This will be the place in which I’ll use that chaos:ready tag from before.

The second option is called filters and I highly recommend you to follow the documentation link as this is the area where targets become truly powerful. For the sake of simplicity (this post is already too long) but not to leave you hanging, I’ll create one that targets only EC2 instances that are in a running state.

The Stop Condition section will provide you with the necessary safe guard to stop the experiment if a certain criteria is met. It is an optional value and I won’t be using it now but I’d suggest to always have one for serious experiments.

Go ahead finish the creation of the template. The service will make sure you’re sure about it with with a nice warning sign.

I’m now ready to start the template, which will in return create an experiment instance. The start process comes with the same warning as the creation one and it should run successfully.

Now that the experiment has finished, let’s have a look at the chaos it caused.

My beanstalk URL now returns an error, which means the underlying EC2 instance has been successfully terminated.

We can confirm our suspicions by looking at the health of our environment as well at the specific time in which it happened by looking at the metrics.

Beanstalk will automatically spin up a new instance and your environment will be back to healthy in a minute or two but it is a good reminded that even if you’re using a managed service, the service can only do what you tell it to do. In our case, because our minimum configuration was one instance, terminating it meant a complete disruption of our application.

In our follow up post with a way of mitigating that but still being able to run chaos experiments on our environments. Check it out here : https://tutorialsforcloud.com/2021/03/25/programmatic-chaos-with-aws-fault-injection-simulator-continued/

For some time now, Azure Cognitive Services has offered a “Text Analytics” feature, which can be used for finding topics within a piece of text, or even sentiment analysis to see if the overall sentiment of the text was positive or negative.

In early 2020, Azure released an additional feature to this API called “Opinion Mining”. Opinion mining is almost the cross between topic discovery and sentiment analysis. Instead of finding the overall sentiment of a piece of text, instead it finds the sentiment of individual topics. For example, in a piece of text such as :

The food here was terrible!

We would expect it to understand that not only is this a negative sentence, but specifically, we are talking negatively about the food. Being able to understand not just whether something is overall positive or negative, but also what is being talked about in that light can be invaluable in machine learning scenarios.

So let’s jump right in!

Setting Up Azure Cognitive Services For Testing

For the purposes of this article, we’re not going to get into individual SDK’s for Python, C#, Java, or any other language (Although these are available). Instead, we’re just going to use a simple Postman example of calling the API, with our key as a header, and retrieving results. This should be enough for us to see how the API works, and what sort of results we can get from it.

The first thing we need to do is head to our Cognitive Services account in the Azure Portal (Or go ahead and make one if you need to, the first 5000 requests are free so there is no immediate cost to creating the account!).

Under Keys and Endpoint, copy out your endpoint and one of your keys from this screen :

For our test, we are going to call a POST URL in the format of :

https://ABC.cognitiveservices.azure.com/text/analytics/v3.1-preview.3/sentiment?opinionmining=true

Where ABC is replaced with your cognitive endpoint taken from the above screenshot.

Additionally, we will sending a header of “Ocp-Apim-Subscription-Key” which will be our key, again taken from the screenshot above. In Postman it will end up looking like so :

The body of our request will always look like the following :

{
  "documents": [
  {
    "language": "en",
    "id": "1",
    "text": "Horrible location as it's right next to a construction site. But the food was amazing! Really friendly waiter too!"
  }]
}

Documents is actually an array because you can send multiple documents at once to the API to have them all mined at once. You still pay per document, so it isn’t a cost saver, but sending multiple documents at once can save time over sending them one by one.

Now we’re all set up, let’s get mining!

Testing Opinion Mining Out

First let’s try out a typical restaurant review :

Horrible location as it’s right next to a construction site. But the food was amazing! Really friendly waiter too!

So what we are looking for here is that it identifies that the location is negative, but that the food and waiter were positive. And what do you know (Note that the full API response is much more verbose, I’m just cutting it down to see what we need!)

{
  "sentiment": "negative",
  "confidenceScores": {
    "positive": 0.0,
    "negative": 1.0
  },
  "text": "location"
}
{
  "sentiment": "positive",
  "confidenceScores": {
    "positive": 1.0,
    "negative": 0.0
  },
  "text": "food"
}, 
{
  "sentiment": "positive",
  "confidenceScores": {
    "positive": 1.0,
    "negative": 0.0
  },
  "text": "waiter"
}

So as we can see it’s actually identified the noun that we are trying to describe, and whether our opinion was positive or negative.

Let’s try something slightly harder. What I noticed was that the opinion mining spotted the adjectives of “Horrible” and “Amazing” which should be fairly easy to spot. But how about this sentence :

I felt the food was bland. The music was also very loud so we couldn’t hear anything anyone said.

So again we are leaving a review, but specifically we are saying that the food is “bland” and the music was “loud”. There’s are very specific to the sentence and aren’t common adjectives you might use to describe something. But again :

{
  "sentiment": "negative",
  "confidenceScores": {
    "positive": 0.01,
    "negative": 0.99
  },
  "text": "food"
}
{
  "sentiment": "negative",
  "confidenceScores": {
    "positive": 0.04,
    "negative": 0.96
  },
  "text": "music"
}

And more importantly we see that it even picked up that the food being bland and the music being loud is why the opinion is negative.

"opinions": [
  {
    "sentiment": "negative",
    "confidenceScores": {
      "positive": 0.01,
      "negative": 0.99
    },
    "text": "bland",
  }
]

Really impressive stuff! Does that mean it always gets it right? Absolutely not. Using sentences with colloquial terms (For example, “The food here is the bees knees!”) just returns neutral scores, but for out of the box opinion mining with no training required at all (And very little developer legwork), opinion mining with Azure Cognitive Services is pretty impressive!