For some time now, Azure Cognitive Services has offered a “Text Analytics” feature, which can be used for finding topics within a piece of text, or even sentiment analysis to see if the overall sentiment of the text was positive or negative.
In early 2020, Azure released an additional feature to this API called “Opinion Mining”. Opinion mining is almost the cross between topic discovery and sentiment analysis. Instead of finding the overall sentiment of a piece of text, instead it finds the sentiment of individual topics. For example, in a piece of text such as :
The food here was terrible!
We would expect it to understand that not only is this a negative sentence, but specifically, we are talking negatively about the food. Being able to understand not just whether something is overall positive or negative, but also what is being talked about in that light can be invaluable in machine learning scenarios.
So let’s jump right in!
Setting Up Azure Cognitive Services For Testing
For the purposes of this article, we’re not going to get into individual SDK’s for Python, C#, Java, or any other language (Although these are available). Instead, we’re just going to use a simple Postman example of calling the API, with our key as a header, and retrieving results. This should be enough for us to see how the API works, and what sort of results we can get from it.
The first thing we need to do is head to our Cognitive Services account in the Azure Portal (Or go ahead and make one if you need to, the first 5000 requests are free so there is no immediate cost to creating the account!).
Under Keys and Endpoint, copy out your endpoint and one of your keys from this screen :
For our test, we are going to call a POST URL in the format of :
https://ABC.cognitiveservices.azure.com/text/analytics/v3.1-preview.3/sentiment?opinionmining=true
Where ABC is replaced with your cognitive endpoint taken from the above screenshot.
Additionally, we will sending a header of “Ocp-Apim-Subscription-Key” which will be our key, again taken from the screenshot above. In Postman it will end up looking like so :
The body of our request will always look like the following :
{ "documents": [ { "language": "en", "id": "1", "text": "Horrible location as it's right next to a construction site. But the food was amazing! Really friendly waiter too!" }] }
Documents is actually an array because you can send multiple documents at once to the API to have them all mined at once. You still pay per document, so it isn’t a cost saver, but sending multiple documents at once can save time over sending them one by one.
Now we’re all set up, let’s get mining!
Testing Opinion Mining Out
First let’s try out a typical restaurant review :
Horrible location as it’s right next to a construction site. But the food was amazing! Really friendly waiter too!
So what we are looking for here is that it identifies that the location is negative, but that the food and waiter were positive. And what do you know (Note that the full API response is much more verbose, I’m just cutting it down to see what we need!)
{ "sentiment": "negative", "confidenceScores": { "positive": 0.0, "negative": 1.0 }, "text": "location" } { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "negative": 0.0 }, "text": "food" }, { "sentiment": "positive", "confidenceScores": { "positive": 1.0, "negative": 0.0 }, "text": "waiter" }
So as we can see it’s actually identified the noun that we are trying to describe, and whether our opinion was positive or negative.
Let’s try something slightly harder. What I noticed was that the opinion mining spotted the adjectives of “Horrible” and “Amazing” which should be fairly easy to spot. But how about this sentence :
I felt the food was bland. The music was also very loud so we couldn’t hear anything anyone said.
So again we are leaving a review, but specifically we are saying that the food is “bland” and the music was “loud”. There’s are very specific to the sentence and aren’t common adjectives you might use to describe something. But again :
{ "sentiment": "negative", "confidenceScores": { "positive": 0.01, "negative": 0.99 }, "text": "food" } { "sentiment": "negative", "confidenceScores": { "positive": 0.04, "negative": 0.96 }, "text": "music" }
And more importantly we see that it even picked up that the food being bland and the music being loud is why the opinion is negative.
"opinions": [ { "sentiment": "negative", "confidenceScores": { "positive": 0.01, "negative": 0.99 }, "text": "bland", } ]
Really impressive stuff! Does that mean it always gets it right? Absolutely not. Using sentences with colloquial terms (For example, “The food here is the bees knees!”) just returns neutral scores, but for out of the box opinion mining with no training required at all (And very little developer legwork), opinion mining with Azure Cognitive Services is pretty impressive!