Understanding Intents and 5 Mistakes That Can Derail Conversational AI

“How can we help you?” Anyone working in business, in virtually any capacity, ponders this all-important question about their customers and users. And for those of us in the field of data science and AI, we have a single word for it: Intent.

As AI and machine learning have risen among the ranks of tech’s hottest buzz words, many now understand the role of intent in AI systems. But I don’t think many people truly understand the importance of an “intent architecture” — and many underestimate just how difficult it can be to create a successful one.

Done right, a well-designed intent architecture for an AI-powered chatbot is the foundation for a great customer experience — and the kind of intelligent automation that helps businesses scale their customer service. Done poorly, it can doom your AI system from the start and frustrate your valuable customers.

In this post, we’ll explain five common mistakes customer service teams (and their developers) make with their intent architecture. But first, let me briefly explain how intents fit within the context of an AI-powered conversational interface.

Intent: The trigger that drives a chatbot response

Exactly how does an AI-powered chatbot — or as we call them, “virtual agents” — determine what to say? That goes roughly like this:

-Customers ask questions or make comments, often referred to as utterances.

-Those utterances are labeled, typically by using machine learning to cluster similar topics, and then verified by humans.

-These clusters of similar topics are established as intents — the thing customers want in the most simple and clear terms, e.g. “Reset password.”

-The intents become the trigger for how your virtual agent responds. For each intent, conversations are designed, while answers and dialog are written.

-When an utterance is made that the natural language processing (NLP) engine thinks is a match to an intent, the virtual agent responds with the predefined dialog.

So, the all-important intent is ultimately the trigger that tells a chatbot how to respond. So, what could go wrong?

Mistake # 1: Choosing the wrong process for labeling

One of the challenges with the labeling process is finding the delicate balance between quality and quantity. If you’re in the process of building your first virtual agent, you might have millions of old messages that you want to label in that step before intents are defined.

So, there are two main ways companies do this:

One is to ask a small internal team to sort through the messages — or similarly outsource it to a small BPO (business process outsourcing) team that you can train and manage. In either case, this will ensure quality work. However, the process of labeling all those utterances will be very costly and can take an exceptionally long time (which is costly in a different way). This approach simply doesn’t scale very well.

Second, you can outsource it to a crowdsourcing service (such as Amazon’s Mechanical Turk), offering a small incentive to each member of a large community. While this might scale and get you the quantity of work you need, the problem is quality. Are the community members knowledgeable about the subject matter? These anonymous communities are incentivized to simply complete tasks, not necessarily complete tasks well. And if the labeling isn’t up to snuff, you’ll end up confusing your algorithm and ultimately dooming your virtual agent.

How we do it

Most companies will choose a blend of the two approaches, using the scalability of crowdsourcing to get through the mass of messaging and apply some measure of quality control by using an internal team to review and make any changes.

At Directly, we help our clients tap into their expert users. Through our platform, companies can task their most knowledgeable superusers to label utterances. This solution helps scale in a manner similar to Mechanical Turk, and also ensures a high level of quality. The users are diligent about the tasks because there is payment/financial incentive and because they are proud to serve as ambassadors for the products they love.

Mistake #2: People doing the labeling don’t understand the subject

In the world of AI, context is king. An NLP engine needs to not just understand the literal meaning of words, but the context. For a company’s AI-powered virtual agent, that means it needs to understand product nuances and the business landscape to effectively answer a customer’s question.

For example, one of our clients is Airbnb. If a customer message reads, “I had a bad experience,” that message might be labeled with something like “rental property problem.” But Airbnb sells local tours and services as a product, named “Airbnb Experiences.” So a “bad experience” might not have anything to do with a rental property.

So for intents, the labeling of utterances needs to account for not just linguistics, but context. The people doing the labeling need to understand the business — yet many companies choose to have anonymous communities or even data scientists with little or no product expertise doing the labeling.

How we do it

As we said above, we tap our clients’ expert users to do much of the labeling — not just because that approach scales, but because those expert users understand the product, in many cases better than internal teams.

Mistake #3: Creating intents that are too broad or too narrow

While data science is clearly a logic-driven profession, there’s an art to creating a good intent structure. Especially when it comes to finding the middle ground between intents that are too broad and too narrow.

Some companies will minimize the number of intents because it spares them the effort involved in content creation for each intent. But this generic approach will end up failing to solve some specific customer problems. For example, a telecommunications company may have a customer who complains about an issue with signing into a Galaxy S10 Android phone. A broad intent might simply be “Phone sign-in.” But with the many variations of mobile phones — in particular the many different Android OS and phone models — the instructions for solving that issue differ. And answering that question — based on the broad intent — with a generic list of 50 things to try would be a terrible user experience.

On the flip side, creating an intent that’s too narrow presents its own challenge. For example any SaaS business would naturally have customers that forget their password. And they may have channels that include web and mobile (iOS and Android) apps. Companies might create multiple intents — e.g. “Android app password reset” — one for each channel. But password resets are usually done on an account level, across channels. Adding unnecessary detail to the intent creates more complication to an intent structure — and is more likely to confuse your NLP engine because of the overlap of key terms.

How we do it

There’s no process to finding the middle ground here. It’s simply being mindful about understanding the nuances of intents that can only come through experience and expertise. Our team here at Directly also has a wealth of experience in building and deploying NLP models and we work with some of the industry’s leading experts.

Mistake #4: Picking the wrong modeling technique for the data

There’s a misconception out there that since deep learning has come into existence, it is the end all for machine learning needs, including for conversational AI. That’s simply not true. There are a variety of different machine learning modeling techniques, each has strengths and weaknesses. And one of the biggest mistakes a team can make is to choose the wrong model.

Sometimes, customer service leaders choose a specific bot framework (e.g. Amazon Lex, Microsoft Bot Framework, Zendesk Smooch, and Salesforce’s Einstein, among others) without understanding the underpinning ML model that’s included. In a sense, they commit to a modeling technique without even realizing they’re making that decision.

Before choosing a modeling technique, you must consider the following questions:

-How much data is the model going to be processing?

-What is the structure of the model?

-How many intents will I try to automate?

-How quickly does the model need to run?

-How many languages is my virtual agent supporting?

Deep learning, in particular, is great at processing shorter bits of text (like SMS messaging), which is why it’s a popular choice for conversational AI. But how you answer all of these questions might steer you in a different direction.

How we do it

This a decision that should be left to data science professionals, as only they can weigh all the factors and make the best decision. When we work with our clients, we look at the overall data landscape and the business context before recommending a modeling technique (and before choosing a bot framework).

Mistake #5: Failing to feed model new data for intent discovery

Like a human, if you don’t continuously feed your ML model it will deteriorate. A model’s food is data. And data comes in the form of new utterances, labels and intents — and correlating new content that it will serve up.

Of course, AI is just code. And the basic code doesn’t change on its own — but the world around it does, and your model needs constant input to “drift” with it.

There are two kinds of drift we track: data drift and concept drift.

In the world of customer service, data drift takes place when there’s an entirely new topic that emerges. That could be a new product line, a new bug customers are experiencing, or a seasonal influence that affects your business in some capacity. In this case, teams need to identify new intents and content to account for the new utterances from customers.

Concept drift is when there are updates or changes to a product or service, but the intent structure doesn’t necessarily change. However, the content — the answers and language your virtual agent shares — may shift.

In both kinds of drift, your model needs constant input and updates to maintain its performance. We see it too often: Companies build and launch a virtual agent but neglect to maintain and train it.

How we do it

Our platform is designed to streamline the virtual agent and AI training process. As noted above, we can task a company’s expert users to train AI and update content so that it’s always current.

So, ‘How can we help you?’

Understanding intents — and what your customer wants — is the easy part. Actually creating the intent structure so you can build, manage, and train an AI-powered virtual agent — and provide customers with the help they need — is challenging. That’s why we at Directly have become a go-to partner for companies in their quest for intelligent customer service automation. Set up a demo today to see how our platform works.

Authored by Sinan Ozdemir and Lauren Senna who help lead Directly’s data science team.

Understanding Intents and 5 Mistakes That Can Derail Conversational AI

Intent: The trigger that drives a chatbot response

Mistake # 1: Choosing the wrong process for labeling

How we do it

Mistake #2: People doing the labeling don’t understand the subject

How we do it

Mistake #3: Creating intents that are too broad or too narrow

How we do it

Mistake #4: Picking the wrong modeling technique for the data

How we do it

Mistake #5: Failing to feed model new data for intent discovery

How we do it

So, ‘How can we help you?’

Keep up to date with the latest news.Sign Up For The Directly Newsletter

Keep up to date with the latest news.
Sign Up For The Directly Newsletter