Martin Kearn - Bot Framework V4 What I learnt in 4 days in July 2018

This is out of date content. This article was written as a point-in-time snap-shot of the pre-released version of Bot Fraemwork V4 as it was in July 2018. The Bot Framework has now been fully released and you should refer to https://docs.microsoft.com/en-us/azure/bot-service/?view=azure-bot-service-4.0 for the doucmentation. Many of the concepts in this articles have now bene retired or moved on. Please contact me if you have questions

I was recently involved in a short, 4-day customer hack based on Microsoft Bot Framework V4 (C# SDK), Azure Bot Service and Language Understanding Intelligence Service (LUIS, part of Cognitive Services).

It was a frustrating yet insightful 4 days and have some observations which might be helpful to anyone else who is thinking about Bot Framework V4 (BFv4), what state it is in right now and whether it is worth seriously looking at yet.

This article is intended as a point-in-time brain dump based on my limited exposure of the new framework and probably only has a shelf life of a few months, but if you are considering a BFv4 bot now, this could be useful.

BFv4 is currently in public preview which was announced at the Build conference back in May 2018. We do not have any dates for the final release, you can read the initial announcement here; the blog.botframework.com blog is a good one to watch for announcements on dates etc.

It is worth noting that I'm writing from a perspective of someone who has done a fair few bots using the V3 framework, almost exclusively with C#. I’m also reasonably well experienced with both Azure Bot Service and LUIS for use with V3 bots. That said, as you’ll see when you read on, V3 experience is of limited value for BFv4, in some ways I think it even made BFv4 harder to learn.

Headlines

I've certainly seen the headlines about BFv4 and seen a few conference videos, but have never got 'hands on' before with any preview version. Here are the main headline changes I was aware of before my 4 days project:

Re-write. BFv4 is a complete re-write of the framework with new concepts, terminology, documentation, architecture etc
More Languages. BFv4 SDK is available for JavaScript, C#, Python and Java.
Open Source. BFv4 is open source and always has been, You can see the GitHub repositories here
Overall service architecture remains the same. The concept of the bot as a single code base which is published to multiple channels (Skype, Cortana, Facebook etc) remains the same. The Azure Bot Service idea also remains the same.
.net Core 2.0. For the .netters amongst you, you’ll be glad to learn that the C# BFv4 SDK is built on .net core 2.0 which means that you get to use all those cool .net core features like middleware and DI. See the open source .net core SDK here

Key Learnings

These are just some of the key learnings about BFv4. These are things I had not already gleaned before starting my 4 day project and things I may have spent slightly longer than I'd have expected figuring out.

Terminology & Concepts

There are several new concepts in BFv4 which bring new terminology with them. The concepts are covered in some detail in the docs, but here are some of the key new terms you'll hear and need to understand to build BFv4 bots.

Adapter: The Adapter is like the orchestration engine for the bot and is responsible for directing incoming and outgoing communication, authentication, and so on. When your bot receives an activity, the adapter wraps up everything about that activity, creates a TurnContext object, passes it to your bot's application logic, and sends responses generated by your bot back to the user's channel. We don't typically work directly with the adapter. Read about activity processing and the adapter here
Middleware: Middleware is a pipeline which sits between the adapter and the bot code. The pipeline can contain multiple middleware components and many of the built in capabilities are represented as middleware such as state. Read more about middleware here
Turn: A Turn is the action of the bot receiving an activity (i.e. a message from the user), and subsequently processing it, normally involving the bot replying back to the user and awaiting further input. A Turn carries a TurnContext object which contains useful information such as Conversation, Activity, Intent, State and other information.
Dialogs and Conversation flow: The way conversation flows through the bot has changed significantly compared to BFv3. Key docs include Manage conversation flow with dialogs and Create modular bot logic with a dialog container. These are some of the key concepts:
- Dialog: A Dialog is a little different to BFv3. In BFv4, a Dialog is used for very simple, single turn interactions. For example if you ask the bot "what is 2 + 2", it would reply "4" and that would be the end of the dialog. A Dialog can receive data either via arguments passed in from the OnTurn function or via state. Dialogs cannot contain child dialogs so cannot be used on their own for complex branched conversations, that is where the DialogContainer is used as it contains a collection of Dialogs.
- Prompt: A Prompt is a type of built-in dialog intended to capture and verify specific pre-defined data from the user such as text, numbers, dates, confirmation or choices. Conceptually this is the same as BFv3.
- DialogContainer: A DialogCointainer is a collection of Dialog or Prompt which are executed sequentially in WaterfallStep. DialogContainers can and do contain child dialogs and are the logical equivalent to the Dialog in BFv3.
- DialogSet: A DialogSet is a collection which can contain child Dialog, Prompt or DialogContainer. DialogSets are generally used to manage the top level menu for your bot and then branch out to different DialogContainers for different branches of the conversation. This is often known as a "root dialog", but should more accurately be described as "root dialog set".
- WaterfallStep: A WaterfallStep step can be thought of as a granular step in the conversation either prompting the user for an utterance or processing what the user said.
State: State is conceptually similar to BFv3 in that it stores data relating to either the conversation or the user. State is a middleware component. Read about Managing conversation state here.

Docs

The docs are actually in quite decent shape at the time of writing. The main, high level conceptual stuff is documented fairly well.

As is usual with a preview technology, the low-level code samples are fairly out of date compared to what you can find on GitHub (more on that next).

The V4 docs are available here: https://docs.microsoft.com/en-us/azure/bot-service/?view=azure-bot-service-4.0

If you try to search for them, you may end of on the V3 docs, but if you hit the 'Azure Bot Service' drop-down in the top left corner and select "SDK V4.x (preview)", it will filter on the BFv4 docs. Rather embarassingly, this took me several hours to figure out!

Samples

As is usual with open source technologies, the samples are maintained on GitHub first and then updated into the docs at a later stage, so I'd always use GitHub as your 'go to' location for code samples and examples.

Being an open source SDK, you can track progress of the SDK itself here: https://github.com/Microsoft/botbuilder-dotnet

You'll find a selection of fairly good samples in the samples-final folder of this repository which cover how to do the most common tasks.

However, the best samples I found for .net were on a specific contosocafe-v4-dotnet branch of the general BotFramework-Samples repository. The samples are focused on a Contoso Cafe scenario and seem to be very fresh (last updated 29th June 2018): https://github.com/Microsoft/BotFramework-Samples/tree/contosocafe-v4-dotnet/docs-samples/V4/dotnet/ContosoCafe/ContosoCafe-5-DialogsWithLUISEntities/ContosoCafe

Azure Bot Service

A bot is essentially a web API and so in theory, it can be hosted on any web service. However, it seems very clear to me that the intention is that Bot Framework bots are hosted as part of the Azure Bot Service.

My experience with the Azure Bot Service in both BFv3 and BFv4 has been good and I struggle to think of a reason not to use the service for hosting as it has many advantages, including:

Easy channel publication
Azure build, deploy and continuous integration capabilities
Templates with popular Cognitive Service integrations
Speech priming
Analytics

That said, the option is yours but my advice would be that if you are going to 'stray from the beaten path' and host elsewhere, you may find that many of the docs and samples become harder to follow. Certainly as you are learning BFv4, you may find it easier to stick with the Azure Bot Service.

Luis Middleware & Strongly Typed Class

As with BFv3, almost every bot will need some level of natural language processing which is where Cognitive Services Language Understanding Service (LUIS) comes in. LUIS extracts intent and entities from a user's utterance and makes that data available to the bot to inform teh application logic.

Intent is normally used to define the top level branch of the conversation, on the top level intent is known, this information is not required.

Entities are not always required, but can be useful to capture data points form the user via natural language rather than using Dialogs and Prompts.

In BFv4, there is a middleware component called LuisRecognizerMiddleware. You can read about how to use it at Using LUIS for Language Understanding.

In my experience, i found a few conceptual issues with the LuisRecognizerMiddleware:

Luis is generally used at the top of the conversation; once you have the intent and entities, Luis is no longer required. However, middleware is executed on every Turn, which will become expensive and wasteful in terms of bandwith, cost and latency for most scenarios.
The LuisRecognizerMiddleware does not provide a strongly typed object to work with. You can still get to the same data but it is presented as a very deep collection of Dictionary objects.

An alternative approach is to use a LuisRecognizer as and when it is required in your bot's root DialogSet to gather top level intent and entities and then brand out to DialogContainers from there, thus reducing the use of Luis.

The other main advantage of the LuisRecognizer class is that you can generate a strongly typed class based on your actual Luis model using a NPM tool called LuisGen

The Extract intents and entities using LUISGen docs give you a high level overview of this approach, however at the time of writing, the samples in the docs were incomplete and difficult to follow.

For a complete example of the LuisRecognizer in action, please see my Bot-V4-Banko example on GitHub. This is based on a fictitious bank which enables balance checks and money transfer via their bot. This is also a good example of multiple DialogContainers in use.

In Summary

If you are a BFv3 developer, be prepared to discard a lot of your knowledge, samples and experience for BFv4 as it really is a big change in terms of terminology, capabilities and overall architecture.

However, as with all platform re-writes of this nature, BFv4 is a very good developer platform and once the initial learning curve has been overcome, I think it is a much easier and overall better platform for writing bots compared to BFv3.

I'll attempt to keep my Bot-V4-Banko example up to date with the latest patterns and I'll also re-visit this article from time-to-time as my understanding develops and patterns and best practices emerge.

This article is just what I learnt in 4 days, your mileage may vary!

See all articles

Got a comment?

All my articles are written and managed as Markdown files on GitHub.

Please add an issue or submit a pull request if something is not right on this article or you have a comment.

If you'd like to simply say "thanks", then please send me a .