December 12, 2023

How we built a generative AI bot

hero image for blog post

By summer 2023 I was convinced that Section needed to use AI to improve our student experience. Other education providers were coming out with AI models – most of them not very good, but movement nonetheless. I knew we had a small window to be “first” at figuring out AI for live learning. 

Plus, we’ve heard the same feedback from students over the last 4 years – the need for more personalized, relevant learning. Our students come from all industries, roles, and company sizes. It’s the blessing and curse of our business. And early indications were that AI could help personalize a learning experience.

So from August to October, we designed, built, and prototyped an AI course tutor called ProfAI. Today I’ll walk you through that process, from building our business case to piloting the first bots. 

Have an idea for a bot? Shoot me an email. I love to talk about the nitty-gritty details of building something new. 

Want to make AI your superpower in 2024? We’re launching 20+ AI courses in Q1 and Q2. Use code SECTIONSTUDENT for 25% off Section membership for a limited time.

As I teach in my AI Generative Business Strategy course, every AI project should start with a memo: a hypothesis and then three stages. 

  1. Hypothesis: How your AI-powered project will increase revenue or lower costs
  2. Prototype: How you’ll validate that you can build/find a solution, and that the solution works
  3. Pilot: How you’ll validate duration and scale – does it work with enough users that represent the wider deployment?
  4. Deployment: How you’ll execute a full roll-out and mitigate the risks that come with it 

Hypothesis + business case

Our hypothesis: A course tutor would increase customer retention rates by 10%, driving an additional $600,000 in revenue per year. 

The business case is pretty simple: We estimated it would cost about $150,000 the first year to stand up and then $60,000/year after that, netting us $390,000 in incremental revenue the first year and then $540,000 every year going forward. Not bad if it works.

Here’s our memo – you’ll notice as we walk through each stage that the details change when we actually start doing it, but this was a good starting point. 

Phase 1: Prototype

The prototype stage is about learning quickly and cheaply. Spend only what you’re willing to lose to test and refine a concept – for us, $15,000, but in many cases it might only be a few hundred dollars. 

For our prototype phase, we used an off-the-shelf bot (Dify) and used a consultant to develop separate bots for eight different sprints. Note: If custom GPTs had existed at the time, we likely would have used this to prototype. 

Each bot was trained on the course material and equipped to provide project feedback when prompted by the student. 

We then tested the bots with 10 TAs – enough to feel confident that they worked – and then rolled them out to four courses. 

Prototype results:

  • 41% of the students tried the bot
  • 30% engaged meaningfully (what we defined as 3+ times)
  • 8.2/10 value score according to students who used it

What we learned

  • Relying on students to provide the prompts was too high friction. It’s a pain to copy and paste canned prompts into AI – we need the AI to guide the student instead of the other way around.
  • Using an off-the-shelf builder like Dify only worked if we restricted the training material to one course. Dify couldn’t build a bot that serviced more than one course, and it’s not scalable to build 50 bots. 
  • GPT 3.5 gave a lot more concise answers than GPT-4 and adhered more closely to the course material
  • The best training materials were shorter. We thought more data would help, but we were wrong – constraining the inputs to course one-pagers made the output better.

Phase 2: Pilot

We’re entering our pilot phase now, with the goal of building one single AI-powered bot for 8-10 courses. Unlike our prototype, the bot will prompt and guide the student with pre-programmed prompts – answering questions, quizzing the student, and guiding them through project completion. 

In October, we took our learnings from Phase 1 and built a project spec, which we shopped around to a few different AI development shops. All our quotes ended up in the range of $125-$150k to develop our spec by the end of February. 

The developer we chose has built about 10 AI prototypes already this year, and pitched us a phased approach with monthly milestones.

Our goals for this pilot phase: 

  • Prove/disprove development of a single AI-powered bot that can scale to all courses (so that we don’t have to build 50 bots for 50 courses)
  • Increase student engagement through an AI-powered (more personalized) experience
  • Increase intent to renew/renewal (higher for bot users)

We’re not reinventing the wheel on chat UI for now, aiming for an experience that should feel familiar to ChatGPT users: 

We’ll spend the next 3 months building our pilot and then roll out to a beta test group to get feedback. (If you’d like to be part of this group and get early access, sign up here). 

But even a month or so in, a few learnings are emerging from this phase:

  • The output from GPT-3.5 is very strong – higher quality than what we experienced with Dify and much more affordable token costs than GPT-4
  • Even with our corpus of content, training materials must be reworked to improve output. I’ll quickly need a dedicated resource to adapt our scripts and content for AI.
  • AI prompting the user (vs. user prompting AI) is a key differentiator, and where we’ll need to focus in our UI.
  • The winner will have a bench of human, business experts to verify outputs and fine tune the model – this is where so much of the time is spent.

Phase 3: Deployment

If our pilot is successful, we’ll move to a larger deployment with a larger audience. That would mean scaling beyond 8-10 courses and building a tighter integration with the Section platform. We’d likely also build bots for additional use cases – such as skills assessments – and build private enterprise instances.

What we’ll watch out for as we scale:

  • Token costs: How much will the typical user engage, and what will our monthly token costs be?
  • User data sensitivity: Will users be okay with us using their responses to train the model?

Want to make AI your superpower in 2024? We’re launching 20+ AI courses in Q1 and Q2. Use code SECTIONSTUDENT for 25% off Section membership for a limited time.

Greg Shove
Greg Shove, CEO