ML Case-study Interview Question: LLM-Powered RAG Chatbot Streamlines BPMN Creation from Internal Knowledge Bases

Rohan Paul

Apr 13, 2025

Browse all the ML Case-Studies here.

Case-Study question

A multinational organization with 30,000+ employees struggles with process modeling activities across various teams. Documentation is scattered across different repositories, making it difficult for process modelers to find and use the right information. The organization wants to create a Large Language Model (LLM)-powered chatbot that can retrieve and interpret their internal documentation, then generate a Business Process Model and Notation (BPMN) diagram in a text-based format. They also want the chatbot to guide process modelers in following internal standards. How would you structure a solution approach? How would you handle data privacy, system integration, iterative improvements, and user acceptance? What steps would you take to ensure the solution is robust?

Connect with me on X (Twitter)

Detailed Solution

Context and Requirements

This organization needs a system that mines scattered process documents, interprets them in real time, and produces a text-based BPMN syntax that users can paste into a modeling tool. The system must run behind the company firewall to ensure data confidentiality. It must handle diverse domains and departmental standards. It must also integrate with the company’s existing knowledge base, which contains detailed process descriptions.

Architecture

An LLM-based chatbot can be hosted using a self-managed platform that supports private deployments. The chatbot can be set up to retrieve relevant documentation via retrieval-augmented generation techniques. The retrieval component indexes process documents so the chatbot references them when forming responses. A text-based BPMN output ensures modelers can easily copy the chatbot’s output into an online BPMN visualizer. The final design includes a governance mechanism to manage updates, user feedback, and version control.

Data Privacy

All confidential documents must remain inside the company’s secured environment. The retrieval pipeline can live on on-premise servers. Any calls to external model APIs are either blocked or replaced by a locally hosted LLM that is fine-tuned on internal data. If the organization uses a vendor LLM, they can implement encryption at rest and in transit and disable data logging on the vendor’s side.

Guided Prompting

Process modelers should receive prompt templates. Each template clarifies the desired level of detail, domain context, and output structure. For instance: “Generate a process model for order handling. Include tasks for order entry, fulfillment, and customer communication. Use text-based BPMN with correct swimlanes.”

RAG Implementation

Below is a sketch of how to integrate a retrieval pipeline in Python:

import numpy as np
import faiss
from transformers import AutoTokenizer, AutoModel

# 1. Load embeddings for each document snippet
docs = ["Doc snippet 1...", "Doc snippet 2...", ...]
# Convert docs into embedding vectors
# Suppose we have a function get_embedding(text) -> vector
doc_vectors = [get_embedding(d) for d in docs]
doc_index = faiss.IndexFlatL2(768)  # Dim depends on embedding model
doc_index.add(np.array(doc_vectors))

def retrieve_relevant_docs(query, top_k=3):
    query_vec = get_embedding(query)
    distances, indices = doc_index.search(np.array([query_vec]), top_k)
    retrieved = [docs[i] for i in indices[0]]
    return retrieved

# 2. Combine retrieved text with the user query before calling the LLM
def rag_pipeline(query):
    relevant_docs = retrieve_relevant_docs(query)
    combined_prompt = "Relevant docs:\n"
    for doc in relevant_docs:
        combined_prompt += doc + "\n"
    combined_prompt += "User query:\n" + query
    # Then send combined_prompt to your local LLM
    response = local_llm(combined_prompt)
    return response

This pipeline indexes and retrieves relevant chunks from the internal repository. The final response can be shaped by a few-shot prompt that instructs the LLM to output text-based BPMN syntax.

Operating Model

A governance team supervises the chatbot’s evolution. They gather user feedback, monitor chatbot logs, and refine index structures. Modelers provide prompt examples, new process documentation, and clarifications about domain-specific best practices. This feedback loop helps the chatbot learn from real usage. The team schedules regular updates for the LLM and reevaluates the embedded documents. Process owners keep the domain knowledge current. A champion user group receives new versions of the chatbot to confirm improvements before broad release.

Handling User Acceptance

User acceptance depends on setting realistic expectations. The chatbot is an assistant, not a final decision-maker. The organization can provide training sessions to show how the chatbot retrieves and formats process documentation. Tutorials can clarify how to refine prompts and interpret BPMN outputs. Emphasizing human oversight ensures process modelers maintain accountability.

Follow-up question: How would you evaluate the performance of the LLM-based system?

Evaluations can include time-to-completion metrics and accuracy checks against established process standards. One approach is to measure how long it takes modelers to produce a final BPMN diagram with and without the chatbot. Another approach is to collect feedback after each modeling session, asking if the chatbot’s recommended tasks match known best practices. The governance team can periodically run test prompts to see if the chatbot’s outputs stay consistent. They can also compare the LLM’s suggestions with official reference diagrams. A feedback form can capture user satisfaction, correctness of output, and clarity of explanations. Over time, this feedback shapes iterative refinements.

Follow-up question: How would you address hallucinations or incorrect outputs?

LLMs can produce inaccurate results if the retrieval pipeline fails or if the system’s prompt logic is insufficient. A robust retrieval step helps mitigate this by providing domain-specific text. Modelers should remain vigilant and review outputs for plausibility. Organizational references should be up to date. A fallback mechanism can display a confidence score. If the chatbot is unsure, it can return disclaimers for further review. Proper prompt engineering also helps reduce hallucinations. For process steps with uncertain details, the chatbot can ask clarifying questions rather than returning random guesses.

Follow-up question: How would you integrate this solution with existing BPM tools?

Text-based BPMN output can be pasted directly into an existing BPMN visualizer. A second integration path is to connect the chatbot’s API to the BPM platform’s import endpoint. The chatbot sends a structured BPMN file, which the BPM tool parses and renders. Version control can be managed in the BPM tool’s repository, ensuring changes are documented. The user can also run comparisons between older and newer BPMN diagrams. An optional plugin might let users launch the chatbot from inside the BPM suite’s user interface, enabling direct question-and-answer flows while modeling.

Rohan's Bytes

Discussion about this post