This is part two of the three-part series about Retrieval Augmented Generation (RAG).
In Part 1, we explored the background, capabilities, and high-level architecture of a RAG-based chat application. In Part 2, we will take a deeper look into its components and my experience with building and deploying a corporate Travel Policy AI Advisor using Microsoft Fabric Preview. My goal was to gain a deep technical understanding of Generative AI app architecture, translate my hands-on experiences into practical business advice, and take you along on the journey.
I must confess that at the beginning of this project, I had serious doubts about whether someone with just an idea could build and launch a working Generative AI application without a significant investment in engineering and infrastructure. However, recent launches of Microsoft Fabric and Amazon Bedrock, which claim to have significantly reduced the entry barriers and simplified the Generative AI development lifecycle, gave me some hope. Is it actually possible? Let's find out.
Company travel policies are notoriously arcane and hard to understand. Rather than having to read and figure out what’s permissible, it would be great for travelers to get accurate answers to their questions without having to wade through pages of policy text.
Our hypothetical technology company needs a natural language Q&A chat, based on their absurdly long, and complex travel policy. Travelers should be able to ask questions such as:
Will the company pay for my spouse?
Can I fly premium economy?
Can I travel on a fighter jet?
The AI application should understand the intent of these questions and respond based on the policy or indicate “I don’t know” if the information is unavailable.
The first step included creating Azure Subscription, identifying the relevant Microsoft services, and deploying the necessary resources. Completing these initial tasks took significant effort. It involved developing a detailed understanding of the MS Fabric architecture, the function of each of its components, their respective locations, and how to access them. I created the following table to capture my understanding and to help keep track of the essential tool URLs.
Links: Azure Account, Azure Portal, Azure AI Studio, Microsoft Fabric Portal, Synapse Data Science App, Manual OpenAI Request Form.
Initially, I found the environment setup to be very confusing. The user experience in each of the portals was so different that they resembled tools built by different vendors. At one point during the setup process, I started doubting that they could effectively work together. However, after several iterations, the services set up in Azure Portal and the models deployed in AI Studio linked up and started working from my Synapse Data Science PySpark notebook.
Setting up the infrastructure, selecting AI models, and deploying APIs turned out to be quite simple once the baseline understanding and all the prerequisites were in place. The next step was to add code for the business logic.
1. Connecting to APIs
This step involves configuring the necessary credentials and endpoints for Azure AI Search, Azure AI Services, and OpenAI. These settings allow the application to interact seamlessly with Azure and OpenAI services throughout the app.
2. Loading and Analyzing the Travel Policy
Using Apache Spark, the policy PDF is loaded into a dataframe. This approach enables efficient processing of large volumes of data, preparing it for further analysis and extraction.
3. Extracting Text from the PDF
Azure AI Document Intelligence, accessed through SynapseML, is employed to extract text from PDF documents. This process transforms raw document data into clean, structured text that can be easily manipulated in subsequent steps.
4. Splitting the text into Chunks
SynapseML's PageSplitter is used to divide the Travel Policy text into manageable chunks. These chunks are then stored separately, facilitating more precise and efficient big data processing.
5. Generating Embeddings
OpenAI Service is utilized to create embeddings for each text chunk. This process transforms the text chunks into numerical vectors that capture their semantic meaning, and translate them into the language Large Language Models understand.
6. Creating the Search Index
Next, an Azure AI Search index is created and populated with the processed document chunks and their embeddings. This step ensures that the relevant travel policy is searchable and relevant chunks can be quickly retrieved when needed.
1. Finding the Relevant Context
In this step, we embed (translate to LLM language) user questions and retrieve the most relevant travel policy text chunks from the search index. This process leverages the vector search capabilities of Azure AI Search to find semantically similar content.
# Pseudocode
1 Generate embedding for the user question
2 Prepare the search query with the user question embedding
3 Send a request to Azure AI Search
4 Retrieve top relevant travel policy chunks
2. Generating Answers
The retrieved travel policy chunks are packaged as context and passed to OpenAI API to generate answers to user questions. In this step, we ensure that the system only answers based on the provided context by using a carefully crafted prompt. The prompt includes the retrieved context from the travel policy and explicit instructions for the AI model.
Here's how this worked:
# Pseudocode
1 Define ChatGPT prompt template
2 Fill in the template with policy chunks and user question
3 Include a clear instruction: "Answer the question based on
the context above. If the information to answer the
question is not present in the given context then reply
'I don't know'."
4 Send the prompt to OpenAI API for completion
5 Return generated answer
OpenAI was instructed to use only the information provided in the travel policy, preventing it from making up information or using knowledge from its pre-training. If the relevant information isn't in the policy, the model is instructed to admit its lack of knowledge rather than speculate.
Much trial-and-error debugging followed...
In addition to input-output validation, I had to make sure the answers made sense, weren't made up, and were meaningfully worded.
...and finally, it started working!
Here are the results:
Initially, I was skeptical that the experiment would succeed. I have tried and failed with similar personal learning endeavors in the past. Most of these failures were due to significant entry barriers, including steep learning curves for engineering, infrastructure, and security concepts, lack of access to low-cost tools, and significant time needed to set up and troubleshoot the environmental issues. This time the odds weren’t in my favor either – brand new concepts, prerelease versions of software, lack of expertise in the community, extremely high pace of change, and gaps in my knowledge. But to my surprise, it worked. So, the answer to my original question was - YES - it is possible for someone with an idea and no infrastructure to build and launch a working generative AI application with Microsoft Fabric. It took approximately 10 hours of heads-down work over 1 full week and $0.17 in Azure service costs.
As businesses continue to look for ways to gain an advantage through AI, RAG applications that unlock intelligent chat with companies’ internal data will become increasingly crucial for efficient knowledge management, employee self-service, and customer support. This application demonstrates how new AI capabilities can take a business idea and bring it to life with relatively low cost and effort.
In part three of this series, I will share my experience building the same Travel Policy AI Advisor RAG chat application using Amazon Bedrock.
Stay tuned!
#AI #BusinessIntelligence #RAGArchitecture #AIinnovation #EnterpriseAI #DataDriven #DigitalTransformation #GenerativeAI #CDAO #CDO