How to create a chatgpt-like system that bases its responses on the enterprise’s own data using Azure OpenAI

6 min readDec 11, 2023

An efficient solution to this problem is provided by the combination of Azure Cognitive Search and Azure OpenAI Service. It combines ChatGPT’s remarkable natural language management capabilities with the enterprise-grade features of Azure, the capacity of Cognitive Search to index, comprehend, and retrieve the appropriate portions of your own data across extensive knowledgebases.

The approach of retrieval-augmented generation allows for a simple start and the ability to grow more complex as needed. Prompt construction, query formulation for efficient knowledge base retrieval, and back-and-forth interaction between ChatGPT and the knowledge base are very customisable.

To make LLM model understand and respond to enterprise domain-specific queries, we have the following two options:

1. Fine-tune the LLM on text data covering the topic mentioned.

2. by utilising Retrieval Augmented Generation (RAG), a method that incorporates a retrieval component into the generation process. making it possible for us to obtain relevant data and use it as a secondary source of data to feed the generation model.

We will go with option 2.

To locate content that might be relevant to the user’s query, perform a semantic search against your knowledge base.
Create a prompt by combining the data taken from the knowledge base followed by “Given the above content, answer the following question” and then the user’s question .
Send GPT-4 the prompt to see what response you receive.

All we’re really doing here is pulling information from a database that appears to be relevant to the user’s query, creating a prompt with that information, and instructing the LLM model to generate the response based only on the information provided in the prompt.

Enhancing the retrieval of knowledge bases

A key component of these solutions is quality of retrieval, as answers will ultimately depend on what we can extract from the knowledge base.

Semantic ranking: By default, Cognitive Search scores using a basic probabilistic model in conjunction with keyword search. You have the option to activate Semantic Ranking, which will increase accuracy by using an advanced deep learning secondary ranking layer.

Document chunking: you want content that is the appropriate length when indexing it in Cognitive Search with the express intent of enabling ChatGPT scenarios. Each document will lack context if it is overly brief. Finding the appropriate message for ChatGPT to “read” becomes challenging if it is too lengthy. If your data permits, recommendation is to start with a small number of sentences (about 1/4 to 1/3 of a page) with a sliding window of text. There are situations where it makes sense to keep the data whole and have each document provide the whole description of a single part

Summarization: Even after chunking, you might occasionally wish to shorten each candidate in order to fit more in a prompt. This can be accomplished by applying a step for summary. Using Semantic Captions (a query-contextualized summarization step supported directly in Cognitive Search), using hit highlighting (a more lexical, rather than semantic, mechanism to extract snippets), or post-processing the search results with an external summarization model are some options for this.

Solution Deployment

we’re going to make use of just 3 services :

Azure OpenAI to interact with the LLM models.

2. Azure Search and Storage account to set up a knowledge database.

3. Azure webapp to build a simple user interface.

The Enterprise domain gpt like search architecture is shown in the diagram below.

One of Azure Open AI’s best features is that it is extremely scalable and flexible and also includes security features like encryption and access controls to help protect customer data and systems from cyber threats.

Here are some of the ways that security is managed for the service to be operational via

Encryption: This aids in limiting unwanted access to private information.

Access Controls: To assist prevent unauthorised access to customer data and systems, Azure Open AI comes with a number of access controls. Network security groups, multi-factor authentication, and role-based access controls are examples of this.

Threat Detection: Azure Open AI has sophisticated threat detection features that assist enterprises in identifying and addressing possible security risks. This covers automated threat response, security warnings, and real-time threat monitoring.

Data Protection: To help guarantee that client data is safeguarded in the case of a data loss or system failure, Azure Open AI offers a number of data protection tools, such as backup and disaster recovery.

Private Endpoint: Azure Open AI, Search and datastore (storage account) uses Private Endpoint , allowing you to limit access to services to only the resources in your virtual network that need it. enabling authorised resources to access what they require while maintaining the safety and security of your services

I have created the terraform deployment to deploy the Azure openai, search and storage account plus private endpoints.

GitHub - selvakumarsai/azureopenai-enterprisedatasearch

Contribute to selvakumarsai/azureopenai-enterprisedatasearch development by creating an account on GitHub.

github.com

azurerm_private_endpoint.pep_st: Still creating... [1m20s elapsed]
azurerm_private_endpoint.openaisearch-pe01: Still creating... [1m30s elapsed]
azurerm_private_endpoint.pep_st: Still creating... [1m30s elapsed]
azurerm_private_endpoint.openaisearch-pe01: Still creating... [1m40s elapsed]
azurerm_private_endpoint.pep_st: Still creating... [1m40s elapsed]
azurerm_private_endpoint.openaisearch-pe01: Creation complete after 1m48s [id=/subscriptions/369c6a47-2323-409e-8284-e40ed50570c7/resourceGroups/byod-resources/providers/Microsoft.Network/privateEndpoints/pe-openaisearch-byod]
azurerm_private_endpoint.pep_st: Still creating... [1m50s elapsed]
azurerm_private_endpoint.pep_st: Still creating... [2m0s elapsed]
azurerm_private_endpoint.pep_st: Still creating... [2m10s elapsed]
azurerm_private_endpoint.pep_st: Still creating... [2m20s elapsed]
azurerm_private_endpoint.pep_st: Still creating... [2m30s elapsed]
azurerm_private_endpoint.pep_st: Creation complete after 2m31s [id=/subscriptions/369c6a47-2323-409e-8284-e40ed50570c7/resourceGroups/byod-resources/providers/Microsoft.Network/privateEndpoints/pep-byodkedb-st]
azurerm_private_dns_a_record.dns_a_sta: Creating...
azurerm_private_dns_a_record.dns_a_sta: Creation complete after 3s [id=/subscriptions/369c6a47-2323-409e-8284-e40ed50570c7/resourceGroups/byod-resources/providers/Microsoft.Network/privateDnsZones/privatelink.blob.core.windows.net/A/sta_a_record]

Apply complete! Resources: 17 added, 0 changed, 0 destroyed.

Adding the data source and deploying the azure webapp activities are performed manually , will be updating this blog once the E2E automation is completed.

To be able to upload your data, you need to have a Storage with Blob Container and Cognitive Search. Unfortunately, there’s one surprising thing not mentioned in the Microsoft documentation. As of Now, Both Storage and Cognitive Search need to have public access enabled in networking tab.

Enabling the chat history in the webapp incur cosmos db usage to your account