订阅 RSS 源

Microsoft and Red Hat have shared a long history  of collaboration based on open innovation. Through our combined efforts, enterprises have been empowered to more confidently build and run mission-critical workloads across hybrid environments. Most recently, this collaboration has taken a significant step forward with new technical integrations—one of the most notable being the full certification of SQL Server 2022 on Red Hat Enterprise Linux (RHEL) 9. 

With this certification, organizations can now run SQL Server 2022 as a confined application on RHEL 9, gaining stronger security boundaries and tighter integration with enterprise identity systems. This includes full Active Directory support via Microsoft’s adutilfor Linux, and AES encryption enabled by default at both 128-bit and 256-bit levels. These updates make SQL Server an excellent option for enterprises embracing open source infrastructure without sacrificing the security capabilities, compliance, or features they rely on.

Building on that strong foundation, we’re excited to share another milestone in our collaboration with Microsoft: a validated pattern for retrieval-augmented generation (RAG) using open source large language models (LLMs) deployed on Red Hat OpenShift with Azure.

This solution provides an architecture for deploying scalable LLM-based applications that combine private data with generative AI (gen AI), all running on trusted Red Hat infrastructure and the scalability of Microsoft Azure. Please note that while vector support in Azure SQL is generally available (GA), vector support for the on-premises version of SQL Server 2025 is still in public preview and not intended for production workloads.

What are validated patterns?

Validated patterns are production-ready, open source reference architectures developed by Red Hat to help teams solve real-world problems. Validated patterns are collections of applications that demonstrate aspects of hub/edge computing. Validated patterns will generally have a hub or centralized component, and an edge component. These will interact in different ways. 

These validated patterns represent an evolution in how applications are deployed in hybrid cloud environments—combining automation, documentation, and best practices into a reusable, GitOps-based framework. Each pattern enables the automatic deployment of a full application stack while supporting continuous integration (CI) and business-centric use cases. Designed for scalability and reliability, validated patterns give developers a trusted, repeatable foundation to build and extend enterprise solutions across clouds.

Tackling GPU availability in the cloud

During initial testing, GPU compatibility presented a challenge—specifically with 24 GB GPU virtual machine (VM) types on Azure. The workaround? Shifting to 16 GB NC T4 VMs, which had sufficient quota and performed well for our target workloads. These VMs now host a fine-tuned inference server running a quantized LLM model in AWQ (Activation-aware Weight Quantization) format, efficiently optimized for available GPU memory.

SQL Server 2025 as your vector store

A key advantage of this validated pattern is how it integrates with Azure SQL. Administering embeddings and RAG content is simple and provides improved security—especially compared to the local SQL setup, which required some workarounds for self-signed certificates. Whether you're connecting to a cloud-based or on-prem instance, the deployment treats both environments identically, relying solely on a connection string that can be more easily configured.

Container ready

The architecture supports SQL Server 2025 public preview, which can be run in a container. This not only removes the need for Docker secrets but also gives users the flexibility to deploy SQL Server outside of Azure—anywhere OpenShift runs. We tested on AWS 4.18 as well, and it runs just as smoothly there.

Some highlights of this modular pattern include:

  • A UI supporting multiple LLMs via dropdown
  • A GPU-powered inference server deployed with Kubernetes taints for workload isolation
  • A RAG database for storing and retrieving vector embeddings
  • Integrated secrets management for Azure SQL credentials

Managing embeddings made easy

Embeddings are generated as part of the pattern installation and stored in the vector database. While the embedding model is chosen at deployment via an environment variable, changing it later requires redeployment to avoid mismatches in vector dimensions. However, the system is designed with flexibility in mind—embedding size and chunk size are configurable. We've tested this with 1024-token chunks and batch sizes up to 50.

Deploying on Azure? We got you

A provided script simplifies GPU node provisioning on Azure. Once that's complete, users just need to create the Azure SQL server, define the appropriate secrets, and install the pattern. The UI, embedding pipeline, and inference server then connect and operate together.

If you prefer an on-premises, container-based deployment of SQL Server, please deploy the SQL Server 2025 public preview container image using the 2025-latest tag. For details, refer to: Quickstart: Install & connect using Docker.

What are the use cases for this validated pattern?

There are a variety of uses cases for this validated pattern, including:

  • Conversational BI and natural-language analytics: Enable users to ask complex business questions in plain English, such as  “What were our top-selling products last quarter?”, and get real-time answers. LLMs generate and execute SQL queries against Azure SQL, returning clear, contextual insights without requiring users to know T-SQL.
  • Enterprise knowledge base search: Break down internal documents such as product manuals, HR policies, or IT runbooks—into manageable chunks and convert them into vector embeddings stored in Azure SQL. This enables the RAG pipeline to semantically retrieve the most relevant content, allowing the LLM to deliver accurate, contextual answers to employee or customer questions.
  • Intelligent customer support: Connect structured customer data and unstructured documentation in Azure SQL to power smart virtual agents. The LLM fetches ticket history, support documentation, and product details to generate personalized, contextual support responses.
  • Compliance and audit automation: Store logs, policy docs, and audit trails in Azure SQL and use the LLM to analyze and summarize relevant content for auditors. This helps streamline compliance reporting while keeping all sensitive data inside a secure, governed SQL environment.
  • Regulated industry chatbots (healthcare, finance, legal): In sectors with strict data compliance needs, hosting vector search and data within Azure SQL helps ensure regulatory alignment. RAG-enabled assistants can safely surface relevant laws, medical guidelines, or financial policies, without data leaving your secure infrastructure.
  • Developer productivity and code generation: Embed database schemas, code snippets, and documentation into Azure SQL, enabling the LLM to generate boilerplate code, optimize SQL queries, or explain performance tuning—all tailored to your environment.
  • Predictive maintenance and incident insights: For industries like manufacturing or energy, historical incident reports and sensor logs can be vectorized and stored in Azure SQL. The LLM can surface similar past incidents, suggest root causes, or summarize troubleshooting steps to speed up resolution times.

See this article to learn more about configuring the MSSQL pattern for your use case.

Final thoughts

By combining Red Hat’s AI GitOps capabilities with Microsoft Azure SQL, enterprises now have a powerful, production-ready foundation to build and scale generative AI solutions. Validated patterns on Red Hat OpenShift make deploying these solutions repeatable and more consistent, while Azure SQL and SQL Server 2025 deliver high-performance storage for vector data and embeddings. Whether you’re tackling conversational analytics or intelligent support, this collaboration helps you run workloads more seamlessly across hybrid and cloud environments. Explore the possibilities of the RAG-LLM pattern on Azure SQL yourself here. To learn more about Azure SQL see this link.  

See the AI GitOps pattern in action with this demo: 

resource

开启企业 AI 之旅:新手指南

此新手指南介绍了红帽 OpenShift AI 和红帽企业 Linux AI 如何加快您的 AI 采用之旅。

关于作者

Vivien Wang is a Senior Engineering Partner Manager. She works with Red Hat partners to co-build, certify and integrate in the Red Hat ecosystem for Openshift, Ansible, Virtualization and RHEL. Before joining Red Hat in 2019, she worked at SAP UKI, and as an in-house technical writer for ASUS and Mitsubishi Heavy Industries. She holds a master's degree from King's College London.
Read full bio

As a Senior Software Engineer on Red Hat's Validated Patterns team, Drew architects and builds automation for OpenShift to create reliable, repeatable solutions for enterprise customers. He joined the team in February 2025, bringing a unique perspective shaped by his experience in front-end and back-end development, QA, and DevOps. Outside of work, Drew enjoys cooking, watching culinary shows, and taking his dog for long walks.

Read full bio

Amit is a Principal Product Manager at Microsoft with over 15 years of experience. He has been pivotal in the development and evolution of SQL Server on Linux, significantly contributing to Microsoft's cross-platform solutions. Currently, he oversees SQL Server on Linux and Containers. With over a decade of database experience, he has designed SQL Server-based data platforms for Tier 1 customers across diverse business segments.

Read full bio

按频道浏览

automation icon

自动化

有关技术、团队和环境 IT 自动化的最新信息

AI icon

人工智能

平台更新使客户可以在任何地方运行人工智能工作负载

open hybrid cloud icon

开放混合云

了解我们如何利用混合云构建更灵活的未来

security icon

安全防护

有关我们如何跨环境和技术减少风险的最新信息

edge icon

边缘计算

简化边缘运维的平台更新

Infrastructure icon

基础架构

全球领先企业 Linux 平台的最新动态

application development icon

应用领域

我们针对最严峻的应用挑战的解决方案

Virtualization icon

虚拟化

适用于您的本地或跨云工作负载的企业虚拟化的未来