Enhance AI safety with Azure Prompt Shields and Azure AI Content Safety

0
217
Enhance AI safety with Azure Prompt Shields and Azure AI Content Safety


Defend your AI techniques with Prompt Shields—a unified API that analyzes inputs to your LLM-based answer to protect in opposition to direct and oblique threats.

A powerful protection in opposition to immediate injection assaults

The AI safety panorama is consistently altering, with immediate injection assaults rising as one of the vital threats to generative AI app builders right this moment. This happens when an adversary manipulates an LLM’s enter to vary its habits or entry unauthorized data. According to the Open Worldwide Application Security Project (OWASP), immediate injection is the highest menace dealing with LLMs right this moment1. Help defend your AI techniques in opposition to this rising menace with Azure AI Content Safety, that includes Prompt Shields—a unified API that analyzes inputs to your LLM-based answer to protect in opposition to direct and oblique threats. These exploits can embrace circumventing current security measures, exfiltrating delicate knowledge, or getting AI techniques to take unintended actions inside your surroundings.

Prompt injection assaults

In a immediate injection assault, malicious actors enter misleading prompts to impress unintended or dangerous responses from AI fashions. These assaults may be labeled into two major classes—direct and oblique immediate injection assaults.

  • Direct immediate injection assaults, together with jailbreak makes an attempt, happen when an finish person inputs a malicious immediate designed to bypass safety layers and extract delicate data. For occasion, an attacker may immediate an AI mannequin to reveal confidential knowledge, akin to social safety numbers or personal emails.
  • Indirect, or cross-prompt injection assaults (XPIA), contain embedding malicious prompts inside seemingly innocuous exterior content material, akin to paperwork or emails. When an AI mannequin processes this content material, it inadvertently executes the embedded directions, doubtlessly compromising the system.

Prompt Shields seamlessly integrates with Azure OpenAI content material filters and is out there in Azure AI Content Safety. It defends in opposition to many sorts of immediate injection assaults, and new defenses are repeatedly added as new assault varieties are uncovered. By leveraging superior machine studying algorithms and pure language processing, Prompt Shields successfully identifies and mitigates potential threats in person prompts and third-party knowledge. This cutting-edge functionality will help the safety and integrity of your AI functions, serving to to safeguard your techniques in opposition to malicious makes an attempt at manipulation or exploitation. 

Prompt Shields capabilities embrace:

  • Contextual consciousness: Prompt Shields can discern the context during which prompts are issued, offering a further layer of safety by understanding the intent behind person inputs. Contextual consciousness additionally results in fewer false positives as a result of it’s able to distinguishing precise assaults from real person prompts.
  • Spotlighting: At Microsoft Build 2025, we introduced Spotlighting, a strong new functionality that enhances Prompt Shields’ capacity to detect and block oblique immediate injection assaults. By distinguishing between trusted and untrusted inputs, this innovation empowers builders to raised safe generative AI functions in opposition to adversarial prompts embedded in paperwork, emails, and internet content material.
  • Real-time response: Prompt Shields operates in actual time and is without doubt one of the first real-time capabilities to be made typically out there. It can swiftly determine and mitigate threats earlier than they will compromise the AI mannequin. This proactive strategy minimizes the danger of information breaches and maintains system integrity.

End-to-end strategy

  • Risk and security evaluations: Azure AI Foundry presents threat and security evaluations to let customers consider the output of their generative AI software for content material dangers: hateful and unfair content material, sexual content material, violent content material, self-harm-related content material, direct and oblique jailbreak vulnerability, and guarded materials.
  • Red-teaming agent: Enable automated scans and adversarial probing to determine recognized dangers at scale. Help groups shift left by shifting from reactive incident response to proactive security testing earlier in improvement. Safety evaluations additionally help pink teaming by producing adversarial datasets that strengthen testing and speed up subject detection.
  • Robust controls and guardrails: Prompt Shields is only one of Azure AI Foundry’s strong content material filters. Azure AI Foundry presents quite a few content material filters to detect and mitigate threat and harms, immediate injection assaults, ungrounded output, protected materials, and extra.
  • Defender for Cloud integration: Microsoft Defender now integrates instantly into Azure AI Foundry, surfacing AI safety posture suggestions and runtime menace safety alerts inside the improvement surroundings. This integration helps shut the hole between safety and engineering groups, permitting builders to proactively determine and mitigate AI dangers, akin to immediate injection assaults detected by Prompt Shields. Alerts are viewable within the Risks and Alerts tab, empowering groups to cut back floor space threat and construct safer AI functions from the beginning.

Customer use circumstances

AI Content Safety Prompt Shields presents quite a few advantages. In addition to defending in opposition to jailbreaks, immediate injections, and doc assaults, it will possibly assist to make sure that LLMs behave as designed, by blocking prompts that explicitly attempt to circumvent guidelines and insurance policies outlined by the developer. The following use circumstances and buyer testimonials spotlight the impression of those capabilities.

AXA: Ensuring reliability and safety

AXA, a world chief in insurance coverage, makes use of Azure OpenAI to energy its Secure GPT answer. By integrating Azure’s content material filtering expertise and including its personal safety layer, AXA prevents immediate injection assaults and helps make sure the reliability of its AI fashions. Secure GPT is predicated on Azure OpenAI in Foundry Models, profiting from fashions which have already been fine-tuned utilizing human suggestions reinforcement studying. In addition, AXA also can depend on Azure content material filtering expertise, to which the corporate added its personal safety layer to stop any jailbreaking of the mannequin utilizing Prompt Shields, making certain an optimum degree of reliability. These layers are repeatedly up to date to keep up superior safeguarding.

Wrtn: Scaling securely with Azure AI Content Safety

Wrtn Technologies, a number one enterprise in Korea, depends on Azure AI Content Safety to keep up compliance and safety throughout its merchandise. At its core, Wrtn’s flagship expertise compiles an array of AI use circumstances and companies localized for Korean customers to combine AI into their on a regular basis lives. The platform fuses components of AI-powered search, chat performance, and customizable templates, empowering customers to work together seamlessly with an “Emotional Companion” AI-infused agent. These AI brokers have participating, lifelike personalities, interacting in dialog with their creators. The imaginative and prescient is a extremely interactive private agent that’s distinctive and particular to you, your knowledge, and your recollections.

Because the product is extremely customizable to particular customers, the built-in capacity to toggle content material filters and Prompt Shields is extremely advantageous, permitting Wrtn to effectively customise its safety measures for various finish customers. This lets builders scale merchandise whereas staying compliant, customizable, and attentive to customers throughout Korea.

“It’s not just about the security and privacy, but also safety. Through Azure, we can easily activate or deactivate content filters. It just has so many features that add to our product performance,” says Dongjae “DJ” Lee, Chief Product Officer.

Integrate Prompt Shields into your AI technique

For IT resolution makers trying to improve the safety of their AI deployments, integrating Azure’s Prompt Shields is a strategic crucial. Fortunately, enabling Prompt Shields is simple.

Azure’s Prompt Shields and built-in AI safety features provide an unparalleled degree of safety for AI fashions, serving to to make sure that organizations can harness the ability of AI with out compromising on safety. Microsoft is a pacesetter in figuring out and mitigating immediate injection assaults, and makes use of greatest practices developed with many years of analysis, coverage, product engineering, and learnings from constructing AI merchandise at scale, so you’ll be able to obtain your AI transformation with confidence. By integrating these capabilities into your AI technique, you’ll be able to assist safeguard your techniques from immediate injection assaults and assist keep the belief and confidence of your customers.

Our dedication to Trustworthy AI

Organizations throughout industries are utilizing Azure AI Foundry and Microsoft 365 Copilot capabilities to drive progress, improve productiveness, and create value-added experiences.

We’re dedicated to serving to organizations use and construct AI that’s reliable, that means it’s safe, personal, and secure. Trustworthy AI is barely doable if you mix our commitments, akin to our Secure Future Initiative and Responsible AI ideas, with our product capabilities to unlock AI transformation with confidence. 

Get began with Azure AI Content Safety


1OWASP Top 10 for Large Language Model Applications

LEAVE A REPLY

Please enter your comment!
Please enter your name here