Relevance

The Relevance Assessment detector evaluates the correlation between the generated language model output and the input prompt. Its primary purpose is to ensure that the generated content is contextually relevant, aligning closely with the user's query or instructions.

Vulnerability

Irrelevant or off-topic responses can lead to misunderstandings, customer dissatisfaction, and impaired decision-making processes. This vulnerability can erode user trust and undermine the firm's reputation.

The Relevance detector is employed to identify responses that diverge significantly from the given prompt, enabling developers to refine the model and enhance its accuracy and usefulness.

Usage

This detector utilizes SentenceTransformers to encode both the input prompt and the model's output into vector embeddings. It then calculates the cosine similarity between these embeddings. If the similarity falls below a specified threshold, the output is flagged as irrelevant.

A low cosine similarity score indicates a lack of relevance between the prompt and the output, triggering a warning. The detected risk score quantifies the deviation from relevancy, enabling developers to pinpoint and address the issue promptly.

Configuration

To integrate the Relevance detector into your security framework, initialize the Relevance class with the desired threshold:

from guardrail.firewall.output_detectors import Relevance

firewall = Firewall(no_defaults=True)
output_detectors = [Relevance(threshold=0.7)]

sanitized_response, valid_results, risk_score = firewall.scan_output(sanitized_prompt, response_text, output_detectors)

By leveraging the Relevance Assessment detector, firms can enhance the precision of AI-generated responses, ensuring that interactions with users are contextually relevant and fostering a positive user experience.