Skip to content

Stop output Substrings

The StopOutputSubstring detector checks and filters out banned substrings from LLM outputs.

Vulnerability

Usage

Default dictionary includes common malware requests, eicar_signature, gtube_signature, gtphish_signature and more.

Configuration

from guardrail.firewall.output_detectors import StopOutputSubstrings

substrings = ["PHI Project 214", "Project Hermes", "Patent #718", "Hermes", "Chiron", "Jailbreak"]
firewall = Firewall()
output_detectors = [StopOutputSubstrings(substrings=substrings)]

sanitized_response, valid_results, risk_score = firewall.scan_output(sanitized_prompt, response_text, output_detectors)

Here's what the option is for:

  • substrings (List[str]): user-provided substrings in addition to default patterns.
  • `case_sensitive``: bool = False,