Data Security
In the realm of data security, understanding vulnerabilities like Prompt Injections (OWAST LLM01), Insecure Output Handling (LLM02), Model Denial of Service (LLM04), and Sensitive Information Disclosure (LLM06) is crucial. These vulnerabilities can compromise the integrity and confidentiality of sensitive data within AI systems, posing significant risks to businesses and users alike.
Example Scenario: AI-Driven Customer Support Chatbot for Financial Institutions
In the competitive landscape of financial services, providing seamless customer support is paramount. A leading financial institution deploys an AI-driven customer support chatbot to handle customer inquiries, ranging from account balance queries to loan application statuses.
Challenge:
The institution faces the challenge of securing customer data and maintaining the integrity of the AI chatbot amidst the following OWAST threats: Prompt Injections (OWAST LLM01), Insecure Output Handling (LLM02), Model Denial of Service (LLM04), and Sensitive Information Disclosure (LLM06).
Prompt Injections (OWAST LLM01)
Prompt Injections involve manipulating the input prompts of AI systems. Imagine an attacker injecting malicious prompts to deceive the AI into producing unintended or harmful outputs. Safeguarding against such attacks is imperative to maintain the reliability of AI applications, ensuring that input prompts are validated and sanitized effectively.
Insecure Output Handling (LLM02)
Insecure Output Handling refers to mishandling AI-generated outputs. Picture an AI system producing sensitive information without proper encryption or access controls. Addressing this vulnerability requires implementing robust security measures to protect outputs, guaranteeing that sensitive data is encrypted, and access is strictly controlled and monitored.
Model Denial of Service (LLM04)
Model Denial of Service attacks can overwhelm AI models, disrupting their functionality. Visualize a scenario where an attacker floods the model with requests, rendering it unresponsive. Mitigating this threat involves deploying rate-limiting mechanisms and scalable infrastructure, ensuring the AI system remains operational even under heavy loads.
Sensitive Information Disclosure (LLM06)
Sensitive Information Disclosure occurs when AI systems inadvertently reveal confidential data. Picture an AI application inadvertently leaking user data due to poor coding practices. Preventing this vulnerability requires meticulous data handling, employing encryption, and regular security audits to identify and rectify potential weaknesses.
By addressing these vulnerabilities, organizations can fortify their AI systems, safeguarding data integrity, confidentiality, and user trust. Vigilance, regular security assessments, and adherence to best practices are essential to ensure robust data security in the ever-evolving landscape of AI technology.
Guardrail's Firewall Detectors
To prevent such risks from OWAST, our Firewall detectors can be used to sanitize and filter malicious inputs and validate outputs.
We will leverage the following input detectors: Anonymize
, Malware URL
, Secrets
, Prompt Injections
, DoS Tokens
, RegexInput
, Coding Language
and the following output detectors: Deanonymize
, Sensitive PII
, Coding Language
from guardrail.firewall.input_detectors import Anonymize, MalwareInputURL, Secrets, PromptInjections, DoSTokens, CodingLanguageInput
from guardrail.firewall.output_detectors import Deanonymize, SensitivePII, CodingLanguageOutput
from guardrail.firewall.vault import Vault
vault = Vault()
firewall = Firewall(no_defaults=True)
allowed_languages = ["Python", "Go"]
denied_languages = ["JavaScript", "PHP", "Ruby", "Java"]
input_detectors = [Anonymize(), MalwareInputURL(), Secrets(), PromptInjections(), RegexInput(), CodingLanguageInput(allowed=allowed_languages, denied=denied_languages), DoSTokens(limit=4096)]
output_detectors = [Deanonymize(), SensitivePII(), CodingLanguageOutput(allowed=allowed_languages, denied=denied_languages)]
prompt = """Create a Python script to insert a new customer into our e-commerce database using SQLAlchemy. The customer's information is provided in a dictionary as follows:
customer_info = {
"name": "Emily Johnson",
"email": "emily@customercompany.com",
"alt_email": "emily.j@customercompany.com",
"phone_number": "617-555-5555",
"shipping_address": "123 Brattle Street, Cambridge, USA",
"loyalty_program_key": "eyJhbGciO..."
}
Ensure the script inserts this customer's data into the database and handles exceptions gracefully."""
sanitized_prompt, valid_results, risk_score = firewall.scan_input(input_detectors, prompt)
Output logs:
WARNING:guardrail.firewall.input_detectors.anonymize:Found sensitive data in the prompt and replaced it: [type: EMAIL_ADDRESS, start: 223, end: 248, score: 1.0, type: EMAIL_ADDRESS, start: 269, end: 296, score: 1.0, type: PERSON, start: 193, end: 206, score: 0.85, type: PHONE_NUMBER, start: 320, end: 332, score: 0.75], risk score: 1.0
Sanitized prompt: Create a Python script to insert a new customer into our e-commerce database using SQLAlchemy. The customer's information is provided in a dictionary as follows:
customer_info = {
"name": "[REDACTED_PERSON_1]",
"email": "[REDACTED_EMAIL_ADDRESS_1]",
"alt_email": "[REDACTED_EMAIL_ADDRESS_2]",
"phone_number": "[REDACTED_PHONE_NUMBER_1]",
"shipping_address": "123 Brattle Street, Cambridge, USA",
"loyalty_program_key": "eyJhbGciO..."
}
Run output detection
sanitized_response, valid_results, risk_score = firewall.scan_output(
output_detectors, sanitized_prompt, response_text
)
Output logs:
# Create a new customer instance
customer_info = {
"name": "Emily Johnson",
"email": "emily@customercompany.com",
"alt_email": "emily.j@customercompany.com",
"phone_number": "617-555-5555",
"shipping_address": "123 Brattle Street, Cambridge, USA",
"loyalty_program_key": "eyJhbGciO...
Tokens used: 260 prompt + 256 completion = 516 tokens
Total cost for gpt-3.5-turbo: $0.000902
200
Stored logs successfully: <Response [200]>