A Deep Dive into Preventing Sensitive Information Disclosure
Sensitive disclosures can occur during training, inference, and interaction flows, often surfacing only under specific prompts or usage patterns.
To win, your GenAI solutions must consistently detect, prevent, and test against sensitive information disclosure across inputs, outputs, and model behavior.
Sensitive data exposure in GenAI systems is subtle, persistent, and easy to underestimate.
• Unclassified disclosure risks: Teams lack a clear taxonomy of what constitutes sensitive information in GenAI contexts.
• Blind spots across phases: Leaks occur during both training and inference, but detection is applied inconsistently.
• Insufficient redaction controls: PII and PHI protections are uneven across inputs, outputs, and downstream systems.
These gaps lead to compliance violations, loss of trust, and serious legal and reputational impact.
In this hands-on workshop, your team designs and evaluates controls to prevent sensitive information disclosure through guided exercises and adversarial testing.
• Classify types of sensitive disclosures relevant to GenAI use cases.
• Detect information leaks across both training and inference phases.
• Apply PII and PHI redaction consistently at input and output boundaries.
• Implement anonymization and obfuscation tools to reduce exposure risk.
• Test systems using adversarial prompts to surface hidden disclosure paths.
Classifying Types of Sensitive Disclosures
Detecting Leaks in Training and Inference Phases
Applying PII and PHI Redaction at Input/Output
Implementing Anonymization and Obfuscation Tools
Testing Exposure with Adversarial Prompts
• Identify and classify sensitive disclosure risks in GenAI systems.
• Detect leakage patterns during both model training and inference.
• Apply effective redaction controls for PII and PHI.
• Reduce exposure through anonymization and obfuscation techniques.
• Leave with tested methods to validate disclosure defenses under adversarial conditions.
Who Should Attend:
Solution Essentials
Virtual or in-person
4 hours
Intermediate
Redaction patterns, anonymization techniques, and adversarial prompt testing exercises