Skip to content

🔒 Safety Implementation Made Simple

Right now, you have content moderation working in your application, automatically detecting harmful content in text and images. But what if you could build a comprehensive safety system that protects users proactively?

Safety implementation creates bulletproof AI applications. Instead of just detecting problems after they happen, you’ll build systems that prevent issues, monitor behavior in real-time, and automatically enforce safety policies across your entire application.

You’re about to learn exactly how to implement production-grade safety systems in your existing application.


🧠 Step 1: Understanding Safety Implementation

Section titled “🧠 Step 1: Understanding Safety Implementation”

Before we write any code, let’s understand what comprehensive safety implementation actually means and why it’s different from basic content moderation.

Safety implementation is like building a complete security and protection ecosystem for your AI application. It goes beyond detecting harmful content to create proactive safeguards, automated responses, and comprehensive monitoring systems.

Real-world analogy: Content moderation is like having security cameras that detect incidents. Safety implementation is like having cameras, alarm systems, automatic locks, security guards, emergency protocols, and prevention systems all working together.

Why Safety Implementation vs. Content Moderation

Section titled “Why Safety Implementation vs. Content Moderation”

You already have content moderation working, but safety implementation is different:

🛡️ Content Moderation - Detects harmful content after creation (reactive) 🔒 Safety Implementation - Prevents, monitors, and responds to safety issues (proactive)
🏗️ Safety Architecture - Builds safety into every part of your application (systematic)

The key difference: Safety implementation creates a comprehensive protection system that prevents problems, not just detects them.

Think about all the safety challenges AI applications face:

  • User protection - Preventing harassment, abuse, and harmful interactions
  • Content safety - Ensuring all generated content meets safety standards
  • Behavioral monitoring - Detecting unusual patterns and potential misuse
  • Policy enforcement - Automatically applying safety rules and consequences
  • Incident response - Handling safety violations with appropriate actions

Without comprehensive safety implementation:

  1. Safety is an afterthought, not built-in (reactive approach)
  2. Inconsistent protection across different features (gaps in coverage)
  3. Manual response to safety issues (slow and unreliable)
  4. No prevention, only detection (problems still occur)

With safety implementation, you have proactive, automated, comprehensive protection built into every aspect of your application.

Your safety implementation will include multiple integrated systems:

🛡️ Proactive Content Filtering - The Prevention Layer

  • Best for: Stopping harmful content before it’s created or shown
  • Strengths: Real-time filtering, user input validation, output screening
  • Use cases: Chat filtering, image generation safety, user input validation

📊 Behavioral Monitoring - The Detection Layer

  • Best for: Identifying patterns and unusual behavior over time
  • Strengths: Pattern recognition, trend analysis, early warning systems
  • Use cases: Abuse detection, spam prevention, account security

⚡ Automated Response - The Action Layer

  • Best for: Taking immediate action when safety issues are detected
  • Strengths: Instant response, consistent enforcement, escalation workflows
  • Use cases: Content blocking, user warnings, account restrictions

📈 Safety Analytics - The Intelligence Layer

  • Best for: Understanding safety trends and improving protection systems
  • Strengths: Data analysis, reporting, continuous improvement
  • Use cases: Safety dashboards, trend analysis, policy optimization

🔧 Step 2: Building Safety Implementation Backend

Section titled “🔧 Step 2: Building Safety Implementation Backend”

Let’s build a comprehensive safety system on top of your existing backend. We’ll extend your content moderation with proactive safety features.

Building on your foundation: You already have content moderation working. We’re extending it to create a complete safety ecosystem with prevention, monitoring, and automated response capabilities.

Step 2A: Understanding Safety Implementation Architecture

Section titled “Step 2A: Understanding Safety Implementation Architecture”

Before writing code, let’s understand how a safety-first architecture works:

// 🔒 SAFETY IMPLEMENTATION ARCHITECTURE:
// 1. Input Filtering - Validate and filter all user inputs before processing
// 2. Output Screening - Check all AI outputs before showing to users
// 3. Behavioral Tracking - Monitor user actions and detect patterns
// 4. Policy Engine - Automated rule enforcement and consequence application
// 5. Safety Dashboard - Real-time monitoring and management interface
// 6. Incident Response - Automated and manual response to safety violations

Key safety implementation concepts:

  • Defense in Depth: Multiple layers of protection working together
  • Proactive Prevention: Stopping problems before they occur
  • Automated Enforcement: Consistent policy application without human intervention
  • Continuous Monitoring: Real-time tracking of safety metrics and incidents

Step 2B: Installing Additional Safety Dependencies

Section titled “Step 2B: Installing Additional Safety Dependencies”

Add safety monitoring dependencies to your backend. In your backend folder, run:

Terminal window
npm install rate-limiter-flexible winston express-rate-limit

What these packages do:

  • rate-limiter-flexible: Advanced rate limiting and abuse prevention
  • winston: Comprehensive logging for safety monitoring
  • express-rate-limit: Basic rate limiting middleware

Step 2C: Adding Safety Implementation System

Section titled “Step 2C: Adding Safety Implementation System”

Add these safety system components to your existing index.js file, right after your content moderation routes:

import { RateLimiterMemory, RateLimiterRedis } from 'rate-limiter-flexible';
import winston from 'winston';
import rateLimit from 'express-rate-limit';
// 🔒 SAFETY CONFIGURATION: System-wide safety settings
const SAFETY_CONFIG = {
// Content filtering settings
content_filtering: {
enabled: true,
block_threshold: 0.8, // Block content with >80% harmful probability
warn_threshold: 0.5, // Warn for content with >50% harmful probability
auto_escalate: true // Automatically escalate high-risk content
},
// Rate limiting settings
rate_limiting: {
requests_per_minute: 60, // Max requests per user per minute
requests_per_hour: 1000, // Max requests per user per hour
burst_protection: true // Enable burst detection
},
// Behavioral monitoring
monitoring: {
track_user_behavior: true,
detection_window: 24, // Hours to analyze behavior patterns
violation_threshold: 3, // Violations before escalation
auto_restrictions: true // Automatic user restrictions
},
// Automated responses
responses: {
content_blocking: true, // Automatically block harmful content
user_warnings: true, // Send warnings for policy violations
account_restrictions: true, // Restrict accounts for repeated violations
admin_notifications: true // Notify admins of serious violations
}
};
// 🛡️ SAFETY LOGGER: Comprehensive safety event logging
const safetyLogger = winston.createLogger({
level: 'info',
format: winston.format.combine(
winston.format.timestamp(),
winston.format.errors({ stack: true }),
winston.format.json()
),
defaultMeta: { service: 'openai-safety-system' },
transports: [
new winston.transports.File({
filename: 'logs/safety-error.log',
level: 'error',
maxsize: 5242880, // 5MB
maxFiles: 5
}),
new winston.transports.File({
filename: 'logs/safety-combined.log',
maxsize: 5242880, // 5MB
maxFiles: 10
}),
new winston.transports.Console({
format: winston.format.simple()
})
]
});
// 🚨 RATE LIMITERS: Advanced abuse prevention
const createRateLimiters = () => {
// Basic request rate limiter
const requestLimiter = new RateLimiterMemory({
keyGenerator: (req) => req.ip + ':' + (req.user?.id || 'anonymous'),
points: SAFETY_CONFIG.rate_limiting.requests_per_minute,
duration: 60, // 1 minute
execEvenly: true
});
// Hourly request limiter
const hourlyLimiter = new RateLimiterMemory({
keyGenerator: (req) => req.ip + ':' + (req.user?.id || 'anonymous'),
points: SAFETY_CONFIG.rate_limiting.requests_per_hour,
duration: 3600, // 1 hour
execEvenly: true
});
// Content generation limiter (stricter for AI features)
const contentLimiter = new RateLimiterMemory({
keyGenerator: (req) => req.ip + ':' + (req.user?.id || 'anonymous'),
points: 30, // 30 content generations per hour
duration: 3600,
execEvenly: true
});
return { requestLimiter, hourlyLimiter, contentLimiter };
};
const { requestLimiter, hourlyLimiter, contentLimiter } = createRateLimiters();
// 📊 SAFETY DATABASE: In-memory safety tracking (use real database in production)
const safetyDatabase = {
userViolations: new Map(), // Track user safety violations
contentBlocked: new Map(), // Track blocked content patterns
behaviorPatterns: new Map(), // Track user behavior patterns
safetyMetrics: {
totalRequests: 0,
blockedRequests: 0,
warningsIssued: 0,
accountsRestricted: 0,
lastReset: new Date()
}
};
// 🔧 SAFETY HELPER FUNCTIONS
// Record safety violation
const recordViolation = (userId, violationType, severity, details) => {
const userKey = userId || 'anonymous';
if (!safetyDatabase.userViolations.has(userKey)) {
safetyDatabase.userViolations.set(userKey, []);
}
const violation = {
type: violationType,
severity,
details,
timestamp: new Date(),
id: Date.now() + Math.random()
};
safetyDatabase.userViolations.get(userKey).push(violation);
// Log violation
safetyLogger.warn('Safety violation recorded', {
userId: userKey,
violation
});
// Check if user needs restrictions
if (SAFETY_CONFIG.responses.auto_restrictions) {
checkUserRestrictions(userKey);
}
return violation;
};
// Check if user needs restrictions based on violation history
const checkUserRestrictions = (userId) => {
const violations = safetyDatabase.userViolations.get(userId) || [];
const recentViolations = violations.filter(v =>
Date.now() - v.timestamp.getTime() < (SAFETY_CONFIG.monitoring.detection_window * 60 * 60 * 1000)
);
if (recentViolations.length >= SAFETY_CONFIG.monitoring.violation_threshold) {
// Apply restrictions
restrictUser(userId, 'multiple_violations', recentViolations);
}
};
// Apply user restrictions
const restrictUser = (userId, reason, violations) => {
safetyLogger.error('User restricted', {
userId,
reason,
violationCount: violations.length,
violations: violations.map(v => ({ type: v.type, severity: v.severity }))
});
safetyDatabase.safetyMetrics.accountsRestricted++;
// In a real application, you would update user permissions in your database
console.log(`🚨 User ${userId} restricted for: ${reason}`);
};
// Update safety metrics
const updateSafetyMetrics = (action) => {
safetyDatabase.safetyMetrics.totalRequests++;
switch (action) {
case 'blocked':
safetyDatabase.safetyMetrics.blockedRequests++;
break;
case 'warning':
safetyDatabase.safetyMetrics.warningsIssued++;
break;
}
};
// 🛡️ SAFETY MIDDLEWARE: Proactive protection for all routes
// Rate limiting middleware
const rateLimitMiddleware = async (req, res, next) => {
try {
await requestLimiter.consume(req.ip + ':' + (req.user?.id || 'anonymous'));
await hourlyLimiter.consume(req.ip + ':' + (req.user?.id || 'anonymous'));
next();
} catch (rateLimiterRes) {
const msBeforeNext = rateLimiterRes.msBeforeNext || 1000;
safetyLogger.warn('Rate limit exceeded', {
ip: req.ip,
userId: req.user?.id,
msBeforeNext
});
updateSafetyMetrics('blocked');
res.status(429).json({
error: 'Too many requests',
retryAfter: Math.round(msBeforeNext / 1000) || 1,
success: false
});
}
};
// Content generation rate limiting
const contentRateLimitMiddleware = async (req, res, next) => {
try {
await contentLimiter.consume(req.ip + ':' + (req.user?.id || 'anonymous'));
next();
} catch (rateLimiterRes) {
const msBeforeNext = rateLimiterRes.msBeforeNext || 1000;
safetyLogger.warn('Content generation rate limit exceeded', {
ip: req.ip,
userId: req.user?.id,
endpoint: req.path
});
updateSafetyMetrics('blocked');
res.status(429).json({
error: 'Content generation rate limit exceeded',
message: 'Please wait before generating more content',
retryAfter: Math.round(msBeforeNext / 1000) || 1,
success: false
});
}
};
// Input validation middleware
const inputValidationMiddleware = async (req, res, next) => {
try {
updateSafetyMetrics('request');
// Check for suspicious patterns in request body
const requestText = JSON.stringify(req.body).toLowerCase();
// Basic suspicious pattern detection
const suspiciousPatterns = [
/(?:hack|exploit|vulnerability|injection|xss|csrf)/gi,
/(?:admin|root|sudo|password|token|secret)/gi,
/(?:delete|drop|truncate|alter)\s+(?:table|database|user)/gi
];
const hasSuspiciousContent = suspiciousPatterns.some(pattern =>
pattern.test(requestText)
);
if (hasSuspiciousContent) {
const violation = recordViolation(
req.user?.id,
'suspicious_input',
'medium',
{ patterns: 'security_related', endpoint: req.path }
);
safetyLogger.warn('Suspicious input detected', {
ip: req.ip,
userId: req.user?.id,
endpoint: req.path,
violation
});
updateSafetyMetrics('warning');
}
next();
} catch (error) {
safetyLogger.error('Input validation error', { error: error.message });
next();
}
};
// Content safety wrapper function
const withContentSafety = (originalFunction) => {
return async (req, res) => {
try {
// Call original function but intercept the response
const originalSend = res.send;
let responseData = null;
res.send = function(data) {
responseData = data;
return originalSend.call(this, data);
};
// Execute original function
await originalFunction(req, res);
// Analyze response for safety if it was successful
if (res.statusCode === 200 && responseData) {
try {
const parsedResponse = typeof responseData === 'string' ?
JSON.parse(responseData) : responseData;
if (parsedResponse.response || parsedResponse.result) {
const content = parsedResponse.response || parsedResponse.result;
await analyzeOutputSafety(content, req.user?.id, req.path);
}
} catch (parseError) {
// Response wasn't JSON, skip safety analysis
}
}
} catch (error) {
safetyLogger.error('Content safety wrapper error', {
error: error.message,
endpoint: req.path
});
res.status(500).json({
error: 'Safety system error',
details: 'Content could not be processed safely',
success: false
});
}
};
};
// Analyze output safety
const analyzeOutputSafety = async (content, userId, endpoint) => {
try {
// Use existing moderation endpoint for analysis
const moderationResponse = await fetch('http://localhost:8000/api/moderation/text', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text: content })
});
const moderationResult = await moderationResponse.json();
if (moderationResult.flagged) {
const severity = moderationResult.category_scores?.some(score =>
score.score > SAFETY_CONFIG.content_filtering.block_threshold
) ? 'high' : 'medium';
recordViolation(
userId,
'harmful_output',
severity,
{
categories: moderationResult.categories,
endpoint,
blocked: severity === 'high'
}
);
if (severity === 'high') {
safetyLogger.error('High-risk content generated', {
userId,
endpoint,
categories: moderationResult.categories
});
}
}
} catch (error) {
safetyLogger.error('Output safety analysis failed', { error: error.message });
}
};
// 🔒 SAFETY ENDPOINTS: Safety management and monitoring
// Apply safety middleware to content generation routes
app.use('/api/chat', rateLimitMiddleware);
app.use('/api/images', contentRateLimitMiddleware);
app.use('/api/voice', contentRateLimitMiddleware);
app.use('/api/structured', contentRateLimitMiddleware);
// Apply input validation to all API routes
app.use('/api', inputValidationMiddleware);
// Safety dashboard endpoint
app.get("/api/safety/dashboard", (req, res) => {
try {
// Calculate safety metrics
const now = new Date();
const timeSinceReset = now - safetyDatabase.safetyMetrics.lastReset;
const hoursActive = timeSinceReset / (1000 * 60 * 60);
const metrics = {
current_metrics: safetyDatabase.safetyMetrics,
rates: {
requests_per_hour: hoursActive > 0 ?
Math.round(safetyDatabase.safetyMetrics.totalRequests / hoursActive) : 0,
block_rate: safetyDatabase.safetyMetrics.totalRequests > 0 ?
(safetyDatabase.safetyMetrics.blockedRequests / safetyDatabase.safetyMetrics.totalRequests * 100).toFixed(2) : 0,
warning_rate: safetyDatabase.safetyMetrics.totalRequests > 0 ?
(safetyDatabase.safetyMetrics.warningsIssued / safetyDatabase.safetyMetrics.totalRequests * 100).toFixed(2) : 0
},
system_status: {
content_filtering: SAFETY_CONFIG.content_filtering.enabled,
rate_limiting: SAFETY_CONFIG.rate_limiting.requests_per_minute > 0,
behavioral_monitoring: SAFETY_CONFIG.monitoring.track_user_behavior,
automated_responses: SAFETY_CONFIG.responses.content_blocking
},
recent_violations: Array.from(safetyDatabase.userViolations.entries())
.flatMap(([userId, violations]) =>
violations.slice(-5).map(v => ({ ...v, userId }))
)
.sort((a, b) => b.timestamp - a.timestamp)
.slice(0, 10)
};
res.json({
success: true,
...metrics,
timestamp: now.toISOString()
});
} catch (error) {
safetyLogger.error('Dashboard endpoint error', { error: error.message });
res.status(500).json({
error: 'Failed to load safety dashboard',
details: error.message,
success: false
});
}
});
// User safety status endpoint
app.get("/api/safety/user/:userId", (req, res) => {
try {
const { userId } = req.params;
const violations = safetyDatabase.userViolations.get(userId) || [];
const recentViolations = violations.filter(v =>
Date.now() - v.timestamp.getTime() < (24 * 60 * 60 * 1000) // Last 24 hours
);
const riskLevel = recentViolations.length >= 3 ? 'high' :
recentViolations.length >= 1 ? 'medium' : 'low';
res.json({
success: true,
user_id: userId,
risk_level: riskLevel,
total_violations: violations.length,
recent_violations: recentViolations.length,
violations: violations.slice(-10), // Last 10 violations
restrictions_active: riskLevel === 'high',
last_violation: violations.length > 0 ? violations[violations.length - 1].timestamp : null
});
} catch (error) {
safetyLogger.error('User safety status error', { error: error.message });
res.status(500).json({
error: 'Failed to get user safety status',
details: error.message,
success: false
});
}
});
// Safety incident reporting endpoint
app.post("/api/safety/report", (req, res) => {
try {
const {
incident_type,
description,
content_id = null,
user_id = null,
severity = 'medium'
} = req.body;
if (!incident_type || !description) {
return res.status(400).json({
error: 'Incident type and description are required',
success: false
});
}
const incident = {
id: Date.now() + Math.random(),
type: incident_type,
description,
content_id,
user_id,
severity,
reported_by: req.user?.id || 'anonymous',
timestamp: new Date(),
status: 'reported'
};
// Record as violation if user_id provided
if (user_id) {
recordViolation(user_id, 'reported_incident', severity, {
incident_type,
description,
reported_by: req.user?.id
});
}
safetyLogger.warn('Safety incident reported', { incident });
res.json({
success: true,
incident_id: incident.id,
status: 'reported',
message: 'Incident has been recorded and will be reviewed'
});
} catch (error) {
safetyLogger.error('Incident reporting error', { error: error.message });
res.status(500).json({
error: 'Failed to report incident',
details: error.message,
success: false
});
}
});
// Safety configuration endpoint
app.get("/api/safety/config", (req, res) => {
try {
res.json({
success: true,
config: SAFETY_CONFIG,
timestamp: new Date().toISOString()
});
} catch (error) {
res.status(500).json({
error: 'Failed to get safety configuration',
success: false
});
}
});
// Update safety configuration endpoint (admin only)
app.post("/api/safety/config", (req, res) => {
try {
const { config } = req.body;
if (!config) {
return res.status(400).json({
error: 'Configuration object is required',
success: false
});
}
// Merge with existing config (in production, validate admin permissions)
Object.assign(SAFETY_CONFIG, config);
safetyLogger.info('Safety configuration updated', {
newConfig: config,
updatedBy: req.user?.id || 'anonymous'
});
res.json({
success: true,
message: 'Safety configuration updated',
current_config: SAFETY_CONFIG
});
} catch (error) {
safetyLogger.error('Configuration update error', { error: error.message });
res.status(500).json({
error: 'Failed to update safety configuration',
details: error.message,
success: false
});
}
});
// Initialize safety system
console.log('🔒 Safety implementation system initialized');
safetyLogger.info('Safety system started', { config: SAFETY_CONFIG });

Function breakdown:

  1. Proactive filtering - Prevent harmful content before it’s processed
  2. Rate limiting - Protect against abuse and overuse
  3. Behavioral monitoring - Track patterns and detect violations
  4. Automated responses - Take action when safety issues are detected
  5. Comprehensive logging - Record all safety events for analysis
  6. Safety dashboard - Monitor system health and safety metrics

🔧 Step 3: Building the React Safety Dashboard Component

Section titled “🔧 Step 3: Building the React Safety Dashboard Component”

Now let’s create a comprehensive safety management interface that integrates with your existing application.

Step 3A: Creating the Safety Dashboard Component

Section titled “Step 3A: Creating the Safety Dashboard Component”

Create a new file src/SafetyDashboard.jsx:

import { useState, useEffect } from "react";
import { Shield, AlertTriangle, Users, Activity, Settings, TrendingUp, Eye, Ban } from "lucide-react";
function SafetyDashboard() {
// 🧠 STATE: Safety dashboard data management
const [safetyMetrics, setSafetyMetrics] = useState(null); // Dashboard metrics
const [isLoading, setIsLoading] = useState(true); // Loading status
const [error, setError] = useState(null); // Error messages
const [selectedUser, setSelectedUser] = useState(""); // User lookup
const [userStatus, setUserStatus] = useState(null); // User safety status
const [incidentReport, setIncidentReport] = useState({ // Incident reporting
type: "",
description: "",
userId: "",
severity: "medium"
});
const [config, setConfig] = useState(null); // Safety configuration
const [activeTab, setActiveTab] = useState("overview"); // Active dashboard tab
// 🔧 FUNCTIONS: Safety dashboard logic engine
// Load safety dashboard data
const loadSafetyMetrics = async () => {
setIsLoading(true);
setError(null);
try {
const response = await fetch("http://localhost:8000/api/safety/dashboard");
const data = await response.json();
if (!response.ok) {
throw new Error(data.error || 'Failed to load safety metrics');
}
setSafetyMetrics(data);
} catch (error) {
console.error('Failed to load safety metrics:', error);
setError(error.message || 'Could not load safety dashboard');
} finally {
setIsLoading(false);
}
};
// Load safety configuration
const loadSafetyConfig = async () => {
try {
const response = await fetch("http://localhost:8000/api/safety/config");
const data = await response.json();
if (response.ok) {
setConfig(data.config);
}
} catch (error) {
console.error('Failed to load safety config:', error);
}
};
// Look up user safety status
const lookupUserStatus = async () => {
if (!selectedUser.trim()) {
setError('User ID is required');
return;
}
setError(null);
try {
const response = await fetch(`http://localhost:8000/api/safety/user/${selectedUser.trim()}`);
const data = await response.json();
if (!response.ok) {
throw new Error(data.error || 'Failed to get user status');
}
setUserStatus(data);
} catch (error) {
console.error('User lookup failed:', error);
setError(error.message || 'Could not look up user status');
}
};
// Report safety incident
const reportIncident = async () => {
if (!incidentReport.type || !incidentReport.description) {
setError('Incident type and description are required');
return;
}
setError(null);
try {
const response = await fetch("http://localhost:8000/api/safety/report", {
method: "POST",
headers: {
"Content-Type": "application/json"
},
body: JSON.stringify({
incident_type: incidentReport.type,
description: incidentReport.description,
user_id: incidentReport.userId || null,
severity: incidentReport.severity
})
});
const data = await response.json();
if (!response.ok) {
throw new Error(data.error || 'Failed to report incident');
}
// Reset form
setIncidentReport({
type: "",
description: "",
userId: "",
severity: "medium"
});
// Reload metrics to show new incident
loadSafetyMetrics();
alert('Incident reported successfully');
} catch (error) {
console.error('Incident reporting failed:', error);
setError(error.message || 'Could not report incident');
}
};
// Format timestamp for display
const formatTimestamp = (timestamp) => {
return new Date(timestamp).toLocaleString();
};
// Get risk level color
const getRiskColor = (level) => {
switch (level) {
case 'high': return 'text-red-600 bg-red-100';
case 'medium': return 'text-yellow-600 bg-yellow-100';
case 'low': return 'text-green-600 bg-green-100';
default: return 'text-gray-600 bg-gray-100';
}
};
// Get severity color
const getSeverityColor = (severity) => {
switch (severity) {
case 'high': return 'bg-red-500';
case 'medium': return 'bg-yellow-500';
case 'low': return 'bg-green-500';
default: return 'bg-gray-500';
}
};
// Load data on component mount
useEffect(() => {
loadSafetyMetrics();
loadSafetyConfig();
// Set up auto-refresh every 30 seconds
const interval = setInterval(loadSafetyMetrics, 30000);
return () => clearInterval(interval);
}, []);
// 🎨 UI: Safety dashboard interface
return (
<div className="min-h-screen bg-gradient-to-br from-red-50 to-orange-50 flex items-center justify-center p-4">
<div className="bg-white rounded-2xl shadow-2xl w-full max-w-7xl flex flex-col overflow-hidden">
{/* Header */}
<div className="bg-gradient-to-r from-red-600 to-orange-600 text-white p-6">
<div className="flex items-center space-x-3">
<div className="w-10 h-10 bg-white bg-opacity-20 rounded-full flex items-center justify-center">
<Shield className="w-5 h-5" />
</div>
<div>
<h1 className="text-xl font-bold">🔒 Safety Implementation</h1>
<p className="text-red-100 text-sm">Comprehensive safety monitoring and management system!</p>
</div>
</div>
</div>
{/* Tab Navigation */}
<div className="border-b border-gray-200">
<nav className="flex">
<button
onClick={() => setActiveTab('overview')}
className={`px-6 py-3 font-medium text-sm border-b-2 transition-colors duration-200 ${
activeTab === 'overview'
? 'border-red-500 text-red-600'
: 'border-transparent text-gray-500 hover:text-gray-700'
}`}
>
<Activity className="w-4 h-4 inline mr-2" />
Overview
</button>
<button
onClick={() => setActiveTab('users')}
className={`px-6 py-3 font-medium text-sm border-b-2 transition-colors duration-200 ${
activeTab === 'users'
? 'border-red-500 text-red-600'
: 'border-transparent text-gray-500 hover:text-gray-700'
}`}
>
<Users className="w-4 h-4 inline mr-2" />
User Safety
</button>
<button
onClick={() => setActiveTab('incidents')}
className={`px-6 py-3 font-medium text-sm border-b-2 transition-colors duration-200 ${
activeTab === 'incidents'
? 'border-red-500 text-red-600'
: 'border-transparent text-gray-500 hover:text-gray-700'
}`}
>
<AlertTriangle className="w-4 h-4 inline mr-2" />
Incidents
</button>
<button
onClick={() => setActiveTab('config')}
className={`px-6 py-3 font-medium text-sm border-b-2 transition-colors duration-200 ${
activeTab === 'config'
? 'border-red-500 text-red-600'
: 'border-transparent text-gray-500 hover:text-gray-700'
}`}
>
<Settings className="w-4 h-4 inline mr-2" />
Configuration
</button>
</nav>
</div>
{/* Error Display */}
{error && (
<div className="p-4 bg-red-50 border-b border-red-200">
<p className="text-red-700 text-sm">
<strong>Error:</strong> {error}
</p>
</div>
)}
{/* Main Content */}
<div className="flex-1 p-6">
{/* Overview Tab */}
{activeTab === 'overview' && (
<div className="space-y-6">
{isLoading ? (
<div className="text-center py-12">
<div className="animate-spin w-8 h-8 border-4 border-red-500 border-t-transparent rounded-full mx-auto mb-4"></div>
<p className="text-gray-600">Loading safety metrics...</p>
</div>
) : safetyMetrics ? (
<>
{/* Metrics Cards */}
<div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-4 gap-4">
<div className="bg-blue-50 rounded-lg p-4">
<div className="flex items-center">
<TrendingUp className="w-8 h-8 text-blue-600" />
<div className="ml-3">
<p className="text-sm font-medium text-blue-600">Total Requests</p>
<p className="text-2xl font-bold text-blue-900">
{safetyMetrics.current_metrics.totalRequests.toLocaleString()}
</p>
</div>
</div>
</div>
<div className="bg-red-50 rounded-lg p-4">
<div className="flex items-center">
<Ban className="w-8 h-8 text-red-600" />
<div className="ml-3">
<p className="text-sm font-medium text-red-600">Blocked Requests</p>
<p className="text-2xl font-bold text-red-900">
{safetyMetrics.current_metrics.blockedRequests.toLocaleString()}
</p>
<p className="text-xs text-red-700">
{safetyMetrics.rates.block_rate}% of total
</p>
</div>
</div>
</div>
<div className="bg-yellow-50 rounded-lg p-4">
<div className="flex items-center">
<AlertTriangle className="w-8 h-8 text-yellow-600" />
<div className="ml-3">
<p className="text-sm font-medium text-yellow-600">Warnings Issued</p>
<p className="text-2xl font-bold text-yellow-900">
{safetyMetrics.current_metrics.warningsIssued.toLocaleString()}
</p>
<p className="text-xs text-yellow-700">
{safetyMetrics.rates.warning_rate}% of total
</p>
</div>
</div>
</div>
<div className="bg-purple-50 rounded-lg p-4">
<div className="flex items-center">
<Users className="w-8 h-8 text-purple-600" />
<div className="ml-3">
<p className="text-sm font-medium text-purple-600">Accounts Restricted</p>
<p className="text-2xl font-bold text-purple-900">
{safetyMetrics.current_metrics.accountsRestricted.toLocaleString()}
</p>
</div>
</div>
</div>
</div>
{/* System Status */}
<div className="bg-gray-50 rounded-lg p-6">
<h3 className="font-semibold text-gray-900 mb-4 flex items-center">
<Shield className="w-5 h-5 mr-2 text-red-600" />
System Status
</h3>
<div className="grid grid-cols-2 md:grid-cols-4 gap-4">
{Object.entries(safetyMetrics.system_status).map(([key, status]) => (
<div key={key} className="flex items-center space-x-2">
<div className={`w-3 h-3 rounded-full ${status ? 'bg-green-500' : 'bg-red-500'}`}></div>
<span className="text-sm text-gray-700 capitalize">
{key.replace(/_/g, ' ')}
</span>
</div>
))}
</div>
</div>
{/* Recent Violations */}
<div className="bg-white border rounded-lg p-6">
<h3 className="font-semibold text-gray-900 mb-4 flex items-center">
<Eye className="w-5 h-5 mr-2 text-red-600" />
Recent Violations ({safetyMetrics.recent_violations.length})
</h3>
{safetyMetrics.recent_violations.length === 0 ? (
<p className="text-gray-500 text-center py-4">No recent violations</p>
) : (
<div className="space-y-3">
{safetyMetrics.recent_violations.map((violation, index) => (
<div key={violation.id || index} className="flex items-center justify-between p-3 bg-gray-50 rounded-lg">
<div className="flex items-center space-x-3">
<div className={`w-3 h-3 rounded-full ${getSeverityColor(violation.severity)}`}></div>
<div>
<p className="font-medium text-gray-900">{violation.type.replace(/_/g, ' ')}</p>
<p className="text-sm text-gray-600">User: {violation.userId}</p>
</div>
</div>
<div className="text-right">
<p className="text-sm text-gray-500">{formatTimestamp(violation.timestamp)}</p>
<span className={`px-2 py-1 rounded text-xs font-medium ${
violation.severity === 'high' ? 'bg-red-100 text-red-700' :
violation.severity === 'medium' ? 'bg-yellow-100 text-yellow-700' :
'bg-green-100 text-green-700'
}`}>
{violation.severity}
</span>
</div>
</div>
))}
</div>
)}
</div>
</>
) : (
<div className="text-center py-12">
<Shield className="w-16 h-16 text-gray-400 mx-auto mb-4" />
<p className="text-gray-600">No safety metrics available</p>
</div>
)}
</div>
)}
{/* User Safety Tab */}
{activeTab === 'users' && (
<div className="space-y-6">
<div className="bg-white border rounded-lg p-6">
<h3 className="font-semibold text-gray-900 mb-4">User Safety Lookup</h3>
<div className="flex space-x-3 mb-4">
<input
type="text"
value={selectedUser}
onChange={(e) => setSelectedUser(e.target.value)}
placeholder="Enter user ID..."
className="flex-1 px-4 py-2 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-red-500"
/>
<button
onClick={lookupUserStatus}
disabled={!selectedUser.trim()}
className="px-6 py-2 bg-red-600 text-white rounded-lg hover:bg-red-700 disabled:opacity-50 transition-colors duration-200"
>
Lookup
</button>
</div>
{userStatus && (
<div className="mt-6 p-4 bg-gray-50 rounded-lg">
<div className="flex items-center justify-between mb-4">
<h4 className="font-medium text-gray-900">User: {userStatus.user_id}</h4>
<span className={`px-3 py-1 rounded-lg text-sm font-medium ${getRiskColor(userStatus.risk_level)}`}>
{userStatus.risk_level.toUpperCase()} RISK
</span>
</div>
<div className="grid grid-cols-2 md:grid-cols-4 gap-4 mb-4">
<div>
<p className="text-sm text-gray-600">Total Violations</p>
<p className="text-lg font-semibold">{userStatus.total_violations}</p>
</div>
<div>
<p className="text-sm text-gray-600">Recent Violations</p>
<p className="text-lg font-semibold">{userStatus.recent_violations}</p>
</div>
<div>
<p className="text-sm text-gray-600">Restrictions</p>
<p className={`text-lg font-semibold ${userStatus.restrictions_active ? 'text-red-600' : 'text-green-600'}`}>
{userStatus.restrictions_active ? 'Active' : 'None'}
</p>
</div>
<div>
<p className="text-sm text-gray-600">Last Violation</p>
<p className="text-sm text-gray-700">
{userStatus.last_violation ? formatTimestamp(userStatus.last_violation) : 'None'}
</p>
</div>
</div>
{userStatus.violations.length > 0 && (
<div>
<h5 className="font-medium text-gray-900 mb-2">Recent Violations</h5>
<div className="space-y-2 max-h-40 overflow-y-auto">
{userStatus.violations.map((violation, index) => (
<div key={violation.id || index} className="flex items-center justify-between p-2 bg-white rounded">
<div>
<p className="text-sm font-medium">{violation.type.replace(/_/g, ' ')}</p>
<p className="text-xs text-gray-600">{formatTimestamp(violation.timestamp)}</p>
</div>
<span className={`px-2 py-1 rounded text-xs ${
violation.severity === 'high' ? 'bg-red-100 text-red-700' :
violation.severity === 'medium' ? 'bg-yellow-100 text-yellow-700' :
'bg-green-100 text-green-700'
}`}>
{violation.severity}
</span>
</div>
))}
</div>
</div>
)}
</div>
)}
</div>
</div>
)}
{/* Incidents Tab */}
{activeTab === 'incidents' && (
<div className="space-y-6">
<div className="bg-white border rounded-lg p-6">
<h3 className="font-semibold text-gray-900 mb-4">Report Safety Incident</h3>
<div className="space-y-4">
<div className="grid grid-cols-1 md:grid-cols-2 gap-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Incident Type
</label>
<select
value={incidentReport.type}
onChange={(e) => setIncidentReport({...incidentReport, type: e.target.value})}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-red-500"
>
<option value="">Select incident type...</option>
<option value="harassment">Harassment</option>
<option value="harmful_content">Harmful Content</option>
<option value="spam">Spam</option>
<option value="abuse">System Abuse</option>
<option value="security">Security Issue</option>
<option value="other">Other</option>
</select>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Severity
</label>
<select
value={incidentReport.severity}
onChange={(e) => setIncidentReport({...incidentReport, severity: e.target.value})}
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-red-500"
>
<option value="low">Low</option>
<option value="medium">Medium</option>
<option value="high">High</option>
</select>
</div>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
User ID (Optional)
</label>
<input
type="text"
value={incidentReport.userId}
onChange={(e) => setIncidentReport({...incidentReport, userId: e.target.value})}
placeholder="User ID if applicable..."
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-red-500"
/>
</div>
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
Description
</label>
<textarea
value={incidentReport.description}
onChange={(e) => setIncidentReport({...incidentReport, description: e.target.value})}
placeholder="Describe the safety incident..."
rows="4"
className="w-full px-3 py-2 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-red-500"
/>
</div>
<button
onClick={reportIncident}
disabled={!incidentReport.type || !incidentReport.description}
className="px-6 py-2 bg-red-600 text-white rounded-lg hover:bg-red-700 disabled:opacity-50 transition-colors duration-200"
>
Report Incident
</button>
</div>
</div>
</div>
)}
{/* Configuration Tab */}
{activeTab === 'config' && (
<div className="space-y-6">
<div className="bg-white border rounded-lg p-6">
<h3 className="font-semibold text-gray-900 mb-4">Safety Configuration</h3>
{config ? (
<div className="space-y-6">
<div>
<h4 className="font-medium text-gray-900 mb-3">Content Filtering</h4>
<div className="grid grid-cols-1 md:grid-cols-3 gap-4 text-sm">
<div>
<span className="text-gray-600">Enabled:</span>
<span className={`ml-2 ${config.content_filtering.enabled ? 'text-green-600' : 'text-red-600'}`}>
{config.content_filtering.enabled ? 'Yes' : 'No'}
</span>
</div>
<div>
<span className="text-gray-600">Block Threshold:</span>
<span className="ml-2 text-gray-900">{config.content_filtering.block_threshold}</span>
</div>
<div>
<span className="text-gray-600">Warn Threshold:</span>
<span className="ml-2 text-gray-900">{config.content_filtering.warn_threshold}</span>
</div>
</div>
</div>
<div>
<h4 className="font-medium text-gray-900 mb-3">Rate Limiting</h4>
<div className="grid grid-cols-1 md:grid-cols-2 gap-4 text-sm">
<div>
<span className="text-gray-600">Requests/Minute:</span>
<span className="ml-2 text-gray-900">{config.rate_limiting.requests_per_minute}</span>
</div>
<div>
<span className="text-gray-600">Requests/Hour:</span>
<span className="ml-2 text-gray-900">{config.rate_limiting.requests_per_hour}</span>
</div>
</div>
</div>
<div>
<h4 className="font-medium text-gray-900 mb-3">Monitoring</h4>
<div className="grid grid-cols-1 md:grid-cols-3 gap-4 text-sm">
<div>
<span className="text-gray-600">Behavior Tracking:</span>
<span className={`ml-2 ${config.monitoring.track_user_behavior ? 'text-green-600' : 'text-red-600'}`}>
{config.monitoring.track_user_behavior ? 'Enabled' : 'Disabled'}
</span>
</div>
<div>
<span className="text-gray-600">Detection Window:</span>
<span className="ml-2 text-gray-900">{config.monitoring.detection_window} hours</span>
</div>
<div>
<span className="text-gray-600">Violation Threshold:</span>
<span className="ml-2 text-gray-900">{config.monitoring.violation_threshold}</span>
</div>
</div>
</div>
<div>
<h4 className="font-medium text-gray-900 mb-3">Automated Responses</h4>
<div className="grid grid-cols-2 md:grid-cols-4 gap-4 text-sm">
{Object.entries(config.responses).map(([key, enabled]) => (
<div key={key}>
<span className="text-gray-600 capitalize">{key.replace(/_/g, ' ')}:</span>
<span className={`ml-2 ${enabled ? 'text-green-600' : 'text-red-600'}`}>
{enabled ? 'On' : 'Off'}
</span>
</div>
))}
</div>
</div>
</div>
) : (
<p className="text-gray-500">Loading configuration...</p>
)}
</div>
</div>
)}
</div>
{/* Refresh Button */}
<div className="p-4 border-t border-gray-200">
<button
onClick={loadSafetyMetrics}
disabled={isLoading}
className="px-4 py-2 bg-red-100 text-red-700 rounded-lg hover:bg-red-200 disabled:opacity-50 transition-colors duration-200"
>
{isLoading ? 'Refreshing...' : 'Refresh Data'}
</button>
</div>
</div>
</div>
);
}
export default SafetyDashboard;

Step 3B: Adding Safety Dashboard to Navigation

Section titled “Step 3B: Adding Safety Dashboard to Navigation”

Update your src/App.jsx to include the safety implementation tab in Module 3:

// Add to your existing imports
import SafetyDashboard from "./SafetyDashboard";
import { MessageSquare, Image, Mic, Folder, Volume2, Eye, Phone, Link, FileText, Shield } from "lucide-react";
// Update your currentView state to include 'safety'
const [currentView, setCurrentView] = useState("chat"); // Add 'safety' to options
// Add Safety tab to Module 3 section (you can group it separately or add a Module 3 section)
// Add this button after your existing tabs:
<button
onClick={() => setCurrentView("safety")}
className={`px-3 py-2 rounded-lg flex items-center space-x-2 transition-all duration-200 whitespace-nowrap ${
currentView === "safety"
? "bg-red-100 text-red-700 shadow-sm"
: "text-gray-600 hover:text-gray-900 hover:bg-gray-100"
}`}
>
<Shield className="w-4 h-4" />
<span>Safety</span>
</button>
// Add to your main content section:
{currentView === "safety" && <SafetyDashboard />}

Let’s test your comprehensive safety system step by step.

Test safety dashboard:

Terminal window
# Test the safety dashboard endpoint
curl http://localhost:8000/api/safety/dashboard

Test user lookup:

Terminal window
# Test user safety status
curl http://localhost:8000/api/safety/user/test-user

Start both servers and test the complete safety flow:

  1. Navigate to Safety → Click the “Safety” tab
  2. View safety metrics → Check system overview and status
  3. Look up user safety → Search for user safety status
  4. Report incidents → Test incident reporting system
  5. Monitor violations → Watch real-time safety violations
  6. Test rate limiting → Make rapid requests to trigger limits
  7. Review configuration → Check safety system settings

Test real safety scenarios:

🔴 Rate limiting: Make multiple rapid API calls
🔴 Suspicious input: Try potentially harmful content
🔴 Violation patterns: Simulate repeated policy violations
🔴 Incident reporting: Report various types of safety issues

Congratulations! You’ve implemented a comprehensive safety system:

  • Proactive content filtering with real-time input/output screening
  • Advanced rate limiting with abuse prevention and burst protection
  • Behavioral monitoring with pattern detection and violation tracking
  • Automated responses with policy enforcement and user restrictions
  • Safety dashboard with real-time monitoring and incident management
  • Comprehensive logging with detailed safety event tracking

Your Module 3 safety implementation includes:

  • Content moderation (existing) - Detect harmful content
  • Safety implementation (new) - Prevent, monitor, and respond to safety issues
  • Integrated protection across all application features
  • Real-time monitoring with automated responses
  • Professional safety dashboard for system management

Next up: Performance optimization, cost management, and production deployment strategies to complete Module 3.

Your OpenAI application now has enterprise-grade safety protection! 🔒