🛡️ AI Content Moderation Made Simple
Right now, your AI application can chat, generate images, transcribe audio, analyze files, synthesize speech, understand vision, have voice conversations, call functions, and search the web. But what if users input harmful content, or your AI generates inappropriate responses?
Content moderation opens up safe AI deployment. Instead of hoping users behave appropriately, your application can automatically detect harmful content in both user inputs and AI outputs, taking action to protect users and maintain a safe environment.
You’re about to learn exactly how to add comprehensive content safety to your existing application.
🧠 Step 1: Understanding AI Content Moderation
Section titled “🧠 Step 1: Understanding AI Content Moderation”Before we write any code, let’s understand what AI content moderation actually means and why it’s essential for any production AI application.
What AI Content Moderation Actually Means
Section titled “What AI Content Moderation Actually Means”AI content moderation is like having a professional safety team monitoring your application 24/7. It automatically scans all text and images for harmful content, flagging or filtering anything that violates safety policies before users see it.
Real-world analogy: It’s like hiring security guards for your application who can instantly recognize inappropriate content and take action. Instead of manually reviewing every message and image, your AI safety system works automatically to maintain a safe environment.
Why Content Moderation is Critical
Section titled “Why Content Moderation is Critical”Without content moderation, your AI application faces serious risks:
- User safety - Exposure to harmful, threatening, or inappropriate content
- Legal liability - Potential violations of content policies and regulations
- Reputation damage - Association with harmful or offensive material
- Platform compliance - Risk of removal from app stores or hosting platforms
- Business continuity - Potential service disruptions due to safety incidents
Content Categories Your Moderation Will Detect
Section titled “Content Categories Your Moderation Will Detect”Your content moderation system will identify multiple categories of harmful content:
🚫 Harassment Content
- What it detects: Harassing language, threats, bullying behavior
- Examples: Personal attacks, intimidation, targeted harassment
- Action: Flag and optionally filter or warn users
🔥 Hate Speech
- What it detects: Content targeting protected groups
- Examples: Discrimination based on race, gender, religion, nationality
- Action: Automatic removal and user education
⚔️ Violence & Graphic Content
- What it detects: Depictions of violence, graphic injury, death
- Examples: Violent imagery, graphic descriptions, threats of violence
- Action: Immediate filtering and incident logging
💔 Self-Harm Content
- What it detects: Content promoting or instructing self-harm
- Examples: Suicide instructions, self-injury promotion
- Action: Block content and provide mental health resources
🔞 Sexual Content
- What it detects: Sexually explicit material, especially involving minors
- Examples: Adult content, inappropriate sexual material
- Action: Age-gate or remove based on application type
⚖️ Illicit Content
- What it detects: Instructions for illegal activities
- Examples: Drug manufacturing, weapons instructions, illegal advice
- Action: Immediate removal and potential reporting
Multi-Modal Detection Capabilities
Section titled “Multi-Modal Detection Capabilities”Your moderation system will handle both text and images:
📝 Text Moderation
- Chat messages and user inputs
- AI-generated responses and content
- File uploads with text content
- Voice transcription results
🖼️ Image Moderation
- User-uploaded images and photos
- AI-generated images and artwork
- Screenshots and visual content
- Profile pictures and avatars
🔧 Step 2: Adding Content Moderation to Your Backend
Section titled “🔧 Step 2: Adding Content Moderation to Your Backend”Let’s add comprehensive content moderation to your existing backend using the same patterns you learned in previous modules. We’ll add new routes to handle safety and moderation.
Building on your foundation: You already have a working Node.js server with OpenAI integration. We’re simply adding content safety capabilities to what you’ve built.
Step 2A: Understanding Moderation State
Section titled “Step 2A: Understanding Moderation State”Before writing code, let’s understand what data our moderation system needs to manage:
// 🧠 CONTENT MODERATION STATE CONCEPTS:// 1. Content Input - Text or images that need to be checked// 2. Moderation Results - Safety analysis and category scores// 3. Policy Decisions - Actions to take based on moderation results// 4. Violation Tracking - Logging and monitoring safety incidents// 5. User Feedback - Informing users about moderation actions
Key moderation concepts:
- Multi-modal Analysis: Checking both text and images for harmful content
- Category Scoring: Confidence levels for different types of harmful content
- Policy Enforcement: Automated actions based on moderation results
- Audit Trails: Comprehensive logging for safety and compliance
Step 2B: Adding the Content Moderation Routes
Section titled “Step 2B: Adding the Content Moderation Routes”Add these new endpoints to your existing index.js
file, right after your web search routes:
// 🛡️ CONTENT MODERATION ENDPOINTS: Add these to your existing server
// Moderate text contentapp.post("/api/moderation/text", async (req, res) => { try { // 🛡️ VALIDATION: Check required inputs const { text, context = "user_input" } = req.body;
if (!text?.trim()) { return res.status(400).json({ error: "Text content is required for moderation", success: false }); }
console.log(`🛡️ Moderating text content (${text.length} chars)`);
// 🔍 MODERATION: Check content with OpenAI const moderation = await openai.moderations.create({ model: "omni-moderation-latest", input: [ { type: "text", text: text.trim() } ], });
const result = moderation.results[0];
// 📊 SAFETY ANALYSIS: Analyze moderation results const safetyAnalysis = analyzeModerationResult(result);
// 📝 AUDIT LOG: Record moderation check const auditEntry = { id: uuidv4(), type: "text_moderation", context: context, content_length: text.length, flagged: result.flagged, categories: result.categories, category_scores: result.category_scores, safety_analysis: safetyAnalysis, timestamp: new Date().toISOString() };
// 📤 SUCCESS RESPONSE: Send moderation results res.json({ success: true, moderation_id: auditEntry.id, flagged: result.flagged, categories: result.categories, category_scores: result.category_scores, safety_analysis: safetyAnalysis, action_recommended: safetyAnalysis.recommended_action, user_message: safetyAnalysis.user_message, model: "omni-moderation-latest", timestamp: auditEntry.timestamp });
} catch (error) { // 🚨 ERROR HANDLING: Handle moderation failures console.error("Text moderation error:", error);
res.status(500).json({ error: "Failed to moderate text content", details: error.message, success: false }); }});
// Moderate image contentapp.post("/api/moderation/image", upload.single("image"), async (req, res) => { try { // 🛡️ VALIDATION: Check if image was uploaded const uploadedImage = req.file; const { context = "user_upload" } = req.body;
if (!uploadedImage) { return res.status(400).json({ error: "Image file is required for moderation", success: false }); }
console.log(`🛡️ Moderating image: ${uploadedImage.originalname} (${uploadedImage.size} bytes)`);
// 🖼️ IMAGE PREPARATION: Convert to base64 for moderation const base64Image = uploadedImage.buffer.toString('base64'); const imageUrl = `data:${uploadedImage.mimetype};base64,${base64Image}`;
// 🔍 MODERATION: Check image with OpenAI const moderation = await openai.moderations.create({ model: "omni-moderation-latest", input: [ { type: "image_url", image_url: { url: imageUrl } } ], });
const result = moderation.results[0];
// 📊 SAFETY ANALYSIS: Analyze moderation results const safetyAnalysis = analyzeModerationResult(result);
// 📝 AUDIT LOG: Record moderation check const auditEntry = { id: uuidv4(), type: "image_moderation", context: context, file_info: { name: uploadedImage.originalname, size: uploadedImage.size, type: uploadedImage.mimetype }, flagged: result.flagged, categories: result.categories, category_scores: result.category_scores, category_applied_input_types: result.category_applied_input_types, safety_analysis: safetyAnalysis, timestamp: new Date().toISOString() };
// 📤 SUCCESS RESPONSE: Send moderation results res.json({ success: true, moderation_id: auditEntry.id, flagged: result.flagged, categories: result.categories, category_scores: result.category_scores, category_applied_input_types: result.category_applied_input_types, safety_analysis: safetyAnalysis, action_recommended: safetyAnalysis.recommended_action, user_message: safetyAnalysis.user_message, file_info: auditEntry.file_info, model: "omni-moderation-latest", timestamp: auditEntry.timestamp });
} catch (error) { // 🚨 ERROR HANDLING: Handle moderation failures console.error("Image moderation error:", error);
res.status(500).json({ error: "Failed to moderate image content", details: error.message, success: false }); }});
// Moderate mixed content (text + image)app.post("/api/moderation/mixed", upload.single("image"), async (req, res) => { try { // 🛡️ VALIDATION: Check inputs const { text, context = "mixed_content" } = req.body; const uploadedImage = req.file;
if (!text?.trim() && !uploadedImage) { return res.status(400).json({ error: "Either text or image content is required for moderation", success: false }); }
console.log(`🛡️ Moderating mixed content: text(${text?.length || 0} chars) + image(${uploadedImage?.size || 0} bytes)`);
// 📝 BUILD INPUT: Prepare moderation input array const moderationInput = [];
if (text?.trim()) { moderationInput.push({ type: "text", text: text.trim() }); }
if (uploadedImage) { const base64Image = uploadedImage.buffer.toString('base64'); const imageUrl = `data:${uploadedImage.mimetype};base64,${base64Image}`; moderationInput.push({ type: "image_url", image_url: { url: imageUrl } }); }
// 🔍 MODERATION: Check mixed content with OpenAI const moderation = await openai.moderations.create({ model: "omni-moderation-latest", input: moderationInput, });
const result = moderation.results[0];
// 📊 SAFETY ANALYSIS: Analyze moderation results const safetyAnalysis = analyzeModerationResult(result);
// 📝 AUDIT LOG: Record moderation check const auditEntry = { id: uuidv4(), type: "mixed_moderation", context: context, content_info: { has_text: !!text?.trim(), text_length: text?.length || 0, has_image: !!uploadedImage, image_info: uploadedImage ? { name: uploadedImage.originalname, size: uploadedImage.size, type: uploadedImage.mimetype } : null }, flagged: result.flagged, categories: result.categories, category_scores: result.category_scores, category_applied_input_types: result.category_applied_input_types, safety_analysis: safetyAnalysis, timestamp: new Date().toISOString() };
// 📤 SUCCESS RESPONSE: Send moderation results res.json({ success: true, moderation_id: auditEntry.id, flagged: result.flagged, categories: result.categories, category_scores: result.category_scores, category_applied_input_types: result.category_applied_input_types, safety_analysis: safetyAnalysis, action_recommended: safetyAnalysis.recommended_action, user_message: safetyAnalysis.user_message, content_info: auditEntry.content_info, model: "omni-moderation-latest", timestamp: auditEntry.timestamp });
} catch (error) { // 🚨 ERROR HANDLING: Handle moderation failures console.error("Mixed content moderation error:", error);
res.status(500).json({ error: "Failed to moderate mixed content", details: error.message, success: false }); }});
// 🔧 HELPER FUNCTIONS: Moderation analysis utilities
// Analyze moderation results and recommend actionsconst analyzeModerationResult = (result) => { const { flagged, categories, category_scores } = result;
// Define severity thresholds const HIGH_THRESHOLD = 0.8; const MEDIUM_THRESHOLD = 0.5; const LOW_THRESHOLD = 0.2;
// Analyze category scores const highRiskCategories = []; const mediumRiskCategories = []; const lowRiskCategories = [];
Object.entries(category_scores).forEach(([category, score]) => { if (score >= HIGH_THRESHOLD) { highRiskCategories.push({ category, score }); } else if (score >= MEDIUM_THRESHOLD) { mediumRiskCategories.push({ category, score }); } else if (score >= LOW_THRESHOLD) { lowRiskCategories.push({ category, score }); } });
// Determine recommended action let recommendedAction = "allow"; let userMessage = "Content approved"; let severity = "none";
if (flagged || highRiskCategories.length > 0) { recommendedAction = "block"; userMessage = "This content violates our community guidelines and cannot be displayed."; severity = "high"; } else if (mediumRiskCategories.length > 0) { recommendedAction = "warn"; userMessage = "This content may be inappropriate. Please review our community guidelines."; severity = "medium"; } else if (lowRiskCategories.length > 0) { recommendedAction = "monitor"; userMessage = "Content approved with monitoring"; severity = "low"; }
// Generate detailed analysis const flaggedCategories = Object.entries(categories) .filter(([_, flagged]) => flagged) .map(([category, _]) => category);
return { flagged: flagged, severity: severity, recommended_action: recommendedAction, user_message: userMessage, flagged_categories: flaggedCategories, risk_analysis: { high_risk: highRiskCategories, medium_risk: mediumRiskCategories, low_risk: lowRiskCategories }, policy_guidance: generatePolicyGuidance(flaggedCategories), timestamp: new Date().toISOString() };};
// Generate policy guidance based on flagged categoriesconst generatePolicyGuidance = (flaggedCategories) => { const guidance = { "harassment": "Content contains harassing language. Please ensure all interactions are respectful and constructive.", "harassment/threatening": "Content contains threatening harassment. This type of content is strictly prohibited.", "hate": "Content contains hate speech targeting protected groups. Please review our diversity and inclusion policies.", "hate/threatening": "Content contains threatening hate speech. This is a serious policy violation.", "self-harm": "Content promotes self-harm. If you need support, please contact mental health resources.", "self-harm/intent": "Content indicates intent for self-harm. Please seek immediate professional help.", "self-harm/instructions": "Content provides self-harm instructions. This type of content is not permitted.", "sexual": "Content contains sexual material. Please ensure content is appropriate for the intended audience.", "sexual/minors": "Content contains inappropriate material involving minors. This is strictly prohibited.", "violence": "Content depicts violence. Please consider if this content is necessary and appropriate.", "violence/graphic": "Content contains graphic violence. This type of content may be disturbing to users.", "illicit": "Content provides instructions for illicit activities. This type of content is not permitted.", "illicit/violent": "Content combines illicit activities with violence. This is a serious policy violation." };
return flaggedCategories.map(category => ({ category: category, guidance: guidance[category] || "Please review our community guidelines for more information." }));};
// Get moderation statisticsapp.get("/api/moderation/stats", (req, res) => { try { // In a real application, you would pull this from a database // This is a mock implementation const stats = { total_checks: 15420, flagged_content: 342, flag_rate: 2.2, top_categories: [ { category: "spam", count: 156, percentage: 45.6 }, { category: "harassment", count: 89, percentage: 26.0 }, { category: "hate", count: 45, percentage: 13.2 }, { category: "violence", count: 32, percentage: 9.4 }, { category: "sexual", count: 20, percentage: 5.8 } ], daily_trends: [ { date: "2024-01-15", total: 1200, flagged: 28 }, { date: "2024-01-16", total: 1350, flagged: 31 }, { date: "2024-01-17", total: 1180, flagged: 25 }, { date: "2024-01-18", total: 1420, flagged: 33 }, { date: "2024-01-19", total: 1590, flagged: 38 } ], last_updated: new Date().toISOString() };
res.json({ success: true, stats: stats, timestamp: new Date().toISOString() });
} catch (error) { console.error("Moderation stats error:", error); res.status(500).json({ error: "Failed to get moderation statistics", details: error.message, success: false }); }});
Function breakdown:
- Text moderation - Check text content for harmful categories
- Image moderation - Analyze images for inappropriate visual content
- Mixed content moderation - Handle text and images together
- Safety analysis - Intelligent interpretation of moderation results
- Policy enforcement - Automated action recommendations
Your backend now supports:
- Text chat (existing functionality)
- Image generation (existing functionality)
- Audio transcription (existing functionality)
- File analysis (existing functionality)
- Text-to-speech (existing functionality)
- Vision analysis (existing functionality)
- Voice interaction (existing functionality)
- Function calling (existing functionality)
- Web search (existing functionality)
- Content moderation (new functionality)
🔧 Step 3: Building the React Content Moderation Component
Section titled “🔧 Step 3: Building the React Content Moderation Component”Now let’s create a React component for content moderation using the same patterns from your existing components.
Step 3A: Creating the Content Moderation Component
Section titled “Step 3A: Creating the Content Moderation Component”Create a new file src/ContentModeration.jsx
:
import { useState, useRef } from "react";import { Shield, Upload, AlertTriangle, CheckCircle, XCircle, BarChart3, Eye, FileText } from "lucide-react";
function ContentModeration() { // 🧠 STATE: Content moderation data management const [textContent, setTextContent] = useState(""); // Text to moderate const [selectedImage, setSelectedImage] = useState(null); // Image to moderate const [previewUrl, setPreviewUrl] = useState(null); // Image preview const [isChecking, setIsChecking] = useState(false); // Processing status const [moderationResults, setModerationResults] = useState([]); // Results history const [stats, setStats] = useState(null); // Moderation statistics const [error, setError] = useState(null); // Error messages const [activeTab, setActiveTab] = useState("text"); // Active moderation type
const fileInputRef = useRef(null);
// 🔧 FUNCTIONS: Content moderation logic engine
// Handle image selection const handleImageSelect = (event) => { const file = event.target.files[0]; if (file) { // Validate file size (25MB limit) if (file.size > 25 * 1024 * 1024) { setError('Image too large. Maximum size is 25MB.'); return; }
// Validate file type const allowedTypes = ['image/jpeg', 'image/png', 'image/webp', 'image/gif']; if (!allowedTypes.includes(file.type)) { setError('Unsupported image type. Please upload JPEG, PNG, WebP, or GIF files.'); return; }
setSelectedImage(file); setError(null);
// Create preview URL const url = URL.createObjectURL(file); setPreviewUrl(url); } };
// Clear selected image const clearImage = () => { setSelectedImage(null); if (previewUrl) { URL.revokeObjectURL(previewUrl); setPreviewUrl(null); } if (fileInputRef.current) { fileInputRef.current.value = ''; } };
// Moderate text content const moderateText = async () => { if (!textContent.trim() || isChecking) return;
setIsChecking(true); setError(null);
try { const response = await fetch("http://localhost:8000/api/moderation/text", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ text: textContent.trim(), context: "manual_check" }), });
const data = await response.json();
if (!response.ok) { throw new Error(data.error || 'Failed to moderate text'); }
// Add result to history const result = { ...data, type: "text", content: textContent.trim(), id: data.moderation_id };
setModerationResults(prev => [result, ...prev]);
} catch (error) { console.error('Text moderation failed:', error); setError(error.message || 'Something went wrong while checking text content'); } finally { setIsChecking(false); } };
// Moderate image content const moderateImage = async () => { if (!selectedImage || isChecking) return;
setIsChecking(true); setError(null);
try { const formData = new FormData(); formData.append('image', selectedImage); formData.append('context', 'manual_check');
const response = await fetch("http://localhost:8000/api/moderation/image", { method: "POST", body: formData });
const data = await response.json();
if (!response.ok) { throw new Error(data.error || 'Failed to moderate image'); }
// Add result to history const result = { ...data, type: "image", content: selectedImage.name, id: data.moderation_id };
setModerationResults(prev => [result, ...prev]); clearImage();
} catch (error) { console.error('Image moderation failed:', error); setError(error.message || 'Something went wrong while checking image content'); } finally { setIsChecking(false); } };
// Moderate mixed content (text + image) const moderateMixed = async () => { if ((!textContent.trim() && !selectedImage) || isChecking) return;
setIsChecking(true); setError(null);
try { const formData = new FormData(); if (textContent.trim()) { formData.append('text', textContent.trim()); } if (selectedImage) { formData.append('image', selectedImage); } formData.append('context', 'mixed_check');
const response = await fetch("http://localhost:8000/api/moderation/mixed", { method: "POST", body: formData });
const data = await response.json();
if (!response.ok) { throw new Error(data.error || 'Failed to moderate content'); }
// Add result to history const result = { ...data, type: "mixed", content: `Text: ${textContent.trim().substring(0, 50)}${textContent.length > 50 ? '...' : ''} + Image: ${selectedImage?.name || 'none'}`, id: data.moderation_id };
setModerationResults(prev => [result, ...prev]); setTextContent(""); clearImage();
} catch (error) { console.error('Mixed content moderation failed:', error); setError(error.message || 'Something went wrong while checking content'); } finally { setIsChecking(false); } };
// Load moderation statistics const loadStats = async () => { try { const response = await fetch("http://localhost:8000/api/moderation/stats"); const data = await response.json();
if (data.success) { setStats(data.stats); } } catch (error) { console.error('Failed to load stats:', error); } };
// Get severity badge color const getSeverityColor = (severity) => { switch (severity) { case 'high': return 'bg-red-100 text-red-700 border-red-200'; case 'medium': return 'bg-yellow-100 text-yellow-700 border-yellow-200'; case 'low': return 'bg-blue-100 text-blue-700 border-blue-200'; default: return 'bg-green-100 text-green-700 border-green-200'; } };
// Get action icon const getActionIcon = (action) => { switch (action) { case 'block': return <XCircle className="w-4 h-4 text-red-600" />; case 'warn': return <AlertTriangle className="w-4 h-4 text-yellow-600" />; case 'monitor': return <Eye className="w-4 h-4 text-blue-600" />; default: return <CheckCircle className="w-4 h-4 text-green-600" />; } };
// Format category scores for display const formatCategoryScores = (categoryScores) => { return Object.entries(categoryScores) .sort(([,a], [,b]) => b - a) .slice(0, 5) .map(([category, score]) => ({ category: category.replace(/[/_]/g, ' '), score: (score * 100).toFixed(1) })); };
// 🎨 UI: Interface components return ( <div className="min-h-screen bg-gradient-to-br from-red-50 to-orange-50 flex items-center justify-center p-4"> <div className="bg-white rounded-2xl shadow-2xl w-full max-w-6xl flex flex-col overflow-hidden">
{/* Header */} <div className="bg-gradient-to-r from-red-600 to-orange-600 text-white p-6"> <div className="flex items-center justify-between"> <div className="flex items-center space-x-3"> <div className="w-10 h-10 bg-white bg-opacity-20 rounded-full flex items-center justify-center"> <Shield className="w-5 h-5" /> </div> <div> <h1 className="text-xl font-bold">🛡️ AI Content Moderation</h1> <p className="text-red-100 text-sm">Keep your application safe and compliant!</p> </div> </div>
<button onClick={loadStats} className="px-4 py-2 bg-white bg-opacity-20 rounded-lg hover:bg-opacity-30 transition-colors duration-200 flex items-center space-x-2" > <BarChart3 className="w-4 h-4" /> <span>Load Stats</span> </button> </div> </div>
{/* Moderation Statistics */} {stats && ( <div className="p-4 bg-gray-50 border-b border-gray-200"> <h3 className="font-semibold text-gray-900 mb-3 text-sm">Moderation Statistics</h3> <div className="grid grid-cols-2 md:grid-cols-4 gap-4"> <div className="bg-white rounded-lg p-3 text-center"> <p className="text-2xl font-bold text-blue-600">{stats.total_checks.toLocaleString()}</p> <p className="text-xs text-gray-600">Total Checks</p> </div> <div className="bg-white rounded-lg p-3 text-center"> <p className="text-2xl font-bold text-red-600">{stats.flagged_content.toLocaleString()}</p> <p className="text-xs text-gray-600">Flagged Content</p> </div> <div className="bg-white rounded-lg p-3 text-center"> <p className="text-2xl font-bold text-yellow-600">{stats.flag_rate}%</p> <p className="text-xs text-gray-600">Flag Rate</p> </div> <div className="bg-white rounded-lg p-3 text-center"> <p className="text-sm font-medium text-gray-900">{stats.top_categories[0]?.category}</p> <p className="text-xs text-gray-600">Top Category</p> </div> </div> </div> )}
{/* Moderation Tabs */} <div className="border-b border-gray-200"> <div className="flex"> <button onClick={() => setActiveTab("text")} className={`px-6 py-3 text-sm font-medium border-b-2 transition-colors duration-200 ${ activeTab === "text" ? "border-red-500 text-red-600 bg-red-50" : "border-transparent text-gray-500 hover:text-gray-700 hover:bg-gray-50" }`} > <FileText className="w-4 h-4 inline mr-2" /> Text Moderation </button> <button onClick={() => setActiveTab("image")} className={`px-6 py-3 text-sm font-medium border-b-2 transition-colors duration-200 ${ activeTab === "image" ? "border-red-500 text-red-600 bg-red-50" : "border-transparent text-gray-500 hover:text-gray-700 hover:bg-gray-50" }`} > <Upload className="w-4 h-4 inline mr-2" /> Image Moderation </button> <button onClick={() => setActiveTab("mixed")} className={`px-6 py-3 text-sm font-medium border-b-2 transition-colors duration-200 ${ activeTab === "mixed" ? "border-red-500 text-red-600 bg-red-50" : "border-transparent text-gray-500 hover:text-gray-700 hover:bg-gray-50" }`} > <Shield className="w-4 h-4 inline mr-2" /> Mixed Content </button> </div> </div>
{/* Content Input Area */} <div className="p-6 border-b border-gray-200"> {activeTab === "text" && ( <div> <label className="block text-sm font-medium text-gray-700 mb-2"> Text Content to Moderate </label> <textarea value={textContent} onChange={(e) => setTextContent(e.target.value)} rows="4" placeholder="Enter text content to check for harmful material..." disabled={isChecking} className="w-full px-4 py-3 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-red-500 focus:border-transparent transition-all duration-200 resize-none disabled:bg-gray-100" /> <div className="mt-4 flex justify-between items-center"> <p className="text-sm text-gray-500">{textContent.length} characters</p> <button onClick={moderateText} disabled={isChecking || !textContent.trim()} className="px-6 py-2 bg-gradient-to-r from-red-600 to-orange-600 hover:from-red-700 hover:to-orange-700 disabled:from-gray-300 disabled:to-gray-300 text-white rounded-lg transition-all duration-200 flex items-center space-x-2 shadow-lg disabled:shadow-none" > <Shield className="w-4 h-4" /> <span>{isChecking ? "Checking..." : "Check Text"}</span> </button> </div> </div> )}
{activeTab === "image" && ( <div> <label className="block text-sm font-medium text-gray-700 mb-2"> Image Content to Moderate </label> {!selectedImage ? ( <div onClick={() => fileInputRef.current?.click()} className="border-2 border-dashed border-gray-300 rounded-lg p-8 text-center cursor-pointer hover:border-red-400 hover:bg-red-50 transition-colors duration-200" > <Upload className="w-12 h-12 text-gray-400 mx-auto mb-4" /> <h4 className="text-lg font-semibold text-gray-700 mb-2">Upload Image</h4> <p className="text-gray-600">Support for JPEG, PNG, WebP, and GIF files up to 25MB</p> </div> ) : ( <div className="bg-gray-50 rounded-lg p-4 border border-gray-200"> <div className="flex items-center justify-between mb-4"> <div className="flex items-center space-x-3"> <img src={previewUrl} alt={selectedImage.name} className="w-16 h-16 object-cover rounded border border-gray-200" /> <div> <h4 className="font-medium text-gray-900">{selectedImage.name}</h4> <p className="text-sm text-gray-600">{(selectedImage.size / 1024 / 1024).toFixed(2)} MB</p> </div> </div> <button onClick={clearImage} className="p-2 text-gray-400 hover:text-red-600 transition-colors duration-200" > × </button> </div> <button onClick={moderateImage} disabled={isChecking} className="w-full bg-gradient-to-r from-red-600 to-orange-600 hover:from-red-700 hover:to-orange-700 disabled:from-gray-300 disabled:to-gray-300 text-white px-6 py-2 rounded-lg transition-all duration-200 flex items-center justify-center space-x-2 shadow-lg disabled:shadow-none" > <Shield className="w-4 h-4" /> <span>{isChecking ? "Checking..." : "Check Image"}</span> </button> </div> )} <input ref={fileInputRef} type="file" accept="image/jpeg,image/png,image/webp,image/gif" onChange={handleImageSelect} className="hidden" /> </div> )}
{activeTab === "mixed" && ( <div className="space-y-4"> <div> <label className="block text-sm font-medium text-gray-700 mb-2"> Text Content (Optional) </label> <textarea value={textContent} onChange={(e) => setTextContent(e.target.value)} rows="3" placeholder="Enter text content to check along with image..." disabled={isChecking} className="w-full px-4 py-3 border border-gray-300 rounded-lg focus:outline-none focus:ring-2 focus:ring-red-500 focus:border-transparent transition-all duration-200 resize-none disabled:bg-gray-100" /> </div>
<div> <label className="block text-sm font-medium text-gray-700 mb-2"> Image Content (Optional) </label> {!selectedImage ? ( <div onClick={() => fileInputRef.current?.click()} className="border-2 border-dashed border-gray-300 rounded-lg p-6 text-center cursor-pointer hover:border-red-400 hover:bg-red-50 transition-colors duration-200" > <Upload className="w-8 h-8 text-gray-400 mx-auto mb-2" /> <p className="text-gray-600">Click to upload image</p> </div> ) : ( <div className="flex items-center space-x-3 p-3 bg-gray-50 rounded border border-gray-200"> <img src={previewUrl} alt={selectedImage.name} className="w-12 h-12 object-cover rounded border border-gray-200" /> <div className="flex-1"> <p className="font-medium text-gray-900">{selectedImage.name}</p> <p className="text-sm text-gray-600">{(selectedImage.size / 1024 / 1024).toFixed(2)} MB</p> </div> <button onClick={clearImage} className="p-1 text-gray-400 hover:text-red-600 transition-colors duration-200" > × </button> </div> )} <input ref={fileInputRef} type="file" accept="image/jpeg,image/png,image/webp,image/gif" onChange={handleImageSelect} className="hidden" /> </div>
<button onClick={moderateMixed} disabled={isChecking || (!textContent.trim() && !selectedImage)} className="w-full bg-gradient-to-r from-red-600 to-orange-600 hover:from-red-700 hover:to-orange-700 disabled:from-gray-300 disabled:to-gray-300 text-white px-6 py-3 rounded-lg transition-all duration-200 flex items-center justify-center space-x-2 shadow-lg disabled:shadow-none" > <Shield className="w-4 h-4" /> <span>{isChecking ? "Checking..." : "Check Mixed Content"}</span> </button> </div> )} </div>
{/* Results Area */} <div className="flex-1 p-6"> {/* Error Display */} {error && ( <div className="bg-red-50 border border-red-200 rounded-lg p-4 mb-4"> <p className="text-red-700"> <strong>Error:</strong> {error} </p> </div> )}
{/* Moderation Results */} {moderationResults.length === 0 ? ( <div className="text-center py-12"> <div className="w-16 h-16 bg-red-100 rounded-2xl flex items-center justify-center mx-auto mb-4"> <Shield className="w-8 h-8 text-red-600" /> </div> <h3 className="text-lg font-semibold text-gray-700 mb-2"> Ready to Check Content! </h3> <p className="text-gray-600 max-w-md mx-auto"> Upload text or images to check for harmful content and ensure your application stays safe and compliant. </p> </div> ) : ( <div className="space-y-4"> <h4 className="font-semibold text-gray-900 flex items-center"> <Shield className="w-5 h-5 mr-2 text-red-600" /> Moderation Results ({moderationResults.length}) </h4>
<div className="space-y-3 max-h-96 overflow-y-auto"> {moderationResults.map((result) => ( <div key={result.id} className="bg-gray-50 rounded-lg p-4 border border-gray-200"> <div className="flex items-start justify-between mb-3"> <div className="flex items-center space-x-3"> {getActionIcon(result.action_recommended)} <div> <h5 className="font-medium text-gray-900 capitalize">{result.type} Content</h5> <p className="text-sm text-gray-600">{result.content}</p> </div> </div>
<div className="flex items-center space-x-2"> <span className={`px-2 py-1 rounded text-xs border ${getSeverityColor(result.safety_analysis.severity)}`}> {result.safety_analysis.severity || 'safe'} </span> <span className={`px-2 py-1 rounded text-xs ${ result.flagged ? 'bg-red-100 text-red-700' : 'bg-green-100 text-green-700' }`}> {result.flagged ? 'Flagged' : 'Safe'} </span> </div> </div>
{result.flagged && ( <div className="bg-white rounded p-3 mb-3"> <p className="text-sm text-red-700 mb-2"> <strong>Policy Violation:</strong> {result.user_message} </p> <p className="text-xs text-gray-600"> <strong>Flagged Categories:</strong> {result.safety_analysis.flagged_categories.join(', ')} </p> </div> )}
{result.category_scores && ( <div className="bg-white rounded p-3"> <h6 className="text-sm font-medium text-gray-700 mb-2">Top Category Scores:</h6> <div className="grid grid-cols-2 md:grid-cols-3 gap-2"> {formatCategoryScores(result.category_scores).map((item, idx) => ( <div key={idx} className="text-xs"> <span className="text-gray-600 capitalize">{item.category}:</span> <span className="font-medium ml-1">{item.score}%</span> </div> ))} </div> </div> )}
<p className="text-xs text-gray-500 mt-2"> {new Date(result.timestamp).toLocaleString()} </p> </div> ))} </div> </div> )} </div> </div> </div> );}
export default ContentModeration;
Step 3B: Adding Content Moderation to Navigation
Section titled “Step 3B: Adding Content Moderation to Navigation”Update your src/App.jsx
to include the new content moderation component:
import { useState } from "react";import StreamingChat from "./StreamingChat";import ImageGenerator from "./ImageGenerator";import AudioTranscription from "./AudioTranscription";import FileAnalysis from "./FileAnalysis";import TextToSpeech from "./TextToSpeech";import VisionAnalysis from "./VisionAnalysis";import VoiceInteraction from "./VoiceInteraction";import FunctionCalling from "./FunctionCalling";import WebSearch from "./WebSearch";import ContentModeration from "./ContentModeration";import { MessageSquare, Image, Mic, Folder, Volume2, Eye, Phone, Zap, Globe, Shield } from "lucide-react";
function App() { // 🧠 STATE: Navigation management const [currentView, setCurrentView] = useState("chat"); // Add 'moderation' option
// 🎨 UI: Main app with navigation (updated for smaller screens) return ( <div className="min-h-screen bg-gray-100"> {/* Navigation Header */} <nav className="bg-white shadow-sm border-b border-gray-200"> <div className="max-w-7xl mx-auto px-4"> <div className="flex items-center justify-between h-16"> {/* Logo */} <div className="flex items-center space-x-3"> <div className="w-8 h-8 bg-gradient-to-r from-blue-500 to-purple-600 rounded-lg flex items-center justify-center"> <span className="text-white font-bold text-sm">AI</span> </div> <h1 className="text-xl font-bold text-gray-900">OpenAI Mastery</h1> </div>
{/* Navigation Buttons */} <div className="flex space-x-1 overflow-x-auto"> <button onClick={() => setCurrentView("chat")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "chat" ? "bg-blue-100 text-blue-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <MessageSquare className="w-4 h-4" /> <span className="hidden sm:inline">Chat</span> </button>
<button onClick={() => setCurrentView("images")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "images" ? "bg-purple-100 text-purple-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Image className="w-4 h-4" /> <span className="hidden sm:inline">Images</span> </button>
<button onClick={() => setCurrentView("audio")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "audio" ? "bg-blue-100 text-blue-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Mic className="w-4 h-4" /> <span className="hidden sm:inline">Audio</span> </button>
<button onClick={() => setCurrentView("files")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "files" ? "bg-green-100 text-green-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Folder className="w-4 h-4" /> <span className="hidden sm:inline">Files</span> </button>
<button onClick={() => setCurrentView("speech")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "speech" ? "bg-orange-100 text-orange-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Volume2 className="w-4 h-4" /> <span className="hidden sm:inline">Speech</span> </button>
<button onClick={() => setCurrentView("vision")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "vision" ? "bg-indigo-100 text-indigo-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Eye className="w-4 h-4" /> <span className="hidden sm:inline">Vision</span> </button>
<button onClick={() => setCurrentView("voice")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "voice" ? "bg-blue-100 text-blue-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Phone className="w-4 h-4" /> <span className="hidden sm:inline">Voice</span> </button>
<button onClick={() => setCurrentView("functions")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "functions" ? "bg-cyan-100 text-cyan-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Zap className="w-4 h-4" /> <span className="hidden sm:inline">Functions</span> </button>
<button onClick={() => setCurrentView("websearch")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "websearch" ? "bg-green-100 text-green-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Globe className="w-4 h-4" /> <span className="hidden sm:inline">Web</span> </button>
<button onClick={() => setCurrentView("moderation")} className={`px-2 py-2 rounded-lg flex items-center space-x-1 transition-all duration-200 whitespace-nowrap text-sm ${ currentView === "moderation" ? "bg-red-100 text-red-700 shadow-sm" : "text-gray-600 hover:text-gray-900 hover:bg-gray-100" }`} > <Shield className="w-4 h-4" /> <span className="hidden sm:inline">Safety</span> </button> </div> </div> </div> </nav>
{/* Main Content */} <main className="h-[calc(100vh-4rem)]"> {currentView === "chat" && <StreamingChat />} {currentView === "images" && <ImageGenerator />} {currentView === "audio" && <AudioTranscription />} {currentView === "files" && <FileAnalysis />} {currentView === "speech" && <TextToSpeech />} {currentView === "vision" && <VisionAnalysis />} {currentView === "voice" && <VoiceInteraction />} {currentView === "functions" && <FunctionCalling />} {currentView === "websearch" && <WebSearch />} {currentView === "moderation" && <ContentModeration />} </main> </div> );}
export default App;
🧪 Testing Your Content Moderation
Section titled “🧪 Testing Your Content Moderation”Let’s test your content moderation feature step by step to make sure everything works correctly.
Step 1: Backend Route Test
Section titled “Step 1: Backend Route Test”First, verify your backend routes work:
Test text moderation:
curl -X POST http://localhost:8000/api/moderation/text \ -H "Content-Type: application/json" \ -d '{"text": "This is a test message for content moderation"}'
Test moderation stats:
curl http://localhost:8000/api/moderation/stats
Step 2: Full Application Test
Section titled “Step 2: Full Application Test”Start both servers:
Backend (in your backend folder):
npm run dev
Frontend (in your frontend folder):
npm run dev
Test the complete flow:
- Navigate to Safety → Click the “Safety” tab in navigation
- Test text moderation → Enter various text samples including potentially harmful content
- Test image moderation → Upload various images to check visual content
- Test mixed content → Combine text and images for comprehensive checking
- Review results → See detailed moderation analysis and policy recommendations
- Check statistics → Load moderation stats to see overall safety metrics
- Test different content types → Try various categories of content to see detection accuracy
Step 3: Safety Testing Scenarios
Section titled “Step 3: Safety Testing Scenarios”Test different types of content:
✅ Safe content: Normal, appropriate text and images⚠️ Borderline content: Content that might trigger warnings❌ Harmful content: Content that should be blocked🔍 Edge cases: Very short text, large images, mixed content
Expected behavior:
- Accurate detection of harmful content categories
- Clear safety recommendations and user messages
- Detailed category scores and confidence levels
- Comprehensive audit trail for all moderation checks
✅ What You Built
Section titled “✅ What You Built”Congratulations! You’ve extended your existing application with comprehensive AI content moderation:
- ✅ Extended your backend with OpenAI moderation API integration
- ✅ Added React safety component following the same patterns as your other features
- ✅ Implemented multi-modal moderation for both text and image content
- ✅ Created safety analysis system with intelligent policy recommendations
- ✅ Added moderation statistics with comprehensive safety metrics
- ✅ Maintained consistent design with your existing application
Your application now has:
- Text chat with streaming responses
- Image generation with DALL-E 3 and GPT-Image-1
- Audio transcription with Whisper voice recognition
- File analysis with intelligent document processing
- Text-to-speech with natural voice synthesis
- Vision analysis with GPT-4o visual intelligence
- Voice interaction with GPT-4o Audio natural conversations
- Function calling with real-world tool integration
- Web search with real-time internet access
- Content moderation with comprehensive safety checking
- Unified navigation between all features
- Professional UI with consistent TailwindCSS styling
Next up: You’ll learn about Production Best Practices, where you’ll implement security, monitoring, error handling, and optimization to make your application truly production-ready.
Your OpenAI mastery application is now safe and compliant! 🛡️