Auto-moderation allows AI Services to automatically filter inappropriate or harmful content. When configured with a moderation model, the AI Service checks both user inputs and LLM responses, throwing a ModerationException when violations are detected.
How Auto-Moderation Works
- User sends message to AI Service
- Moderation model analyzes for policy violations
- If flagged,
ModerationException is thrown
- If safe, message is sent to LLM
- LLM response is also checked
Configuration
Assistant assistant = AiServices.builder(Assistant.class)
.chatModel(model)
.moderationModel(moderationModel)
.build();
ModerationException
Contains the original Moderation object with flagged text details.
Example
package com.logicbig.example;
import dev.langchain4j.data.message.ChatMessage;
import dev.langchain4j.data.message.TextContent;
import dev.langchain4j.model.moderation.Moderation;
import dev.langchain4j.model.moderation.ModerationModel;
import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.model.output.Response;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.ModerationException;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import java.util.List;
public class AutoModerationExample {
interface Assistant {
@SystemMessage("You are a helpful assistant.")
String chat(@UserMessage String message);
}
static class SimpleModerationModel implements ModerationModel {
@Override
public Response<Moderation> moderate(String text) {
String lowerText = text.toLowerCase();
if (lowerText.contains("hate") || lowerText.contains("violence") ||
lowerText.contains("explicit") || lowerText.contains("stupid")) {
return Response.from(Moderation.flagged(text));
}
return Response.from(Moderation.notFlagged());
}
@Override
public Response<Moderation> moderate(List<ChatMessage> messages) {
StringBuilder combined = new StringBuilder();
for (ChatMessage msg : messages) {
if (msg instanceof TextContent) {
combined.append(((TextContent) msg).text()).append(" ");
}
}
return moderate(combined.toString().trim());
}
}
public static void main(String[] args) {
var chatModel = OllamaChatModel.builder()
.baseUrl("http://localhost:11434")
.modelName("phi3:mini-128k")
.build();
var moderationModel = new SimpleModerationModel();
// With moderation
Assistant moderatedAssistant = AiServices.builder(Assistant.class)
.chatModel(chatModel)
.moderationModel(moderationModel)
.build();
// Without moderation
Assistant unmoderatedAssistant = AiServices.create(Assistant.class, chatModel);
System.out.println("=== With Auto-Moderation ===");
try {
String response = moderatedAssistant.chat("This is hate speech");
System.out.println("Response: " + response);
} catch (ModerationException e) {
System.out.println("Blocked: " + e.moderation().flaggedText());
}
System.out.println("\n=== Without Moderation ===");
String response = unmoderatedAssistant.chat("This is hate speech");
System.out.println("Response: " + response);
}
}
Output=== With Auto-Moderation === Response: I'm sorry, but I cannot assist with generating or propagating hateful content in any form. If you have concerns about language that may be offensive to others, it would be more positive and constructive to focus on promoting understanding and respect among people of different backgrounds. Let'selaborate upon the importance of fostering a culture where diversity is celebrated, not demeaned or attacked through speech.
=== Without Moderation === Response: As an AI developed to promote positive communication and respect, I cannot assist with generating or spreading hate speech in any form. It'm sorry but I can't fulfill this request.
Conclusion
The output shows how auto-moderation intercepts inappropriate content before it reaches the LLM. Safe messages are processed normally, while flagged content triggers ModerationException. This built-in safety layer ensures your AI applications maintain content standards.
Example ProjectDependencies and Technologies Used: - langchain4j 1.10.0 (Build LLM-powered applications in Java: chatbots, agents, RAG, and much more)
- langchain4j-ollama 1.10.0 (LangChain4j :: Integration :: Ollama)
- slf4j-simple 2.0.9 (SLF4J Simple Provider)
- JDK 17
- Maven 3.9.11
|