In the last tutorial, we learned how to implement Chat Memory for a single user. Now, we will extend that to support multiple concurrent users.
Stateful Conversations with Memory
AI Services can maintain conversation history through chat memory, enabling stateful interactions where the LLM can reference previous messages. This is essential for:
- Multi-turn conversations that require context
- Personalized responses based on user history
- Task-oriented dialogs that span multiple interactions
- Maintaining user preferences across sessions
Memory Management Strategies
AI Services support different memory management approaches:
- Global Memory: Single memory for all users (simple but not scalable)
- Per-User Memory: Separate memory instance for each user ID
- Persistent Memory: Memory that survives application restarts
- Windowed Memory: Memory that keeps only the last N messages
@MemoryId Annotation
Definition of MemoryIdVersion: 1.10.0 package dev.langchain4j.service;
@Retention(RUNTIME)
@Target(PARAMETER)
public @interface MemoryId {
}
The @MemoryId annotation identifies which parameter provides the memory identifier (typically user ID or session ID). This allows the AI Service to retrieve or create the appropriate memory instance for each user.
String chat(@MemoryId String userId,
@UserMessage String message);
ChatMemoryProvider
The ChatMemoryProvider is a factory that creates memory instances on demand. It receives the memory ID and returns a configured ChatMemory instance for that user.
Example
The following example uses Ollama and phi3:mini-128k which is good for a demo and learning but not good for production-grade applications because it has limited reasoning capabilities and accuracy for complex tasks.
package com.logicbig.example;
import dev.langchain4j.memory.chat.MessageWindowChatMemory;
import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.MemoryId;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import java.time.Duration;
import java.time.temporal.ChronoUnit;
public class ChatMemoryExample {
@SystemMessage("Always reply in English.")
interface Assistant {
String chat(@MemoryId String userId,
@UserMessage String message);
}
public static void main(String[] args) {
OllamaChatModel model =
OllamaChatModel.builder()
.baseUrl("http://localhost:11434")
.modelName("phi3:mini-128k")
.temperature(0.7)
.numCtx(4096)
.timeout(Duration.of(5, ChronoUnit.MINUTES))
.build();
// Create AI Service with per-user memory (max 5 messages per user)
Assistant assistant = AiServices.builder(Assistant.class)
.chatModel(model)
.chatMemoryProvider(userId -> MessageWindowChatMemory.withMaxMessages(5))
.build();
// User 1 conversation
System.out.println("=== User 1 Conversation ===");
String response1 = assistant.chat("user123", "Hello, my name is Alice.");
System.out.println("Response 1: " + response1);
String response2 = assistant.chat("user123", "What's my name?");
System.out.println("Response 2: " + response2);
// User 2 conversation (separate memory)
System.out.println("\n=== User 2 Conversation ===");
String response3 = assistant.chat("user456", "Hi, I'm Bob");
System.out.println("Response 3: " + response3);
String response4 = assistant.chat("user456", "Who am I?");
System.out.println("Response 4: " + response4);
// User 1 continues their conversation
System.out.println("\n=== User 1 Continues ===");
String response5 = assistant.chat("user123", "What programming languages do you know?");
System.out.println("Response 5: " + response5);
// Demonstrate memory doesn't mix between users
System.out.println("\n=== Testing Memory Isolation ===");
String response6 = assistant.chat("user123", "What's my name again?");
System.out.println("User 123 (Alice): " + response6);
String response7 = assistant.chat("user456", "What's my name?");
System.out.println("User 456 (Bob): " + response7);
}
}
Output=== User 1 Conversation === Response 1: Greetings, Alice! How can I assist you today? Response 2: Your name is Alice. Is there anything specific you would like to know or discuss about yourself?
=== User 2 Conversation === Response 3: Hello! How can I assist you today? Response 4: As an AI developed by Microsoft, I don't have personal information about individuals. Whenever someone interacts with me in this role-playing scenario and provides a name or identity claim, that is the persona they are assuming for our conversation at that moment. If you identify yourself as Bob when initiating contact, then within the context of this interaction, I am responding to "Bob."
However, outside this scripted dialogue without any specific instructions from users like me implying otherwise, in reality, if a person named Bob is speaking with an AI assistant for help or information, they would simply be engaging as themselves.
=== User 1 Continues === Response 5: As an AI, I am equipped with knowledge on various topics including multiple programming languages such as Python, Java, C++, JavaScript, Ruby, and many others. Is there a specific language-related topic you're interested in exploring or learning about today?
=== Testing Memory Isolation === User 123 (Alice): Your name is Alice. If you have any questions regarding yourself that are not directly related to programming languages but instead pertain to personal information, I would be more than happy to assist with those as well! How else may I serve you today, Alice? User 456 (Bob): In our role-playing scenario where you introduced yourself as "Bob," within that specific context and conversation thread I am responding to the persona of Bob. If this were a real interaction without prior instructions, your actual identity would be whatever it is in reality when not engaged with me or any other AI system like myself.
Conclusion
The output demonstrates how each user maintains their own conversation history. User 1's memory contains only their interaction, while User 2's memory contains theirs. The AI Service proxy automatically manages memory retrieval, message storage, and context window management, allowing you to build stateful applications without dealing with memory implementation details.
Example ProjectDependencies and Technologies Used: - langchain4j 1.10.0 (Build LLM-powered applications in Java: chatbots, agents, RAG, and much more)
- langchain4j-ollama 1.10.0 (LangChain4j :: Integration :: Ollama)
- slf4j-simple 2.0.9 (SLF4J Simple Provider)
- JDK 17
- Maven 3.9.11
|
|