AI LangChain4j - Chaining Multiple AI Services

[Last Updated: Jan 29, 2026]

Modular AI Application Design

Chaining multiple AI Services allows you to build complex applications by combining specialized components. Each service handles a specific task, making the system more maintainable, testable, and efficient. This approach is beneficial for:

Decomposing complex workflows into simpler steps
Using different models for different tasks (cost optimization)
Implementing conditional logic with AI decisions
Building reusable AI components

Chaining Strategies

Sequential Chaining: Output of one service becomes input to another
Conditional Chaining: AI Services returning booleans/enums control flow
Parallel Chaining: Multiple services process the same input independently
Mixed Chaining: Combine AI Services with regular Java logic

Benefits of Service Chaining

Cost Optimization: Use cheaper models for simple tasks
Specialization: Different services can use different prompts/tools
Testability: Each service can be unit tested independently
Maintainability: Changes to one service don't affect others
Reusability: Services can be reused across different applications

Integration Patterns

AI Services integrate seamlessly with regular Java code:

Call AI Services from business logic methods
Use AI Service return values in if/else statements
Iterate with AI Services in loops
Mock AI Services in unit tests
Combine multiple AI Services with deterministic logic

Example

The following example uses Ollama and phi3:mini-128k/ llama3.2:3b which are good for a demo and learning but not good for production-grade applications because it has limited reasoning capabilities and accuracy for complex tasks.

package com.logicbig.example;

import dev.langchain4j.agent.tool.P;
import dev.langchain4j.agent.tool.Tool;
import dev.langchain4j.model.chat.ChatModel;
import dev.langchain4j.model.ollama.OllamaChatModel;
import dev.langchain4j.service.AiServices;
import dev.langchain4j.service.SystemMessage;
import dev.langchain4j.service.UserMessage;
import java.time.Duration;

public class ChainingExample {

    // Service 1: Query Classifier
    interface QueryClassifier {
        @UserMessage("Classify this query into one of these classifiers: "
                + "GREETING, QUESTION, COMMAND, or OTHER. Only return the classifier. "
                + "No explanation needed. Query: {{it}}")
        String classify(String query);
    }

    // Service 2: Greeting Responder (using model for simple task)
    interface GreetingResponder {
        @SystemMessage("You are a friendly assistant. "
                + "Respond to greetings warmly but briefly.")
        String respondToGreeting(String greeting);
    }

    // Service 3: Technical Expert (more capable model for complex questions)
    interface TechnicalExpert {
        @SystemMessage("You are a technical expert. Provide short, "
                + "accurate answers to technical questions.")
        String answerTechnicalQuestion(String question);
    }

    static class CommandTools {
        @Tool("Book a flight from one city to another")
        String bookFlight(String from,
                          String to) {
            System.out.printf("\n[TOOL CALLED] 'bookFlight' with params: "
                                      + "from='%s', to='%s'\n", from, to);
            //simulating booking flight
            return String.format("Flight booked from %s to %s at 3pm tomorrow. "
                                         + "Confirmation code: ABC123", from, to);
        }
    }

    // Service 4: Command Processor
    interface CommandProcessor {
        String processCommand(String command);
    }

    // Main orchestrator that chains services
    static class SmartAssistant {
        private final QueryClassifier classifier;
        private final GreetingResponder greeter;
        private final TechnicalExpert expert;
        private final CommandProcessor commandProcessor;

        public SmartAssistant(QueryClassifier classifier,
                              GreetingResponder greeter,
                              TechnicalExpert expert,
                              CommandProcessor commandProcessor) {
            this.classifier = classifier;
            this.greeter = greeter;
            this.expert = expert;
            this.commandProcessor = commandProcessor;
        }

        public String handleQuery(String query) {
            System.out.println("Query: " + query);
            // Step 1: Classify the query
            String classification = classifier.classify(query);
            System.out.println("Classification: " + classification);

            // Step 2: Route to appropriate service
            return switch (classification.toUpperCase()) {
                case "GREETING" -> {
                    System.out.println("Routing to Greeting Responder");
                    yield greeter.respondToGreeting(query);
                }
                case "QUESTION" -> {
                    System.out.println("Routing to Technical Expert");
                    yield expert.answerTechnicalQuestion(query);
                }
                case "COMMAND" -> {
                    System.out.println("Routing to Command processor");
                    yield commandProcessor.processCommand(query);
                }
                default -> {
                    System.out.println("Using default response");
                    yield "I'm not sure how to handle that. Could you rephrase?";
                }
            };
        }
    }

    public static void main(String[] args) {
        // Create different models for different tasks
        ChatModel cheapModel =
                OllamaChatModel.builder()
                               .baseUrl("http://localhost:11434")
                               .modelName("phi3:mini-128k")
                               .numCtx(4096)
                               .timeout(Duration.ofMinutes(3))
                               .temperature(0.1) // Low temp for classification
                               .build();

        ChatModel capableModel =
                OllamaChatModel.builder()
                               .baseUrl("http://localhost:11434")
                               .modelName("llama3.2:3b")
                               .numCtx(4096)
                               .timeout(Duration.ofMinutes(3))
                               .temperature(0.7) // Higher temp for creative responses
                               .build();

        // Create all services
        GreetingResponder greeter = AiServices.create(GreetingResponder.class, cheapModel);
        QueryClassifier classifier = AiServices.create(QueryClassifier.class, capableModel);
        TechnicalExpert expert = AiServices.create(TechnicalExpert.class, capableModel);
        CommandProcessor commandProcessor = AiServices.builder(CommandProcessor.class)
                                                      .chatModel(capableModel)
                                                      .tools(new CommandTools())
                                                      .build();

        // Create orchestrator
        SmartAssistant assistant = new SmartAssistant(classifier, greeter, expert, commandProcessor);

        // Test different queries
        System.out.println("-- Example 1: Greeting --");
        String response1 = assistant.handleQuery("Hello there!");
        System.out.println("Response: " + response1);

        System.out.println("\n-- Example 2: Technical Question --");
        String response2 = assistant.handleQuery("Explain how neural networks work "
                                                         + "in 2 sentences");
        System.out.println("Response: " + response2);
        System.out.println("\n-- Example 3: Command --");
        String response3 = assistant.handleQuery("Please book a flight from Seoul to Singapore.");
        System.out.println("Response: " + response3);

        System.out.println("\n-- Example 4: Question  --");
        String response4 = assistant.handleQuery("What's an AI agent "
                                                         + "in 2 sentences?");
        System.out.println("Response: " + response4);
    }
}

Output

-- Example 1: Greeting --
Query: Hello there!
Classification: GREETING
Routing to Greeting Responder
Response: Hi! How can I help you today?

-- Example 2: Technical Question --
Query: Explain how neural networks work in 2 sentences
Classification: QUESTION
Routing to Technical Expert
Response: Neural networks work by mimicking the human brain's structure, consisting of interconnected nodes (neurons) that process and transmit information through complex algorithms, allowing them to learn patterns and make predictions from large datasets. Through a series of iterative training cycles, these algorithms adjust the strength of connections between nodes based on input data, enabling the network to improve its performance over time.

-- Example 3: Command --
Query: Please book a flight from Seoul to Singapore.
Classification: COMMAND
Routing to Command processor

[TOOL CALLED] 'bookFlight' with params: from='Seoul', to='Singapore'
Response: The flight you've been booked on is operated by Korean Air, and it will depart from Incheon International Airport (ICN) in Seoul and arrive at Changi Airport (SIN) in Singapore.

Here are the details of your flight:

* Departure airport: Incheon International Airport (ICN)
* Destination airport: Changi Airport (SIN)
* Departure time: 3:00 PM (tomorrow)
* Flight number: KE123
* Airline: Korean Air
* Class: Economy

Please note that the flight schedule and availability may be subject to change, so it's always a good idea to check with the airline or a travel booking website for the latest information.

Also, don't forget to check in online at least 2 hours before your flight departs, and print out your boarding pass or access it on your mobile device. Have a safe trip!

-- Example 4: Question  --
Query: What's an AI agent in 2 sentences?
Classification: QUESTION
Routing to Technical Expert
Response: An AI agent is a software component that perceives its environment, takes actions, and learns from experiences to achieve a specific goal or set of goals. It acts as an intelligent decision-maker, using various techniques such as machine learning, rule-based systems, and optimization algorithms to interact with its surroundings and make informed decisions.

Conclusion

The output demonstrates how different AI Services work together: the classifier identifies the query type, the router directs it to the appropriate specialized service, and the final response combines multiple AI decisions. Each service uses optimal configuration for its specific task, and the overall flow is controlled by a mix of AI decisions and deterministic logic. This modular approach makes the system more robust and maintainable than a single monolithic AI Service.

Example Project

Dependencies and Technologies Used:

langchain4j 1.10.0 (Build LLM-powered applications in Java: chatbots, agents, RAG, and much more)
langchain4j-ollama 1.10.0 (LangChain4j :: Integration :: Ollama)
slf4j-simple 2.0.9 (SLF4J Simple Provider)
JDK 17
Maven 3.9.11