Integrate Multiple LLMs with Spring AI: OpenAI, Ollama and Gemini

    Integrate Multiple LLMs with Spring AI: OpenAI, Ollama and Gemini

    23/07/2025

    Introduction

    If you are building AI-powered applications, having access to multiple Large Language Models (LLMs) can significantly enhance your application's capabilities. Each LLM has its unique strengths, pricing models, and characteristics. Spring AI, Spring's official AI framework, makes it remarkably easy to integrate and switch between different LLM providers seamlessly.

    In this blog, we'll build a Spring Boot application that integrates three LLMs: OpenAI, Ollama (running Mistral), and Google Gemini.

    Why Multiple LLMs?

    Before diving into the implementation, let's go through some of the reasons why you might want to integrate multiple LLMs in your application:

    Cost Optimization : Different LLM providers offer varying pricing models. You might use OpenAI's GPT-4 for complex reasoning tasks while using Ollama's local models for simple queries to reduce costs.

    Performance Characteristics: Each model excels in different areas. For example OpenAI's models are great for general-purpose tasks, while Ollama's models can be deployed locally for privacy-sensitive applications.

    Reliability and Fallbacks: Having multiple providers ensures your application remains functional even if one service experiences downtime or rate limiting.

    Specialized Use Cases: Different models might be optimized for specific domains like code generation, creative writing, or technical documentation.

    Project Setup

    Let's start by setting up our Spring Boot project with the necessary dependencies.

    Maven Dependencies

    First, create a new Spring Boot project and add the following dependencies to your pom.xml:

    <?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <parent> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-parent</artifactId> <version>3.5.3</version> <relativePath/> </parent> <groupId>com.codewiz</groupId> <artifactId>multillm</artifactId> <version>0.0.1-SNAPSHOT</version> <name>multillm</name> <description>Multi-LLM Integration with Spring AI</description> <properties> <java.version>24</java.version> <spring-ai.version>1.0.0</spring-ai.version> </properties> <dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-model-ollama</artifactId> </dependency> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-model-openai</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> </dependencies> <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-bom</artifactId> <version>${spring-ai.version}</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> </project>

    Application Configuration

    Next, configure the application properties in src/main/resources/application.properties:

    spring.application.name=multillm # OpenAI Configuration spring.ai.openai.api-key=${OPENAI_API_KEY} spring.ai.openai.chat.options.model=gpt-4.1-nano # Ollama Configuration spring.ai.ollama.chat.options.model=mistral:7b # Gemini Configuration (using OpenAI-compatible endpoint) gemini.api.key=${GEMINI_API_KEY} gemini.api.url=https://generativelanguage.googleapis.com/v1beta/openai gemini.api.completions.path=/chat/completions gemini.model.name=gemini-2.5-flash # Server Configuration server.port=8100

    Note: Make sure to set your environment variables:

    export OPENAI_API_KEY="your-openai-api-key" export GEMINI_API_KEY="your-gemini-api-key"

    Building the Application Structure

    Let us first create an enum to represent the different LLM types:

    package com.codewiz.multillm.dto; public enum LLMType { OPENAI("openai"), OLLAMA("ollama"), GEMINI("gemini"); private final String value; LLMType(String value) { this.value = value; } public String getValue() { return value; } }

    Now let us add a Chat Response DTO

    package com.codewiz.multillm.dto; public class ChatResponse { private String response; private String llm; private String originalMessage; private long timestamp; // Getters and setters }

    Controller and Service Implementation

    REST Controller

    Create a simple REST controller to handle chat requests:

    @RestController public class ChatController { private final ChatService chatService; @Autowired public ChatController(ChatService chatService) { this.chatService = chatService; } @GetMapping("/chat") public ResponseEntity<ChatResponse> chat( @RequestParam String message, @RequestParam String llm) { String response = chatService.chat(llm, message); return ResponseEntity.ok(new ChatResponse(response, llm, message)); } }

    Chat Service

    The service layer handles the logic for routing requests to different LLM providers:

    @Service public class ChatService { private final ChatClient openAIChatClient; private final ChatClient ollamaChatClient; private final ChatClient geminiChatClient; @Autowired public ChatService(OpenAiChatModel openAiChatModel, OllamaChatModel ollamaChatModel, @Qualifier("geminiChatClient") ChatClient geminiChatClient) { this.openAIChatClient = ChatClient.create(openAiChatModel); this.ollamaChatClient = ChatClient.create(ollamaChatModel); this.geminiChatClient = geminiChatClient; } public String chat(String llmName, String message) { var chatClient = getChatModel(LLMType.valueOf(llmName.toUpperCase())); return chatClient.prompt() .user(message) .call() .content(); } private ChatClient getChatModel(LLMType llmName) { return switch (llmName) { case OPENAI -> openAIChatClient; case OLLAMA -> ollamaChatClient; case GEMINI -> geminiChatClient; }; } }

    Adding OpenAI and Ollama Support

    OpenAI and Ollama integration is straightforward with Spring AI's starter dependencies:

    OpenAI Integration

    Spring AI provides native support for OpenAI through the spring-ai-starter-model-openai dependency. The configuration is handled automatically using the properties we defined:

    spring.ai.openai.api-key=${OPENAI_API_KEY} spring.ai.openai.chat.options.model=gpt-4.1-nano

    Ollama Integration

    Ollama integration is equally simple with the spring-ai-starter-model-ollama dependency. Ollama runs locally, so you'll need to:

    1. Install Ollama on your machine
    2. Pull the desired model: ollama pull mistral:7b
    3. Configure the model in properties:
    spring.ai.ollama.chat.options.model=mistral:7b

    Once you add the dependencies and configure the properties, Spring AI automatically loads the ChatModel bean for both OpenAI and Ollama.

    Integrating Gemini with OpenAI Dependency

    Google Gemini doesn't have a dedicated Spring AI starter, but we can leverage Gemini's OpenAI-compatible API endpoint using the existing OpenAI dependency.

    LLM Configuration for Gemini

    Create a configuration class to set up Gemini using the OpenAI client:

    @Configuration public class LLMConfig { @Bean @Qualifier("geminiChatClient") public ChatClient geminiChatModel( OpenAiChatModel baseChatModel, @Value("${gemini.api.key}") String apiKey, @Value("${gemini.api.url}") String geminiUrl, @Value("${gemini.api.completions.path}") String completionsPath, @Value("${gemini.model.name}") String modelName) { var geminiApi = OpenAiApi.builder() .baseUrl(geminiUrl) .completionsPath(completionsPath) .apiKey(apiKey) .build(); var customModel = baseChatModel.mutate() .openAiApi(geminiApi) .defaultOptions(OpenAiChatOptions.builder().model(modelName).build()) .build(); return ChatClient.create(customModel); } }

    This configuration:

    1. Uses OpenAI API structure: Leverages the existing OpenAI client infrastructure
    2. Customizes endpoints: Points to Gemini's OpenAI-compatible endpoint
    3. Mutates the base model: Creates a new instance with Gemini-specific settings
    4. Maintains compatibility: Works seamlessly with Spring AI's ChatClient interface

    Testing with HTTPie

    Now that our application is complete, let's test it using HTTPie. First, start your application:

    ./mvnw spring-boot:run

    Testing OpenAI

    # Test OpenAI and get model information http GET localhost:8100/chat \ message=="What model are you? Please provide your name, version, and key capabilities." \ llm==openai

    Sample Response:

    { "llm": "openai", "originalMessage": "What model are you? Please provide your name, version, and key capabilities.", "response": "I am ChatGPT, based on the GPT-4 architecture developed by OpenAI. My version includes improvements in understanding and generating human-like text, enabling me to assist with a wide range of tasks such as answering questions, providing explanations, composing creative writing, and more. I can understand context, handle complex prompts, and generate coherent and relevant responses across various topics.", "timestamp": 1753268746222 }

    Testing Ollama

    # Test Ollama and get model information http GET localhost:8100/chat \ message=="What model are you? Please tell me your name, version, and what you're good at." \ llm==ollama

    Sample Response:

    { "llm": "ollama", "originalMessage": "What model are you? Please tell me your name, version, and what you're good at.", "response": " I am a model of the Chat Model developed by Mistral AI. My primary function is to assist with various tasks by providing information, answering questions, and engaging in conversation. I strive to provide precise, helpful, and courteous responses.\n\nWhile I don't have a personal name, you can think of me as your digital assistant designed to make your interactions more enjoyable and productive. My capabilities include but are not limited to: answering questions, providing explanations, discussing a wide range of topics, assisting with scheduling and organization, offering recommendations, and much more.\n\nIn terms of my version, I am part of the latest generation of models, continually learning and improving from the data it encounters during interactions like this one.", "timestamp": 1753268772790 }

    Testing Gemini

    # Test Gemini and get model information http GET localhost:8100/chat \ message=="Please identify yourself. What model are you, what version, and what are your strengths?" \ llm==gemini

    Sample Response:

    { "llm": "gemini", "originalMessage": "Please identify yourself. What model are you, what version, and what are your strengths?", "response": "I am a large language model, trained by Google.\n\n**Model & Version:**\nUnlike traditional software with specific version numbers, large language models like me are continuously updated and refined. There isn't a single, publicly accessible \"version number\" in the way you might think of software like....", "timestamp": 1753268800297 }

    Conclusion

    Integrating multiple LLMs with Spring AI is quite simple. By leveraging Spring AI's capabilities, you can easily switch between different LLM providers based on your application's needs. This flexibility allows you to optimize for cost, performance, and reliability while providing a seamless user experience.

    You can find the complete source code for this project on my GitHub repository: CodeWizzard01/spring-ai-multiple-llm


    For more in-depth tutorials on Java, Spring, and modern software development practices, follow me for more content:

    🔗 Blog 🔗 YouTube 🔗 LinkedIn 🔗 Medium 🔗 Github

    Stay tuned for more content on the latest in AI and software engineering!

    Summarise

    Transform Your Learning

    Get instant AI-powered summaries of YouTube videos and websites. Save time while enhancing your learning experience.

    Instant video summaries
    Smart insights extraction
    Channel tracking