Integrate Multiple LLMs with Spring AI: OpenAI, Ollama and Gemini

23/07/2025

Introduction

If you are building AI-powered applications, having access to multiple Large Language Models (LLMs) can significantly enhance your application's capabilities. Each LLM has its unique strengths, pricing models, and characteristics. Spring AI, Spring's official AI framework, makes it remarkably easy to integrate and switch between different LLM providers seamlessly.

In this blog, we'll build a Spring Boot application that integrates three LLMs: OpenAI, Ollama (running Mistral), and Google Gemini.

Why Multiple LLMs?

Before diving into the implementation, let's go through some of the reasons why you might want to integrate multiple LLMs in your application:

Cost Optimization : Different LLM providers offer varying pricing models. You might use OpenAI's GPT-4 for complex reasoning tasks while using Ollama's local models for simple queries to reduce costs.

Performance Characteristics: Each model excels in different areas. For example OpenAI's models are great for general-purpose tasks, while Ollama's models can be deployed locally for privacy-sensitive applications.

Reliability and Fallbacks: Having multiple providers ensures your application remains functional even if one service experiences downtime or rate limiting.

Specialized Use Cases: Different models might be optimized for specific domains like code generation, creative writing, or technical documentation.

Project Setup

Let's start by setting up our Spring Boot project with the necessary dependencies.

Maven Dependencies

First, create a new Spring Boot project and add the following dependencies to your pom.xml:


<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>3.5.3</version>
        <relativePath/>
    </parent>
    <groupId>com.codewiz</groupId>
    <artifactId>multillm</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>multillm</name>
    <description>Multi-LLM Integration with Spring AI</description>

    <properties>
        <java.version>24</java.version>
        <spring-ai.version>1.0.0</spring-ai.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-model-ollama</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-model-openai</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.springframework.ai</groupId>
                <artifactId>spring-ai-bom</artifactId>
                <version>${spring-ai.version}</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>
</project>

Application Configuration

Next, configure the application properties in src/main/resources/application.properties:


spring.application.name=multillm

# OpenAI Configuration
spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.chat.options.model=gpt-4.1-nano

# Ollama Configuration
spring.ai.ollama.chat.options.model=mistral:7b

# Gemini Configuration (using OpenAI-compatible endpoint)
gemini.api.key=${GEMINI_API_KEY}
gemini.api.url=https://generativelanguage.googleapis.com/v1beta/openai
gemini.api.completions.path=/chat/completions
gemini.model.name=gemini-2.5-flash

# Server Configuration
server.port=8100

Note: Make sure to set your environment variables:


export OPENAI_API_KEY="your-openai-api-key"
export GEMINI_API_KEY="your-gemini-api-key"

Building the Application Structure

Let us first create an enum to represent the different LLM types:


package com.codewiz.multillm.dto;

public enum LLMType {
    OPENAI("openai"),
    OLLAMA("ollama"),
    GEMINI("gemini");

    private final String value;

    LLMType(String value) {
        this.value = value;
    }

    public String getValue() {
        return value;
    }
}

Now let us add a Chat Response DTO


package com.codewiz.multillm.dto;

public class ChatResponse {
    private String response;
    private String llm;
    private String originalMessage;
    private long timestamp;
    // Getters and setters
}

Controller and Service Implementation

REST Controller

Create a simple REST controller to handle chat requests:


@RestController
public class ChatController {

    private final ChatService chatService;

    @Autowired
    public ChatController(ChatService chatService) {
        this.chatService = chatService;
    }

    @GetMapping("/chat")
    public ResponseEntity<ChatResponse> chat(
            @RequestParam String message,
            @RequestParam String llm) {
        String response = chatService.chat(llm, message);
        return ResponseEntity.ok(new ChatResponse(response, llm, message));
    }
}

Chat Service

The service layer handles the logic for routing requests to different LLM providers:



@Service
public class ChatService {

    private final ChatClient openAIChatClient;
    private final ChatClient ollamaChatClient;
    private final ChatClient geminiChatClient;

    @Autowired
    public ChatService(OpenAiChatModel openAiChatModel,
                       OllamaChatModel ollamaChatModel, 
                       @Qualifier("geminiChatClient") ChatClient geminiChatClient) {
        this.openAIChatClient = ChatClient.create(openAiChatModel);
        this.ollamaChatClient = ChatClient.create(ollamaChatModel);
        this.geminiChatClient = geminiChatClient;
    }

    public String chat(String llmName, String message) {
        var chatClient = getChatModel(LLMType.valueOf(llmName.toUpperCase()));
        return chatClient.prompt()
                .user(message)
                .call()
                .content();
    }

    private ChatClient getChatModel(LLMType llmName) {
        return switch (llmName) {
            case OPENAI -> openAIChatClient;
            case OLLAMA -> ollamaChatClient;
            case GEMINI -> geminiChatClient;
        };
    }
}

Adding OpenAI and Ollama Support

OpenAI and Ollama integration is straightforward with Spring AI's starter dependencies:

OpenAI Integration

Spring AI provides native support for OpenAI through the spring-ai-starter-model-openai dependency. The configuration is handled automatically using the properties we defined:


spring.ai.openai.api-key=${OPENAI_API_KEY}
spring.ai.openai.chat.options.model=gpt-4.1-nano

Ollama Integration

Ollama integration is equally simple with the spring-ai-starter-model-ollama dependency. Ollama runs locally, so you'll need to:

Install Ollama on your machine
Pull the desired model: ollama pull mistral:7b
Configure the model in properties:


spring.ai.ollama.chat.options.model=mistral:7b

Once you add the dependencies and configure the properties, Spring AI automatically loads the ChatModel bean for both OpenAI and Ollama.

Integrating Gemini with OpenAI Dependency

Google Gemini doesn't have a dedicated Spring AI starter, but we can leverage Gemini's OpenAI-compatible API endpoint using the existing OpenAI dependency.

LLM Configuration for Gemini

Create a configuration class to set up Gemini using the OpenAI client:



@Configuration
public class LLMConfig {

    @Bean
    @Qualifier("geminiChatClient")
    public ChatClient geminiChatModel(
            OpenAiChatModel baseChatModel,
            @Value("${gemini.api.key}") String apiKey, 
            @Value("${gemini.api.url}") String geminiUrl,
            @Value("${gemini.api.completions.path}") String completionsPath,
            @Value("${gemini.model.name}") String modelName) {
        
        var geminiApi = OpenAiApi.builder()
                .baseUrl(geminiUrl)
                .completionsPath(completionsPath)
                .apiKey(apiKey)
                .build();
                
        var customModel = baseChatModel.mutate()
                .openAiApi(geminiApi)
                .defaultOptions(OpenAiChatOptions.builder().model(modelName).build())
                .build();
                
        return ChatClient.create(customModel);
    }
}

This configuration:

Uses OpenAI API structure: Leverages the existing OpenAI client infrastructure
Customizes endpoints: Points to Gemini's OpenAI-compatible endpoint
Mutates the base model: Creates a new instance with Gemini-specific settings
Maintains compatibility: Works seamlessly with Spring AI's ChatClient interface

Testing with HTTPie

Now that our application is complete, let's test it using HTTPie. First, start your application:


./mvnw spring-boot:run

Testing OpenAI


# Test OpenAI and get model information
http GET localhost:8100/chat \
  message=="What model are you? Please provide your name, version, and key capabilities." \
  llm==openai

Sample Response:


{
    "llm": "openai",
    "originalMessage": "What model are you? Please provide your name, version, and key capabilities.",
    "response": "I am ChatGPT, based on the GPT-4 architecture developed by OpenAI. My version includes improvements in understanding and generating human-like text, enabling me to assist with a wide range of tasks such as answering questions, providing explanations, composing creative writing, and more. I can understand context, handle complex prompts, and generate coherent and relevant responses across various topics.",
    "timestamp": 1753268746222
}

Testing Ollama


# Test Ollama and get model information  
http GET localhost:8100/chat \
  message=="What model are you? Please tell me your name, version, and what you're good at." \
  llm==ollama

Sample Response:


{
    "llm": "ollama",
    "originalMessage": "What model are you? Please tell me your name, version, and what you're good at.",
    "response": " I am a model of the Chat Model developed by Mistral AI. My primary function is to assist with various tasks by providing information, answering questions, and engaging in conversation. I strive to provide precise, helpful, and courteous responses.\n\nWhile I don't have a personal name, you can think of me as your digital assistant designed to make your interactions more enjoyable and productive. My capabilities include but are not limited to: answering questions, providing explanations, discussing a wide range of topics, assisting with scheduling and organization, offering recommendations, and much more.\n\nIn terms of my version, I am part of the latest generation of models, continually learning and improving from the data it encounters during interactions like this one.",
    "timestamp": 1753268772790
}

Testing Gemini


# Test Gemini and get model information
http GET localhost:8100/chat \
  message=="Please identify yourself. What model are you, what version, and what are your strengths?" \
  llm==gemini

Sample Response:


{
    "llm": "gemini",
    "originalMessage": "Please identify yourself. What model are you, what version, and what are your strengths?",
    "response": "I am a large language model, trained by Google.\n\n**Model & Version:**\nUnlike traditional software with specific version numbers, large language models like me are continuously updated and refined. There isn't a single, publicly accessible \"version number\" in the way you might think of software like....",
    "timestamp": 1753268800297
}

Conclusion

Integrating multiple LLMs with Spring AI is quite simple. By leveraging Spring AI's capabilities, you can easily switch between different LLM providers based on your application's needs. This flexibility allows you to optimize for cost, performance, and reliability while providing a seamless user experience.

You can find the complete source code for this project on my GitHub repository: CodeWizzard01/spring-ai-multiple-llm

For more in-depth tutorials on Java, Spring, and modern software development practices, follow me for more content:

🔗 Blog 🔗 LinkedIn 🔗 Medium 🔗 Github

Stay tuned for more content on the latest in AI and software engineering!

Model Context Protocol (MCP) Basics and building MCP Server and Client in Java and Spring

Learn about Model Context Protocol (MCP) and how to build an MCP server and client using Java and Spring. Explore the evolution of AI integration and the benefits of using MCP for LLM applications.

Building AI Agents with Google Agent Development Kit (ADK) and Java

Learn how to build powerful AI agents using Google's Agent Development Kit (ADK) with Java. Explore ADK architecture, core concepts, and create a learning assistant agent with tools and MCP integration.