Two years ago, wiring an LLM into a Java backend meant writing bespoke HTTP client code and manually parsing JSON responses. Python had LangChain, LlamaIndex, and a growing ecosystem. Java had patience and a generic HttpClient.
That gap is closed. As of 2026, six production-viable frameworks can take a Java application from zero to multi-agent with tool calling, MCP support, and durable workflows. The hard part now isn't finding a framework—it's choosing between them.
Here's a practical comparison based on the current versions and real usage tradeoffs.
| Framework | Language | Spring native | Planning model | MCP | A2A | Checkpointing | Status |
|---|---|---|---|---|---|---|---|
| LangChain4j | Java | ✅ | ReAct loop | ✅ | ✅ | ❌ | GA |
| Spring AI | Java | Native | ReAct loop | ✅ | ❌ | ❌ | GA |
| Embabel | Kotlin/Java | ✅ | GOAP + Utility AI | ✅ | ✅ | Planned | Beta |
| Koog | Kotlin (Java API) | ✅ | Graph-based | ✅ | ❌ | ✅ | Beta |
| Google ADK | Java | ✅ | Hierarchical | ✅ | ✅ | ❌ | Pre-GA |
| Semantic Kernel | Java | ❌ | ReAct loop | ✅ | ❌ | ❌ | GA |
A2A = Agent-to-Agent protocol support. Checkpointing = durable mid-workflow state persistence.
LangChain4j established dominance early by abstracting the LLM provider landscape into clean Java interfaces. In 1.x, it formally split agentic capabilities into a dedicated module (langchain4j-agentic) with sub-modules for MCP, A2A, and common agentic patterns — things that previously required custom wiring now have first-class support.
The core developer experience is declarative. You define an interface, and the framework proxies the implementation:
public interface CustomerSupportAgent { @SystemMessage("You are a helpful support agent. Only answer questions related to your tools.") String handleQuery(@UserMessage String query); }
Attaching tools, memory, and guardrails is done at build time:
CustomerSupportAgent agent = AiServices.builder(CustomerSupportAgent.class) .chatLanguageModel(model) .chatMemory(MessageWindowChatMemory.withMaxMessages(10)) .tools(new OrderLookupService(), new RefundService()) .inputGuardrail(new PiiRedactionGuardrail()) // 1.x: blocks/redacts PII before the LLM sees it .outputGuardrail(new HallucinationGuardrail()) // 1.x: validates response before returning .build();
The guardrails API (langchain4j-guardrails) is one of the more important 1.x additions. Input guardrails intercept user messages before they reach the model — you can block prompt injections, redact PII, or enforce input schemas. Output guardrails validate model responses before they reach your application. In production, where you can't fully trust either the user or the model, this is essential.
What to watch for: LangChain4j abstracts over 20+ providers. When a provider ships a new capability (OpenAI's native code execution, Gemini's grounding API), LangChain4j typically lags by a few weeks before exposing it through the unified API. If you need bleeding-edge provider features immediately, you'll need to reach into the raw client, which defeats the abstraction. The tradeoff is explicit: breadth over novelty.
📚 Related: Building a Smart Investment Portfolio Advisor with Java, Spring Boot, and LangChain4j
Spring AI is the right choice when your codebase is already Spring Boot. It uses the Spring component model throughout — autoconfiguration, beans, property-driven setup — so there's no context switching.
Tool calling is entirely native. Annotate a service method and Spring AI generates the JSON schema and handles parameter mapping:
@Service public class OrderService { @Tool(description = "Fetches the current shipping status of an order by its ID.") public String getOrderStatus(String orderId) { return orderRepository.findById(orderId) .map(Order::getShippingStatus) .orElseThrow(() -> new OrderNotFoundException(orderId)); } }
Advisors API, which is the most underrated feature in the framework allows to wrap the LLM call and can modify requests going in and responses coming out composably:
ChatClient chatClient = ChatClient.builder(chatModel) .defaultAdvisors( new MessageChatMemoryAdvisor(chatMemory), // adds conversation memory new QuestionAnswerAdvisor(vectorStore) // adds RAG context injection ) .build();
Custom advisors are straightforward to write. You can inject tenant context, sanitize outputs, log token usage per request, or enforce a rate limit — all reusable across different ChatClient instances.
Structured Output is another key feature. It maps the LLM response directly to a Java record without manual parsing:
record ResearchSummary(String title, List<String> keyPoints, String conclusion) {} ResearchSummary summary = chatClient.prompt() .user("Summarize the security implications of post-quantum cryptography for TLS") .call() .entity(ResearchSummary.class);
Spring AI generates the JSON schema from the record, injects it into the prompt, and handles deserialization. It retries on parse failures with a corrective prompt.
What to watch for: Spring AI ships features fast. The gap between 1.0 and 1.1 introduced the Advisors API, Structured Output, and a redesigned MCP integration. That velocity is great, but it means breaking API changes between minor versions are real. Pin versions explicitly with the Spring AI BOM and review changelogs on every upgrade.
📚 Related:
Most agent frameworks run reactive loops: prompt the LLM, get a tool call back, execute it, append the result, repeat. The LLM decides what to do at each step. Embabel, created by Rod Johnson (creator of Spring), takes a fundamentally different approach: Goal-Oriented Action Planning (GOAP).
Instead of asking the LLM to drive, you define discrete actions whose inputs and outputs are Java types. The Embabel planner using a non-LLM AI algorithm borrowed from game AI — determines the sequence of actions needed to transform your start state into your goal state.
@Agent(description = "Research agent that compiles reports from source material") public class ResearchAgent { @Action public RawData fetchData(UserInput input) { // Pure Java — no LLM involved here return dataService.fetch(input.getContent()); } @Action public Summary synthesize(RawData data, Ai ai) { // The Ai parameter is injected; choose your model explicitly return ai.withLlm(OpenAiModels.GPT_41) .createObject( "Synthesize the following data into a concise summary:\n" + data.getContent(), Summary.class ); } @AchievesGoal(description = "Compile a final report from a research summary") @Action public Report compile(Summary summary) { return new Report(summary.text(), LocalDate.now()); } }
The planner infers the execution graph from types: Report requires Summary, Summary requires RawData, RawData requires UserInput. You declare what produces what, not when to call it. When the synthesize step fails, the planner can replan — potentially choosing a different action that also produces a Summary — rather than crashing.
Beyond GOAP, Embabel also supports Utility AI: actions are scored by configurable utility functions, and the highest-scoring action runs at each step. This is better suited to open-ended exploration tasks where there's no fixed goal state.
Embabel supports three execution modes that determine how the platform resolves which agent to run:
<dependency> <groupId>com.embabel.agent</groupId> <artifactId>embabel-agent-starter</artifactId> <version>0.3.0</version> </dependency>
What to watch for: Embabel is at 0.3.0 with 0.4.0-SNAPSHOT actively in development. The API surface moves. The production PaaS deployment story isn't fully landed yet — it's designed to separate local execution from production deployment, but that gap requires manual bridging today. If you need a framework that's been stress-tested in production at scale, wait for 1.0.
JetBrains built Koog with a different architectural bet: agent behavior is modeled as explicit directed graphs, not LLM-driven loops or declarative planners. You define nodes (individual steps or LLM calls) and edges (transitions), and the framework executes the graph with full control over retry logic, branching, and state.
The primary API is a Kotlin DSL:
val researchAgent = AIAgent( promptExecutor = simpleOpenAIExecutor(apiKey), systemPrompt = """ You are a research assistant. Use the available tools to answer questions accurately. Cite your sources. If you cannot find reliable information, say so. """.trimIndent(), llmModel = OpenAIModels.Chat.GPT4o, tools = listOf(WebSearchTool(), DocumentReaderTool()) ) val result = researchAgent.run("What are the security implications of MCP for enterprise deployments?")
Koog's checkpointing is the feature that genuinely differentiates it for production use. If your agent is in the middle of a multi-step graph and a downstream service returns a 503, Koog restores from the last checkpoint node rather than restarting from scratch. For long-running agentic workflows that call slow external APIs, this is the difference between "retry everything" and "resume from where we were."
For Spring Boot projects:
<dependency> <groupId>ai.koog</groupId> <artifactId>koog-spring-boot-starter-jvm</artifactId> <version>0.8.0</version> </dependency>
Two more features worth knowing: intelligent history compression and LLM switching. History compression automatically summarizes older turns when approaching context limits, preserving semantic content rather than simply truncating. LLM switching lets you change the model mid-conversation without losing history — useful when you want a cheap model for tool dispatch and an expensive one for final synthesis.
Koog also targets Kotlin Multiplatform: the same agent code runs on JVM, Android, iOS, and browser (WasmJS). If your agent logic needs to run on mobile or at the edge, no other framework on this list can match that.
What to watch for: The primary API is Kotlin DSL. The Java API was added in 0.7.0 and works, but community resources, documentation examples, and Stack Overflow answers are predominantly Kotlin. If your team isn't writing Kotlin, expect translation friction. Koog is still in Beta and minor versions have had breaking changes.
Google's Agent Development Kit for Java takes a code-first approach with LlmAgent as the primary building block:
LlmAgent researchAgent = LlmAgent.builder() .name("research_agent") .description("Researches topics using Google Search and synthesizes findings") .model("gemini-2.0-flash") .instruction(""" You are a precise research assistant. Use the search tool to find current information. Cite sources in every response. Acknowledge uncertainty where it exists. """) .tools(new GoogleSearchTool()) .build();
ADK's architectural focus is the A2A (Agent-to-Agent) protocol. Rather than proprietary agent orchestration, A2A lets agents discover and invoke each other over a standardized interface — effectively treating agents as services. LangChain4j and Embabel also support A2A, but Google treats it as the primary multi-agent mechanism in ADK.
ADK 1.2.0 ships a built-in development UI (the same one from Python ADK) for tracing tool calls, inspecting intermediate states, and replaying agent runs. This genuinely speeds up debugging during development.
Important: Google ADK for Java is still Pre-GA. Google applies its "Pre-GA Offerings Terms" — no SLA, no guarantee against breaking changes, limited support. Version 1.2.0 is capable, but don't build a critical production path on it unless you have a specific Gemini or Google Cloud requirement that outweighs the stability risk.
📚 Related: Building AI Agents with Google Agent Development Kit (ADK) and Java
Microsoft's Semantic Kernel Java implementation now lives in its own repository (microsoft/semantic-kernel-java) with a v1.0+ stability commitment — no breaking changes within 1.x. It excels at prompt chaining and integrates tightly with Azure OpenAI Service and the broader Azure AI ecosystem.
The Java port is faithful but reflects its C# origins. If you compare the API to LangChain4j, Semantic Kernel feels heavier and occasionally awkward from a Java idiom perspective. For Azure-first teams already invested in the Microsoft AI stack, that tradeoff is acceptable. For teams without that constraint, LangChain4j provides broader provider coverage with a more natural Java developer experience.
Pick based on what you're building, not on GitHub stars.
The Python ecosystem advantage is gone. The question now is which Java framework fits your architecture.
Agent Skills replace static prompts with modular, version-controlled workflows. Here is how progressive disclosure works and how to set them up.
Learn how to secure MCP servers with OAuth, scoped tools, input validation, sandboxing, audit logs, and OWASP MCP Top 10 guidance for Java teams today.
Find the most popular YouTube creators in tech categories like AI, Java, JavaScript, Python, .NET, and developer conferences. Perfect for learning, inspiration, and staying updated with the best tech content.

Get instant AI-powered summaries of YouTube videos and websites. Save time while enhancing your learning experience.