Top 10 Microservice Questions Asked in Interviews

    Top 10 Microservice Questions Asked in Interviews

    09/01/2026

    I've sat through hundreds of microservice interviews, both as a candidate and an interviewer. The questions that separate senior engineers from juniors aren't about definitions—they're about production experience or how well you can demonstrate your experience with distributed systems. Can you explain distributed transactions? Sure. But have you debugged a compensation transaction that failed halfway through?

    Here are the top 10 microservice interview questions that actually matter. These aren't theory questions—they're designed to reveal your experience with distributed systems that handle real traffic, real failures, and real complexity.

    1. How do you handle distributed transactions and data consistency across microservices?

    This is where most candidates stumble. They know ACID transactions don't work across services, but they can't articulate the practical alternatives.

    What They're Really Asking:

    Can you manage data consistency without a global transaction? In a monolith, you'd wrap everything in a single database transaction. In microservices, that approach fails because each service owns its own database. You need different strategies.

    The Answer:

    We can use the Saga pattern for long-running transactions that span multiple services. There are two approaches: orchestration and choreography.

    Orchestration Pattern: A central orchestrator coordinates the workflow. It's easier to understand and debug, but creates a single point of coordination. We can use this when we need explicit control over the transaction flow.

    Java code example:

    @Service public class OrderOrchestrator { public void createOrder(OrderRequest request) { try { // Step 1: Reserve inventory InventoryResponse inventory = inventoryService.reserve(request.getItems()); // Step 2: Process payment PaymentResponse payment = paymentService.process(request.getPayment()); // Step 3: Create order Order order = orderService.create(request); // Step 4: Send confirmation notificationService.sendConfirmation(order); } catch (Exception e) { // Compensation logic compensate(request, e); } } }

    Choreography Pattern: Each service publishes events and reacts to events from others. No central coordinator. Better for scalability, but harder to debug. We can use this when services are loosely coupled and we want to avoid a coordination bottleneck.

    Orchestration Pattern Orchestrator Order Service Inventory Payment Shipping Notification 1. Reserve 2. Process 3. Ship 4. Notify Choreography Pattern Event Bus (Kafka) Order Inventory Payment Shipping OrderCreated Events Each service acts independently No central coordinator

    The tradeoff is eventual consistency. You can't guarantee immediate consistency across services. That's okay for most business scenarios—users can tolerate a few seconds of delay for order confirmation.

    Production Gotchas:

    I've been bitten by compensation failures before. If your compensation logic itself fails, you're in a partial state. Always make compensation operations idempotent. Also, ensure your saga steps are idempotent—network retries can cause duplicate executions.

    2. Explain your approach to inter-service communication. When would you use synchronous vs asynchronous patterns?

    This question reveals whether you understand the fundamental tradeoffs of distributed systems. Every communication pattern has consequences.

    What They're Really Asking:

    Do you know when to use REST vs message queues? Can you explain why one approach might cascade failures while the other doesn't?

    The Answer:

    We can choose synchronous communication (REST, gRPC) when we need immediate responses and can tolerate tight coupling. Asynchronous communication (Kafka, RabbitMQ) when we need resilience and can accept eventual consistency.

    Synchronous Communication (REST/gRPC):

    We can use REST for user-facing operations where response time matters. When a user clicks "Place Order," they need to know immediately if it succeeded. But synchronous calls create dependencies—if the payment service is down, orders can't be placed.

    @RestController public class OrderController { @Autowired private PaymentServiceClient paymentClient; @PostMapping("/orders") public ResponseEntity<Order> createOrder(@RequestBody OrderRequest request) { // Synchronous call - blocks until response PaymentResponse payment = paymentClient.processPayment(request.getPayment()); if (payment.isSuccessful()) { return ResponseEntity.ok(orderService.create(request)); } throw new PaymentException("Payment failed"); } }

    The catch? Cascading failures. If the payment service is slow, your order service becomes slow too. That's why I always add timeouts and circuit breakers to synchronous calls.

    Asynchronous Communication (Message Brokers):

    We can use message queues when we want loose coupling and better fault tolerance. The order service publishes an "OrderCreated" event. Payment and inventory services consume it independently. If payment service is down, the order still gets created—payment happens eventually.

    @Service public class OrderService { @Autowired private KafkaTemplate<String, OrderEvent> kafkaTemplate; public Order createOrder(OrderRequest request) { Order order = orderRepository.save(new Order(request)); // Fire and forget - doesn't wait for consumers kafkaTemplate.send("order-created", new OrderCreatedEvent(order)); return order; } }

    The tradeoff is eventual consistency. Users might see their order before payment is confirmed. That's fine for most scenarios, but you need clear UX patterns.

    Synchronous vs. Asynchronous Integration Synchronous Integration (REST, GraphQL, gRPC) Asynchronous Integration (Kafka, RabbitMQ, etc) Client Service API Gateway Order Service Payment Service Inventory Service Request Response Synchronous Integration Characteristics: • Request-response pattern • Client waits for response (blocking) • Tightly coupled services • Real-time responses, but prone to cascading failures Producer Service Message Broker Order Service Payment Service Inventory Service Publish Asynchronous Integration Characteristics: • Fire-and-forget pattern • Producer continues without waiting (non-blocking) • Loosely coupled services • Better resilience, but eventual consistency REST/GraphQL gRPC HTTP Kafka RabbitMQ ActiveMQ Topics/ Queues

    I've seen teams use synchronous calls everywhere "because it's simpler." Then production hits, and a single slow service brings down the entire system. The fix? Move non-critical paths to async.

    📚 Related: Learn more about resilience patterns in Top Resilience Patterns in Microservices.

    3. How do you implement service discovery in a dynamic microservices environment? How do you handle service unavailability?

    Service discovery is one of those infrastructure concerns that seems simple until you're debugging why service A can't find service B at 3 AM.

    What They're Really Asking:

    Do you understand how services find each other when IPs change constantly? Can you handle the edge cases?

    The Answer:

    We can use client-side service discovery with a service registry like Consul or Eureka. Services register themselves on startup and deregister on shutdown. Clients query the registry to find available instances.

    Service Registry Pattern:

    Each service registers its network location (IP + port) when it starts. The registry maintains a list of healthy services. Clients query the registry periodically to get fresh service instances.

    Handling Service Unavailability:

    The registry marks services as unhealthy when health checks fail. But clients might have cached stale instances. I always implement retry logic with exponential backoff and failover to different instances.

    For production, we can use client-side load balancing (Ribbon or Spring Cloud LoadBalancer) that automatically handles instance selection and retries. We should also configure timeouts shorter than the registry's health check interval—if a service becomes unhealthy, clients stop calling it quickly.

    Production Gotchas:

    DNS caching is the silent killer. Java's default DNS TTL is infinite, meaning clients cache IP addresses forever. If a service instance dies and a new one takes its IP, clients might keep hitting the old IP. I always set networkaddress.cache.ttl to 30 seconds in production.

    Also, watch out for the thundering herd problem. When a service comes back online, all clients try to connect simultaneously. Implement staggered retry with jitter to spread the load.

    In Kubernetes environments, we can use built-in service discovery. Kubernetes DNS automatically resolves service names to IPs, and endpoints are updated dynamically. Simpler than running a separate registry.

    Service Registry (Consul/Eureka) Health Checks Payment 10.0.1.1:8080 Inventory 10.0.1.2:8080 Order 10.0.1.3:8080 UNHEALTHY 1. Register 2. Register 3. Register Health Check Health Check Health Check Client (Order Service) 4. Query Instances 5. Return Healthy Instances Available Instances: • Payment: 10.0.1.1:8080 • Inventory: 10.0.1.2:8080 • Order: (unhealthy - filtered)

    📚 Related: Explore more system design patterns in Top 15 System Design Patterns.

    4. Describe your strategy for monitoring and observability in a distributed system. How do you trace a request across multiple services?

    Observability is what keeps you sane in production. Without it, debugging distributed systems is like searching for a needle in a haystack blindfolded.

    What They're Really Asking:

    Can you debug a request that spans 10 services? Do you understand correlation IDs and distributed tracing?

    The Answer:

    We can implement the three pillars of observability: metrics, logs, and traces. Each serves a different purpose, and you need all three.

    Metrics (What's happening now):

    We can use Prometheus for metrics collection. Key metrics to track: request rate, latency percentiles (p50, p95, p99), error rate, and business metrics like orders per second. Dashboards in Grafana provide real-time visibility.

    The trick is choosing the right metrics. Too many metrics create noise. Too few and you miss critical issues. We can focus on SLOs—if we promise 200ms p95 latency, that's what we monitor.

    Logs (What happened):

    Centralized logging with correlation IDs. Every request gets a unique correlation ID that propagates through all service calls. When something breaks, we can grep for that ID and see the entire request flow.

    We can use structured logging (JSON format) to query logs effectively. Tools like ELK stack or Loki make searching millions of log lines manageable.

    Distributed Tracing (Why it's slow):

    Tracing shows the exact path a request takes through services and where time is spent. We can use OpenTelemetry with Jaeger or Zipkin.

    The key is trace context propagation. Each service adds itself to the trace and forwards the trace context to downstream services. At the end, we get a complete timeline showing that 80% of the request time was spent in the payment service database query.

    Production Gotchas:

    Sampling is critical. You can't trace every request—it's too expensive. We can sample 1% of requests normally, but 100% during incidents. Also, trace context propagation can break if you use async processing. Make sure to pass trace context through message headers.

    Client Request Trace-ID: abc123 API Gateway Span: gateway-span 50ms Order Service Span: order-span 120ms Payment Service Span: payment-span 80ms Inventory Service Span: inventory-span 200ms (slow!) Database Query: 180ms Trace-ID propagated Distributed Trace Timeline (Total: 450ms) Gateway 50ms Order Service 120ms Payment 80ms Inventory Service (Bottleneck!) 200ms (DB query: 180ms) All spans share same Trace-ID: abc123 Correlation ID propagated through all service calls

    Alert fatigue is real. I only alert on metrics that require immediate action. Everything else goes to dashboards. If you get 50 alerts per hour, you'll start ignoring them all.

    5. What is the Circuit Breaker pattern, and can you describe a production scenario where you implemented it?

    This question tests whether you understand failure modes in distributed systems. Circuit breakers prevent cascading failures—they're essential in production.

    What They're Really Asking:

    Have you experienced cascading failures? Do you understand when to fail fast vs retry?

    The Answer:

    A circuit breaker is like an electrical circuit breaker—it stops current flow when there's too much load. In software, it stops calling a failing service to prevent cascading failures.

    The circuit has three states: Closed (normal operation), Open (failing, reject requests immediately), and Half-Open (testing if service recovered).

    CLOSED (Normal Operation) Client Circuit Breaker CLOSED Requests flow Payment Requests pass through, failures counted OPEN (Failure Threshold Breached) Client Circuit Breaker OPEN Fail fast Payment DOWN Requests blocked immediately, fallback executed HALF-OPEN (Testing Recovery) Client Circuit Breaker HALF-OPEN Test request Payment Single test request allowed, if success → CLOSED, if fail → OPEN State Transition Flow Failure < Threshold Failure ≥ Threshold CLOSED → OPEN → HALF-OPEN → (CLOSED or OPEN)

    How It Works:

    We can configure a failure threshold—say, 50% failures in the last 100 requests. When threshold is breached, the circuit opens. All requests fail fast without calling the service. After a timeout period, the circuit moves to half-open and allows one test request. If it succeeds, the circuit closes. If it fails, it opens again.

    Sample Code:

    @Service public class PaymentServiceClient { @Autowired private RestClient restClient; @CircuitBreaker(name = "payment-service", fallbackMethod = "processPaymentFallback") @Retry(name = "payment-service") public PaymentResponse processPayment(PaymentRequest request) { return restClient.post() .uri("http://payment-service/process") .contentType(MediaType.APPLICATION_JSON) .body(request) .retrieve() .body(PaymentResponse.class); } public PaymentResponse processPaymentFallback(PaymentRequest request, Exception e) { // Fallback: queue payment for later processing return new PaymentResponse() .status("QUEUED") .message("Payment queued due to service unavailability"); } }

    Production Gotchas:

    Don't set thresholds too aggressively. If you open the circuit after 5% failures, you'll open it during normal blips. Also, half-open state is tricky—if multiple requests hit during half-open, you might close the circuit prematurely. Most libraries handle this, but it's worth understanding.

    I've also seen teams forget fallback logic. Opening the circuit is good, but if you have no fallback, you're just failing in a different way. Always design fallback behavior.

    📚 Related: Learn more about implementing circuit breakers and other resilience patterns in Top Resilience Patterns in Microservices.

    6. How do you handle API versioning and backward compatibility in microservices? What's your deployment strategy?

    This question separates engineers who've shipped breaking changes in production from those who haven't. Versioning seems straightforward until you realize you've got 50 services depending on service A's API, and service A needs to change.

    What They're Really Asking:

    Have you broken production by deploying an incompatible API change? Do you understand the different versioning strategies and when each makes sense?

    The Answer:

    Versioning refers to the practice of managing changes to APIs so that existing clients continue to work while new features or breaking changes are introduced. It's essential for evolving microservices without disrupting running systems, especially when multiple teams or clients depend on your APIs. Basically it is about running multiple versions of the same API side by side.

    There are several techniques for API versioning:

    1. URI Path Versioning:
    Include the version number in the API path (e.g., /api/v1/orders). This makes it clear which version is being used and allows multiple versions to coexist.

    2. Header Versioning:
    Clients specify the desired API version in a custom HTTP header, such as Accept-Version: v2. This keeps URLs cleaner and can make versioning less visible to end-users.

    3. Content Negotiation:
    The API version is embedded in the Accept or Content-Type headers using MIME types (e.g., application/vnd.myservice.v2+json). This decouples versioning from URLs but can be harder to discover and debug.

    4. Query Parameter Versioning:
    Clients provide a version as a query parameter (e.g., /orders?version=2). While easy to implement, this approach is less common for APIs serving external clients.

    Each technique has trade-offs in terms of discoverability, cacheability, and tooling support. The choice largely depends on team preferences, consumer needs, and the nature of the API. Whichever technique is chosen, the key is to ensure backward compatibility so that existing clients are not broken by new releases.

    Backward Compatibility:

    The golden rule: add, don't remove. When introducing v2, keep v1 running. Never remove fields—deprecate them instead. Clients migrate at their own pace.

    Deployment Strategy:

    Deploy v2 alongside v1, not replacing it. Run both versions simultaneously during migration. For internal services, use canary deployments. External APIs require explicit versioning since we don't control client upgrade schedules.

    Production Gotchas:

    Always version breaking changes—never change response structures in place. Maintain at most two versions (current and previous). Watch out for transitive dependencies—if service A calls service B (v1) and service B upgrades to v2, service A might break.

    📚 Related: Learn more about implementing API versioning in Spring Boot in A Developer's Guide to API Versioning and Implementation in Spring Boot.

    7. How do you ensure security in a microservices architecture, especially for service-to-service communication?

    Security in microservices is harder than in monoliths because you've got network boundaries everywhere. A vulnerability in one service can expose the entire system if service-to-service communication isn't secured properly.

    What They're Really Asking:

    Do you understand the security implications of distributed systems? Have you implemented mTLS, service meshes, or other production-grade security patterns?

    The Answer:

    We can use mutual TLS (mTLS) for service-to-service communication. Each service presents a certificate, and both services verify each other's identity. This prevents man-in-the-middle attacks and ensures only authorized services can communicate.

    The challenge is certificate management. In Kubernetes, we can use cert-manager to automatically issue and rotate certificates from Let's Encrypt or an internal CA. Without automation, certificate rotation becomes a nightmare. I've seen teams forget to rotate certificates and wake up to production outages when certificates expired.

    For user-facing APIs, we can use OAuth 2.0 with JWT tokens. The API Gateway validates tokens and forwards them to services. Services trust the gateway and extract user context from the token. This pattern centralizes authentication while keeping services stateless.

    API keys are fine for service-to-service calls in trusted environments, but they're static secrets. Anyone with the key has access. I prefer certificates because they can be rotated without code changes and provide better audit trails.

    The service mesh pattern is also used for security. Istio or Linkerd handle mTLS automatically, encrypting all service-to-service traffic without code changes. I've used Istio in production, and while the operational complexity is higher, the automatic security posture improvement is worth it for large deployments.

    📚 Related: Learn how to secure service-to-service calls in depth with JWT and Spring Security in JWT Authentication with Spring 6 Security: Complete Guide.

    8. What is eventual consistency, and how do you handle scenarios where your service needs strongly consistent data from another microservice?

    This question tests whether you understand the fundamental tradeoffs of distributed systems. Eventual consistency isn't a bug—it's a feature you design for. But sometimes you need stronger guarantees.

    What They're Really Asking:

    Do you understand CAP theorem in practice? Can you explain when eventual consistency is acceptable and when it isn't? Have you built systems that need strong consistency across service boundaries?

    The Answer:

    Eventual consistency means that if I stop making updates, all services will eventually see the same data. There's no guarantee of immediate consistency. If service A updates a customer's balance, service B might not see that change for seconds or even minutes, depending on how you propagate updates.

    Most business scenarios can tolerate eventual consistency. If I place an order, it's fine if the inventory service updates a few seconds later. Users don't need to see real-time inventory counts. If payment processing takes a moment, that's acceptable. Eventual consistency gives you availability and partition tolerance, which are usually more important than immediate consistency.

    But sometimes you need strong consistency. If I'm checking if a customer has sufficient balance before processing a payment, eventual consistency isn't good enough. I need to know the current balance right now, not eventually.

    The solution depends on where the data lives. If both services need strongly consistent data, we can use one of these patterns:

    Pattern 1: Move the data into one service. If order service needs customer balance, maybe the order service should own customer balances. But this breaks service boundaries and creates tight coupling. We should avoid this unless the data naturally belongs together.

    Pattern 2: Use distributed locks or transactions. We can use a distributed lock to ensure only one service updates customer balance at a time. This gives strong consistency but creates contention and potential deadlocks. The performance impact is significant, so we should only use this when absolutely necessary.

    Pattern 3: Use a transactional outbox. When the payment service updates a balance, it writes the update to its database and also writes an event to an outbox table in the same transaction. A separate process reads the outbox and publishes events. The outbox ensures the event is published exactly once, giving eventual consistency with strong delivery guarantees.

    Pattern 4: Query the source directly. If order service needs customer balance, it can call customer service synchronously and trust the response. This gives strong consistency for reads, but if customer service is down, order service can't function. This is the tradeoff—you get consistency but lose availability.

    In practice, we should design for eventual consistency by default. Most scenarios don't need strong consistency, and the performance and availability benefits are huge. When we do need strong consistency, we can use synchronous calls with appropriate fallbacks. If customer service is down, we can queue the order and process it when customer service recovers.

    The production gotcha is user experience. If your UI shows data that's eventually consistent, users might see stale data and get confused. I always design UIs to handle this gracefully. Show "updating..." indicators, use optimistic updates, or explain that data might be slightly delayed. Don't pretend everything is real-time when it isn't.

    📚 Related: Learn about event sourcing patterns in Event Sourcing Explained and how to ensure data consistency between microservices using the Outbox Pattern in The Outbox Pattern: Ensuring Data Consistency in Microservices.

    9. What is the role of an API Gateway in a microservices architecture, and how have you used it in production?

    API Gateways are one of those patterns that everyone implements differently. Some teams use them for everything, others avoid them entirely. The right answer depends on context, which is what interviewers are really probing for.

    What They're Really Asking:

    Do you understand the single entry point pattern? Have you built systems with and without gateways? Can you explain the tradeoffs?

    The Answer:

    An API Gateway is a single entry point for all client requests. Instead of clients calling multiple services directly, they call the gateway, which routes requests to the appropriate services. The gateway handles cross-cutting concerns like authentication, rate limiting, routing, and response aggregation.

    We can use API Gateways for external-facing APIs. When external clients need to access our system, they shouldn't know about our internal service topology. The gateway provides a stable interface while we refactor services behind it. We can add, remove, or split services without breaking clients.

    The gateway also centralizes authentication and authorization. Instead of implementing OAuth in every service, I do it once in the gateway. Services receive requests with user context already validated. This reduces code duplication and ensures consistent security policies.

    Rate limiting is another common use case. I can throttle abusive clients at the gateway before they hit my services. This protects my infrastructure from traffic spikes or attacks. Without a gateway, I'd need to implement rate limiting in every service, which is inefficient and inconsistent.

    Response aggregation is useful when clients need data from multiple services. Instead of making multiple round trips, the gateway can call multiple services and combine the responses. This reduces latency and simplifies client code. The tradeoff is tight coupling—if one service is slow, the entire aggregated response is slow.

    In production, I've used both API Gateway-as-a-Service offerings like AWS API Gateway and self-hosted solutions like Kong or Zuul. The managed services are simpler but less flexible. Self-hosted gives me more control but requires more operational overhead.

    The production gotcha is the gateway becoming a bottleneck. If all traffic flows through a single gateway, its failure brings down everything. I always run multiple gateway instances behind a load balancer. I also monitor gateway performance closely—if it becomes slow, everything becomes slow.

    Web Client (Browser) Mobile App (iOS/Android) Third-party (External API) API Gateway • Authentication • Authorization • Rate Limiting • Request Routing • Response Aggregation Order Service Payment Service Inventory Service User Service Shipping Service Catalog Service Reviews Service Search Service Analytics Service Route Gateway Benefits ✓ Single entry point for all clients ✓ Centralized authentication & authorization ✓ Rate limiting & throttling ✓ Request/response transformation

    Some teams avoid gateways entirely and let clients call services directly. This works if you have few clients or if clients are internal and can handle service discovery. But for external APIs with many clients, a gateway is essential. The complexity tradeoff is worth it for the operational benefits.

    10. Describe a time when you had to debug a production issue in a microservices environment. What was your approach?

    This is the most revealing question. Anyone can explain theory, but debugging production issues in distributed systems requires real experience. The answer reveals whether you've actually built and operated microservices at scale.

    What They're Really Asking:

    Have you been on-call at 2 AM debugging a distributed system? Can you systematically debug issues across multiple services? Do you understand the tools and techniques?

    The Answer:

    Here's a systematic framework for debugging production issues:

    1. Understand Scope (Metrics First):

    Check dashboards for error rates, latency percentiles, and affected services. Identify if the issue is isolated or widespread, and correlate with recent deployments or traffic spikes.

    2. Trace Request Flow (Correlation IDs & Tracing):

    Extract correlation IDs from failed requests and trace the path across services. Distributed tracing shows where requests fail or slow down, identifying bottlenecks.

    3. Analyze Logs:

    Pull logs using correlation IDs for detailed errors and stack traces. Search for error patterns, connection failures, timeouts, and compare logs across services.

    4. Check Infrastructure:

    Verify service registry and health checks—services might be marked healthy but failing. Check resource utilization (CPU, memory, connection pools) and network connectivity.

    5. Narrow Down & Fix:

    Review code paths for resource leaks, missing error handling, or logic errors. Deploy fix with monitoring enabled, verify metrics return to baseline, and document the incident.

    Common Strategies:

    Start broad, narrow down (metrics → logs → traces → code review). Compare before/after deployments. Check dependencies (downstream services, databases, message brokers). Review recent changes and correlate timing.

    Production Gotchas:

    Health checks that don't verify actual functionality can mask issues. Without correlation IDs and distributed tracing, debugging becomes guesswork. Good observability is essential—metrics show what's wrong, logs show why, traces show where.

    Conclusion

    These ten questions reveal production experience—understanding not just what these patterns are, but when to use them and what can go wrong.


    For a comprehensive collection of microservice interview questions with detailed answers and code examples, check out our Microservices Interview Questions page.

    🔗 Blog 🔗 LinkedIn 🔗 Medium 🔗 Github

    Discover Top YouTube Creators

    Explore Popular Tech YouTube Channels

    Find the most popular YouTube creators in tech categories like AI, Java, JavaScript, Python, .NET, and developer conferences. Perfect for learning, inspiration, and staying updated with the best tech content.

    Summarise

    Transform Your Learning

    Get instant AI-powered summaries of YouTube videos and websites. Save time while enhancing your learning experience.

    Instant video summaries
    Smart insights extraction
    Channel tracking