dlx-ansible/docs/KAFKA-LOCALHOST-FIX.md

4.0 KiB

Kafka Admin Client localhost:9092 Warning Fix

Symptom

During sj_api (Spring Boot) startup, the following warnings appear repeatedly:

WARN [kafka-admin-client-thread | smart-api-admin-0]
  Connection to node -1 (localhost/127.0.0.1:9092) could not be established.
  Node may not be available.

The application eventually starts successfully but takes ~60 seconds due to retry loops.

Root Cause

Two separate issues compound each other:

1. Kafka has two listeners — services were using the wrong one

services/kafka.yaml defines:

KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:29092,EXTERNAL_LISTENER://192.168.200.114:9092
KAFKA_LISTENERS:            PLAINTEXT://0.0.0.0:29092,EXTERNAL_LISTENER://0.0.0.0:9092
  • PLAINTEXT://kafka:29092 — internal Docker network (for container-to-container)
  • EXTERNAL_LISTENER://192.168.200.114:9092 — external host access (for outside Docker)

The .env had kafkaservice=kafka:9092, which connects to the external listener. When a container connects via the external listener, Kafka returns metadata advertising 192.168.200.114:9092 as the broker address. From inside a container, this routes back through the host and causes connection confusion, including resolving to localhost.

Fix: Change .env to use the internal PLAINTEXT listener:

kafkaservice=kafka:29092

2. Spring Boot dev profile hardcodes localhost:9092 for the Kafka admin client

The application jar's application-dev.yml has localhost:9092 as the default Kafka bootstrap server. The KAFKASERVICE env var only overrides the producer/consumer clients — the Spring Kafka admin client reads from spring.kafka.bootstrap-servers which was still falling back to the dev profile's localhost:9092.

Fix: Add SPRING_KAFKA_BOOTSTRAP_SERVERS to the api service environment in docker-compose-prod.yaml, pointing at the same value as KAFKASERVICE:

environment:
  - KAFKASERVICE=${kafkaservice}
  - SPRING_KAFKA_BOOTSTRAP_SERVERS=${kafkaservice}   # <-- add this

This overrides the dev profile default for the admin client at container startup.

Files Changed

File Change
/opt/smartjournal/.env kafkaservice=kafka:9092kafkaservice=kafka:29092
/opt/smartjournal/docker-compose-prod.yaml Added SPRING_KAFKA_BOOTSTRAP_SERVERS=${kafkaservice} to api service environment

Result

  • No more localhost:9092 warnings
  • Startup time: ~60 seconds → ~20 seconds

Applying to Another Environment

  1. Check Kafka listeners — ensure the internal listener (PLAINTEXT) is on a different port from the external listener and that kafkaservice in .env points to the internal one:

    kafkaservice=kafka:<internal-port>
    
  2. Add the Spring override to the api service in docker-compose-prod.yaml:

    - SPRING_KAFKA_BOOTSTRAP_SERVERS=${kafkaservice}
    
  3. Recreate the api container (restart is not sufficient — env vars require recreate):

    docker compose -f docker-compose-prod.yaml up -d --force-recreate api
    
  4. Verify — startup should complete in ~20 seconds with no localhost warnings:

    docker logs sj_api 2>&1 | grep -E 'localhost.*9092|Started UiApplication'
    

    Expected: only the Started UiApplication in XX seconds line, no localhost warnings.

  • mfa_enabled=fasle typo in .env — caused Invalid boolean value startup crash. Fixed by correcting to mfa_enabled=false.

  • Duplicate env vars with hyphens vs underscores in docker-compose-prod.yaml:

    - SAML-MAPPER-GRAPH-PROXY-PORT=${saml-mapper-graph-proxy-port}  # broken (hyphen)
    - SAML-MAPPER-GRAPH-PROXY-PORT=${saml_mapper_graph_proxy_port}  # correct (underscore)
    

    Shell interprets ${saml-mapper-graph-proxy-port} as ${saml} with default mapper-graph-proxy-port, so the port env var receives a string instead of an integer, crashing Spring Boot. Fixed by removing the hyphenated duplicate lines.