5.8 KiB
Project Memory: dlx-ansible
Infrastructure Overview
- NPM Server: nginx (192.168.200.71) - Nginx Proxy Manager for SSL termination
- Application Servers: hiveops (192.168.200.112), smartjournal (192.168.200.114)
- CI/CD Server: jenkins (192.168.200.91) - Jenkins + SonarQube
- All servers use
dlxadminuser with passwordless sudo
Critical Learnings
SSL Certificate Offloading with Nginx Proxy Manager
Problem: Spring Boot applications behind NPM experience redirect loops when accessed via HTTPS.
Root Cause: Spring Boot doesn't trust X-Forwarded-* headers by default. When NPM terminates SSL and forwards HTTP to backend, Spring sees HTTP and redirects to HTTPS, creating infinite loop.
Solution: Configure Spring Boot to trust forwarded headers:
environment:
SERVER_FORWARD_HEADERS_STRATEGY: native
SERVER_USE_FORWARD_HEADERS: true
Key Points:
- Containers must be recreated (not restarted) for env vars to take effect
- Verify with:
curl -I -H 'X-Forwarded-Proto: https' http://localhost:8080/ - Success indicator:
Strict-Transport-Securityheader in response - Documentation:
docs/SSL-OFFLOADING-FIX.md
Docker Compose Best Practices
Environment Variable Loading:
- Use
--env-fileflag when .env is not in same directory as compose file - Example:
docker compose -f docker/docker-compose.yml --env-file .env up -d
Container Updates:
- Restart: Keeps existing container, doesn't apply env changes
- Recreate: Removes old container, creates new one with latest env/config
- Always recreate when changing environment variables
HiveOps Application Structure
Main Deployment (/opt/hiveops-deploy/):
- Full microservices stack
- Services: incident-backend, incident-frontend, mgmt, remote
- Managed via docker-compose
Standalone Deployment (/home/hiveops/):
- Simplified incident management system
- Separate from main deployment
- Used for direct hiveops.directlx.dev access
Jenkins Firewall Blocking (2026-02-09)
Problem: Jenkins and SonarQube were unreachable from network.
Root Cause: Server had no host_vars file, inherited default firewall config (SSH only).
Solution: Created host_vars/jenkins.yml with ports 22, 8080 (Jenkins), 9000 (SonarQube).
Quick Fix:
ansible jenkins -m community.general.ufw -a "rule=allow port=8080 proto=tcp" -b
ansible jenkins -m community.general.ufw -a "rule=allow port=9000 proto=tcp" -b
ansible jenkins -m shell -a "docker start postgresql sonarqube" -b
Key Points:
- Jenkins runs as Java system service (not Docker) on port 8080
- SonarQube runs in Docker with PostgreSQL backend
- Always create host_vars file for servers with specific firewall needs
- Documentation:
docs/JENKINS-CONNECTIVITY-FIX.md
File Locations
Host Variables
/source/dlx-src/dlx-ansible/host_vars/npm.yml- NPM firewall config/source/dlx-src/dlx-ansible/host_vars/hiveops.yml- HiveOps settings/source/dlx-src/dlx-ansible/host_vars/smartjournal.yml- SmartJournal settings/source/dlx-src/dlx-ansible/host_vars/jenkins.yml- Jenkins/SonarQube firewall config
Playbooks Created
playbooks/fix-hiveops-ssl-offload.yml- SSL offload fix automationplaybooks/fix-hiveops-compose-indentation.yml- Compose file correctionsplaybooks/fix-hiveops-mgmt-ssl.yml- Management service SSL fix
Templates
templates/hiveops-docker-compose.prod.yml.j2- Corrected compose template
Storage Remediation (2026-02-08)
Critical Issues Identified:
- proxmox-00 root FS: 84.5% full (CRITICAL)
- proxmox-01 dlx-docker: 81.1% full (HIGH)
- Unused containers: 1.2 TB allocated
- SonarQube: 354 GB (82% of allocation)
Remediation Playbooks Created:
remediate-storage-critical-issues.yml: Log cleanup, Docker prune, auditsremediate-docker-storage.yml: Deep Docker cleanup + automationremediate-stopped-containers.yml: Safe container removal with backupsconfigure-storage-monitoring.yml: Proactive monitoring (5/10 min checks)
Documentation:
STORAGE-AUDIT.md: Full hardware/storage analysis (550 lines)STORAGE-REMEDIATION-GUIDE.md: Step-by-step execution (480 lines)REMEDIATION-SUMMARY.md: Quick reference (300 lines)
Expected Results:
- Total space freed: 1-2 TB
- proxmox-00: 84.5% → 70% (10-15 GB freed)
- proxmox-01: 81.1% → 70% (50-150 GB freed)
- Automation prevents regrowth (weekly prune + hourly monitoring)
Commit: 90ed5c1
Common Tasks
Fix SSL Offloading for Spring Boot Service
- Add env vars to .env:
SERVER_FORWARD_HEADERS_STRATEGY=native,SERVER_USE_FORWARD_HEADERS=true - Add to docker-compose environment section
- Recreate container:
docker stop <name> && docker rm <name> && docker compose up -d <service> - Verify: Check for
Strict-Transport-Securityheader
Apply Firewall Configuration
- Firewall is managed by common role (roles/common/tasks/security.yml)
- Controlled per-host via
common_firewall_enabledandcommon_firewall_allowed_ports - Some hosts (docker, hiveops, smartjournal) have firewall disabled for Docker networking
Run Storage Remediation
- Test with
--check:ansible-playbook playbooks/remediate-storage-critical-issues.yml --check - Deploy monitoring:
ansible-playbook playbooks/configure-storage-monitoring.yml -l proxmox - Fix proxmox-00:
ansible-playbook playbooks/remediate-storage-critical-issues.yml -l proxmox-00 - Fix proxmox-01:
ansible-playbook playbooks/remediate-docker-storage.yml -l proxmox-01 - Monitor:
tail -f /var/log/storage-monitor.log - Remove containers (optional):
ansible-playbook playbooks/remediate-stopped-containers.yml -e dry_run=false
Security Notes
- Only trust forwarded headers when backend is not internet-accessible
- NPM server (192.168.200.71) should be only server that can reach backend ports
- Backend ports should bind to localhost only:
127.0.0.1:8080:8080