dlx-ansible/CLAUDE.md

9.7 KiB

CLAUDE.md - dlx-ansible

Infrastructure as Code for DirectLX - Ansible playbooks, roles, and inventory for managing a Proxmox-based homelab infrastructure with multiple services.

Project Overview

This repository manages 16 servers across Proxmox hypervisors, databases, web services, infrastructure services, and applications using Ansible automation.

Infrastructure

Server Inventory

Proxmox Cluster:

  • proxmox-00 (192.168.200.10) - Primary hypervisor
  • proxmox-01 (192.168.200.11) - Secondary hypervisor
  • proxmox-02 (192.168.200.12) - Tertiary hypervisor

Database Servers:

  • postgres (192.168.200.103) - PostgreSQL database
  • mysql (192.168.200.110) - MySQL/MariaDB database
  • mongo (192.168.200.111) - MongoDB database

Web/Proxy Servers:

  • nginx (192.168.200.65) - Web server
  • npm (192.168.200.71) - Nginx Proxy Manager for SSL termination

Infrastructure Services:

  • docker (192.168.200.200) - Docker host for various containerized services
  • pihole (192.168.200.100) - DNS server and ad-blocking
  • gitea (192.168.200.102) - Self-hosted Git service
  • jenkins (192.168.200.91) - CI/CD server + SonarQube

Application Servers:

  • hiveops (192.168.200.112) - HiveOps incident management (Spring Boot)
  • smartjournal (192.168.200.114) - Journal tracking application
  • odoo (192.168.200.61) - ERP system

Control:

  • ansible-node (192.168.200.106) - Ansible control node

Common Access Patterns

  • User: dlxadmin (passwordless sudo on all servers)
  • SSH: Key-based authentication (password disabled on most servers)
  • Exception: Jenkins server has password auth enabled for AWS Jenkins Master connection
  • Firewall: UFW managed via common role

Quick Start Commands

Basic Ansible Operations

# Check connectivity to all servers
ansible all -m ping

# Check connectivity to specific group
ansible webservers -m ping

# Run ad-hoc command
ansible all -m shell -a "uptime" -b

# Gather facts about servers
ansible all -m setup

Playbook Execution

# Run main site playbook
ansible-playbook playbooks/site.yml

# Limit to specific servers
ansible-playbook playbooks/site.yml -l jenkins,npm

# Limit to server group
ansible-playbook playbooks/site.yml -l webservers

# Use tags
ansible-playbook playbooks/site.yml --tags firewall

# Dry run (check mode)
ansible-playbook playbooks/site.yml --check

# Verbose output
ansible-playbook playbooks/site.yml -v
ansible-playbook playbooks/site.yml -vvv  # very verbose

Security Operations

# Run comprehensive security audit
ansible-playbook playbooks/security-audit-v2.yml

# View audit results
cat /tmp/security-audit-*/report.txt
cat docs/SECURITY-AUDIT-SUMMARY.md

# Apply security updates
ansible all -m apt -a "update_cache=yes upgrade=dist" -b

# Check firewall status
ansible all -m shell -a "ufw status verbose" -b

# Configure Docker server firewall (when ready)
ansible-playbook playbooks/secure-docker-server-firewall.yml

Server Management

# Reboot servers
ansible all -m reboot -b

# Check disk space
ansible all -m shell -a "df -h" -b

# Check memory usage
ansible all -m shell -a "free -h" -b

# Check running services
ansible all -m shell -a "systemctl status" -b

# Update packages
ansible all -m apt -a "update_cache=yes" -b

Directory Structure

dlx-ansible/
├── inventory/
│   └── hosts.yml              # Server inventory with IPs and groups
│
├── host_vars/                 # Per-host configuration
│   ├── jenkins.yml            # Jenkins-specific vars (firewall ports)
│   ├── npm.yml                # NPM firewall configuration
│   ├── hiveops.yml            # HiveOps settings
│   └── ...
│
├── group_vars/                # Per-group configuration
│
├── roles/                     # Ansible roles
│   └── common/                # Common configuration for all servers
│       ├── tasks/
│       │   ├── main.yml
│       │   ├── packages.yml
│       │   ├── security.yml   # Firewall, SSH hardening
│       │   ├── users.yml
│       │   └── timezone.yml
│       └── defaults/
│           └── main.yml       # Default variables
│
├── playbooks/                 # Ansible playbooks
│   ├── site.yml               # Main playbook (includes all roles)
│   ├── security-audit-v2.yml  # Security audit
│   ├── secure-docker-server-firewall.yml
│   └── ...
│
├── templates/                 # Jinja2 templates
│
└── docs/                      # Documentation
    ├── SECURITY-AUDIT-SUMMARY.md
    ├── JENKINS-CONNECTIVITY-FIX.md
    └── ...

Key Configuration Patterns

Firewall Management

Firewall is managed by the common role. Configuration is per-host in host_vars/:

# Example: host_vars/jenkins.yml
common_firewall_enabled: true
common_firewall_allowed_ports:
  - "22/tcp"    # SSH
  - "8080/tcp"  # Jenkins
  - "9000/tcp"  # SonarQube

Firewall Disabled Hosts:

  • docker, hiveops, smartjournal, odoo (disabled for Docker networking)

SSH Configuration

Most servers use key-only authentication:

PasswordAuthentication no
PubkeyAuthentication yes
PermitRootLogin no  # (except Proxmox nodes)

Exception: Jenkins has password authentication enabled for AWS Jenkins Master.

Spring Boot SSL Offloading

For Spring Boot applications behind Nginx Proxy Manager:

environment:
  SERVER_FORWARD_HEADERS_STRATEGY: native
  SERVER_USE_FORWARD_HEADERS: true

This prevents redirect loops when NPM terminates SSL.

Docker Compose

When .env is not in same directory as compose file:

docker compose -f docker/docker-compose.yml --env-file .env up -d

Container updates: Always recreate (not restart) when changing environment variables.

Critical Knowledge

See ~/.claude/projects/-source-dlx-src-dlx-ansible/memory/MEMORY.md for detailed infrastructure knowledge including:

  • SSL offloading configuration
  • Jenkins connectivity troubleshooting
  • Storage remediation procedures
  • Security audit findings
  • Common fixes and solutions

Common Tasks

Add New Server

  1. Add to inventory/hosts.yml:

    newserver:
      ansible_host: 192.168.200.xxx
    
  2. Create host_vars/newserver.yml (if custom config needed)

  3. Run setup:

    ansible-playbook playbooks/site.yml -l newserver
    

Update Firewall Rules

  1. Edit host_vars/<server>.yml:

    common_firewall_allowed_ports:
      - "22/tcp"
      - "80/tcp"
      - "443/tcp"
    
  2. Apply changes:

    ansible-playbook playbooks/site.yml -l <server> --tags firewall
    

Enable Automatic Security Updates

ansible all -m apt -a "name=unattended-upgrades state=present" -b
ansible all -m copy -a "dest=/etc/apt/apt.conf.d/20auto-upgrades content='APT::Periodic::Update-Package-Lists \"1\";\nAPT::Periodic::Unattended-Upgrade \"1\";' mode=0644" -b

Run Monthly Security Audit

ansible-playbook playbooks/security-audit-v2.yml
cat docs/SECURITY-AUDIT-SUMMARY.md

Git Workflow

  • Main Branch: Production-ready configurations
  • Commit Messages: Descriptive, include what was changed and why
  • Co-Authored-By: Include for Claude-assisted work
  • Testing: Always test with --check before applying changes

Example commit:

git add playbooks/new-playbook.yml
git commit -m "Add playbook for X configuration

This playbook automates Y to solve Z problem.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>"

Troubleshooting

SSH Connection Issues

# Test SSH connectivity
ansible <server> -m ping

# Check SSH with verbose output
ssh -vvv dlxadmin@<server-ip>

# Test from control machine
ansible <server> -m shell -a "whoami" -b

Firewall Issues

# Check firewall status
ansible <server> -m shell -a "ufw status verbose" -b

# Temporarily disable (for debugging)
ansible <server> -m ufw -a "state=disabled" -b

# Re-enable
ansible <server> -m ufw -a "state=enabled" -b

Playbook Failures

# Run with verbose output
ansible-playbook playbooks/site.yml -vvv

# Check syntax
ansible-playbook playbooks/site.yml --syntax-check

# List tasks
ansible-playbook playbooks/site.yml --list-tasks

# Start at specific task
ansible-playbook playbooks/site.yml --start-at-task="task name"

Security Best Practices

  1. Always test with --check first
  2. Limit scope with -l when testing
  3. Keep firewall rules minimal
  4. Use key-based SSH authentication
  5. Enable automatic security updates
  6. Run monthly security audits
  7. Document changes in memory
  8. Never commit secrets (use Ansible Vault when needed)

Important Notes

  • Jenkins password auth is intentional (for AWS Jenkins Master access)
  • Firewall disabled on hiveops/smartjournal/odoo for Docker networking
  • Proxmox nodes may require root login for management
  • NPM server (192.168.200.71) handles SSL termination for web services
  • Pi-hole (192.168.200.100) provides DNS for internal services

Resources

  • Documentation: docs/ directory
  • Security Audit: docs/SECURITY-AUDIT-SUMMARY.md
  • Claude Memory: ~/.claude/projects/-source-dlx-src-dlx-ansible/memory/MEMORY.md
  • Version Controlled Config: http://192.168.200.102/directlx/dlx-claude

Maintenance Schedule

  • Daily: Monitor server health, check failed logins
  • Weekly: Review and apply security updates
  • Monthly: Run security audit, review firewall rules
  • Quarterly: Review and update documentation

Last Updated: 2026-02-09 Repository: http://192.168.200.102/directlx/dlx-ansible (Gitea) Claude Memory: Maintained in ~/.claude/projects/ Version Controlled: http://192.168.200.102/directlx/dlx-claude