dlx-ansible/docs/LOCAL-DNS-CONFIGURATION.md

248 lines
6.3 KiB
Markdown

# Local DNS Configuration for DirectLX Services
## Problem
DirectLX services (incident.directlx.dev, hiveops.directlx.dev, etc.) resolve to public IP `45.16.76.42`, which has unreliable connectivity:
- Connection timeouts (2+ minutes)
- Only ~33% success rate
- Intermittent failures
## Solution
Configure local DNS resolution to use the internal NPM server (`192.168.200.71`) instead of going through the public IP.
### Benefits
- ✅ Fast local network access (2-4ms vs 2+ minute timeouts)
- ✅ 100% reliability on local network
- ✅ No dependency on public IP routing
- ✅ Seamless HTTPS with valid certificates
## Playbooks
### 1. `configure-local-dns-localhost.yml`
Configure DNS on your local workstation **right now**.
**Usage:**
```bash
cd /source/dlx-src/dlx-ansible
ansible-playbook playbooks/configure-local-dns-localhost.yml --ask-become-pass
```
**What it does:**
- Adds directlx.dev entries to `/etc/hosts`
- Points all domains to `192.168.200.71` (NPM server)
- Tests connectivity automatically
- Shows before/after results
**Duration:** ~5 seconds
---
### 2. `configure-local-dns.yml`
Deploy local DNS configuration across **all infrastructure hosts**.
**Usage:**
```bash
cd /source/dlx-src/dlx-ansible
# Apply to all hosts except NPM server
ansible-playbook -i inventory/hosts.yml playbooks/configure-local-dns.yml
# Apply to specific group
ansible-playbook -i inventory/hosts.yml playbooks/configure-local-dns.yml --limit application
# Apply to specific host
ansible-playbook -i inventory/hosts.yml playbooks/configure-local-dns.yml --limit hiveops
```
**What it does:**
- Configures all hosts to use local NPM server
- Creates backup of `/etc/hosts` before changes
- Uses blockinfile for idempotent updates
- Verifies DNS resolution after configuration
- Tests HTTPS connectivity (optional)
**Target hosts:** All hosts except NPM server itself
**Duration:** ~10-20 seconds per host
---
## Manual Configuration
If you prefer to configure manually:
```bash
sudo tee -a /etc/hosts <<'EOF'
# DirectLX Local DNS Entries
192.168.200.71 incident.directlx.dev
192.168.200.71 hiveops.directlx.dev
192.168.200.71 mgmt.directlx.dev
192.168.200.71 release.directlx.dev
192.168.200.71 gitea.directlx.dev
192.168.200.71 smartjournal.directlx.dev
192.168.200.71 directlx.dev
192.168.200.71 www.directlx.dev
192.168.200.71 registry.directlx.dev
EOF
```
## Verification
### Test DNS Resolution
```bash
getent hosts incident.directlx.dev
# Expected: 192.168.200.71 incident.directlx.dev
```
### Test HTTPS Connectivity
```bash
curl -I https://incident.directlx.dev
# Expected: HTTP/1.1 200 OK (within 1 second)
```
### Test All Domains
```bash
for domain in incident.directlx.dev hiveops.directlx.dev mgmt.directlx.dev; do
echo "Testing $domain..."
time curl -I --max-time 5 https://$domain | head -1
done
```
## Rollback
If you need to remove the configuration:
```bash
# Restore from backup
sudo cp /etc/hosts.backup-<timestamp> /etc/hosts
# Or remove the Ansible-managed block
sudo sed -i '/# DirectLX Local DNS/,/# END DirectLX Local DNS/d' /etc/hosts
```
## Production Considerations
### Option 1: /etc/hosts (Current Solution)
- ✅ Simple and immediate
- ✅ No additional dependencies
- ❌ Must be applied to each host
- ❌ Manual management required
### Option 2: Pi-hole Split-Horizon DNS (Recommended Long-term)
Configure Pi-hole at `192.168.200.100` with local DNS records:
**Benefits:**
- ✅ Centralized management
- ✅ Automatic for all network clients
- ✅ No per-host configuration
- ✅ Easy to update
**Implementation:**
1. Log in to Pi-hole admin (`http://192.168.200.100/admin`)
2. Navigate to **Local DNS****DNS Records**
3. Add A records for each domain:
- Domain: `incident.directlx.dev` → IP: `192.168.200.71`
- Domain: `hiveops.directlx.dev` → IP: `192.168.200.71`
- Domain: `mgmt.directlx.dev` → IP: `192.168.200.71`
- Domain: `registry.directlx.dev` → IP: `192.168.200.71`
- etc.
### Option 3: Internal DNS Server (Enterprise)
Set up authoritative DNS for `directlx.dev` zone with split-horizon configuration.
## How It Works
### Before
```
Client → DNS Query → 45.16.76.42 (public IP)
Client → HTTPS Request → 45.16.76.42 → [timeout/unreliable]
```
### After
```
Client → /etc/hosts → 192.168.200.71 (local NPM)
Client → HTTPS Request → 192.168.200.71 (NPM) → 192.168.200.112 (backend)
```
## Network Architecture
```
┌─────────────────┐
│ Your Workstation│
│ 10.10.10.119 │
└────────┬────────┘
│ Local Network Route
┌────▼─────────────────┐
│ NPM (Nginx Proxy) │
│ 192.168.200.71 │
│ Port 443 (HTTPS) │
└────┬─────────────────┘
│ Proxy Pass
┌────▼──────────────────┐
│ HiveOps Backend │
│ 192.168.200.112:8080 │
└───────────────────────┘
```
## Troubleshooting
### Issue: Still getting timeouts
**Check DNS resolution:**
```bash
getent hosts incident.directlx.dev
```
Should show `192.168.200.71`, not `45.16.76.42`.
**Check /etc/hosts:**
```bash
grep directlx /etc/hosts
```
Should show entries with `192.168.200.71`.
### Issue: DNS resolving to wrong IP
**Check systemd-resolved cache:**
```bash
sudo systemd-resolve --flush-caches
resolvectl flush-caches
```
**Check NSS resolution order:**
```bash
grep hosts /etc/nsswitch.conf
```
Should show `files` before `dns`: `hosts: files dns`
### Issue: Certificate warnings
If you get SSL certificate warnings, it means you're connecting via IP instead of hostname.
**Solution:** Ensure you're using the domain name, not the IP:
```bash
# Wrong
curl https://192.168.200.71
# Correct
curl https://incident.directlx.dev
```
## Related Documentation
- [SSL Offloading Fix](SSL-OFFLOADING-FIX.md) - Spring Boot SSL configuration
- [Express Proxy Config](EXPRESS-PROXY-CONFIG.md) - Express.js proxy settings
- NPM Configuration - Located in `/data/nginx/proxy_host/` on 192.168.200.71
## Author
- **Created:** 2026-02-06
- **Purpose:** Fix unreliable public IP connectivity to DirectLX services
- **Repository:** dlx-src/dlx-ansible