directlx-claude-config/projects/-source-hiveops-src-hiveops.../memory/MEMORY.md

223 lines
9.1 KiB
Markdown

# Project Memory
## Git Preferences
- **Default branch**: Always use `main` instead of `master`
- When initializing repos: `git init -b main`
- When creating first commit, use `main` branch
- When pushing: `git push -u origin main`
- **Standardized Git Identity**: All HiveOps repositories use consistent git configuration
- **User name**: `directlx`
- **User email**: `directlx.dev@gmail.com`
- Set in each repository: `git config user.name "directlx" && git config user.email "directlx.dev@gmail.com"`
- Verify before committing: `git config user.name && git config user.email`
- All 12 HiveOps repositories have been standardized (as of 2026-02-15)
## Documentation Organization
- **Markdown files location**: Always place documentation `.md` files in the `docs/` directory
- Exception: `README.md` stays in repository root
- All other `.md` files go in `docs/`
- Update references in `README.md` to point to `docs/` paths
- Keep documentation organized and centralized
- **Deployment Documentation**: Comprehensive guides for microservice deployment
- `docs/DEPLOYMENT-GUIDE.md` - Complete deployment procedures, troubleshooting, rollback
- `docs/DEPLOYMENT-QUICKSTART.md` - Fast reference for quick deployments
- Both guides include standardized git configuration verification steps
- Deployment workflow: Build → Push → SSH to production → Pull → Deploy → Verify
## Browser-Only Access Restriction
**Context**: HiveOps Incident Management is restricted to only work through HiveOps Browser application, not direct web browser access.
### Implementation Pattern (Dual-Layer Security)
1. **Nginx Layer** (`instances/services/nginx/conf.d/default.conf`):
- **API/Agent endpoints** (`^/(api|atm|actuator)/`): NO browser check - agents don't have browser headers
- **Frontend** (`/`): Browser check required - serves blocked page if unauthorized
- Location: `/blocked.html` serves `instances/services/nginx/conf.d/blocked.html`
- **IMPORTANT**: Agents run on ATMs without HiveOps Browser headers - nginx must allow `/api/`, `/atm/`, `/actuator/` paths through without browser checks
2. **Backend Layer** (Java Spring Boot):
- `BrowserOnlyFilter.java` - Servlet filter that checks headers, allows `/api/**`, `/atm/**`, `/actuator/**`
- Registered in `SecurityConfig.java` via `.addFilterBefore()`
- Spring Security handles authentication:
- `/api/**` - requires JWT authentication (`hasRole("USER")`)
- `/atm/**` - allows unauthenticated access (`permitAll`) for agent communication
- `/actuator/health`, `/actuator/info` - public endpoints
### HiveOps Browser Header Injection
**File**: `hiveops-browser/src/main/main.js` — attached to `incidentView.webContents.session`
Injects `X-HiveOps-Browser`, `X-HiveOps-Browser-Version`, `User-Agent` suffix, and `Authorization: Bearer <token>` (from `authManager.getToken()`). See "Browser JWT Injection" section below for full details.
### Critical Nginx Configuration Pattern
**Always use DNS resolver + variables for proxy_pass**:
```nginx
# Docker DNS resolver (prevents "host not found" errors at startup)
resolver 127.0.0.11 valid=30s;
location / {
set $backend "service-name:port";
proxy_pass http://$backend/;
# ... proxy headers
}
```
**Why**: Nginx resolves hostnames in `proxy_pass` at config load time. If services aren't ready, nginx fails to start. Using variables defers DNS resolution to request time.
### Files Modified for Browser Restriction
- `hiveops-openmetal/instances/services/nginx/conf.d/default.conf` - nginx config with browser checks
- `hiveops-openmetal/instances/services/nginx/conf.d/blocked.html` - blocked access page
- `hiveops-incident/backend/.../filter/BrowserOnlyFilter.java` - backend filter
- `hiveops-incident/backend/.../config/SecurityConfig.java` - filter registration
- `hiveops-browser/src/main/main.js` - header injection
### Spring Security 6.x Authentication Issue (RESOLVED 2026-02-16)
**Problem**: `AnonymousAuthenticationFilter` overwrites custom authentication, causing 403 errors even when authentication is successfully set.
**Solution**: Disable anonymous authentication in SecurityConfig:
```java
http.anonymous(anonymous -> anonymous.disable())
```
Use `hasRole("USER")` instead of `.authenticated()` for authorization rules.
### Nginx Proxy Path Issue (RESOLVED 2026-02-16)
**Problem**: Simple `proxy_pass http://$backend/;` with `location /incident/` doesn't correctly forward paths with HTTP/2 - all requests arrive as `GET /`.
**Solution**: Use regex location to capture and forward the full path:
```nginx
location ~ ^/incident/(.*)$ {
set $backend "hiveops-incident:8081";
set $incident_path /$1;
proxy_pass http://$backend$incident_path$is_args$args;
proxy_pass_request_headers on;
}
```
### Agent 405 Errors - Nginx Method/Browser Check Issue (RESOLVED 2026-02-16)
**Problem**: Agents getting 405 Method Not Allowed errors when calling:
- `/api/atms/config/sync` (POST)
- `/atm/fm/modules/{country}/{atm}` (POST)
**Root Causes**:
1. Global CORS headers restricted all methods to `GET, OPTIONS` on `incident.bcos.cloud`
2. Nginx was enforcing HiveOps Browser header check on API/agent endpoints
**Solution**:
1. Remove global CORS method restrictions from server block
2. Create separate location blocks:
- `^/(api|atm|actuator)/` → NO browser check (agents don't have headers), allow all methods
- `/` → Browser check required (frontend only), restrictive CORS
3. Backend Spring Security handles authentication (JWT for `/api/**`, permitAll for `/atm/**`)
### JWT Session Expiry Bug (RESOLVED 2026-02-26)
**Problem**: Users see 403 on incident page approximately every 4 hours.
**Root cause**:
1. `auth-manager.js:getToken()` returned expired tokens without checking expiry
2. `main.js` injected expired `Authorization: Bearer <token>` into all incident view requests
3. Backend's `JwtAuthenticationFilter` rejected invalid token → no authentication set
4. Spring Security with `anonymous.disable()` returned **403** instead of 401
**Fix 1 — `hiveops-browser/src/main/auth-manager.js`**:
```javascript
getToken() {
if (!this.isAuthenticated()) { return null; } // ← added expiry check
const auth = this.getAuth();
return auth && auth.token ? auth.token : null;
}
```
**Fix 2 — `hiveops-incident/backend/.../config/SecurityConfig.java`**:
```java
.exceptionHandling(ex -> ex
.authenticationEntryPoint(new HttpStatusEntryPoint(HttpStatus.UNAUTHORIZED))
)
```
Unauthenticated requests now return **401** (not 403) so clients can distinguish session expiry from access denied.
**Note**: 4-hour session expiry is hardcoded in `auth-manager.js:storeAuth()`. Background validator checks every 15 minutes and forces re-login on expiry.
### Browser JWT Injection
**File**: `hiveops-browser/src/main/main.js``incidentView.webContents.session.webRequest.onBeforeSendHeaders`
Injects into the incident view (not all requests):
- `X-HiveOps-Browser: true`
- `X-HiveOps-Browser-Version: <version>`
- `User-Agent: ... HiveOps/<version>`
- `Authorization: Bearer <token>` (from `authManager.getToken()`)
### Browser Release Workflow
```bash
# Run from /source/hiveops-src/hiveops-browser/
./build-all.sh
# Auto-bumps patch version, builds all platforms (Linux + Windows via Wine),
# copies installers to ../hiveops-openmetal/hiveops/instances/browser/downloads/
# SCP installers to CDN server
scp "downloads/HiveOps Browser Setup X.X.XX.exe" \
"downloads/HiveOps Browser-X.X.XX.AppImage" \
"downloads/hiveops-browser_X.X.XX_amd64.deb" \
hiveops@173.231.252.43:~/hiveops/hiveops-openmetal/instances/browser/downloads/
# Run release script on CDN server
ssh hiveops@173.231.252.43
cd ~/hiveops/hiveops-openmetal/instances/browser
./scripts/release-browser.sh X.X.XX \
"HiveOps Browser Setup X.X.XX.exe" \
"HiveOps Browser-X.X.XX.AppImage" \
"hiveops-browser_X.X.XX_amd64.deb"
# Commit version bump in hiveops-browser, commit downloads + CLAUDE.md in hiveops-openmetal
```
**Latest released version: 2.0.47**
### Deployment
```bash
# SSH user for production servers
ssh hiveops@173.231.252.40
# Instance 1 (Services) deployment path — same on both local and production
~/hiveops/hiveops-openmetal/instances/services/
# After nginx config changes, always copy from local:
scp /source/hiveops-src/hiveops-openmetal/hiveops/instances/services/nginx/conf.d/default.conf \
hiveops@173.231.252.40:~/hiveops/hiveops-openmetal/instances/services/nginx/conf.d/
# Restart services
cd ~/hiveops/hiveops-openmetal/instances/services
docker compose restart nginx
docker compose up -d hiveops-incident
```
### Server IPs
- Services (`incident.bcos.cloud`, `api.bcos.cloud`, etc.) → `173.231.252.40`
- CDN (`cdn.bcos.cloud`, `bcos.cloud`) → `173.231.252.43`
- Database → `173.231.252.45`
For CDN work: `ssh hiveops@173.231.252.43` and `cd ~/hiveops/hiveops-openmetal/instances/browser`
### Reorganized Structure (2026-02-20)
- `hiveops/instances/services/` — was `instances/services/`
- `hiveops/instances/browser/` — was `instances/browser/`
- `shared/database/` — was `instances/database/`
- `hiveops/docker-compose*.yml` — was root `docker-compose*.yml`
- `hiveops/.env` — was root `.env`
- `smartjournal/` — NEW, SmartJournal skeleton