directlx-claude-config/plans/lively-puzzling-cocoa.md

236 lines
8.5 KiB
Markdown

# Plan: hiveops-qds-monitor Microservice
## Context
QDS (Quality Data Systems) sends automated ATM alert emails from `atmalerts@qualitydatasystems.com` via MoniManager to `hiveops@branchcos.com`. These are currently landing in a Gmail label called **QDS** but are never acted on programmatically. The goal is a new HiveOps microservice that monitors this inbox and creates (or resolves) incidents in `hiveops-incident` automatically.
---
## Email Format (Observed)
Two event types, identified by subject prefix:
- **`[External] Event Occurred`** → create incident
- **`[External] Canceled`** → resolve existing incident
Body contains a structured key-value table:
```
Terminal # 763330
Fault Contents Cash Dispenser Bill Cassette PCU3 Abnormal
Fault Device Cash Dispenser
Error Code (optional)
Date and Time 02/23/2026 06:56:58
Ticket No 287886
Remaining Total USD90,000.00 (optional)
Priority High
Severity Critical
Problem Desc ...
```
---
## New Service: `hiveops-qds-monitor`
**Location:** `/source/hiveops-src/hiveops-qds-monitor/` (new repo, Java Spring Boot)
**Tech:** Spring Boot 3.4.x · Gmail API (OAuth2) · RestTemplate · H2 embedded DB · Spring Scheduler
---
## Architecture
```
Gmail (QDS label)
│ poll every 2 min (unread only)
GmailPollerService
│ parse body
EmailParserService ──→ QdsEmailEvent { terminalId, ticketNo, faultDevice,
│ faultContents, severity, eventType }
Event Occurred? Canceled?
│ │
▼ ▼
IncidentCreationFlow IncidentResolutionFlow
1. AuthService.getJwt() 1. TicketMappingRepo.findByTicketNo()
2. AtmLookupService 2. PUT /api/incidents/{id} → RESOLVED
GET /api/atms/search 3. TicketMappingRepo.delete()
3. QdsIncidentMapper
(fault → type, severity)
4. POST /api/incidents
5. TicketMappingRepo.save(ticketNo → incidentId)
│ │
└────────── mark email as read ─┘
```
---
## Project Structure
```
hiveops-qds-monitor/
├── pom.xml
├── Dockerfile
└── src/main/
├── java/com/hiveops/qdsmonitor/
│ ├── HiveOpsQdsMonitorApplication.java
│ ├── config/
│ │ └── AppConfig.java # RestTemplate, Gmail client beans
│ ├── scheduler/
│ │ └── GmailPollerService.java # @Scheduled, calls Gmail API
│ ├── service/
│ │ ├── EmailParserService.java # Parse raw email body → QdsEmailEvent
│ │ ├── AtmLookupService.java # GET /api/atms/search
│ │ ├── AuthService.java # Login + token cache + auto-refresh
│ │ └── IncidentApiService.java # POST /api/incidents, PUT .../status
│ ├── mapper/
│ │ └── QdsIncidentMapper.java # Fault → IncidentType + Severity
│ ├── dto/
│ │ ├── QdsEmailEvent.java
│ │ ├── CreateIncidentRequest.java
│ │ └── UpdateIncidentRequest.java
│ └── repository/
│ └── TicketMappingRepository.java # H2: ticketNo ↔ hiveops incidentId
└── resources/
└── application.properties
```
---
## Key Implementation Details
### 1. Gmail Polling (`GmailPollerService`)
- Use `google-api-services-gmail` Java client library
- OAuth2 credentials (client_id, client_secret, refresh_token) from env vars
- Query: `label:QDS is:unread` — fetch only unread messages
- After processing: mark email as read (`UNREAD` label removed)
- Run on `@Scheduled(fixedDelayString = "${qds.poll.interval-ms:120000}")`
### 2. Email Parsing (`EmailParserService`)
- Detect event type from subject: contains `"Event Occurred"` vs `"Canceled"`
- Parse body as key-value pairs using regex: `^(\w[\w\s]+?)\s{2,}(.+)$`
- Extract: Terminal #, Fault Contents, Fault Device, Error Code, Date and Time, Ticket No, Priority, Severity
### 3. Fault → IncidentType Mapping (`QdsIncidentMapper`)
| Fault Device / Contents | IncidentType |
|---|---|
| Card Reader | CARD_READER_FAIL |
| Cash Dispenser + "cassette" | CASSETTE_LOW / CASSETTE_EMPTY |
| Cash Dispenser (other) | DISPENSER_JAM |
| Network | NETWORK_ERROR |
| Power | POWER_FAILURE |
| Item Processing Module | HARDWARE_ERROR |
| Out of service / Service | SOFTWARE_ERROR |
| Default | HARDWARE_ERROR |
QDS Severity → HiveOps Severity:
- `Critical``CRITICAL`
- `High` (priority) → `HIGH`
- `Normal``MEDIUM`
- `Low``LOW`
### 4. Auth Service (`AuthService`)
- `POST http://hiveops-auth:8082/auth/api/login` with service account credentials
- Cache JWT token + expiry in memory
- Auto-refresh using refresh token before expiry (check on each API call)
- Env vars: `SERVICE_ACCOUNT_EMAIL`, `SERVICE_ACCOUNT_PASSWORD`
### 5. ATM Lookup (`AtmLookupService`)
- `GET http://hiveops-incident:8081/api/atms/search?query={terminalId}`
- Return first match's numeric `id` (Long)
- If no match found: log `WARN` and skip incident creation (don't create orphan incidents)
### 6. Incident API (`IncidentApiService`)
- Create: `POST http://hiveops-incident:8081/api/incidents`
- Description format: `[QDS] Terminal: {terminalId} | Ticket: #{ticketNo} | {faultContents} | {dateTime}`
- Resolve: `PUT http://hiveops-incident:8081/api/incidents/{id}` with `{ "status": "RESOLVED" }`
### 7. Ticket Mapping (H2 Embedded DB)
- Table: `ticket_mapping(qds_ticket_no VARCHAR PK, hiveops_incident_id BIGINT, created_at)`
- Spring Data JPA with H2 — no external DB needed
- Persist across restarts (H2 file mode: `./data/qds-monitor`)
---
## Incident Description Format
```
[QDS] Terminal: 763330 | Ticket: #287886 | Cash Dispenser Bill Cassette PCU3 Abnormal | 02/23/2026 06:56:58
```
---
## Files to Create
| File | Action |
|---|---|
| `/source/hiveops-src/hiveops-qds-monitor/pom.xml` | New — Spring Boot 3.4, Gmail API, H2 |
| `/source/hiveops-src/hiveops-qds-monitor/Dockerfile` | New — multi-stage Java build |
| `/source/hiveops-src/hiveops-qds-monitor/src/...` | New — all Java source files above |
| `/source/hiveops-src/hiveops-qds-monitor/src/main/resources/application.properties` | New |
## Files to Modify
| File | Change |
|---|---|
| `hiveops/instances/services/docker-compose.yml` | Add `hiveops-qds-monitor` service block |
| `hiveops/instances/services/.env` | Add Gmail OAuth + service account env vars |
| `hiveops/.env` | Add `QDS_MONITOR_VERSION=latest` |
---
## Docker Service Definition (to add to docker-compose.yml)
```yaml
hiveops-qds-monitor:
image: ${REGISTRY_URL}/hiveops-qds-monitor:${QDS_MONITOR_VERSION:-latest}
restart: unless-stopped
environment:
- GMAIL_CLIENT_ID=${GMAIL_CLIENT_ID}
- GMAIL_CLIENT_SECRET=${GMAIL_CLIENT_SECRET}
- GMAIL_REFRESH_TOKEN=${GMAIL_REFRESH_TOKEN}
- GMAIL_LABEL_NAME=QDS
- INCIDENT_API_URL=http://hiveops-incident:8081
- AUTH_API_URL=http://hiveops-auth:8082
- SERVICE_ACCOUNT_EMAIL=${QDS_SERVICE_ACCOUNT_EMAIL}
- SERVICE_ACCOUNT_PASSWORD=${QDS_SERVICE_ACCOUNT_PASSWORD}
- POLL_INTERVAL_MS=120000
volumes:
- ./data/qds-monitor:/app/data
depends_on:
- hiveops-incident
- hiveops-auth
networks:
- hiveops-network
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
healthcheck:
test: ["CMD", "bash", "-c", "exec 3<>/dev/tcp/127.0.0.1/8080 && echo -e 'GET /actuator/health HTTP/1.0\r\n\r\n' >&3 && cat <&3 | grep -q 'UP'"]
interval: 30s
timeout: 10s
retries: 3
```
---
## Required Gmail OAuth Setup (one-time, before deployment)
Before deploying, a one-time OAuth2 flow must be run to generate a refresh token for `hiveops@branchcos.com`:
1. Create a Google Cloud project with Gmail API enabled
2. Create OAuth2 credentials (Desktop app type)
3. Run OAuth consent flow to get `refresh_token`
4. Store `client_id`, `client_secret`, `refresh_token` in `.env`
---
## Verification
1. Build the jar locally: `mvn clean package -DskipTests`
2. Run locally with env vars set, verify it connects to Gmail and logs emails found
3. Deploy to docker-compose, check logs: `docker compose logs -f hiveops-qds-monitor`
4. Send a test email to the QDS label and confirm incident created in hiveops-incident UI
5. Mark the incident as resolved by sending a "Canceled" test email and confirm status change
6. Check H2 ticket mapping is persisted across service restart