feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized
Services: - ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap) - logcorrelator: JA4 log correlation engine (Go, ClickHouse) - mod_reqin_log: Apache module (C, JSON request logging) - bot_detector: ML bot detection pipeline (Python) - dashboard: FastAPI/Streamlit analytics UI (Python) Shared libraries: - shared/go/ja4common: logger, config, shutdown, ipfilter (Go module) - shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package) - shared/clickhouse/: canonical SQL migrations (10 files) Build & packaging: - Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10) - go.work workspace linking sentinel, correlator, ja4common - Makefile with test-all, build-all, rpm-* targets Fixes applied: - go.work: 1.21 → 1.24.6 (required by sentinel) - correlator Dockerfiles: golang:1.21 → golang:1.24 - replace directives in go.mod for ja4common local path - pyproject.toml: setuptools.backends → setuptools.build_meta - Removed static libpcap linking (unavailable on Rocky 9) - Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32) - Rewrote corrupted test files (logger_test.go × 2) Test coverage: - correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%) - sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse) Documentation: - README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
114
services/dashboard/.github/copilot-instructions.md
vendored
Normal file
114
services/dashboard/.github/copilot-instructions.md
vendored
Normal file
@ -0,0 +1,114 @@
|
||||
# Copilot Instructions — Bot Detector Dashboard
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
This is a **SOC (Security Operations Center) dashboard** for visualizing bot detections from an upstream `bot_detector_ai` service. It is a **single-service, full-stack app**: the FastAPI backend serves the built React frontend as static files *and* exposes a REST API, all on port 8000. There is no separate frontend server in production and **no authentication**.
|
||||
|
||||
**Data source:** ClickHouse database (`mabase_prod`), primarily the `ml_detected_anomalies` table and the `view_dashboard_entities` view.
|
||||
|
||||
```
|
||||
dashboard/
|
||||
├── backend/ # Python 3.11 + FastAPI — REST API + static file serving
|
||||
│ ├── main.py # App entry point: CORS, router registration, SPA catch-all
|
||||
│ ├── config.py # pydantic-settings Settings, reads .env
|
||||
│ ├── database.py # ClickHouseClient singleton (db)
|
||||
│ ├── models.py # All Pydantic v2 response models
|
||||
│ ├── routes/ # One module per domain: metrics, detections, variability,
|
||||
│ │ # attributes, analysis, entities, incidents, audit, reputation
|
||||
│ └── services/
|
||||
│ └── reputation_ip.py # Async httpx → ip-api.com + ipinfo.io (no API keys)
|
||||
└── frontend/ # React 18 + TypeScript 5 + Vite 5 + Tailwind CSS 3
|
||||
└── src/
|
||||
├── App.tsx # BrowserRouter + Sidebar + TopHeader + all Routes
|
||||
├── ThemeContext.tsx # dark/light/auto, persisted to localStorage (key: soc_theme)
|
||||
├── api/client.ts # Axios instance (baseURL: /api) + all TS interfaces
|
||||
├── components/ # One component per route view + shared panels + ui/
|
||||
├── hooks/ # useMetrics, useDetections, useVariability (polling wrappers)
|
||||
└── utils/STIXExporter.ts
|
||||
```
|
||||
|
||||
## Dev Commands
|
||||
|
||||
```bash
|
||||
# Backend (run from repo root)
|
||||
pip install -r requirements.txt
|
||||
python -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
|
||||
|
||||
# Frontend (separate terminal)
|
||||
cd frontend && npm install
|
||||
npm run dev # :3000 with HMR, proxies /api → localhost:8000
|
||||
npm run build # tsc type-check + vite build → frontend/dist/
|
||||
npm run preview # preview the production build
|
||||
|
||||
# Docker (production)
|
||||
docker compose up -d dashboard_web
|
||||
docker compose build dashboard_web && docker compose up -d dashboard_web
|
||||
docker compose logs -f dashboard_web
|
||||
```
|
||||
|
||||
There is no test suite or linter configured (no pytest, vitest, ESLint, Black, etc.).
|
||||
|
||||
```bash
|
||||
# Manual smoke tests
|
||||
curl http://localhost:8000/health
|
||||
curl http://localhost:8000/api/metrics | jq '.summary'
|
||||
curl "http://localhost:8000/api/detections?page=1&page_size=5" | jq '.items | length'
|
||||
```
|
||||
|
||||
## Key Conventions
|
||||
|
||||
### Backend
|
||||
|
||||
- **All routes are raw SQL** — no ORM. Results are accessed by positional index: `result.result_rows[0][n]`. Column order is determined by the `SELECT` statement.
|
||||
- **Query parameters** use `%(name)s` dict syntax: `db.query(sql, {"param": value})`.
|
||||
- **Every router module** defines `router = APIRouter(prefix="/api/<domain>", tags=["..."])` and is registered in `main.py` via `app.include_router(...)`.
|
||||
- **SPA catch-all** (`/{full_path:path}`) **must remain the last registered route** in `main.py`. New routers must be added with `app.include_router()` before it.
|
||||
- **IPv4 IPs** are stored as IPv6-mapped (`::ffff:x.x.x.x`) in `src_ip`; queries normalize with `replaceRegexpAll(toString(src_ip), '^::ffff:', '')`.
|
||||
- **NULL guards** — all row fields are coalesced: `row[n] or ""`, `row[n] or 0`, `row[n] or "LOW"`.
|
||||
- **`anomaly_score`** can be negative in the DB; always normalize with `abs()` for display.
|
||||
- **`analysis.py`** stores SOC classifications in a `classifications` ClickHouse table. The `audit_logs` table is optional — routes silently return empty results if absent.
|
||||
|
||||
### Frontend
|
||||
|
||||
- **API calls** use the axios instance from `src/api/client.ts` (baseURL `/api`) or direct `fetch('/api/...')`. There is **no global state manager** — components use `useState`/`useEffect` or custom hooks directly.
|
||||
- **TypeScript interfaces** in `client.ts` mirror the Pydantic models in `backend/models.py`. Both must be kept in sync when changing data shapes.
|
||||
- **Tailwind uses semantic CSS-variable tokens** — always use `bg-background`, `bg-background-secondary`, `bg-background-card`, `text-text-primary`, `text-text-secondary`, `text-text-disabled`, `bg-accent-primary`, `threat-critical/high/medium/low` rather than raw Tailwind color classes (e.g., `slate-800`). This ensures dark/light theme compatibility.
|
||||
- **Threat level taxonomy**: `CRITICAL` > `HIGH` > `MEDIUM` > `LOW` — always uppercase strings; colors: red / orange / yellow / green.
|
||||
- **URL encoding**: entity values with special characters (JA4 fingerprints, subnets) are `encodeURIComponent`-encoded. Subnets use `_24` in place of `/24` (e.g., `/entities/subnet/141.98.11.0_24`).
|
||||
- **Recent investigations** are stored in `localStorage` under `soc_recent_investigations` (max 8). Tracked by `RouteTracker` component. Only types `ip`, `ja4`, `subnet` are tracked.
|
||||
- **Auto-refresh**: metrics every 30 s, incidents every 60 s.
|
||||
- **French UI text** — all user-facing strings and log messages are in French; code identifiers are in English.
|
||||
|
||||
### Frontend → Backend in Dev vs Production
|
||||
|
||||
- **Dev**: Vite dev server on `:3000` proxies `/api/*` to `http://localhost:8000` (see `vite.config.ts`).
|
||||
- **Production**: React SPA is served by FastAPI from `frontend/dist/`. API calls hit the same origin at `:8000` — no proxy needed.
|
||||
|
||||
### Docker
|
||||
|
||||
- Single service using `network_mode: "host"` — no port mapping; the container shares the host network stack.
|
||||
- Multi-stage Dockerfile: `node:20-alpine` builds the frontend → `python:3.11-slim` installs deps → final image copies both.
|
||||
|
||||
## Environment Variables (`.env`)
|
||||
|
||||
| Variable | Default | Description |
|
||||
|---|---|---|
|
||||
| `CLICKHOUSE_HOST` | `clickhouse` | ClickHouse hostname |
|
||||
| `CLICKHOUSE_PORT` | `8123` | ClickHouse HTTP port (set in code) |
|
||||
| `CLICKHOUSE_DB` | `mabase_prod` | Database name |
|
||||
| `CLICKHOUSE_USER` | `admin` | |
|
||||
| `CLICKHOUSE_PASSWORD` | `` | |
|
||||
| `API_HOST` | `0.0.0.0` | Uvicorn bind host |
|
||||
| `API_PORT` | `8000` | Uvicorn bind port |
|
||||
| `CORS_ORIGINS` | `["http://localhost:3000", ...]` | Allowed origins |
|
||||
|
||||
> ⚠️ The `.env` file contains real credentials — never commit it to public repos.
|
||||
|
||||
## ClickHouse Tables
|
||||
|
||||
| Table / View | Used by |
|
||||
|---|---|
|
||||
| `ml_detected_anomalies` | Primary source for detections, metrics, variability, analysis |
|
||||
| `view_dashboard_entities` | User agents, client headers, paths, query params (entities routes) |
|
||||
| `classifications` | SOC analyst classifications (created by `analysis.py`) |
|
||||
| `mabase_prod.audit_logs` | Audit trail (optional — missing table is handled silently) |
|
||||
Reference in New Issue
Block a user