Files
dashboard/.github/copilot-instructions.md
SOC Analyst 1455e04303 fix: correct CampaignsView, analysis.py IPv4 split, entities date filter
- CampaignsView: update ClusterData interface to match real API response
  (severity/unique_ips/score instead of threat_level/total_ips/confidence_range)
  Fix fetch to use data.items, rewrite ClusterCard and BehavioralTab
  Remove unused getClassificationColor and THREAT_ORDER constants
- analysis.py: fix IPv4Address object has no attribute 'split' on line 322
  Add str() conversion before calling .split('.')
- entities.py: fix Date vs DateTime comparison — log_date is a Date column,
  comparing against now()-INTERVAL HOUR caused yesterday's entries to be excluded
  Use toDate(now() - INTERVAL X HOUR) for correct Date-level comparison

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-15 23:10:35 +01:00

6.7 KiB

Copilot Instructions — Bot Detector Dashboard

Architecture Overview

This is a SOC (Security Operations Center) dashboard for visualizing bot detections from an upstream bot_detector_ai service. It is a single-service, full-stack app: the FastAPI backend serves the built React frontend as static files and exposes a REST API, all on port 8000. There is no separate frontend server in production and no authentication.

Data source: ClickHouse database (mabase_prod), primarily the ml_detected_anomalies table and the view_dashboard_entities view.

dashboard/
├── backend/           # Python 3.11 + FastAPI — REST API + static file serving
│   ├── main.py        # App entry point: CORS, router registration, SPA catch-all
│   ├── config.py      # pydantic-settings Settings, reads .env
│   ├── database.py    # ClickHouseClient singleton (db)
│   ├── models.py      # All Pydantic v2 response models
│   ├── routes/        # One module per domain: metrics, detections, variability,
│   │                  # attributes, analysis, entities, incidents, audit, reputation
│   └── services/
│       └── reputation_ip.py  # Async httpx → ip-api.com + ipinfo.io (no API keys)
└── frontend/          # React 18 + TypeScript 5 + Vite 5 + Tailwind CSS 3
    └── src/
        ├── App.tsx         # BrowserRouter + Sidebar + TopHeader + all Routes
        ├── ThemeContext.tsx # dark/light/auto, persisted to localStorage (key: soc_theme)
        ├── api/client.ts   # Axios instance (baseURL: /api) + all TS interfaces
        ├── components/     # One component per route view + shared panels + ui/
        ├── hooks/          # useMetrics, useDetections, useVariability (polling wrappers)
        └── utils/STIXExporter.ts

Dev Commands

# Backend (run from repo root)
pip install -r requirements.txt
python -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

# Frontend (separate terminal)
cd frontend && npm install
npm run dev        # :3000 with HMR, proxies /api → localhost:8000
npm run build      # tsc type-check + vite build → frontend/dist/
npm run preview    # preview the production build

# Docker (production)
docker compose up -d dashboard_web
docker compose build dashboard_web && docker compose up -d dashboard_web
docker compose logs -f dashboard_web

There is no test suite or linter configured (no pytest, vitest, ESLint, Black, etc.).

# Manual smoke tests
curl http://localhost:8000/health
curl http://localhost:8000/api/metrics | jq '.summary'
curl "http://localhost:8000/api/detections?page=1&page_size=5" | jq '.items | length'

Key Conventions

Backend

  • All routes are raw SQL — no ORM. Results are accessed by positional index: result.result_rows[0][n]. Column order is determined by the SELECT statement.
  • Query parameters use %(name)s dict syntax: db.query(sql, {"param": value}).
  • Every router module defines router = APIRouter(prefix="/api/<domain>", tags=["..."]) and is registered in main.py via app.include_router(...).
  • SPA catch-all (/{full_path:path}) must remain the last registered route in main.py. New routers must be added with app.include_router() before it.
  • IPv4 IPs are stored as IPv6-mapped (::ffff:x.x.x.x) in src_ip; queries normalize with replaceRegexpAll(toString(src_ip), '^::ffff:', '').
  • NULL guards — all row fields are coalesced: row[n] or "", row[n] or 0, row[n] or "LOW".
  • anomaly_score can be negative in the DB; always normalize with abs() for display.
  • analysis.py stores SOC classifications in a classifications ClickHouse table. The audit_logs table is optional — routes silently return empty results if absent.

Frontend

  • API calls use the axios instance from src/api/client.ts (baseURL /api) or direct fetch('/api/...'). There is no global state manager — components use useState/useEffect or custom hooks directly.
  • TypeScript interfaces in client.ts mirror the Pydantic models in backend/models.py. Both must be kept in sync when changing data shapes.
  • Tailwind uses semantic CSS-variable tokens — always use bg-background, bg-background-secondary, bg-background-card, text-text-primary, text-text-secondary, text-text-disabled, bg-accent-primary, threat-critical/high/medium/low rather than raw Tailwind color classes (e.g., slate-800). This ensures dark/light theme compatibility.
  • Threat level taxonomy: CRITICAL > HIGH > MEDIUM > LOW — always uppercase strings; colors: red / orange / yellow / green.
  • URL encoding: entity values with special characters (JA4 fingerprints, subnets) are encodeURIComponent-encoded. Subnets use _24 in place of /24 (e.g., /entities/subnet/141.98.11.0_24).
  • Recent investigations are stored in localStorage under soc_recent_investigations (max 8). Tracked by RouteTracker component. Only types ip, ja4, subnet are tracked.
  • Auto-refresh: metrics every 30 s, incidents every 60 s.
  • French UI text — all user-facing strings and log messages are in French; code identifiers are in English.

Frontend → Backend in Dev vs Production

  • Dev: Vite dev server on :3000 proxies /api/* to http://localhost:8000 (see vite.config.ts).
  • Production: React SPA is served by FastAPI from frontend/dist/. API calls hit the same origin at :8000 — no proxy needed.

Docker

  • Single service using network_mode: "host" — no port mapping; the container shares the host network stack.
  • Multi-stage Dockerfile: node:20-alpine builds the frontend → python:3.11-slim installs deps → final image copies both.

Environment Variables (.env)

Variable Default Description
CLICKHOUSE_HOST clickhouse ClickHouse hostname
CLICKHOUSE_PORT 8123 ClickHouse HTTP port (set in code)
CLICKHOUSE_DB mabase_prod Database name
CLICKHOUSE_USER admin
CLICKHOUSE_PASSWORD ``
API_HOST 0.0.0.0 Uvicorn bind host
API_PORT 8000 Uvicorn bind port
CORS_ORIGINS ["http://localhost:3000", ...] Allowed origins

⚠️ The .env file contains real credentials — never commit it to public repos.

ClickHouse Tables

Table / View Used by
ml_detected_anomalies Primary source for detections, metrics, variability, analysis
view_dashboard_entities User agents, client headers, paths, query params (entities routes)
classifications SOC analyst classifications (created by analysis.py)
mabase_prod.audit_logs Audit trail (optional — missing table is handled silently)