- Add make test-integration* commands - Add SKIP_TESTS=true fast build flag - Add 'Comments standard' section referencing docs/commenting-standard.md - Add 'Known gotchas' section with 6 non-obvious issues: * go.work build context must include both sentinel + correlator * YAML does not expand env vars in Go (hardcode DSN) * REGEXP_TREE dict requires >=1 rule or inserts fail * pcap only captures non-loopback traffic * ClickHouse init needs 120s timeout * RPM builds must use Rocky Linux (libpcap.so.1 vs .so.0.8) * FLAT() layout requires numeric keys (use COMPLEX_KEY_HASHED for strings) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
8.9 KiB
Copilot Instructions — ja4-platform
What is this?
A monorepo for a JA4/JA3 TLS fingerprinting security pipeline. Five services capture network traffic, correlate logs, detect bots via ML, and present results in a SOC dashboard. All backed by ClickHouse.
Data flow: mod-reqin-log (Apache HTTP logs) → unix socket → correlator ← unix socket ← sentinel (TLS/TCP capture) → ClickHouse → bot-detector (ML scoring) → dashboard (FastAPI SOC UI)
Build, test, lint
All builds run in Docker — no native Go/Python/C toolchain required on the host.
# Full suite
make test-all # run all tests (Docker)
make build-all # build all service images
make rpm-all # build RPMs (sentinel, correlator, mod-reqin-log) for el8/el9/el10
# Per-service tests
make test-sentinel # Go tests (needs --cap-add=NET_RAW inside)
make test-correlator # Go tests with 60% coverage gate
make test-bot-detector # Python pytest
make test-dashboard # Python pytest
make test-ja4common-python # Python pytest (shared lib)
make test-mod-reqin-log # C cmocka tests
# Single Go test (from service dir, or via Docker):
docker run --rm -v $(pwd):/build -w /build/services/correlator golang:1.24 \
go test -v -run TestConfigLoad ./internal/config/
# Single Python test (from repo root):
docker build -f services/dashboard/Dockerfile.tests -t dash-tests .
docker run --rm dash-tests pytest backend/tests/test_metrics.py -v -k test_health
# Faster correlator build (skip tests):
docker build --target builder --build-arg SKIP_TESTS=true -f services/correlator/Dockerfile .
# Linting (Go only — no Python linter configured)
cd services/sentinel && go vet ./... && gofmt -l .
cd services/correlator && go vet ./... && gofmt -l .
# Full-stack integration tests (Docker Compose, resets DB each run)
make test-integration # runs tests/integration/run-tests.sh → down -v + up + traffic + verify
make test-integration-keep # same but leaves stack running after
make test-integration-down # tear down integration stack
Architecture
Go workspace (go.work, Go 1.24.6)
Three modules in the workspace:
services/sentinel— TLS/TCP packet capture daemon (gopacket/pcap, systemd)services/correlator— log correlation engine, hexagonal architectureshared/go/ja4common— shared logger, config, shutdown, ipfilter
Both services have a replace directive in their go.mod pointing to ../../shared/go/ja4common. The workspace takes precedence for local dev; the replace is needed for Docker builds.
Correlator hexagonal architecture
ports/source.go → EventSource, CorrelatedLogSink, CorrelationProcessor interfaces
adapters/inbound/ → unixsocket (reads from sentinel + mod-reqin-log)
adapters/outbound/ → clickhouse, file, stdout, multi (fan-out wrapper)
domain/ → CorrelationService, CorrelatedLog, NormalizedEvent
app/ → Orchestrator (wires everything together)
config/ → YAML config loader
Python services
bot-detector— scikit-learn IsolationForest + DBSCAN. Single monolithic module (bot_detector.py). Usesos.getenv()directly for config, NOT pydantic-settings.dashboard— FastAPI + React SPA. 20 route modules inbackend/routes/. Uses pydantic-settings (backend/config.py).shared/python/ja4_common—ClickHouseClientsingleton +ClickHouseSettings(pydantic-settings). Installed as a local package in each Python Dockerfile.
C module
mod-reqin-log— Apache HTTPD module (C11, built withapxs). Logs HTTP requests as JSON to a Unix socket. Tests use cmocka.
ClickHouse dual-database pattern
Two configurable databases (env vars with defaults):
| Env var | Default | Contains |
|---|---|---|
CLICKHOUSE_DB_LOGS |
ja4_logs |
http_logs_raw, http_logs, mv_http_logs |
CLICKHOUSE_DB_PROCESSING |
ja4_processing |
Aggregations, ML tables, views, dicts, audit |
Cross-database references exist — materialized views in one DB read from the other:
ja4_logs.mv_http_logsreferencesja4_processing.dict_anubis_*andja4_processing.dict_iplocate_asnja4_processing.mv_agg_*readsFROM ja4_logs.http_logs
In Python code, always use fully qualified table names:
from ..config import settings
query = f"SELECT ... FROM {settings.CLICKHOUSE_DB_PROCESSING}.ml_detected_anomalies ..."
query = f"SELECT ... FROM {settings.CLICKHOUSE_DB_LOGS}.http_logs ..."
Never hardcode database names in queries.
In Go (correlator), the database is part of the ClickHouse DSN (clickhouse://user:pass@host:9000/ja4_logs). The target table is configurable via YAML (outputs.clickhouse.table).
SQL migrations live in shared/clickhouse/ (10 ordered files). Deploy with shared/clickhouse/deploy_schema.sh which substitutes DB names from env vars.
Key conventions
Docker-first builds
Every service has Dockerfile (prod), Dockerfile.dev or Dockerfile.tests (tests), and Go/C services have Dockerfile.package (RPM packaging via 3-stage: builder → rpmbuild × 3 distros → alpine output).
Go config: YAML + env vars
- Sentinel:
config.yml, env prefixJA4SENTINEL_ - Correlator:
config.yml, env prefixLOGCORRELATOR_ - Both support
SIGHUPfor log rotation
Python config: pydantic-settings
- Dashboard:
backend/config.py→Settings(BaseSettings)with.envfile - ja4_common:
ClickHouseSettings(BaseSettings)— singleton atsettings - bot-detector: exception — uses raw
os.getenv(), not pydantic-settings
Dashboard route structure
Every route file follows this pattern:
from fastapi import APIRouter, HTTPException, Query
from ..config import settings
from ..database import db
router = APIRouter()
@router.get("/api/something")
async def get_something():
query = f"SELECT ... FROM {settings.CLICKHOUSE_DB_PROCESSING}.table_name ..."
result = db.query(query)
...
RPM spec files
Located at services/<name>/packaging/rpm/<name>.spec. Version injected via --define "build_version X.Y.Z" at build time.
Inter-service communication
Services communicate via Unix sockets, not HTTP:
sentinel→/var/run/logcorrelator/network.socket→correlator(source B: TLS/TCP data)mod-reqin-log→/var/run/logcorrelator/http.socket→correlator(source A: HTTP data)correlator→ ClickHouse (batch inserts intoja4_logs.http_logs_raw)
Sentinel requires elevated privileges
Tests need --cap-add=NET_RAW --cap-add=NET_ADMIN for packet capture (pcap).
Comments standard
All code is commented in French (identifiers stay in English). Standard defined in docs/commenting-standard.md:
- Go: godoc
// FuncName does X, package-level// Package foo fournit... - Python: PEP-257 triple-quoted French docstrings on all functions/classes/modules
- C: Doxygen
/** @brief ... @param ... @return ... */before every function,/* ====== Section ====== */banners - Bash: standardized header block with
Usage:andVariables d'environnement: - SQL:
-- === filename.sql — description ===banner +-- --- Table ---section headers
Known gotchas
go.work and Docker build contexts
When building either sentinel or correlator in Docker, the build context must include both service directories because go.work references them both. The root-level Makefiles always use . (repo root) as context — don't change this.
Correlator YAML does not expand env vars
Go's YAML parser reads ${VAR:-default} as a literal string. Use hardcoded values or pass values directly in the YAML file. This is why tests/integration/platform/correlator.yml has a hardcoded DSN.
REGEXP_TREE dictionary requires ≥1 rule
dict_anubis_ua uses LAYOUT(REGEXP_TREE). If anubis_ua_rules is empty, every INSERT into http_logs_raw fails because the materialized view mv_http_logs calls dictGet() on it. The integration test init script seeds a catch-all rule.
TLS/pcap capture needs non-loopback traffic
sentinel listens on a network interface (e.g., eth0), not loopback. Traffic sent to localhost or 127.0.0.1 from the same container is invisible to pcap. In integration tests, traffic must come from a separate container crossing the Docker bridge network.
ClickHouse initialization timing
ClickHouse takes ~15-20s to initialize all 10 SQL files. Integration health checks use a 120s timeout (not the default 60s).
RPM builds must use Rocky Linux
All Dockerfile.package files use rockylinux:9 (or rockylinux:8/almalinux:10) as the build base — never Debian-based images. Reason: Rocky provides libpcap.so.1; Debian provides libpcap.so.0.8. Building sentinel on Debian and running on Rocky produces a missing library error at runtime.
ClickHouse FLAT() layout requires numeric keys
If adding a new dictionary with a String primary key, use COMPLEX_KEY_HASHED() not FLAT().