feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized
Services: - ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap) - logcorrelator: JA4 log correlation engine (Go, ClickHouse) - mod_reqin_log: Apache module (C, JSON request logging) - bot_detector: ML bot detection pipeline (Python) - dashboard: FastAPI/Streamlit analytics UI (Python) Shared libraries: - shared/go/ja4common: logger, config, shutdown, ipfilter (Go module) - shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package) - shared/clickhouse/: canonical SQL migrations (10 files) Build & packaging: - Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10) - go.work workspace linking sentinel, correlator, ja4common - Makefile with test-all, build-all, rpm-* targets Fixes applied: - go.work: 1.21 → 1.24.6 (required by sentinel) - correlator Dockerfiles: golang:1.21 → golang:1.24 - replace directives in go.mod for ja4common local path - pyproject.toml: setuptools.backends → setuptools.build_meta - Removed static libpcap linking (unavailable on Rocky 9) - Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32) - Rewrote corrupted test files (logger_test.go × 2) Test coverage: - correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%) - sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse) Documentation: - README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
426
services/correlator/README.md
Normal file
426
services/correlator/README.md
Normal file
@ -0,0 +1,426 @@
|
||||
# logcorrelator
|
||||
|
||||
Service de corrélation de logs HTTP et réseau écrit en Go.
|
||||
|
||||
## Description
|
||||
|
||||
**logcorrelator** reçoit deux flux de logs JSON via des sockets Unix datagrammes (SOCK_DGRAM) :
|
||||
- **Source A** : logs HTTP applicatifs (Apache, reverse proxy)
|
||||
- **Source B** : logs réseau (métadonnées IP/TCP, JA3/JA4, etc.)
|
||||
|
||||
Il corrèle les événements sur la base de `src_ip + src_port` dans une fenêtre temporelle configurable, et produit des logs corrélés vers :
|
||||
- Un fichier local (JSON lines)
|
||||
- ClickHouse (pour analyse et archivage)
|
||||
|
||||
Les logs opérationnels du service (démarrage, erreurs, métriques) sont écrits sur **stderr** et collectés par journald. Aucune donnée corrélée n'apparaît sur stdout.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
||||
│ Source A │────▶│ │────▶│ File Sink │
|
||||
│ HTTP/Apache │ │ Correlation │ │ (JSON lines) │
|
||||
│ (Unix DGRAM) │ │ Service │ └─────────────────┘
|
||||
└─────────────────┘ │ │
|
||||
│ - Buffers │ ┌─────────────────┐
|
||||
┌─────────────────┐ │ - Time Window │────▶│ ClickHouse │
|
||||
│ Source B │────▶│ - Orphan Policy │ │ Sink │
|
||||
│ Réseau/JA4 │ │ - Keep-Alive │ └─────────────────┘
|
||||
│ (Unix DGRAM) │ └──────────────────┘
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
Architecture hexagonale : domaine pur (`internal/domain`), ports abstraits (`internal/ports`), adaptateurs (`internal/adapters`), orchestration (`internal/app`).
|
||||
|
||||
## Build (100% Docker)
|
||||
|
||||
Tout le build, les tests et le packaging RPM s'exécutent dans des conteneurs :
|
||||
|
||||
```bash
|
||||
# Build complet avec tests (builder stage)
|
||||
make docker-build-dev
|
||||
|
||||
# Packaging RPM (el8, el9, el10)
|
||||
make package-rpm
|
||||
|
||||
# Build rapide sans tests
|
||||
make docker-build-dev-no-test
|
||||
|
||||
# Tests en local (nécessite Go 1.21+)
|
||||
make test
|
||||
```
|
||||
|
||||
### Prérequis
|
||||
|
||||
- Docker 20.10+
|
||||
|
||||
## Installation
|
||||
|
||||
### Packages RPM
|
||||
|
||||
```bash
|
||||
# Générer les packages
|
||||
make package-rpm
|
||||
|
||||
# Installer (Rocky Linux / AlmaLinux)
|
||||
sudo dnf install -y dist/rpm/el8/logcorrelator-1.1.12-1.el8.x86_64.rpm
|
||||
sudo dnf install -y dist/rpm/el9/logcorrelator-1.1.12-1.el9.x86_64.rpm
|
||||
sudo dnf install -y dist/rpm/el10/logcorrelator-1.1.12-1.el10.x86_64.rpm
|
||||
|
||||
# Démarrer
|
||||
sudo systemctl enable --now logcorrelator
|
||||
sudo systemctl status logcorrelator
|
||||
```
|
||||
|
||||
### Build manuel
|
||||
|
||||
```bash
|
||||
# Binaire local (nécessite Go 1.21+)
|
||||
go build -o logcorrelator ./cmd/logcorrelator
|
||||
./logcorrelator -config config.example.yml
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Fichier YAML. Voir `config.example.yml` pour un exemple complet.
|
||||
|
||||
```yaml
|
||||
log:
|
||||
level: INFO # DEBUG, INFO, WARN, ERROR
|
||||
|
||||
inputs:
|
||||
unix_sockets:
|
||||
- name: http
|
||||
source_type: A # Source HTTP
|
||||
path: /var/run/logcorrelator/http.socket
|
||||
format: json
|
||||
socket_permissions: "0666"
|
||||
- name: network
|
||||
source_type: B # Source réseau
|
||||
path: /var/run/logcorrelator/network.socket
|
||||
format: json
|
||||
socket_permissions: "0666"
|
||||
|
||||
outputs:
|
||||
file:
|
||||
path: /var/log/logcorrelator/correlated.log
|
||||
clickhouse:
|
||||
enabled: false
|
||||
dsn: clickhouse://user:pass@localhost:9000/db
|
||||
table: http_logs_raw
|
||||
batch_size: 500
|
||||
flush_interval_ms: 200
|
||||
max_buffer_size: 5000
|
||||
drop_on_overflow: true
|
||||
timeout_ms: 1000
|
||||
stdout:
|
||||
enabled: false # no-op pour les données ; logs opérationnels toujours sur stderr
|
||||
|
||||
correlation:
|
||||
time_window:
|
||||
value: 10
|
||||
unit: s
|
||||
orphan_policy:
|
||||
apache_always_emit: true
|
||||
apache_emit_delay_ms: 500 # délai avant émission orphelin A (ms)
|
||||
network_emit: false
|
||||
matching:
|
||||
mode: one_to_many # Keep-Alive : un B peut corréler plusieurs A successifs
|
||||
buffers:
|
||||
max_http_items: 10000
|
||||
max_network_items: 20000
|
||||
ttl:
|
||||
network_ttl_s: 120 # TTL remis à zéro à chaque corrélation (Keep-Alive)
|
||||
# Exclure des IPs source (IPs uniques ou plages CIDR)
|
||||
exclude_source_ips:
|
||||
- 10.0.0.1
|
||||
- 172.16.0.0/12
|
||||
# Restreindre la corrélation à certains ports de destination (optionnel)
|
||||
# Si la liste est vide, tous les ports sont corrélés
|
||||
include_dest_ports:
|
||||
- 80
|
||||
- 443
|
||||
|
||||
metrics:
|
||||
enabled: false
|
||||
addr: ":8080"
|
||||
```
|
||||
|
||||
### Format du DSN ClickHouse
|
||||
|
||||
```
|
||||
clickhouse://username:password@host:port/database
|
||||
```
|
||||
|
||||
Ports : `9000` (natif, recommandé) ou `8123` (HTTP).
|
||||
|
||||
## Format des logs
|
||||
|
||||
### Source A (HTTP)
|
||||
|
||||
```json
|
||||
{
|
||||
"src_ip": "192.168.1.1", "src_port": 8080,
|
||||
"dst_ip": "10.0.0.1", "dst_port": 443,
|
||||
"timestamp": 1704110400000000000,
|
||||
"method": "GET", "path": "/api/test"
|
||||
}
|
||||
```
|
||||
|
||||
### Source B (Réseau)
|
||||
|
||||
```json
|
||||
{
|
||||
"src_ip": "192.168.1.1", "src_port": 8080,
|
||||
"dst_ip": "10.0.0.1", "dst_port": 443,
|
||||
"ja3": "abc123", "ja4": "xyz789"
|
||||
}
|
||||
```
|
||||
|
||||
### Log corrélé (sortie)
|
||||
|
||||
Structure JSON plate — tous les champs A et B sont fusionnés à la racine :
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2024-01-01T12:00:00Z",
|
||||
"src_ip": "192.168.1.1", "src_port": 8080,
|
||||
"dst_ip": "10.0.0.1", "dst_port": 443,
|
||||
"correlated": true,
|
||||
"method": "GET", "path": "/api/test",
|
||||
"ja3": "abc123", "ja4": "xyz789"
|
||||
}
|
||||
```
|
||||
|
||||
En cas de collision de champ entre A et B, les deux valeurs sont conservées avec préfixes `a_` et `b_`.
|
||||
|
||||
Les orphelins A (sans B correspondant) sont émis avec `"correlated": false, "orphan_side": "A"`.
|
||||
|
||||
## Schema ClickHouse
|
||||
|
||||
Le fichier `sql/init.sql` contient le schéma complet prêt à l'emploi.
|
||||
|
||||
```bash
|
||||
clickhouse-client --multiquery < sql/init.sql
|
||||
```
|
||||
|
||||
### Architecture des tables
|
||||
|
||||
```
|
||||
http_logs_raw ← inserts du service (raw_json String)
|
||||
│
|
||||
└─ mv_http_logs ← vue matérialisée (parse JSON → colonnes typées)
|
||||
│
|
||||
▼
|
||||
http_logs ← table requêtable par les analystes
|
||||
```
|
||||
|
||||
### Table `http_logs` — colonnes
|
||||
|
||||
| Groupe | Colonnes |
|
||||
|---|---|
|
||||
| Temporel | `time` DateTime, `log_date` Date |
|
||||
| Réseau | `src_ip` IPv4, `src_port` UInt16, `dst_ip` IPv4, `dst_port` UInt16 |
|
||||
| HTTP | `method`, `scheme`, `host`, `path`, `query`, `http_version` (LowCardinality) |
|
||||
| Corrélation | `orphan_side`, `correlated` UInt8, `keepalives` UInt16, `a_timestamp`/`b_timestamp` UInt64, `conn_id` |
|
||||
| IP meta | `ip_meta_df` UInt8, `ip_meta_id` UInt16, `ip_meta_total_length` UInt16, `ip_meta_ttl` UInt8 |
|
||||
| TCP meta | `tcp_meta_options`, `tcp_meta_window_size` UInt32, `tcp_meta_mss` UInt16, `tcp_meta_window_scale` UInt8, `syn_to_clienthello_ms` Int32 |
|
||||
| TLS / fingerprint | `tls_version`, `tls_sni`, `tls_alpn` (LowCardinality), `ja3`, `ja3_hash`, `ja4` |
|
||||
| En-têtes HTTP | `header_user_agent`, `header_accept`, `header_accept_encoding`, `header_accept_language`, `header_x_request_id`, `header_x_trace_id`, `header_x_forwarded_for`, `header_sec_ch_ua*`, `header_sec_fetch_*` |
|
||||
|
||||
### Utilisateurs et permissions
|
||||
|
||||
```sql
|
||||
-- data_writer : INSERT sur http_logs_raw uniquement (compte du service)
|
||||
GRANT INSERT ON mabase_prod.http_logs_raw TO data_writer;
|
||||
GRANT SELECT ON mabase_prod.http_logs_raw TO data_writer;
|
||||
|
||||
-- analyst : lecture sur la table parsée
|
||||
GRANT SELECT ON mabase_prod.http_logs TO analyst;
|
||||
```
|
||||
|
||||
### Vérification de l'ingestion
|
||||
|
||||
```sql
|
||||
-- Données brutes reçues
|
||||
SELECT count(*), min(ingest_time), max(ingest_time) FROM mabase_prod.http_logs_raw;
|
||||
|
||||
-- Données parsées par la vue matérialisée
|
||||
SELECT count(*), min(time), max(time) FROM mabase_prod.http_logs;
|
||||
|
||||
-- Derniers logs corrélés
|
||||
SELECT time, src_ip, dst_ip, method, host, path, ja4
|
||||
FROM mabase_prod.http_logs
|
||||
WHERE correlated = 1
|
||||
ORDER BY time DESC LIMIT 10;
|
||||
```
|
||||
|
||||
## Signaux
|
||||
|
||||
| Signal | Comportement |
|
||||
|--------|--------------|
|
||||
| `SIGINT` / `SIGTERM` | Arrêt gracieux (drain buffers, flush sinks) |
|
||||
| `SIGHUP` | Réouverture des fichiers de sortie (log rotation) |
|
||||
|
||||
## Logs internes
|
||||
|
||||
Les logs opérationnels vont sur **stderr** :
|
||||
|
||||
```bash
|
||||
# Systemd
|
||||
journalctl -u logcorrelator -f
|
||||
|
||||
# Docker
|
||||
docker logs -f logcorrelator
|
||||
```
|
||||
|
||||
## Structure du projet
|
||||
|
||||
```
|
||||
cmd/logcorrelator/ # Point d'entrée
|
||||
internal/
|
||||
adapters/
|
||||
inbound/unixsocket/ # Lecture SOCK_DGRAM → NormalizedEvent
|
||||
outbound/
|
||||
clickhouse/ # Sink ClickHouse (batch, retry, logging complet)
|
||||
file/ # Sink fichier (JSON lines, SIGHUP reopen)
|
||||
multi/ # Fan-out vers plusieurs sinks
|
||||
stdout/ # No-op pour les données (logs opérationnels sur stderr)
|
||||
app/ # Orchestrator (sources → corrélation → sinks)
|
||||
config/ # Chargement/validation YAML
|
||||
domain/ # CorrelationService, NormalizedEvent, CorrelatedLog
|
||||
observability/ # Logger, métriques, serveur HTTP /metrics /health
|
||||
ports/ # Interfaces EventSource, CorrelatedLogSink, CorrelationProcessor
|
||||
config.example.yml # Exemple de configuration
|
||||
Dockerfile # Build multi-stage (builder, runtime, dev)
|
||||
Dockerfile.package # Packaging RPM multi-distros (el8, el9, el10)
|
||||
Makefile # Cibles de build
|
||||
architecture.yml # Spécification architecture
|
||||
logcorrelator.service # Unité systemd
|
||||
```
|
||||
|
||||
## Débogage
|
||||
|
||||
### Logs DEBUG
|
||||
|
||||
```yaml
|
||||
log:
|
||||
level: DEBUG
|
||||
```
|
||||
|
||||
Exemples de logs produits :
|
||||
```
|
||||
[unixsocket:http] DEBUG event received: source=A src_ip=192.168.1.1 src_port=8080
|
||||
[correlation] DEBUG processing A event: key=192.168.1.1:8080
|
||||
[correlation] DEBUG correlation found: A(src_ip=... src_port=... ts=...) + B(...)
|
||||
[correlation] DEBUG A event has no matching B key in buffer: key=...
|
||||
[correlation] DEBUG event excluded by IP filter: source=A src_ip=10.0.0.1 src_port=8080
|
||||
[correlation] DEBUG event excluded by dest port filter: source=A dst_port=22
|
||||
[correlation] DEBUG TTL reset for B event (Keep-Alive): key=... new_ttl=120s
|
||||
[clickhouse] DEBUG batch sent: rows=42 table=http_logs_raw
|
||||
```
|
||||
|
||||
### Serveur de métriques
|
||||
|
||||
```yaml
|
||||
metrics:
|
||||
enabled: true
|
||||
addr: ":8080"
|
||||
```
|
||||
|
||||
`GET /health` → `{"status":"healthy"}`
|
||||
|
||||
`GET /metrics` :
|
||||
|
||||
```json
|
||||
{
|
||||
"events_received_a": 1542, "events_received_b": 1498,
|
||||
"correlations_success": 1450, "correlations_failed": 92,
|
||||
"failed_no_match_key": 45, "failed_time_window": 23,
|
||||
"failed_buffer_eviction": 5, "failed_ttl_expired": 12,
|
||||
"failed_ip_excluded": 7, "failed_dest_port_filtered": 3,
|
||||
"buffer_a_size": 23, "buffer_b_size": 18,
|
||||
"orphans_emitted_a": 92, "orphans_pending_a": 4,
|
||||
"keepalive_resets": 892
|
||||
}
|
||||
```
|
||||
|
||||
### Diagnostic par métriques
|
||||
|
||||
| Métrique élevée | Cause | Solution |
|
||||
|---|---|---|
|
||||
| `failed_no_match_key` | A et B n'ont pas le même `src_ip:src_port` | Vérifier les deux sources |
|
||||
| `failed_time_window` | Timestamps trop éloignés | Augmenter `time_window.value` ou vérifier NTP |
|
||||
| `failed_ttl_expired` | B expire avant corrélation | Augmenter `ttl.network_ttl_s` |
|
||||
| `failed_buffer_eviction` | Buffers trop petits | Augmenter `buffers.max_http_items` / `max_network_items` |
|
||||
| `failed_ip_excluded` | Traffic depuis IPs exclues | Normal si attendu |
|
||||
| `failed_dest_port_filtered` | Traffic sur ports non listés | Vérifier `include_dest_ports` |
|
||||
| `orphans_emitted_a` élevé | Beaucoup de A sans B | Vérifier que la source B envoie des événements |
|
||||
|
||||
### Filtrage par IP source
|
||||
|
||||
```yaml
|
||||
correlation:
|
||||
exclude_source_ips:
|
||||
- 10.0.0.1 # IP unique (health checks)
|
||||
- 172.16.0.0/12 # Plage CIDR
|
||||
```
|
||||
|
||||
Les événements depuis ces IPs sont silencieusement ignorés (non corrélés, non émis en orphelin). La métrique `failed_ip_excluded` comptabilise les exclusions.
|
||||
|
||||
### Filtrage par port de destination
|
||||
|
||||
```yaml
|
||||
correlation:
|
||||
include_dest_ports:
|
||||
- 80 # HTTP
|
||||
- 443 # HTTPS
|
||||
- 8080
|
||||
- 8443
|
||||
```
|
||||
|
||||
Si la liste est non vide, seuls les événements dont le `dst_port` est dans la liste participent à la corrélation. Les autres sont silencieusement ignorés. Liste vide = tous les ports corrélés (comportement par défaut). La métrique `failed_dest_port_filtered` comptabilise les exclusions.
|
||||
|
||||
### Scripts de test
|
||||
|
||||
```bash
|
||||
# Script Bash (simple)
|
||||
./scripts/test-correlation.sh -c 10 -v
|
||||
|
||||
# Script Python (scénarios complets : basic, time window, keepalive, différentes IPs)
|
||||
pip install requests
|
||||
python3 scripts/test-correlation-advanced.py --all
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### ClickHouse : erreurs d'insertion
|
||||
|
||||
- **`No such column`** : vérifier que la table `http_logs_raw` utilise la colonne unique `raw_json` (pas de colonnes séparées)
|
||||
- **`ACCESS_DENIED`** : `GRANT INSERT ON mabase_prod.http_logs_raw TO data_writer;`
|
||||
- Les erreurs de flush sont loggées en ERROR dans les logs du service
|
||||
|
||||
### Vue matérialisée vide
|
||||
|
||||
Si `http_logs_raw` a des données mais `http_logs` est vide :
|
||||
```sql
|
||||
-- Vérifier la vue
|
||||
SHOW CREATE TABLE mabase_prod.mv_http_logs;
|
||||
-- Vérifier les permissions (la MV s'exécute sous le compte du service)
|
||||
GRANT SELECT ON mabase_prod.http_logs_raw TO data_writer;
|
||||
```
|
||||
|
||||
### Sockets Unix : permission denied
|
||||
|
||||
Vérifier que `socket_permissions: "0666"` est configuré et que le répertoire `/var/run/logcorrelator` appartient à l'utilisateur `logcorrelator`.
|
||||
|
||||
### Service systemd ne démarre pas
|
||||
|
||||
```bash
|
||||
journalctl -u logcorrelator -n 50 --no-pager
|
||||
/usr/bin/logcorrelator -config /etc/logcorrelator/logcorrelator.yml
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
Reference in New Issue
Block a user