ecceb041742dd9a3bbc4de2c5672286b7728ef45
view_ip_recurrence :
Ajout de WHERE detected_at >= now() - INTERVAL 30 DAY
→ Avec PARTITION BY (P1), ClickHouse élagage les partitions hors de cette
plage avant même de lire les données. La vue ne scanne que les partitions
actives (au lieu des 30 partitions journalières complètes).
→ ORDER BY (src_ip) garantit que le GROUP BY src_ip lit des données
contiguës (aucune réorganisation mémoire).
rotation.py — supprimer FINAL sur ml_detected_anomalies :
FINAL force une déduplication complète du ReplacingMergeTree en mémoire
(équivalent à un DISTINCT sur toute la table) — une des opérations les plus
coûteuses dans ClickHouse.
Fix : remplacer le sous-SELECT FINAL par view_ip_recurrence (déjà aggrégée
par src_ip, retourne recurrence directement sans FINAL).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ja4-platform
ja4-platform is a monorepo security pipeline for TLS fingerprinting (JA4/JA3) and bot detection. It captures live network traffic, correlates TLS handshakes with HTTP requests, detects anomalous behavior using machine learning (Isolation Forest), and presents results through a SOC analyst dashboard — all backed by ClickHouse as the central data store.
Pipeline Overview
┌─────────────────────────────────────────────────────────────────────────────┐
│ Linux Server (Apache) │
│ │
│ ┌─────────────────┐ ┌─────────────────────┐ │
│ │ mod-reqin-log │───────▶│ UNIX socket (HTTP) │──┐ │
│ │ (Apache module) │ JSON │ /var/run/logcorr/ │ │ │
│ │ C · httpd DSO │ │ http.socket │ │ │
│ └─────────────────┘ └─────────────────────┘ │ │
│ ▼ │
│ ┌─────────────────┐ ┌─────────────────────┐ ┌──────────────────┐ │
│ │ sentinel │───────▶│ UNIX socket (TLS) │─▶│ correlator │ │
│ │ (TLS capture) │ JSON │ /var/run/logcorr/ │ │ (event join) │ │
│ │ Go · libpcap │ │ network.socket │ │ Go · hex. arch │ │
│ └─────────────────┘ └─────────────────────┘ └────────┬─────────┘ │
│ │ │
└────────────────────────────────────────────────────────────────┼────────────┘
│ INSERT
▼
┌──────────────────┐
│ ClickHouse │
│ ja4_processing │
│ (all tables) │
└────────┬─────────┘
│ SELECT
┌────────────────────┼────────────────────┐
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ bot-detector │ │ dashboard │
│ (ML anomaly det) │ │ (SOC web UI) │
│ Python · sklearn │ │ FastAPI + React │
└──────────────────┘ └──────────────────┘
Services
| Service | Language | Purpose | Interface |
|---|---|---|---|
| sentinel | Go | Live TLS packet capture, JA4/JA3 fingerprint generation | UNIX socket (network.socket) |
| mod-reqin-log | C | Apache HTTPD module, HTTP request JSON logging | UNIX socket (http.socket) |
| correlator | Go | Joins HTTP + TLS events by src_ip:src_port + time window |
ClickHouse INSERT, file, stdout |
| bot-detector | Python | Isolation Forest ML anomaly detection on aggregated traffic | ClickHouse read/write, HTTP :8080 |
| dashboard | Python/JS | SOC analyst web dashboard (FastAPI + React) | HTTP :8000 |
Shared Libraries
| Library | Language | Description |
|---|---|---|
| go/ja4common | Go | Logger, config loader, shutdown handler, IP filter |
| python/ja4_common | Python | ClickHouse client singleton, settings |
Quickstart
Prerequisites
- Docker (with BuildKit) and Docker Compose
make- No native Go, Python, or C toolchains required — all builds run inside Docker
Build All Services
make build-all
Run All Tests
make test-all
Build RPM Packages
make rpm-all
# RPMs written to services/<service>/dist/
Documentation
| Document | Description |
|---|---|
| Architecture | System architecture, data flow, component interactions |
| Development | Build, test, package, and extend the platform |
| Database Schema | Every ClickHouse table, view, dictionary, and materialized view |
| Database Migrations | Migration order, application, verification, and rollback |
Service Documentation
- Sentinel — TLS capture daemon
- mod-reqin-log — Apache HTTP logging module
- Correlator — HTTP/TLS event correlation engine
- Bot Detector — ML anomaly detection
- Dashboard — SOC web dashboard and API
Shared Library Documentation
- go-ja4common — Go shared library
- python-ja4common — Python shared library
Go Workspace
The repository uses a Go workspace (go.work) to link the Go modules:
go 1.21
use (
./services/sentinel
./services/correlator
./shared/go/ja4common
)
License
See individual service directories for license information.
Description
Languages
Python
38.2%
HTML
24.8%
Go
16.1%
Shell
15.1%
C
3.5%
Other
2.3%