Architecture: - ja4_logs: raw log ingestion (http_logs_raw, http_logs, mv_http_logs) - ja4_processing: analytics, aggregation, ML, dictionaries, audit Configuration (env vars): - CLICKHOUSE_DB_LOGS (default: ja4_logs) - CLICKHOUSE_DB_PROCESSING (default: ja4_processing) Changes: - SQL migrations (10 files): all mabase_prod refs → ja4_logs or ja4_processing with correct cross-database references (MVs, views, dicts) - deploy_schema.sh: substitutes DB names from env vars at deploy time - Python shared settings: added CLICKHOUSE_DB_LOGS + CLICKHOUSE_DB_PROCESSING - Dashboard routes (19 files): replaced ~80 hardcoded mabase_prod refs with settings.CLICKHOUSE_DB_LOGS / settings.CLICKHOUSE_DB_PROCESSING - Bot-detector: DB → CLICKHOUSE_DB_PROCESSING, fetch_rules.py configurable - Correlator: DSN example updated to ja4_logs - Docker-compose + .env files: new env vars with defaults - All documentation updated (14 markdown files) All tests pass: sentinel 10/10, correlator 67.1%, bot-detector 11, dashboard 20, ja4_common 18 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
124 lines
6.9 KiB
Markdown
124 lines
6.9 KiB
Markdown
# ja4-platform
|
|
|
|
**ja4-platform** is a monorepo security pipeline for TLS fingerprinting (JA4/JA3) and bot detection. It captures live network traffic, correlates TLS handshakes with HTTP requests, detects anomalous behavior using machine learning (Isolation Forest), and presents results through a SOC analyst dashboard — all backed by ClickHouse as the central data store.
|
|
|
|
## Pipeline Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ Linux Server (Apache) │
|
|
│ │
|
|
│ ┌─────────────────┐ ┌─────────────────────┐ │
|
|
│ │ mod-reqin-log │───────▶│ UNIX socket (HTTP) │──┐ │
|
|
│ │ (Apache module) │ JSON │ /var/run/logcorr/ │ │ │
|
|
│ │ C · httpd DSO │ │ http.socket │ │ │
|
|
│ └─────────────────┘ └─────────────────────┘ │ │
|
|
│ ▼ │
|
|
│ ┌─────────────────┐ ┌─────────────────────┐ ┌──────────────────┐ │
|
|
│ │ sentinel │───────▶│ UNIX socket (TLS) │─▶│ correlator │ │
|
|
│ │ (TLS capture) │ JSON │ /var/run/logcorr/ │ │ (event join) │ │
|
|
│ │ Go · libpcap │ │ network.socket │ │ Go · hex. arch │ │
|
|
│ └─────────────────┘ └─────────────────────┘ └────────┬─────────┘ │
|
|
│ │ │
|
|
└────────────────────────────────────────────────────────────────┼────────────┘
|
|
│ INSERT
|
|
▼
|
|
┌──────────────────┐
|
|
│ ClickHouse │
|
|
│ ja4_processing │
|
|
│ (all tables) │
|
|
└────────┬─────────┘
|
|
│ SELECT
|
|
┌────────────────────┼────────────────────┐
|
|
▼ ▼
|
|
┌──────────────────┐ ┌──────────────────┐
|
|
│ bot-detector │ │ dashboard │
|
|
│ (ML anomaly det) │ │ (SOC web UI) │
|
|
│ Python · sklearn │ │ FastAPI + React │
|
|
└──────────────────┘ └──────────────────┘
|
|
```
|
|
|
|
## Services
|
|
|
|
| Service | Language | Purpose | Interface |
|
|
|---------|----------|---------|-----------|
|
|
| [sentinel](docs/services/sentinel.md) | Go | Live TLS packet capture, JA4/JA3 fingerprint generation | UNIX socket (`network.socket`) |
|
|
| [mod-reqin-log](docs/services/mod-reqin-log.md) | C | Apache HTTPD module, HTTP request JSON logging | UNIX socket (`http.socket`) |
|
|
| [correlator](docs/services/correlator.md) | Go | Joins HTTP + TLS events by `src_ip:src_port` + time window | ClickHouse INSERT, file, stdout |
|
|
| [bot-detector](docs/services/bot-detector.md) | Python | Isolation Forest ML anomaly detection on aggregated traffic | ClickHouse read/write, HTTP `:8080` |
|
|
| [dashboard](docs/services/dashboard.md) | Python/JS | SOC analyst web dashboard (FastAPI + React) | HTTP `:8000` |
|
|
|
|
## Shared Libraries
|
|
|
|
| Library | Language | Description |
|
|
|---------|----------|-------------|
|
|
| [go/ja4common](docs/shared/go-ja4common.md) | Go | Logger, config loader, shutdown handler, IP filter |
|
|
| [python/ja4_common](docs/shared/python-ja4common.md) | Python | ClickHouse client singleton, settings |
|
|
|
|
## Quickstart
|
|
|
|
### Prerequisites
|
|
|
|
- Docker (with BuildKit) and Docker Compose
|
|
- `make`
|
|
- No native Go, Python, or C toolchains required — all builds run inside Docker
|
|
|
|
### Build All Services
|
|
|
|
```bash
|
|
make build-all
|
|
```
|
|
|
|
### Run All Tests
|
|
|
|
```bash
|
|
make test-all
|
|
```
|
|
|
|
### Build RPM Packages
|
|
|
|
```bash
|
|
make rpm-all
|
|
# RPMs written to services/<service>/dist/
|
|
```
|
|
|
|
## Documentation
|
|
|
|
| Document | Description |
|
|
|----------|-------------|
|
|
| [Architecture](docs/architecture.md) | System architecture, data flow, component interactions |
|
|
| [Development](docs/development.md) | Build, test, package, and extend the platform |
|
|
| [Database Schema](docs/database/schema.md) | Every ClickHouse table, view, dictionary, and materialized view |
|
|
| [Database Migrations](docs/database/migrations.md) | Migration order, application, verification, and rollback |
|
|
|
|
### Service Documentation
|
|
|
|
- [Sentinel](docs/services/sentinel.md) — TLS capture daemon
|
|
- [mod-reqin-log](docs/services/mod-reqin-log.md) — Apache HTTP logging module
|
|
- [Correlator](docs/services/correlator.md) — HTTP/TLS event correlation engine
|
|
- [Bot Detector](docs/services/bot-detector.md) — ML anomaly detection
|
|
- [Dashboard](docs/services/dashboard.md) — SOC web dashboard and API
|
|
|
|
### Shared Library Documentation
|
|
|
|
- [go-ja4common](docs/shared/go-ja4common.md) — Go shared library
|
|
- [python-ja4common](docs/shared/python-ja4common.md) — Python shared library
|
|
|
|
## Go Workspace
|
|
|
|
The repository uses a Go workspace (`go.work`) to link the Go modules:
|
|
|
|
```
|
|
go 1.21
|
|
|
|
use (
|
|
./services/sentinel
|
|
./services/correlator
|
|
./shared/go/ja4common
|
|
)
|
|
```
|
|
|
|
## License
|
|
|
|
See individual service directories for license information.
|