Services: - ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap) - logcorrelator: JA4 log correlation engine (Go, ClickHouse) - mod_reqin_log: Apache module (C, JSON request logging) - bot_detector: ML bot detection pipeline (Python) - dashboard: FastAPI/Streamlit analytics UI (Python) Shared libraries: - shared/go/ja4common: logger, config, shutdown, ipfilter (Go module) - shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package) - shared/clickhouse/: canonical SQL migrations (10 files) Build & packaging: - Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10) - go.work workspace linking sentinel, correlator, ja4common - Makefile with test-all, build-all, rpm-* targets Fixes applied: - go.work: 1.21 → 1.24.6 (required by sentinel) - correlator Dockerfiles: golang:1.21 → golang:1.24 - replace directives in go.mod for ja4common local path - pyproject.toml: setuptools.backends → setuptools.build_meta - Removed static libpcap linking (unavailable on Rocky 9) - Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32) - Rewrote corrupted test files (logger_test.go × 2) Test coverage: - correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%) - sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse) Documentation: - README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ClickHouse Migrations — ja4-platform
Migration Order
Apply these files in numeric order against the ClickHouse server:
clickhouse-client --multiquery < 00_database.sql
clickhouse-client --multiquery < 01_raw_tables.sql
clickhouse-client --multiquery < 02_dictionaries.sql
clickhouse-client --multiquery < 03_anubis_tables.sql
clickhouse-client --multiquery < 04_mv_http_logs.sql
clickhouse-client --multiquery < 05_aggregation_tables.sql
clickhouse-client --multiquery < 06_ml_tables.sql
clickhouse-client --multiquery < 07_ai_features_view.sql
clickhouse-client --multiquery < 08_users.sql
clickhouse-client --multiquery < 09_audit_table.sql
File Descriptions
| File | Contents |
|---|---|
00_database.sql |
CREATE DATABASE |
01_raw_tables.sql |
http_logs_raw ingest table |
02_dictionaries.sql |
ASN geo dict, bot IP/JA4/network reference tables |
03_anubis_tables.sql |
Anubis crawler rule tables and dictionaries (UA, IP, ASN, country) |
04_mv_http_logs.sql |
Canonical http_logs target table + mv_http_logs materialized view with full Anubis enrichment |
05_aggregation_tables.sql |
agg_host_ip_ja4_1h, agg_header_fingerprint_1h + their MVs |
06_ml_tables.sql |
ml_detected_anomalies, ml_all_scores |
07_ai_features_view.sql |
view_ai_features_1h with Anubis enrichment |
08_users.sql |
ClickHouse users and grants |
09_audit_table.sql |
audit_logs table for SOC dashboard audit trail |
Prerequisites
Place CSV data files in /var/lib/clickhouse/user_files/:
iplocate-ip-to-asn.csv— IP-to-ASN mapping (from IPLocate)bot_ip.csv— Known bot IP prefixesbot_ja4.csv— Known bot JA4 fingerprintsasn_reputation.csv— ASN reputation labels
Notes
04_mv_http_logs.sqlis the canonical version of the MV, superseding the base version inservices/correlator/sql/init.sql. It includes full Anubis enrichment.- All migrations are idempotent (use
IF NOT EXISTS/IF EXISTS). - Anubis dictionary passwords in
03_anubis_tables.sqlmust be changed before production use.