feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized
Services: - ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap) - logcorrelator: JA4 log correlation engine (Go, ClickHouse) - mod_reqin_log: Apache module (C, JSON request logging) - bot_detector: ML bot detection pipeline (Python) - dashboard: FastAPI/Streamlit analytics UI (Python) Shared libraries: - shared/go/ja4common: logger, config, shutdown, ipfilter (Go module) - shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package) - shared/clickhouse/: canonical SQL migrations (10 files) Build & packaging: - Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10) - go.work workspace linking sentinel, correlator, ja4common - Makefile with test-all, build-all, rpm-* targets Fixes applied: - go.work: 1.21 → 1.24.6 (required by sentinel) - correlator Dockerfiles: golang:1.21 → golang:1.24 - replace directives in go.mod for ja4common local path - pyproject.toml: setuptools.backends → setuptools.build_meta - Removed static libpcap linking (unavailable on Rocky 9) - Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32) - Rewrote corrupted test files (logger_test.go × 2) Test coverage: - correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%) - sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse) Documentation: - README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
162
docs/architecture.md
Normal file
162
docs/architecture.md
Normal file
@ -0,0 +1,162 @@
|
||||
# Architecture
|
||||
|
||||
The ja4-platform is a security pipeline that captures live network traffic, generates JA4/JA3 TLS fingerprints, correlates them with HTTP requests, applies machine-learning anomaly detection, and surfaces results through a SOC analyst dashboard. ClickHouse serves as the central data store linking all services.
|
||||
|
||||
## System Architecture
|
||||
|
||||
```
|
||||
┌───────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ Target Linux Server │
|
||||
│ │
|
||||
│ ┌─────────────┐ HTTP req ┌───────────────────────┐ UNIX socket (DGRAM) │
|
||||
│ │ Client │────────────▶│ Apache HTTPD │──────────────┐ │
|
||||
│ │ (browser / │ │ + mod-reqin-log │ │ │
|
||||
│ │ bot) │ └───────────────────────┘ │ │
|
||||
│ │ │ ▼ │
|
||||
│ │ │ TLS CH ┌───────────────────────┐ ┌─────────────────────┐ │
|
||||
│ │ │────────────▶│ sentinel │ │ correlator │ │
|
||||
│ │ │ (pcap) │ (packet capture) │──▶│ (event join) │ │
|
||||
│ └─────────────┘ └───────────────────────┘ └────────┬────────────┘ │
|
||||
│ │ │
|
||||
└────────────────────────────────────────────────────────────────────┼──────────────┘
|
||||
│ INSERT JSON
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ ClickHouse │
|
||||
│ mabase_prod │
|
||||
│ │
|
||||
│ http_logs_raw │
|
||||
│ ──(MV)──▶ http_logs│
|
||||
│ ──(MV)──▶ agg_* │
|
||||
│ view_ai_features │
|
||||
│ ml_detected_anom. │
|
||||
│ ml_all_scores │
|
||||
└──────┬──────┬───────┘
|
||||
│ │
|
||||
┌──────────────────┘ └──────────────────┐
|
||||
▼ ▼
|
||||
┌──────────────────────┐ ┌──────────────────────┐
|
||||
│ bot-detector │ │ dashboard │
|
||||
│ (Python) │ │ (FastAPI + React) │
|
||||
│ │ │ │
|
||||
│ Reads: │ │ Reads: │
|
||||
│ view_ai_features │ │ ml_detected_anom. │
|
||||
│ view_ip_recurrence │ │ ml_all_scores │
|
||||
│ Writes: │ │ http_logs │
|
||||
│ ml_detected_anom. │ │ agg_* tables │
|
||||
│ ml_all_scores │ │ audit_logs │
|
||||
└──────────────────────┘ └──────────────────────┘
|
||||
```
|
||||
|
||||
## Data Flow
|
||||
|
||||
### 1. Capture Phase
|
||||
|
||||
1. **mod-reqin-log** (Apache C module) hooks into `post_read_request`. On each HTTP request, it serializes method, path, headers, client IP/port into JSON and sends it via UNIX datagram socket to `/var/run/logcorrelator/http.socket`.
|
||||
|
||||
2. **sentinel** (Go daemon) uses libpcap to capture live TLS ClientHello packets on configured ports (default: 443, 8443). It extracts IP/TCP metadata, generates JA4 and JA3 fingerprints, and sends the result as JSON via UNIX datagram socket to `/var/run/logcorrelator/network.socket`.
|
||||
|
||||
### 2. Correlation Phase
|
||||
|
||||
3. **correlator** (Go daemon) listens on both UNIX sockets. It buffers incoming events and correlates them by matching `src_ip:src_port` within a configurable time window (default: 10 s). HTTP Keep-Alive connections are supported via `one_to_many` matching mode where a single TLS handshake (source B) is reused for multiple HTTP requests (source A). Correlated events merge HTTP fields (method, path, headers) with TLS fields (JA4, JA3, IP/TCP metadata) into a single `CorrelatedLog` JSON object, which is inserted into `http_logs_raw`.
|
||||
|
||||
### 3. Enrichment Phase (ClickHouse)
|
||||
|
||||
4. **mv_http_logs** materialized view automatically transforms `http_logs_raw` JSON into the structured `http_logs` table, enriching each row with:
|
||||
- ASN/geo data via `dict_iplocate_asn`
|
||||
- Anubis bot identification via `dict_anubis_ua`, `dict_anubis_ip`, `dict_anubis_asn`, `dict_anubis_country`
|
||||
|
||||
5. **mv_agg_host_ip_ja4_1h** and **mv_agg_header_fingerprint_1h** aggregate `http_logs` into 1-hour behavioral windows.
|
||||
|
||||
6. **view_ai_features_1h** joins the two aggregation tables and computes 50+ ML features per `(src_ip, ja4, host)` tuple.
|
||||
|
||||
### 4. Detection Phase
|
||||
|
||||
7. **bot-detector** (Python) runs on a 5-minute cycle:
|
||||
- Reads `view_ai_features_1h` for the last 24 hours
|
||||
- Separates known bots (via reputation dictionaries) from unknown traffic
|
||||
- Trains/loads Isolation Forest models on human-baseline traffic
|
||||
- Scores unknown traffic and writes anomalies to `ml_detected_anomalies` and all scores to `ml_all_scores`
|
||||
|
||||
### 5. Visualization Phase
|
||||
|
||||
8. **dashboard** (FastAPI + React) queries ClickHouse to display detections, feature analysis, investigation summaries, and clustering to SOC analysts.
|
||||
|
||||
## Component Interaction Matrix
|
||||
|
||||
| From → To | mod-reqin-log | sentinel | correlator | ClickHouse | bot-detector | dashboard |
|
||||
|-----------|:---:|:---:|:---:|:---:|:---:|:---:|
|
||||
| **mod-reqin-log** | — | — | UNIX socket (DGRAM) | — | — | — |
|
||||
| **sentinel** | — | — | UNIX socket (DGRAM) | — | — | — |
|
||||
| **correlator** | — | — | — | Native TCP :9000 (INSERT) | — | — |
|
||||
| **ClickHouse** | — | — | — | — | — | — |
|
||||
| **bot-detector** | — | — | — | HTTP :8123 (SELECT/INSERT) | — | — |
|
||||
| **dashboard** | — | — | — | HTTP :8123 (SELECT/INSERT) | — | — |
|
||||
|
||||
## ClickHouse Table Ownership
|
||||
|
||||
| Table/View | Written By | Read By |
|
||||
|------------|-----------|---------|
|
||||
| `http_logs_raw` | correlator | mv_http_logs (MV) |
|
||||
| `http_logs` | mv_http_logs (MV) | mv_agg_*, dashboard |
|
||||
| `agg_host_ip_ja4_1h` | mv_agg_host_ip_ja4_1h (MV) | view_ai_features_1h |
|
||||
| `agg_header_fingerprint_1h` | mv_agg_header_fingerprint_1h (MV) | view_ai_features_1h |
|
||||
| `view_ai_features_1h` | — (view) | bot-detector |
|
||||
| `view_ip_recurrence` | — (view) | bot-detector |
|
||||
| `ml_detected_anomalies` | bot-detector | dashboard |
|
||||
| `ml_all_scores` | bot-detector | dashboard |
|
||||
| `audit_logs` | dashboard | dashboard |
|
||||
|
||||
## Correlation Algorithm
|
||||
|
||||
The correlator joins HTTP events (source A) with TLS/network events (source B) using a two-key correlation:
|
||||
|
||||
1. **Key**: `src_ip + src_port` — the client's source IP and ephemeral port uniquely identify a TCP connection.
|
||||
2. **Time window**: Events must arrive within the configured window (default 10 seconds).
|
||||
3. **Matching mode**:
|
||||
- `one_to_one`: Each B event matches at most one A event (consumed after match).
|
||||
- `one_to_many` (default, Keep-Alive): A single B (TLS handshake) can match multiple A events (HTTP requests) on the same connection. The B event has a configurable TTL (default 120 s) that resets on each match.
|
||||
4. **Orphan handling**: Unmatched A events are emitted after a configurable delay (default 500 ms) with `correlated=false` and `orphan_side=A`.
|
||||
|
||||
## JA4/JA3 Fingerprint Format
|
||||
|
||||
### JA4
|
||||
|
||||
JA4 is a modern TLS fingerprinting format (successor to JA3) with the structure:
|
||||
|
||||
```
|
||||
t{TLS_VER}{SNI}{CIPHER_COUNT}{EXT_COUNT}_{CIPHER_HASH}_{EXT_HASH}
|
||||
```
|
||||
|
||||
Example: `t13d1516h2_8daaf6152771_b0da82dd1658`
|
||||
|
||||
- Prefix `t` = TLS, followed by version (`13` = TLS 1.3)
|
||||
- `d` = SNI present, `i` = SNI absent
|
||||
- Cipher suite count and extension count
|
||||
- SHA-256 truncated hashes of sorted cipher suites and extensions
|
||||
|
||||
### JA3
|
||||
|
||||
JA3 is the original TLS fingerprinting format:
|
||||
|
||||
```
|
||||
{TLS_VER},{CIPHERS},{EXTENSIONS},{ELLIPTIC_CURVES},{EC_POINT_FORMATS}
|
||||
```
|
||||
|
||||
The `ja3_hash` is the MD5 hash of the JA3 string.
|
||||
|
||||
Both fingerprints are generated by sentinel from the TLS ClientHello payload.
|
||||
|
||||
## Technology Stack
|
||||
|
||||
| Component | Technology |
|
||||
|-----------|-----------|
|
||||
| Packet capture | Go + libpcap (gopacket) |
|
||||
| HTTP logging | C Apache module (APR) |
|
||||
| Event correlation | Go (hexagonal architecture) |
|
||||
| ML detection | Python 3.11 + scikit-learn |
|
||||
| Dashboard backend | FastAPI (Python) |
|
||||
| Dashboard frontend | React + Vite |
|
||||
| Data store | ClickHouse |
|
||||
| Deployment | systemd, Docker, RPM |
|
||||
| IPC | UNIX datagram sockets |
|
||||
256
docs/database/migrations.md
Normal file
256
docs/database/migrations.md
Normal file
@ -0,0 +1,256 @@
|
||||
# Database Migrations
|
||||
|
||||
The ClickHouse schema for ja4-platform is managed through numbered SQL migration files in `shared/clickhouse/`. Migrations are idempotent (using `IF NOT EXISTS` / `IF EXISTS`) and must be applied in numeric order.
|
||||
|
||||
## Migration Order
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `00_database.sql` | Creates the `mabase_prod` database |
|
||||
| `01_raw_tables.sql` | Creates `http_logs_raw` ingest table (MergeTree, 1-day TTL) |
|
||||
| `02_dictionaries.sql` | Creates ASN geo dictionary (`dict_iplocate_asn`), bot IP/JA4 reference tables, `ref_bot_networks` |
|
||||
| `03_anubis_tables.sql` | Creates Anubis crawler rule tables (`anubis_ua_rules`, `anubis_ip_rules`, `anubis_asn_rules`, `anubis_country_rules`) and their dictionaries (`dict_anubis_ua`, `dict_anubis_ip`, `dict_anubis_asn`, `dict_anubis_country`) |
|
||||
| `04_mv_http_logs.sql` | Creates the canonical `http_logs` table and `mv_http_logs` materialized view with full Anubis enrichment |
|
||||
| `05_aggregation_tables.sql` | Creates reputation dictionaries (`dict_bot_ip`, `dict_bot_ja4`, `dict_asn_reputation`), behavioral aggregation tables (`agg_host_ip_ja4_1h`, `agg_header_fingerprint_1h`), and their materialized views |
|
||||
| `06_ml_tables.sql` | Creates ML output tables (`ml_detected_anomalies`, `ml_all_scores`) and `view_ip_recurrence` |
|
||||
| `07_ai_features_view.sql` | Creates `view_ai_features_1h` — the 50+ feature view used by bot-detector |
|
||||
| `08_users.sql` | Creates ClickHouse users (`data_writer`, `analyst`) and grants permissions |
|
||||
| `09_audit_table.sql` | Creates `audit_logs` table for SOC dashboard audit trail |
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### 1. ClickHouse Server
|
||||
|
||||
A running ClickHouse server (version 23.8+ recommended for `REGEXP_TREE` dictionary support).
|
||||
|
||||
### 2. CSV Data Files
|
||||
|
||||
Place the following files in `/var/lib/clickhouse/user_files/`:
|
||||
|
||||
| File | Source | Description |
|
||||
|------|--------|-------------|
|
||||
| `iplocate-ip-to-asn.csv` | [IPLocate](https://iplocate.io) | IP-to-ASN mapping with country, org, domain |
|
||||
| `bot_ip.csv` | Custom | Known bot IP prefixes (CIDR format) |
|
||||
| `bot_ja4.csv` | Custom | Known bot JA4 fingerprints |
|
||||
| `asn_reputation.csv` | Custom | ASN reputation labels (`human`, `bot`, `unknown`) |
|
||||
|
||||
### 3. Anubis Passwords
|
||||
|
||||
Migration `03_anubis_tables.sql` contains placeholder passwords (`CHANGE_ME`) for the Anubis dictionaries. Replace these with the actual ClickHouse admin password before applying:
|
||||
|
||||
```bash
|
||||
sed -i "s/CHANGE_ME/your_actual_password/g" 03_anubis_tables.sql
|
||||
```
|
||||
|
||||
## How to Apply
|
||||
|
||||
### Full Initial Setup
|
||||
|
||||
Apply all migrations in order:
|
||||
|
||||
```bash
|
||||
cd shared/clickhouse/
|
||||
|
||||
clickhouse-client --multiquery < 00_database.sql
|
||||
clickhouse-client --multiquery < 01_raw_tables.sql
|
||||
clickhouse-client --multiquery < 02_dictionaries.sql
|
||||
clickhouse-client --multiquery < 03_anubis_tables.sql
|
||||
clickhouse-client --multiquery < 04_mv_http_logs.sql
|
||||
clickhouse-client --multiquery < 05_aggregation_tables.sql
|
||||
clickhouse-client --multiquery < 06_ml_tables.sql
|
||||
clickhouse-client --multiquery < 07_ai_features_view.sql
|
||||
clickhouse-client --multiquery < 08_users.sql
|
||||
clickhouse-client --multiquery < 09_audit_table.sql
|
||||
```
|
||||
|
||||
### With Authentication
|
||||
|
||||
```bash
|
||||
clickhouse-client --user admin --password 'your_password' --multiquery < 00_database.sql
|
||||
# ... repeat for each file
|
||||
```
|
||||
|
||||
### One-Liner (All at Once)
|
||||
|
||||
```bash
|
||||
cd shared/clickhouse/
|
||||
for f in 0*.sql; do
|
||||
echo "Applying $f..."
|
||||
clickhouse-client --multiquery < "$f"
|
||||
done
|
||||
```
|
||||
|
||||
## How to Verify
|
||||
|
||||
After applying all migrations, run these queries to verify each migration was successful:
|
||||
|
||||
### 00 — Database
|
||||
|
||||
```sql
|
||||
SHOW DATABASES LIKE 'mabase_prod';
|
||||
-- Expected: mabase_prod
|
||||
```
|
||||
|
||||
### 01 — Raw Tables
|
||||
|
||||
```sql
|
||||
EXISTS mabase_prod.http_logs_raw;
|
||||
-- Expected: 1
|
||||
```
|
||||
|
||||
### 02 — Dictionaries
|
||||
|
||||
```sql
|
||||
SELECT dictGetOrDefault('mabase_prod.dict_iplocate_asn', 'country_code',
|
||||
toIPv6(toIPv4('8.8.8.8')), 'MISSING');
|
||||
-- Expected: US (if CSV loaded) or MISSING
|
||||
```
|
||||
|
||||
### 03 — Anubis Tables
|
||||
|
||||
```sql
|
||||
EXISTS mabase_prod.anubis_ua_rules;
|
||||
EXISTS mabase_prod.anubis_ip_rules;
|
||||
EXISTS mabase_prod.anubis_asn_rules;
|
||||
EXISTS mabase_prod.anubis_country_rules;
|
||||
-- Expected: 1 for each
|
||||
```
|
||||
|
||||
### 04 — MV + http_logs
|
||||
|
||||
```sql
|
||||
EXISTS mabase_prod.http_logs;
|
||||
SELECT name FROM system.tables WHERE database = 'mabase_prod' AND name = 'mv_http_logs';
|
||||
-- Expected: mv_http_logs
|
||||
```
|
||||
|
||||
### 05 — Aggregation Tables
|
||||
|
||||
```sql
|
||||
EXISTS mabase_prod.agg_host_ip_ja4_1h;
|
||||
EXISTS mabase_prod.agg_header_fingerprint_1h;
|
||||
SELECT name FROM system.dictionaries WHERE database = 'mabase_prod' AND name = 'dict_bot_ip';
|
||||
-- Expected: dict_bot_ip
|
||||
```
|
||||
|
||||
### 06 — ML Tables
|
||||
|
||||
```sql
|
||||
EXISTS mabase_prod.ml_detected_anomalies;
|
||||
EXISTS mabase_prod.ml_all_scores;
|
||||
SELECT name FROM system.tables WHERE database = 'mabase_prod' AND name LIKE 'view_ip%';
|
||||
-- Expected: view_ip_recurrence
|
||||
```
|
||||
|
||||
### 07 — AI Features View
|
||||
|
||||
```sql
|
||||
SELECT name FROM system.tables WHERE database = 'mabase_prod' AND name = 'view_ai_features_1h';
|
||||
-- Expected: view_ai_features_1h
|
||||
```
|
||||
|
||||
### 08 — Users
|
||||
|
||||
```sql
|
||||
SHOW GRANTS FOR data_writer;
|
||||
-- Expected: GRANT INSERT, SELECT ON mabase_prod.http_logs_raw TO data_writer
|
||||
SHOW GRANTS FOR analyst;
|
||||
-- Expected: GRANT SELECT ON multiple tables
|
||||
```
|
||||
|
||||
### 09 — Audit Table
|
||||
|
||||
```sql
|
||||
EXISTS mabase_prod.audit_logs;
|
||||
-- Expected: 1
|
||||
```
|
||||
|
||||
### Full Verification Query
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
count() AS total_tables
|
||||
FROM system.tables
|
||||
WHERE database = 'mabase_prod'
|
||||
AND name IN (
|
||||
'http_logs_raw', 'http_logs', 'agg_host_ip_ja4_1h', 'agg_header_fingerprint_1h',
|
||||
'ml_detected_anomalies', 'ml_all_scores', 'ref_bot_networks',
|
||||
'anubis_ua_rules', 'anubis_ip_rules', 'anubis_asn_rules', 'anubis_country_rules',
|
||||
'audit_logs', 'bot_ip', 'bot_ja4'
|
||||
);
|
||||
-- Expected: 14
|
||||
```
|
||||
|
||||
## Rollback Notes
|
||||
|
||||
### General Approach
|
||||
|
||||
ClickHouse does not support transactional DDL. To roll back a migration:
|
||||
|
||||
1. **Tables**: `DROP TABLE IF EXISTS mabase_prod.<table_name>`
|
||||
2. **Materialized Views**: `DROP VIEW IF EXISTS mabase_prod.<mv_name>` (drop MV before its target table)
|
||||
3. **Dictionaries**: `DROP DICTIONARY IF EXISTS mabase_prod.<dict_name>`
|
||||
4. **Views**: `DROP VIEW IF EXISTS mabase_prod.<view_name>`
|
||||
5. **Users**: `DROP USER IF EXISTS <username>`
|
||||
|
||||
### Rollback Order (Reverse of Apply)
|
||||
|
||||
```sql
|
||||
-- 09: Audit
|
||||
DROP TABLE IF EXISTS mabase_prod.audit_logs;
|
||||
|
||||
-- 08: Users
|
||||
DROP USER IF EXISTS data_writer;
|
||||
DROP USER IF EXISTS analyst;
|
||||
|
||||
-- 07: AI Features View
|
||||
DROP VIEW IF EXISTS mabase_prod.view_ai_features_1h;
|
||||
|
||||
-- 06: ML Tables
|
||||
DROP VIEW IF EXISTS mabase_prod.view_ip_recurrence;
|
||||
DROP TABLE IF EXISTS mabase_prod.ml_all_scores;
|
||||
DROP TABLE IF EXISTS mabase_prod.ml_detected_anomalies;
|
||||
|
||||
-- 05: Aggregation
|
||||
DROP VIEW IF EXISTS mabase_prod.mv_agg_header_fingerprint_1h;
|
||||
DROP VIEW IF EXISTS mabase_prod.mv_agg_host_ip_ja4_1h;
|
||||
DROP TABLE IF EXISTS mabase_prod.agg_header_fingerprint_1h;
|
||||
DROP TABLE IF EXISTS mabase_prod.agg_host_ip_ja4_1h;
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_asn_reputation;
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_bot_ja4;
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_bot_ip;
|
||||
|
||||
-- 04: MV + http_logs
|
||||
DROP VIEW IF EXISTS mabase_prod.mv_http_logs;
|
||||
DROP TABLE IF EXISTS mabase_prod.http_logs;
|
||||
|
||||
-- 03: Anubis
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_anubis_country;
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_anubis_asn;
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_anubis_ip;
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_anubis_ua;
|
||||
DROP TABLE IF EXISTS mabase_prod.anubis_country_rules;
|
||||
DROP TABLE IF EXISTS mabase_prod.anubis_asn_rules;
|
||||
DROP TABLE IF EXISTS mabase_prod.anubis_ip_rules;
|
||||
DROP TABLE IF EXISTS mabase_prod.anubis_ua_rules;
|
||||
|
||||
-- 02: Dictionaries
|
||||
DROP DICTIONARY IF EXISTS mabase_prod.dict_iplocate_asn;
|
||||
DROP TABLE IF EXISTS mabase_prod.bot_ja4;
|
||||
DROP TABLE IF EXISTS mabase_prod.bot_ip;
|
||||
DROP TABLE IF EXISTS mabase_prod.ref_bot_networks;
|
||||
|
||||
-- 01: Raw Tables
|
||||
DROP TABLE IF EXISTS mabase_prod.http_logs_raw;
|
||||
|
||||
-- 00: Database
|
||||
DROP DATABASE IF EXISTS mabase_prod;
|
||||
```
|
||||
|
||||
### Important Notes
|
||||
|
||||
- **Data loss**: Dropping tables destroys all data. Always back up before rollback.
|
||||
- **MV dependency**: Materialized views must be dropped before their target tables.
|
||||
- **Dictionary dependency**: Views/MVs using dictionaries will fail if dictionaries are dropped while they still reference them.
|
||||
- **Idempotent re-apply**: After rollback, migrations can be safely re-applied since they use `IF NOT EXISTS`.
|
||||
- **`04_mv_http_logs.sql`** is the canonical version of the MV, superseding any base version in `services/correlator/sql/init.sql`.
|
||||
334
docs/database/schema.md
Normal file
334
docs/database/schema.md
Normal file
@ -0,0 +1,334 @@
|
||||
# Database Schema
|
||||
|
||||
The ja4-platform uses ClickHouse as its central data store with database `mabase_prod`. This document describes every table, materialized view, dictionary, and view in the schema.
|
||||
|
||||
## Tables
|
||||
|
||||
### http_logs_raw
|
||||
|
||||
Raw JSON ingest table — direct target for correlator INSERTs.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `raw_json` | String (ZSTD(3)) | Complete correlated log as JSON string |
|
||||
| `ingest_time` | DateTime | Insertion timestamp (default: `now()`) |
|
||||
|
||||
- **Engine**: MergeTree
|
||||
- **Partition by**: `toDate(ingest_time)`
|
||||
- **Order by**: `ingest_time`
|
||||
- **TTL**: `ingest_time + INTERVAL 1 DAY`
|
||||
|
||||
---
|
||||
|
||||
### http_logs
|
||||
|
||||
Parsed and enriched HTTP log table — populated by `mv_http_logs` materialized view.
|
||||
|
||||
| Column | Type | Nullable | Description |
|
||||
|--------|------|----------|-------------|
|
||||
| `time` | DateTime | No | Request timestamp |
|
||||
| `log_date` | Date | No | Date partition key (default: `toDate(time)`) |
|
||||
| `src_ip` | IPv4 | No | Client source IP |
|
||||
| `src_port` | UInt16 | No | Client source port |
|
||||
| `dst_ip` | IPv4 | No | Server destination IP |
|
||||
| `dst_port` | UInt16 | No | Server destination port |
|
||||
| `src_asn` | UInt32 | No | Source ASN (enriched via dict_iplocate_asn) |
|
||||
| `src_country_code` | LowCardinality(String) | No | Source country code |
|
||||
| `src_as_name` | LowCardinality(String) | No | AS name |
|
||||
| `src_org` | LowCardinality(String) | No | AS organization |
|
||||
| `src_domain` | LowCardinality(String) | No | AS domain |
|
||||
| `method` | LowCardinality(String) | No | HTTP method |
|
||||
| `scheme` | LowCardinality(String) | No | URL scheme (http/https) |
|
||||
| `host` | LowCardinality(String) | No | HTTP Host header |
|
||||
| `path` | String (ZSTD(3)) | No | Request path |
|
||||
| `query` | String (ZSTD(3)) | No | Query string |
|
||||
| `http_version` | LowCardinality(String) | No | HTTP version |
|
||||
| `orphan_side` | LowCardinality(String) | No | Orphan side (A, B, or empty) |
|
||||
| `correlated` | UInt8 | No | 1 if HTTP+TLS correlated |
|
||||
| `keepalives` | UInt16 | No | Keep-alive request sequence |
|
||||
| `a_timestamp` | UInt64 | No | Source A event timestamp (ns) |
|
||||
| `b_timestamp` | UInt64 | No | Source B event timestamp (ns) |
|
||||
| `conn_id` | String (ZSTD(3)) | No | TCP connection identifier |
|
||||
| `ip_meta_df` | UInt8 | No | IP Don't Fragment flag |
|
||||
| `ip_meta_id` | UInt16 | No | IP identification |
|
||||
| `ip_meta_total_length` | UInt16 | No | IP total length |
|
||||
| `ip_meta_ttl` | UInt8 | No | IP TTL |
|
||||
| `tcp_meta_options` | LowCardinality(String) | No | TCP options list |
|
||||
| `tcp_meta_window_size` | UInt32 | No | TCP window size |
|
||||
| `tcp_meta_mss` | UInt16 | No | TCP MSS |
|
||||
| `tcp_meta_window_scale` | UInt8 | No | TCP window scale |
|
||||
| `syn_to_clienthello_ms` | Int32 | No | SYN-to-ClientHello timing (ms) |
|
||||
| `tls_version` | LowCardinality(String) | No | TLS version |
|
||||
| `tls_sni` | LowCardinality(String) | No | TLS SNI |
|
||||
| `tls_alpn` | LowCardinality(String) | No | TLS ALPN |
|
||||
| `ja3` | String (ZSTD(3)) | No | JA3 fingerprint |
|
||||
| `ja3_hash` | String (ZSTD(3)) | No | JA3 MD5 hash |
|
||||
| `ja4` | String (ZSTD(3)) | No | JA4 fingerprint |
|
||||
| `client_headers` | String (ZSTD(3)) | No | Comma-separated header names |
|
||||
| `header_user_agent` | String (ZSTD(3)) | No | User-Agent header |
|
||||
| `header_accept` | String (ZSTD(3)) | No | Accept header |
|
||||
| `header_accept_encoding` | String (ZSTD(3)) | No | Accept-Encoding header |
|
||||
| `header_accept_language` | String (ZSTD(3)) | No | Accept-Language header |
|
||||
| `header_content_type` | String (ZSTD(3)) | No | Content-Type header |
|
||||
| `header_x_request_id` | String (ZSTD(3)) | No | X-Request-Id header |
|
||||
| `header_x_trace_id` | String (ZSTD(3)) | No | X-Trace-Id header |
|
||||
| `header_x_forwarded_for` | String (ZSTD(3)) | No | X-Forwarded-For header |
|
||||
| `header_sec_ch_ua` | String (ZSTD(3)) | No | Sec-CH-UA header |
|
||||
| `header_sec_ch_ua_mobile` | String (ZSTD(3)) | No | Sec-CH-UA-Mobile header |
|
||||
| `header_sec_ch_ua_platform` | String (ZSTD(3)) | No | Sec-CH-UA-Platform header |
|
||||
| `header_sec_fetch_dest` | String (ZSTD(3)) | No | Sec-Fetch-Dest header |
|
||||
| `header_sec_fetch_mode` | String (ZSTD(3)) | No | Sec-Fetch-Mode header |
|
||||
| `header_sec_fetch_site` | String (ZSTD(3)) | No | Sec-Fetch-Site header |
|
||||
| `anubis_bot_name` | LowCardinality(String) | No | Anubis-detected bot name (default: '') |
|
||||
| `anubis_bot_action` | LowCardinality(String) | No | Anubis-detected bot action (default: '') |
|
||||
| `anubis_bot_category` | LowCardinality(String) | No | Anubis-detected bot category (default: '') |
|
||||
|
||||
- **Engine**: MergeTree
|
||||
- **Partition by**: `log_date`
|
||||
- **Order by**: `(time, src_ip, dst_ip, ja4)`
|
||||
- **TTL**: `log_date + INTERVAL 7 DAY`
|
||||
|
||||
---
|
||||
|
||||
### agg_host_ip_ja4_1h
|
||||
|
||||
Behavioral aggregation per `(src_ip, ja4, host)` per hour. Uses `AggregatingMergeTree` with `SimpleAggregateFunction` and `AggregateFunction` columns for incremental aggregation.
|
||||
|
||||
Key columns include: `window_start`, `src_ip`, `ja4`, `host`, `src_asn`, `hits`, `count_post`, `uniq_paths`, `uniq_query_params`, `tcp_jitter_variance`, `unique_src_ports`, `unique_conn_id`, `orphan_count`, `ip_id_zero_count`, `mss_1460_count`, `uniq_ua`, `url_depth_variance`, `count_anomalous_payload`, `uniq_ja3`, `avg_syn_ms`, `tls12_count`, `count_head`, `count_no_sec_fetch`, `count_generic_accept`, `count_http10`, `ip_df_var`, `avg_ttl`, `ttl_var`, `count_no_wscale`, `count_correlated`, `count_no_accept_enc`, `count_http_scheme`.
|
||||
|
||||
- **Engine**: AggregatingMergeTree
|
||||
- **Order by**: `(window_start, src_ip, ja4, host)`
|
||||
|
||||
---
|
||||
|
||||
### agg_header_fingerprint_1h
|
||||
|
||||
Header-level behavioral fingerprint aggregation per `(src_ip)` per hour.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `window_start` | DateTime | Hour window start |
|
||||
| `src_ip` | IPv6 | Source IP |
|
||||
| `header_order_hash` | SimpleAggregateFunction(any, String) | Hash of header order |
|
||||
| `header_count` | SimpleAggregateFunction(max, UInt16) | Max header count |
|
||||
| `has_accept_language` | SimpleAggregateFunction(max, UInt8) | Accept-Language presence |
|
||||
| `has_cookie` | SimpleAggregateFunction(max, UInt8) | Cookie presence |
|
||||
| `has_referer` | SimpleAggregateFunction(max, UInt8) | Referer presence |
|
||||
| `modern_browser_score` | SimpleAggregateFunction(max, UInt8) | Browser compliance score |
|
||||
| `ua_ch_mismatch` | SimpleAggregateFunction(max, UInt8) | UA/Client Hints mismatch |
|
||||
| `sec_fetch_mode` | SimpleAggregateFunction(any, String) | Sec-Fetch-Mode value |
|
||||
| `sec_fetch_dest` | SimpleAggregateFunction(any, String) | Sec-Fetch-Dest value |
|
||||
|
||||
- **Engine**: AggregatingMergeTree
|
||||
- **Order by**: `(window_start, src_ip)`
|
||||
|
||||
---
|
||||
|
||||
### ml_detected_anomalies
|
||||
|
||||
Anomaly detections above the threat threshold.
|
||||
|
||||
Key columns: `detected_at`, `src_ip` (IPv6), `ja4`, `host`, `bot_name`, `anomaly_score` (Float32), `raw_anomaly_score` (Float32), `threat_level`, `model_name`, `recurrence` (UInt32), `campaign_id` (Int32), `reason`, plus all ML feature columns and Anubis enrichment (`anubis_bot_name`, `anubis_bot_action`, `anubis_bot_category`).
|
||||
|
||||
- **Engine**: ReplacingMergeTree(detected_at)
|
||||
- **Order by**: `(src_ip)`
|
||||
- **TTL**: `detected_at + INTERVAL 30 DAY`
|
||||
|
||||
---
|
||||
|
||||
### ml_all_scores
|
||||
|
||||
All ML classifications (no threshold filter) for observability.
|
||||
|
||||
Key columns: `detected_at`, `window_start`, `src_ip`, `ja4`, `host`, `bot_name`, `anomaly_score`, `raw_anomaly_score`, `threat_level`, `model_name`, `correlated`, `campaign_id`, plus ASN and Anubis enrichment.
|
||||
|
||||
- **Engine**: ReplacingMergeTree(detected_at)
|
||||
- **Order by**: `(window_start, src_ip, ja4, host, model_name)`
|
||||
- **TTL**: `window_start + INTERVAL 3 DAY`
|
||||
|
||||
---
|
||||
|
||||
### ref_bot_networks
|
||||
|
||||
Bot network CIDR reference table.
|
||||
|
||||
| Column | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `network` | IPv6CIDR | Network CIDR |
|
||||
| `bot_name` | LowCardinality(String) | Bot name |
|
||||
| `is_legitimate` | UInt8 | 1 = legitimate bot |
|
||||
| `last_update` | DateTime | Last update timestamp |
|
||||
|
||||
- **Engine**: ReplacingMergeTree(last_update)
|
||||
- **Order by**: `(network, bot_name)`
|
||||
|
||||
---
|
||||
|
||||
### bot_ip / bot_ja4
|
||||
|
||||
CSV-backed flat tables for quick bot lookups.
|
||||
|
||||
- `bot_ip`: single column `ip` (String) — Engine: File(CSV, 'bot_ip.csv')
|
||||
- `bot_ja4`: single column `ja4` (String) — Engine: File(CSV, 'bot_ja4.csv')
|
||||
|
||||
---
|
||||
|
||||
### Anubis Rule Tables
|
||||
|
||||
| Table | Key | Columns | Engine |
|
||||
|-------|-----|---------|--------|
|
||||
| `anubis_ua_rules` | `id` (UInt64) | `parent_id`, `regexp`, `keys`, `values` | ReplacingMergeTree |
|
||||
| `anubis_ip_rules` | `prefix` (String) | `bot_name`, `action`, `rule_id`, `has_ua`, `category` | ReplacingMergeTree |
|
||||
| `anubis_asn_rules` | `asn` (UInt32) | `bot_name`, `action`, `category` | ReplacingMergeTree |
|
||||
| `anubis_country_rules` | `country_code` (String) | `bot_name`, `action`, `category` | ReplacingMergeTree |
|
||||
|
||||
---
|
||||
|
||||
### audit_logs
|
||||
|
||||
SOC audit trail for dashboard activity.
|
||||
|
||||
| Column | Type | Default | Description |
|
||||
|--------|------|---------|-------------|
|
||||
| `timestamp` | DateTime | `now()` | Event time |
|
||||
| `user_name` | LowCardinality(String) | `'soc_user'` | Analyst name |
|
||||
| `action` | LowCardinality(String) | — | Action performed |
|
||||
| `entity_type` | LowCardinality(String) | `''` | Entity type (ip, ja4, etc.) |
|
||||
| `entity_id` | String | `''` | Entity identifier |
|
||||
| `entity_count` | UInt32 | `0` | Entity count |
|
||||
| `details` | String (ZSTD(3)) | `''` | JSON details |
|
||||
| `client_ip` | String | `''` | Analyst client IP |
|
||||
|
||||
- **Engine**: MergeTree
|
||||
- **Partition by**: `toDate(timestamp)`
|
||||
- **Order by**: `(timestamp, user_name, action)`
|
||||
- **TTL**: `toDate(timestamp) + INTERVAL 90 DAY`
|
||||
|
||||
---
|
||||
|
||||
## Materialized Views
|
||||
|
||||
### mv_http_logs
|
||||
|
||||
- **Source**: `http_logs_raw`
|
||||
- **Target**: `http_logs`
|
||||
- **Transformation**: Parses `raw_json` via `JSONExtract*` functions, enriches with ASN data from `dict_iplocate_asn` and Anubis bot detection from `dict_anubis_ua`, `dict_anubis_ip`, `dict_anubis_asn`, `dict_anubis_country`. Uses a 5-level priority cascade for Anubis: UA+IP combined > UA only > IP only > ASN > Country.
|
||||
|
||||
### mv_agg_host_ip_ja4_1h
|
||||
|
||||
- **Source**: `http_logs`
|
||||
- **Target**: `agg_host_ip_ja4_1h`
|
||||
- **Transformation**: Groups by `(toStartOfHour(time), src_ip, ja4, host, src_asn)`. Computes counts, unique values, variances, and aggregate functions for 50+ behavioral features.
|
||||
|
||||
### mv_agg_header_fingerprint_1h
|
||||
|
||||
- **Source**: `http_logs`
|
||||
- **Target**: `agg_header_fingerprint_1h`
|
||||
- **Transformation**: Groups by `(toStartOfHour(time), src_ip)`. Computes header order hash, header count, browser compliance score, Client Hints mismatch.
|
||||
|
||||
---
|
||||
|
||||
## Dictionaries
|
||||
|
||||
### dict_iplocate_asn
|
||||
|
||||
- **Source**: CSV file `/var/lib/clickhouse/user_files/iplocate-ip-to-asn.csv`
|
||||
- **Key**: `network` (String)
|
||||
- **Layout**: `IP_TRIE`
|
||||
- **Attributes**: `asn` (UInt32), `country_code`, `name`, `org`, `domain`
|
||||
- **Lifetime**: 3600–7200 seconds
|
||||
|
||||
### dict_bot_ip
|
||||
|
||||
- **Source**: CSV file `/var/lib/clickhouse/user_files/bot_ip.csv`
|
||||
- **Key**: `prefix` (String)
|
||||
- **Layout**: `IP_TRIE`
|
||||
- **Attributes**: `bot_name` (String)
|
||||
- **Lifetime**: 300 seconds
|
||||
|
||||
### dict_bot_ja4
|
||||
|
||||
- **Source**: CSV file `/var/lib/clickhouse/user_files/bot_ja4.csv`
|
||||
- **Key**: `ja4` (String)
|
||||
- **Layout**: `COMPLEX_KEY_HASHED`
|
||||
- **Attributes**: `bot_name` (String)
|
||||
- **Lifetime**: 300 seconds
|
||||
|
||||
### dict_asn_reputation
|
||||
|
||||
- **Source**: CSV file `/var/lib/clickhouse/user_files/asn_reputation.csv`
|
||||
- **Key**: `src_asn` (UInt64)
|
||||
- **Layout**: `HASHED`
|
||||
- **Attributes**: `label` (String)
|
||||
- **Lifetime**: 300 seconds
|
||||
|
||||
### dict_anubis_ua
|
||||
|
||||
- **Source**: ClickHouse table `anubis_ua_rules`
|
||||
- **Key**: `regexp` (String)
|
||||
- **Layout**: `REGEXP_TREE`
|
||||
- **Attributes**: `bot_name`, `action`, `has_ip`, `rule_id`, `category`
|
||||
- **Lifetime**: 300–600 seconds
|
||||
|
||||
### dict_anubis_ip
|
||||
|
||||
- **Source**: ClickHouse table `anubis_ip_rules`
|
||||
- **Key**: `prefix` (String)
|
||||
- **Layout**: `IP_TRIE`
|
||||
- **Attributes**: `bot_name`, `action`, `rule_id`, `has_ua`, `category`
|
||||
- **Lifetime**: 300–600 seconds
|
||||
|
||||
### dict_anubis_asn
|
||||
|
||||
- **Source**: ClickHouse table `anubis_asn_rules`
|
||||
- **Key**: `asn` (UInt32)
|
||||
- **Layout**: `FLAT`
|
||||
- **Attributes**: `bot_name`, `action`, `category`
|
||||
- **Lifetime**: 300–600 seconds
|
||||
|
||||
### dict_anubis_country
|
||||
|
||||
- **Source**: ClickHouse table `anubis_country_rules`
|
||||
- **Key**: `country_code` (String)
|
||||
- **Layout**: `FLAT`
|
||||
- **Attributes**: `bot_name`, `action`, `category`
|
||||
- **Lifetime**: 300–600 seconds
|
||||
|
||||
---
|
||||
|
||||
## Views
|
||||
|
||||
### view_ai_features_1h
|
||||
|
||||
Computes 50+ ML features per `(src_ip, ja4, host)` from the last 24 hours by joining `agg_host_ip_ja4_1h` and `agg_header_fingerprint_1h`. Includes:
|
||||
|
||||
- Behavioral features: `hits`, `hit_velocity`, `fuzzing_index`, `post_ratio`, `orphan_ratio`
|
||||
- Connection features: `max_keepalives`, `multiplexing_efficiency`, `port_exhaustion_ratio`
|
||||
- Browser features: `modern_browser_score`, `ua_ch_mismatch`, `header_order_shared_count`
|
||||
- TLS features: `alpn_http_mismatch`, `is_alpn_missing`, `sni_host_mismatch`
|
||||
- L4 features: `tcp_jitter_variance`, `avg_ttl`, `ttl_std`, `syn_timing_cv`
|
||||
- Reputation: `bot_name` (from dict_bot_ip/dict_bot_ja4), `anubis_bot_name/action/category`
|
||||
- Derived: `temporal_entropy`, `ja3_diversity_ratio`
|
||||
|
||||
### view_ip_recurrence
|
||||
|
||||
Aggregates recurrence data from `ml_detected_anomalies`:
|
||||
|
||||
```sql
|
||||
SELECT src_ip, count() AS recurrence,
|
||||
min(detected_at) AS first_seen, max(detected_at) AS last_seen,
|
||||
min(anomaly_score) AS worst_score,
|
||||
argMin(threat_level, anomaly_score) AS worst_threat_level
|
||||
FROM ml_detected_anomalies GROUP BY src_ip;
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## User Accounts
|
||||
|
||||
| User | Permissions | Purpose |
|
||||
|------|------------|---------|
|
||||
| `data_writer` | INSERT + SELECT on `http_logs_raw` | Used by correlator service |
|
||||
| `analyst` | SELECT on `http_logs`, `ml_detected_anomalies`, `ml_all_scores`, `view_ai_features_1h`, `view_ip_recurrence`, `audit_logs` | Used by dashboard/SOC analysts |
|
||||
|
||||
> **Security note**: Default passwords are `ChangeMe` — replace with strong passwords before production use. Store credentials in a secrets manager.
|
||||
246
docs/development.md
Normal file
246
docs/development.md
Normal file
@ -0,0 +1,246 @@
|
||||
# Development Guide
|
||||
|
||||
This guide covers building, testing, packaging, and extending the ja4-platform monorepo. All build and test operations run inside Docker — no native Go, Python, or C toolchains are required on the host.
|
||||
|
||||
## Prerequisites
|
||||
|
||||
| Requirement | Minimum Version | Notes |
|
||||
|-------------|----------------|-------|
|
||||
| Docker | 20.10+ | BuildKit enabled (`DOCKER_BUILDKIT=1`) |
|
||||
| Docker Compose | 2.x | For bot-detector and dashboard |
|
||||
| make | 3.81+ | GNU Make |
|
||||
| git | 2.x | For version tagging |
|
||||
|
||||
No Go, Python, or C compilers are needed on the host machine.
|
||||
|
||||
## Building All Services
|
||||
|
||||
```bash
|
||||
make build-all
|
||||
```
|
||||
|
||||
This builds Docker images for:
|
||||
- `ja4-platform/sentinel:latest`
|
||||
- `ja4-platform/correlator:latest`
|
||||
- `ja4-platform/bot-detector:latest`
|
||||
- `ja4-platform/dashboard:latest`
|
||||
|
||||
mod-reqin-log is an Apache module and is only built as part of the RPM packaging process.
|
||||
|
||||
### Building Individual Services
|
||||
|
||||
```bash
|
||||
make build-sentinel # Go binary in Docker
|
||||
make build-correlator # Go binary in Docker
|
||||
make build-bot-detector # Python image
|
||||
make build-dashboard # FastAPI + React image
|
||||
```
|
||||
|
||||
## Running Tests
|
||||
|
||||
```bash
|
||||
make test-all
|
||||
```
|
||||
|
||||
### Per-Service Testing
|
||||
|
||||
| Service | Command | Details |
|
||||
|---------|---------|---------|
|
||||
| sentinel | `make test-sentinel` | Go tests with `-race` flag, requires `NET_RAW`/`NET_ADMIN` caps |
|
||||
| correlator | `make test-correlator` | Go tests with 80% coverage gate enforced |
|
||||
| mod-reqin-log | `make test-mod-reqin-log` | C unit tests (JSON serialization, config parsing, header handling) |
|
||||
| bot-detector | `make test-bot-detector` | Python pytest suite |
|
||||
| dashboard | `make test-dashboard` | Python pytest for FastAPI routes |
|
||||
| ja4_common (Python) | `make test-ja4common-python` | Shared Python library tests |
|
||||
|
||||
## Building RPM Packages
|
||||
|
||||
```bash
|
||||
make rpm-all
|
||||
```
|
||||
|
||||
Builds RPMs for sentinel, correlator, and mod-reqin-log targeting Rocky Linux 8/9/10:
|
||||
|
||||
```bash
|
||||
make rpm-sentinel # → services/sentinel/dist/rpm/
|
||||
make rpm-correlator # → services/correlator/dist/rpm/
|
||||
make rpm-mod-reqin-log # → services/mod-reqin-log/dist/rpm/
|
||||
```
|
||||
|
||||
Each RPM build uses a multi-stage Docker pipeline:
|
||||
1. Builder stage compiles the binary (Go) or shared object (C)
|
||||
2. RPM builder stage runs `rpmbuild` for each target distro (el8, el9, el10)
|
||||
3. Output stage copies RPMs to the host via `--output type=local`
|
||||
|
||||
### Distribution Packages
|
||||
|
||||
```bash
|
||||
make dist # Alias for rpm-all
|
||||
# RPMs in services/<service>/dist/rpm/el{8,9,10}/
|
||||
```
|
||||
|
||||
## Local Development Workflow
|
||||
|
||||
### Go Services (sentinel, correlator)
|
||||
|
||||
The `go.work` workspace links Go modules:
|
||||
|
||||
```
|
||||
go 1.21
|
||||
|
||||
use (
|
||||
./services/sentinel
|
||||
./services/correlator
|
||||
./shared/go/ja4common
|
||||
)
|
||||
```
|
||||
|
||||
If you have Go 1.21+ installed locally, you can develop without Docker:
|
||||
|
||||
```bash
|
||||
# Run sentinel tests locally
|
||||
cd services/sentinel && go test ./... -race -v
|
||||
|
||||
# Run correlator tests locally
|
||||
cd services/correlator && go test ./... -race -cover -v
|
||||
|
||||
# Build sentinel binary locally (requires libpcap-dev)
|
||||
cd services/sentinel && go build -o ja4sentinel ./cmd/ja4sentinel/
|
||||
```
|
||||
|
||||
### Python Services (bot-detector, dashboard)
|
||||
|
||||
```bash
|
||||
# Install shared library in development mode
|
||||
cd shared/python/ja4_common && pip install -e .
|
||||
|
||||
# Run bot-detector locally
|
||||
cd services/bot-detector && pip install -r bot_detector/requirements.txt
|
||||
python -m bot_detector.bot_detector
|
||||
|
||||
# Run dashboard locally
|
||||
cd services/dashboard && pip install -r backend/requirements.txt
|
||||
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
### C Module (mod-reqin-log)
|
||||
|
||||
Requires `apxs` (Apache extension tool) and development headers:
|
||||
|
||||
```bash
|
||||
cd services/mod-reqin-log
|
||||
make build # Compiles mod_reqin_log.so
|
||||
make test # Runs unit tests
|
||||
make rpm # Builds RPM packages
|
||||
```
|
||||
|
||||
## Adding a New Service
|
||||
|
||||
### Go Service
|
||||
|
||||
1. Create the service directory:
|
||||
```bash
|
||||
mkdir -p services/my-service/cmd/my-service
|
||||
mkdir -p services/my-service/internal
|
||||
```
|
||||
|
||||
2. Initialize the Go module:
|
||||
```bash
|
||||
cd services/my-service
|
||||
go mod init github.com/antitbone/ja4/my-service
|
||||
```
|
||||
|
||||
3. Add to `go.work`:
|
||||
```
|
||||
use (
|
||||
./services/sentinel
|
||||
./services/correlator
|
||||
./services/my-service # ← add this
|
||||
./shared/go/ja4common
|
||||
)
|
||||
```
|
||||
|
||||
4. Import the shared library:
|
||||
```go
|
||||
import (
|
||||
"github.com/antitbone/ja4/ja4common/logger"
|
||||
"github.com/antitbone/ja4/ja4common/config"
|
||||
"github.com/antitbone/ja4/ja4common/shutdown"
|
||||
)
|
||||
```
|
||||
|
||||
5. Add Makefile targets:
|
||||
```makefile
|
||||
build-my-service:
|
||||
docker build -f services/my-service/Dockerfile -t ja4-platform/my-service:latest .
|
||||
|
||||
test-my-service:
|
||||
docker build -f services/my-service/Dockerfile.dev -t ja4-platform/my-service-tests:latest .
|
||||
docker run --rm ja4-platform/my-service-tests:latest
|
||||
```
|
||||
|
||||
6. Update `build-all` and `test-all` dependencies.
|
||||
|
||||
### Python Service
|
||||
|
||||
1. Create the service directory with a `requirements.txt` or `pyproject.toml`.
|
||||
2. Add `ja4-common` as a dependency (installed from `shared/python/ja4_common`).
|
||||
3. Use `from ja4_common.clickhouse import get_client` for ClickHouse access.
|
||||
4. Add Makefile targets following the bot-detector/dashboard pattern.
|
||||
|
||||
## go.work Workspace
|
||||
|
||||
The `go.work` file at the repository root links all Go modules, allowing cross-module development without publishing:
|
||||
|
||||
```
|
||||
go 1.21
|
||||
|
||||
use (
|
||||
./services/sentinel
|
||||
./services/correlator
|
||||
./shared/go/ja4common
|
||||
)
|
||||
```
|
||||
|
||||
When adding a new Go module:
|
||||
1. `go mod init` in the service directory
|
||||
2. Add the path to `go.work`
|
||||
3. Reference shared packages via their module path: `github.com/antitbone/ja4/ja4common/...`
|
||||
4. Run `go work sync` to update the workspace
|
||||
|
||||
## ja4_common Python Package
|
||||
|
||||
The shared Python package (`shared/python/ja4_common`) provides:
|
||||
|
||||
- `ClickHouseSettings` — pydantic-settings model reading from `.env`
|
||||
- `ClickHouseClient` — singleton client with auto-reconnect
|
||||
- `get_client()` — module-level singleton accessor
|
||||
|
||||
### Extending ja4_common
|
||||
|
||||
1. Add new modules under `shared/python/ja4_common/ja4_common/`
|
||||
2. Export them in `__init__.py`
|
||||
3. Add dependencies to `pyproject.toml`
|
||||
4. Run tests: `make test-ja4common-python`
|
||||
|
||||
### Using in a New Service
|
||||
|
||||
Add to `requirements.txt`:
|
||||
```
|
||||
ja4-common @ file:///app/shared/python/ja4_common
|
||||
```
|
||||
|
||||
Or in Docker, copy the shared library and install:
|
||||
```dockerfile
|
||||
COPY shared/python/ja4_common /app/shared/python/ja4_common
|
||||
RUN pip install /app/shared/python/ja4_common
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
Each service reads configuration from environment variables and/or YAML config files. See individual service documentation for the full reference:
|
||||
|
||||
- [Sentinel configuration](services/sentinel.md#configuration-reference)
|
||||
- [Correlator configuration](services/correlator.md#configuration-reference)
|
||||
- [Bot Detector configuration](services/bot-detector.md#environment-variables)
|
||||
- [Dashboard configuration](services/dashboard.md#configuration)
|
||||
265
docs/services/bot-detector.md
Normal file
265
docs/services/bot-detector.md
Normal file
@ -0,0 +1,265 @@
|
||||
# Bot Detector
|
||||
|
||||
The bot-detector is a Python service that performs machine-learning anomaly detection on aggregated HTTP/TLS traffic features stored in ClickHouse. It runs on a continuous cycle (default: every 5 minutes), using Isolation Forest to identify suspicious traffic patterns, enriched with SHAP explainability, DBSCAN clustering, and Anubis bot-rule enrichment.
|
||||
|
||||
## ML Algorithm
|
||||
|
||||
### Isolation Forest (Semi-Supervised)
|
||||
|
||||
The core algorithm is **Isolation Forest** (Liu, Ting & Zhou, 2008) — an unsupervised anomaly detection algorithm that isolates anomalies by randomly partitioning feature space. Anomalies require fewer partitions to isolate than normal points.
|
||||
|
||||
The approach is **semi-supervised** because:
|
||||
1. **Known bots** are identified a priori via reputation dictionaries (IP, JA4, ASN)
|
||||
2. **Human baseline** is identified via ASN reputation labels (`asn_label = 'human'`)
|
||||
3. The model trains **only on human-baseline traffic** (minimum 500 sessions required)
|
||||
4. Unknown traffic is scored by deviation from the human profile
|
||||
|
||||
### Two-Model Architecture
|
||||
|
||||
| Model | Condition | Features | Data |
|
||||
|-------|-----------|----------|------|
|
||||
| **Complet** | `correlated = 1` | 35 | HTTP + TCP + TLS (full pipeline data) |
|
||||
| **Applicatif** | `correlated = 0` | 31 | HTTP only (no TLS correlation available) |
|
||||
|
||||
### Threat Levels
|
||||
|
||||
| Score Range | Level | Interpretation |
|
||||
|------------|-------|----------------|
|
||||
| `< -0.30` | **CRITICAL** | Extremely anomalous behavior |
|
||||
| `< -0.15` | **HIGH** | Strong anomaly signal |
|
||||
| `< -0.05` | **MEDIUM** | Moderate anomaly |
|
||||
| `≥ -0.05` | **LOW** | Slightly unusual |
|
||||
|
||||
## Feature List
|
||||
|
||||
### Common Features (31 — Applicatif model)
|
||||
|
||||
#### HTTP Behavior
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `hits` | Request count in the window |
|
||||
| `hit_velocity` | Requests per second |
|
||||
| `fuzzing_index` | Path/parameter diversity anomaly score |
|
||||
| `post_ratio` | Fraction of POST requests |
|
||||
| `port_exhaustion_ratio` | Fraction of distinct source ports / total |
|
||||
| `orphan_ratio` | Requests without TLS correlation |
|
||||
| `head_ratio` | Fraction of HEAD requests |
|
||||
| `http10_ratio` | Fraction of HTTP/1.0 requests |
|
||||
| `generic_accept_ratio` | Fraction of short Accept headers |
|
||||
| `sec_fetch_absence_rate` | Fraction missing Sec-Fetch-Site |
|
||||
| `missing_accept_enc_ratio` | Fraction missing Accept-Encoding |
|
||||
| `http_scheme_ratio` | Fraction using HTTP (not HTTPS) |
|
||||
|
||||
#### Connection Management
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `max_keepalives` | Max requests on a single Keep-Alive connection |
|
||||
| `tcp_shared_count` | TCP connections shared between sessions |
|
||||
| `multiplexing_efficiency` | HTTP/2 multiplexing efficiency |
|
||||
|
||||
#### Browser Fingerprint
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `header_count` | HTTP headers sent |
|
||||
| `has_accept_language` | Accept-Language header presence |
|
||||
| `has_cookie` | Cookie header presence |
|
||||
| `has_referer` | Referer header presence |
|
||||
| `modern_browser_score` | Composite browser compliance score (0–100) |
|
||||
| `ua_ch_mismatch` | User-Agent vs Client Hints inconsistency |
|
||||
| `ip_id_zero_ratio` | IP packets with ID=0 (headless/minimal stack) |
|
||||
| `header_order_shared_count` | IPs sharing same header order |
|
||||
| `header_order_confidence` | Normalized entropy of header order |
|
||||
| `distinct_header_orders` | Distinct header orderings per IP |
|
||||
| `is_fake_navigation` | Sec-Fetch-Mode=navigate with non-document dest |
|
||||
|
||||
#### Navigation Patterns
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `request_size_variance` | Variance of request sizes |
|
||||
| `mss_mobile_mismatch` | TCP MSS vs mobile profile inconsistency |
|
||||
| `asset_ratio` | Static asset request fraction |
|
||||
| `direct_access_ratio` | Direct accesses (no referer) |
|
||||
| `is_ua_rotating` | User-Agent rotation detected (flag) |
|
||||
| `distinct_ja4_count` | Distinct JA4 fingerprints per IP |
|
||||
| `anomalous_payload_ratio` | Anomalous payload size fraction |
|
||||
|
||||
#### Concentration & Rarity
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `src_port_density` | Source port entropy |
|
||||
| `ja4_asn_concentration` | JA4 concentration within ASN |
|
||||
| `ja4_country_concentration` | JA4 concentration per country |
|
||||
| `is_rare_ja4` | Rare JA4 fingerprint (< 100 total hits) |
|
||||
|
||||
#### Temporal & Diversity
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `temporal_entropy` | Temporal distribution entropy |
|
||||
| `path_diversity_ratio` | URL path diversity |
|
||||
| `url_depth_variance` | URL depth variance |
|
||||
| `ja3_diversity_ratio` | JA3 diversity ratio per IP |
|
||||
|
||||
### Additional TCP/TLS Features (Complet model only — 4 extra)
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `tcp_jitter_variance` | TCP inter-packet jitter variance |
|
||||
| `alpn_http_mismatch` | ALPN vs actual HTTP protocol mismatch |
|
||||
| `is_alpn_missing` | ALPN absent in ClientHello |
|
||||
| `sni_host_mismatch` | TLS SNI vs HTTP Host mismatch |
|
||||
|
||||
### L4 Fingerprint Features (Complet model)
|
||||
|
||||
| Feature | Description |
|
||||
|---------|-------------|
|
||||
| `avg_ttl` | Average IP TTL (OS fingerprint) |
|
||||
| `ttl_std` | TTL standard deviation |
|
||||
| `no_window_scale_ratio` | Fraction without TCP window scale |
|
||||
| `syn_timing_cv` | SYN timing coefficient of variation |
|
||||
| `tls12_ratio` | Fraction of TLS 1.2 connections |
|
||||
| `ip_df_variance` | IP Don't-Fragment flag variance |
|
||||
|
||||
## Detection Pipeline
|
||||
|
||||
```
|
||||
1. Read view_ai_features_1h (last 24h) → DataFrame
|
||||
2. Read view_ip_recurrence → recurrence map
|
||||
3. Clean columns (fillna, astype)
|
||||
4. Split by correlated=1 / correlated=0
|
||||
5. For each model (Complet, Applicatif):
|
||||
a. A7: Validate features (exclude missing/constant)
|
||||
b. Separate known bots → log as KNOWN_BOT
|
||||
c. Filter human baseline (asn_label='human', min 500 sessions)
|
||||
d. Load or train Isolation Forest model
|
||||
e. A1: Check concept drift (KS test on features)
|
||||
f. Score unknown traffic
|
||||
g. A10: Normalize scores to [-1, 0]
|
||||
h. A2: Compute adaptive threshold = min(percentile_5, ANOMALY_THRESHOLD)
|
||||
i. A6: Apply recurrence weighting
|
||||
j. Filter scores below threshold
|
||||
k. A4: SHAP explainability (top 5 features)
|
||||
l. A8: DBSCAN clustering (campaign detection)
|
||||
6. Concatenate results, deduplicate by src_ip (keep lowest score)
|
||||
7. A5: Deduplication with TTL (skip recently reported IPs)
|
||||
8. Insert into ml_detected_anomalies + ml_all_scores
|
||||
```
|
||||
|
||||
## Concept Drift Detection (A1)
|
||||
|
||||
Uses the **Kolmogorov-Smirnov test** to compare feature distributions between the current data and the training data. If the fraction of drifted features exceeds `DRIFT_THRESHOLD` (default: 0.30), the model is retrained.
|
||||
|
||||
## SHAP Explainability (A4)
|
||||
|
||||
When enabled (`ENABLE_SHAP=true`), computes SHAP values for each detected anomaly using `shap.TreeExplainer`. The top 5 contributing features are stored in the `reason` field.
|
||||
|
||||
## DBSCAN Clustering (A8)
|
||||
|
||||
When enabled (`ENABLE_CLUSTERING=true`), applies DBSCAN on anomaly feature vectors to group related anomalies into campaigns. Each anomaly gets a `campaign_id` (-1 = no cluster).
|
||||
|
||||
## Anubis Bot-Rule Enrichment
|
||||
|
||||
The `view_ai_features_1h` view enriches each IP with Anubis bot detection using a priority cascade:
|
||||
1. **UA + IP combined** (same `rule_id`) — highest confidence
|
||||
2. **UA only** (no IP requirement)
|
||||
3. **IP only** (no UA requirement)
|
||||
4. **ASN match**
|
||||
5. **Country match**
|
||||
|
||||
## Environment Variables
|
||||
|
||||
| Variable | Type | Default | Description |
|
||||
|----------|------|---------|-------------|
|
||||
| `CLICKHOUSE_HOST` | string | `clickhouse` | ClickHouse server hostname |
|
||||
| `CLICKHOUSE_PORT` | int | `8123` | ClickHouse HTTP port |
|
||||
| `CLICKHOUSE_DB` | string | `mabase_prod` | Database name |
|
||||
| `CLICKHOUSE_USER` | string | `admin` | ClickHouse username |
|
||||
| `CLICKHOUSE_PASSWORD` | string | `""` | ClickHouse password |
|
||||
| `ISOLATION_CONTAMINATION` | float | `0.02` | Contamination parameter for Isolation Forest |
|
||||
| `ANOMALY_THRESHOLD` | float | `-0.03` | Score threshold for anomaly detection |
|
||||
| `ANOMALY_PERCENTILE` | int | `5` | Percentile for adaptive threshold (A2) |
|
||||
| `CYCLE_INTERVAL_SEC` | int | `300` | Seconds between detection cycles |
|
||||
| `MAX_CONSECUTIVE_FAILURES` | int | `3` | Max consecutive failures before exit |
|
||||
| `BOT_DETECTOR_LOG` | string | `/var/log/bot_detector/decisions.jsonl` | Decision log file path |
|
||||
| `LOG_BACKUP_COUNT` | int | `7` | Number of rotated log backups |
|
||||
| `MODEL_DIR` | string | `/var/lib/bot_detector` | Model persistence directory |
|
||||
| `RETRAIN_INTERVAL_HOURS` | int | `24` | Hours between model retraining |
|
||||
| `MODEL_HISTORY_COUNT` | int | `10` | Number of model versions to keep |
|
||||
| `DRIFT_THRESHOLD` | float | `0.30` | KS-test drift threshold (A1) |
|
||||
| `ENABLE_MULTIWINDOW` | bool | `false` | Enable 24h multi-window analysis (A3) |
|
||||
| `MULTIWINDOW_VIEW` | string | `view_ai_features_24h` | View for multi-window mode |
|
||||
| `ENABLE_SHAP` | bool | `true` | Enable SHAP explainability (A4) |
|
||||
| `DEDUP_TTL_MIN` | int | `60` | Deduplication TTL in minutes (A5) |
|
||||
| `RECURRENCE_WEIGHT` | float | `0.005` | Recurrence score weighting factor (A6) |
|
||||
| `MIN_VALID_FEATURE_RATIO` | float | `0.50` | Min valid feature ratio (A7) |
|
||||
| `ENABLE_CLUSTERING` | bool | `true` | Enable DBSCAN clustering (A8) |
|
||||
| `CLUSTERING_MIN_SAMPLES` | int | `3` | DBSCAN min samples per cluster |
|
||||
| `HEALTH_PORT` | int | `8080` | Health check HTTP server port |
|
||||
|
||||
## Output Tables
|
||||
|
||||
### ml_detected_anomalies
|
||||
|
||||
Anomaly detections above the threat threshold. Engine: `ReplacingMergeTree(detected_at)`, ORDER BY `(src_ip)`, TTL 30 days.
|
||||
|
||||
Key columns: `detected_at`, `src_ip`, `ja4`, `host`, `bot_name`, `anomaly_score`, `raw_anomaly_score`, `threat_level`, `model_name`, `recurrence`, `campaign_id`, `reason`, `anubis_bot_name`, `anubis_bot_action`, `anubis_bot_category`, plus all ML features.
|
||||
|
||||
### ml_all_scores
|
||||
|
||||
All classifications (no threshold filter) for observability. Engine: `ReplacingMergeTree(detected_at)`, ORDER BY `(window_start, src_ip, ja4, host, model_name)`, TTL 3 days.
|
||||
|
||||
## Decision Log Format
|
||||
|
||||
The `decisions.jsonl` file contains structured JSONL entries:
|
||||
|
||||
```json
|
||||
{"event": "CYCLE_START", "cycle_id": "20260309T143000", "total": 5000, "human": 1500, "known_bot": 200, "correlated": 3000}
|
||||
{"event": "ANOMALY", "src_ip": "203.0.113.42", "score": -0.25, "threat_level": "HIGH", "reason": "hit_velocity=45.2, fuzzing_index=0.8, ...", "campaign_id": 3}
|
||||
{"event": "KNOWN_BOT", "src_ip": "198.51.100.10", "bot_name": "AhrefsBot"}
|
||||
{"event": "CYCLE_END", "cycle_id": "20260309T143000", "anomalies": 15, "known_bots": 200, "duration_sec": 12.5}
|
||||
```
|
||||
|
||||
Log rotation: 50 MB max size × `LOG_BACKUP_COUNT` backups (default 7).
|
||||
|
||||
## Health Check Endpoint
|
||||
|
||||
- **URL**: `GET http://localhost:8080/`
|
||||
- **Response**: `200 OK` with status JSON
|
||||
- Runs in a separate thread
|
||||
|
||||
## Model Persistence
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `model_<name>_<version>.joblib` | Serialized Isolation Forest (joblib) |
|
||||
| `model_<name>_<version>.meta.json` | Model metadata (features, thresholds, training stats) |
|
||||
| `model_<name>.current` | Pointer to active model version |
|
||||
| `training_history.jsonl` | Training history log |
|
||||
|
||||
Models are rotated: only the last `MODEL_HISTORY_COUNT` versions (default 10) are kept.
|
||||
|
||||
## Docker Deployment
|
||||
|
||||
```bash
|
||||
# Build
|
||||
make build-bot-detector
|
||||
|
||||
# Run with docker-compose
|
||||
cd services/bot-detector
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
### Volumes
|
||||
|
||||
| Host Path | Container Path | Description |
|
||||
|-----------|---------------|-------------|
|
||||
| `./bot_detector_logs` | `/var/log/bot_detector` | Decision logs (JSONL) |
|
||||
| `./bot_detector_models` | `/var/lib/bot_detector` | Persisted ML models |
|
||||
| `./reputation/data/user_files/bot_ip.csv` | `/data/bot_ip.csv` (ro) | Known bot IP list |
|
||||
| `./reputation/data/user_files/bot_ja4.csv` | `/data/bot_ja4.csv` (ro) | Known bot JA4 list |
|
||||
| `./reputation/data/user_files/asn_reputation.csv` | `/data/asn_reputation.csv` (ro) | ASN reputation labels |
|
||||
220
docs/services/correlator.md
Normal file
220
docs/services/correlator.md
Normal file
@ -0,0 +1,220 @@
|
||||
# Correlator
|
||||
|
||||
The correlator (`logcorrelator`) is a Go daemon that joins HTTP events from [mod-reqin-log](mod-reqin-log.md) (source A) with TLS/network events from [sentinel](sentinel.md) (source B) into unified correlated log entries. It uses a `src_ip:src_port` key with a configurable time window to match events, supports HTTP Keep-Alive connections, and writes results to ClickHouse, file, and/or stdout.
|
||||
|
||||
## Correlation Algorithm
|
||||
|
||||
### Key Matching
|
||||
|
||||
Events are correlated by their **correlation key**: `src_ip:src_port`. Since a client's ephemeral source port uniquely identifies a TCP connection, matching on this pair reliably joins the HTTP request (seen by Apache) with the TLS handshake (seen by sentinel) from the same connection.
|
||||
|
||||
### Time Window
|
||||
|
||||
Events must arrive within the configured time window (default: **10 seconds**) to be matched. This accounts for:
|
||||
- Processing latency between Apache and sentinel
|
||||
- Packet capture buffering
|
||||
- UNIX socket delivery ordering
|
||||
|
||||
### Keep-Alive Support
|
||||
|
||||
In `one_to_many` mode (default), a single TLS handshake event (source B) can match **multiple** HTTP requests (source A) on the same TCP connection:
|
||||
|
||||
1. Source B event arrives → buffered with TTL (default: 120 s)
|
||||
2. Source A event arrives with same key → correlation match, B event TTL resets
|
||||
3. Next A event on same connection → matches same B event (TTL resets again)
|
||||
4. Connection closes → B event expires after TTL
|
||||
|
||||
Each A event within a Keep-Alive session gets an incrementing `keepalives` counter.
|
||||
|
||||
### Orphan Handling
|
||||
|
||||
- **Source A orphans** (HTTP without TLS match): Emitted after `apache_emit_delay_ms` (default: 500 ms) with `correlated=false`, `orphan_side=A`
|
||||
- **Source B orphans** (TLS without HTTP match): Not emitted by default (`network_emit: false`)
|
||||
- **Buffer overflow**: Oldest events are rotated out and emitted as orphans
|
||||
|
||||
### Field Merging
|
||||
|
||||
When two events are correlated:
|
||||
- HTTP fields (method, path, headers, etc.) come from source A
|
||||
- TLS/network fields (JA4, JA3, IP/TCP metadata) come from source B
|
||||
- On field collision with different values: both are kept with `a_` and `b_` prefixes
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
Configuration is loaded from a YAML file (default: `/etc/logcorrelator/logcorrelator.yml`).
|
||||
|
||||
### Log Settings
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `log.level` | string | `INFO` | Log level: `DEBUG`, `INFO`, `WARN`, `ERROR` |
|
||||
|
||||
### Input Settings
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `inputs.unix_sockets[].name` | string | — | Human-readable source name (e.g., `http`, `network`) |
|
||||
| `inputs.unix_sockets[].path` | string | — | UNIX socket path to listen on |
|
||||
| `inputs.unix_sockets[].format` | string | `json` | Input format |
|
||||
| `inputs.unix_sockets[].source_type` | string | — | Event source: `A` (HTTP), `B` (Network) |
|
||||
| `inputs.unix_sockets[].socket_permissions` | string | `0666` | Socket file permissions (octal) |
|
||||
|
||||
### Output Settings
|
||||
|
||||
#### File Output
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `outputs.file.enabled` | bool | `true` | Enable file output |
|
||||
| `outputs.file.path` | string | `/var/log/logcorrelator/correlated.log` | Output file path |
|
||||
|
||||
#### ClickHouse Output
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `outputs.clickhouse.enabled` | bool | `false` | Enable ClickHouse output |
|
||||
| `outputs.clickhouse.dsn` | string | — | ClickHouse DSN (e.g., `clickhouse://user:pass@host:9000/db`) |
|
||||
| `outputs.clickhouse.table` | string | — | Target table name |
|
||||
| `outputs.clickhouse.batch_size` | int | `500` | Records per batch insert |
|
||||
| `outputs.clickhouse.flush_interval_ms` | int | `200` | Flush interval in milliseconds |
|
||||
| `outputs.clickhouse.max_buffer_size` | int | `5000` | Maximum in-memory buffer size |
|
||||
| `outputs.clickhouse.drop_on_overflow` | bool | `true` | Drop records when buffer is full |
|
||||
| `outputs.clickhouse.async_insert` | bool | `true` | Use ClickHouse async inserts |
|
||||
| `outputs.clickhouse.timeout_ms` | int | `1000` | Operation timeout in milliseconds |
|
||||
|
||||
#### Stdout Output
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `outputs.stdout.enabled` | bool | `false` | Enable stdout output |
|
||||
| `outputs.stdout.level` | string | — | Output verbosity filter |
|
||||
|
||||
### Correlation Settings
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `correlation.time_window.value` | int | `10` | Time window value |
|
||||
| `correlation.time_window.unit` | string | `s` | Time window unit (`s`, `ms`) |
|
||||
| `correlation.orphan_policy.apache_always_emit` | bool | `true` | Always emit A events even without B match |
|
||||
| `correlation.orphan_policy.apache_emit_delay_ms` | int | `500` | Delay before emitting orphan A (ms) |
|
||||
| `correlation.orphan_policy.network_emit` | bool | `false` | Emit B events without A match |
|
||||
| `correlation.matching.mode` | string | `one_to_many` | Matching mode: `one_to_one` or `one_to_many` |
|
||||
| `correlation.buffers.max_http_items` | int | `10000` | Max buffered HTTP (source A) events |
|
||||
| `correlation.buffers.max_network_items` | int | `20000` | Max buffered network (source B) events |
|
||||
| `correlation.ttl.network_ttl_s` | int | `120` | TTL for source B events (seconds) |
|
||||
| `correlation.exclude_source_ips` | []string | `[]` | IPs or CIDRs to exclude from correlation |
|
||||
| `correlation.include_dest_ports` | []int | `[]` | If non-empty, only correlate events on these ports |
|
||||
|
||||
### Metrics Settings
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `metrics.enabled` | bool | `false` | Enable metrics HTTP server |
|
||||
| `metrics.addr` | string | `:8080` | Metrics server listen address |
|
||||
|
||||
## Input Events
|
||||
|
||||
### Source A (HTTP — from mod-reqin-log)
|
||||
|
||||
JSON fields: `time`, `src_ip`, `src_port`, `dst_ip`, `dst_port`, `method`, `scheme`, `host`, `path`, `query`, `http_version`, `client_headers`, `header_*`
|
||||
|
||||
### Source B (Network — from sentinel)
|
||||
|
||||
JSON fields: `src_ip`, `src_port`, `dst_ip`, `dst_port`, `ip_meta_*`, `tcp_meta_*`, `tls_version`, `tls_sni`, `tls_alpn`, `ja4`, `ja3`, `ja3_hash`, `conn_id`, `syn_to_clienthello_ms`, `timestamp`
|
||||
|
||||
## Output CorrelatedLog JSON Schema
|
||||
|
||||
```json
|
||||
{
|
||||
"timestamp": "2026-03-09T14:30:00Z",
|
||||
"src_ip": "203.0.113.42",
|
||||
"src_port": 52341,
|
||||
"dst_ip": "192.168.1.10",
|
||||
"dst_port": 443,
|
||||
"correlated": true,
|
||||
"method": "GET",
|
||||
"host": "example.com",
|
||||
"path": "/api/v1/users",
|
||||
"ja4": "t13d1516h2_8daaf6152771_b0da82dd1658",
|
||||
"ja3_hash": "e7d705a3286e19ea42f587b344ee6865",
|
||||
"ip_meta_ttl": 64,
|
||||
"tcp_meta_window_size": 65535,
|
||||
"tls_version": "1.3",
|
||||
"tls_sni": "example.com",
|
||||
"tls_alpn": "h2",
|
||||
"header_User-Agent": "Mozilla/5.0 ...",
|
||||
"keepalives": 3
|
||||
}
|
||||
```
|
||||
|
||||
Core fields are always present; additional fields are merged from A and B event raw data.
|
||||
|
||||
## ClickHouse Sink
|
||||
|
||||
- **Protocol**: ClickHouse native TCP (port 9000) via `clickhouse-go/v2`
|
||||
- **Target table**: `http_logs_raw` (raw JSON stored, then parsed by materialized views)
|
||||
- **Batch inserts**: Buffered up to `batch_size` records (default 500)
|
||||
- **Flush interval**: Default 200 ms timer triggers flush if batch not full
|
||||
- **Retry behavior**: Up to 3 retries with exponential backoff (100 ms base)
|
||||
- **Connection ping**: 5-second timeout on startup
|
||||
- **Buffer overflow**: Records dropped when buffer exceeds `max_buffer_size` (configurable)
|
||||
|
||||
## Metrics HTTP Server
|
||||
|
||||
When `metrics.enabled: true`, exposes:
|
||||
|
||||
| Endpoint | Description |
|
||||
|----------|-------------|
|
||||
| `GET /metrics` | Correlation metrics as JSON (events received, correlated, orphans, buffer sizes) |
|
||||
| `GET /health` | Health check endpoint |
|
||||
|
||||
## systemd Service
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=logcorrelator service
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=logcorrelator
|
||||
Group=logcorrelator
|
||||
ExecStart=/usr/bin/logcorrelator -config /etc/logcorrelator/logcorrelator.yml
|
||||
ExecReload=/bin/kill -HUP $MAINPID
|
||||
Restart=on-failure
|
||||
RestartSec=5
|
||||
RuntimeDirectory=logcorrelator
|
||||
RuntimeDirectoryMode=0755
|
||||
|
||||
# Security hardening
|
||||
NoNewPrivileges=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
ReadWritePaths=/var/log/logcorrelator /etc/logcorrelator
|
||||
|
||||
# Resource limits
|
||||
LimitNOFILE=65536
|
||||
TimeoutStartSec=10
|
||||
TimeoutStopSec=30
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
### Security Hardening
|
||||
|
||||
- Runs as dedicated `logcorrelator` user/group
|
||||
- `NoNewPrivileges=true` — prevents privilege escalation
|
||||
- `ProtectSystem=strict` — read-only filesystem except `ReadWritePaths`
|
||||
- `ProtectHome=true` — no access to home directories
|
||||
- `RuntimeDirectory=logcorrelator` — systemd creates socket directory with correct ownership
|
||||
|
||||
## RPM Package Contents
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `/usr/bin/logcorrelator` | Binary |
|
||||
| `/etc/logcorrelator/logcorrelator.yml` | Configuration file |
|
||||
| `/usr/lib/systemd/system/logcorrelator.service` | systemd unit |
|
||||
| `/var/log/logcorrelator/` | Log directory |
|
||||
| `/var/run/logcorrelator/` | Socket directory (RuntimeDirectory) |
|
||||
308
docs/services/dashboard.md
Normal file
308
docs/services/dashboard.md
Normal file
@ -0,0 +1,308 @@
|
||||
# Dashboard
|
||||
|
||||
The dashboard is a SOC (Security Operations Center) web application built with FastAPI (backend) and React (frontend) that provides real-time visualization, investigation, and analysis of bot detections generated by the [bot-detector](bot-detector.md). It queries ClickHouse (`mabase_prod`) for all data.
|
||||
|
||||
## Technology Stack
|
||||
|
||||
| Component | Technology |
|
||||
|-----------|-----------|
|
||||
| Backend | Python 3.11 + FastAPI |
|
||||
| Frontend | React + Vite |
|
||||
| Database | ClickHouse (via `ja4_common` shared client) |
|
||||
| API Docs | Swagger UI (`/docs`) and ReDoc (`/redoc`) |
|
||||
|
||||
## Configuration
|
||||
|
||||
| Variable | Type | Default | Description |
|
||||
|----------|------|---------|-------------|
|
||||
| `CLICKHOUSE_HOST` | string | `clickhouse` | ClickHouse hostname |
|
||||
| `CLICKHOUSE_PORT` | int | `8123` | ClickHouse HTTP port |
|
||||
| `CLICKHOUSE_DB` | string | `mabase_prod` | Database name |
|
||||
| `CLICKHOUSE_USER` | string | `admin` | ClickHouse user |
|
||||
| `CLICKHOUSE_PASSWORD` | string | `""` | ClickHouse password |
|
||||
| `API_HOST` | string | `0.0.0.0` | API listen address |
|
||||
| `API_PORT` | int | `8000` | API listen port |
|
||||
| `CORS_ORIGINS` | list | `["http://localhost:3000", "http://127.0.0.1:3000"]` | Allowed CORS origins |
|
||||
|
||||
## API Reference
|
||||
|
||||
All endpoints are prefixed with `/api/`. The dashboard exposes **74+ endpoints** across 20 routers.
|
||||
|
||||
### Health
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/health` | Health check — returns ClickHouse connection status |
|
||||
|
||||
---
|
||||
|
||||
### Metrics (`/api/metrics`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/metrics` | Global dashboard metrics: detection counts by threat level, unique IPs, time series |
|
||||
| GET | `/api/metrics/threats` | Threat distribution summary |
|
||||
| GET | `/api/metrics/baseline` | Human baseline statistics |
|
||||
|
||||
---
|
||||
|
||||
### Detections (`/api/detections`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/detections` | Paginated detection list with filtering, sorting, and text search |
|
||||
| GET | `/api/detections/{detection_id}` | Single detection details |
|
||||
|
||||
**Query Parameters** (GET `/api/detections`):
|
||||
|
||||
| Parameter | Type | Description |
|
||||
|-----------|------|-------------|
|
||||
| `page` | int | Page number (default: 1) |
|
||||
| `page_size` | int | Items per page (default: 20) |
|
||||
| `threat_level` | string | Filter by threat level |
|
||||
| `model_name` | string | Filter by model name |
|
||||
| `search` | string | Full-text search across IP, JA4, host, bot_name |
|
||||
| `sort_by` | string | Sort field |
|
||||
| `sort_order` | string | `asc` or `desc` |
|
||||
|
||||
---
|
||||
|
||||
### Investigation (`/api/investigation`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/investigation/{ip}/summary` | **Primary investigation endpoint.** Aggregates ML score, brute-force, TCP spoofing, JA4 rotation, persistence, and 24h timeline into a single response with a `risk_score` (0–100) |
|
||||
|
||||
---
|
||||
|
||||
### Reputation (`/api/reputation`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/reputation/ip/{ip_address}` | Full IP reputation from IP-API.com and IPinfo.io (proxy, VPN, Tor, hosting detection) |
|
||||
| GET | `/api/reputation/ip/{ip_address}/summary` | Simplified reputation summary |
|
||||
|
||||
---
|
||||
|
||||
### Analysis (`/api/analysis`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/analysis/{ip}/subnet` | Subnet analysis for an IP (related IPs in same /24) |
|
||||
| GET | `/api/analysis/{ip}/country` | Country-level analysis for an IP |
|
||||
| GET | `/api/analysis/country` | Global country analysis across all detections |
|
||||
| GET | `/api/analysis/{ip}/ja4` | JA4 fingerprint analysis for an IP |
|
||||
| GET | `/api/analysis/{ip}/user-agents` | User-agent analysis for an IP |
|
||||
| GET | `/api/analysis/{ip}/recommendation` | SOC classification recommendation |
|
||||
| POST | `/api/analysis/classifications` | Create a classification (legitimate/suspicious/malicious) |
|
||||
| GET | `/api/analysis/classifications` | List all classifications |
|
||||
| GET | `/api/analysis/classifications/stats` | Classification statistics |
|
||||
|
||||
---
|
||||
|
||||
### Entities (`/api/entities`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/entities/types` | List available entity types |
|
||||
| GET | `/api/entities/subnet/{subnet}` | Investigate a subnet |
|
||||
| GET | `/api/entities/{entity_type}/{entity_value}` | Investigate any entity (IP, JA4, subnet, UA, host) |
|
||||
| GET | `/api/entities/{entity_type}/{entity_value}/related` | Related entities |
|
||||
| GET | `/api/entities/{entity_type}/{entity_value}/user_agents` | User-agents for entity |
|
||||
| GET | `/api/entities/{entity_type}/{entity_value}/client_headers` | Client headers for entity |
|
||||
| GET | `/api/entities/{entity_type}/{entity_value}/paths` | URL paths for entity |
|
||||
| GET | `/api/entities/{entity_type}/{entity_value}/query_params` | Query parameters for entity |
|
||||
|
||||
---
|
||||
|
||||
### Incidents (`/api/incidents`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/incidents` | List all incidents |
|
||||
| GET | `/api/incidents/clusters` | Active incident clusters (behavioral similarity grouping) |
|
||||
| GET | `/api/incidents/{cluster_id}` | Incident cluster details |
|
||||
| POST | `/api/incidents/{cluster_id}/classify` | Classify an incident cluster |
|
||||
|
||||
---
|
||||
|
||||
### Fingerprints (`/api/fingerprints`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/fingerprints/spoofing` | TLS fingerprint spoofing detection |
|
||||
| GET | `/api/fingerprints/ja4-ua-matrix` | JA4 ↔ User-Agent correlation matrix |
|
||||
| GET | `/api/fingerprints/ua-analysis` | Suspicious user-agent analysis |
|
||||
| GET | `/api/fingerprints/ip/{ip}/coherence` | Fingerprint coherence analysis per IP |
|
||||
| GET | `/api/fingerprints/legitimate-ja4` | Known legitimate JA4 fingerprints |
|
||||
| GET | `/api/fingerprints/asn-correlation` | JA4-ASN correlation analysis |
|
||||
|
||||
---
|
||||
|
||||
### Brute Force (`/api/bruteforce`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/bruteforce/targets` | Brute-force target hosts |
|
||||
| GET | `/api/bruteforce/attackers` | Brute-force attacker IPs |
|
||||
| GET | `/api/bruteforce/timeline` | Brute-force attack timeline |
|
||||
| GET | `/api/bruteforce/host/{host}/attackers` | Attackers for a specific host |
|
||||
|
||||
---
|
||||
|
||||
### TCP Spoofing (`/api/tcp-spoofing`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/tcp-spoofing/overview` | TCP/OS fingerprint spoofing overview |
|
||||
| GET | `/api/tcp-spoofing/list` | Spoofing detection list |
|
||||
| GET | `/api/tcp-spoofing/matrix` | TTL × MSS anomaly matrix |
|
||||
|
||||
---
|
||||
|
||||
### Header Fingerprint (`/api/headers`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/headers/clusters` | Header fingerprint clusters (suspicious patterns) |
|
||||
| GET | `/api/headers/cluster/{hash}/ips` | IPs sharing a header fingerprint |
|
||||
|
||||
---
|
||||
|
||||
### Heatmap (`/api/heatmap`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/heatmap/hourly` | Hourly traffic heatmap |
|
||||
| GET | `/api/heatmap/top-hosts` | Top hosts by traffic volume |
|
||||
| GET | `/api/heatmap/matrix` | Activity/hour matrix |
|
||||
|
||||
---
|
||||
|
||||
### Botnets (`/api/botnets`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/botnets/ja4-spread` | JA4 geographic spread (botnet indicator) |
|
||||
| GET | `/api/botnets/ja4/{ja4}/countries` | Country distribution for a JA4 fingerprint |
|
||||
| GET | `/api/botnets/summary` | Global botnet detection summary |
|
||||
|
||||
---
|
||||
|
||||
### Rotation (`/api/rotation`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/rotation/ja4-rotators` | IPs rotating JA4 fingerprints (evasion detection) |
|
||||
| GET | `/api/rotation/persistent-threats` | Persistent threats across time windows |
|
||||
| GET | `/api/rotation/ip/{ip}/ja4-history` | JA4 fingerprint history for an IP |
|
||||
| GET | `/api/rotation/sophistication` | Sophistication score analysis |
|
||||
| GET | `/api/rotation/proactive-hunt` | Proactive threat hunting suggestions |
|
||||
|
||||
---
|
||||
|
||||
### ML Features (`/api/ml`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/ml/top-anomalies` | Top anomalies with feature details |
|
||||
| GET | `/api/ml/ip/{ip}/radar` | Feature radar chart data for an IP |
|
||||
| GET | `/api/ml/score-distribution` | Anomaly score distribution histogram |
|
||||
| GET | `/api/ml/score-trends` | Score trends over time |
|
||||
| GET | `/api/ml/b-features` | Source B (TCP/TLS) feature analysis |
|
||||
| GET | `/api/ml/campaigns` | ML-detected campaign analysis |
|
||||
| GET | `/api/ml/scatter` | Feature scatter plot data |
|
||||
|
||||
---
|
||||
|
||||
### Attributes (`/api/attributes`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/attributes/{attr_type}` | List distinct values for an attribute (ja4, user_agent, asn, country, host) with counts |
|
||||
|
||||
---
|
||||
|
||||
### Variability (`/api/variability`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/variability/{attr_type}/{value}` | Behavioral variability analysis for an attribute value |
|
||||
| GET | `/api/variability/{attr_type}/{value}/ips` | IPs associated with an attribute value |
|
||||
| GET | `/api/variability/{attr_type}/{value}/attributes` | Attribute breakdown for a value |
|
||||
| GET | `/api/variability/{attr_type}/{value}/user_agents` | User-agents for an attribute value |
|
||||
|
||||
---
|
||||
|
||||
### Clustering (`/api/clustering`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/clustering/status` | Clustering cache status |
|
||||
| GET | `/api/clustering/clusters` | K-Means cluster list |
|
||||
| GET | `/api/clustering/cluster/{cluster_id}/points` | Data points in a cluster |
|
||||
| GET | `/api/clustering/cluster/{cluster_id}/ips` | IPs in a cluster |
|
||||
|
||||
---
|
||||
|
||||
### Search (`/api/search`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| GET | `/api/search/quick` | Cross-entity search (IP, JA4, host, UA, country, ASN) |
|
||||
|
||||
---
|
||||
|
||||
### Audit (`/api/audit`)
|
||||
|
||||
| Method | Path | Description |
|
||||
|--------|------|-------------|
|
||||
| POST | `/api/audit/logs` | Create an audit log entry |
|
||||
| GET | `/api/audit/logs` | Query audit logs (filtered, paginated) |
|
||||
| GET | `/api/audit/stats` | Audit statistics |
|
||||
| GET | `/api/audit/users/activity` | Per-user activity summary |
|
||||
|
||||
## Frontend Structure
|
||||
|
||||
The React frontend is built with Vite and served as static assets:
|
||||
|
||||
- **Entry point**: `/` → `frontend/dist/index.html`
|
||||
- **Static assets**: `/assets/*` → `frontend/dist/assets/`
|
||||
- **SPA routing**: All non-`/api/` paths fall through to `index.html` (React Router)
|
||||
- **API proxy**: Frontend calls `/api/*` which is handled by FastAPI routers
|
||||
|
||||
## Services
|
||||
|
||||
### IPReputationService
|
||||
|
||||
Queries public IP reputation databases (IP-API.com, IPinfo.io) without API keys:
|
||||
- Proxy/VPN/Tor detection
|
||||
- ASN, country, ISP information
|
||||
- Hosting provider identification
|
||||
|
||||
### ClusteringEngine
|
||||
|
||||
K-Means clustering on ML features with caching:
|
||||
- Automatic cluster count selection
|
||||
- Feature normalization via StandardScaler
|
||||
- In-memory cache with TTL
|
||||
|
||||
## Deployment
|
||||
|
||||
```bash
|
||||
# Build Docker image
|
||||
make build-dashboard
|
||||
|
||||
# Run tests
|
||||
make test-dashboard
|
||||
|
||||
# Run locally (development)
|
||||
cd services/dashboard
|
||||
uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
### Health Check
|
||||
|
||||
```
|
||||
GET /health → {"status": "healthy", "clickhouse": "connected"}
|
||||
```
|
||||
200
docs/services/mod-reqin-log.md
Normal file
200
docs/services/mod-reqin-log.md
Normal file
@ -0,0 +1,200 @@
|
||||
# mod-reqin-log
|
||||
|
||||
`mod_reqin_log` is an Apache HTTPD module (C shared object) that captures HTTP request metadata and sends it as JSON to a UNIX datagram socket. It serves as the HTTP-layer ingestion point for the ja4-platform pipeline, feeding request data to the [correlator](correlator.md) for joining with TLS fingerprint data from [sentinel](sentinel.md).
|
||||
|
||||
## Purpose
|
||||
|
||||
Apache processes HTTP requests after TLS termination, so it has access to the decoded HTTP method, path, headers, and client IP/port. mod-reqin-log hooks into the `post_read_request` phase to serialize this data immediately, before any rewrite or auth module modifies the request.
|
||||
|
||||
## Apache Directives Reference
|
||||
|
||||
All directives are server-level (`RSRC_CONF`):
|
||||
|
||||
| Directive | Type | Default | Description |
|
||||
|-----------|------|---------|-------------|
|
||||
| `JsonSockLogEnabled` | Flag (On/Off) | Off | Enable or disable the module |
|
||||
| `JsonSockLogSocket` | String | — | UNIX domain socket path for JSON output |
|
||||
| `JsonSockLogHeaders` | String list | — | HTTP header names to log (repeatable) |
|
||||
| `JsonSockLogMaxHeaders` | Integer | `25` | Maximum number of headers to log |
|
||||
| `JsonSockLogMaxHeaderValueLen` | Integer | `256` | Maximum length of each header value (truncated beyond) |
|
||||
| `JsonSockLogReconnectInterval` | Integer (seconds) | `10` | Minimum seconds between reconnection attempts |
|
||||
| `JsonSockLogErrorReportInterval` | Integer (seconds) | `10` | Minimum seconds between error log entries (throttling) |
|
||||
| `JsonSockLogLevel` | String | `WARNING` | Module log level: `DEBUG`, `INFO`, `WARNING`, `ERROR`, `EMERG` |
|
||||
|
||||
### Example httpd.conf
|
||||
|
||||
```apache
|
||||
LoadModule reqin_log_module modules/mod_reqin_log.so
|
||||
|
||||
JsonSockLogEnabled On
|
||||
JsonSockLogSocket /var/run/logcorrelator/http.socket
|
||||
JsonSockLogHeaders User-Agent Accept Accept-Encoding Accept-Language
|
||||
JsonSockLogHeaders Content-Type X-Request-Id X-Trace-Id X-Forwarded-For
|
||||
JsonSockLogHeaders Sec-CH-UA Sec-CH-UA-Mobile Sec-CH-UA-Platform
|
||||
JsonSockLogHeaders Sec-Fetch-Dest Sec-Fetch-Mode Sec-Fetch-Site
|
||||
JsonSockLogMaxHeaders 25
|
||||
JsonSockLogMaxHeaderValueLen 256
|
||||
JsonSockLogReconnectInterval 10
|
||||
JsonSockLogErrorReportInterval 10
|
||||
JsonSockLogLevel WARNING
|
||||
```
|
||||
|
||||
## Output JSON Schema
|
||||
|
||||
Each HTTP request is serialized as a flat JSON object and sent as a single UNIX datagram:
|
||||
|
||||
```json
|
||||
{
|
||||
"time": "2026-03-09T14:30:00Z",
|
||||
"src_ip": "203.0.113.42",
|
||||
"src_port": 52341,
|
||||
"dst_ip": "192.168.1.10",
|
||||
"dst_port": 443,
|
||||
"method": "GET",
|
||||
"scheme": "https",
|
||||
"host": "example.com",
|
||||
"path": "/api/v1/users",
|
||||
"query": "page=1&limit=20",
|
||||
"http_version": "HTTP/2.0",
|
||||
"client_headers": "User-Agent,Accept,Accept-Encoding,Accept-Language",
|
||||
"header_User-Agent": "Mozilla/5.0 ...",
|
||||
"header_Accept": "text/html,application/xhtml+xml",
|
||||
"header_Accept-Encoding": "gzip, deflate, br",
|
||||
"header_Accept-Language": "en-US,en;q=0.9",
|
||||
"header_Sec-Fetch-Dest": "document",
|
||||
"header_Sec-Fetch-Mode": "navigate",
|
||||
"header_Sec-Fetch-Site": "none"
|
||||
}
|
||||
```
|
||||
|
||||
### Field Reference
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `time` | string (ISO 8601) | Request timestamp (UTC) |
|
||||
| `src_ip` | string | Client IP address |
|
||||
| `src_port` | int | Client source port |
|
||||
| `dst_ip` | string | Server IP address |
|
||||
| `dst_port` | int | Server port |
|
||||
| `method` | string | HTTP method (`GET`, `POST`, etc.) |
|
||||
| `scheme` | string | URL scheme (`http` or `https`) |
|
||||
| `host` | string | HTTP Host header value |
|
||||
| `path` | string | Request URI path |
|
||||
| `query` | string | Query string (without `?`) |
|
||||
| `http_version` | string | HTTP version (`HTTP/1.1`, `HTTP/2.0`) |
|
||||
| `client_headers` | string | Comma-separated list of header names sent by client (order preserved) |
|
||||
| `header_<Name>` | string | Value of each configured header (one field per header) |
|
||||
|
||||
### Sensitive Headers
|
||||
|
||||
The following headers are **always excluded** from output regardless of `JsonSockLogHeaders`:
|
||||
|
||||
- `Authorization`
|
||||
- `Cookie`
|
||||
- `Set-Cookie`
|
||||
- `X-Api-Key`
|
||||
- `X-Auth-Token`
|
||||
- `Proxy-Authorization`
|
||||
- `WWW-Authenticate`
|
||||
|
||||
### Size Limits
|
||||
|
||||
- Maximum JSON size: **64 KB** (prevents memory exhaustion DoS)
|
||||
- Header values are truncated to `JsonSockLogMaxHeaderValueLen` bytes
|
||||
|
||||
## Thread Safety
|
||||
|
||||
mod-reqin-log is designed for Apache's `worker` and `event` MPMs (multi-threaded):
|
||||
|
||||
- **Socket FD** is protected by an `apr_thread_mutex_t` (`fd_mutex`)
|
||||
- **Per-child process state** includes the socket file descriptor, mutex, and error tracking
|
||||
- **Error reporting** uses `LOG_THROTTLED` macro with timestamp-based deduplication
|
||||
- All JSON serialization uses per-request pool allocation — no shared buffers
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
Apache HTTPD process
|
||||
├── child process 1
|
||||
│ ├── fd_mutex (apr_thread_mutex_t)
|
||||
│ ├── socket_fd (shared across threads)
|
||||
│ ├── thread 1 → post_read_request → serialize JSON → mutex lock → sendto() → unlock
|
||||
│ ├── thread 2 → post_read_request → serialize JSON → mutex lock → sendto() → unlock
|
||||
│ └── ...
|
||||
├── child process 2
|
||||
│ ├── fd_mutex
|
||||
│ ├── socket_fd (independent)
|
||||
│ └── ...
|
||||
```
|
||||
|
||||
## Reconnection Behavior
|
||||
|
||||
- Socket is opened during `child_init` (per-child process startup)
|
||||
- If the socket is unavailable at startup, connection is deferred
|
||||
- On send failure, reconnection is attempted respecting `JsonSockLogReconnectInterval`
|
||||
- Failed sends are silently dropped (HTTP request processing is not blocked)
|
||||
- Error log entries are throttled by `JsonSockLogErrorReportInterval`
|
||||
- Socket type: `SOCK_DGRAM` (connectionless UNIX datagram)
|
||||
- Non-blocking sends with `MSG_NOSIGNAL`
|
||||
|
||||
## Deployment
|
||||
|
||||
### Installation via RPM
|
||||
|
||||
```bash
|
||||
rpm -ivh mod_reqin_log-1.0.19-1.el10.x86_64.rpm
|
||||
```
|
||||
|
||||
### LoadModule Directive
|
||||
|
||||
```apache
|
||||
LoadModule reqin_log_module modules/mod_reqin_log.so
|
||||
```
|
||||
|
||||
### Verifying Installation
|
||||
|
||||
```bash
|
||||
httpd -M | grep reqin_log
|
||||
# Expected: reqin_log_module (shared)
|
||||
```
|
||||
|
||||
## Build
|
||||
|
||||
All builds run inside Docker:
|
||||
|
||||
```bash
|
||||
# Run unit tests
|
||||
make test-mod-reqin-log
|
||||
|
||||
# Build RPM packages (el8, el9, el10)
|
||||
make rpm-mod-reqin-log
|
||||
# RPMs in services/mod-reqin-log/dist/rpm/el{8,9,10}/
|
||||
```
|
||||
|
||||
### Local Build (requires Apache development headers)
|
||||
|
||||
```bash
|
||||
cd services/mod-reqin-log
|
||||
make build # Compiles mod_reqin_log.so via apxs
|
||||
make test # Runs unit tests
|
||||
```
|
||||
|
||||
### Test Coverage
|
||||
|
||||
Unit tests cover:
|
||||
- JSON serialization (escaping, size limits, field output)
|
||||
- Config parsing (all directives, edge cases)
|
||||
- Header handling (sensitive header exclusion, max headers, truncation)
|
||||
- Module integration (real Apache module hooks)
|
||||
|
||||
## Source Files
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `src/mod_reqin_log.c` | Main module source |
|
||||
| `src/mod_reqin_log.h` | Header with types, constants, defaults |
|
||||
| `conf/mod_reqin_log.conf` | Example Apache configuration |
|
||||
| `tests/unit/test_json_serialization.c` | JSON output tests |
|
||||
| `tests/unit/test_config_parsing.c` | Directive parsing tests |
|
||||
| `tests/unit/test_header_handling.c` | Header filtering tests |
|
||||
| `tests/unit/test_module_real.c` | Integration tests |
|
||||
247
docs/services/sentinel.md
Normal file
247
docs/services/sentinel.md
Normal file
@ -0,0 +1,247 @@
|
||||
# Sentinel
|
||||
|
||||
Sentinel (`ja4sentinel`) is a Go daemon that performs live network packet capture on a Linux server, extracts TLS ClientHello handshakes, generates JA4 and JA3 fingerprints, enriches them with IP/TCP metadata, and outputs structured JSON log records to configurable destinations (UNIX socket, file, or stdout).
|
||||
|
||||
## Role in the Pipeline
|
||||
|
||||
Sentinel is the **network-layer ingestion point**. It sits on the target server, captures TLS traffic via libpcap, and feeds fingerprinted events to the [correlator](correlator.md) through a UNIX datagram socket.
|
||||
|
||||
```
|
||||
Network traffic (port 443/8443)
|
||||
│ pcap
|
||||
▼
|
||||
┌───────────────┐
|
||||
│ sentinel │
|
||||
│ ┌─────────┐ │
|
||||
│ │ capture │──▶ Raw packets
|
||||
│ └─────────┘ │
|
||||
│ ┌─────────┐ │
|
||||
│ │ tlsparse│──▶ TLS ClientHello extraction + TCP reassembly
|
||||
│ └─────────┘ │
|
||||
│ ┌─────────┐ │
|
||||
│ │ finger- │──▶ JA4/JA3 fingerprint generation
|
||||
│ │ print │ │
|
||||
│ └─────────┘ │
|
||||
│ ┌─────────┐ │
|
||||
│ │ output │──▶ UNIX socket / file / stdout
|
||||
│ └─────────┘ │
|
||||
└───────────────┘
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
Sentinel uses a pipeline of goroutines:
|
||||
|
||||
1. **Capture goroutine** — Opens pcap handle on the configured interface, applies BPF filter, reads raw packets into a buffered channel (`packet_buffer_size`).
|
||||
2. **Packet processor goroutine** — Reads from the channel, feeds packets to the TLS parser, generates fingerprints, and writes output.
|
||||
3. **Watchdog goroutine** — Sends systemd watchdog heartbeats at half the configured interval.
|
||||
4. **Signal handler** — Listens for `SIGINT`/`SIGTERM` (graceful shutdown) and `SIGHUP` (log rotation).
|
||||
|
||||
### Key Interfaces
|
||||
|
||||
| Interface | Package | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `Capture` | `internal/capture` | Packet capture via libpcap |
|
||||
| `Parser` | `internal/tlsparse` | TCP reassembly + ClientHello extraction |
|
||||
| `Engine` | `internal/fingerprint` | JA4/JA3 fingerprint generation |
|
||||
| `Writer` | `internal/output` | Log record output (stdout, file, UNIX socket) |
|
||||
| `MultiWriter` | `internal/output` | Fan-out to multiple writers |
|
||||
| `Builder` | `internal/output` | Factory for constructing writers from config |
|
||||
|
||||
## Configuration Reference
|
||||
|
||||
Configuration is loaded from a YAML file (default: `config.yml`) with environment variable overrides.
|
||||
|
||||
### Core Settings
|
||||
|
||||
| Name | Type | Default | Env Override | Description |
|
||||
|------|------|---------|-------------|-------------|
|
||||
| `core.interface` | string | `any` | `JA4SENTINEL_INTERFACE` | Network interface to capture (`any` = all interfaces) |
|
||||
| `core.listen_ports` | []uint16 | `[443]` | `JA4SENTINEL_PORTS` | TCP ports to monitor (comma-separated in env) |
|
||||
| `core.bpf_filter` | string | `""` (auto) | `JA4SENTINEL_BPF_FILTER` | Custom BPF filter (empty = auto-generated) |
|
||||
| `core.local_ips` | []string | `[]` (auto) | — | Local IPs to monitor (empty = auto-detect, excludes loopback) |
|
||||
| `core.exclude_source_ips` | []string | `[]` | — | Source IPs or CIDRs to exclude (e.g., `["10.0.0.0/8"]`) |
|
||||
| `core.flow_timeout_sec` | int | `30` | `JA4SENTINEL_FLOW_TIMEOUT` | Timeout for TLS handshake extraction (1–300) |
|
||||
| `core.packet_buffer_size` | int | `1000` | `JA4SENTINEL_PACKET_BUFFER_SIZE` | Packet channel buffer size (1–1,000,000) |
|
||||
| `core.log_level` | string | `info` | — | Log level: `debug`, `info`, `warn`, `error` (YAML only) |
|
||||
|
||||
> **Note:** `log_level` is intentionally not overridable via environment variable (architecture decision since v1.1.12).
|
||||
|
||||
### Output Settings
|
||||
|
||||
Each output is an entry in the `outputs` array:
|
||||
|
||||
| Name | Type | Default | Description |
|
||||
|------|------|---------|-------------|
|
||||
| `type` | string | — | Output type: `unix_socket`, `stdout`, `file` |
|
||||
| `enabled` | bool | — | Whether this output is active |
|
||||
| `async_buffer` | int | `1000` | Queue size for async writes |
|
||||
| `params.socket_path` | string | — | Path for `unix_socket` type |
|
||||
| `params.path` | string | — | File path for `file` type |
|
||||
|
||||
### Example Configuration
|
||||
|
||||
```yaml
|
||||
core:
|
||||
interface: any
|
||||
listen_ports: [443, 8443]
|
||||
bpf_filter: ""
|
||||
local_ips: []
|
||||
exclude_source_ips: ["10.0.0.0/8", "192.168.1.1"]
|
||||
flow_timeout_sec: 30
|
||||
packet_buffer_size: 1000
|
||||
log_level: info
|
||||
|
||||
outputs:
|
||||
- type: unix_socket
|
||||
enabled: true
|
||||
params:
|
||||
socket_path: /var/run/logcorrelator/network.socket
|
||||
- type: file
|
||||
enabled: false
|
||||
params:
|
||||
path: /var/log/ja4sentinel/ja4.log
|
||||
```
|
||||
|
||||
## Output Format (LogRecord JSON Schema)
|
||||
|
||||
Each output record is a flat JSON object:
|
||||
|
||||
```json
|
||||
{
|
||||
"src_ip": "203.0.113.42",
|
||||
"src_port": 52341,
|
||||
"dst_ip": "192.168.1.10",
|
||||
"dst_port": 443,
|
||||
"ip_meta_ttl": 64,
|
||||
"ip_meta_total_length": 583,
|
||||
"ip_meta_id": 12345,
|
||||
"ip_meta_df": true,
|
||||
"tcp_meta_window_size": 65535,
|
||||
"tcp_meta_mss": 1460,
|
||||
"tcp_meta_window_scale": 8,
|
||||
"tcp_meta_options": "MSS,NOP,WScale,NOP,NOP,Timestamps,SACK",
|
||||
"conn_id": "203.0.113.42:52341-192.168.1.10:443",
|
||||
"sensor_id": "",
|
||||
"tls_version": "1.3",
|
||||
"tls_sni": "example.com",
|
||||
"tls_alpn": "h2",
|
||||
"syn_to_clienthello_ms": 12,
|
||||
"ja4": "t13d1516h2_8daaf6152771_b0da82dd1658",
|
||||
"ja3": "771,4866-4867-4865-49196-49200...",
|
||||
"ja3_hash": "e7d705a3286e19ea42f587b344ee6865",
|
||||
"timestamp": 1709312345678901234
|
||||
}
|
||||
```
|
||||
|
||||
### Field Reference
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `src_ip` | string | Client source IP address |
|
||||
| `src_port` | uint16 | Client source port |
|
||||
| `dst_ip` | string | Server destination IP address |
|
||||
| `dst_port` | uint16 | Server destination port |
|
||||
| `ip_meta_ttl` | uint8 | IP Time-To-Live |
|
||||
| `ip_meta_total_length` | uint16 | IP total packet length |
|
||||
| `ip_meta_id` | uint16 | IP identification field |
|
||||
| `ip_meta_df` | bool | IP Don't Fragment flag |
|
||||
| `tcp_meta_window_size` | uint16 | TCP window size |
|
||||
| `tcp_meta_mss` | uint16 | TCP Maximum Segment Size (omitted if 0) |
|
||||
| `tcp_meta_window_scale` | uint8 | TCP window scale factor (omitted if 0) |
|
||||
| `tcp_meta_options` | string | Comma-separated TCP options |
|
||||
| `conn_id` | string | Unique flow identifier |
|
||||
| `sensor_id` | string | Sensor/captor identifier |
|
||||
| `tls_version` | string | Max TLS version from ClientHello |
|
||||
| `tls_sni` | string | Server Name Indication |
|
||||
| `tls_alpn` | string | ALPN protocol (e.g., `h2`, `http/1.1`) |
|
||||
| `syn_to_clienthello_ms` | uint32 | Time from SYN to ClientHello (ms) |
|
||||
| `ja4` | string | JA4 TLS fingerprint |
|
||||
| `ja3` | string | JA3 TLS fingerprint |
|
||||
| `ja3_hash` | string | MD5 hash of JA3 string |
|
||||
| `timestamp` | int64 | Unix nanoseconds |
|
||||
|
||||
## UNIX Socket Output Protocol
|
||||
|
||||
- **Socket type**: `unixgram` (DGRAM — connectionless)
|
||||
- **Encoding**: One JSON object per datagram (no delimiter)
|
||||
- **Max datagram size**: 64 KB
|
||||
- **Reconnection**: Exponential backoff (100 ms → 2 s), max 3 attempts per write
|
||||
- **Queue**: Async write queue (default 1000 items) absorbs transient socket failures
|
||||
- **Error callback**: Consecutive failures are tracked and reported
|
||||
|
||||
## Signal Handling
|
||||
|
||||
| Signal | Behavior |
|
||||
|--------|----------|
|
||||
| `SIGTERM` / `SIGINT` | Graceful shutdown: cancel context, close capture, flush outputs, log filter stats |
|
||||
| `SIGHUP` | Log rotation: reopen file outputs (used by `systemctl reload` + logrotate) |
|
||||
|
||||
## JA4 Fingerprint Algorithm
|
||||
|
||||
1. Extract TLS ClientHello from the TCP payload (with TCP reassembly for fragmented handshakes)
|
||||
2. Parse cipher suites, extensions, ALPN, SNI, supported versions
|
||||
3. Build JA4 string: `t{version}{sni_flag}{cipher_count}{ext_count}_{cipher_hash}_{ext_hash}`
|
||||
4. Build JA3 string: `{version},{ciphers},{extensions},{curves},{formats}`
|
||||
5. Compute JA3 MD5 hash
|
||||
|
||||
Sentinel uses the `tlsfingerprint` library for ALPN and TLS version parsing, with custom sanitization for malformed/truncated ClientHellos.
|
||||
|
||||
## Deployment
|
||||
|
||||
### systemd
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=ja4sentinel TLS fingerprinting daemon
|
||||
After=network.target
|
||||
|
||||
[Service]
|
||||
Type=notify
|
||||
ExecStart=/usr/bin/ja4sentinel -config /etc/ja4sentinel/config.yml
|
||||
ExecReload=/bin/kill -HUP $MAINPID
|
||||
Restart=on-failure
|
||||
WatchdogSec=30
|
||||
TimeoutStopSec=2
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Sentinel uses systemd `sd_notify` for:
|
||||
- `READY` — sent after initialization
|
||||
- `WATCHDOG` — sent at half the `WatchdogSec` interval
|
||||
- `STOPPING` — sent before shutdown
|
||||
|
||||
### Docker
|
||||
|
||||
```bash
|
||||
make build-sentinel
|
||||
docker run --cap-add=NET_RAW --cap-add=NET_ADMIN \
|
||||
-v /var/run/logcorrelator:/var/run/logcorrelator \
|
||||
ja4-platform/sentinel:latest
|
||||
```
|
||||
|
||||
## RPM Package Contents
|
||||
|
||||
| Path | Description |
|
||||
|------|-------------|
|
||||
| `/usr/bin/ja4sentinel` | Binary (statically linked Go) |
|
||||
| `/etc/ja4sentinel/config.yml.default` | Default configuration (noreplace) |
|
||||
| `/usr/share/ja4sentinel/config.yml` | Reference configuration |
|
||||
| `/usr/lib/systemd/system/ja4sentinel.service` | systemd unit |
|
||||
| `/etc/logrotate.d/ja4sentinel` | logrotate configuration |
|
||||
| `/var/lib/ja4sentinel/` | State directory |
|
||||
| `/var/log/ja4sentinel/` | Log directory |
|
||||
| `/var/run/logcorrelator/` | Socket directory |
|
||||
|
||||
### RPM Dependencies
|
||||
|
||||
- `systemd`
|
||||
- `libpcap >= 1.9.0`
|
||||
|
||||
### Supported Distributions
|
||||
|
||||
- Rocky Linux 8, 9, 10
|
||||
- AlmaLinux 8, 9
|
||||
- RHEL 8, 9
|
||||
244
docs/shared/go-ja4common.md
Normal file
244
docs/shared/go-ja4common.md
Normal file
@ -0,0 +1,244 @@
|
||||
# go-ja4common
|
||||
|
||||
`ja4common` is the shared Go library for the ja4-platform, providing unified logging, YAML configuration loading with environment variable overrides, graceful shutdown handling, and IP address filtering. It is used by both [sentinel](../services/sentinel.md) and [correlator](../services/correlator.md).
|
||||
|
||||
**Module path**: `github.com/antitbone/ja4/ja4common`
|
||||
|
||||
**Go version**: 1.21+
|
||||
|
||||
**Dependencies**: `gopkg.in/yaml.v3`
|
||||
|
||||
## Packages
|
||||
|
||||
### logger
|
||||
|
||||
Unified structured logging with two styles:
|
||||
- **Prefix+Fields style** (correlator pattern) — `Logger`
|
||||
- **Component style** (sentinel pattern) — `ComponentLogger`
|
||||
|
||||
#### Types
|
||||
|
||||
```go
|
||||
type LogLevel int
|
||||
|
||||
const (
|
||||
DEBUG LogLevel = iota
|
||||
INFO
|
||||
WARN
|
||||
ERROR
|
||||
)
|
||||
```
|
||||
|
||||
#### Logger API
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `New` | `New(prefix string) *Logger` | Create logger with INFO level |
|
||||
| `NewWithLevel` | `NewWithLevel(prefix, level string) *Logger` | Create logger with specified level |
|
||||
| `SetLevel` | `(l *Logger) SetLevel(level string)` | Change minimum log level at runtime |
|
||||
| `ShouldLog` | `(l *Logger) ShouldLog(level LogLevel) bool` | Check if level would be logged |
|
||||
| `WithFields` | `(l *Logger) WithFields(fields map[string]any) *Logger` | Return new logger with additional fields |
|
||||
| `Info` | `(l *Logger) Info(msg string)` | Log info message |
|
||||
| `Infof` | `(l *Logger) Infof(msg string, args ...any)` | Log formatted info |
|
||||
| `Warn` | `(l *Logger) Warn(msg string)` | Log warning |
|
||||
| `Warnf` | `(l *Logger) Warnf(msg string, args ...any)` | Log formatted warning |
|
||||
| `Error` | `(l *Logger) Error(msg string, err error)` | Log error with optional error value |
|
||||
| `Debug` | `(l *Logger) Debug(msg string)` | Log debug message |
|
||||
| `Debugf` | `(l *Logger) Debugf(msg string, args ...any)` | Log formatted debug |
|
||||
| `ParseLogLevel` | `ParseLogLevel(level string) LogLevel` | Parse string to LogLevel |
|
||||
|
||||
#### ComponentLogger API
|
||||
|
||||
Wraps `Logger` to satisfy sentinel's component-based logging interface:
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `NewComponentLogger` | `NewComponentLogger(level string) *ComponentLogger` | Create component logger |
|
||||
| `Log` | `(c *ComponentLogger) Log(component, level, message string, details map[string]string)` | Log with component context |
|
||||
| `Debug` | `(c *ComponentLogger) Debug(component, message string, details map[string]string)` | Debug with component |
|
||||
| `Info` | `(c *ComponentLogger) Info(component, message string, details map[string]string)` | Info with component |
|
||||
| `Warn` | `(c *ComponentLogger) Warn(component, message string, details map[string]string)` | Warn with component |
|
||||
| `Error` | `(c *ComponentLogger) Error(component, message string, details map[string]string)` | Error with component |
|
||||
|
||||
#### Usage Example
|
||||
|
||||
```go
|
||||
import "github.com/antitbone/ja4/ja4common/logger"
|
||||
|
||||
// Prefix+Fields style
|
||||
log := logger.NewWithLevel("myservice", "DEBUG")
|
||||
log.Info("starting up")
|
||||
log.WithFields(map[string]any{"port": 8080}).Info("listening")
|
||||
|
||||
// Component style (sentinel compatibility)
|
||||
clog := logger.NewComponentLogger("info")
|
||||
clog.Info("capture", "packets received", map[string]string{"count": "1000"})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### config
|
||||
|
||||
Generic YAML configuration loading with environment variable overrides using struct tags.
|
||||
|
||||
#### API
|
||||
|
||||
| Function | Signature | Description |
|
||||
|----------|-----------|-------------|
|
||||
| `LoadYAML` | `LoadYAML[T any](path string, optional bool) (T, error)` | Load and unmarshal YAML file |
|
||||
| `OverrideFromEnv` | `OverrideFromEnv[T any](cfg *T, envPrefix string) error` | Apply env var overrides via `env` struct tags |
|
||||
|
||||
#### Supported Types for Environment Override
|
||||
|
||||
- `string`
|
||||
- `int`, `int8`, `int16`, `int32`, `int64`
|
||||
- `uint`, `uint8`, `uint16`, `uint32`, `uint64`
|
||||
- `bool`
|
||||
- `[]string` (comma-separated)
|
||||
|
||||
#### Usage Example
|
||||
|
||||
```go
|
||||
import "github.com/antitbone/ja4/ja4common/config"
|
||||
|
||||
type MyConfig struct {
|
||||
Host string `yaml:"host" env:"HOST"`
|
||||
Port int `yaml:"port" env:"PORT"`
|
||||
Debug bool `yaml:"debug" env:"DEBUG"`
|
||||
Tags []string `yaml:"tags" env:"TAGS"`
|
||||
}
|
||||
|
||||
// Load YAML (optional=true means missing file returns zero value)
|
||||
cfg, err := config.LoadYAML[MyConfig]("config.yml", true)
|
||||
|
||||
// Override from environment (prefix="" means use tag directly)
|
||||
err = config.OverrideFromEnv(&cfg, "MYAPP")
|
||||
// Reads: MYAPP_HOST, MYAPP_PORT, MYAPP_DEBUG, MYAPP_TAGS
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### shutdown
|
||||
|
||||
Graceful shutdown handler that blocks until `SIGTERM`/`SIGINT`, then runs cleanup hooks.
|
||||
|
||||
#### API
|
||||
|
||||
```go
|
||||
type Hook struct {
|
||||
Name string
|
||||
Fn func() error
|
||||
}
|
||||
|
||||
func Handle(ctx context.Context, cancel context.CancelFunc, hooks []Hook, logger simpleLogger)
|
||||
```
|
||||
|
||||
The `Handle` function:
|
||||
1. Blocks until `SIGTERM`, `SIGINT`, or context cancellation
|
||||
2. Calls `cancel()` to propagate shutdown
|
||||
3. Runs all hooks in order, logging errors but not aborting
|
||||
|
||||
#### Usage Example
|
||||
|
||||
```go
|
||||
import "github.com/antitbone/ja4/ja4common/shutdown"
|
||||
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
|
||||
hooks := []shutdown.Hook{
|
||||
{Name: "close-db", Fn: func() error { return db.Close() }},
|
||||
{Name: "flush-logs", Fn: func() error { return logger.Flush() }},
|
||||
}
|
||||
|
||||
// This blocks until signal received
|
||||
shutdown.Handle(ctx, cancel, hooks, myLogger)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### ipfilter
|
||||
|
||||
IP address and CIDR range matching for source IP exclusion.
|
||||
|
||||
#### API
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `New` | `New(excludeList []string) (*Filter, error)` | Create filter from IP/CIDR list |
|
||||
| `ShouldExclude` | `(f *Filter) ShouldExclude(ipStr string) bool` | Check if IP should be excluded |
|
||||
| `Count` | `(f *Filter) Count() (ips int, networks int)` | Return number of loaded entries |
|
||||
|
||||
Accepts: single IPs (`192.168.1.1`), CIDR ranges (`10.0.0.0/8`), IPv6 addresses and ranges.
|
||||
|
||||
#### Usage Example
|
||||
|
||||
```go
|
||||
import "github.com/antitbone/ja4/ja4common/ipfilter"
|
||||
|
||||
filter, err := ipfilter.New([]string{
|
||||
"10.0.0.0/8",
|
||||
"192.168.1.1",
|
||||
"2001:db8::/32",
|
||||
})
|
||||
|
||||
if filter.ShouldExclude("10.0.0.5") {
|
||||
// Skip this IP
|
||||
}
|
||||
|
||||
ips, nets := filter.Count() // 1 IP, 2 networks
|
||||
```
|
||||
|
||||
## Using from a New Service
|
||||
|
||||
### 1. Add to go.mod
|
||||
|
||||
```bash
|
||||
cd services/my-service
|
||||
go mod init github.com/antitbone/ja4/my-service
|
||||
```
|
||||
|
||||
Add the dependency:
|
||||
```
|
||||
require github.com/antitbone/ja4/ja4common v0.0.0
|
||||
```
|
||||
|
||||
### 2. Add to go.work
|
||||
|
||||
In the repository root `go.work`:
|
||||
```
|
||||
use (
|
||||
./services/sentinel
|
||||
./services/correlator
|
||||
./services/my-service // ← add
|
||||
./shared/go/ja4common
|
||||
)
|
||||
```
|
||||
|
||||
### 3. Import and Use
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"context"
|
||||
"github.com/antitbone/ja4/ja4common/config"
|
||||
"github.com/antitbone/ja4/ja4common/logger"
|
||||
"github.com/antitbone/ja4/ja4common/shutdown"
|
||||
)
|
||||
|
||||
func main() {
|
||||
log := logger.NewWithLevel("myservice", "INFO")
|
||||
|
||||
cfg, _ := config.LoadYAML[MyConfig]("config.yml", true)
|
||||
config.OverrideFromEnv(&cfg, "MYSERVICE")
|
||||
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
shutdown.Handle(ctx, cancel, nil, log)
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Sync Workspace
|
||||
|
||||
```bash
|
||||
go work sync
|
||||
```
|
||||
216
docs/shared/python-ja4common.md
Normal file
216
docs/shared/python-ja4common.md
Normal file
@ -0,0 +1,216 @@
|
||||
# python-ja4common
|
||||
|
||||
`ja4_common` is the shared Python library for the ja4-platform, providing a unified ClickHouse client singleton and configuration settings. It is used by [bot-detector](../services/bot-detector.md) and [dashboard](../services/dashboard.md).
|
||||
|
||||
**Package name**: `ja4-common`
|
||||
|
||||
**Python version**: ≥ 3.11
|
||||
|
||||
**Dependencies**:
|
||||
- `clickhouse-connect >= 0.8.0`
|
||||
- `pydantic-settings >= 2.1.0`
|
||||
|
||||
## ClickHouseSettings
|
||||
|
||||
Pydantic-settings model that reads configuration from environment variables and `.env` files.
|
||||
|
||||
### Fields
|
||||
|
||||
| Field | Type | Default | Env Variable | Description |
|
||||
|-------|------|---------|-------------|-------------|
|
||||
| `CLICKHOUSE_HOST` | str | `"clickhouse"` | `CLICKHOUSE_HOST` | ClickHouse server hostname |
|
||||
| `CLICKHOUSE_PORT` | int | `8123` | `CLICKHOUSE_PORT` | ClickHouse HTTP API port |
|
||||
| `CLICKHOUSE_DB` | str | `"mabase_prod"` | `CLICKHOUSE_DB` | Database name |
|
||||
| `CLICKHOUSE_USER` | str | `"admin"` | `CLICKHOUSE_USER` | Username for authentication |
|
||||
| `CLICKHOUSE_PASSWORD` | str | `""` | `CLICKHOUSE_PASSWORD` | Password for authentication |
|
||||
|
||||
### Configuration Sources
|
||||
|
||||
Settings are loaded in order of precedence:
|
||||
1. **Environment variables** (highest priority)
|
||||
2. **`.env` file** in the current working directory
|
||||
3. **Default values** (lowest priority)
|
||||
|
||||
Environment variable names are **case-sensitive** (e.g., `CLICKHOUSE_HOST`, not `clickhouse_host`).
|
||||
|
||||
### Usage
|
||||
|
||||
```python
|
||||
from ja4_common.settings import settings
|
||||
|
||||
print(settings.CLICKHOUSE_HOST) # "clickhouse" or from env
|
||||
print(settings.CLICKHOUSE_PORT) # 8123 or from env
|
||||
```
|
||||
|
||||
## ClickHouseClient
|
||||
|
||||
Wraps `clickhouse_connect` with auto-reconnection and a clean API.
|
||||
|
||||
### Methods
|
||||
|
||||
| Method | Signature | Description |
|
||||
|--------|-----------|-------------|
|
||||
| `connect` | `connect() -> Client` | Returns the underlying `clickhouse_connect` client, creating or reconnecting as needed |
|
||||
| `query` | `query(query: str, params: dict = None)` | Execute a SELECT query, returns result set |
|
||||
| `command` | `command(query: str, params: dict = None)` | Execute a DDL/DML command (CREATE, INSERT, etc.) |
|
||||
| `insert` | `insert(table: str, data, column_names=None)` | Bulk insert data into a table |
|
||||
| `close` | `close()` | Close the connection and release resources |
|
||||
|
||||
### Auto-Reconnection
|
||||
|
||||
The `connect()` method automatically reconnects if the current connection is lost:
|
||||
|
||||
```python
|
||||
def connect(self):
|
||||
if self._client is None or not self._ping():
|
||||
self._client = clickhouse_connect.get_client(
|
||||
host=settings.CLICKHOUSE_HOST,
|
||||
port=settings.CLICKHOUSE_PORT,
|
||||
database=settings.CLICKHOUSE_DB,
|
||||
user=settings.CLICKHOUSE_USER,
|
||||
password=settings.CLICKHOUSE_PASSWORD,
|
||||
connect_timeout=10,
|
||||
)
|
||||
return self._client
|
||||
```
|
||||
|
||||
### Usage Example
|
||||
|
||||
```python
|
||||
from ja4_common.clickhouse import get_client
|
||||
|
||||
client = get_client()
|
||||
|
||||
# SELECT query
|
||||
result = client.query("SELECT count() FROM http_logs WHERE src_ip = {ip:String}", {"ip": "203.0.113.42"})
|
||||
print(result.result_rows)
|
||||
|
||||
# INSERT
|
||||
client.insert("audit_logs", [[datetime.now(), "analyst1", "investigate", "ip", "203.0.113.42"]],
|
||||
column_names=["timestamp", "user_name", "action", "entity_type", "entity_id"])
|
||||
|
||||
# Command
|
||||
client.command("OPTIMIZE TABLE http_logs FINAL")
|
||||
```
|
||||
|
||||
## get_client() Singleton
|
||||
|
||||
The `get_client()` function provides a module-level singleton `ClickHouseClient`:
|
||||
|
||||
```python
|
||||
from ja4_common.clickhouse import get_client
|
||||
|
||||
# First call creates the client
|
||||
client1 = get_client()
|
||||
|
||||
# Subsequent calls return the same instance
|
||||
client2 = get_client()
|
||||
assert client1 is client2
|
||||
```
|
||||
|
||||
### Implementation
|
||||
|
||||
```python
|
||||
_client: Optional[ClickHouseClient] = None
|
||||
|
||||
def get_client() -> ClickHouseClient:
|
||||
global _client
|
||||
if _client is None:
|
||||
_client = ClickHouseClient()
|
||||
return _client
|
||||
```
|
||||
|
||||
## Using from a New Service
|
||||
|
||||
### 1. Add Dependency
|
||||
|
||||
In your service's `requirements.txt`:
|
||||
```
|
||||
ja4-common @ file:///app/shared/python/ja4_common
|
||||
```
|
||||
|
||||
Or in `pyproject.toml`:
|
||||
```toml
|
||||
[project]
|
||||
dependencies = [
|
||||
"ja4-common",
|
||||
]
|
||||
```
|
||||
|
||||
### 2. Docker Setup
|
||||
|
||||
```dockerfile
|
||||
# Copy shared library
|
||||
COPY shared/python/ja4_common /app/shared/python/ja4_common
|
||||
RUN pip install /app/shared/python/ja4_common
|
||||
|
||||
# Copy service code
|
||||
COPY services/my-service /app/services/my-service
|
||||
```
|
||||
|
||||
### 3. Use in Code
|
||||
|
||||
```python
|
||||
from ja4_common.clickhouse import get_client
|
||||
from ja4_common.settings import settings
|
||||
|
||||
# Access settings
|
||||
print(f"Connecting to {settings.CLICKHOUSE_HOST}:{settings.CLICKHOUSE_PORT}")
|
||||
|
||||
# Use client
|
||||
db = get_client()
|
||||
result = db.query("SELECT count() FROM ml_detected_anomalies")
|
||||
```
|
||||
|
||||
### 4. Environment Configuration
|
||||
|
||||
Create a `.env` file or set environment variables:
|
||||
```bash
|
||||
CLICKHOUSE_HOST=clickhouse.example.com
|
||||
CLICKHOUSE_PORT=8123
|
||||
CLICKHOUSE_DB=mabase_prod
|
||||
CLICKHOUSE_USER=data_writer
|
||||
CLICKHOUSE_PASSWORD=secret
|
||||
```
|
||||
|
||||
## Testing: Mocking the Client
|
||||
|
||||
### Using unittest.mock
|
||||
|
||||
```python
|
||||
from unittest.mock import MagicMock, patch
|
||||
from ja4_common.clickhouse import ClickHouseClient
|
||||
|
||||
def test_my_service():
|
||||
mock_client = MagicMock(spec=ClickHouseClient)
|
||||
mock_client.query.return_value = MagicMock(result_rows=[(42,)])
|
||||
|
||||
with patch("ja4_common.clickhouse._client", mock_client):
|
||||
from ja4_common.clickhouse import get_client
|
||||
client = get_client()
|
||||
result = client.query("SELECT count() FROM http_logs")
|
||||
assert result.result_rows == [(42,)]
|
||||
```
|
||||
|
||||
### Overriding Settings in Tests
|
||||
|
||||
```python
|
||||
from ja4_common.settings import ClickHouseSettings
|
||||
|
||||
# Create custom settings for tests
|
||||
test_settings = ClickHouseSettings(
|
||||
CLICKHOUSE_HOST="localhost",
|
||||
CLICKHOUSE_PORT=8123,
|
||||
CLICKHOUSE_DB="test_db",
|
||||
CLICKHOUSE_USER="test_user",
|
||||
CLICKHOUSE_PASSWORD="test_pass",
|
||||
)
|
||||
```
|
||||
|
||||
## Source Files
|
||||
|
||||
| File | Description |
|
||||
|------|-------------|
|
||||
| `ja4_common/settings.py` | `ClickHouseSettings` pydantic-settings model |
|
||||
| `ja4_common/clickhouse.py` | `ClickHouseClient` class and `get_client()` singleton |
|
||||
| `pyproject.toml` | Package metadata and dependencies |
|
||||
Reference in New Issue
Block a user