Files
ja4-platform/services/correlator/config.example.yml
toto d469e39da7 feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized
Services:
- ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap)
- logcorrelator: JA4 log correlation engine (Go, ClickHouse)
- mod_reqin_log: Apache module (C, JSON request logging)
- bot_detector: ML bot detection pipeline (Python)
- dashboard: FastAPI/Streamlit analytics UI (Python)

Shared libraries:
- shared/go/ja4common: logger, config, shutdown, ipfilter (Go module)
- shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package)
- shared/clickhouse/: canonical SQL migrations (10 files)

Build & packaging:
- Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10)
- go.work workspace linking sentinel, correlator, ja4common
- Makefile with test-all, build-all, rpm-* targets

Fixes applied:
- go.work: 1.21 → 1.24.6 (required by sentinel)
- correlator Dockerfiles: golang:1.21 → golang:1.24
- replace directives in go.mod for ja4common local path
- pyproject.toml: setuptools.backends → setuptools.build_meta
- Removed static libpcap linking (unavailable on Rocky 9)
- Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32)
- Rewrote corrupted test files (logger_test.go × 2)

Test coverage:
- correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%)
- sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse)

Documentation:
- README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-07 16:42:59 +02:00

93 lines
2.7 KiB
YAML

# logcorrelator configuration file
# Format: YAML
# Logging configuration
log:
level: INFO # DEBUG, INFO, WARN, ERROR
inputs:
unix_sockets:
- name: http
source_type: A
path: /var/run/logcorrelator/http.socket
format: json
socket_permissions: "0666" # world read/write
- name: network
source_type: B
path: /var/run/logcorrelator/network.socket
format: json
socket_permissions: "0666"
outputs:
file:
enabled: true
path: /var/log/logcorrelator/correlated.log
clickhouse:
enabled: false
dsn: clickhouse://user:pass@localhost:9000/db
table: correlated_logs_http_network
batch_size: 500
flush_interval_ms: 200
max_buffer_size: 5000
drop_on_overflow: true
async_insert: true
timeout_ms: 1000
stdout:
enabled: false
correlation:
# Time window for correlation (A and B must be within this window)
# Increased to 10s to support HTTP Keep-Alive scenarios
time_window:
value: 10
unit: s
# Orphan policy: what to do when no match is found
orphan_policy:
apache_always_emit: true # Always emit A events, even without B match
apache_emit_delay_ms: 500 # Wait 500ms before emitting as orphan (allows B to arrive)
network_emit: false # Never emit B events alone
# Matching mode: one_to_one or one_to_many (Keep-Alive)
matching:
mode: one_to_many
# Buffer limits (max events in memory)
buffers:
max_http_items: 10000
max_network_items: 20000
# TTL for network events (source B)
# Increased to 120s to support long-lived HTTP Keep-Alive sessions
ttl:
network_ttl_s: 120
# Exclude specific source IPs or CIDR ranges from correlation
# Events from these IPs will be silently dropped (not correlated, not emitted)
# Useful for excluding health checks, internal traffic, or known bad actors
exclude_source_ips:
- 10.0.0.1 # Single IP
- 192.168.1.100 # Another single IP
- 172.16.0.0/12 # CIDR range (private network)
- 10.10.10.0/24 # Another CIDR range
# Restrict correlation to specific destination ports (optional)
# If non-empty, only events whose dst_port matches one of these values will be correlated
# Events on other ports are silently ignored (not correlated, not emitted as orphans)
# Useful to focus on HTTP/HTTPS traffic only and ignore unrelated connections
# include_dest_ports:
# - 80 # HTTP
# - 443 # HTTPS
# - 8080 # HTTP alt
# - 8443 # HTTPS alt
# Metrics server configuration (optional, for debugging/monitoring)
metrics:
enabled: false
addr: ":8080" # Address to listen on (e.g., ":8080", "localhost:8080")
# Endpoints:
# GET /metrics - Returns correlation metrics as JSON
# GET /health - Health check endpoint