Files
ja4-platform/docs/services/sentinel.md
toto d469e39da7 feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized
Services:
- ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap)
- logcorrelator: JA4 log correlation engine (Go, ClickHouse)
- mod_reqin_log: Apache module (C, JSON request logging)
- bot_detector: ML bot detection pipeline (Python)
- dashboard: FastAPI/Streamlit analytics UI (Python)

Shared libraries:
- shared/go/ja4common: logger, config, shutdown, ipfilter (Go module)
- shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package)
- shared/clickhouse/: canonical SQL migrations (10 files)

Build & packaging:
- Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10)
- go.work workspace linking sentinel, correlator, ja4common
- Makefile with test-all, build-all, rpm-* targets

Fixes applied:
- go.work: 1.21 → 1.24.6 (required by sentinel)
- correlator Dockerfiles: golang:1.21 → golang:1.24
- replace directives in go.mod for ja4common local path
- pyproject.toml: setuptools.backends → setuptools.build_meta
- Removed static libpcap linking (unavailable on Rocky 9)
- Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32)
- Rewrote corrupted test files (logger_test.go × 2)

Test coverage:
- correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%)
- sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse)

Documentation:
- README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-07 16:42:59 +02:00

9.1 KiB
Raw Blame History

Sentinel

Sentinel (ja4sentinel) is a Go daemon that performs live network packet capture on a Linux server, extracts TLS ClientHello handshakes, generates JA4 and JA3 fingerprints, enriches them with IP/TCP metadata, and outputs structured JSON log records to configurable destinations (UNIX socket, file, or stdout).

Role in the Pipeline

Sentinel is the network-layer ingestion point. It sits on the target server, captures TLS traffic via libpcap, and feeds fingerprinted events to the correlator through a UNIX datagram socket.

Network traffic (port 443/8443)
        │ pcap
        ▼
┌───────────────┐
│   sentinel    │
│  ┌─────────┐  │
│  │ capture  │──▶ Raw packets
│  └─────────┘  │
│  ┌─────────┐  │
│  │ tlsparse│──▶ TLS ClientHello extraction + TCP reassembly
│  └─────────┘  │
│  ┌─────────┐  │
│  │ finger- │──▶ JA4/JA3 fingerprint generation
│  │ print   │  │
│  └─────────┘  │
│  ┌─────────┐  │
│  │ output  │──▶ UNIX socket / file / stdout
│  └─────────┘  │
└───────────────┘

Architecture

Sentinel uses a pipeline of goroutines:

  1. Capture goroutine — Opens pcap handle on the configured interface, applies BPF filter, reads raw packets into a buffered channel (packet_buffer_size).
  2. Packet processor goroutine — Reads from the channel, feeds packets to the TLS parser, generates fingerprints, and writes output.
  3. Watchdog goroutine — Sends systemd watchdog heartbeats at half the configured interval.
  4. Signal handler — Listens for SIGINT/SIGTERM (graceful shutdown) and SIGHUP (log rotation).

Key Interfaces

Interface Package Description
Capture internal/capture Packet capture via libpcap
Parser internal/tlsparse TCP reassembly + ClientHello extraction
Engine internal/fingerprint JA4/JA3 fingerprint generation
Writer internal/output Log record output (stdout, file, UNIX socket)
MultiWriter internal/output Fan-out to multiple writers
Builder internal/output Factory for constructing writers from config

Configuration Reference

Configuration is loaded from a YAML file (default: config.yml) with environment variable overrides.

Core Settings

Name Type Default Env Override Description
core.interface string any JA4SENTINEL_INTERFACE Network interface to capture (any = all interfaces)
core.listen_ports []uint16 [443] JA4SENTINEL_PORTS TCP ports to monitor (comma-separated in env)
core.bpf_filter string "" (auto) JA4SENTINEL_BPF_FILTER Custom BPF filter (empty = auto-generated)
core.local_ips []string [] (auto) Local IPs to monitor (empty = auto-detect, excludes loopback)
core.exclude_source_ips []string [] Source IPs or CIDRs to exclude (e.g., ["10.0.0.0/8"])
core.flow_timeout_sec int 30 JA4SENTINEL_FLOW_TIMEOUT Timeout for TLS handshake extraction (1300)
core.packet_buffer_size int 1000 JA4SENTINEL_PACKET_BUFFER_SIZE Packet channel buffer size (11,000,000)
core.log_level string info Log level: debug, info, warn, error (YAML only)

Note: log_level is intentionally not overridable via environment variable (architecture decision since v1.1.12).

Output Settings

Each output is an entry in the outputs array:

Name Type Default Description
type string Output type: unix_socket, stdout, file
enabled bool Whether this output is active
async_buffer int 1000 Queue size for async writes
params.socket_path string Path for unix_socket type
params.path string File path for file type

Example Configuration

core:
  interface: any
  listen_ports: [443, 8443]
  bpf_filter: ""
  local_ips: []
  exclude_source_ips: ["10.0.0.0/8", "192.168.1.1"]
  flow_timeout_sec: 30
  packet_buffer_size: 1000
  log_level: info

outputs:
  - type: unix_socket
    enabled: true
    params:
      socket_path: /var/run/logcorrelator/network.socket
  - type: file
    enabled: false
    params:
      path: /var/log/ja4sentinel/ja4.log

Output Format (LogRecord JSON Schema)

Each output record is a flat JSON object:

{
  "src_ip": "203.0.113.42",
  "src_port": 52341,
  "dst_ip": "192.168.1.10",
  "dst_port": 443,
  "ip_meta_ttl": 64,
  "ip_meta_total_length": 583,
  "ip_meta_id": 12345,
  "ip_meta_df": true,
  "tcp_meta_window_size": 65535,
  "tcp_meta_mss": 1460,
  "tcp_meta_window_scale": 8,
  "tcp_meta_options": "MSS,NOP,WScale,NOP,NOP,Timestamps,SACK",
  "conn_id": "203.0.113.42:52341-192.168.1.10:443",
  "sensor_id": "",
  "tls_version": "1.3",
  "tls_sni": "example.com",
  "tls_alpn": "h2",
  "syn_to_clienthello_ms": 12,
  "ja4": "t13d1516h2_8daaf6152771_b0da82dd1658",
  "ja3": "771,4866-4867-4865-49196-49200...",
  "ja3_hash": "e7d705a3286e19ea42f587b344ee6865",
  "timestamp": 1709312345678901234
}

Field Reference

Field Type Description
src_ip string Client source IP address
src_port uint16 Client source port
dst_ip string Server destination IP address
dst_port uint16 Server destination port
ip_meta_ttl uint8 IP Time-To-Live
ip_meta_total_length uint16 IP total packet length
ip_meta_id uint16 IP identification field
ip_meta_df bool IP Don't Fragment flag
tcp_meta_window_size uint16 TCP window size
tcp_meta_mss uint16 TCP Maximum Segment Size (omitted if 0)
tcp_meta_window_scale uint8 TCP window scale factor (omitted if 0)
tcp_meta_options string Comma-separated TCP options
conn_id string Unique flow identifier
sensor_id string Sensor/captor identifier
tls_version string Max TLS version from ClientHello
tls_sni string Server Name Indication
tls_alpn string ALPN protocol (e.g., h2, http/1.1)
syn_to_clienthello_ms uint32 Time from SYN to ClientHello (ms)
ja4 string JA4 TLS fingerprint
ja3 string JA3 TLS fingerprint
ja3_hash string MD5 hash of JA3 string
timestamp int64 Unix nanoseconds

UNIX Socket Output Protocol

  • Socket type: unixgram (DGRAM — connectionless)
  • Encoding: One JSON object per datagram (no delimiter)
  • Max datagram size: 64 KB
  • Reconnection: Exponential backoff (100 ms → 2 s), max 3 attempts per write
  • Queue: Async write queue (default 1000 items) absorbs transient socket failures
  • Error callback: Consecutive failures are tracked and reported

Signal Handling

Signal Behavior
SIGTERM / SIGINT Graceful shutdown: cancel context, close capture, flush outputs, log filter stats
SIGHUP Log rotation: reopen file outputs (used by systemctl reload + logrotate)

JA4 Fingerprint Algorithm

  1. Extract TLS ClientHello from the TCP payload (with TCP reassembly for fragmented handshakes)
  2. Parse cipher suites, extensions, ALPN, SNI, supported versions
  3. Build JA4 string: t{version}{sni_flag}{cipher_count}{ext_count}_{cipher_hash}_{ext_hash}
  4. Build JA3 string: {version},{ciphers},{extensions},{curves},{formats}
  5. Compute JA3 MD5 hash

Sentinel uses the tlsfingerprint library for ALPN and TLS version parsing, with custom sanitization for malformed/truncated ClientHellos.

Deployment

systemd

[Unit]
Description=ja4sentinel TLS fingerprinting daemon
After=network.target

[Service]
Type=notify
ExecStart=/usr/bin/ja4sentinel -config /etc/ja4sentinel/config.yml
ExecReload=/bin/kill -HUP $MAINPID
Restart=on-failure
WatchdogSec=30
TimeoutStopSec=2

[Install]
WantedBy=multi-user.target

Sentinel uses systemd sd_notify for:

  • READY — sent after initialization
  • WATCHDOG — sent at half the WatchdogSec interval
  • STOPPING — sent before shutdown

Docker

make build-sentinel
docker run --cap-add=NET_RAW --cap-add=NET_ADMIN \
  -v /var/run/logcorrelator:/var/run/logcorrelator \
  ja4-platform/sentinel:latest

RPM Package Contents

Path Description
/usr/bin/ja4sentinel Binary (statically linked Go)
/etc/ja4sentinel/config.yml.default Default configuration (noreplace)
/usr/share/ja4sentinel/config.yml Reference configuration
/usr/lib/systemd/system/ja4sentinel.service systemd unit
/etc/logrotate.d/ja4sentinel logrotate configuration
/var/lib/ja4sentinel/ State directory
/var/log/ja4sentinel/ Log directory
/var/run/logcorrelator/ Socket directory

RPM Dependencies

  • systemd
  • libpcap >= 1.9.0

Supported Distributions

  • Rocky Linux 8, 9, 10
  • AlmaLinux 8, 9
  • RHEL 8, 9