Services: - ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap) - logcorrelator: JA4 log correlation engine (Go, ClickHouse) - mod_reqin_log: Apache module (C, JSON request logging) - bot_detector: ML bot detection pipeline (Python) - dashboard: FastAPI/Streamlit analytics UI (Python) Shared libraries: - shared/go/ja4common: logger, config, shutdown, ipfilter (Go module) - shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package) - shared/clickhouse/: canonical SQL migrations (10 files) Build & packaging: - Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10) - go.work workspace linking sentinel, correlator, ja4common - Makefile with test-all, build-all, rpm-* targets Fixes applied: - go.work: 1.21 → 1.24.6 (required by sentinel) - correlator Dockerfiles: golang:1.21 → golang:1.24 - replace directives in go.mod for ja4common local path - pyproject.toml: setuptools.backends → setuptools.build_meta - Removed static libpcap linking (unavailable on Rocky 9) - Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32) - Rewrote corrupted test files (logger_test.go × 2) Test coverage: - correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%) - sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse) Documentation: - README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6.9 KiB
mod-reqin-log
mod_reqin_log is an Apache HTTPD module (C shared object) that captures HTTP request metadata and sends it as JSON to a UNIX datagram socket. It serves as the HTTP-layer ingestion point for the ja4-platform pipeline, feeding request data to the correlator for joining with TLS fingerprint data from sentinel.
Purpose
Apache processes HTTP requests after TLS termination, so it has access to the decoded HTTP method, path, headers, and client IP/port. mod-reqin-log hooks into the post_read_request phase to serialize this data immediately, before any rewrite or auth module modifies the request.
Apache Directives Reference
All directives are server-level (RSRC_CONF):
| Directive | Type | Default | Description |
|---|---|---|---|
JsonSockLogEnabled |
Flag (On/Off) | Off | Enable or disable the module |
JsonSockLogSocket |
String | — | UNIX domain socket path for JSON output |
JsonSockLogHeaders |
String list | — | HTTP header names to log (repeatable) |
JsonSockLogMaxHeaders |
Integer | 25 |
Maximum number of headers to log |
JsonSockLogMaxHeaderValueLen |
Integer | 256 |
Maximum length of each header value (truncated beyond) |
JsonSockLogReconnectInterval |
Integer (seconds) | 10 |
Minimum seconds between reconnection attempts |
JsonSockLogErrorReportInterval |
Integer (seconds) | 10 |
Minimum seconds between error log entries (throttling) |
JsonSockLogLevel |
String | WARNING |
Module log level: DEBUG, INFO, WARNING, ERROR, EMERG |
Example httpd.conf
LoadModule reqin_log_module modules/mod_reqin_log.so
JsonSockLogEnabled On
JsonSockLogSocket /var/run/logcorrelator/http.socket
JsonSockLogHeaders User-Agent Accept Accept-Encoding Accept-Language
JsonSockLogHeaders Content-Type X-Request-Id X-Trace-Id X-Forwarded-For
JsonSockLogHeaders Sec-CH-UA Sec-CH-UA-Mobile Sec-CH-UA-Platform
JsonSockLogHeaders Sec-Fetch-Dest Sec-Fetch-Mode Sec-Fetch-Site
JsonSockLogMaxHeaders 25
JsonSockLogMaxHeaderValueLen 256
JsonSockLogReconnectInterval 10
JsonSockLogErrorReportInterval 10
JsonSockLogLevel WARNING
Output JSON Schema
Each HTTP request is serialized as a flat JSON object and sent as a single UNIX datagram:
{
"time": "2026-03-09T14:30:00Z",
"src_ip": "203.0.113.42",
"src_port": 52341,
"dst_ip": "192.168.1.10",
"dst_port": 443,
"method": "GET",
"scheme": "https",
"host": "example.com",
"path": "/api/v1/users",
"query": "page=1&limit=20",
"http_version": "HTTP/2.0",
"client_headers": "User-Agent,Accept,Accept-Encoding,Accept-Language",
"header_User-Agent": "Mozilla/5.0 ...",
"header_Accept": "text/html,application/xhtml+xml",
"header_Accept-Encoding": "gzip, deflate, br",
"header_Accept-Language": "en-US,en;q=0.9",
"header_Sec-Fetch-Dest": "document",
"header_Sec-Fetch-Mode": "navigate",
"header_Sec-Fetch-Site": "none"
}
Field Reference
| Field | Type | Description |
|---|---|---|
time |
string (ISO 8601) | Request timestamp (UTC) |
src_ip |
string | Client IP address |
src_port |
int | Client source port |
dst_ip |
string | Server IP address |
dst_port |
int | Server port |
method |
string | HTTP method (GET, POST, etc.) |
scheme |
string | URL scheme (http or https) |
host |
string | HTTP Host header value |
path |
string | Request URI path |
query |
string | Query string (without ?) |
http_version |
string | HTTP version (HTTP/1.1, HTTP/2.0) |
client_headers |
string | Comma-separated list of header names sent by client (order preserved) |
header_<Name> |
string | Value of each configured header (one field per header) |
Sensitive Headers
The following headers are always excluded from output regardless of JsonSockLogHeaders:
AuthorizationCookieSet-CookieX-Api-KeyX-Auth-TokenProxy-AuthorizationWWW-Authenticate
Size Limits
- Maximum JSON size: 64 KB (prevents memory exhaustion DoS)
- Header values are truncated to
JsonSockLogMaxHeaderValueLenbytes
Thread Safety
mod-reqin-log is designed for Apache's worker and event MPMs (multi-threaded):
- Socket FD is protected by an
apr_thread_mutex_t(fd_mutex) - Per-child process state includes the socket file descriptor, mutex, and error tracking
- Error reporting uses
LOG_THROTTLEDmacro with timestamp-based deduplication - All JSON serialization uses per-request pool allocation — no shared buffers
Architecture
Apache HTTPD process
├── child process 1
│ ├── fd_mutex (apr_thread_mutex_t)
│ ├── socket_fd (shared across threads)
│ ├── thread 1 → post_read_request → serialize JSON → mutex lock → sendto() → unlock
│ ├── thread 2 → post_read_request → serialize JSON → mutex lock → sendto() → unlock
│ └── ...
├── child process 2
│ ├── fd_mutex
│ ├── socket_fd (independent)
│ └── ...
Reconnection Behavior
- Socket is opened during
child_init(per-child process startup) - If the socket is unavailable at startup, connection is deferred
- On send failure, reconnection is attempted respecting
JsonSockLogReconnectInterval - Failed sends are silently dropped (HTTP request processing is not blocked)
- Error log entries are throttled by
JsonSockLogErrorReportInterval - Socket type:
SOCK_DGRAM(connectionless UNIX datagram) - Non-blocking sends with
MSG_NOSIGNAL
Deployment
Installation via RPM
rpm -ivh mod_reqin_log-1.0.19-1.el10.x86_64.rpm
LoadModule Directive
LoadModule reqin_log_module modules/mod_reqin_log.so
Verifying Installation
httpd -M | grep reqin_log
# Expected: reqin_log_module (shared)
Build
All builds run inside Docker:
# Run unit tests
make test-mod-reqin-log
# Build RPM packages (el8, el9, el10)
make rpm-mod-reqin-log
# RPMs in services/mod-reqin-log/dist/rpm/el{8,9,10}/
Local Build (requires Apache development headers)
cd services/mod-reqin-log
make build # Compiles mod_reqin_log.so via apxs
make test # Runs unit tests
Test Coverage
Unit tests cover:
- JSON serialization (escaping, size limits, field output)
- Config parsing (all directives, edge cases)
- Header handling (sensitive header exclusion, max headers, truncation)
- Module integration (real Apache module hooks)
Source Files
| File | Description |
|---|---|
src/mod_reqin_log.c |
Main module source |
src/mod_reqin_log.h |
Header with types, constants, defaults |
conf/mod_reqin_log.conf |
Example Apache configuration |
tests/unit/test_json_serialization.c |
JSON output tests |
tests/unit/test_config_parsing.c |
Directive parsing tests |
tests/unit/test_header_handling.c |
Header filtering tests |
tests/unit/test_module_real.c |
Integration tests |