project: name: mod_reqin_log description: > Apache HTTPD 2.4 module logging all incoming HTTP requests as JSON lines to a Unix domain socket at request reception time (no processing time). language: c author: name: Jacquin Antoine email: rpm@arkel.fr target: server: apache-httpd version: "2.4" os: rocky-linux-8+, almalinux-10+ build: toolchain: gcc apache_dev: httpd-devel (apxs) artifacts: - mod_reqin_log.so context: architecture: pattern: native-apache-module scope: global mpm_compatibility: - prefork - worker - event request_phase: hook: post_read_request rationale: > Log as soon as the HTTP request is fully read to capture input-side data (client/server addresses, request line, headers) without waiting for application processing. logging_scope: coverage: all-traffic description: > Every HTTP request handled by the Apache instance is considered for logging when the module is enabled and the Unix socket is configured. module: name: mod_reqin_log files: source: - src/mod_reqin_log.c - src/mod_reqin_log.h packaging: - mod_reqin_log.spec tests: - tests/unit/test_module_real.c - tests/unit/test_config_parsing.c - tests/unit/test_header_handling.c - tests/unit/test_json_serialization.c hooks: - name: register_hooks responsibilities: - Register post_read_request hook for logging at request reception. - Register child_init hook for per-process state initialization. - Initialize per-process server configuration structure. - name: child_init responsibilities: - Initialize module state for each Apache child process. - Reset per-process socket state (fd, timers, error flags). - Attempt initial non-blocking connection to Unix socket if configured. - name: post_read_request responsibilities: - Retrieve per-process server configuration (thread-safe). - Ensure Unix socket is connected (with periodic reconnect). - Build JSON log document for the request. - Write JSON line to Unix socket using non-blocking I/O. - Handle errors by dropping the current log line and rate-limiting error reports into Apache error_log. thread_safety: model: per-process-state description: > Each Apache child process maintains its own socket state stored in the server configuration structure (reqin_log_server_conf_t). This avoids race conditions in worker and event MPMs where multiple threads share a process. implementation: - State stored via ap_get_module_config(s->module_config) - No global variables for socket state - Each process has independent: socket_fd, connect timers, error timers data_model: json_line: description: > One JSON object per HTTP request, serialized on a single line and terminated by "\n". Uses flat structure with header fields at root level. structure: flat fields: - name: time type: string format: iso8601-with-timezone example: "2026-02-26T11:59:30Z" - name: timestamp type: integer unit: microseconds (expressed as nanoseconds) description: > Wall-clock timestamp in microseconds since Unix epoch, expressed as nanoseconds for compatibility (multiplied by 1000). Note: apr_time_now() returns microseconds with microsecond precision. The nanosecond representation is for API compatibility only. example: 1708948770000000000 - name: src_ip type: string example: "192.0.2.10" - name: src_port type: integer example: 45678 - name: dst_ip type: string example: "198.51.100.5" - name: dst_port type: integer example: 443 - name: method type: string example: "GET" - name: path type: string example: "/foo/bar" - name: host type: string example: "example.com" - name: http_version type: string example: "HTTP/1.1" - name: header_ type: string description: > Flattened header fields at root level. For each configured header , a field 'header_' is added directly to the JSON root object. Headers are only included if present in the request. key_pattern: "header_" optional: true example: header_X-Request-Id: "abcd-1234" header_User-Agent: "curl/7.70.0" example_full: time: "2026-02-26T11:59:30Z" timestamp: 1708948770000000000 src_ip: "192.0.2.10" src_port: 45678 dst_ip: "198.51.100.5" dst_port: 443 method: "GET" path: "/api/users" host: "example.com" http_version: "HTTP/1.1" header_X-Request-Id: "abcd-1234" header_User-Agent: "curl/7.70.0" configuration: scope: global directives: - name: JsonSockLogEnabled type: flag context: server-config default: "Off" description: > Enable or disable mod_reqin_log logging globally. Logging only occurs when this directive is On and JsonSockLogSocket is set. - name: JsonSockLogSocket type: string context: server-config required_when_enabled: true example: "/var/run/mod_reqin_log.sock" description: > Filesystem path of the Unix domain socket to which JSON log lines will be written. - name: JsonSockLogHeaders type: list context: server-config value_example: ["X-Request-Id", "X-Trace-Id", "User-Agent"] description: > List of HTTP header names to log. For each configured header , the module adds a JSON field 'header_' at the root level of the JSON log entry (flat structure). Order matters for applying the JsonSockLogMaxHeaders limit. - name: JsonSockLogMaxHeaders type: integer context: server-config default: 10 min: 0 description: > Maximum number of headers from JsonSockLogHeaders to actually log. If more headers are configured, only the first N are considered. - name: JsonSockLogMaxHeaderValueLen type: integer context: server-config default: 256 min: 1 description: > Maximum length in characters for each logged header value. Values longer than this limit are truncated before JSON encoding. - name: JsonSockLogReconnectInterval type: integer context: server-config default: 10 unit: seconds description: > Minimal delay between two connection attempts to the Unix socket after a failure. Used to avoid reconnect attempts on every request. - name: JsonSockLogErrorReportInterval type: integer context: server-config default: 10 unit: seconds description: > Minimal delay between two error messages emitted into Apache error_log for repeated I/O or connection errors on the Unix socket. behavior: enabling_rules: - JsonSockLogEnabled must be On. - JsonSockLogSocket must be set to a non-empty path. header_handling: - Built-in blacklist prevents logging of sensitive headers by default. - Blacklisted headers: Authorization, Cookie, Set-Cookie, X-Api-Key, X-Auth-Token, Proxy-Authorization, WWW-Authenticate. - Blacklisted headers are silently skipped (logged at DEBUG level only). - If a configured header is absent in a request, the corresponding JSON key is omitted from the log entry. - Header values are truncated to JsonSockLogMaxHeaderValueLen characters. io: socket: type: unix-domain mode: client path_source: JsonSockLogSocket connection: persistence: true non_blocking: true lifecycle: open: - Attempt initial connection during child_init if enabled. - On first log attempt after reconnect interval expiry if not yet connected. failure: - On connection failure, mark socket as unavailable. - Do not block the worker process. reconnect: strategy: time-based interval_seconds: "@config.JsonSockLogReconnectInterval" trigger: > When a request arrives and the last connect attempt time is older than reconnect interval, a new connect is attempted. write: format: "json_object + '\\n'" mode: non-blocking error_handling: on_eagain: action: drop-current-log-line note: do not retry for this request. on_epipe_or_conn_reset: action: - close_socket - mark_unavailable - schedule_reconnect generic_errors: action: drop-current-log-line drop_policy: description: > Logging errors never impact client response. The current log line is silently dropped (except for throttled error_log reporting). error_handling: apache_error_log_reporting: enabled: true throttle_interval_seconds: "@config.JsonSockLogErrorReportInterval" events: - type: connect_failure message_template: "[mod_reqin_log] Unix socket connect failed: /" - type: write_failure message_template: "[mod_reqin_log] Unix socket write failed: /" fatal_conditions: - description: > Misconfiguration (JsonSockLogEnabled On but missing JsonSockLogSocket) should be reported at startup as a configuration error. - description: > Any internal JSON-encoding failure should be treated as non-fatal: drop current log and optionally emit a throttled error_log entry. constraints: performance: objectives: - Logging overhead per request should be minimal and non-blocking. - No dynamic allocations in hot path beyond what is strictly necessary (prefer APR pools where possible). design_choices: - Single JSON serialization pass per request. - Use non-blocking I/O to avoid stalling worker threads/processes. - Avoid reconnect attempts on every request via time-based backoff. security: notes: - Module includes built-in blacklist of sensitive headers to prevent accidental credential leakage (Authorization, Cookie, X-Api-Key, etc.). - Socket permissions default to 0o660 (owner+group only) for security. - Recommended socket path: /var/run/mod_reqin_log.sock (not /tmp). - Use environment variable MOD_REQIN_LOG_SOCKET to configure socket path. - Module does not anonymize IPs; data protection is delegated to configuration. - No requests are rejected due to logging failures. hardening: - Socket path length validated against system limit (108 bytes). - JSON log line size limited to 64KB to prevent memory exhaustion DoS. - NULL pointer checks on all connection/request fields. - Thread-safe socket FD access via mutex (worker/event MPMs). - Error logging reduced to prevent information disclosure. robustness: requirements: - Logging failures must not crash Apache worker processes. - Module must behave correctly under high traffic, socket disappearance, and repeated connect failures. testing: strategy: unit_tests: framework: cmocka location: tests/unit/test_module_real.c focus: - JSON serialization with header truncation and header count limits. - Dynamic buffer operations (dynbuf_t) with resize handling. - ISO8601 timestamp formatting. - Header value truncation to JsonSockLogMaxHeaderValueLen. - Control character escaping in JSON strings. execution: - docker build -f Dockerfile.tests . - docker run --rm ctest --output-on-failure ci: strategy: description: > All builds, tests and packaging are executed inside Docker containers using GitLab CI with Docker-in-Docker (dind). tools: orchestrator: GitLab CI container_engine: docker dind: true workflow_file: .gitlab-ci.yml rpm_strategy: > Separate RPMs are built for each major RHEL/CentOS/Rocky/AlmaLinux version (el8, el9, el10) due to glibc and httpd-devel incompatibilities across major versions. A single RPM cannot work across all versions. RPM packages are built using rpmbuild with mod_reqin_log.spec file. stages: - name: build description: > Build all RPM packages (el8, el9, el10) using Dockerfile.package with multi-stage build. dockerfile: Dockerfile.package artifacts: - dist/rpm/*.el8.*.rpm - dist/rpm/*.el9.*.rpm - dist/rpm/*.el10.*.rpm - name: test description: > Run unit tests (C with cmocka) inside Docker containers. dockerfile: Dockerfile.tests execution: ctest --output-on-failure - name: verify description: > Verify package installation on each target distribution. jobs: - name: verify-rpm-el8 image: rockylinux:8 check: "rpm -qi mod_reqin_log && httpd -M | grep reqin_log" - name: verify-rpm-el9 image: rockylinux:9 check: "rpm -qi mod_reqin_log && httpd -M | grep reqin_log" - name: verify-rpm-el10 image: almalinux:10 check: "rpm -qi mod_reqin_log && httpd -M | grep reqin_log"