Initial commit: mod_reqin_log Apache module

Features:
- JSON logging of HTTP requests to Unix domain socket
- Configurable HTTP headers logging (flat JSON structure)
- Header value truncation and count limits
- Automatic reconnect on socket disconnection
- Error reporting with throttling

Configuration directives:
- JsonSockLogEnabled: Enable/disable logging
- JsonSockLogSocket: Unix socket path
- JsonSockLogHeaders: List of headers to log
- JsonSockLogMaxHeaders: Maximum headers to log
- JsonSockLogMaxHeaderValueLen: Max header value length
- JsonSockLogReconnectInterval: Reconnect delay
- JsonSockLogErrorReportInterval: Error log throttle

Includes:
- Module source code (src/)
- Unit and integration tests (tests/, scripts/)
- Documentation (README.md, architecture.yml)
- Build configuration (CMakeLists.txt, Makefile)
- Packaging (deb/rpm)

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
Jacquin Antoine
2026-02-26 13:55:07 +01:00
commit 66549acf5c
27 changed files with 3550 additions and 0 deletions

396
architecture.yml Normal file
View File

@ -0,0 +1,396 @@
project:
name: mod_reqin_log
description: >
Apache HTTPD 2.4 module logging all incoming HTTP requests as JSON lines
to a Unix domain socket at request reception time (no processing time).
language: c
target:
server: apache-httpd
version: "2.4"
os: rocky-linux-8+
build:
toolchain: gcc
apache_dev: httpd-devel (apxs)
artifacts:
- mod_reqin_log.so
context:
architecture:
pattern: native-apache-module
scope: global
mpm_compatibility:
- prefork
- worker
- event
request_phase:
hook: post_read_request
rationale: >
Log as soon as the HTTP request is fully read to capture input-side data
(client/server addresses, request line, headers) without waiting for
application processing.
logging_scope:
coverage: all-traffic
description: >
Every HTTP request handled by the Apache instance is considered for logging
when the module is enabled and the Unix socket is configured.
module:
name: mod_reqin_log
hooks:
- name: register_hooks
responsibilities:
- Register post_read_request hook for logging at request reception.
- Initialize per-process Unix socket connection if enabled.
- name: child_init
responsibilities:
- Initialize module state for each Apache child process.
- Attempt initial non-blocking connection to Unix socket if configured.
- name: child_exit
responsibilities:
- Cleanly close Unix socket file descriptor if open.
- name: post_read_request
responsibilities:
- Ensure Unix socket is connected (with periodic reconnect).
- Build JSON log document for the request.
- Write JSON line to Unix socket using non-blocking I/O.
- Handle errors by dropping the current log line and rate-limiting
error reports into Apache error_log.
data_model:
json_line:
description: >
One JSON object per HTTP request, serialized on a single line and
terminated by "\n".
fields:
- name: time
type: string
format: iso8601-with-timezone
example: "2026-02-26T11:59:30Z"
- name: timestamp
type: integer
unit: nanoseconds
description: >
Monotonic or wall-clock based timestamp in nanoseconds since an
implementation-defined epoch (stable enough for ordering and latency analysis).
- name: src_ip
type: string
example: "192.0.2.10"
- name: src_port
type: integer
example: 45678
- name: dst_ip
type: string
example: "198.51.100.5"
- name: dst_port
type: integer
example: 443
- name: method
type: string
example: "GET"
- name: path
type: string
example: "/foo/bar"
- name: host
type: string
example: "example.com"
- name: http_version
type: string
example: "HTTP/1.1"
- name: headers
type: object
description: >
Flattened headers from the configured header list. Keys are derived
from configured header names prefixed with 'header_'.
key_pattern: "header_<configured_header_name>"
example:
header_X-Request-Id: "abcd-1234"
header_User-Agent: "curl/7.70.0"
configuration:
scope: global
directives:
- name: JsonSockLogEnabled
type: flag
context: server-config
default: "Off"
description: >
Enable or disable mod_reqin_log logging globally. Logging only occurs
when this directive is On and JsonSockLogSocket is set.
- name: JsonSockLogSocket
type: string
context: server-config
required_when_enabled: true
example: "/var/run/mod_reqin_log.sock"
description: >
Filesystem path of the Unix domain socket to which JSON log lines
will be written.
- name: JsonSockLogHeaders
type: list
context: server-config
value_example: ["X-Request-Id", "X-Trace-Id", "User-Agent"]
description: >
List of HTTP header names to log. For each configured header <H>,
the module adds a JSON field 'header_<H>' at the root level of the
JSON log entry (flat structure). Order matters for applying the
JsonSockLogMaxHeaders limit.
- name: JsonSockLogMaxHeaders
type: integer
context: server-config
default: 10
min: 0
description: >
Maximum number of headers from JsonSockLogHeaders to actually log.
If more headers are configured, only the first N are considered.
- name: JsonSockLogMaxHeaderValueLen
type: integer
context: server-config
default: 256
min: 1
description: >
Maximum length in characters for each logged header value.
Values longer than this limit are truncated before JSON encoding.
- name: JsonSockLogReconnectInterval
type: integer
context: server-config
default: 10
unit: seconds
description: >
Minimal delay between two connection attempts to the Unix socket after
a failure. Used to avoid reconnect attempts on every request.
- name: JsonSockLogErrorReportInterval
type: integer
context: server-config
default: 10
unit: seconds
description: >
Minimal delay between two error messages emitted into Apache error_log
for repeated I/O or connection errors on the Unix socket.
behavior:
enabling_rules:
- JsonSockLogEnabled must be On.
- JsonSockLogSocket must be set to a non-empty path.
header_handling:
- No built-in blacklist; admin is fully responsible for excluding
sensitive headers (Authorization, Cookie, etc.).
- If a configured header is absent in a request, the corresponding
JSON key may be omitted or set to null (implementation choice, but
must be consistent).
- Header values are truncated to JsonSockLogMaxHeaderValueLen characters.
io:
socket:
type: unix-domain
mode: client
path_source: JsonSockLogSocket
connection:
persistence: true
non_blocking: true
lifecycle:
open:
- Attempt initial connection during child_init if enabled.
- On first log attempt after reconnect interval expiry if not yet connected.
failure:
- On connection failure, mark socket as unavailable.
- Do not block the worker process.
reconnect:
strategy: time-based
interval_seconds: "@config.JsonSockLogReconnectInterval"
trigger: >
When a request arrives and the last connect attempt time is older
than reconnect interval, a new connect is attempted.
write:
format: "json_object + '\\n'"
mode: non-blocking
error_handling:
on_eagain:
action: drop-current-log-line
note: do not retry for this request.
on_epipe_or_conn_reset:
action:
- close_socket
- mark_unavailable
- schedule_reconnect
generic_errors:
action: drop-current-log-line
drop_policy:
description: >
Logging errors never impact client response. The current log line
is silently dropped (except for throttled error_log reporting).
error_handling:
apache_error_log_reporting:
enabled: true
throttle_interval_seconds: "@config.JsonSockLogErrorReportInterval"
events:
- type: connect_failure
message_template: "[mod_reqin_log] Unix socket connect failed: <errno>/<detail>"
- type: write_failure
message_template: "[mod_reqin_log] Unix socket write failed: <errno>/<detail>"
fatal_conditions:
- description: >
Misconfiguration (JsonSockLogEnabled On but missing JsonSockLogSocket)
should be reported at startup as a configuration error.
- description: >
Any internal JSON-encoding failure should be treated as non-fatal:
drop current log and optionally emit a throttled error_log entry.
constraints:
performance:
objectives:
- Logging overhead per request should be minimal and non-blocking.
- No dynamic allocations in hot path beyond what is strictly necessary
(prefer APR pools where possible).
design_choices:
- Single JSON serialization pass per request.
- Use non-blocking I/O to avoid stalling worker threads/processes.
- Avoid reconnect attempts on every request via time-based backoff.
security:
notes:
- Module does not anonymize IPs nor scrub headers; it is intentionally
transparent. Data protection and header choices are delegated to configuration.
- No requests are rejected due to logging failures.
robustness:
requirements:
- Logging failures must not crash Apache worker processes.
- Module must behave correctly under high traffic, socket disappearance,
and repeated connect failures.
testing:
strategy:
unit_tests:
focus:
- JSON serialization with header truncation and header count limits.
- Directive parsing and configuration merging (global scope).
- Error-handling branches for non-blocking write and reconnect logic.
integration_tests:
env:
server: apache-httpd 2.4
os: rocky-linux-8+
log_consumer: simple Unix socket server capturing JSON lines
scenarios:
- name: basic_logging
description: >
With JsonSockLogEnabled On and valid socket, verify that each request
produces a valid JSON line with expected fields.
- name: header_limits
description: >
Configure more headers than JsonSockLogMaxHeaders and verify only
the first N are logged and values are truncated according to
JsonSockLogMaxHeaderValueLen.
- name: socket_unavailable_on_start
description: >
Start Apache with JsonSockLogEnabled On but socket not yet created;
verify periodic reconnect attempts and throttled error logging.
- name: runtime_socket_loss
description: >
Drop the Unix socket while traffic is ongoing; verify that log lines
are dropped, worker threads are not blocked, and reconnect attempts
resume once the socket reappears.
ci:
strategy:
description: >
All builds, tests and packaging are executed inside Docker containers.
The host only needs Docker and the CI runner.
tools:
orchestrator: "to-define (GitLab CI / GitHub Actions / autre)"
container_engine: docker
stages:
- name: build
description: >
Compile mod_reqin_log as an Apache 2.4 module inside Docker images
dedicated to each target distribution.
jobs:
- name: build-rocky-8
image: "rockylinux:8"
steps:
- install_deps:
- gcc
- make
- httpd
- httpd-devel
- apr-devel
- apr-util-devel
- rpm-build
- build_module:
command: "apxs -c -i src/mod_reqin_log.c"
- name: build-debian
image: "debian:stable"
steps:
- install_deps:
- build-essential
- apache2
- apache2-dev
- debhelper
- build_module:
command: "apxs -c -i src/mod_reqin_log.c"
- name: test
description: >
Run unit tests (C) and integration tests (Apache + Unix socket consumer)
inside Docker containers.
jobs:
- name: unit-tests
image: "debian:stable"
steps:
- install_test_deps:
- build-essential
- cmake
- "test-framework (à définir: cmocka, criterion, ...)"
- run_tests:
command: "ctest || make test"
- name: integration-tests-rocky-8
image: "rockylinux:8"
steps:
- prepare_apache_and_module
- start_unix_socket_consumer
- run_http_scenarios:
description: >
Validate JSON logs, header limits, socket loss and reconnect
behaviour using curl/ab/siege or similar tools.
- name: package
description: >
Build RPM and DEB packages for mod_reqin_log inside Docker.
jobs:
- name: rpm-rocky-8
image: "rockylinux:8"
steps:
- install_deps:
- rpm-build
- rpmlint
- "build deps same as 'build-rocky-8'"
- build_rpm:
spec_file: "packaging/rpm/mod_reqin_log.spec"
command: "rpmbuild -ba packaging/rpm/mod_reqin_log.spec"
- artifacts:
paths:
- "dist/rpm/**/*.rpm"
- name: deb-debian
image: "debian:stable"
steps:
- install_deps:
- devscripts
- debhelper
- dpkg-dev
- "build deps same as 'build-debian'"
- build_deb:
command: |
cd packaging/deb
debuild -us -uc
- artifacts:
paths:
- "dist/deb/**/*.deb"
artifacts:
retention:
policy: "keep build logs and packages long enough for debugging (to define)"
outputs:
- type: module
path: "dist/modules/mod_reqin_log.so"
- type: rpm
path: "dist/rpm/"
- type: deb
path: "dist/deb/"