Add supporting infrastructure for nginx HTTP capture using kretprobe
on __x64_sys_recvfrom to replace the blocked tracepoint sys_exit_recvfrom.
Changes:
- bpf/bpf_types.h: Add nginx_pid_map for filtering recvfrom by PID
- cmd/ja4ebpf/main.go: Add Uprobes configuration section
- Makefile: Add test targets for recvfrom validation
- internal/loader: Generate nginx HTTP event structures
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes "permission denied" error when attaching tracepoint sys_exit_recvfrom
on Rocky Linux 9 (kernel 5.14+). The tracepoint exit has stricter permissions
than entry tracepoints.
Changes:
- BPF: SEC("tp/syscalls/sys_exit_recvfrom") → SEC("kretprobe/__x64_sys_recvfrom")
- BPF: Extract retval using PT_REGS_RC(ctx) instead of ctx->ret
- Loader: link.Tracepoint() → link.Kretprobe()
- Add nginxPidMap for filtering recvfrom calls by nginx PID
Validation:
- All HTTP fields captured without truncation (path up to 39 chars, query up to 244 chars)
- Custom headers (X-Request-ID, X-Custom-Header) fully captured
- Unit tests added and passing (TestKretprobeRecvfromAttachment, TestKretprobeVsTracepoint)
- ClickHouse validation complete: http_logs and http_logs_raw tables verified
Tested on:
- Rocky Linux 9 (kernel 5.14+)
- bpftool shows: kprobe name tp_sys_exit_recvfrom (kretprobe active)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added warning logs when HTTP/1.x request/response parsing fails
to aid in debugging malformed traffic and parser issues.
Changes:
- Log warning when ParseHTTP1Response returns nil
- Log warning when ParseHTTP1Request returns nil (SSL events)
- Log warning when ParseHTTP1Request returns nil (plain HTTP)
Includes src IP and data length for context.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ReadyCh consumer goroutine could block indefinitely during shutdown
because it only checked for channel closure but not context cancellation.
Changes:
- Add select statement to check both ctx.Done() and ReadyCh closure
- Add shutdown logging for better debugging
- Add 100ms grace period for goroutines to exit
Fixes potential hangs when receiving SIGTERM/SIGINT.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add dst_ip and dst_port fields to tls_hello_event BPF struct and populate
them in tc_capture.c. Update Go TLS event handler with new byte offsets
(payload[2048]+src_ip(4)+dst_ip(4)+src_port(2)+dst_port(2)+payload_len(2)+
timestamp_ns(8) = 2070 bytes). Read dst_ip/dst_port from HTTP plain events
and use them to populate L3L4 when SYN was not captured, ensuring dst_ip
and dst_port are always available in ClickHouse for both TLS and HTTP sessions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix two critical offset bugs introduced when ip_total_length was added to
tcp_syn_event: tcp_options_raw offset 21→23 and tcp_options_len offset 61→63,
plus minimum size check 70→72. Fix ssl_data_event direction field offset from
4118 (inside timestamp_ns) to 4126. Simplify attachSSLWrite to use generated
objects directly instead of dynamic spec loading. Regenerate BPF objects with
SSL_write uprobe programs included.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add run-e2e-test.sh with CLI parameters (--hits, --http-ratio, --dns, --tls,
--src-ips, --keep-analysis, --up) for configurable traffic generation. Traffic
runs from VM endpoints with multiple source IPs (alias IPs on eth0) to produce
distinct sessions for the ML pipeline. Fix curl TLS flags (--tlsv1.2 instead
of --tls-v1-2), skip redundant local verification in distributed mode, and
fix dashboard is_available() cache that never retried after ClickHouse recovery.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Dockerfile.package: migre go-builder de golang:bookworm (Debian) vers
rockylinux:9, installe Go depuis le tarball officiel, remplace apt par
dnf (clang llvm libbpf-devel bpftool)
- Suppression du champ 'correlated' de l'agent ja4ebpf : avec eBPF/XDP,
la corrélation L3/L4↔L7 est toujours implicite par présence des champs.
Supprimé de : session.go, manager.go, main.go (x5), clickhouse.go
- Thèse (6 corrections listées + cohérence correlated) :
1. §3.5 + §3.9.1 : SSL_read retourne des octets bruts sans respecter les
frontières H2 → buffer circulaire de réassemblage en Go userspace
2. §3.1 : supprimé libpcap + CAP_NET_RAW, remplacé par définition uprobe
3. §4 + §7 : compte exact 96 features en 8 familles (Famille 1–8),
supprimé taxonomie F1–F11 obsolète, tous les totaux mis à jour
4. §2.4 + §8 : remplacé 7 fausses URLs arXiv par [Référence à vérifier]
5. §4 Famille 2 : ja4_drift_ratio → renvoi à Famille 8 (définition complète)
6. §6.4 : ajouté limite 'Overhead de l'uprobe SSL_read'
+ §3.6 : supprimé correlated=0/1 du texte architectural
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Increase MAX_TLS_PAYLOAD from 512 to 2048 bytes to capture full
TLS ClientHellos (modern browsers/curl send 1000-1543 byte ClientHellos)
- Fix ParseClientHello to tolerate XDP-truncated payloads: clamp
recordLength and chLen to available data instead of returning error
- Fix cipher suites, compression, extensions truncation to use clamping
- Fix consumeSynEvents struct field offsets: dst_ip (4 bytes at offset 4)
was not accounted for, causing all L3/L4 metadata to be read from
wrong positions (TTL was actually dst_ip[0], windowSize was dst_port, etc.)
- Add parseTCPOptions() to extract MSS and Window Scale from raw TCP options
(C code sets defaults of mss=0, window_scale=0xFF, expects Go to parse)
- Fix consumeAcceptEvents: skip zero-IP events to avoid phantom sessions
- Fix consumeSSLEvents: filter zero-IP/port events when proc fallback fails
- Add missing consumeHTTPPlainEvents goroutine (was defined but never called)
- Fix race condition: SYN consumer sets Correlated=true if TLS already present
- Update tls_hello_event struct offsets in Go consumer (payload_len now at
offset 2054, was 518, due to payload array growing from 512 to 2048 bytes)
- Remove debug logging from consumers and GC
E2E verified: HTTP plain (port 80) and HTTPS (port 443) both produce
fully correlated sessions in ClickHouse with correct:
- ip_meta_ttl=64, ip_meta_df=true, ip_meta_id
- tcp_meta_window_size=64240, tcp_meta_window_scale=10, tcp_meta_mss=1460
- ja4=t13i3010_1d37bd780c83_95d2a80e6515
- tls_alpn=http/1.1
- method=GET, path=/, header_order_signature=Host;User-Agent;Accept
- correlated=1
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>