- Implemented Apache HTTP capture using recvfrom syscall (model identical to nginx)
- Added sys_enter_recvfrom + kretprobe __x64_sys_recvfrom approach
- Renamed Apache BPF maps (apache_http_pid_map, apache_http_recv_args_map) to avoid conflicts with nginx
- Added support for recvfrom and recvmsg syscalls (recvmsg support incomplete)
Test results:
- Rocky 9 (kernel 5.14): nginx HTTP capture works perfectly with full headers
- Rocky 10 (kernel 6.12): Apache HTTP capture NOT working (headers=0)
- CentOS 8 (kernel 4.18): Apache HTTP capture NOT working (headers=0)
Root cause: Apache event MPM uses async epoll model that doesn't trigger
recvfrom syscalls the same way as nginx. Further investigation needed
for Apache-specific capture methods.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename apache_pid_map to apache_http_pid_map
- Rename apache_read_args_map to apache_http_recv_args_map
- Update all references in C code and Go loader
- Attempt both tracepoints and kretprobe for Apache HTTP capture
Test results:
- Rocky 9 (kernel 5.14): nginx HTTP capture works perfectly
- Rocky 10 (kernel 6.12): Apache HTTP capture not working (headers=0)
- CentOS 8 (kernel 4.18): Apache HTTP capture not working
The issue appears to be that Apache event MPM may not use recvfrom()
in the same way as nginx, or uses a different code path.
Further investigation needed for Apache HTTP capture.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add uprobe_apache.c with kretprobe on __x64_sys_recvfrom for Apache HTTP capture
- Update loader.go to support unified "servers" configuration instead of separate nginx_bin_path/apache_enabled
- Add consumeApacheHTTPEvents() function to process Apache HTTP events
- Update bpf_types.h to add Apache-specific BPF maps and structs
- Fix perf event array value_size for pb_apache_http (must be sizeof(__u32) not struct size)
- Add NGINX_APACHE_GUIDE.md documentation for HTTP capture from both servers
Validation results:
- nginx HTTP capture: ✅ Working (57 headers captured, no truncation)
- Apache HTTP capture: ⚠️ Under investigation (kretprobe not triggering on CentOS 8 kernel 4.18)
Configuration:
- JA4EBPF_UPROBES_ENABLED=true
- JA4EBPF_UPROBES_SERVERS=nginx,apache (or "both")
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refactor uprobes configuration to use a single server list instead
of separate nginx_bin_path and apache_enabled options.
Configuration changes:
- Uprobes.Servers: []string (was: NginxBinPath + ApacheEnabled)
- Accepts: ["nginx"], ["apache"], or ["nginx", "apache"]
- Can also use "both" to enable both servers
- Environment variable: JA4EBPF_UPROBES_SERVERS (was: separate vars)
Examples:
YAML:
uprobes:
enabled: true
servers: ["nginx", "apache"]
Environment:
JA4EBPF_UPROBES_SERVERS=nginx,apache
Code changes:
- Generic loop over cfg.Uprobes.Servers for attachment and consumption
- Remove duplicate checks for Enabled/ApacheEnabled
- Update attachNginxUprobesWithRetry to use default nginx path
- Update attachApacheUprobesWithRetry to remove ApacheEnabled check
- Update documentation to reflect both nginx and apache support
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add supporting infrastructure for nginx HTTP capture using kretprobe
on __x64_sys_recvfrom to replace the blocked tracepoint sys_exit_recvfrom.
Changes:
- bpf/bpf_types.h: Add nginx_pid_map for filtering recvfrom by PID
- cmd/ja4ebpf/main.go: Add Uprobes configuration section
- Makefile: Add test targets for recvfrom validation
- internal/loader: Generate nginx HTTP event structures
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update all documentation to reflect the resolved HTTP nginx capture
issue via kretprobe on __x64_sys_recvfrom.
Changes:
- README.md: Update HTTP status table showing kretprobe is now working
- docs/services/ja4ebpf.md: Replace tracepoint with kretprobe in hooks table,
mark issue as resolved with validation reference
- docs/architecture.md: Clarify TC HTTP plain capture is packet-level only
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixes "permission denied" error when attaching tracepoint sys_exit_recvfrom
on Rocky Linux 9 (kernel 5.14+). The tracepoint exit has stricter permissions
than entry tracepoints.
Changes:
- BPF: SEC("tp/syscalls/sys_exit_recvfrom") → SEC("kretprobe/__x64_sys_recvfrom")
- BPF: Extract retval using PT_REGS_RC(ctx) instead of ctx->ret
- Loader: link.Tracepoint() → link.Kretprobe()
- Add nginxPidMap for filtering recvfrom calls by nginx PID
Validation:
- All HTTP fields captured without truncation (path up to 39 chars, query up to 244 chars)
- Custom headers (X-Request-ID, X-Custom-Header) fully captured
- Unit tests added and passing (TestKretprobeRecvfromAttachment, TestKretprobeVsTracepoint)
- ClickHouse validation complete: http_logs and http_logs_raw tables verified
Tested on:
- Rocky Linux 9 (kernel 5.14+)
- bpftool shows: kprobe name tp_sys_exit_recvfrom (kretprobe active)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add initial implementation of nginx uprobes to capture complete HTTP
headers at application layer. This addresses the limitation of TC-based
capture which truncates headers spanning multiple packets.
Changes:
- Add uprobe_nginx.c with read() syscall interception
- Add nginx_read_args map for uretprobe correlation
- Add AttachUprobesNginx() method with retry support
- Config via uprobes.enabled in YAML or JA4EBPF_UPROBES_ENABLED env var
Current status:
- ✅ HTTPS (TLS) capture works perfectly - complete headers via SSL_read
- ❌ HTTP plain nginx uprobes don't fire - nginx uses recv() not read()
- ⚠️ HTTP plain TC capture truncates headers (fundamental limitation)
Note: The nginx uprobes approach has limitations:
1. nginx uses recv()/recvmsg() syscalls, not read()
2. PLT attachment to glibc recv() doesn't trigger properly
3. Consider kprobes on sys_recvfrom or packet reassembly for future
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed off-by-one error in uprobe_ssl.c where bpf_probe_read_user
was called with `data_len & (MAX_SSL_DATA - 1)` mask, causing
0-byte read when data_len was exactly 4096 (4096 & 4095 = 0).
This caused HTTP headers to be truncated when SSL_read returned
exactly 4096 bytes, resulting in host header values like "p"
instead of "platform".
The fix removes the incorrect bitwise operation and uses data_len
directly since it's already limited to MAX_SSL_DATA.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed race condition where ja4ebpf would fail to connect to
ClickHouse at startup because ClickHouse HTTP port wasn't ready yet,
even though Docker healthcheck passed.
Changes:
- Add 30s wait loop with ClickHouse /ping endpoint check
- Log success message when ClickHouse is ready
- Applied to all 4 stacks: nginx, apache, nginx-varnish, hitch-varnish
Test results after fix:
- nginx: 240 rows, 175 JA4 fingerprints ✅
- apache: 257 rows, 191 JA4 fingerprints ✅
- nginx-varnish: 298 rows, 242 JA4 fingerprints ✅
- hitch-varnish: 247 rows, 177 JA4 fingerprints ✅
All L3/L4 metadata (TTL, MSS, Window), TLS fingerprinting (JA4, SNI),
and HTTP layer data are correctly captured and persisted.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changed from zero-value check to existence check for clearer intent.
Both approaches have similar performance characteristics for map lookups,
but using 'ok' makes it explicit we're checking for key presence.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Added warning logs when HTTP/1.x request/response parsing fails
to aid in debugging malformed traffic and parser issues.
Changes:
- Log warning when ParseHTTP1Response returns nil
- Log warning when ParseHTTP1Request returns nil (SSL events)
- Log warning when ParseHTTP1Request returns nil (plain HTTP)
Includes src IP and data length for context.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fixed potential use-after-free in Update() when gcRound deletes
a session between GetOrCreate() and acquiring the session lock.
Changes:
- Add 'deleted' flag to SessionState
- Mark sessions as deleted before removing from map in gcRound
- Check deleted flag in Update and recreate session if needed
This ensures updates to deleted sessions create a new session
instead of modifying a freed/dangling reference.
Race detector verified: go test ./internal/correlation/... -race
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ReadyCh consumer goroutine could block indefinitely during shutdown
because it only checked for channel closure but not context cancellation.
Changes:
- Add select statement to check both ctx.Done() and ReadyCh closure
- Add shutdown logging for better debugging
- Add 100ms grace period for goroutines to exit
Fixes potential hangs when receiving SIGTERM/SIGINT.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace single-service-per-endpoint with all-ips mode running nginx, apache,
and hitch+varnish simultaneously on 3 dedicated IPs per VM (eth1 alias IPs).
Add a dedicated traffic VM with curl-impersonate for realistic TLS fingerprints,
parallelized traffic generation, and paired SNI_HOSTS/TARGET_IPS lists for
per-VM per-service hostname identification (e.g. rocky9-nginx-platform.test).
Key changes:
- run-tests-vm.sh: add setup_all_ips(), IP-specific Listen/bind directives
with reset-before-apply pattern, graceful service availability checks
- run-e2e-test.sh: traffic VM architecture, all-ips mode, eth1 network,
paired IP/SNI lists, updated cleanup for alias IPs
- generate-traffic.sh: parallel background jobs, curl-impersonate detection,
auto source interface detection via ip route get, Host header in HTTP traffic
- Vagrantfile: add traffic VM with provision-traffic.sh
- provision-traffic.sh: install curl-impersonate and httpx for traffic gen
- test-rpm.sh: multi-interface TC check, updated ja4ebpf config
- clickhouse-init.sh: load CSV stubs for Anubis/bot-networks dictionaries
- Remove obsolete correlator/sentinel/mod-reqin-log docs
- Add h2_settings_ack column to http_logs schema
- Upgrade Go toolchain to 1.25.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The HPACK static table was completely wrong from index 15 onwards — entries
were shifted and missing, causing all header name lookups to return wrong
names (e.g. index 19 returned "cookie" instead of "accept"). Rewrite the
entire table as hpackStaticEntry{Name,Value} structs matching RFC 7541 Appendix
A (indices 1-61) plus browser extensions (62-100). Fix DecodeH2HeadersBlock to
properly decode fully-indexed representations (6.1) which were silently dropped
before — now both name and value are extracted from the static table entry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add dst_ip and dst_port fields to tls_hello_event BPF struct and populate
them in tc_capture.c. Update Go TLS event handler with new byte offsets
(payload[2048]+src_ip(4)+dst_ip(4)+src_port(2)+dst_port(2)+payload_len(2)+
timestamp_ns(8) = 2070 bytes). Read dst_ip/dst_port from HTTP plain events
and use them to populate L3L4 when SYN was not captured, ensuring dst_ip
and dst_port are always available in ClickHouse for both TLS and HTTP sessions.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fix two critical offset bugs introduced when ip_total_length was added to
tcp_syn_event: tcp_options_raw offset 21→23 and tcp_options_len offset 61→63,
plus minimum size check 70→72. Fix ssl_data_event direction field offset from
4118 (inside timestamp_ns) to 4126. Simplify attachSSLWrite to use generated
objects directly instead of dynamic spec loading. Regenerate BPF objects with
SSL_write uprobe programs included.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add uprobe_ssl_write_entry/uretprobe_ssl_write_exit to capture server HTTP
responses via SSL_write with direction=1. Implement full HPACK decoder
(RFC 7541 static table, multi-byte integers, literal representations) for
HTTP/2 header extraction. Add AcceptCache mapping {tgid,fd}→SessionKey
from accept4 events as authoritative source for SSL correlation when BPF
ssl_conn_map has src_ip=0. Add ip_total_length to tcp_syn_event BPF struct.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The manual byte assembly (sa_buf[2]<<8 | sa_buf[3]) already produces
a host-byte-order port value; __builtin_bswap16 was swapping it again,
causing SSL events to use wrong source ports and preventing TLS/HTTP
session correlation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add run-e2e-test.sh with CLI parameters (--hits, --http-ratio, --dns, --tls,
--src-ips, --keep-analysis, --up) for configurable traffic generation. Traffic
runs from VM endpoints with multiple source IPs (alias IPs on eth0) to produce
distinct sessions for the ML pipeline. Fix curl TLS flags (--tlsv1.2 instead
of --tls-v1-2), skip redundant local verification in distributed mode, and
fix dashboard is_available() cache that never retried after ClickHouse recovery.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the LogisticRegression meta-learner with a PyTorch MetaFusionMLP
(Linear(3,16)->BN->ReLU->Dropout->Linear(16,1)->Sigmoid) for non-linear
fusion of EIF, NF, and XGBoost scores. Replace KS-test + quantile digest
drift detection with ADWIN (adaptive sliding window, Hoeffding bound).
Replace weekly XGBoost batch retraining with River HoeffdingAdaptiveTree
for incremental online learning (learn_one per cycle). Update all thesis
documentation sections (2.4.2c, 2.4.3, 3.8, discussion, conclusion).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rewrite fleet.py to use a GNN-based approach: nodes are src_ip with ML feature
vectors, edges connect IPs sharing (JA4, ASN) pairs, GraphSAGE (2 SAGEConv
layers, in→64→32) produces 32D embeddings clustered by HDBSCAN. PyG NeighborLoader
activates for >50k nodes. Update thesis docs (§5.2, §6.4, §2, §8) to reflect
GraphSAGE architecture and PyG scalability.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Split monolithic thesis into separate chapter markdown files under
docs/thesis/. Remove fabricated bibliography entries, correct inflated
claims, add GNN/Transformers section, and rename MetaLearner to Fusion LR.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add .gitignore rules for generated CSV data, eBPF compiled objects,
and vmlinux.h header. Remove 19 tracked files (~175 MB) that can be
regenerated from scripts (generate_*.py), bpftool, or bpf2go.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implement offline profile building (profile_builder.py) and real-time
dynamic scoring (browser_matcher_dynamic.py) using HDBSCAN-based browser
fingerprint clustering. Add ClickHouse materialized view (13_h2_profiling.sql)
for h2_profile_stats aggregation. Update thesis and project documentation
to cover the new dynamic profiling architecture.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Dockerfile.package: migre go-builder de golang:bookworm (Debian) vers
rockylinux:9, installe Go depuis le tarball officiel, remplace apt par
dnf (clang llvm libbpf-devel bpftool)
- Suppression du champ 'correlated' de l'agent ja4ebpf : avec eBPF/XDP,
la corrélation L3/L4↔L7 est toujours implicite par présence des champs.
Supprimé de : session.go, manager.go, main.go (x5), clickhouse.go
- Thèse (6 corrections listées + cohérence correlated) :
1. §3.5 + §3.9.1 : SSL_read retourne des octets bruts sans respecter les
frontières H2 → buffer circulaire de réassemblage en Go userspace
2. §3.1 : supprimé libpcap + CAP_NET_RAW, remplacé par définition uprobe
3. §4 + §7 : compte exact 96 features en 8 familles (Famille 1–8),
supprimé taxonomie F1–F11 obsolète, tous les totaux mis à jour
4. §2.4 + §8 : remplacé 7 fausses URLs arXiv par [Référence à vérifier]
5. §4 Famille 2 : ja4_drift_ratio → renvoi à Famille 8 (définition complète)
6. §6.4 : ajouté limite 'Overhead de l'uprobe SSL_read'
+ §3.6 : supprimé correlated=0/1 du texte architectural
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Increase MAX_TLS_PAYLOAD from 512 to 2048 bytes to capture full
TLS ClientHellos (modern browsers/curl send 1000-1543 byte ClientHellos)
- Fix ParseClientHello to tolerate XDP-truncated payloads: clamp
recordLength and chLen to available data instead of returning error
- Fix cipher suites, compression, extensions truncation to use clamping
- Fix consumeSynEvents struct field offsets: dst_ip (4 bytes at offset 4)
was not accounted for, causing all L3/L4 metadata to be read from
wrong positions (TTL was actually dst_ip[0], windowSize was dst_port, etc.)
- Add parseTCPOptions() to extract MSS and Window Scale from raw TCP options
(C code sets defaults of mss=0, window_scale=0xFF, expects Go to parse)
- Fix consumeAcceptEvents: skip zero-IP events to avoid phantom sessions
- Fix consumeSSLEvents: filter zero-IP/port events when proc fallback fails
- Add missing consumeHTTPPlainEvents goroutine (was defined but never called)
- Fix race condition: SYN consumer sets Correlated=true if TLS already present
- Update tls_hello_event struct offsets in Go consumer (payload_len now at
offset 2054, was 518, due to payload array growing from 512 to 2048 bytes)
- Remove debug logging from consumers and GC
E2E verified: HTTP plain (port 80) and HTTPS (port 443) both produce
fully correlated sessions in ClickHouse with correct:
- ip_meta_ttl=64, ip_meta_df=true, ip_meta_id
- tcp_meta_window_size=64240, tcp_meta_window_scale=10, tcp_meta_mss=1460
- ja4=t13i3010_1d37bd780c83_95d2a80e6515
- tls_alpn=http/1.1
- method=GET, path=/, header_order_signature=Host;User-Agent;Accept
- correlated=1
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Use two separate //go:generate directives (Ja4Tc for tc_capture.c, Ja4Ssl
for uprobe_ssl.c) to avoid duplicate LICENSE symbol and multi-file clang issue
- Update loader.go to hold tcObjs/sslObjs separately with correct field names:
UprobeSslSetFd, UprobeSslReadEntry, UretprobeSslReadExit,
KprobeAccept4Entry, KretprobeAccept4Exit
- Add systemd-rpm-macros to all three RPM build stages (el8/el9/el10)
so that %{_unitdir} macro resolves correctly
- RPMs now build successfully for el8, el9, el10
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>