feat: JA3 fingerprinting, SSL correlation fix, ML pipeline overhaul, E2E test infra

ja4ebpf:
- Add JA3 raw + MD5 hash fingerprinting (ComputeJA3 in TLS parser)
- Fix accept4 port double-swap bug (__builtin_bswap16 on already-host-order value)
- Fix scheme override bug in ClickHouse writer (HTTP block clearing HTTPS)
- Add HTTP/2 passive fingerprinting (Akamai H2 FP, SETTINGS, pseudo-header order)
- Enrich ClickHouse schema with IP/TCP metadata, H2 settings, Sec-* headers
- Ensure maximum data completeness: all available L3/L4, TLS, HTTP fields emitted

bot-detector:
- Replace logistic regression with MLP fusion classifier
- Replace KS drift detection with ADWIN online learning
- Replace NetworkX/Louvain with PyTorch Geometric GraphSAGE for fleet detection
- Replace autoencoder with RealNVP normalizing flow + SessionTransformer embeddings

infra:
- Add distributed E2E test infrastructure (4 VMs: endpoints + analysis)
- Add Vagrant provisioning for analysis VM, e2e Makefile targets, run scripts

docs:
- Restructure thesis into chapter files with corrected references
- Add E2E testing documentation
- Update architecture, schema, deployment, service docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jacquin Antoine
2026-04-15 02:57:07 +02:00
parent f88b739992
commit 61addc8cfa
5 changed files with 171 additions and 160 deletions

View File

@ -31,6 +31,8 @@ type L3L4 struct {
type TLSInfo struct {
ClientHelloRaw []byte // payload ClientHello brut
JA4Hash string // empreinte JA4 calculée
JA3Raw string // empreinte JA3 brute (format: version,ciphers,exts,groups,ecfmts)
JA3Hash string // empreinte JA3 hash MD5
SNI string // Server Name Indication
ALPN []string // protocoles Application-Layer Protocol Negotiation
TLSVersion uint16 // version TLS la plus haute annoncée