feat: JA3 fingerprinting, SSL correlation fix, ML pipeline overhaul, E2E test infra

ja4ebpf:
- Add JA3 raw + MD5 hash fingerprinting (ComputeJA3 in TLS parser)
- Fix accept4 port double-swap bug (__builtin_bswap16 on already-host-order value)
- Fix scheme override bug in ClickHouse writer (HTTP block clearing HTTPS)
- Add HTTP/2 passive fingerprinting (Akamai H2 FP, SETTINGS, pseudo-header order)
- Enrich ClickHouse schema with IP/TCP metadata, H2 settings, Sec-* headers
- Ensure maximum data completeness: all available L3/L4, TLS, HTTP fields emitted

bot-detector:
- Replace logistic regression with MLP fusion classifier
- Replace KS drift detection with ADWIN online learning
- Replace NetworkX/Louvain with PyTorch Geometric GraphSAGE for fleet detection
- Replace autoencoder with RealNVP normalizing flow + SessionTransformer embeddings

infra:
- Add distributed E2E test infrastructure (4 VMs: endpoints + analysis)
- Add Vagrant provisioning for analysis VM, e2e Makefile targets, run scripts

docs:
- Restructure thesis into chapter files with corrected references
- Add E2E testing documentation
- Update architecture, schema, deployment, service docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jacquin Antoine
2026-04-15 02:57:07 +02:00
parent f88b739992
commit 61addc8cfa
5 changed files with 171 additions and 160 deletions

View File

@ -47,6 +47,8 @@ type sessionRecord struct {
// TLS (noms attendus par le MV)
JA4Hash string `json:"ja4,omitempty"`
JA3Raw string `json:"ja3,omitempty"`
JA3Hash string `json:"ja3_hash,omitempty"`
TLSSNI string `json:"tls_sni,omitempty"`
TLSALPN string `json:"tls_alpn,omitempty"`
TLSVersion string `json:"tls_version,omitempty"`
@ -242,6 +244,8 @@ func sessionToRecord(s *correlation.SessionState) sessionRecord {
// Champs TLS
if s.TLS != nil {
rec.JA4Hash = s.TLS.JA4Hash
rec.JA3Raw = s.TLS.JA3Raw
rec.JA3Hash = s.TLS.JA3Hash
rec.TLSSNI = s.TLS.SNI
rec.TLSALPN = strings.Join(s.TLS.ALPN, ",")
rec.TLSVersion = formatTLSVersion(s.TLS.TLSVersion)
@ -262,7 +266,9 @@ func sessionToRecord(s *correlation.SessionState) sessionRecord {
rec.Path = last.Path
rec.QueryString = last.QueryString
rec.Host = last.Host
rec.Scheme = "" // sera rempli par le dispatcher si TLS
if last.Host != "" && s.TLS != nil {
rec.Scheme = "https"
}
rec.HTTPVersion = last.HTTPVersion
rec.StatusCode = &last.StatusCode
rec.ResponseSize = &last.ResponseSize