feat(ml): replace Autoencoder with RealNVP Normalizing Flow and add SessionTransformer embeddings

Replace TrafficAutoEncoder (MSE reconstruction scoring) with TrafficNormalizingFlow
(RealNVP via FrEIA, 4 affine coupling blocks, anomaly score = -log p(x)) for
mathematically rigorous density estimation. Add SessionTransformer module producing
32-dimensional sequence embeddings from raw HTTP request sequences (path, method,
timing) via a lightweight TransformerEncoder, replacing path_transition_entropy and
cadence_cv features. Update thesis documentation sections 2.4.2b and 3.8 accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Jacquin Antoine
2026-04-13 15:11:21 +02:00
parent 0e5f94dd0d
commit c1821dcbc4
14 changed files with 515 additions and 3590 deletions

View File

@ -595,8 +595,7 @@ Intégration du feedback des analystes SOC depuis la table `audit_logs` :
**Module** : `preprocessing.py` + `cycle.py`
9 features issues de la thèse (§5) enrichies depuis `view_thesis_features_1h` :
- `path_transition_entropy` — entropie des transitions entre chemins
- `cadence_cv` — coefficient de variation de la cadence
- `seq_emb_0`..`seq_emb_31` — embeddings séquentiels via SessionTransformer (§5.2, remplace path_transition_entropy + cadence_cv)
- `burst_ratio` / `pause_ratio` — ratios de rafales et pauses
- `lag1_autocorrelation` — autocorrélation lag-1 des inter-arrivées
- `benford_deviation` — déviation par rapport à la loi de Benford