feat(ml): replace Autoencoder with RealNVP Normalizing Flow and add SessionTransformer embeddings
Replace TrafficAutoEncoder (MSE reconstruction scoring) with TrafficNormalizingFlow (RealNVP via FrEIA, 4 affine coupling blocks, anomaly score = -log p(x)) for mathematically rigorous density estimation. Add SessionTransformer module producing 32-dimensional sequence embeddings from raw HTTP request sequences (path, method, timing) via a lightweight TransformerEncoder, replacing path_transition_entropy and cadence_cv features. Update thesis documentation sections 2.4.2b and 3.8 accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@ -168,6 +168,22 @@ def fetch_and_analyze():
|
||||
except Exception as e:
|
||||
log_info(f'[Thèse §5] view_thesis_features_1h inaccessible : {e} — features avancées ignorées.')
|
||||
|
||||
# ── §5.2 — Embeddings Transformer de séquence (remplace path_transition_entropy + cadence_cv)
|
||||
try:
|
||||
from .session_transformer import extract_sequence_embeddings
|
||||
df_embs = extract_sequence_embeddings(df, client)
|
||||
if df_embs is not None and not df_embs.empty:
|
||||
df = df.merge(df_embs, on=['src_ip', 'ja4', 'host'], how='left')
|
||||
for i in range(32):
|
||||
col = f'seq_emb_{i}'
|
||||
if col in df.columns:
|
||||
df[col] = df[col].fillna(0.0)
|
||||
log_info(f'[Transformer §5.2] {len(df_embs)} sessions enrichies avec 32 embeddings séquentiels.')
|
||||
except Exception as e:
|
||||
log_info(f'[Transformer §5.2] Embeddings indisponibles : {e}')
|
||||
for i in range(32):
|
||||
df[f'seq_emb_{i}'] = 0.0
|
||||
|
||||
df = preprocess_df(df)
|
||||
|
||||
# §5 — Enrichissement avec le score de flotte JA4×ASN (bipartite fleet detection)
|
||||
|
||||
Reference in New Issue
Block a user