Rewrite fleet.py to use a GNN-based approach: nodes are src_ip with ML feature vectors, edges connect IPs sharing (JA4, ASN) pairs, GraphSAGE (2 SAGEConv layers, in→64→32) produces 32D embeddings clustered by HDBSCAN. PyG NeighborLoader activates for >50k nodes. Update thesis docs (§5.2, §6.4, §2, §8) to reflect GraphSAGE architecture and PyG scalability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Détection et Classification du Trafic HTTP Malveillant
Document technique — Avril 2026 — Version 4.0
Ce document est divisé en 9 parties :
| Fichier | Contenu | Lignes |
|---|---|---|
| 00_resume.md | Titre, résumé, table des matières | 75 |
| 01_introduction.md | Section 1 — Introduction, contexte, générations de défenses | 50 |
| 02_etat_de_lart.md | Section 2 — État de l'art (règles statiques, fingerprinting, ML) | 208 |
| 03_architecture.md | Section 3.1–3.8 — Architecture multi-couches, pipeline ML | 767 |
| 04_browser_matcher.md | Section 3.9 — Browser Signature Detection (browser_matcher) | 481 |
| 05_features.md | Section 4 — Taxonomie des 96 features (8 familles) | 682 |
| 06_techniques_avancees.md | Section 5 — Techniques comportementales avancées (§5.1–5.8) | 669 |
| 07_discussion_limites.md | Section 6 — Discussion, limites, scalabilité, RGPD | 207 |
| 08_conclusion_references.md | Sections 7–8 — Conclusion et références | 277 |