Files
ja4-platform/services/dashboard/backend/templates/health.html
toto a108814a56 feat: roadmap détection bots §2-9 — HTTP/2, cohérence, drift, flotte, Jaccard, ExIFFI, méta-learner, métriques
Étape 2 — Fingerprinting HTTP/2 dans le pipeline ML :
- Ajout du dictionnaire dict_browser_h2 (11 familles de navigateurs) dans 05_aggregation_tables.sql
- Ajout du CTE h2_agg et 4 features HTTP/2 dans 07_ai_features_view.sql :
  h2_settings_known, h2_pseudo_order_match, h2_ja4_coherence, h2_settings_rare
- Calcul du fingerprint_coherence_score (5 axes pondérés) dans la vue
- Ajout du 6e axe axis_h2_coherence dans browser.py (poids rééquilibrés)
- browser_h2.csv : 11 fingerprints Akamai → famille navigateur

Étape 3 — Pré-filtre de cohérence sur la baseline humaine :
- pipeline.py exclut les sessions avec fingerprint_coherence_score < seuil de la baseline d'entraînement
- FINGERPRINT_COHERENCE_THRESHOLD configurable via env (défaut 0.25)
- Log des sessions exclues pour analyse SOC

Étape 4 — Détection de drift améliorée :
- scoring.py : passage de 5 à 9 quantiles (p5…p95)
- Ajout de la divergence KL en complément du test KS
- Détection de drift adversarial (≥80% des features dérivent dans la même direction)
- Split temporel strict pour la validation

Étape 5 — Graphe bipartite JA4×ASN (§5.2) :
- fleet.py : détection de flottes via NetworkX + Louvain (imports optionnels)
- enrich_with_fleet_score() : ajout fleet_score + fleet_campaign_flag au DataFrame
- cycle.py : appel après preprocess_df avec log du nombre de sessions en flotte
- SQL migration 05_fleet_metrics_tables.sql : table fleet_detections (TTL 7j)
- Dashboard : /fleet + /api/fleet (communautés détectées) + template fleet.html

Étape 6 — Cross-domain Jaccard §5.8 :
- 12_thesis_features.sql : CTE jaccard_paths → cross_domain_path_similarity
- Signal : même chemins (/admin, /wp-login) sur plusieurs hosts = scanner

Étape 7 — ExIFFI + erreurs AE par feature :
- scoring.py : compute_exiffi_importance() par permutation, compute_ae_feature_errors()
- pipeline.py : calcul ExIFFI sur X_test, mapping index → dict pour anomalies
- build_reason() enrichi avec exiffi_top quand SHAP inactif

Étape 8 — Méta-learner pour la pondération de l'ensemble :
- scoring.py : classe MetaLearner (LogisticRegression, fallback poids fixes <1000 labels)
- Collecte des labels depuis le cycle courant (known_bots, légitimes, Anubis)
- pipeline.py : remplacement des poids fixes par MetaLearner.predict()

Étape 9 — Métriques de performance et monitoring :
- metrics.py : record_cycle_metrics() — taux anomalie, drift, corrélation, latence
- SQL migration 05_fleet_metrics_tables.sql : table ml_performance_metrics (TTL 90j)
- Dashboard : /health + /api/health + template health.html
- cycle.py : appel record_cycle_metrics en fin de cycle (Complet + Applicatif)

Tests : 36/36 bot-detector tests passent

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-10 00:11:35 +02:00

372 lines
19 KiB
HTML

{% extends "base.html" %}
{% block page_title %}Santé du Pipeline ML — §9{% endblock %}
{% block content %}
<div class="p-4 lg:p-6 space-y-4 max-w-[1920px] mx-auto">
<!-- ═══ Header KPIs ═══ -->
<div class="flex flex-wrap items-center gap-4 mb-2">
<h1 class="text-xl font-bold text-white flex items-center gap-2">
<svg class="w-6 h-6 text-green-400" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 19v-6a2 2 0 00-2-2H5a2 2 0 00-2 2v6a2 2 0 002 2h2a2 2 0 002-2zm0 0V9a2 2 0 012-2h2a2 2 0 012 2v10m-6 0a2 2 0 002 2h2a2 2 0 002-2m0 0V5a2 2 0 012-2h2a2 2 0 012 2v14a2 2 0 01-2 2h-2a2 2 0 01-2-2z"/></svg>
Santé du Pipeline ML
</h1>
<div class="ml-auto flex items-center gap-3 flex-wrap">
<div class="text-center px-3">
<div class="text-2xl font-bold text-brand-500" id="kpi-cycles"></div>
<div class="text-[10px] text-gray-500 uppercase tracking-wider">Cycles (7j)</div>
</div>
<div class="text-center px-3 border-l border-gray-700">
<div class="text-2xl font-bold text-orange-400" id="kpi-anomaly"></div>
<div class="text-[10px] text-gray-500 uppercase tracking-wider">Taux anomalie moy.</div>
</div>
<div class="text-center px-3 border-l border-gray-700">
<div class="text-2xl font-bold text-yellow-400" id="kpi-latency"></div>
<div class="text-[10px] text-gray-500 uppercase tracking-wider">Latence moy. (ms)</div>
</div>
<div class="text-center px-3 border-l border-gray-700">
<div class="text-2xl font-bold text-red-400" id="kpi-drifts"></div>
<div class="text-[10px] text-gray-500 uppercase tracking-wider">Alertes drift</div>
</div>
</div>
</div>
<!-- ═══ Doc banner ═══ -->
<div class="bg-gray-900/50 border border-gray-800 rounded-lg px-4 py-3 text-xs text-gray-400 leading-relaxed">
<strong class="text-green-300">Métriques de performance — Étape 9</strong> — Chaque cycle du
bot-detector (~5 min) enregistre son taux d'anomalie, sa latence, son taux de corrélation
réseau, le nombre de features valides et les alertes de dérive.
<br><strong>Alertes :</strong>
<span class="text-red-400">drift_alert</span> = dérive de features &gt; 30% —
<span class="text-orange-400">anomaly_rate &gt; 10%</span> = risque de sur-détection —
<span class="text-yellow-400">latence &gt; 300s</span> = problème de performance.
<br><strong>Action SOC :</strong> Un drift prolongé nécessite un re-entraînement manuel.
Un taux de corrélation bas (&lt; 50%) indique un problème sentinel/correlator.
</div>
<!-- ═══ Graphiques temporels ═══ -->
<div class="grid grid-cols-1 lg:grid-cols-2 gap-4">
<!-- Taux d'anomalie -->
<div class="section-card">
<div class="section-header">
<span class="section-title">
<svg class="w-4 h-4 text-orange-400" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 17h8m0 0V9m0 8l-8-8-4 4-6-6"/></svg>
Taux d'anomalie par cycle
<span class="relative inline-block"><button onclick="docToggle(this)" class="doc-btn"></button><div class="doc-panel">
<h4>Taux d'anomalie</h4>
<p>Pourcentage de sessions classées HIGH/CRITICAL/MEDIUM/LOW par cycle, par modèle.</p>
<p><strong>Seuil d'alerte :</strong> &gt; 10% → sur-détection probable. &lt; 0.5% → sous-détection.</p>
<p class="doc-source">Source : ml_performance_metrics</p>
</div></span>
</span>
</div>
<div class="section-body"><div id="anomaly-chart" style="height:240px"></div></div>
</div>
<!-- Latence -->
<div class="section-card">
<div class="section-header">
<span class="section-title">
<svg class="w-4 h-4 text-yellow-400" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 8v4l3 3m6-3a9 9 0 11-18 0 9 9 0 0118 0z"/></svg>
Latence des cycles (ms)
<span class="relative inline-block"><button onclick="docToggle(this)" class="doc-btn"></button><div class="doc-panel">
<h4>Latence de traitement</h4>
<p>Durée totale du cycle bot-detector en millisecondes (fetch ClickHouse + scoring ML + insert).</p>
<p><strong>Seuil d'alerte :</strong> &gt; 300 000ms (5 min) → le cycle dépasse l'intervalle planifié.</p>
<p class="doc-source">Source : ml_performance_metrics.cycle_latency_ms</p>
</div></span>
</span>
</div>
<div class="section-body"><div id="latency-chart" style="height:240px"></div></div>
</div>
</div>
<!-- Drift rate et taux de corrélation -->
<div class="grid grid-cols-1 lg:grid-cols-2 gap-4">
<div class="section-card">
<div class="section-header">
<span class="section-title">
<svg class="w-4 h-4 text-red-400" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 9v2m0 4h.01m-6.938 4h13.856c1.54 0 2.502-1.667 1.732-2.5L13.732 4c-.77-.833-1.964-.833-2.732 0L4.082 16.5c-.77.833.192 2.5 1.732 2.5z"/></svg>
Dérive des features
</span>
</div>
<div class="section-body"><div id="drift-chart" style="height:200px"></div></div>
</div>
<div class="section-card">
<div class="section-header">
<span class="section-title">
<svg class="w-4 h-4 text-blue-400" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13.828 10.172a4 4 0 00-5.656 0l-4 4a4 4 0 105.656 5.656l1.102-1.101m-.758-4.899a4 4 0 005.656 0l4-4a4 4 0 00-5.656-5.656l-1.1 1.1"/></svg>
Taux de corrélation réseau
<span class="relative inline-block"><button onclick="docToggle(this)" class="doc-btn"></button><div class="doc-panel">
<h4>Taux de corrélation</h4>
<p>Proportion de sessions avec <code>correlated=1</code> (JA4 TLS corrélé par sentinel).</p>
<p><strong>Valeur attendue :</strong> &gt; 50%. En dessous, vérifier sentinel et logcorrelator.</p>
<p class="doc-source">Source : ml_performance_metrics.correlated_rate</p>
</div></span>
</span>
</div>
<div class="section-body"><div id="corr-chart" style="height:200px"></div></div>
</div>
</div>
<!-- ═══ Table des cycles récents ═══ -->
<div class="section-card overflow-hidden">
<div class="section-header">
<span class="section-title">
<svg class="w-4 h-4 text-green-400" fill="none" stroke="currentColor" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"/></svg>
Cycles récents
</span>
<div class="flex items-center gap-2">
<select id="model-filter" class="bg-gray-800 text-gray-300 text-xs border border-gray-700 rounded px-2 py-1">
<option value="">Tous les modèles</option>
</select>
<span id="table-status" class="text-[10px] text-gray-500">Chargement…</span>
</div>
</div>
<div class="overflow-x-auto">
<table class="data-table">
<thead>
<tr>
<th>Cycle</th>
<th>Modèle</th>
<th>Sessions</th>
<th>Corrélation</th>
<th>Anomalies</th>
<th>CRITICAL</th>
<th>HIGH</th>
<th>Drift</th>
<th>Latence (ms)</th>
<th>Features</th>
<th>Baseline</th>
<th>Meta</th>
</tr>
</thead>
<tbody id="metrics-body">
<tr><td colspan="12" class="text-center text-gray-500 py-8">Chargement…</td></tr>
</tbody>
</table>
</div>
</div>
</div>
<script>
/* ════════════════════════════════════════════════════════════════════════════
* Page Santé du Pipeline ML — chargement et rendu
* ════════════════════════════════════════════════════════════════════════════ */
let _allMetrics = [];
function fmtTs(ts) {
if (!ts) return '—';
return new Date(ts).toLocaleString('fr-FR', {dateStyle:'short', timeStyle:'short'});
}
function pct(v) {
return (v * 100).toFixed(1) + '%';
}
function driftBadge(rate, alert) {
if (alert) return `<span class="badge badge-critical">${pct(rate)} ⚠</span>`;
if (rate > 0.15) return `<span class="badge badge-high">${pct(rate)}</span>`;
return `<span class="badge badge-low">${pct(rate)}</span>`;
}
function anomalyBadge(rate) {
if (rate > 0.10) return `<span class="badge badge-critical">${pct(rate)}</span>`;
if (rate > 0.05) return `<span class="badge badge-high">${pct(rate)}</span>`;
if (rate > 0.01) return `<span class="badge badge-medium">${pct(rate)}</span>`;
return `<span class="badge badge-low">${pct(rate)}</span>`;
}
function corrBadge(rate) {
if (rate < 0.30) return `<span class="badge badge-critical">${pct(rate)}</span>`;
if (rate < 0.50) return `<span class="badge badge-high">${pct(rate)}</span>`;
return `<span class="badge badge-low">${pct(rate)}</span>`;
}
function latencyBadge(ms) {
if (ms > 300000) return `<span class="badge badge-critical">${ms.toLocaleString()}</span>`;
if (ms > 120000) return `<span class="badge badge-high">${ms.toLocaleString()}</span>`;
return `<span class="text-gray-300 text-xs">${ms.toLocaleString()}</span>`;
}
function renderTable(metrics) {
const tbody = document.getElementById('metrics-body');
if (!metrics.length) {
tbody.innerHTML = '<tr><td colspan="12" class="text-center text-gray-500 py-8">Aucune donnée sur les 7 derniers jours</td></tr>';
return;
}
tbody.innerHTML = metrics.map(m => `
<tr class="${m.drift_alert ? 'bg-red-900/10' : ''}">
<td class="text-gray-400 text-xs">${fmtTs(m.cycle_at)}</td>
<td><span class="badge badge-known">${m.model_name || '—'}</span></td>
<td class="text-right font-mono text-xs">${(m.total_sessions || 0).toLocaleString('fr-FR')}</td>
<td class="text-right">${corrBadge(m.correlated_rate || 0)}</td>
<td class="text-right">${anomalyBadge(m.anomaly_rate || 0)}</td>
<td class="text-right font-bold text-red-400">${m.critical_count || 0}</td>
<td class="text-right font-bold text-orange-400">${m.high_count || 0}</td>
<td class="text-right">${driftBadge(m.drift_rate || 0, m.drift_alert)}</td>
<td class="text-right">${latencyBadge(m.cycle_latency_ms || 0)}</td>
<td class="text-right text-xs text-gray-400">${m.features_valid || 0}/${m.features_total || 0}</td>
<td class="text-right text-xs text-gray-400">${(m.baseline_size || 0).toLocaleString()}</td>
<td class="text-center">${m.meta_learner_active ? '✓' : '—'}</td>
</tr>
`).join('');
}
function filterAndRender() {
const modelFilter = document.getElementById('model-filter').value;
const filtered = modelFilter
? _allMetrics.filter(m => m.model_name === modelFilter)
: _allMetrics;
renderTable(filtered);
}
async function loadHealth() {
try {
const res = await fetch('/api/health');
const data = await res.json();
_allMetrics = data.metrics || [];
// KPIs globaux
const driftAlerts = _allMetrics.filter(m => m.drift_alert).length;
document.getElementById('kpi-cycles').textContent = _allMetrics.length;
document.getElementById('kpi-anomaly').textContent = (data.avg_anomaly_rate * 100).toFixed(2) + '%';
document.getElementById('kpi-latency').textContent = Math.round(data.avg_latency_ms || 0).toLocaleString('fr-FR');
document.getElementById('kpi-drifts').textContent = driftAlerts;
document.getElementById('table-status').textContent = `${_allMetrics.length} cycle(s)`;
// Filtre modèles
const models = [...new Set(_allMetrics.map(m => m.model_name).filter(Boolean))];
const sel = document.getElementById('model-filter');
models.forEach(name => {
const opt = document.createElement('option');
opt.value = name;
opt.textContent = name;
sel.appendChild(opt);
});
sel.addEventListener('change', filterAndRender);
renderTable(_allMetrics);
renderCharts(_allMetrics);
} catch (err) {
console.error('Erreur chargement métriques :', err);
document.getElementById('table-status').textContent = 'Erreur de chargement';
document.getElementById('metrics-body').innerHTML =
'<tr><td colspan="12" class="text-center text-red-500 py-8">Erreur de chargement</td></tr>';
}
}
function renderCharts(metrics) {
if (!metrics.length) return;
// Ordre chronologique pour les graphes
const ordered = [...metrics].reverse();
const labels = ordered.map(m => fmtTs(m.cycle_at));
// Couleurs par modèle
const modelColors = {'Complet':'#6366f1', 'Applicatif':'#22d3ee'};
const colorFor = name => modelColors[name] || '#9ca3af';
// --- Graphe taux d'anomalie ---
const anomalyEl = document.getElementById('anomaly-chart');
if (anomalyEl) {
const byModel = {};
ordered.forEach(m => {
(byModel[m.model_name] = byModel[m.model_name] || []).push(
+(m.anomaly_rate * 100).toFixed(2)
);
});
const anomalyChart = echarts.init(anomalyEl, 'dark');
anomalyChart.setOption({
backgroundColor: 'transparent',
tooltip: { trigger:'axis', valueFormatter: v => v + '%' },
legend: { bottom: 0, textStyle:{color:'#9ca3af', fontSize:10} },
xAxis: { type:'category', data: labels, axisLabel:{color:'#6b7280', fontSize:9, rotate:30, interval:'auto'} },
yAxis: { type:'value', name:'%', axisLabel:{color:'#6b7280', fontSize:10},
axisLine:{lineStyle:{color:'#374151'}} },
series: Object.entries(byModel).map(([name, vals]) => ({
name, type:'line', data: vals, smooth: true,
lineStyle:{color: colorFor(name), width:2},
itemStyle:{color: colorFor(name)},
areaStyle:{color: colorFor(name), opacity:0.08},
symbol:'none',
})),
markLine: { data:[{yAxis:10, label:{formatter:'Seuil 10%', fontSize:9}, lineStyle:{color:'#ef4444', type:'dashed'}}] },
});
window.addEventListener('resize', () => anomalyChart.resize());
}
// --- Graphe latence ---
const latencyEl = document.getElementById('latency-chart');
if (latencyEl) {
const byModel = {};
ordered.forEach(m => {
(byModel[m.model_name] = byModel[m.model_name] || []).push(m.cycle_latency_ms || 0);
});
const latencyChart = echarts.init(latencyEl, 'dark');
latencyChart.setOption({
backgroundColor: 'transparent',
tooltip: { trigger:'axis', valueFormatter: v => v.toLocaleString() + ' ms' },
legend: { bottom: 0, textStyle:{color:'#9ca3af', fontSize:10} },
xAxis: { type:'category', data: labels, axisLabel:{color:'#6b7280', fontSize:9, rotate:30, interval:'auto'} },
yAxis: { type:'value', name:'ms', axisLabel:{color:'#6b7280', fontSize:10} },
series: Object.entries(byModel).map(([name, vals]) => ({
name, type:'bar', data: vals, stack:'total',
itemStyle:{color: colorFor(name)}, barMaxWidth:20,
})),
});
window.addEventListener('resize', () => latencyChart.resize());
}
// --- Graphe drift ---
const driftEl = document.getElementById('drift-chart');
if (driftEl) {
const driftChart = echarts.init(driftEl, 'dark');
driftChart.setOption({
backgroundColor: 'transparent',
tooltip: { trigger:'axis', valueFormatter: v => (v * 100).toFixed(1) + '%' },
xAxis: { type:'category', data: labels, axisLabel:{color:'#6b7280', fontSize:9, rotate:30, interval:'auto'} },
yAxis: { type:'value', max:1, axisLabel:{color:'#6b7280', fontSize:10,
formatter: v => (v*100).toFixed(0) + '%'} },
series: [{
type: 'bar',
data: ordered.map(m => ({
value: m.drift_rate || 0,
itemStyle: { color: m.drift_alert ? '#ef4444' : '#6366f1' },
})),
barMaxWidth: 20,
}],
});
window.addEventListener('resize', () => driftChart.resize());
}
// --- Graphe corrélation réseau ---
const corrEl = document.getElementById('corr-chart');
if (corrEl) {
const byModel = {};
ordered.forEach(m => {
(byModel[m.model_name] = byModel[m.model_name] || []).push(
+(m.correlated_rate * 100).toFixed(1)
);
});
const corrChart = echarts.init(corrEl, 'dark');
corrChart.setOption({
backgroundColor: 'transparent',
tooltip: { trigger:'axis', valueFormatter: v => v + '%' },
legend: { bottom: 0, textStyle:{color:'#9ca3af', fontSize:10} },
xAxis: { type:'category', data: labels, axisLabel:{color:'#6b7280', fontSize:9, rotate:30, interval:'auto'} },
yAxis: { type:'value', max:100, axisLabel:{color:'#6b7280', fontSize:10,
formatter: v => v + '%'} },
series: Object.entries(byModel).map(([name, vals]) => ({
name, type:'line', data: vals, smooth:true,
lineStyle:{color: colorFor(name), width:2},
itemStyle:{color: colorFor(name)}, symbol:'none',
})),
});
window.addEventListener('resize', () => corrChart.resize());
}
}
loadHealth();
</script>
{% endblock %}