Triple-voice ensemble architecture: - EIF (non-supervisé, anomalies zero-day) - Autoencoder (non-supervisé, corrélations non-linéaires) - XGBoost (supervisé, patterns connus + feedback SOC) XGBoost implementation: - Trained on historical ml_all_scores labels (NORMAL=0, HIGH/CRITICAL/DENY/KNOWN=1) - Weekly retraining (XGB_RETRAIN_INTERVAL_H=168), min 100 labels required - Score = predict_proba, combined via meta-learner: (1-β)*(EIF+AE) + β*xgb_prob - Configurable: XGB_WEIGHT (β=0.20), XGB_MIN_LABELS, XGB_RETRAIN_INTERVAL_HOURS - Graceful fallback: if xgboost unavailable or labels insufficient, EIF+AE only - ClickHouse: xgb_prob column added to ml_all_scores - Tests: 4 new tests (availability, train/predict, meta-learner, save/load) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
12 lines
196 B
Plaintext
12 lines
196 B
Plaintext
clickhouse-connect==0.8.12
|
|
pandas==2.2.3
|
|
scikit-learn==1.6.1
|
|
shap==0.47.2
|
|
scipy>=1.14
|
|
hdbscan>=0.8.38
|
|
isotree>=0.6.1
|
|
torch>=2.0
|
|
xgboost>=2.0
|
|
pyyaml>=6.0
|
|
ja4-common @ file:///app/shared/ja4_common
|