Rewrite fleet.py to use a GNN-based approach: nodes are src_ip with ML feature vectors, edges connect IPs sharing (JA4, ASN) pairs, GraphSAGE (2 SAGEConv layers, in→64→32) produces 32D embeddings clustered by HDBSCAN. PyG NeighborLoader activates for >50k nodes. Update thesis docs (§5.2, §6.4, §2, §8) to reflect GraphSAGE architecture and PyG scalability. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
15 lines
242 B
Plaintext
15 lines
242 B
Plaintext
clickhouse-connect==0.8.12
|
|
pandas==2.2.3
|
|
scikit-learn==1.6.1
|
|
shap==0.47.2
|
|
scipy>=1.14
|
|
hdbscan>=0.8.38
|
|
isotree>=0.6.1
|
|
torch>=2.0
|
|
torch_geometric>=2.4
|
|
FrEIA>=0.2
|
|
xgboost>=2.0
|
|
cleanlab>=2.6
|
|
pyyaml>=6.0
|
|
ja4-common @ file:///app/shared/ja4_common
|