feat: ja4-platform monorepo — 5 services unified, tests & RPM builds standardized

Services:
- ja4sentinel: TLS/JA4 fingerprint capture daemon (Go, libpcap)
- logcorrelator: JA4 log correlation engine (Go, ClickHouse)
- mod_reqin_log: Apache module (C, JSON request logging)
- bot_detector: ML bot detection pipeline (Python)
- dashboard: FastAPI/Streamlit analytics UI (Python)

Shared libraries:
- shared/go/ja4common: logger, config, shutdown, ipfilter (Go module)
- shared/python/ja4_common: ClickHouseClient, ClickHouseSettings (Python package)
- shared/clickhouse/: canonical SQL migrations (10 files)

Build & packaging:
- Unified 3-stage Dockerfile.package for Go RPMs (el8/el9/el10)
- go.work workspace linking sentinel, correlator, ja4common
- Makefile with test-all, build-all, rpm-* targets

Fixes applied:
- go.work: 1.21 → 1.24.6 (required by sentinel)
- correlator Dockerfiles: golang:1.21 → golang:1.24
- replace directives in go.mod for ja4common local path
- pyproject.toml: setuptools.backends → setuptools.build_meta
- Removed static libpcap linking (unavailable on Rocky 9)
- Fixed data races in output/writers_test.go (sync.Mutex + atomic.Int32)
- Rewrote corrupted test files (logger_test.go × 2)

Test coverage:
- correlator: 67.1% total (unixsocket 80.5%, config 91.7%, app 83.3%, multi 87.7%, stdout 100%)
- sentinel: all 10 packages pass (api, capture, config, fingerprint, ipfilter, logging, output, tlsparse)

Documentation:
- README.md + docs/ (architecture, development, 5 services, shared libs, DB schema & migrations)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This commit is contained in:
toto
2026-04-07 16:42:59 +02:00
commit d469e39da7
278 changed files with 1621301 additions and 0 deletions

View File

@ -0,0 +1,92 @@
"""
Endpoints pour la liste des attributs uniques
"""
from fastapi import APIRouter, HTTPException, Query
from ..database import db
from ..models import AttributeListResponse, AttributeListItem
router = APIRouter(prefix="/api/attributes", tags=["attributes"])
@router.get("/{attr_type}", response_model=AttributeListResponse)
async def get_attributes(
attr_type: str,
limit: int = Query(100, ge=1, le=1000, description="Nombre maximum de résultats")
):
"""
Récupère la liste des valeurs uniques pour un type d'attribut
"""
try:
# Mapping des types vers les colonnes
type_column_map = {
"ip": "src_ip",
"ja4": "ja4",
"country": "country_code",
"asn": "asn_number",
"host": "host",
"threat_level": "threat_level",
"model_name": "model_name",
"asn_org": "asn_org"
}
if attr_type not in type_column_map:
raise HTTPException(
status_code=400,
detail=f"Type invalide. Types supportés: {', '.join(type_column_map.keys())}"
)
column = type_column_map[attr_type]
# Requête de base
base_query = f"""
SELECT
{column} AS value,
count() AS count
FROM ml_detected_anomalies
WHERE detected_at >= now() - INTERVAL 24 HOUR
"""
# Ajout du filtre pour exclure les valeurs vides/nulles
# Gestion spéciale pour les types IPv6/IPv4 qui ne peuvent pas être comparés à ''
if attr_type == "ip":
# Pour les adresses IP, on convertit en string et on filtre
query = f"""
SELECT value, count FROM (
SELECT toString({column}) AS value, count() AS count
FROM ml_detected_anomalies
WHERE detected_at >= now() - INTERVAL 24 HOUR
GROUP BY {column}
)
WHERE value != '' AND value IS NOT NULL
ORDER BY count DESC
LIMIT %(limit)s
"""
else:
query = f"""
{base_query}
AND {column} != '' AND {column} IS NOT NULL
GROUP BY value
ORDER BY count DESC
LIMIT %(limit)s
"""
result = db.query(query, {"limit": limit})
items = [
AttributeListItem(
value=str(row[0]),
count=row[1]
)
for row in result.result_rows
]
return AttributeListResponse(
type=attr_type,
items=items,
total=len(items)
)
except HTTPException:
raise
except Exception as e:
raise HTTPException(status_code=500, detail=f"Erreur: {str(e)}")