docs: add sanity check queries for ClickHouse ingestion
Some checks failed
Build and Test / test (push) Has been cancelled
Build and Test / build (push) Has been cancelled
Build and Test / docker (push) Has been cancelled

- Add 6 verification queries in README
- Check tables exist, MV definition, row counts
- Display raw and parsed logs samples
- Add interpretation guide for troubleshooting

Co-authored-by: Qwen-Coder <qwen-coder@alibabacloud.com>
This commit is contained in:
toto
2026-03-03 14:40:35 +01:00
parent eed376d749
commit a6327cc36f

View File

@ -378,6 +378,65 @@ SELECT raw_json
FROM mabase_prod.http_logs_raw;
```
### Sanity checks - Vérification de l'ingestion
Après avoir déployé le service, vérifiez que les données circulent correctement :
```sql
-- 1. Tables présentes
SELECT
database,
table,
engine
FROM system.tables
WHERE database = currentDatabase()
AND table IN ('http_logs_raw', 'http_logs', 'mv_http_logs');
-- 2. Définition de la vue matérialisée
SHOW CREATE TABLE mv_http_logs;
-- 3. Vérifier que les inserts bruts arrivent
SELECT
count(*) AS rows_raw,
min(ingest_time) AS min_ingest,
max(ingest_time) AS max_ingest
FROM http_logs_raw;
-- 4. Voir les derniers logs bruts
SELECT
ingest_time,
raw_json
FROM http_logs_raw
ORDER BY ingest_time DESC
LIMIT 5;
-- 5. Vérifier que la MV alimente http_logs
SELECT
count(*) AS rows_flat,
min(time) AS min_time,
max(time) AS max_time
FROM http_logs;
-- 6. Voir les derniers logs parsés
SELECT
time,
src_ip,
dst_ip,
method,
host,
path,
header_user_agent,
tls_version,
ja4
FROM http_logs
ORDER BY time DESC
LIMIT 10;
```
**Interprétation :**
- Si `rows_raw` > 0 mais `rows_flat` = 0 : la vue matérialisée ne fonctionne pas (vérifiez les droits SELECT sur `http_logs_raw`)
- Si les deux comptes sont > 0 : l'ingestion et le parsing fonctionnent correctement
## Tests
```bash