feat(ebpf): add Apache httpd HTTP capture via kretprobe recvfrom
- Add uprobe_apache.c with kretprobe on __x64_sys_recvfrom for Apache HTTP capture - Update loader.go to support unified "servers" configuration instead of separate nginx_bin_path/apache_enabled - Add consumeApacheHTTPEvents() function to process Apache HTTP events - Update bpf_types.h to add Apache-specific BPF maps and structs - Fix perf event array value_size for pb_apache_http (must be sizeof(__u32) not struct size) - Add NGINX_APACHE_GUIDE.md documentation for HTTP capture from both servers Validation results: - nginx HTTP capture: ✅ Working (57 headers captured, no truncation) - Apache HTTP capture: ⚠️ Under investigation (kretprobe not triggering on CentOS 8 kernel 4.18) Configuration: - JA4EBPF_UPROBES_ENABLED=true - JA4EBPF_UPROBES_SERVERS=nginx,apache (or "both") Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
296
docs/services/ja4ebpf/NGINX_APACHE_GUIDE.md
Normal file
296
docs/services/ja4ebpf/NGINX_APACHE_GUIDE.md
Normal file
@ -0,0 +1,296 @@
|
||||
# Guide : Capture HTTP Nginx et Apache httpd via ja4ebpf
|
||||
|
||||
## Vue d'ensemble
|
||||
|
||||
ja4ebpf peut capturer le trafic HTTP complet depuis deux serveurs web différents :
|
||||
- **Nginx** ✅ : via `recvfrom()` syscall (kretprobe sur `__x64_sys_recvfrom`)
|
||||
- **Apache httpd** ⚠️ : en cours de validation - kretprobe `__x64_sys_recvfrom`
|
||||
|
||||
### Statut de validation
|
||||
|
||||
| Serveur | Kernel | Statut | Headers capturés |
|
||||
|---------|--------|--------|------------------|
|
||||
| nginx | Rocky Linux 9 (5.14+) | ✅ Validé | Tous (sans troncature) |
|
||||
| Apache httpd | CentOS 8 (4.18) | ⚠️ En cours | Investigation nécessaire |
|
||||
| Apache httpd | Rocky Linux 9 (5.14+) | ⚠️ À tester | - |
|
||||
|
||||
## Configuration
|
||||
|
||||
### Configuration YAML
|
||||
|
||||
```yaml
|
||||
uprobes:
|
||||
enabled: true
|
||||
servers: ["nginx", "apache"] # Active les deux serveurs
|
||||
```
|
||||
|
||||
### Variables d'environnement
|
||||
|
||||
```bash
|
||||
JA4EBPF_UPROBES_ENABLED=true
|
||||
JA4EBPF_UPROBES_SERVERS=nginx,apache # ou "both" pour les deux
|
||||
```
|
||||
|
||||
## Architecture de capture
|
||||
|
||||
### Nginx (rocky9: 192.168.42.40)
|
||||
```
|
||||
┌─────────────┐
|
||||
│ nginx worker │─┐
|
||||
└─────────────┘ │
|
||||
├─ read() ──┐
|
||||
│ │
|
||||
┌──────▼──────┐ │
|
||||
│ kretprobe │ │
|
||||
│ sys_exit │ │
|
||||
│ recvfrom │ │
|
||||
└─────────────┘ │
|
||||
│
|
||||
┌───────▼──────┐
|
||||
│ ja4ebpf │
|
||||
│ user space │
|
||||
└──────────────┘
|
||||
```
|
||||
|
||||
### Apache httpd (centos8: 192.168.42.228) - En cours de validation
|
||||
```
|
||||
┌─────────────┐
|
||||
│ httpd worker │─┐
|
||||
└─────────────┘ │
|
||||
├─ recvfrom() ──┐
|
||||
│ │
|
||||
┌──────▼──────┐ │
|
||||
│ kretprobe │ │
|
||||
│ __x64_sys │ │
|
||||
│ recvfrom │ │
|
||||
└─────────────┘ │
|
||||
│
|
||||
┌───────▼──────┐
|
||||
│ ja4ebpf │
|
||||
│ user space │
|
||||
└─────────────┘
|
||||
```
|
||||
|
||||
**Note** : Apache httpd avec event MPM peut utiliser différents syscalls selon la configuration.
|
||||
Les tests en cours utilisent kretprobe sur `__x64_sys_recvfrom` (identique à nginx).
|
||||
|
||||
## Déploiement multi-servers
|
||||
|
||||
### Scénario 1 : ja4ebpf sur chaque serveur web
|
||||
```
|
||||
┌─────────────────┐ ┌─────────────────┐
|
||||
│ rocky9 (nginx) │ │ centos8 (apache)│
|
||||
│ │ │ │
|
||||
│ ┌────────────┐ │ │ ┌────────────┐ │
|
||||
│ │ nginx │ │ │ │ Apache │ │
|
||||
│ └─────┬──────┘ │ │ └─────┬──────┘ │
|
||||
│ │ │ │ │ │
|
||||
│ ┌─────▼──────┐ │ │ ┌─────▼──────┐ │
|
||||
│ │ ja4ebpf │ │ │ │ ja4ebpf │ │
|
||||
│ └────────────┘ │ │ └────────────┘ │
|
||||
│ │ │ │
|
||||
│ capture: recvfrom│ │ capture: read │
|
||||
└──────────────────┘ └──────────────────┘
|
||||
|
||||
IP: 192.168.42.40 IP: 192.168.42.228
|
||||
```
|
||||
|
||||
**Configuration** :
|
||||
```bash
|
||||
# rocky9
|
||||
JA4EBPF_UPROBES_SERVERS=nginx
|
||||
|
||||
# centos8
|
||||
JA4EBPF_UPROBES_SERVERS=apache
|
||||
```
|
||||
|
||||
### Scénario 2 : ja4ebpf sur machine tierce (recommandé)
|
||||
```
|
||||
┌─────────────────────────────────────────────────┐
|
||||
│ analysis VM (ja4ebpf) │
|
||||
│ │
|
||||
│ ┌────────────┐ ┌─────────────┐ │
|
||||
│ │ nginx │ │ Apache │ │
|
||||
│ └──────┬─────┘ └──────┬──────┘ │
|
||||
│ │ │ │
|
||||
│ └───────┬───────┘ │
|
||||
│ │ │
|
||||
│ ┌───────▼──────────┐ │
|
||||
│ │ ja4ebpf │ │
|
||||
│ │ (read/recvfrom) │ │
|
||||
│ └─────────────────┘ │
|
||||
└───────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**Configuration** :
|
||||
```yaml
|
||||
uprobes:
|
||||
enabled: true
|
||||
servers: ["nginx", "apache"]
|
||||
```
|
||||
|
||||
## Validation
|
||||
|
||||
### Vérification nginx
|
||||
```bash
|
||||
# Vérifier que nginx capture
|
||||
curl http://192.168.42.40/test -H "User-Agent: Test" -H "X-Request-ID: test-nginx-001"
|
||||
|
||||
# Logs ja4ebpf
|
||||
tail -f /tmp/ja4ebpf-test.log | grep "\[nginx\]"
|
||||
# Exemple: [nginx] HTTP: pid=116276 fd=8 GET /test (headers=5)
|
||||
|
||||
# ClickHouse
|
||||
sudo docker exec analysis-clickhouse-1 clickhouse-client --query \
|
||||
"SELECT method, path, header_user_agent FROM ja4_logs.http_logs \
|
||||
WHERE time > now() - INTERVAL 1 MINUTE ORDER BY time DESC LIMIT 10"
|
||||
```
|
||||
|
||||
### Vérification Apache
|
||||
```bash
|
||||
# Vérifier que Apache capture
|
||||
curl http://192.168.42.228/test -H "User-Agent: Test" -H "X-Request-ID: test-apache-001"
|
||||
|
||||
# Logs ja4ebpf
|
||||
tail -f /tmp/ja4ebpf-apache.log | grep "\[apache\]"
|
||||
# Exemple: [apache] HTTP: pid=48914 fd=8 GET /test (headers=5)
|
||||
|
||||
# ClickHouse
|
||||
sudo docker exec analysis-clickhouse-1 clickhouse-client --query \
|
||||
"SELECT method, path, header_user_agent FROM ja4_logs.http_logs \
|
||||
WHERE time > now() - INTERVAL 1 MINUTE ORDER BY time DESC LIMIT 10"
|
||||
```
|
||||
|
||||
## Tests de validation
|
||||
|
||||
### Test 1 : Headers complets
|
||||
```bash
|
||||
# Test nginx (20+ headers)
|
||||
curl http://192.168.42.40/api/test \
|
||||
-H "User-Agent: Mozilla/5.0 (Validation-Agent)" \
|
||||
-H "Accept: application/json" \
|
||||
-H "Authorization: Bearer token" \
|
||||
-H "X-Custom-1: value1" \
|
||||
-H "X-Custom-2: value2" \
|
||||
... (jusqu'à 20 headers)
|
||||
|
||||
# Test Apache (20+ headers)
|
||||
curl http://192.168.42.228/api/test \
|
||||
-H "User-Agent: Mozilla/5.0 (Validation-Agent)" \
|
||||
-H "Accept: application/json" \
|
||||
-H "Authorization: Bearer token" \
|
||||
-H "X-Custom-1: value1" \
|
||||
-H "X-Custom-2: value2" \
|
||||
... (jusqu'à 20 headers)
|
||||
```
|
||||
|
||||
### Test 2 : Path et query longs
|
||||
```bash
|
||||
# nginx
|
||||
curl "http://192.168.42.40/api/v1/users/12345/profile/preferences?include=all&filter=active&sort=desc"
|
||||
|
||||
# Apache
|
||||
curl "http://192.168.42.228/api/v1/users/12345/profile/preferences?include=all&filter=active&sort=desc"
|
||||
```
|
||||
|
||||
### Validation ClickHouse
|
||||
```sql
|
||||
-- Requête pour vérifier la capture
|
||||
SELECT
|
||||
src_ip,
|
||||
method,
|
||||
path,
|
||||
query,
|
||||
substring(header_user_agent, 1, 40) as ua_preview,
|
||||
length(header_order_signature) as header_count,
|
||||
substring(header_order_signature, 1, 60) as headers_preview
|
||||
FROM ja4_logs.http_logs
|
||||
WHERE time > now() - INTERVAL 5 MINUTE
|
||||
ORDER BY time DESC
|
||||
LIMIT 20
|
||||
FORMAT Pretty
|
||||
```
|
||||
|
||||
## Résultats de validation
|
||||
|
||||
### Nginx (via recvfrom) - ✅ VALIDÉ sur Rocky Linux 9
|
||||
- ✅ Méthode HTTP capturée
|
||||
- ✅ Path complet sans troncature
|
||||
- ✅ Query string complète
|
||||
- ✅ Tous les headers capturés (y compris custom X-*)
|
||||
- ✅ User-Agent complet
|
||||
- ✅ Ordre des headers préservé
|
||||
- ✅ Données ClickHouse sans troncature
|
||||
|
||||
**Exemple de capture validée** :
|
||||
```sql
|
||||
SELECT method, path, host,
|
||||
length(header_order_signature) as headers_count,
|
||||
header_order_signature
|
||||
FROM ja4_logs.http_logs
|
||||
WHERE path = '/test-nginx-final'
|
||||
-- Résultat : headers_count=6
|
||||
-- header_order_signature: host;accept;user-agent;x-request-id;x-custom-1;x-custom-2
|
||||
```
|
||||
|
||||
### Apache httpd - ⚠️ EN COURS DE VALIDATION
|
||||
Sur CentOS 8 (kernel 4.18) :
|
||||
- ⚠️ Kretprobe __x64_sys_recvfrom ne déclenche pas d'événements
|
||||
- ⚠️ TC layer capture la connexion (src_ip disponible)
|
||||
- ❌ HTTP layer ne capture pas les headers
|
||||
|
||||
**Pistes d'investigation** :
|
||||
1. Vérifier si Apache event MPM utilise recv() ou recvfrom()
|
||||
2. Tester sur Rocky 9 (kernel 5.14+) avec Apache
|
||||
3. Envisager tracepoint/sys_enter_recvfrom alternatif
|
||||
|
||||
## Dépannage
|
||||
|
||||
### Apache ne capture pas
|
||||
```bash
|
||||
# Vérifier que Apache httpd utilise bien read()
|
||||
sudo strace -p 48914 -e trace=read 2>&1 | grep -A5 "GET "
|
||||
|
||||
# Vérifier que les PIDs Apache sont dans la map
|
||||
sudo bpftool map list name apache_pid_map
|
||||
|
||||
# Vérifier l'attachement kretprobe
|
||||
sudo bpftool prog show | grep sys_exit_read
|
||||
```
|
||||
|
||||
### Nginx ne capture pas
|
||||
```bash
|
||||
# Vérifier les tracepoints attachés
|
||||
sudo bpftool prog show | grep recvfrom
|
||||
|
||||
# Vérifier les PIDs nginx
|
||||
pgrep -a nginx | wc -l
|
||||
|
||||
# Vérifier les logs ja4ebpf
|
||||
tail -f /tmp/ja4ebpf-test.log | grep nginx
|
||||
```
|
||||
|
||||
## Fichiers BPF
|
||||
|
||||
### uprobe_nginx.c
|
||||
- `SEC("tp/syscalls/sys_enter_recvfrom")` : Sauvegarde arguments recvfrom
|
||||
- `SEC("kretprobe/__x64_sys_recvfrom")` : Capture données + envoi vers pb_ginx_http
|
||||
|
||||
### uprobe_nginx.c
|
||||
- `SEC("kretprobe/__x64_sys_recvfrom")` : Capture données HTTP + envoi vers pb_ginx_http
|
||||
|
||||
### uprobe_apache.c
|
||||
- `SEC("kretprobe/__x64_sys_recvfrom")` : Capture données HTTP + envoi vers pb_apache_http
|
||||
- Utilise PT_REGS_PARM2() pour accéder au buffer utilisateur
|
||||
|
||||
## Limitations
|
||||
|
||||
1. **Architecture** : Le kretprobe `__x64_sys_recvfrom` est spécifique à l'architecture x86_64
|
||||
2. **Local** : La capture doit se faire sur la même machine que le serveur web (pour accéder aux syscalls)
|
||||
3. **Performance** : Chaque syscall lu génère un événement BPF - le trafic très élevé peut impacter les performances
|
||||
|
||||
## Références
|
||||
|
||||
- Documentation nginx recvfrom : `docs/services/ja4ebpf.md`
|
||||
- Rapport validation ClickHouse : `services/ja4ebpf/docs/CLICKHOUSE_VALIDATION_REPORT.md`
|
||||
- Fix kretprobe recvfrom : `services/ja4ebpf/docs/RECVFROM_FIX.md`
|
||||
Reference in New Issue
Block a user