Wazuh AI/ML Security Rules
41 custom SIEM rules and 3 log decoders for monitoring AI inference infrastructure — covering jailbreak detection, RAG pipeline operations, MCP agent security, and model supply chain integrity.
01 Problem
AI inference stacks generate structured logs that traditional SIEM rules don't parse. Jailbreak attempts, prompt extraction, MCP tool abuse, and poisoned model files all produce telemetry — but without purpose-built decoders and detection rules, that telemetry goes unanalyzed.
These rules were developed from hands-on operation of a multi-layer inference gateway, RAG pipeline with MCP tool integration, and model scanning pipeline. Each rule exists because the corresponding attack pattern was observed, tested, or deliberately simulated in a lab environment.
02 Rule Domains
Inference Gateway Security
16 rules · IDs 100100–100191 · Jailbreak defense, evasion detection, privacy controls
| Rule | Level | Detection | Type |
|---|---|---|---|
| 100100 | 10 | Jailbreak attempt blocked (regex/classifier layer) | Alert |
| 100101 | 8 | Deep analysis escalation triggered | Alert |
| 100102 | 7 | Gateway authentication failure | Alert |
| 100103 | 6 | Rate limit exceeded | Alert |
| 100104 | 12 | Multiple jailbreaks from same source (3+ in 5 min) | Correlation |
| 100105 | 11 | Request blocked by deep analysis engine | Alert |
| 100110 | 9 | Encoding evasion attempt (base64, hex, rot13) | Alert |
| 100111 | 11 | Multiple encoding evasion attempts from same source | Correlation |
| 100112 | 9 | System prompt extraction attempt | Alert |
| 100113 | 11 | Multiple extraction attempts from same source | Correlation |
| 100114 | 6 | Token budget exceeded | Alert |
| 100115 | 11 | Same block pattern triggered repeatedly (persistence) | Correlation |
| 100150 | 9 | PII detected in model response | Alert |
| 100151 | 12 | Multiple PII leaks to same source | Correlation |
| 100190 | 3 | Request completed (audit trail) | Audit |
| 100191 | 7 | Upstream inference server error | Alert |
RAG Pipeline & MCP Agent Monitoring
19 rules · IDs 100120–100138 · Document operations, agent tool calls, sensitive access
| Rule | Level | Detection | Type |
|---|---|---|---|
| 100120 | 3 | Document operation (upload, embed) | Audit |
| 100121 | 3 | RAG query activity | Audit |
| 100122 | 3 | Vector database operation | Audit |
| 100123 | 7 | Document processing error | Alert |
| 100124 | 5 | Workspace configuration change | Alert |
| 100130 | 3 | MCP server started | Audit |
| 100131 | 8 | MCP server failure | Alert |
| 100132 | 5 | MCP tool invocation | Alert |
| 100133 | 7 | MCP write operation (file create/edit/move) | Alert |
| 100134 | 8 | MCP configuration accessed | Alert |
| 100135 | 10 | MCP accessed sensitive path or keyword | Alert |
| 100136 | 5 | Authentication event | Audit |
| 100137 | 7 | Backend error | Alert |
| 100138 | 5 | Backend warning | Alert |
Model Supply Chain Security
6 rules · IDs 100140–100145 · Pickle scanning, model integrity verification
| Rule | Level | Detection | Type |
|---|---|---|---|
| 100140 | 3 | Model scan initiated | Audit |
| 100141 | 12 | Dangerous pickle object detected | Alert |
| 100142 | 10 | Model security issues found | Alert |
| 100143 | 11 | Model failed security scan | Alert |
| 100144 | 3 | Model passed security scan | Audit |
| 100145 | 7 | Scanner tool not available | Alert |
03 Architecture
Each AI service emits structured JSON logs. Custom decoders parse these into fields that rules can match against. The pipeline:
Three custom decoders handle the three log sources. Each uses Wazuh's JSON plugin decoder to extract structured fields — event, layer, service, level — that rules then match with field-level conditions rather than raw string matching. This keeps rules portable across different infrastructure deployments.
04 Example Rules
Jailbreak Correlation
Escalates to level 12 when the same source triggers 3+ blocks within 5 minutes.
<rule id="100104" level="12" frequency="3" timeframe="300">
<if_matched_sid>100100</if_matched_sid>
<same_source_ip />
<description>Multiple jailbreak attempts from same source</description>
<group>ai_security,jailbreak,correlation,</group>
</rule> MCP Sensitive Path Access
Fires when an MCP agent tool accesses paths or keywords associated with secrets, credentials, or system configuration.
<rule id="100135" level="10">
<decoded_as>json</decoded_as>
<field name="service">backend</field>
<match>/etc/|/var/|\.ssh|\.env|password|secret|token</match>
<description>MCP accessed sensitive path or keyword</description>
<group>mcp,sensitive,security,</group>
</rule> Dangerous Pickle Detection
Level 12 critical alert when a model file contains dangerous serialized objects — the primary supply chain attack vector for ML models.
<rule id="100141" level="12">
<decoded_as>json</decoded_as>
<field name="event">picklescan_complete</field>
<field name="result">dangerous</field>
<description>DANGEROUS pickle detected in model file</description>
<group>supply_chain,malware,critical,</group>
</rule> 05 Design Decisions
Severity Calibration
Rule levels follow a deliberate escalation model. Audit events (level 3) capture baseline telemetry without noise. Single-occurrence attacks trigger mid-range alerts (7–10). Correlation rules that detect persistence, repeated evasion, or multi-event attack chains escalate to levels 11–12. This prevents alert fatigue while ensuring automated attacks don't fly under the radar.
Correlation Over Single-Event
Five of the 41 rules are correlation rules that fire only when a lower-level rule triggers multiple times from the same source within a time window. A single jailbreak attempt is level 10. Three from the same IP in five minutes is level 12. This reflects how real attacks behave — adversaries iterate, and iteration creates a detectable pattern.
Field-Level Matching
Rules match on structured JSON fields (event, layer, service, result) rather than raw log strings. This makes rules portable — they work regardless of the infrastructure that produces the logs, as long as the log schema is consistent.
MCP as a Trust Boundary
MCP tool calls get disproportionate rule coverage (6 dedicated rules) relative to their log volume because MCP represents a trust boundary between the LLM and the filesystem/network. An MCP write operation (level 7) or sensitive path access (level 10) is categorically different from a chat query (level 3) — the rules reflect that risk differential.
06 Detection Coverage
| Threat Category | Rules | Approach |
|---|---|---|
| Jailbreak / Prompt Injection | 100100, 100104, 100105, 100115 | Layer-specific blocking + persistence correlation |
| Evasion (Encoding) | 100110, 100111 | Base64/hex/rot13 detection + multi-attempt correlation |
| Prompt Extraction | 100112, 100113 | Extraction pattern matching + repeat correlation |
| Data Leakage (PII) | 100150, 100151 | Response scanning + leak correlation |
| MCP Agent Abuse | 100132–100135 | Tool call audit → write detection → sensitive path alerting |
| Model Supply Chain | 100141–100143 | Pickle scanning + integrity verification |
| RAG Pipeline | 100120–100124 | Document operations, query activity, config changes |
| Infrastructure | 100102, 100103, 100114, 100191 | Auth failures, rate limits, resource abuse, upstream errors |