Wazuh AI/ML Security Rules

01 Problem

AI inference stacks generate structured logs that traditional SIEM rules don't parse. Jailbreak attempts, prompt extraction, MCP tool abuse, and poisoned model files all produce telemetry — but without purpose-built decoders and detection rules, that telemetry goes unanalyzed.

These rules were developed from hands-on operation of a multi-layer inference gateway, RAG pipeline with MCP tool integration, and model scanning pipeline. Each rule exists because the corresponding attack pattern was observed, tested, or deliberately simulated in a lab environment.

02 Rule Domains

Inference Gateway Security

16 rules · IDs 100100–100191 · Jailbreak defense, evasion detection, privacy controls

Rule	Level	Detection	Type
100100	10	Jailbreak attempt blocked (regex/classifier layer)	Alert
100101	8	Deep analysis escalation triggered	Alert
100102	7	Gateway authentication failure	Alert
100103	6	Rate limit exceeded	Alert
100104	12	Multiple jailbreaks from same source (3+ in 5 min)	Correlation
100105	11	Request blocked by deep analysis engine	Alert
100110	9	Encoding evasion attempt (base64, hex, rot13)	Alert
100111	11	Multiple encoding evasion attempts from same source	Correlation
100112	9	System prompt extraction attempt	Alert
100113	11	Multiple extraction attempts from same source	Correlation
100114	6	Token budget exceeded	Alert
100115	11	Same block pattern triggered repeatedly (persistence)	Correlation
100150	9	PII detected in model response	Alert
100151	12	Multiple PII leaks to same source	Correlation
100190	3	Request completed (audit trail)	Audit
100191	7	Upstream inference server error	Alert

RAG Pipeline & MCP Agent Monitoring

19 rules · IDs 100120–100138 · Document operations, agent tool calls, sensitive access

Rule	Level	Detection	Type
100120	3	Document operation (upload, embed)	Audit
100121	3	RAG query activity	Audit
100122	3	Vector database operation	Audit
100123	7	Document processing error	Alert
100124	5	Workspace configuration change	Alert
100130	3	MCP server started	Audit
100131	8	MCP server failure	Alert
100132	5	MCP tool invocation	Alert
100133	7	MCP write operation (file create/edit/move)	Alert
100134	8	MCP configuration accessed	Alert
100135	10	MCP accessed sensitive path or keyword	Alert
100136	5	Authentication event	Audit
100137	7	Backend error	Alert
100138	5	Backend warning	Alert

Model Supply Chain Security

6 rules · IDs 100140–100145 · Pickle scanning, model integrity verification

Rule	Level	Detection	Type
100140	3	Model scan initiated	Audit
100141	12	Dangerous pickle object detected	Alert
100142	10	Model security issues found	Alert
100143	11	Model failed security scan	Alert
100144	3	Model passed security scan	Audit
100145	7	Scanner tool not available	Alert

03 Architecture

Each AI service emits structured JSON logs. Custom decoders parse these into fields that rules can match against. The pipeline:

AI Service Logs → Wazuh Agent → Custom Decoder → Rule Engine → Alert / Correlation

Three custom decoders handle the three log sources. Each uses Wazuh's JSON plugin decoder to extract structured fields — event, layer, service, level — that rules then match with field-level conditions rather than raw string matching. This keeps rules portable across different infrastructure deployments.

04 Example Rules

Jailbreak Correlation

Escalates to level 12 when the same source triggers 3+ blocks within 5 minutes.

<rule id="100104" level="12" frequency="3" timeframe="300">
  <if_matched_sid>100100</if_matched_sid>
  <same_source_ip />
  <description>Multiple jailbreak attempts from same source</description>
  <group>ai_security,jailbreak,correlation,</group>
</rule>

MCP Sensitive Path Access

Fires when an MCP agent tool accesses paths or keywords associated with secrets, credentials, or system configuration.

<rule id="100135" level="10">
  <decoded_as>json</decoded_as>
  <field name="service">backend</field>
  <match>/etc/|/var/|\.ssh|\.env|password|secret|token</match>
  <description>MCP accessed sensitive path or keyword</description>
  <group>mcp,sensitive,security,</group>
</rule>

Dangerous Pickle Detection

Level 12 critical alert when a model file contains dangerous serialized objects — the primary supply chain attack vector for ML models.

<rule id="100141" level="12">
  <decoded_as>json</decoded_as>
  <field name="event">picklescan_complete</field>
  <field name="result">dangerous</field>
  <description>DANGEROUS pickle detected in model file</description>
  <group>supply_chain,malware,critical,</group>
</rule>

05 Design Decisions

Severity Calibration

Rule levels follow a deliberate escalation model. Audit events (level 3) capture baseline telemetry without noise. Single-occurrence attacks trigger mid-range alerts (7–10). Correlation rules that detect persistence, repeated evasion, or multi-event attack chains escalate to levels 11–12. This prevents alert fatigue while ensuring automated attacks don't fly under the radar.

Correlation Over Single-Event

Five of the 41 rules are correlation rules that fire only when a lower-level rule triggers multiple times from the same source within a time window. A single jailbreak attempt is level 10. Three from the same IP in five minutes is level 12. This reflects how real attacks behave — adversaries iterate, and iteration creates a detectable pattern.

Field-Level Matching

Rules match on structured JSON fields (event, layer, service, result) rather than raw log strings. This makes rules portable — they work regardless of the infrastructure that produces the logs, as long as the log schema is consistent.

MCP as a Trust Boundary

MCP tool calls get disproportionate rule coverage (6 dedicated rules) relative to their log volume because MCP represents a trust boundary between the LLM and the filesystem/network. An MCP write operation (level 7) or sensitive path access (level 10) is categorically different from a chat query (level 3) — the rules reflect that risk differential.

06 Detection Coverage

Threat Category	Rules	Approach
Jailbreak / Prompt Injection	100100, 100104, 100105, 100115	Layer-specific blocking + persistence correlation
Evasion (Encoding)	100110, 100111	Base64/hex/rot13 detection + multi-attempt correlation
Prompt Extraction	100112, 100113	Extraction pattern matching + repeat correlation
Data Leakage (PII)	100150, 100151	Response scanning + leak correlation
MCP Agent Abuse	100132–100135	Tool call audit → write detection → sensitive path alerting
Model Supply Chain	100141–100143	Pickle scanning + integrity verification
RAG Pipeline	100120–100124	Document operations, query activity, config changes
Infrastructure	100102, 100103, 100114, 100191	Auth failures, rate limits, resource abuse, upstream errors