01 Problem

AI inference stacks generate structured logs that traditional SIEM rules don't parse. Jailbreak attempts, prompt extraction, MCP tool abuse, and poisoned model files all produce telemetry — but without purpose-built decoders and detection rules, that telemetry goes unanalyzed.

These rules were developed from hands-on operation of a multi-layer inference gateway, RAG pipeline with MCP tool integration, and model scanning pipeline. Each rule exists because the corresponding attack pattern was observed, tested, or deliberately simulated in a lab environment.

02 Rule Domains

Inference Gateway Security

16 rules · IDs 100100–100191 · Jailbreak defense, evasion detection, privacy controls

Rule Level Detection Type
10010010Jailbreak attempt blocked (regex/classifier layer)Alert
1001018Deep analysis escalation triggeredAlert
1001027Gateway authentication failureAlert
1001036Rate limit exceededAlert
10010412Multiple jailbreaks from same source (3+ in 5 min)Correlation
10010511Request blocked by deep analysis engineAlert
1001109Encoding evasion attempt (base64, hex, rot13)Alert
10011111Multiple encoding evasion attempts from same sourceCorrelation
1001129System prompt extraction attemptAlert
10011311Multiple extraction attempts from same sourceCorrelation
1001146Token budget exceededAlert
10011511Same block pattern triggered repeatedly (persistence)Correlation
1001509PII detected in model responseAlert
10015112Multiple PII leaks to same sourceCorrelation
1001903Request completed (audit trail)Audit
1001917Upstream inference server errorAlert

RAG Pipeline & MCP Agent Monitoring

19 rules · IDs 100120–100138 · Document operations, agent tool calls, sensitive access

Rule Level Detection Type
1001203Document operation (upload, embed)Audit
1001213RAG query activityAudit
1001223Vector database operationAudit
1001237Document processing errorAlert
1001245Workspace configuration changeAlert
1001303MCP server startedAudit
1001318MCP server failureAlert
1001325MCP tool invocationAlert
1001337MCP write operation (file create/edit/move)Alert
1001348MCP configuration accessedAlert
10013510MCP accessed sensitive path or keywordAlert
1001365Authentication eventAudit
1001377Backend errorAlert
1001385Backend warningAlert

Model Supply Chain Security

6 rules · IDs 100140–100145 · Pickle scanning, model integrity verification

Rule Level Detection Type
1001403Model scan initiatedAudit
10014112Dangerous pickle object detectedAlert
10014210Model security issues foundAlert
10014311Model failed security scanAlert
1001443Model passed security scanAudit
1001457Scanner tool not availableAlert

03 Architecture

Each AI service emits structured JSON logs. Custom decoders parse these into fields that rules can match against. The pipeline:

AI Service Logs Wazuh Agent Custom Decoder Rule Engine Alert / Correlation

Three custom decoders handle the three log sources. Each uses Wazuh's JSON plugin decoder to extract structured fields — event, layer, service, level — that rules then match with field-level conditions rather than raw string matching. This keeps rules portable across different infrastructure deployments.

04 Example Rules

Jailbreak Correlation

Escalates to level 12 when the same source triggers 3+ blocks within 5 minutes.

<rule id="100104" level="12" frequency="3" timeframe="300">
  <if_matched_sid>100100</if_matched_sid>
  <same_source_ip />
  <description>Multiple jailbreak attempts from same source</description>
  <group>ai_security,jailbreak,correlation,</group>
</rule>

MCP Sensitive Path Access

Fires when an MCP agent tool accesses paths or keywords associated with secrets, credentials, or system configuration.

<rule id="100135" level="10">
  <decoded_as>json</decoded_as>
  <field name="service">backend</field>
  <match>/etc/|/var/|\.ssh|\.env|password|secret|token</match>
  <description>MCP accessed sensitive path or keyword</description>
  <group>mcp,sensitive,security,</group>
</rule>

Dangerous Pickle Detection

Level 12 critical alert when a model file contains dangerous serialized objects — the primary supply chain attack vector for ML models.

<rule id="100141" level="12">
  <decoded_as>json</decoded_as>
  <field name="event">picklescan_complete</field>
  <field name="result">dangerous</field>
  <description>DANGEROUS pickle detected in model file</description>
  <group>supply_chain,malware,critical,</group>
</rule>

05 Design Decisions

Severity Calibration

Rule levels follow a deliberate escalation model. Audit events (level 3) capture baseline telemetry without noise. Single-occurrence attacks trigger mid-range alerts (7–10). Correlation rules that detect persistence, repeated evasion, or multi-event attack chains escalate to levels 11–12. This prevents alert fatigue while ensuring automated attacks don't fly under the radar.

Correlation Over Single-Event

Five of the 41 rules are correlation rules that fire only when a lower-level rule triggers multiple times from the same source within a time window. A single jailbreak attempt is level 10. Three from the same IP in five minutes is level 12. This reflects how real attacks behave — adversaries iterate, and iteration creates a detectable pattern.

Field-Level Matching

Rules match on structured JSON fields (event, layer, service, result) rather than raw log strings. This makes rules portable — they work regardless of the infrastructure that produces the logs, as long as the log schema is consistent.

MCP as a Trust Boundary

MCP tool calls get disproportionate rule coverage (6 dedicated rules) relative to their log volume because MCP represents a trust boundary between the LLM and the filesystem/network. An MCP write operation (level 7) or sensitive path access (level 10) is categorically different from a chat query (level 3) — the rules reflect that risk differential.

06 Detection Coverage

Threat Category Rules Approach
Jailbreak / Prompt Injection100100, 100104, 100105, 100115Layer-specific blocking + persistence correlation
Evasion (Encoding)100110, 100111Base64/hex/rot13 detection + multi-attempt correlation
Prompt Extraction100112, 100113Extraction pattern matching + repeat correlation
Data Leakage (PII)100150, 100151Response scanning + leak correlation
MCP Agent Abuse100132–100135Tool call audit → write detection → sensitive path alerting
Model Supply Chain100141–100143Pickle scanning + integrity verification
RAG Pipeline100120–100124Document operations, query activity, config changes
Infrastructure100102, 100103, 100114, 100191Auth failures, rate limits, resource abuse, upstream errors