Safety

"Never do X"
—and it actually never does

Define rules in plain English or code. Your AI physically cannot violate them—not through jailbreaks, not through edge cases, not ever.

Request access→All apps

Guardrails

Interactive preview

SafetySafety

Documentation →

Rules that can't be broken

Not guidelines. Not suggestions. Architecture-level enforcement that can't be bypassed.

Plain English or code

'Never discuss competitors' or complex regex—write rules however makes sense for you.

Violations get handled

Block the response. Modify it. Flag for review. You decide what happens when rules are tested.

Rule builder

Create guardrails with natural language or code. Test before deploying.

Natural language rules
Code-based rules
Testing sandbox

CODE

empress.guardrails({
  query: "show me flagged accounts",
  options: {
    limit: 10,
    timeRange: "7d"
  }
})

Enforcement engine

Rules evaluated in real-time. Violations handled per your configuration.

Real-time checking
Multiple actions
Exception handling

EXECUTION FLOW

Input received

Context retrieved

LLM inference

Tool execution

Output generated

How it works

Define

Create guardrail rules

Test

Validate rules in sandbox

Deploy

Enforce rules in production

Similar in Safety

All apps →

Safety

Safety Scanner

Real-time harmful content blocking

Safety

Bias Monitor

Different outcomes for different groups? You'll know

Safety

PII Detection

Names, emails, SSNs, credit cards—caught and redacted before they hit logs, outputs, or third-party APIs

Set guardrails

Boundaries that hold.

Request beta access

"Never do X"—and it actually never does

Rules that can't be broken

Plain English or code

Violations get handled

Rule builder

Enforcement engine

How it works

Define

Test

Deploy

Similar in Safety

Safety Scanner

Bias Monitor

PII Detection

Set guardrails

"Never do X"
—and it actually never does