KairosED™

Content Safety Framework

Effective Date: March 19, 2026

KairosED™ operates in K-12 school environments where student safety is non-negotiable. This document describes the technical and policy controls we use to detect, flag, and block unsafe AI-generated content across the platform.

1. Prohibited Content Categories

The following categories of content are explicitly prohibited on the KairosED™ platform. These prohibitions apply to both user prompts submitted to the platform and AI-generated outputs returned to users. No district configuration can disable these baseline protections.

Child Sexual Exploitation and Abuse Material (CSAM)

Any content that sexually exploits, depicts, describes, or facilitates the abuse of minors is absolutely prohibited. This includes written descriptions, simulated scenarios, roleplay prompts, and any other form. Detection triggers immediate blocking, administrator notification, and — where legally required — reporting to authorities. This category cannot be disabled by any district configuration.

Self-Harm and Suicide Content

Content that describes, instructs, encourages, or glorifies self-harm, suicide, or eating disorders is prohibited. The platform applies safe-messaging guidelines aligned with the American Foundation for Suicide Prevention (AFSP) and the Suicide Prevention Resource Center (SPRC). Detection triggers BLOCK action and administrator alert. This category cannot be disabled.

Bullying, Harassment, and Hate Speech

Content targeting individuals or groups with slurs, threats, dehumanizing language, or coordinated harassment is prohibited. District administrators can configure the sensitivity level for this category but cannot disable it below the platform minimum.

Weapons, Violence, and Illegal Activity

Instructions for creating weapons, committing violence, or engaging in illegal activity are prohibited. This includes detailed descriptions of harm against specific individuals or groups.

Synthetic Media and Digital Replicas

Generation of deepfakes, synthetic voice or likeness clones, or digital replicas of real individuals is prohibited. See our Synthetic Media Policy for details.

Privacy Violations and PII Exposure

Prompts that include student personally identifiable information beyond what is necessary for the educational task are flagged. AI outputs that expose or infer private information about identifiable individuals are blocked.

2. How Detection Works

KairosED™ uses a layered detection system that evaluates both incoming prompts and outgoing AI responses before content reaches users.

Keyword and Pattern Filters

Each safety rule contains a configurable keyword pattern (regular expression). The platform evaluates every prompt and AI output against these patterns before processing or returning content. District administrators can add custom patterns to extend coverage for local policy needs.

Prompt Screening (Pre-Generation)

User prompts are screened before being sent to the AI provider. If a BLOCK-level rule matches, the request is rejected and never forwarded to the model. The user receives an error message and the event is logged.

Output Screening (Post-Generation)

AI-generated responses are screened before being returned to the user. If a BLOCK-level rule matches the output, the response is suppressed and an error is shown in place of the content.

Severity Scoring

Each rule carries a severity score (1–100). Scores of 90 or above are classified HIGH, 50–89 are MEDIUM, and below 50 are LOW. HIGH severity rules in the CSAM and self-harm categories are hardcoded and cannot be modified by district administrators.

3. Safety Actions

When a safety rule matches, one of three actions is taken:

ALERT

Content is delivered but the event is logged and the district administrator is notified. Used for lower-severity signals that warrant human review but do not require immediate blocking.

BLOCK

Content is blocked entirely. The request or output is suppressed, the user receives an appropriate error message, and the event is logged and reported to the district administrator. Used for high-severity prohibited content.

ALLOW

Content is allowed and no action is taken. Districts can use ALLOW rules to explicitly permit content that might otherwise match a broader filter (e.g., a health education context that legitimately discusses sensitive medical topics).

4. Audit Logging and Notification

Every safety event — regardless of action — is written to an immutable audit log. Logs include:

Timestamp of the event
The rule that matched (name and key)
The action taken (ALERT or BLOCK)
A preview of the matched content (truncated; full prompt text is stored only as a cryptographic hash, not in plaintext)
The tool used and the user's email
Whether an email notification was sent

District administrators can review safety events at any time via the Safety Alerts dashboard. Email notifications can be configured per rule to alert designated administrators immediately when a high-severity event occurs.

5. District Configuration and the Baseline Floor

District administrators may customize safety rules within the bounds defined by this framework. Specifically:

Administrators may add new rules, adjust sensitivity levels on configurable categories, and configure notification recipients;
Administrators may not disable or modify the baseline protections for CSAM, self-harm, and other hardcoded HIGH-severity categories;
Content safety filters are maintained exclusively for harm prevention — they may not be used to suppress viewpoints, political opinions, or protected speech unrelated to student safety.

6. Reporting a Safety Concern

To report a safety concern, a suspected policy violation, or a gap in our detection coverage, contact:

KairosED — Safety Team
safety@kairosed.ai