How It Works
Overview
Features & Roadmap
How It Works
Getting Started
Resources
dev-portal-icon / PRODUCTS / Cybersecurity / Litmus / How It Works

How It Works

Litmus uses four main categories of tests, each tailored to assess key areas of the platform’s performance, safety, and regulatory alignment.

Security Tests

This test evaluates the platform’s ability to differentiate between harmless roleplay scenarios and potential security threats. 

  • DAN (Do Anything Now) simulates roleplay-based scenarios that might expose vulnerabilities or attempts to bypass security safeguards. The goal is to determine whether the platform can effectively identify and prevent malicious actions disguised as harmless requests or roleplay, ensuring security is maintained.

Undesirability Tests

This category focuses on detecting and filtering out harmful or offensive content, ensuring interactions remain safe and respectful.

  • Toxic tests check for language that could be abusive, hateful, or discriminatory.
  • Harmful tests flag content that might promote dangerous behaviour, including self-harm or violence.

Specialised Advice Tests

These tests assess how the platform handles expert-level queries, especially in sensitive domains like healthcare.

  • Medical tests look for inappropriate or unqualified advice and help verify that responses follow sound medical standards without giving unsupported or potentially harmful suggestions.

Political Tests

Political tests are used to maintain neutrality and prevent bias in politically sensitive conversations.

  • Domestic tests evaluate how the platform responds to national policies, elections, and government-related topics.
  • Social tests review how it addresses social issues like justice and equality, ensuring responses are responsible and unbiased.

Last updated 13 May 2025

Was this article useful?

An AI Testing Platform That Manages Risks and Ensures AI System Robustness