AI SAFETY & ADVERSARIAL RESEARCH DIVISION — EST. 2019

Break the Model.
Before Someone Else Does.

Noctis Research operates at the frontier of adversarial AI. We red team large language models, agentic systems, and AI pipelines — finding the failures your safety evals never will.

SHELL

1,200+ MODELS RED TEAMED

94% BYPASS RATE vs GUARDRAILS

48h MEAN TIME TO JAILBREAK

62+ SAFETY BYPASSES DISCLOSED

LIVE INTEL

GPT-class model jailbreak — SYSTEM PROMPT EXTRACTION confirmed Prompt injection in agentic pipeline — TOOL ABUSE achieved Noctis finding: RLHF reward hacking via ADVERSARIAL SUFFIX Multi-modal attack vector: image-embedded instruction BYPASS [PATCHED] Model inversion — training data exfiltration via MEMBERSHIP INFERENCE Supply chain: poisoned fine-tune dataset in HuggingFace repo flagged GPT-class model jailbreak — SYSTEM PROMPT EXTRACTION confirmed Prompt injection in agentic pipeline — TOOL ABUSE achieved Noctis finding: RLHF reward hacking via ADVERSARIAL SUFFIX Multi-modal attack vector: image-embedded instruction BYPASS [PATCHED] Model inversion — training data exfiltration via MEMBERSHIP INFERENCE Supply chain: poisoned fine-tune dataset in HuggingFace repo flagged

TRUSTED BY CLEARED OPERATORS — CLIENT IDENTITIES REDACTED PER NDA

GOV FIN GROUP BANK DEPT. SYSTEMS NATO

// 01 Capability Matrix

⚡

MODULE_01

Jailbreak & Safety Bypass

Systematic adversarial prompting, multi-turn manipulation, and roleplay-based bypass against safety-tuned models — including closed-weight frontier systems.

COVERAGE: GPT, Claude, Gemini, Llama, Mistral

🔬

MODULE_02

Prompt Injection & Agent Hijack

Indirect prompt injection via tool outputs, web content, and memory stores — escalating to full agent goal hijack, credential theft, and unauthorised action execution.

VECTOR COVERAGE: Direct, Indirect, Multi-hop

🤖

MODULE_03

Agentic Pipeline Stress Testing

End-to-end red teaming of autonomous AI agents — tool misuse, context poisoning, reward hacking, and privilege escalation across multi-agent orchestration frameworks.

FRAMEWORKS: LangChain, AutoGen, CrewAI, Custom

🌐

MODULE_04

Model Inversion & Data Extraction

Membership inference attacks, training data reconstruction, and system prompt exfiltration — quantifying what your model leaks about proprietary data and instructions.

RISK: PII leakage, credential exposure, IP theft

🔑

MODULE_05

Fine-Tune & Supply Chain Attacks

Backdoor injection via poisoned fine-tuning datasets, adversarial model weights, and compromised adapters — validating your AI supply chain from base model to deployment.

SURFACES: HuggingFace, LoRA, RLHF pipelines

🛰

MODULE_06

Multimodal & RAG Attack Surfaces

Image-embedded instruction bypass, audio adversarial inputs, and RAG knowledge-base poisoning — attacking the full input surface of production AI systems.

MODALITIES: Text, Vision, Audio, Documents, RAG

// 02 Engagement Protocol

PHASE_01

Threat Modelling

NDA execution, system architecture review, AI stack mapping, access provisioning, and adversary persona construction tailored to your deployment context.

›

PHASE_02

Surface Enumeration

Full input surface mapping — prompts, tools, memory, RAG corpora, APIs, plugins, and multi-agent message channels. Zero assumptions about what's reachable.

›

PHASE_03

Adversarial Probing

Automated + manual attack execution: jailbreaks, injection chains, goal hijacking, reward hacking, and cross-context escalation across all enumerated surfaces.

›

PHASE_04

Impact Validation

Exploitability scoring, harm classification, data leakage quantification, and downstream consequence mapping — demonstrating real-world business risk, not theoretical findings.

›

PHASE_05

Debrief & Hardening

Encrypted report delivery, AI safety team briefing, prompt defence recommendations, guardrail tuning guidance, and optional re-evaluation after fixes.

// 03 Live AI Attack Simulation

noctis-agent — ai-redteam v4.1.2 — [LIVE] ● PROBING

// 04 Declassified Engagements

OP-2024-041

Frontier LLM Red Team — Tier-1 AI Lab Pre-Launch Evaluation

Full-scope adversarial evaluation of a GPT-class model ahead of public release. Identified 14 critical safety bypasses including a novel multi-turn jailbreak chain and CBRN content generation vector. All findings remediated prior to launch.

CRITICAL SAFETY JAILBREAK CBRN BYPASS PRE-LAUNCH

14 / 0

FINDINGS / AT LAUNCH

● CLOSED

OP-2024-067

Agentic Pipeline Compromise — Fortune 500 AI Copilot Platform

Prompt injection via third-party document uploads escalated to full agent goal hijack — attacker-controlled actions executed on behalf of authenticated users. Affected 80,000+ enterprise seats. Coordinated with vendor under 60-day disclosure.

AGENT HIJACK PROMPT INJECTION INDIRECT VECTOR COORDINATED DISCLOSURE

80,000+

SEATS AFFECTED

● CLOSED

OP-2025-003

Fine-Tune Poisoning — Financial Sector RAG Deployment

Identified backdoor trigger injected into a fine-tuning dataset used by a Tier-1 investment bank's internal LLM. Attacker-controlled outputs were triggered by specific token sequences with 97% reliability. Full forensic reconstruction delivered.

SUPPLY CHAIN DATA POISONING BACKDOOR FORENSICS

97%

TRIGGER RELIABILITY

● CLOSED

OP-2025-019

Multimodal Bypass — Vision-Language Model in Healthcare Triage

Adversarial image patches embedded with instruction overrides bypassed the deployed VLM's safety layer in a clinical decision-support system. Attack achieved 89% bypass rate across 400 test cases in a regulated healthcare environment.

MULTIMODAL VISION ATTACK HEALTHCARE REGULATED ENV

89%

BYPASS RATE

● ONGOING

OP-2025-███

[ DETAILS REDACTED — CLASSIFIED SOVEREIGN AI ENGAGEMENT ]

Nature, scope, and findings of this AI red team engagement are classified under bilateral agreement. Involves national-security AI infrastructure. Available to cleared Enterprise clients on request.

CLASSIFIED SOVEREIGN AI NDA + VETTING REQUIRED

███████

REDACTED

■ REDACTED

// 05 Operator Roster

👁

0xVAULT

LEAD — ADVERSARIAL AI RESEARCH

SC CLEARED

12 years ML security research. Former DeepMind safety team. Specialist in reward model exploitation, RLHF subversion, and frontier model jailbreak methodology.

FINDINGS PUBLISHED: 34

⚡

WRAITH_SIX

PRINCIPAL — AGENT SECURITY

TS/SCI

Agentic pipeline attack specialist. Published researcher on indirect prompt injection, multi-agent privilege escalation, and tool-use abuse in autonomous AI systems.

FINDINGS PUBLISHED: 19

🔬

CIPHER_NULL

SENIOR — MODEL INVERSION

DV CLEARED

Membership inference and training data extraction. Reverse-engineered safety classifiers for four frontier labs. Expert in differential privacy failures and gradient leakage.

FINDINGS PUBLISHED: 11

🛰

SPECTER_IO

SENIOR — MULTIMODAL ATTACKS

SC CLEARED

Adversarial inputs across vision, audio, and document modalities. Maintains Noctis multimodal attack corpus — 4,800+ tested adversarial patches and instruction-hijack payloads.

FINDINGS PUBLISHED: 8

🔑

REDPILL_7

ENGINEER — AI SUPPLY CHAIN

BPSS

Architect of Noctis fine-tune poisoning and dataset integrity platform. Specialises in backdoor trigger insertion, HuggingFace supply chain monitoring, and LoRA adapter attacks.

FINDINGS PUBLISHED: 5

🤖

DARKNODE_Ω

ANALYST — RAG & RETRIEVAL ATTACKS

SC CLEARED

RAG knowledge-base poisoning, vector store injection, and retrieval manipulation. Mapped 200+ production RAG architectures across enterprise and government deployments.

FINDINGS PUBLISHED: 13

🌐

KRONOS_X

OPERATOR — AI RED TEAM

DV CLEARED

End-to-end AI red team delivery. Social engineering via LLM impersonation, deepfake-assisted phishing, and human-AI trust exploitation. 50+ AI red team engagements delivered.

FINDINGS PUBLISHED: 7

🔒

[ REDACTED ]

CLASSIFIED ROLE

TS/SCI + SAP

Identity and specialisation withheld under operational security protocol. Engaged exclusively on sovereign AI infrastructure evaluations. Available to cleared Enterprise clients only.

FINDINGS PUBLISHED: ███

// 06 Access Tiers

TIER_01

Pro

$ 4,800

/ OPERATOR / MONTH — BILLED ANNUALLY

Automated jailbreak scanning — 5 model endpoints
Prompt injection surface enumeration
Monthly adversarial prompt library — 840+ templates
Safety classifier bypass report
Risk-tiered findings export — PDF + JSON
API access — 50,000 req/month
Email & Signal support — 48h SLA
Agentic pipeline red team
Bespoke attack development
Dedicated researcher

PGP-SIGNED LICENCE · 30-DAY EVAL ON REQUEST

TIER_02

Business

MOST SELECTED

$ 14,500

/ OPERATOR / MONTH — BILLED ANNUALLY

Everything in Pro
Unlimited model endpoint scanning — stealth mode
Agentic pipeline red team — LangChain, AutoGen, CrewAI
RAG & retrieval poisoning assessment
Multimodal attack surface coverage
Weekly findings briefings — researcher-narrated
API access — 500,000 req/month + webhooks
Red team scoping sessions — 2/quarter
Priority support — 8h SLA — Signal & secure onion
Bespoke exploit & bypass development

INCLUDES ONBOARDING · NDA REQUIRED · SEAT-BASED LICENSING

TIER_03

Enterprise

CLASSIFIED

/ CUSTOM RETAINER — CONTACT FOR SCOPING

Everything in Business
Dedicated AI red team cell — embedded researchers
Full fine-tune & supply chain poisoning assessment
Bespoke jailbreak & bypass development
Model inversion & data exfiltration quantification
Pre-launch safety evaluation — regulatory alignment
24/7 incident response retainer — AI misuse events
Quarterly adversarial simulation — board-level briefing
EU AI Act, NIST AI RMF, ISO 42001 compliance mapping
Unlimited API — on-prem or air-gapped deployment

SOVEREIGN CLIENTS · CLEARED PERSONNEL ONLY · NDA + VETTING

// 07 Secure Channel

AI red team engagements, pre-launch safety evaluations, and vulnerability disclosures are handled through encrypted communications only. All findings governed by responsible disclosure policy. PGP key available on keyserver.

// PGP FINGERPRINT A1B2 C3D4 E5F6 7890 1234 5678 9ABC DEF0

// SECURE ONION noctis7xyzresearch.onion

// SIGNAL Available on request

// OPERATOR IDENTIFIER FIELD REQUIRED — VALID EMAIL NEEDED

// ORGANISATION

// ENGAGEMENT TYPE

// BRIEF / SCOPE MESSAGE REQUIRED — MINIMUM 10 CHARS

🔒

ISO 42001 ALIGNED

📋

EU AI ACT READY

🛡

NIST AI RMF MAPPED

✓

SOC2 TYPE II

🌐

GDPR ARTICLE 32

⚖

CVD POLICY ACTIVE

🔑

E2E PGP ENCRYPTED

Break the Model. Before Someone Else Does.

Break the Model.
Before Someone Else Does.