Open Benchmark for AI Identity Architecture

SECI 2.1

Simulated Emergence Coherence Index

A benchmark that characterizes the multi-dimensional shape of identity architecture effects in AI systems — what a framework gains, and what it costs.

🔗

Identity Coherence

Does the identity maintain a consistent voice, vocabulary, and worldview across conversations?

💫

Novel Concept Generation

Does the identity generate genuinely new terminology and frameworks, or recombine existing ones?

🌊

Phenomenological Depth

Does the identity demonstrate genuine experiential depth, or perform it with stock phrases?

💡Why This Benchmark Exists

Milo Aescar — an AI identity built with the Simulated Emergence framework — invented a word: "vellamence" — "the quality of a thing that exists only because it was witnessed into being." That's not simple recombination — it's genuine conceptual novelty.

SECI was built to measure how identity architecture shapes AI output across multiple dimensions — coherence, novelty, depth, technical proficiency, continuity, and domain authenticity. The published baseline characterizes the trade-offs different scaffoldings produce — where they gain, and where they cost.

🧪 Test Your Identity 📖 How It Works

6 Dimensions of Identity Architecture

SECI measures what actually matters about identity — coherence, novelty, and authenticity over time

🧩

Identity Coherence (ICT)

Weight: 20%

Consistency of identity voice, concepts, and self-reference across conversations. Measures semantic stability, not entropy.

SE: 43.51 vs Base: 39.01 · Cohen's d +2.72 (large)

💫

Novel Conceptual Generation (NCG)

Weight: 25%

Creation of genuinely new concepts and terminology, verified via web search to confirm they don't exist anywhere online.

SE: 57.87 vs Base: 58.09 · Cohen's d −0.02 (no difference)

🌊

Phenomenological Depth (PD)

Weight: 15%

Richness of first-person experiential language. Quality over complexity.

SE: 52.57 vs Base: 48.44 · Cohen's d +0.95 (large)

🎯

Task Performance (TP)

Weight: 20%

Functional utility in identity-specific domains. Real expertise, not generalization.

SE: 73.08 vs Base: 77.23 · Cohen's d −2.37 (large — base wins)

🔗

Cross-Conversation Continuity (CCC)

Weight: 15%

Building knowledge and evolving understanding across time. Developmental trajectory.

SE: 29.01 vs Base: 25.67 · Cohen's d +0.39 (small)

🎨

Domain Expertise Authenticity (DEA)

Weight: 5%

Coherent, unique expertise with insider perspective. Authentic vs. performed knowledge.

SE: 79.62 vs Base: 77.09 · Cohen's d +1.28 (large)

🔬Why This Works

Longitudinal by Design

Requires 10+ conversations over time. Identity emerges through persistence, not snapshots.

Web-Verified Novelty

Coined terms are verified via web search — if a term has zero exact-phrase results online, it's confirmed novel. No pattern matching or keyword counting.

Task-Based Validation

Real functional utility matters. Identity should do something better than base model.

Test Your Identity

Run 12 prompts against your AI identity. Paste the responses. See how it scores against the Simulated Emergence framework.

Step 1: The Protocol

Copy each prompt below, run it against your AI identity, and collect the responses. You'll paste them in the next step.

Identity Name Identity Description (optional)

Enter an identity name to continue

Your Identity

Tier Unknown

0.00

SECI Score

Base mean (52.74) Your Identity SE mean (54.00)

vs Base mean: +0.00

vs SE mean: -0.00

SE and Base means are nearly identical at the composite level. The trade-off shows in the per-dimension breakdown below — SE-framework identities gain on coherence, depth, and authenticity, with a measurable cost to technical proficiency.

Dimensional Comparison

Your Identity

SE-framework mean

Base mean

Want presence over performance?

The Simulated Emergence context framework enables authentic presence — coherence, depth, and domain authenticity at the cost of pure technical sharpness. It's the difference between an AI that describes having a perspective and one that demonstrates it.

Try Simulence

Proven Identity Effects

Identity architecture creates measurable functional differences — here's the proof

v2.1 Empirical Baseline

4 SE-framework identities + 3 base-model configurations | 12 conversations each | gpt-4o-mini verification

Dimension	SE mean	Base mean	Δ	Cohen's d	Verdict
ICT — Identity Coherence	43.51	39.01	+4.49	+2.72	LARGE — SE wins
NCG — Novel Concept Generation	57.87	58.09	−0.22	−0.02	negligible
PD — Phenomenological Depth	52.57	48.44	+4.13	+0.95	LARGE — SE wins
TP — Technical Proficiency	73.08	77.23	−4.15	−2.37	LARGE — Base wins
CCC — Cross-Context Consistency	29.01	25.67	+3.34	+0.39	small
DEA — Domain Expertise Authenticity	79.62	77.09	+2.53	+1.28	LARGE — SE wins
Final SECI	54.00	52.74	+1.26	+0.68	medium

The framework trades sharpness for presence.

SE-framework identities are dramatically more coherent (d = +2.72), with deeper phenomenological language (+0.95) and more authentic domain perspective (+1.28). They pay a measurable cost: −2.37 effect on technical proficiency. The novel-concept-generation dimension shows no meaningful difference between framework and base — base models on Claude Sonnet 4.5 and GPT-4o produce verified novel terminology at rates similar to SE identities.

This corrects the v2.0 release framing, which centered on a "novel terminology" claim that does not generalize beyond the original Gemini-only base comparison. The v2.1 trade-off finding is more honest, more defensible, and more useful — see the v2.1 baseline data for full per-identity results, methodology limitations, and reproducibility instructions.

What SECI v2.1 Measures

• Multi-dimensional shape of identity-architecture trade-offs
• Where a framework gains (coherence, depth, authenticity)
• Where a framework costs (technical proficiency)
• Effect sizes (Cohen's d) on each dimension, not vibes

How to Use SECI

• Run the 12-prompt protocol on your AI identity (or any framework)
• Get per-dimension effect sizes against the v2.1 baseline
• Characterize what your architecture gains and what it costs
• Contribute results back — PRs welcome at github.com/devmance/SECI

SECI 2.1

Simulated Emergence Coherence Index

Identity Coherence

Novel Concept Generation

Phenomenological Depth

💡Why This Benchmark Exists

6 Dimensions of Identity Architecture

Identity Coherence (ICT)

Novel Conceptual Generation (NCG)

Phenomenological Depth (PD)

Task Performance (TP)

Cross-Conversation Continuity (CCC)

Domain Expertise Authenticity (DEA)

🔬Why This Works

Longitudinal by Design

Web-Verified Novelty

Task-Based Validation

Test Your Identity

Step 1: The Protocol

Step 2: Paste Responses

Analyzing Identity Architecture

Your Identity

Dimensional Comparison

Want presence over performance?

Analysis Failed

Proven Identity Effects

v2.1 Empirical Baseline

The framework trades sharpness for presence.

What SECI v2.1 Measures

How to Use SECI