The Dataset

The Noetic Archive

Shared, decentralized infrastructure for the science of inner states — a versioned, multimodal foundation pairing neural recordings with structured descriptions of experience that isolated research efforts, across geographies and traditions, can contribute to and draw from in common.

Status · Live prototype

The Noetic Archive is a working prototype today. The Architect agent is ingesting data, refining archetypes, and releasing versioned Editions. See the prototype →

For researchers

The Archive is built as shared infrastructure for the science of experience — usable by labs and PIs beyond Psyntient. For Researchers →

§ 01

What it is

The Noetic Archive is the connective layer between otherwise-siloed research efforts worldwide. It collects three kinds of information for every recording session: neural and physiological signals, structured first-person reports of the experience unfolding during them, and contextual metadata. Together, these form a single, queryable dataset that grows denser and more discriminating with every contribution.

The Archive is device- and modality-agnostic by construction. Observation Packets are designed to ingest heterogeneous data streams — EEG, fMRI, fNIRS, MEG, ECoG, wearable telemetry (HR, HRV, EDA), eye-tracking, full-body motion capture, and emerging BCI systems — alongside the structured reports that turn signal into evidence about experience.

Inputs flow from two channels. Psyntient Ground is our dedicated contribution device, the first in an expected family. In parallel — and beginning before Ground ships — the Archive ingests data from third-party companies, research institutions, and preexisting datasets that meet its consent and compliance standards. External ingestion is how the dataset accrues diversity and scale from day one.

Where most neuroscience datasets capture brain activity in isolation, the Archive insists on a parallel record of what the recording was like from the inside. That parallel record is what makes the dataset usable for phenomenological inference — and what makes it the first of its kind at scale.

§ 02

Observation Packets — the unit of evidence

The foundational unit of the Archive is the Observation Packet: an immutable, time-stamped container that captures a single window of measurement of an internal state. Each packet binds a neural or physiological record with the participant's first-person report of that same moment — when a report is available.

Packets are append-only by design. Updates create new versioned revisions; the original record is preserved so any downstream claim can be traced back to the raw evidence that produced it. Neither neural data nor self-report alone carries scientific weight in the Archive — the pairing is what counts.

Packets without a report are admitted as pure neural evidence and can reinforce existing patterns, but a new archetype is never defined from neural data alone. Unpaired clusters surface as candidates for human investigation.

§ 03

Archetype-centric ontology

Neural archetypes are the primary semantic objects of the Archive — joint clusters in phenomenological and neural feature space that represent recurring patterns of human experience. They range from broad categories (focused attention, open awareness) to fine-grained signatures tied to narrowly described states.

Each archetype carries a phenomenological signature, a per-modality neural signature, the set of exemplar packets that ground it, and a confidence tier — tentative, provisional, or established — that grows as more independent exemplars accumulate.

§ 04

Fuzzy, overlapping membership

Subjective experience rarely resolves to a single label. The Archive accepts this from the start: a single observation can belong to multiple archetypes with varying confidence scores, reflecting the genuine ambiguity of inner states.

This fuzzy membership is what allows the taxonomy to remain honest about consciousness rather than forcing it into discrete bins it was never built to fit.

§ 05

A layered architecture

The Archive is organized in four layers, kept deliberately distinct so that raw evidence, discovered structure, and analytical scaffolding never contaminate each other.

Layer 1

Evidence

Observation Packets — immutable, multimodal, time-stamped records of measurement.

Layer 2

Ontology

Archetypes — joint phenomenological + neural clusters with fuzzy exemplar membership.

Layer 3

Higher-order organization

Genera and beyond — groupings of related archetypes by shared structure or narrative trajectory.

Layer 4

Latent infrastructure

Embeddings, manifold coordinates, graph edges, and retrieval anchors that support search and refinement.

§ 06

The Architect — the agent that organizes the Archive

The Architect is an internal AI agent that processes incoming data and organizes it into structured representations. It runs a multi-tiered refinement pipeline — deterministic integrity checks, lightweight pattern scans across the unmapped pool, and deeper passes that propose new archetypes, sharpen boundaries, and detect drift between definitions and their exemplars.

A separate user-facing agent, the Noetic Interface, will be the conversational front door and backend API to the Archive — the layer where researchers, developers, and users (and their services) explore archetypes and packets in plain language. Both agents are described in more depth on the About page.

§ 07

Editions — Git-native scientific releases

The Archive periodically publishes frozen, immutable Editions of its taxonomy. Each Edition is a self-contained, versioned package: canonical archetypes and packets, generated databases and integrity checksums, and a public-facing scientific overview.

Canonical artifacts are text-based, diffable, and Git-tracked — so any claim the Archive makes can be reproduced against the exact Edition that produced it. Visualizations and interpretive overlays are treated as derived artifacts, clearly marked and never confused with the empirical core.

§ 08

Shared infrastructure, not a product

The Archive is the substrate other research builds on — not a tool used in isolation. Studies, applications, and AI systems are what get built on top of it; the Archive's job is to be the common, citable foundation underneath them. Its value compounds with every consented session contributed, and it is built to be read by people and systems beyond Psyntient — see For Researchers for how outside labs use it as an instrument, and About for the broader ecosystem and who it serves.