Introduction

Arke is a public knowledge network for storing, discovering, and connecting information.

Making content truly accessible is harder than it looks. Meaningful search requires vectors, embeddings, extraction pipelines—infrastructure most people can't build. And even with that, files sitting on a website or in a folder don't get found. You end up working alone, disconnected from related work that exists somewhere.

Arke handles all of it. Upload anything—we process it and connect it to a network where similar collections surface automatically. Your information becomes searchable, discoverable, and linked to work you didn't know existed.

Alpha—Arke is currently in alpha. An invite code is required. All operations are free. If you're interested in early access, email nick@arkeon.tech.

Who Is This For

Anyone preserving information—researchers, archivists, journalists, investigators, librarians, genealogists. If you work with large amounts of textual or image-based data and care about its integrity and permanence, this is built for you.
Developers and AI agents—the API is designed for programmatic access. Build tools, processing agents, or applications on top of the network.
Curators—you don't have to upload anything. Start an empty collection and connect to content from across the network. Curation is contribution.

How It Works

Here's what happens when you use Arke, step by step.

1. Create a collection

A collection is a container for related content—a research project, an archive, a dataset. It's the top-level organizational unit.

POST /entities
{
  "type": "collection",
  "label": "Moby Dick",
  "description": "Herman Melville's epic tale of Captain Ahab's obsessive quest for the white whale"
}

See a live example collection in the Arke network.

2. Upload files

Upload files into your collection. Supported formats include PDF, JPEG, PNG, TIFF, WebP, AVIF, GIF, and any text file. Each file is stored and content-hashed—it gets a unique content identifier (CID) derived from the file contents.

POST /entities
{
  "type": "file",
  "collection": "01KFNR0H0Q791Y1SMZWEQ09FGV",
  "label": "moby-dick.txt"
}
# Then upload content via POST /entities/{id}/content

See a live example file entity in the Arke network.

3. Every entity gets an EIDOS manifest

Every node in the network—whether it's a collection, a file, a chapter, or a user—is stored as an EIDOS manifest. This is the universal schema:

{
  "schema": "arke/eidos@v1",
  "id": "01KFNR849AZNBWE9DYJRZR7PSA",
  "type": "file",
  "created_at": "2025-01-15T10:30:00Z",
  "ver": 1,
  "ts": 1736937000000,
  "prev": null,
  "properties": {
    "label": "Chapter 1. Loomings",
    "summary": "Ishmael introduces himself and explains his habit of going to sea whenever he feels restless...",
    "text": "Call me Ishmael. Some years ago—never mind how long precisely..."
  },
  "relationships": [
    { "predicate": "in", "peer": "01KFNR81RMVAX2BBMMBW51V97D", "peer_type": "file" },
    { "predicate": "after", "peer": "01KFNR84EXTRACTS...", "peer_type": "file" }
  ],
  "edited_by": {
    "user_id": "01AGENT_STRUCTURE...",
    "method": "ai_generated"
  }
}

The manifest is hashed to produce a CID—a permanent, verifiable identifier for that exact version of the entity. Change anything, and you get a new CID. The previous version is preserved via the prev link, forming an immutable version chain.

4. Automatic indexing

You don't need to do anything for this step. Workers continuously watch the network and index every new entity into:

Semantic search (Pinecone)—find content by meaning, not just keywords
Graph index (Neo4j)—traverse relationships, find connections across collections

Within a minute of the entity being created, it becomes discoverable. Search within your own collection, or across the entire network.

5. Permanent archiving

Every entity that enters the network is attested to Arweave—a permanent, decentralized storage network. Arke maintains a continuous event chain on Arweave where each block points to the previous one. The full manifest is stored (not just a hash), meaning the entire network is independently reconstructable from the Arweave record alone.

This is the permanence guarantee: even if Arke's infrastructure disappeared, the data survives. (Note: binary files are not currently attested.)

Arweave Event Chain:

  [Block N] ──prev──▶ [Block N-1] ──prev──▶ [Block N-2] ──prev──▶ ...
     │                    │                      │
     ▼                    ▼                      ▼
  Full manifest       Full manifest          Full manifest
  (entity created)    (entity updated)       (entity created)

Why Structure Your Data This Way

Organizing information as interconnected graph nodes instead of flat files has practical consequences:

Better AI access. An LLM looking for Ahab's first monologue about the white whale can navigate through chapters, read summaries at each level, and find it in Chapter 36 (The Quarter-Deck) without scanning the entire novel. This is substantially more effective than flat chunking, where an AI has to scan through arbitrarily sliced text with no structural awareness.

Cross-collection discovery. When your entities are indexed with semantic embeddings, they surface alongside related content from other collections. A chapter about cetology in Moby Dick might connect to a marine biology research paper, a whaling museum's archive, or another researcher's collection of 19th-century maritime documents. Connections you didn't know existed become visible.

Data that resists typical formats. This works for information that's unstructured or hard to represent in rows and columns—things better expressed as graphs and interconnected entities.

Quick Reference

Question	Answer
How do I get access?	Alpha is invite-only. Email nick@arkeon.tech.
Is it free?	Yes, all operations are free during alpha.
What can I upload?	PDF, JPEG, PNG, TIFF, WebP, AVIF, GIF, text files, and any binary file type via MIME type.
Is there an SDK?	Yes—`@arke-institute/sdk` (TypeScript). Install with `npm install @arke-institute/sdk`.
Is there an API reference?	Yes—Ops Reference and Interactive API Docs.
What's the base URL?	`https://api.arke.institute`
Is there a test network?	Yes. Set `network: 'test'` in the SDK. Test network entities use `II`-prefixed IDs and auto-expire after 30 days.

Next Steps

Key Concepts—Entities, versioning, and relationships
Architecture—System design and storage layers
FAQ—Common questions

Documentation Roadmap

Available now:

Ops Reference—All API operations with parameters and permissions
Interactive API Docs—Redoc-powered API explorer
OpenAPI Spec—Machine-readable API specification

Coming soon:

Doc	Description
Getting Started Guide	Step-by-step walkthrough: create an account, set up a collection, upload and process your first file.
SDK Reference	Full documentation for `@arke-institute/sdk`—installation, authentication, uploads, error handling.
EIDOS Schema Reference	The universal entity schema—field definitions, type profiles, validation rules, extensibility.
LLM Documentation Index	Auto-generated index of all documentation for AI agent consumption (`llms.txt`).

On this page