Arke
Reference

Conventions

ID formats, network conventions, timestamps, error formats, and other standards used in the Arke API.

ID Formats

Main Network

Standard 26-character ULIDs using Crockford's Base32 alphabet:

01JXXXXXXXXXXXXXXXXXXXXXXXXX

ULID structure: TTTTTTTTTTRRRRRRRRRRRRRRRR

  • 10 characters: Timestamp (48 bits, millisecond precision)
  • 16 characters: Randomness (80 bits, cryptographically random)

ULIDs are sortable by creation time and globally unique. The Crockford Base32 alphabet excludes I, L, O, U to avoid visual confusion.

Test Network

Test IDs use an II prefix + 24 characters:

IIXXXXXXXXXXXXXXXXXXXXXX

The II prefix is impossible in real ULIDs (since 'I' is excluded from Crockford Base32), making test data unambiguously identifiable. Test data routes to separate storage buckets and databases.

Special ID Prefixes

The system also accepts content-derived IDs for certain entity types:

  • F prefix: File entities (F + 25 characters)
  • C prefix: Chunk entities (C + 25 characters)

Version links use the IPLD convention for content-addressed references:

{ "/": "bafkrei..." }

This provides compatibility with content-addressed ecosystems. The CID (Content Identifier) is computed locally using the same algorithm as IPFS, providing content-addressing guarantees without requiring IPFS infrastructure.

Note: Storage is on Cloudflare R2/D1, not IPFS. The IPLD link format is preserved for compatibility with content-addressed systems.

CID Format

CIDs (Content Identifiers) are computed from the manifest JSON using:

  • Hash algorithm: SHA-256
  • Codec: Raw (0x55)
  • Version: CIDv1
  • Encoding: Base32 (lowercase)

Result format: CIDs starting with bafkrei...

Example:

bafkreiabc123def456ghi789jkl012mno345pqr678stu901vwx234yz5

For deterministic CID computation, objects are canonically stringified with recursively sorted keys before hashing. This ensures the same content always produces the same CID regardless of original key order.

Timestamp Formats

The API uses two timestamp formats:

FieldFormatExample
created_at, ts (in version history)ISO 8601 datetime2025-12-26T12:00:00.000Z
ts (in entity response)Unix timestamp (milliseconds)1735214400000

Network Header

Set X-Arke-Network to select the network:

ValueNetworkID FormatStorage
main (default)ProductionStandard ULIDProduction buckets
testTestII-prefixedTest buckets

Authentication Headers

HeaderDescription
Authorization: Bearer <token>Supabase JWT token for user authentication
Authorization: ApiKey ak_xxxAgent API key authentication
Authorization: ApiKey uk_xxxUser API key authentication
X-On-Behalf-OfUser entity ID for service accounts acting on behalf of a user (requires role: service auth)

Pagination

The API uses two pagination patterns depending on the endpoint:

Offset-based Pagination

Used for collection entities and user collections:

{
  "pagination": {
    "offset": 0,
    "limit": 50,
    "count": 25,
    "has_more": false
  }
}

Query parameters: ?offset=0&limit=50

Cursor-based Pagination

Used for version history and events:

{
  "versions": [...],
  "has_more": true,
  "next_cursor": "bafkrei..."
}

Query parameters: ?from=<cursor>&limit=10

For events, the cursor is a numeric ID: ?cursor=12345

Error Response Format

All errors follow a consistent JSON structure:

{
  "error": "Error message",
  "details": {}
}

Standard Error Codes

StatusError TypeExample
400Validation error{"error": "Validation failed", "details": {"issues": [...]}}
401Unauthorized{"error": "Unauthorized: Missing or invalid authentication token"}
403Forbidden{"error": "Forbidden: You do not have permission to perform this action"}
404Not found{"error": "Entity not found"}
409CAS conflict{"error": "Conflict: entity was modified", "details": {"expected": "...", "actual": "..."}}
500Internal error{"error": "Internal server error"}
503Service unavailable{"error": "Service unavailable", "details": {"service": "pinecone"}}

CAS (Compare-And-Swap) Errors

When updating entities, you must provide expect_tip with the current CID. If another update occurred, you receive a 409 response:

{
  "error": "Conflict: entity was modified",
  "details": {
    "expected": "bafkreibug443...",
    "actual": "bafkreinewabc..."
  }
}

Inline Entity References

For referencing entities within text content (e.g., in description, ocr_text, or content properties), use the arke: URI scheme:

arke:<entity-id>

This is domain-agnostic and future-proof - the entity ID is the stable identifier, not a URL that might change.

Markdown Syntax

[Display Label](arke:01JENTITY123ABC456789XYZ)

Examples

This analysis builds on [Previous Report](arke:01JFILE789ABC012345DEF).
The header contains [Company Logo](arke:01JIMAGE456DEF789012ABC).

Parsing

// Raw references
const INLINE_REF_PATTERN = /arke:([0-9A-HJKMNP-TV-Z]{26}|II[0-9A-HJKMNP-TV-Z]{24}|[FC][0-9A-HJKMNP-TV-Z]{25})/g;

// Markdown links with labels
const MARKDOWN_REF_PATTERN = /\[([^\]]+)\]\(arke:([0-9A-HJKMNP-TV-Z]{26}|II[0-9A-HJKMNP-TV-Z]{24}|[FC][0-9A-HJKMNP-TV-Z]{25})\)/g;

See Entity References for the distinction between inline URIs and structured EntityRef objects.


Schema Versioning

Entity schemas follow the namespace/type@version convention:

arke/eidos@v1      -- Base entity schema (current)
arke/file@v1       -- File profile
arke/collection@v1 -- Collection profile

When schemas evolve, the version number increments. The system validates entities against the schema version specified in their schema field.

On this page