Conventions

ID formats, network conventions, timestamps, error formats, and other standards used in the Arke API.

ID Formats

Main Network

Standard 26-character ULIDs using Crockford's Base32 alphabet:

01JXXXXXXXXXXXXXXXXXXXXXXXXX

ULID structure: TTTTTTTTTTRRRRRRRRRRRRRRRR

10 characters: Timestamp (48 bits, millisecond precision)
16 characters: Randomness (80 bits, cryptographically random)

ULIDs are sortable by creation time and globally unique. The Crockford Base32 alphabet excludes I, L, O, U to avoid visual confusion.

Test Network

Test IDs use an II prefix + 24 characters:

IIXXXXXXXXXXXXXXXXXXXXXX

The II prefix is impossible in real ULIDs (since 'I' is excluded from Crockford Base32), making test data unambiguously identifiable. Test data routes to separate storage buckets and databases.

Special ID Prefixes

The system also accepts content-derived IDs for certain entity types:

F prefix: File entities (F + 25 characters)
C prefix: Chunk entities (C + 25 characters)

IPLD-Style Links

Version links use the IPLD convention for content-addressed references:

{ "/": "bafkrei..." }

This provides compatibility with content-addressed ecosystems. The CID (Content Identifier) is computed locally using the same algorithm as IPFS, providing content-addressing guarantees without requiring IPFS infrastructure.

Note: Storage is on Cloudflare R2/D1, not IPFS. The IPLD link format is preserved for compatibility with content-addressed systems.

CID Format

CIDs (Content Identifiers) are computed from the manifest JSON using:

Hash algorithm: SHA-256
Codec: Raw (0x55)
Version: CIDv1
Encoding: Base32 (lowercase)

Result format: CIDs starting with bafkrei...

Example:

bafkreiabc123def456ghi789jkl012mno345pqr678stu901vwx234yz5

For deterministic CID computation, objects are canonically stringified with recursively sorted keys before hashing. This ensures the same content always produces the same CID regardless of original key order.

Timestamp Formats

The API uses two timestamp formats:

Field	Format	Example
`created_at`, `ts` (in version history)	ISO 8601 datetime	`2025-12-26T12:00:00.000Z`
`ts` (in entity response)	Unix timestamp (milliseconds)	`1735214400000`

Network Header

Set X-Arke-Network to select the network:

Value	Network	ID Format	Storage
`main` (default)	Production	Standard ULID	Production buckets
`test`	Test	`II`-prefixed	Test buckets

Authentication Headers

Header	Description
`Authorization: Bearer <token>`	Supabase JWT token for user authentication
`Authorization: ApiKey ak_xxx`	Agent API key authentication
`Authorization: ApiKey uk_xxx`	User API key authentication
`X-On-Behalf-Of`	User entity ID for service accounts acting on behalf of a user (requires `role: service` auth)

Pagination

The API uses two pagination patterns depending on the endpoint:

Offset-based Pagination

Used for collection entities and user collections:

{
  "pagination": {
    "offset": 0,
    "limit": 50,
    "count": 25,
    "has_more": false
  }
}

Query parameters: ?offset=0&limit=50

Cursor-based Pagination

Used for version history and events:

{
  "versions": [...],
  "has_more": true,
  "next_cursor": "bafkrei..."
}

Query parameters: ?from=<cursor>&limit=10

For events, the cursor is a numeric ID: ?cursor=12345

Error Response Format

All errors follow a consistent JSON structure:

{
  "error": "Error message",
  "details": {}
}

Standard Error Codes

Status	Error Type	Example
400	Validation error	`{"error": "Validation failed", "details": {"issues": [...]}}`
401	Unauthorized	`{"error": "Unauthorized: Missing or invalid authentication token"}`
403	Forbidden	`{"error": "Forbidden: You do not have permission to perform this action"}`
404	Not found	`{"error": "Entity not found"}`
409	CAS conflict	`{"error": "Conflict: entity was modified", "details": {"expected": "...", "actual": "..."}}`
500	Internal error	`{"error": "Internal server error"}`
503	Service unavailable	`{"error": "Service unavailable", "details": {"service": "pinecone"}}`

CAS (Compare-And-Swap) Errors

When updating entities, you must provide expect_tip with the current CID. If another update occurred, you receive a 409 response:

{
  "error": "Conflict: entity was modified",
  "details": {
    "expected": "bafkreibug443...",
    "actual": "bafkreinewabc..."
  }
}

Inline Entity References

For referencing entities within text content (e.g., in description, ocr_text, or content properties), use the arke: URI scheme:

arke:<entity-id>

This is domain-agnostic and future-proof - the entity ID is the stable identifier, not a URL that might change.

Markdown Syntax

[Display Label](arke:01JENTITY123ABC456789XYZ)

Examples

This analysis builds on [Previous Report](arke:01JFILE789ABC012345DEF).

The header contains [Company Logo](arke:01JIMAGE456DEF789012ABC).

Parsing

// Raw references
const INLINE_REF_PATTERN = /arke:([0-9A-HJKMNP-TV-Z]{26}|II[0-9A-HJKMNP-TV-Z]{24}|[FC][0-9A-HJKMNP-TV-Z]{25})/g;

// Markdown links with labels
const MARKDOWN_REF_PATTERN = /\[([^\]]+)\]\(arke:([0-9A-HJKMNP-TV-Z]{26}|II[0-9A-HJKMNP-TV-Z]{24}|[FC][0-9A-HJKMNP-TV-Z]{25})\)/g;

See Entity References for the distinction between inline URIs and structured EntityRef objects.

Schema Versioning

Entity schemas follow the namespace/type@version convention:

arke/eidos@v1      -- Base entity schema (current)
arke/file@v1       -- File profile
arke/collection@v1 -- Collection profile

When schemas evolve, the version number increments. The system validates entities against the schema version specified in their schema field.

On this page