Skip to Content
Limits & Quotas

Limits and Quotas

Engram Limits

Per-Request Limits

ResourceLimitNotes
Messages per append_messages call200Batch your messages in groups of 200 or fewer
Search results (limit param)50 maxDefault: 10
Messages per get_conversation (message_limit)500 maxDefault: 100. Use pagination for longer conversations
Conversations per list_conversations (limit)100 maxDefault: 20. Use offset for pagination

Data Limits

ResourceLimitNotes
Message content sizeNo hard limitPractical limit ~100KB per message. D1 TEXT fields have no defined max
Tags per conversationNo hard limitStored as JSON array. Keep reasonable for query performance
Metadata sizeNo hard limitStored as JSON object. Keep under a few KB
Title lengthNo hard limitTEXT field

Chunking Parameters

ParameterValueDescription
Window size5 messagesNumber of messages per chunk
Stride3 messagesStep between consecutive chunk starts
Overlap2 messagesMessages shared between adjacent chunks

These are fixed in the current release. Configurable chunking is on the roadmap.

Cloudflare Platform Limits

Engram runs on Cloudflare’s platform. These are the underlying platform limits.

Workers (Free Plan)

ResourceLimit
Requests per day100,000
CPU time per request10ms
Memory128MB
Script size1MB

Workers (Paid — $5/month)

ResourceLimit
Requests per month10 million included, $0.50/million after
CPU time per request30 seconds
Memory128MB
Script size10MB

D1

ResourceFreePaid
Storage5GB10GB+
Rows read per day5 million50 billion
Rows written per day100,00050 million
Databases1050,000

Vectorize

ResourceLimit
Vectors per index5,000,000
DimensionsUp to 1536 (Engram uses 768)
Metadata per vector10KB
Namespaces per index1,000
Indexes per account100

Workers AI

ResourceLimit
bge-base-en-v1.5Free, unlimited
Max input tokens512 tokens per text
Batch size100 texts per call

Estimating Usage

Storage

Message content is gzip-compressed before storage, reducing space by roughly 3×. A typical message is ~3 KB raw and ~1 KB compressed. With chunk overhead, roughly 35 KB per 100-message conversation.

At D1’s paid tier (10GB), with compression:

  • ~280,000 conversations with 100 messages each
  • ~28 million individual messages

Short messages (<128 bytes) are stored uncompressed — gzip headers would make them larger.

Vectors

Each chunk produces one 768-dimensional vector. With a window of 5 and stride of 3, a 100-message conversation produces roughly 33 chunks.

At Vectorize’s 5M vector limit:

  • ~150,000 conversations with 100 messages each

Requests

Each append_messages call is 1 Worker request. Each search is 1 request. At 100K requests/day (free tier), that’s roughly:

  • 100K message appends or searches per day
  • ~4,000 per hour

Rate Limiting

Rate limiting is not yet implemented in the Engram MVP. It is planned for Phase 2.

Last updated on