ZenDB is an embedded document database for Motoko that leverages stable memory to store and query large datasets efficiently. It provides a familiar document-oriented API, similar to MongoDB, allowing developers to store nested records, create collections, define indexes and perform complex queries on their data.
- Indexes: Support for multi-field indexes to speed up complex queries
- Full Candid Integration: Native support for Candid that allows you to insert and retrieve any motoko data type without creating custom serialization or type mappings
- Rich Query Language And Query Builder: Comprehensive set of operators including equality, range, and logical operations with an intuitive fluent API for building queries
- Query Execution Engine: Performance optimized query planner programmed to search for the best execution path that minimizes in-memory data processing.
- Pagination: Supports both offset and cursor pagination
- Partial Field Updates: Update any nested field without having to re-insert the entire document
- Schema Validation And Constraints: Add restrictions on what specific values can be stored in each collection
Important Note: Currently, no automatic indexes are created for your data. You'll need to create your own indexes when defining your collection and schema. If no indexes exist that can satisfy a query, ZenDB will run a full collection scan, which is likely to hit the instruction limit for a dataset with as little as ten thousand records.
- Requires
mocversion0.14.13or higher to run. - Install with mops:
mops toolchain use moc 0.14.13.
mops add zendbCanisters have a limit of how much stable memory they can use. This limit can be set in the dfx.json file in the root of your project. By default, the limit is set to 4GB, but for larger datasets, you can increase it up to 500GB.
This measurement is in pages, where each page is 64KiB. For a 200GB limit, the limit would be 200 * (1024 * 1024) / 64 = 3276800 pages.
In addition to setting the stable memory limit, it's recommended to use legacy persistence instead of EOP to limit heap allocations. I found that legacy persistence uses 2x less heap allocations and costs 4x less cycles to run for stable memory heavy applications.
"canisters": {
"stress-test" : {
"type": "motoko",
"main": "main.mo",
"args": "--max-stable-pages 3276800 --legacy-persistence --force-gc --generational-gc",
"optimize": "O3"
}
},import Principal "mo:base/Principal";
import ZenDB "mo:zendb";
actor class Canister() = canister_reference {
let canister_id = Principal.fromActor(canister_reference);
stable var zendb = ZenDB.newStableStore(canister_id, null);
let db = ZenDB.launchDefaultDB(zendb);
}// motoko type
type User = {
id: Nat;
name: Text;
email: Text;
profile: {
age: ?Nat;
location: Text;
interests: [Text];
};
created_at: Int;
updated_at: Int;
};
// corresponding schema type
let UsersSchema : ZenDB.Types.Schema = #Record([
("id", #Nat),
("name", #Text),
("email", #Text),
("profile", #Record([
("age", #Option(#Nat)),
("location", #Text),
("interests", #Array(#Text)),
])),
("created_at", #Int),
("updated_at", #Int)
]);
// serializer - boilerplate to convert between Motoko types and Candid blobs
let candify_users : ZenDB.Types.Candify<User> = {
to_blob = func(user: User) : Blob { to_candid(user) };
from_blob = func(blob: Blob) : ?User { from_candid(blob) };
};// Create collection
let #ok(users) = db.createCollection("users", UsersSchema, candify_users, null);
// Create optimal indexes for your query patterns
let #ok(_) = users.createIndex("name_idx", [("name", #Ascending)], null);
let #ok(_) = users.createIndex(
"location_created_at_idx",
[
("profile.location", #Ascending),
("created_at", #Descending)
],
null
);// Insert a document
let user : User = {
id = 1;
name = "Alice";
email = "alice@example.com";
profile = {
age = ?35;
location = "San Francisco";
interests = ["coding", "hiking", "photography"];
};
created_at = Time.now();
updated_at = Time.now();
};
let #ok(userId) = users.insert(user);
// Query with the QueryBuilder API
let #ok(queryResults) = users.search(
ZenDB.QueryBuilder()
.Where("profile.location", #eq(#Text("San Francisco")))
.And("profile.age", #gte(#Nat(30)))
.SortBy("created_at", #Descending)
.Limit(10)
);
// Search returns a SearchResult with documents and instruction count
assert queryResults.documents == [(userId, user)];// Update specific fields in a document
let #ok(_) = users.updateById(
userId,
[
("profile.location", #Text("New York")),
("profile.interests", #Array([#Text("coding"), #Text("reading")]))
]
);
let ?updatedUser1 = users.get(userId);
assert updatedUser1.profile.location == "New York";
assert updatedUser1.profile.interests == ["coding", "reading"];
// Update multiple fields referencing the current value of a field
let currTime = Time.now();
let #ok(_) = users.update(
ZenDB.QueryBuilder().Where("email", #eq(#Text(""))),
[
("name", #uppercase(#currValue)),
("age", #add(#currValue, #Nat(1))),
("updated_at", #Int(currTime)),
("email", #lowercase(
#concatAll([
#concat(#get("name"), #Text("-in-")),
#replaceSubText(#get("profile.location"), " ", "-"),
#Text("@example.com")
])
))
]
);
let updatedUser2 = users.get(userId);
assert updatedUser2.name == "ALICE";
assert updatedUser2.age == ?36;
assert updatedUser2.updated_at == currTime;
assert updatedUser2.email == "alice-in-san-francisco@example.com";
Monitor your collections to understand performance characteristics:
let stats = users.stats();// Find active premium users who joined recently
let #ok(search_result) = users.search(
ZenDB.QueryBuilder()
.Where("status", #eq(#Text("active")))
.And("account_type", #eq(#Text("premium")))
.And("joined_date", #gte(#Int(oneWeekAgo)))
.SortBy("activity_score", #Descending)
.Limit(25)
);
let activeRecentPremiumUsers = search_result.documents;
// Find users matching any of several criteria
let #ok(search_result2) = users.search(
ZenDB.QueryBuilder()
.Where("role", #eq(#Text("admin")))
.OrQuery(
ZenDB.QueryBuilder()
.Where("subscription_tier", #eq(#Text("enterprise")))
.And("usage", #gte(#Nat(highUsageThreshold)))
)
);The repository includes several examples demonstrating ZenDB's capabilities:
- Simple Notes Dapp: Example Notes app, with simple CRUD operations
- ICP Txs explorer: ZenDB test app that indexes ICP transactions
- Flying Ninja: Dapp from the dfinity/examples repo ported to use ZenDB.
For more detailed examples and advanced usage, see the Complete Documentation.
ZenDB Documentation - Comprehensive guide covering:
- Getting Started: ZenDB instances, memory types, configuration
- Schema Definition: Type system, constraints, validation
- Collection Management: Creating collections, CRUD operations
- Advanced Querying: QueryBuilder API, logical grouping, operators
- Indexing: B-Tree indexes, composite indexes, Orchid encoding
- Performance: Query optimization, index selection strategies
- Monitoring: Collection statistics, memory usage
The documentation includes detailed examples, performance optimization tips, and best practices to help you get the most out of ZenDB.
ZenDB uses a sophisticated query planner to determine the most efficient indexes for each query. To get the best performance from ZenDB, create indexes that:
- Match your most common query patterns
- Include fields used in sorting operations
- Support your filtering operations (equality, range conditions)
For optimal query performance, order fields in composite indexes by priority:
- Equality filters - Fields with exact matches (
#eq) come first - Sort fields - Fields used for ordering results come second
- Range filters - Fields with range queries (
#gt,#lt,#between) come last
This ordering is crucial because ZenDB stores composite indexes as concatenated keys in a B-tree structure. When equality filters come first, the query engine can combine these conditions into a single key prefix for efficient B-tree scanning. This allows the system to quickly narrow down to the smallest possible result set before applying range operations. If range fields were placed first, the index couldn't be fully utilized since range operations break the prefix matching pattern, forcing expensive full index scans.
Example query using ZenDB QueryBuilder:
let #ok(results) = users.search(
ZenDB.QueryBuilder()
.Where("age", #gt(#Nat(18))) // Range filter
.And("status", #eq(#Text("active"))) // Equality filter
.SortBy("created_at", #Descending) // Sort operation
);Optimal index for this query:
let #ok(_) = users.createIndex(
"status_date_age_idx",
[
("status", #Ascending), // High selectivity field first
("created_at", #Descending), // Sort field for efficient ordering
("age", #Ascending) // Range field with lower selectivity
]
);The search() method returns a Result<SearchResult<Record>, Text> where SearchResult contains:
documents: An array of tuples[(DocumentId, Document)]- the document ID and the document itselfinstructions: The number of instructions used to execute the query
let #ok(search_result) = users.search(query);
let documents = search_result.documents;
let instructions_used = search_result.instructions;This allows you to monitor query performance and access both the results and execution metrics.
-
Limited array support - Can store arrays in collections, but cannot create indexes on array fields or perform operations on specific array elements. In addition, indexes cannot be created on fields nested within an
#Array. -
No support for text-based full-text search or pattern matching within indexes
-
Complex queries with many OR conditions may have suboptimal performance
-
The query planner may not always select the optimal index for complex queries. It is recommended to analyze query performance and adjust indexes accordingly.
-
Schema updates and migrations not yet supported. As a result, changing the schema of an existing collection requires creating a new collection and migrating the data manually.
-
Using Limit/Skip Pagination can be inefficient and may hit the instruction limits if the result set is too large. It is recommended to create indexes that fully cover your queries where possible, to avoid this limitation.
- Multi-field compound indexes
- Powerful query language with logical operators
- Schema validation and Schema constraints
- Fully support Array fields and operations on them
- Multi-key array indexes - for indexing fields within arrays
- Backward compatible schema upgrades and versioning
- Aggregation functions (min, max, sum, avg, etc.)
- Support for migrations
- Full Text search capabilities by implementing an inverted text index
- Data Certification of all documents, using the ic-certification motoko library
- Dedicated database canister for use by clients in other languages (e.g. JavaScript, Rust)
- Database management tools to handle collection creation, index management, and data migrations
- Periodic backups to external canisters
- Database canister monitoring and analytics tools
Contributions are welcome! Please feel free to create an issue to report a bug or submit a Pull Request. For features, please create an issue first to discuss supporting it in this project.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0) - see the LICENSE file for details.