Skip to content

Stores and storage

A store is a flat key/value namespace with byte-range reads — everything else (arrays, groups, chunks) is keys and bytes on top of it. Anywhere the API takes a store you can pass a directory path (making a LocalStore) or a store object.

LocalStore

A directory on disk; each chunk/metadata document is a file. Writes are atomic (temp file + rename), so concurrent readers never see partial chunks.

root = "storage_demo.zarr";
z = zarr.create(root, [10 10], "double", Path="x", ChunkShape=[5 5]);
z(:, :) = magic(10);
z2 = zarr.open(root, Path="x");
assert(isequal(z2(:, :), magic(10)))

MemoryStore

In-memory, ideal for tests and scratch work:

mem = zarr.stores.MemoryStore();
zm = zarr.create(mem, 5, "int32");
zm(:) = int32((1:5)');
assert(isequal(zm(2:3), int32([2; 3])))

ZipStore

A whole hierarchy inside one .zip file, compatible with zarr-python's ZipStore. Write mode accumulates entries in memory and writes the file on close() (zip entries can't be rewritten in place):

zw = zarr.stores.ZipStore("archive.zarr.zip", Mode="w");
zarr.create_group(zw, Attributes=struct(kind="archive"));
za = zarr.create(zw, [4 4], "double", Path="data");
za(:, :) = magic(4);
zw.close();

zr = zarr.stores.ZipStore("archive.zarr.zip");    % read-only
g = zarr.open(zr);
assert(isequal(g.item("data").read(), magic(4)))
zr.close();

HTTP(S)

HttpStore reads Zarr data from any web server, object-store HTTP endpoint, or CDN. When the server honors Range requests (S3, nginx, …), sharded arrays fetch only the byte ranges they need.

store = zarr.stores.HttpStore("https://example.com/data.zarr");
z = zarr.open(store, Path="temperature");   % direct opens always work
tile = z(1:512, 1:512);

HTTP servers cannot list keys, so browsing the hierarchy (children, tree) requires consolidated metadata (below) — direct opens by Path always work. Public S3 buckets work today via their HTTPS endpoints (https://<bucket>.s3.<region>.amazonaws.com/...).

Consolidated metadata

Opening a deep hierarchy normally costs one read per zarr.json. Fatal over HTTP. zarr.consolidate_metadata inlines every node's metadata into the root group's zarr.json (the same format zarr-python writes and reads), after which hierarchy operations are served from memory:

zarr.create(root, 3, "int8", Path="deep/nested/y");
zarr.consolidate_metadata(root);
gc = zarr.open(root);
node = gc.item("deep").item("nested").item("y");   % zero store reads
assert(isequal(node.shape, 3))

Consolidation is a snapshot — re-run it after adding or removing nodes.

Deleting nodes

zarr.delete_node(root, "deep");                    % recursive
assert(~gc.store.exists("deep/nested/y/zarr.json"))

Custom stores

Subclass zarr.stores.Store and implement get, set, erase, exists, list, and listDir; override getPartial/getSuffix with true ranged reads if the backend supports them (that is what makes sharded partial reads efficient). See +zarr/+stores/HttpStore.m for a compact example.