UltiHash Self-Hosted

Run lightning-fast storage on your infra

Kubernetes-native object storage for AI, analytics, and high-throughput use cases.

Claim your free 10TiB license



Install UltiHash

Ideally on flash-based storage
with Kubernetes

1

Upload your data

Built-in deduplication reduces storage costs

2

High-throughput read

Connect with robust
S3-compatible API

3

Connect your stack

GenAI, analytics, machine learning tools

4

Install with Helm chart



Install with Helm chart



Why teams choose UltiHash

Built for performance

Optimized for high throughput on modern Flash architecture - ideal for 'write once, read many' applications

S3-compatible API

Drop into existing pipelines without changing your stack.

See integrations

Secure by design



Reed-Solomon erasure coding for data resiliency



Policy-based access control



Versioning + object locking



SOC-2 Type II-certified



Fully GDPR-compliant

Kubernetes-native

Works with any Kubernetes engine and CSI driver for easy integration into your environment

Fully software-defined

No lock-in or proprietary hardware dependencies. Deploy on-prem or in a VPC

Binary-level deduplication

Save storage on datasets with sections of duplicate byte strings (e.g. in images, logs, weights)

Support from UltiHash engineers

Get support

Detailed documentation

Written by and for storage admins + DevOps engineers

See full documentation

Feature request board

Give feedback

Optimized for the data workloads that matter

Retrieval-augmented generation (RAG)

Serve unstructured documents and source content with high throughput.

In modern RAG pipelines, vector embeddings are served from vector databases; the underlying unstructured data (PDFs, web pages, markdowns, knowledge base articles) still resides in object storage.

UltiHash accelerates RAG systems by enabling:

Fast retrieval of full documents and pre-tokenized files



High concurrency for microservice architectures serving real-time LLM agents



Optional deduplication of redundant chunks to reduce overall storage footprint





Works with Zilliz, LanceDB, Chroma, or custom RAG pipelines



Ideal for hybrid cloud setups where data locality matters



Avoids egress costs by colocating with compute or caching layer

Model training with media data pipelines

Handle high-volume media data pipelines — ingestion to model training.

Object storage becomes a bottleneck when media teams process large volumes of files for training or production tasks (e.g. object detection, speech-to-text).

UltiHash improves:

Throughput when ingesting and retrieving multi-frame image sequences or video segments



Versioning and deduplication of near-identical files (e.g. annotated vs. raw media)



Concurrent access by labeling tools, preprocessors, and ML pipelines





Validated with SuperAnnotate and MLOps image pipelines



Optimized for flash and high IOPS workloads (batch reads, partial range requests)



Ideal for edge or VPC clusters processing data

Analytics + data lakehouse architectures

Boost performance for read-heavy workloads across modern query engines.

Data lakehouses often rely on object storage for storing Parquet, ORC, or Iceberg-managed datasets. While S3 is standard, it's also a throughput bottleneck.

UltiHash improves:

Read throughput for scan-heavy workloads (e.g. SQL queries over columnar files)



Time-to-insight for data scientists using Spark, Presto/Trino, or DuckDB



Cost efficiency through deduplication of similar datasets (e.g. intermediate aggregates)





Compatible with Apache Iceberg, Delta Lake, etc.



Ideal for hybrid on-prem/cloud environments running analytics engines



Zero vendor lock-in with support for S3 API commands

Model training: checkpoints, weights, datasets

Speed up training runs with faster data access and lower infra overhead.

Training large models or fine-tuning them often involves accessing massive amounts of input data (images, telemetry, tabular logs) and writing intermediate outputs.

UltiHash delivers:

High-speed reads for datasets loaded via PyTorch, TensorFlow...



Deduplicated storage, for example of model weights, checkpoints, optimizer states in structured or binary formats



Consistent performance when multiple training jobs hit storage simultaneously





Works well with DDP-style training setups (multi-GPU, multi-node)



Supports range-based reads and resumable downloads



Efficient on both centralized and distributed infrastructure

UltiHash runs everywhere your workloads do

Virtual private cloud

Run UltiHash in your VPC with full control over performance.

On-premises
bare-metal clusters

Install UltiHash directly on your own hardware for maximum performance, data locality, and zero external dependencies.

On any CSI-compatible
storage backend

Deploy UltiHash on top of any CSI-backed volume in Kubernetes and integrate it with your existing stack.

We help you get started.

Talk to an UltiHash engineer



Start free, scale when you're ready for production

Test license

up to 10 TiB

Start for free

Premium deployment

10 - 999 TiB

Get access



Enterprise

1 PiB +

Talk to us

Two ways to pay.
Switch at any time.



Pay-as-you-go

Our most flexible pricing option, based solely on the actual amount of gigabytes stored in your UltiHash storage cluster.

~$10.40

/ TiB / month

charged per GiB per hour

RECOMMENDED FOR



Unpredictable storage needs



Prototyping & early deployments



Seamless up/down scaling on demand



Subscription

Our lowest price per gigabyte with commitment discounts, available with monthly or annual billing - ideal for large-scale use cases.

$7.20

/ TiB / month

* when billed annually

RECOMMENDED FOR



Stable, predictable storage demand



Lower cost for long-term projects



Enterprise AI production use

Join a linked ecosystem of leading infrastructure providers and AI platforms

UltiHash integrates seamlessly with a growing ecosystem of leading infrastructure providers and AI platforms. Whether you’re running workloads on AWS, deploying in Kubernetes, or building pipelines with tools like PyTorch, Spark, or Iceberg, UltiHash fits right in. Our S3-compatible API and high-throughput architecture make it easy to plug into your existing stack and accelerate performance without reworking your setup.

Want to partner with us?

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.