UltiHash Self-Hosted

Run lightning-fast storage on your infra

Kubernetes-native object storage for AI, analytics, and high-throughput use cases.

Install UltiHash

Ideally on flash-based storage
with Kubernetes

1

Upload your data

Built-in deduplication reduces storage costs

2

High-throughput read

Connect with robust
S3-compatible API

3

Connect your stack

GenAI, analytics, machine learning tools

4
Install with Helm chart

Why teams choose UltiHash

Built for performance

Optimized for high throughput on modern Flash architecture - ideal for 'write once, read many' applications

S3-compatible API

Drop into existing pipelines without changing your stack.
See integrations

Secure by design

Reed-Solomon erasure coding for data resiliency
Policy-based access control
Versioning + object locking
SOC-2 Type II-certified
Fully GDPR-compliant

Kubernetes-native

Works with any Kubernetes engine and CSI driver for easy integration into your environment

Fully software-defined

No lock-in or proprietary hardware dependencies. Deploy on-prem or in a VPC

Binary-level deduplication

Save storage on datasets with sections of duplicate byte strings (e.g. in images, logs, weights)

Support from UltiHash engineers

Get support

Detailed documentation

Written by and for storage admins + DevOps engineers
See full documentation

Feature request board

Give feedback

Optimized for the data workloads that matter

Retrieval-augmented generation (RAG)
Serve unstructured documents and source content with high throughput.
In modern RAG pipelines, vector embeddings are served from vector databases; the underlying unstructured data (PDFs, web pages, markdowns, knowledge base articles) still resides in object storage.

UltiHash accelerates RAG systems by enabling:
Fast retrieval of full documents and pre-tokenized files
High concurrency for microservice architectures serving real-time LLM agents
Optional deduplication of redundant chunks to reduce overall storage footprint
Works with Zilliz, LanceDB, Chroma, or custom RAG pipelines
Ideal for hybrid cloud setups where data locality matters
Avoids egress costs by colocating with compute or caching layer
Model training with media data pipelines
Handle high-volume media data pipelines — ingestion to model training.
Object storage becomes a bottleneck when media teams process large volumes of files for training or production tasks (e.g. object detection, speech-to-text).

UltiHash improves:
Throughput when ingesting and retrieving multi-frame image sequences or video segments
Versioning and deduplication of near-identical files (e.g. annotated vs. raw media)
Concurrent access by labeling tools, preprocessors, and ML pipelines
Validated with SuperAnnotate and MLOps image pipelines
Optimized for flash and high IOPS workloads (batch reads, partial range requests)
Ideal for edge or VPC clusters processing data
Analytics + data lakehouse architectures
Boost performance for read-heavy workloads across modern query engines.
Data lakehouses often rely on object storage for storing Parquet, ORC, or Iceberg-managed datasets. While S3 is standard, it's also a throughput bottleneck.

UltiHash improves:
Read throughput for scan-heavy workloads (e.g. SQL queries over columnar files)
Time-to-insight for data scientists using Spark, Presto/Trino, or DuckDB
Cost efficiency through deduplication of similar datasets (e.g. intermediate aggregates)
Compatible with Apache Iceberg, Delta Lake, etc.
Ideal for hybrid on-prem/cloud environments running analytics engines
Zero vendor lock-in with support for S3 API commands
Model training: checkpoints, weights, datasets
Speed up training runs with faster data access and lower infra overhead.
Training large models or fine-tuning them often involves accessing massive amounts of input data (images, telemetry, tabular logs) and writing intermediate outputs.

UltiHash delivers:
High-speed reads for datasets loaded via PyTorch, TensorFlow...
Deduplicated storage, for example of model weights, checkpoints, optimizer states in structured or binary formats
Consistent performance when multiple training jobs hit storage simultaneously
Works well with DDP-style training setups (multi-GPU, multi-node)
Supports range-based reads and resumable downloads
Efficient on both centralized and distributed infrastructure

UltiHash runs everywhere your workloads do

Virtual private cloud

Run UltiHash in your VPC with full control over performance.

On-premises
bare-metal clusters

Install UltiHash directly on your own hardware for maximum performance, data locality, and zero external dependencies.

On any CSI-compatible
storage backend

Deploy UltiHash on top of any CSI-backed volume in Kubernetes and integrate it with your existing stack.
We help you get started.
Talk to an UltiHash engineer

Start free, scale when you're ready for production

10 TB free license

for testing + POCs

Subscription plans

flexible 1, 12, 24 or 36 months

Pay-as-you-go

for elastic workloads

Test license
up to 10 TiB
Start for free
Premium deployment
10 - 999 TiB
Get access
Enterprise
1 PiB +
Talk to us
Two ways to pay.
Switch at any time.
Pay-as-you-go
Our most flexible pricing option, based solely on the actual amount of gigabytes stored in your UltiHash storage cluster.
~$10.40
/ TiB / month
charged per GiB per hour
RECOMMENDED FOR
Unpredictable storage needs
Prototyping & early deployments
Seamless up/down scaling on demand
Subscription
Our lowest price per gigabyte with commitment discounts, available with monthly or annual billing - ideal for large-scale use cases.
$7.20
/ TiB / month
* when billed annually
RECOMMENDED FOR
Stable, predictable storage demand
Lower cost for long-term projects
Enterprise AI production use

Now also available on

UltiHash Self-Hosted is now easier to deploy and manage via your AWS account, with flexible billing options including pay-as-you-go and long-term contracts (12, 24, or 36 months) that come with built-in discounts. All billing flows through AWS, so there’s no extra invoicing or vendor approvals. You can deploy on AWS, on-premises, or any VPC, and get up to 1 GB/s GET throughput per machine with full control over your data.

Join a linked ecosystem of leading infrastructure providers and AI platforms

UltiHash integrates seamlessly with a growing ecosystem of leading infrastructure providers and AI platforms. Whether you’re running workloads on AWS, deploying in Kubernetes, or building pipelines with tools like PyTorch, Spark, or Iceberg, UltiHash fits right in. Our S3-compatible API and high-throughput architecture make it easy to plug into your existing stack and accelerate performance without reworking your setup.
Want to partner with us?
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Built for developers - not sales demos

Support from UltiHash engineers

Get support

Feature request board

Give feedback

Detailed documentation for storage admins + DevOps engineers