NodeMind — Binary Document Intelligence

Performance

RAG float32 vs NodeMind Binary

A 1 GB text document becomes a 10 GB RAG float32 index — that's the real cost of vector search at scale. NodeMind's binary codec crushes that 10 GB down to just 210 MB online (or 32× smaller offline). Same documents. Same BGE-M3 embeddings. Dramatically different storage.

Why does RAG expand 10×? Chunking 1 KB of text produces a 1024-dim float32 vector = 4 KB (4× on raw text). HNSW graph index structures add another 2–3×. Result: every 1 GB of documents becomes ~10 GB in a vector database — confirmed by Elasticsearch, Pure Storage, and Milvus benchmarks. NodeMind then compresses that 10 GB RAG index 48× further on text (up to 100× on image embeddings) using our patent-pending binary codec.

Original Documents	RAG Index float32 · ~10× expansion	NodeMind Index binary · 48× smaller online	vs RAG	RAG Storage/mo S3 Standard	NodeMind Storage/mo S3 Standard	Managed Vector DB/mo Pinecone pricing	Annual Savings
— Storage Comparison
1 GB documents ~250K chunks	10 GB	210 MB	48×	$0.23/mo	$0.0024/mo	$25.00/mo	$300 / yr
10 GB documents ~2.5M chunks	100 GB	2.1 GB	48×	$2.30/mo	$0.024/mo	$250.00/mo	$3,000 / yr
100 GB documents ~25M chunks	1 TB	21 GB	48×	$23.00/mo	$0.24/mo	$2,500/mo	$30,000 / yr
1 TB documents ~250M chunks	10 TB	210 GB	48×	$230/mo	$2.40/mo	$25,000/mo	$300,000 / yr
— Search Performance
Search method Same 1024-dim BGE-M3	Cosine similarity on float32 — O(N·D) multiply-accumulate		75×	Hamming distance on 1024-bit integers — POPCNT only
GPU required	Yes — needed for fast cosine at scale			No — pure CPU, any machine
RAM for 250M chunks	~1 TB RAM			~10 GB RAM
Offline / portable	No — requires live vector DB connection			Yes — download zip, run anywhere, no cloud needed

Codec: NodeMind's compression is not standard binary quantization (which gives 32× at ~5% quality loss). Our patent-pending algorithm applies a spectral transform before binarization, achieving 48× online compression on text (32× on the offline downloadable bundle) and up to 100× on image embeddings, all with higher recall than vanilla quantization. No formula is disclosed. Costs use S3 Standard at $0.023/GB/mo and Pinecone managed vector DB at $2.50/GB/mo.

Note on scale and overhead. Compression ratios scale with corpus size. On small datasets (< 10,000 chunks) compression measures closer to 31× online due to the fixed structural overhead of the 64 MIH sub-tables; at production scale (> 100,000 chunks) this overhead is amortised, recovering the full 48×. Likewise the 75× search speedup is observable above ~100,000 chunks — small documents hit the 1 ms latency floor on both indexes. The 32× offline figure refers to the portable index file; if the raw corpus text is optionally bundled in the same zip, the total bundle footprint is roughly 5× smaller than standard RAG. Image / audio / video ratios (up to 100×) are projections — not yet measured in production.

Modalities

Built for every data type

The NodeMind binary codec applies to any embedding — text today, with audio, image, and video compression coming next. The same algorithm, dramatically higher ratios for richer media.

Text & Documents

48×

PDF, TXT, Markdown. BGE-M3 1024-dim embeddings → 1024-bit binary fingerprints. 48× online · 32× offline. Live on nodemind.space now.

Coming Soon

Images

100×

Visual embeddings binarized with the NodeMind codec. Projection: ~100× vs float32 CLIP/ViT embeddings — not yet measured in production. Coming soon.

Coming Soon

Audio

48×

Audio segment embeddings (Whisper, wav2vec) compressed with the same codec. Expected 48× online · 32× offline (text-equivalent path via Whisper transcription). Coming soon.

Coming Soon

Video

48×

Frame-level and temporal embeddings indexed with MIH. Expected 48× online · 32× offline (transcript + frame embeddings). Coming soon.

* Text results (48× online / 32× offline) are measured on the live platform. Image / audio / video estimates are projected from the same algorithm applied to those modalities' float32 embeddings; image's higher 100× ratio reflects larger native embedding dimensions in vision models.

Algorithm

How NodeMind works

Three stages — embedding, binary encoding with our proprietary codec, and Multi-Index Hashing search. No gradients. No GPU. Pure integer arithmetic throughout.

1. Chunk document

→

2. BGE-M3 embed (1024-dim)

→

3. NodeMind binary codec

→

4. MIH index (64 sub-tables)

→

5. Hamming search → results

Proprietary Binary Codec

Each BGE-M3 float32 embedding (4,096 bytes) is transformed into a 1024-bit binary fingerprint (128 bytes) using our patent-pending algorithm. The codec is not standard quantization — it applies a spectral transform that preserves semantic neighborhood relationships far better than direct sign-binarization.

100× size reduction vs float32
Adds > subtracts > right-shift — no floats
Deterministic & reversible lookup
Patent AU 2026901656 (codec)

Multi-Index Hashing (MIH)

The 1024-bit fingerprint is split into 64 sub-strings of 16 bits each. Each sub-string is stored in a separate hash table. At query time, exact matches per sub-table are merged — giving sub-linear Hamming nearest-neighbor search without any approximate structures.

64 hash sub-tables, 16 bits each
75× faster than linear cosine scan
No FAISS, no HNSW, no ANN library
Patent AU 2026901657 (centroid index)

BGE-M3 Embeddings

NodeMind uses BGE-M3, the state-of-the-art multilingual embedding model with 1024 dimensions. Dense, sparse, and multi-vector representations are supported. The model is loaded once per worker — no repeated downloads.

MTEB top-ranked multilingual model
1024-dim dense vectors
Runs on community hardware (RTX 3080 + 128 GB RAM) — no datacenter required
Scales to zero when idle

Portable Index Files

After indexing, users download two zip files: the NodeMind binary index and a standard RAG float32 index. Both run completely offline using the included nodemind_local.py runner. No cloud subscription needed to query.

NodeMind zip: binary MIH index
RAG zip: float32 cosine index
Side-by-side benchmark built in
Auto-deleted after 24 hours

User uploads PDF
        │
        ▼
[ FastAPI — nodemind.space ]   ← nginx + SSL (Google Cloud VPS, 1TB)
        │
        ▼  submit job
[ Community Hardware: RTX 3080 + 128 GB RAM ]
  1. pdfplumber → chunks
  2. BGE-M3 → float32 embeddings (1024-dim)
  3. NodeMind binary codec → 1024-bit fingerprints (100× smaller)
  4. MIH index: 64 sub-tables × 16-bit keys
  5. RAG index: float32 cosine (comparison baseline)
  6. Return nm_zip + rag_zip
        │
        ▼
[ VPS stores zips ]   ← auto-deleted after 24 hours
        │
        ▼
User downloads both — runs offline

Intellectual Property

Patent-protected technology

NodeMind's core algorithms are protected by two Australian provisional patents filed in 2026 by Sai Kiran Bathula, independent researcher, Coleambally NSW.

AU 2026901656 · Provisional

NodeMind WHT Binary Codec

The proprietary spectral encoding algorithm that converts float32 embeddings into compact binary fingerprints achieving 100× compression — substantially beyond standard binary quantization (32×) — with higher semantic recall.

AU 2026901657 · Provisional

NodeMind Centroid Multi-Index Hash

The centroid-based Multi-Index Hashing structure that enables sub-linear exact Hamming nearest-neighbor search over binary fingerprints, achieving 75× query speedup without approximate search structures.

Get Started

Try NodeMind for free

No installation. No API key. Upload any PDF, TXT, or Markdown file at the live demo and get a portable binary index back in under 2 minutes.

Visit the demo

Go to nodemind.space and click Try Free. Enter your email — login is instant, no inbox check.

Upload a document

Drop any PDF, TXT, or Markdown file (10 MB per file, 50 MB lifetime per account). Community hardware (RTX 3080 + 128 GB RAM) picks it up, embeds with BGE-M3, and applies the NodeMind codec — typical 5,500-page PDF indexes in ~7 minutes.

Download your index

Once ready, download both the NodeMind binary index and the RAG float32 index. Run queries side-by-side offline.

Compare live

Use the Compare tab to run queries and see NodeMind vs RAG side by side — latency, index size, compression ratio, and result quality.

Open Live Demo →

Document search
at binary speed

RAG float32 vs NodeMind Binary

Built for every data type

How NodeMind works

Proprietary Binary Codec

Multi-Index Hashing (MIH)

BGE-M3 Embeddings

Portable Index Files

Patent-protected technology

Try NodeMind for free

Interested in licensing or collaboration?

Document searchat binary speed

RAG float32 vs NodeMind Binary

Built for every data type

How NodeMind works

Proprietary Binary Codec

Multi-Index Hashing (MIH)

BGE-M3 Embeddings

Portable Index Files

Patent-protected technology

Try NodeMind for free

Interested in licensing or collaboration?

Document search
at binary speed