---
linkTitle: SVS-VAMANA vector search
title: SVS-VAMANA Vector Search
aliases:
- /integrate/redisvl/user_guide/09_svs_vamana
weight: 09
---


In this notebook, we will explore SVS-VAMANA (Scalable Vector Search with VAMANA graph algorithm), a graph-based vector search algorithm that is optimized to work with compression methods to reduce memory usage. It combines the Vamana graph algorithm with advanced compression techniques (LVQ and LeanVec) and is optimized for Intel hardware.

**How it works**

Vamana builds a single-layer proximity graph and prunes edges during construction based on tunable parameters, similar to HNSW but with a simpler structure. The compression methods apply per-vector normalization and scalar quantization, learning parameters directly from the data to enable fast, on-the-fly distance computations with SIMD-optimized layout Vector quantization and compression.


**SVS-VAMANA offers:**
- **Fast approximate nearest neighbor search** using graph-based algorithms
- **Vector compression** (LVQ, LeanVec) with up to 87.5% memory savings
- **Dimensionality reduction** (optional, with LeanVec)
- **Automatic performance optimization** through CompressionAdvisor

**Use SVS-VAMANA when:**
- Large datasets where memory is expensive
- Cloud deployments with memory-based pricing
- When 90-95% recall is acceptable
- High-dimensional vectors (>1024 dims) with LeanVec compression


**Table of Contents**

1. [Prerequisites](#Prerequisites)
2. [Quick Start with CompressionAdvisor](#Quick-Start-with-CompressionAdvisor)
3. [Creating an SVS-VAMANA Index](#Creating-an-SVS-VAMANA-Index)
4. [Loading Sample Data](#Loading-Sample-Data)
5. [Performing Vector Searches](#Performing-Vector-Searches)
6. [Understanding Compression Types](#Understanding-Compression-Types)
7. [Hybrid Queries with SVS-VAMANA](#Hybrid-Queries-with-SVS-VAMANA)
8. [Performance Monitoring](#Performance-Monitoring)
9. [Manual Configuration (Advanced)](#Manual-Configuration-(Advanced))
10. [Best Practices and Tips](#Best-Practices-and-Tips)
11. [Cleanup](#Cleanup)

---

## Prerequisites

Before running this notebook, ensure you have:
1. Installed `redisvl` and have that environment active for this notebook
2. A running Redis Stack instance with:
   - Redis >= 8.2.0
   - RediSearch >= 2.8.10

For example, you can run Redis Stack locally with Docker:

```bash
docker run -d -p 6379:6379 -p 8001:8001 redis/redis-stack:latest
```

**Note:** SVS-VAMANA only supports FLOAT16 and FLOAT32 datatypes.


```python
# Import necessary modules
import numpy as np
from redisvl.index import SearchIndex
from redisvl.query import VectorQuery
from redisvl.utils import CompressionAdvisor
from redisvl.redis.utils import array_to_buffer

# Set random seed for reproducible results
np.random.seed(42)
```


```python
# Redis connection
REDIS_URL = "redis://localhost:6379"
```

## Quick Start with CompressionAdvisor

The easiest way to get started with SVS-VAMANA is using the `CompressionAdvisor` utility, which automatically recommends optimal configuration based on your vector dimensions and performance priorities.


```python
# Get recommended configuration for common embedding dimensions
dims = 1024  # Common embedding dimensions (works reliably with SVS-VAMANA)

config = CompressionAdvisor.recommend(
    dims=dims,
    priority="balanced"  # Options: "memory", "speed", "balanced"
)

print("Recommended Configuration:")
for key, value in config.items():
    print(f"  {key}: {value}")

# Estimate memory savings
savings = CompressionAdvisor.estimate_memory_savings(
    config["compression"],
    dims,
    config.get("reduce")
)
print(f"\nEstimated Memory Savings: {savings}%")
```

    Recommended Configuration:
      algorithm: svs-vamana
      datatype: float16
      graph_max_degree: 64
      construction_window_size: 300
      compression: LeanVec4x8
      reduce: 512
      search_window_size: 30
    
    Estimated Memory Savings: 81.2%


## Creating an SVS-VAMANA Index

Let's create an index using the recommended configuration. We'll use a simple schema with text content and vector embeddings.


```python
# Create index schema with recommended SVS-VAMANA configuration
schema = {
    "index": {
        "name": "svs_demo",
        "prefix": "doc",
    },
    "fields": [
        {"name": "content", "type": "text"},
        {"name": "category", "type": "tag"},
        {
            "name": "embedding",
            "type": "vector",
            "attrs": {
                "dims": dims,
                **config,  # Use the recommended configuration
                "distance_metric": "cosine"
            }
        }
    ]
}

# Create the index
index = SearchIndex.from_dict(schema, redis_url=REDIS_URL)
index.create(overwrite=True)

print(f"✅ Created SVS-VAMANA index: {index.name}")
print(f"   Algorithm: {config['algorithm']}")
print(f"   Compression: {config['compression']}")
print(f"   Dimensions: {dims}")
if 'reduce' in config:
    print(f"   Reduced to: {config['reduce']} dimensions")
```

    ✅ Created SVS-VAMANA index: svs_demo
       Algorithm: svs-vamana
       Compression: LeanVec4x8
       Dimensions: 1024
       Reduced to: 512 dimensions


## Loading Sample Data

Let's create some sample documents with embeddings to demonstrate SVS-VAMANA search capabilities.


```python
# Generate sample data
sample_documents = [
    {"content": "Machine learning algorithms for data analysis", "category": "technology"},
    {"content": "Natural language processing and text understanding", "category": "technology"},
    {"content": "Computer vision and image recognition systems", "category": "technology"},
    {"content": "Delicious pasta recipes from Italy", "category": "food"},
    {"content": "Traditional French cooking techniques", "category": "food"},
    {"content": "Healthy meal planning and nutrition", "category": "food"},
    {"content": "Travel guide to European destinations", "category": "travel"},
    {"content": "Adventure hiking in mountain regions", "category": "travel"},
    {"content": "Cultural experiences in Asian cities", "category": "travel"},
    {"content": "Financial planning for retirement", "category": "finance"},
]

# Generate random embeddings for demonstration
# In practice, you would use a real embedding model
data_to_load = []

# Use reduced dimensions if LeanVec compression is applied
vector_dims = config.get("reduce", dims)
print(f"Creating vectors with {vector_dims} dimensions (reduced from {dims} if applicable)")

for i, doc in enumerate(sample_documents):
    # Create a random vector with some category-based clustering
    base_vector = np.random.random(vector_dims).astype(np.float32)
    
    # Add some category-based similarity (optional, for demo purposes)
    category_offset = hash(doc["category"]) % 100 / 1000.0
    base_vector[0] += category_offset
    
    # Convert to the datatype specified in config
    if config["datatype"] == "float16":
        base_vector = base_vector.astype(np.float16)
    
    data_to_load.append({
        "content": doc["content"],
        "category": doc["category"],
        "embedding": array_to_buffer(base_vector, dtype=config["datatype"])
    })

# Load data into the index
index.load(data_to_load)
print(f"✅ Loaded {len(data_to_load)} documents into the index")

# Wait a moment for indexing to complete
import time
time.sleep(2)

# Verify the data was loaded
info = index.info()
print(f"   Index now contains {info.get('num_docs', 0)} documents")
```

    Creating vectors with 512 dimensions (reduced from 1024 if applicable)
    ✅ Loaded 10 documents into the index
       Index now contains 0 documents


## Performing Vector Searches

Now let's perform some vector similarity searches using our SVS-VAMANA index.


```python
# Create a query vector (in practice, this would be an embedding of your query text)
# Important: Query vector must match the index datatype and dimensions
vector_dims = config.get("reduce", dims)
if config["datatype"] == "float16":
    query_vector = np.random.random(vector_dims).astype(np.float16)
else:
    query_vector = np.random.random(vector_dims).astype(np.float32)

# Perform a vector similarity search
query = VectorQuery(
    vector=query_vector.tolist(),
    vector_field_name="embedding",
    return_fields=["content", "category"],
    num_results=5
)

results = index.query(query)

print("🔍 Vector Search Results:")
print("=" * 50)
for i, result in enumerate(results, 1):
    distance = result.get('vector_distance', 'N/A')
    print(f"{i}. [{result['category']}] {result['content']}")
    print(f"   Distance: {distance:.4f}" if isinstance(distance, (int, float)) else f"   Distance: {distance}")
    print()
```

    🔍 Vector Search Results:
    ==================================================


## Runtime Parameters for Performance Tuning

SVS-VAMANA supports runtime parameters that can be adjusted at query time without rebuilding the index. These parameters allow you to fine-tune the trade-off between search speed and accuracy.

**Available Runtime Parameters:**

- **`search_window_size`**: Controls the size of the search window during KNN search (higher = better recall, slower search)
- **`epsilon`**: Approximation factor for range queries (default: 0.01)
- **`use_search_history`**: Whether to use search buffer (OFF/ON/AUTO, default: AUTO)
- **`search_buffer_capacity`**: Tuning parameter for 2-level compression (default: search_window_size)

Let's see how these parameters affect search performance:


```python
# Example 1: Basic query with default parameters
basic_query = VectorQuery(
    vector=query_vector.tolist(),
    vector_field_name="embedding",
    return_fields=["content", "category"],
    num_results=5
)

print("🔍 Basic Query (default parameters):")
results = index.query(basic_query)
print(f"Found {len(results)} results\n")

# Example 2: Query with tuned runtime parameters for higher recall
tuned_query = VectorQuery(
    vector=query_vector.tolist(),
    vector_field_name="embedding",
    return_fields=["content", "category"],
    num_results=5,
    search_window_size=40,      # Larger window for better recall
    use_search_history='ON',    # Use search history
    search_buffer_capacity=50   # Larger buffer capacity
)

print("🎯 Tuned Query (higher recall parameters):")
results = index.query(tuned_query)
print(f"Found {len(results)} results")
print("\nNote: Higher search_window_size improves recall but may increase latency")
```

### Runtime Parameters with Range Queries

Runtime parameters are also useful for range queries, where you want to find all vectors within a certain distance threshold:


```python
from redisvl.query import VectorRangeQuery

# Range query with runtime parameters
range_query = VectorRangeQuery(
    vector=query_vector.tolist(),
    vector_field_name="embedding",
    return_fields=["content", "category"],
    distance_threshold=0.3,
    epsilon=0.05,               # Approximation factor
    search_window_size=30,      # Search window size
    use_search_history='AUTO'   # Automatic history management
)

results = index.query(range_query)
print(f"🎯 Range Query Results: Found {len(results)} vectors within distance threshold 0.3")
for i, result in enumerate(results[:3], 1):
    distance = result.get('vector_distance', 'N/A')
    print(f"{i}. {result['content'][:50]}... (distance: {distance})")
```

## Understanding Compression Types

SVS-VAMANA supports different compression algorithms that trade off between memory usage and search quality. Let's explore the available options.


```python
# Compare different compression priorities
print("Compression Recommendations for Different Priorities:")
print("=" * 60)

priorities = ["memory", "speed", "balanced"]
for priority in priorities:
    config = CompressionAdvisor.recommend(dims=dims, priority=priority)
    savings = CompressionAdvisor.estimate_memory_savings(
        config["compression"],
        dims,
        config.get("reduce")
    )
    
    print(f"\n{priority.upper()} Priority:")
    print(f"  Compression: {config['compression']}")
    print(f"  Datatype: {config['datatype']}")
    if "reduce" in config:
        print(f"  Dimensionality reduction: {dims} → {config['reduce']}")
    print(f"  Search window size: {config['search_window_size']}")
    print(f"  Memory savings: {savings}%")
```

    Compression Recommendations for Different Priorities:
    ============================================================
    
    MEMORY Priority:
      Compression: LeanVec4x8
      Datatype: float16
      Dimensionality reduction: 1024 → 512
      Search window size: 20
      Memory savings: 81.2%
    
    SPEED Priority:
      Compression: LeanVec4x8
      Datatype: float16
      Dimensionality reduction: 1024 → 256
      Search window size: 40
      Memory savings: 90.6%
    
    BALANCED Priority:
      Compression: LeanVec4x8
      Datatype: float16
      Dimensionality reduction: 1024 → 512
      Search window size: 30
      Memory savings: 81.2%


## Compression Types Explained

SVS-VAMANA offers several compression algorithms:

### LVQ (Learned Vector Quantization)
- **LVQ4**: 4 bits per dimension (87.5% memory savings)
- **LVQ4x4**: 8 bits per dimension (75% memory savings)
- **LVQ4x8**: 12 bits per dimension (62.5% memory savings)
- **LVQ8**: 8 bits per dimension (75% memory savings)

### LeanVec (Compression + Dimensionality Reduction)
- **LeanVec4x8**: 12 bits per dimension + dimensionality reduction
- **LeanVec8x8**: 16 bits per dimension + dimensionality reduction

The CompressionAdvisor automatically chooses the best compression type based on your vector dimensions and priority.


```python
# Demonstrate compression savings for different vector dimensions
test_dimensions = [384, 768, 1024, 1536, 3072]

print("Memory Savings by Vector Dimension:")
print("=" * 50)
print(f"{'Dims':<6} {'Compression':<12} {'Savings':<8} {'Strategy'}")
print("-" * 50)

for dims in test_dimensions:
    config = CompressionAdvisor.recommend(dims=dims, priority="balanced")
    savings = CompressionAdvisor.estimate_memory_savings(
        config["compression"],
        dims,
        config.get("reduce")
    )
    
    strategy = "LeanVec" if dims >= 1024 else "LVQ"
    print(f"{dims:<6} {config['compression']:<12} {savings:>6.1f}% {strategy}")
```

    Memory Savings by Vector Dimension:
    ==================================================
    Dims   Compression  Savings  Strategy
    --------------------------------------------------
    384    LVQ4x4         75.0% LVQ
    768    LVQ4x4         75.0% LVQ
    1024   LeanVec4x8     81.2% LeanVec
    1536   LeanVec4x8     81.2% LeanVec
    3072   LeanVec4x8     81.2% LeanVec


## Hybrid Queries with SVS-VAMANA

SVS-VAMANA can be combined with other Redis search capabilities for powerful hybrid queries that filter by metadata while performing vector similarity search.


```python
# Perform a hybrid search: vector similarity + category filter
hybrid_query = VectorQuery(
    vector=query_vector.tolist(),
    vector_field_name="embedding",
    return_fields=["content", "category"],
    num_results=3
)

# Add a filter to only search within "technology" category
hybrid_query.set_filter("@category:{technology}")

filtered_results = index.query(hybrid_query)

print("🔍 Hybrid Search Results (Technology category only):")
print("=" * 55)
for i, result in enumerate(filtered_results, 1):
    distance = result.get('vector_distance', 'N/A')
    print(f"{i}. [{result['category']}] {result['content']}")
    print(f"   Distance: {distance:.4f}" if isinstance(distance, (int, float)) else f"   Distance: {distance}")
    print()
```

    🔍 Hybrid Search Results (Technology category only):
    =======================================================


## Performance Monitoring

Let's examine the index statistics to understand the performance characteristics of our SVS-VAMANA index.


```python
# Get detailed index information
info = index.info()

print("📊 Index Statistics:")
print("=" * 30)
print(f"Documents: {info.get('num_docs', 0)}")

# Handle vector_index_sz_mb which might be a string
vector_size = info.get('vector_index_sz_mb', 0)
if isinstance(vector_size, str):
    try:
        vector_size = float(vector_size)
    except ValueError:
        vector_size = 0.0
print(f"Vector index size: {vector_size:.2f} MB")

# Handle total_indexing_time which might also be a string
indexing_time = info.get('total_indexing_time', 0)
if isinstance(indexing_time, str):
    try:
        indexing_time = float(indexing_time)
    except ValueError:
        indexing_time = 0.0
print(f"Total indexing time: {indexing_time:.2f} seconds")

# Calculate memory efficiency
if info.get('num_docs', 0) > 0 and vector_size > 0:
    mb_per_doc = vector_size / info.get('num_docs', 1)
    print(f"Memory per document: {mb_per_doc:.4f} MB")
    
    # Estimate for larger datasets
    for scale in [1000, 10000, 100000]:
        estimated_mb = mb_per_doc * scale
        print(f"Estimated size for {scale:,} docs: {estimated_mb:.1f} MB")
else:
    print("Memory efficiency calculation requires documents and vector index size > 0")
```

    📊 Index Statistics:
    ==============================
    Documents: 0
    Vector index size: 0.00 MB
    Total indexing time: 1.58 seconds
    Memory efficiency calculation requires documents and vector index size > 0


## Manual Configuration (Advanced)

For advanced users who want full control over SVS-VAMANA parameters, you can manually configure the algorithm instead of using CompressionAdvisor.


```python
# Example of manual SVS-VAMANA configuration
manual_schema = {
    "index": {
        "name": "svs_manual",
        "prefix": "manual",
    },
    "fields": [
        {"name": "content", "type": "text"},
        {
            "name": "embedding",
            "type": "vector",
            "attrs": {
                "dims": 768,
                "algorithm": "svs-vamana",
                "datatype": "float32",
                "distance_metric": "cosine",
                
                # Graph construction parameters
                "graph_max_degree": 64,           # Higher = better recall, more memory
                "construction_window_size": 300,  # Higher = better quality, slower build
                
                # Search parameters
                "search_window_size": 40,         # Higher = better recall, slower search
                
                # Compression settings
                "compression": "LVQ4x4",          # Choose compression type
                "training_threshold": 10000,      # Min vectors before compression training
            }
        }
    ]
}

print("Manual SVS-VAMANA Configuration:")
print("=" * 40)
vector_attrs = manual_schema["fields"][1]["attrs"]
for key, value in vector_attrs.items():
    if key != "dims":  # Skip dims as it's obvious
        print(f"  {key}: {value}")

# Calculate memory savings for this configuration
manual_savings = CompressionAdvisor.estimate_memory_savings(
    "LVQ4x4", 768, None
)
print(f"\nEstimated memory savings: {manual_savings}%")
```

    Manual SVS-VAMANA Configuration:
    ========================================
      algorithm: svs-vamana
      datatype: float32
      distance_metric: cosine
      graph_max_degree: 64
      construction_window_size: 300
      search_window_size: 40
      compression: LVQ4x4
      training_threshold: 10000
    
    Estimated memory savings: 75.0%


## Best Practices and Tips

### When to Use SVS-VAMANA
- **Large datasets** (>10K vectors) where memory efficiency matters
- **High-dimensional vectors** (>512 dimensions) that benefit from compression
- **Applications** that can tolerate slight recall trade-offs for speed and memory savings

### Parameter Tuning Guidelines

**Index-time parameters** (set during index creation):
- **Start with CompressionAdvisor** recommendations for compression and datatype
- **Use LeanVec** for high-dimensional vectors (≥1024 dims)
- **Use LVQ** for lower-dimensional vectors (<1024 dims)
- **graph_max_degree**: Higher values improve recall but increase memory usage
- **construction_window_size**: Higher values improve index quality but slow down build time

**Runtime parameters** (adjustable at query time without rebuilding index):
- **search_window_size**: Start with 20, increase to 40-100 for higher recall
- **epsilon**: Use 0.01-0.05 for range queries (higher = faster but less accurate)
- **use_search_history**: Use 'AUTO' (default) or 'ON' for better recall
- **search_buffer_capacity**: Usually set equal to search_window_size

### Performance Considerations
- **Index build time** increases with higher construction_window_size
- **Search latency** increases with higher search_window_size (tunable at query time!)
- **Memory usage** decreases with more aggressive compression
- **Recall quality** may decrease with more aggressive compression or lower search_window_size

## Cleanup

Clean up the indices created in this demo.


```python
# Clean up demo indices
try:
    index.delete()
    print("Cleaned up svs_demo index")
except:
    print("- svs_demo index was already deleted or doesn't exist")
```

    Cleaned up svs_demo index