Files
herodb/examples/README.md
2025-08-23 05:04:37 +02:00

171 lines
5.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# HeroDB Tantivy Search Examples
This directory contains examples demonstrating HeroDB's full-text search capabilities powered by Tantivy.
## Tantivy Search Demo (Bash Script)
### Overview
The `tantivy_search_demo.sh` script provides a comprehensive demonstration of HeroDB's search functionality using Redis commands. It showcases various search scenarios including basic text search, filtering, sorting, geographic queries, and more.
### Prerequisites
1. **HeroDB Server**: The server must be running on port 6381
2. **Redis CLI**: The `redis-cli` tool must be installed and available in your PATH
### Running the Demo
#### Step 1: Start HeroDB Server
```bash
# From the project root directory
cargo run -- --port 6381
```
#### Step 2: Run the Demo (in a new terminal)
```bash
# From the project root directory
./examples/tantivy_search_demo.sh
```
### What the Demo Covers
The script demonstrates 15 different search scenarios:
1. **Index Creation** - Creating a search index with various field types
2. **Data Insertion** - Adding sample products to the index
3. **Basic Text Search** - Simple keyword searches
4. **Filtered Search** - Combining text search with category filters
5. **Numeric Range Search** - Finding products within price ranges
6. **Sorting Results** - Ordering results by different fields
7. **Limited Results** - Pagination and result limiting
8. **Complex Queries** - Multi-field searches with sorting
9. **Geographic Search** - Location-based queries
10. **Index Information** - Getting statistics about the search index
11. **Search Comparison** - Tantivy vs simple pattern matching
12. **Fuzzy Search** - Typo tolerance and approximate matching
13. **Phrase Search** - Exact phrase matching
14. **Boolean Queries** - AND, OR, NOT operators
15. **Cleanup** - Removing test data
### Sample Data
The demo uses a product catalog with the following fields:
- **title** (TEXT) - Product name with higher search weight
- **description** (TEXT) - Detailed product description
- **category** (TAG) - Comma-separated categories
- **price** (NUMERIC) - Product price for range queries
- **rating** (NUMERIC) - Customer rating for sorting
- **location** (GEO) - Geographic coordinates for location searches
### Key Redis Commands Demonstrated
#### Index Management
```bash
# Create search index
FT.CREATE product_catalog ON HASH PREFIX 1 product: SCHEMA title TEXT WEIGHT 2.0 SORTABLE description TEXT category TAG SEPARATOR , price NUMERIC SORTABLE rating NUMERIC SORTABLE location GEO
# Get index information
FT.INFO product_catalog
# Drop index
FT.DROPINDEX product_catalog
```
#### Search Queries
```bash
# Basic text search
FT.SEARCH product_catalog wireless
# Filtered search
FT.SEARCH product_catalog 'organic @category:{food}'
# Numeric range
FT.SEARCH product_catalog '@price:[50 150]'
# Sorted results
FT.SEARCH product_catalog '@category:{electronics}' SORTBY price ASC
# Geographic search
FT.SEARCH product_catalog '@location:[37.7749 -122.4194 50 km]'
# Boolean queries
FT.SEARCH product_catalog 'wireless AND audio'
FT.SEARCH product_catalog 'coffee OR tea'
# Phrase search
FT.SEARCH product_catalog '"noise canceling"'
```
### Interactive Features
The demo script includes:
- **Colored output** for better readability
- **Pause between steps** to review results
- **Error handling** with clear error messages
- **Automatic cleanup** of test data
- **Progress indicators** showing what each step demonstrates
### Troubleshooting
#### HeroDB Not Running
```
✗ HeroDB is not running on port 6381
Please start HeroDB with: cargo run -- --port 6381
```
**Solution**: Start the HeroDB server in a separate terminal.
#### Redis CLI Not Found
```
redis-cli: command not found
```
**Solution**: Install Redis tools or use an alternative Redis client.
#### Connection Refused
```
Could not connect to Redis at localhost:6381: Connection refused
```
**Solution**: Ensure HeroDB is running and listening on the correct port.
### Manual Testing
You can also run individual commands manually:
```bash
# Connect to HeroDB
redis-cli -h localhost -p 6381
# Create a simple index
FT.CREATE myindex ON HASH SCHEMA title TEXT description TEXT
# Add a document
HSET doc:1 title "Hello World" description "This is a test document"
# Search
FT.SEARCH myindex hello
```
### Performance Notes
- **Indexing**: Documents are indexed in real-time as they're added
- **Search Speed**: Full-text search is much faster than pattern matching on large datasets
- **Memory Usage**: Tantivy indexes are memory-efficient and disk-backed
- **Scalability**: Supports millions of documents with sub-second search times
### Advanced Features
The demo showcases advanced Tantivy features:
- **Relevance Scoring** - Results ranked by relevance
- **Fuzzy Matching** - Handles typos and approximate matches
- **Field Weighting** - Title field has higher search weight
- **Multi-field Search** - Search across multiple fields simultaneously
- **Geographic Queries** - Distance-based location searches
- **Numeric Ranges** - Efficient range queries on numeric fields
- **Tag Filtering** - Fast categorical filtering
### Next Steps
After running the demo, explore:
1. **Custom Schemas** - Define your own field types and configurations
2. **Large Datasets** - Test with thousands or millions of documents
3. **Real Applications** - Integrate search into your applications
4. **Performance Tuning** - Optimize for your specific use case
For more information, see the [search documentation](../herodb/docs/search.md).