Embedding Models
Configure which embedding model RAGaaS uses to understand your content.
Overview
Embedding models convert text into vectors that capture semantic meaning, enabling search based on understanding rather than exact matches.
Example Matches
When using embedding models, these texts would be considered similar:
-
"How do I cancel my subscription?" ≈ "What's the process for ending my membership?"
-
"Getting database connection timeout" ≈ "Database connection failed: timeout error"
-
"What's the pricing for enterprise plan?" ≈ "How much does it cost for large companies?"
This semantic matching helps find relevant content even when the exact words don't match.
How It Works
-
During Ingestion:
- Your content is split into chunks
- Each chunk is converted to a vector
- Vectors are stored in your vector database
-
During Search:
- Your search query is converted to a vector
- Similar vectors are found
- Most relevant matches are returned
In RAGaaS, embedding models are configured at the namespace level. You'll need to provide your own OpenAI or Cohere API key.
Supported Models
OpenAI Models
text-embedding-3-small (Recommended)
- Dimensions: 1536
- Max Input: 8191 tokens
- Use Case: Best for most use cases
- Languages: Good for English and multilingual
- Token Cost: $0.00002 / 1K tokens
text-embedding-3-large
- Dimensions: 3072
- Max Input: 8191 tokens
- Use Case: Highest accuracy needs
- Languages: Good for English and multilingual support
- Token Cost: $0.00013 / 1K tokens
text-embedding-ada-002 (Legacy)
- Not recommended for new projects
- Use text-embedding-3-small instead
Cohere Models
embed-english-v3.0
- Dimensions: 1024
- Max Input: 512 tokens
- Use Case: English content
- Token Cost: $0.00001 / 1K tokens
embed-multilingual-v3.0
- Dimensions: 1024
- Max Input: 512 tokens
- Use Case: 100+ languages
- Token Cost: $0.00001 / 1K tokens
embed-english-light-v3.0
- Dimensions: 384
- Max Input: 512 tokens
- Use Case: Cost-effective English
- Token Cost: $0.000005 / 1K tokens
embed-multilingual-light-v3.0
- Dimensions: 384
- Max Input: 512 tokens
- Use Case: Cost-effective multilingual
- Token Cost: $0.000005 / 1K tokens
Jina Models
jina-embeddings-v3
- Dimensions: 1024
- Max Input: 8192 tokens
- Use Case: General purpose, high performance
- Languages: Good for English and multilingual
Model Selection Guide
- General Purpose (Recommended)
{
"embeddingModelConfig": {
"provider": "OPENAI",
"model": "text-embedding-3-small",
"apiKey": "your-openai-key"
}
}
- Multilingual Content
{
"embeddingModelConfig": {
"provider": "COHERE",
"model": "embed-multilingual-v3.0",
"apiKey": "your-cohere-key"
}
}
- Cost-Sensitive
{
"embeddingModelConfig": {
"provider": "COHERE",
"model": "embed-english-light-v3.0",
"apiKey": "your-cohere-key"
}
}
- High Performance
{
"embeddingModelConfig": {
"provider": "JINA",
"model": "jina-embeddings-v3",
"apiKey": "your-jina-key"
}
}
Common Issues
- Invalid API Key
- Verify key format and permissions
- Check key has not expired
- Ensure sufficient credit/quota
- Rate Limits
- OpenAI: Tier-based rate limits
- Cohere: 2000/minute
- Jina: Tier-based rate limits
- System handles retries automatically
- Token Limits
- OpenAI: 8,192 tokens max
- Cohere: 512 tokens max
- Jina: 8,192 tokens max
- Content is automatically chunked to fit