diff --git a/MODELS.md b/MODELS.md new file mode 100644 index 0000000..9b788c2 --- /dev/null +++ b/MODELS.md @@ -0,0 +1,81 @@ +# Available watsonx.ai Models + +This document lists all foundation models currently available in watsonx.ai. + +## Chat & Text Generation Models + +| Model Label | Model ID | Provider | Parameters | Max Context | +|------------|----------|----------|------------|-------------| +| granite-3-8b-instruct | ibm/granite-3-8b-instruct | IBM | 8B | 8,192 | +| granite-3.1-8b-instruct | ibm/granite-3.1-8b-instruct | IBM | 8B | 131,072 | +| granite-4-h-small | ibm/granite-4-h-small | IBM | - | 131,072 | +| llama-3-3-70b-instruct | meta-llama/llama-3-3-70b-instruct | Meta | 70B | 131,072 | +| llama-guard-3-11b-vision | meta-llama/llama-guard-3-11b-vision | Meta | 11B | 131,072 | +| mistral-large-2512 | mistral-large-2512 | Mistral AI | 675B (41B active) | - | +| mistral-medium-2505 | mistralai/mistral-medium-2505 | Mistral AI | - | 131,072 | +| mistral-small-3-1-24b-instruct-2503 | mistralai/mistral-small-3-1-24b-instruct-2503 | Mistral AI | 24B | 128,000 | +| gpt-oss-120b | openai/gpt-oss-120b | OpenAI | 120B (5.1B active) | 131,072 | + +## Embedding Models + +| Model Label | Model ID | Provider | Dimensions | Max Tokens | +|------------|----------|----------|------------|------------| +| all-minilm6-v2 | sentence-transformers/all-minilm-l6-v2 | sentence-transformers | 384 | 512 | +| slate-125m-english-rtrvr | ibm/slate-125m-english-rtrvr | IBM | - | - | +| slate-30m-english-rtrvr | ibm/slate-30m-english-rtrvr | IBM | - | - | + +## Model Capabilities + +### IBM Granite Models +- **granite-3-8b-instruct**: General-purpose instruction-following model +- **granite-3.1-8b-instruct**: Enhanced version with extended context (131K tokens) +- **granite-4-h-small**: Latest Granite 4 model with advanced capabilities + +### Meta Llama Models +- **llama-3-3-70b-instruct**: Large instruction-tuned model with 70B parameters +- **llama-guard-3-11b-vision**: Multimodal model with vision capabilities + +### Mistral Models +- **mistral-large-2512**: State-of-the-art MoE model with 675B total parameters +- **mistral-medium-2505**: Multimodal model with vision and extended context +- **mistral-small-3-1-24b-instruct-2503**: Efficient 24B parameter model with vision + +### OpenAI Models +- **gpt-oss-120b**: Open-weight model optimized for reasoning and agentic tasks + +## Supported Tasks + +All chat models support: +- Question Answering +- Summarization +- Retrieval Augmented Generation (RAG) +- Classification +- Text Generation +- Code Generation +- Extraction +- Translation +- Function Calling + +## Usage Notes + +1. **Context Length**: Models with 131K+ context length support long-document processing +2. **Vision Models**: Models with `image_chat` function support multimodal inputs +3. **Function Calling**: Most models support OpenAI-compatible function calling +4. **Multilingual**: Many models support multiple languages (see model specifications) + +## Model Selection Guide + +- **For general chat**: `ibm/granite-3-8b-instruct` or `meta-llama/llama-3-3-70b-instruct` +- **For long context**: `ibm/granite-3.1-8b-instruct` or `mistralai/mistral-medium-2505` +- **For vision tasks**: `meta-llama/llama-guard-3-11b-vision` or `mistralai/mistral-medium-2505` +- **For embeddings**: `sentence-transformers/all-minilm-l6-v2` or `ibm/slate-125m-english-rtrvr` +- **For reasoning**: `openai/gpt-oss-120b` or `mistral-large-2512` + +## API Version + +All models use API version: `2024-02-13` + +--- + +*Last updated: 2026-02-23* +*Source: watsonx.ai Foundation Model Specs API*