MiniMax LLM¶
The minimax_llm module integrates MiniMax models into AutoRAG via the OpenAI-compatible API.
Supported Models¶
Model |
Context Window |
|---|---|
MiniMax-M2.7 |
1,048,576 tokens |
MiniMax-M2.7-highspeed |
1,048,576 tokens |
MiniMax-M2.5 |
1,048,576 tokens |
MiniMax-M2.5-highspeed |
204,800 tokens |
Features¶
Auto-truncate prompt¶
Prompts that exceed the model’s token limit are automatically truncated to prevent API errors.
Temperature clamping¶
MiniMax models accept temperature values between 0 and 1. Values above 1.0 are automatically clamped to 1.0.
Think-tag stripping¶
MiniMax M2.5+ models may include <think>...</think> reasoning tags in their output. These are automatically stripped from the generated text.
Module Parameters¶
llm: The MiniMax model name. For example,
MiniMax-M2.7orMiniMax-M2.5-highspeed.batch: The batch size for API calls. Default is 16.
truncate: Whether to truncate input prompts to the model’s max length. Default is True.
api_key: MiniMax API key. You can also set this to env variable
MINIMAX_API_KEY.And all parameters from the OpenAI Chat Completion API (MiniMax uses an OpenAI-compatible endpoint).
Example config.yaml¶
modules:
- module_type: minimax_llm
llm: [MiniMax-M2.7, MiniMax-M2.5-highspeed]
temperature: [0.1, 0.5]
max_tokens: 512
api_key: ${MINIMAX_API_KEY}