Skip to content

fix: reduce max_completion_tokens for Groq API to 8192#1279

Open
yang1002378395-cmyk wants to merge 4 commits intokhoj-ai:masterfrom
yang1002378395-cmyk:fix-groq-max-tokens
Open

fix: reduce max_completion_tokens for Groq API to 8192#1279
yang1002378395-cmyk wants to merge 4 commits intokhoj-ai:masterfrom
yang1002378395-cmyk:fix-groq-max-tokens

Conversation

@yang1002378395-cmyk
Copy link
Copy Markdown
Contributor

Problem

Groq API has a maximum of 8192 tokens for max_completion_tokens, but the code was hardcoded to 16000, causing API errors.

Fixes #1236

Solution

  • Added MAX_COMPLETION_TOKENS_GROQ constant (8192)
  • Uses Groq-specific limit when api_base_url is api.groq.com
  • Applied to both sync (completion_with_backoff) and async (chat_completion_with_backoff) functions

Testing

  • Python syntax check passed
  • The fix is minimal and follows existing code patterns (similar to how fields_to_exclude is handled for Groq in clean_response_schema)

阳虎 added 4 commits March 15, 2026 00:36
When ChatModel.friendly_name is None, __str__ returns None causing:
TypeError: __str__ returned non-string (type NoneType)

Fixed by falling back to name field when friendly_name is None.
…loading model

When SearchModelConfig.ApiType.LOCAL is set with an embeddings_inference_endpoint,
Khoj was still downloading the model from HuggingFace instead of using the API.

Changes:
- Only load SentenceTransformer locally when ApiType.LOCAL and no endpoint configured
- Use OpenAI-compatible API for local endpoints (llama.cpp, vLLM, etc.)
- Handle None API key for local servers that don't require authentication

Fixes khoj-ai#1253
- Add NEXT_PUBLIC_CSP_IMG_DOMAINS env var for build-time configuration
- Add KHOJ_CSP_IMG_DOMAINS env var for runtime hint
- Add CSPHeadersMiddleware to pass domains via X-Khoj-CSP-Img-Domains header
- Document new env var in docker-compose.yml

Fixes khoj-ai#1249
Groq API has a maximum of 8192 tokens for max_completion_tokens,
but the code was hardcoded to 16000, causing API errors.

This fix:
- Adds MAX_COMPLETION_TOKENS_GROQ constant (8192)
- Uses Groq-specific limit when api_base_url is api.groq.com
- Applied to both sync and async completion functions

Fixes khoj-ai#1236
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BadRequestError: 400 when using Free Groq/OpenAI compatible APIs due to max_completion_tokens mismatch

1 participant