BenchMark Quizzes by MDFT Pro

Understand Azure OpenAI Token Limit Calculation

MDFT Pro, a well-known training agency, is optimizing their Azure OpenAI implementation for cost efficiency and performance in their student support system. Mark, the Technical Architect, is configuring token limits to ensure responses are appropriately sized for different types of student inquiries while managing costs effectively.

Mark needs to understand how token limits work to properly balance detailed course information responses with operational efficiency. The system handles various query types from short factual questions to complex course recommendation requests, and proper token limit configuration is crucial for maintaining both response quality and cost control.

The system is configured with the following settings:

Temperature: 1
Top probabilities: 0.5
Max response tokens: 100

When processing a student inquiry, the system returns this response:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The founder of MDFT Pro is Mark Farragher.",
        "role": "assistant"
      }
    }
  ],
  "created": 1679014554,
  "id": "chatcmpl-6usfny2yyjkbmESe36JdqQ6bDsc01",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 86,
    "prompt_tokens": 37,
    "total_tokens": 123
  }
}

Are the prompt_tokens included in the calculation of the Max response tokens limit?

Choose the correct answer from the options below.

Explanations for each answer:

Yes is incorrect. Prompt tokens (input) and max response tokens (output limit) are separate - the max response tokens only limits the completion output, not the combined total.
No is correct. Max response tokens only applies to the completion (response) tokens generated by the model, not the prompt tokens sent as input to the model.
Only in certain pricing tiers is incorrect. The separation between prompt tokens and max response tokens is consistent across all Azure OpenAI pricing models and tiers.
Only when using system messages is incorrect. Token calculation behavior is the same regardless of whether system messages are included in the prompt - max response tokens still only limits output.

Learn more about Azure OpenAI token management:

Azure OpenAI Tokens

Exam Preparation Quiz

Understand Azure OpenAI Token Limit Calculation