ChatCompletion Endpoint

This guide will break down the core functionalities of ChatCompletion endpoint.

ChatCompletion Endpoint Overview

Chat models process a series of messages to generate responses, suitable for both multi-turn conversations and single-turn tasks. The Konko SDK simplifies interaction with these models, ensuring a responsive and context-aware dialogue.

Setting Up

Ensure your environment is ready with Python 3.8 or later. For SDK installation, visit Setup Access & SDK.

Retrieving Model Metadata

Use the Konko Python SDK to access metadata for all models:

import konko

# Retrieve model IDs
models = konko.models.list()

The response includes model IDs and attributes indicating their capabilities:

SyncPage[Model](
  data=[
    Model(
      id='codellama/codellama-13b-instruct',
      created=1702453635,
      object='model',
      owned_by='konko',
      is_chat=True,
      is_text=False,
      creator='Meta',
      creator_logo_url='https://platform.konko.ai/model-creators/meta.svg',
      name='codellama/codellama-13b-instruct',
      description='WIP',
      category='Coding: Generalist',
      censorship='Censored',
      max_context_length=8192,
      license_name='Llama 2 Community License',
      license_url='https://ai.meta.com/llama/license/',
      public_endpoint_pricing='$0.0003 / 1k tokens',
      private_endpoint_pricing='$2 / hour',
      docs_url='https://docs.konko.ai/v0.5.0/docs/list-of-models',
      prompt_guide_url='https://docs.konko.ai/v0.5.0/docs/prompting#prompting-llama-2-chat-and-codellama-instruct'
    ),
    ...
    Model(
      id='phind/phind-codellama-34b-v2',
      created=1702496623,
      object='model',
      owned_by='konko',
      is_chat=False,
      is_text=True,
      creator='Phind',
      creator_logo_url='https://platform.konko.ai/model-creators/phind.webp',
      name='phind/phind-codellama-34b-v2',
      description='WIP',
      category='Coding: Generalist',
      censorship='Censored',
      max_context_length=16384,
      license_name='Llama 2 Community License',
      license_url='https://ai.meta.com/llama/license/',
      public_endpoint_pricing='$0.0008 / 1k tokens',
      private_endpoint_pricing='$2.8 / hour',
      docs_url='https://docs.konko.ai/v0.5.0/docs/list-of-models',
      prompt_guide_url='https://docs.konko.ai/v0.5.0/docs/prompting#prompting-phind-codellama-34b-v2'
    )
  ],
  object='list'
)

πŸ“˜

To determine if a model is compatible with the ChatCompletion or Completion endpoints, check the is_chat attribute in the JSON object or refer to our Available Models.

Best Practice: Before using the ChatCompletion endpoint, verify the model ID from the SDK response. This ensures compatibility with the desired LLM and the use of the latest models.

Using the SDK for ChatCompletion

A typical SDK call:

import konko

konko_completion = konko.chat.completions.create(
    model="nousresearch/nous-hermes-llama2-13b",
    messages=[
        {"role": "system", "content": "You are a summarizer"},
        {"role": "user", "content": '''Your prompt here...'''}
    ],
    temperature=0.1,
    max_tokens=300,
    n=1
)

Chat Model Guidelines

  • Input Format: Models take an array of message objects with a role ("system", "user", or "assistant") and content.
  • Conversation Flow: Start with a system message to set the model's role, followed by user and assistant messages.
  • Include Context: Provide relevant conversation history. The model doesn't remember past interactions.
  • System Messages: Optional but useful for guiding model behavior.

For more details, see our API reference page.



What’s Next