Completion Endpoint

This guide will break down the core functionalities of Completion endpoint.

Completion Endpoint Overview

The Completion endpoint is designed for generating responses based on a single input prompt, differing from ChatCompletion which handles multi-turn conversations. It’s ideal for tasks like content generation, summarization, and single-shot question answering.

Setting Up

Ensure your development environment is ready with Python 3.8 or later. Refer to Setup Access & SDK for SDK installation.

Retrieving Model Metadata

Use the Konko Python SDK to access metadata for all models:

import konko

# Retrieve model IDs
models = konko.models.list()

The response includes model IDs and attributes indicating their capabilities:

SyncPage[Model](
  data=[
    Model(
      id='codellama/codellama-13b-instruct',
      created=1702453635,
      object='model',
      owned_by='konko',
      is_chat=True,
      is_text=False,
      creator='Meta',
      creator_logo_url='https://platform.konko.ai/model-creators/meta.svg',
      name='codellama/codellama-13b-instruct',
      description='WIP',
      category='Coding: Generalist',
      censorship='Censored',
      max_context_length=8192,
      license_name='Llama 2 Community License',
      license_url='https://ai.meta.com/llama/license/',
      public_endpoint_pricing='$0.0003 / 1k tokens',
      private_endpoint_pricing='$2 / hour',
      docs_url='https://docs.konko.ai/v0.5.0/docs/list-of-models',
      prompt_guide_url='https://docs.konko.ai/v0.5.0/docs/prompting#prompting-llama-2-chat-and-codellama-instruct'
    ),
    ...
    Model(
      id='phind/phind-codellama-34b-v2',
      created=1702496623,
      object='model',
      owned_by='konko',
      is_chat=False,
      is_text=True,
      creator='Phind',
      creator_logo_url='https://platform.konko.ai/model-creators/phind.webp',
      name='phind/phind-codellama-34b-v2',
      description='WIP',
      category='Coding: Generalist',
      censorship='Censored',
      max_context_length=16384,
      license_name='Llama 2 Community License',
      license_url='https://ai.meta.com/llama/license/',
      public_endpoint_pricing='$0.0008 / 1k tokens',
      private_endpoint_pricing='$2.8 / hour',
      docs_url='https://docs.konko.ai/v0.5.0/docs/list-of-models',
      prompt_guide_url='https://docs.konko.ai/v0.5.0/docs/prompting#prompting-phind-codellama-34b-v2'
    )
  ],
  object='list'
)

📘

To determine if a model is compatible with the ChatCompletion or Completion endpoints, check the is_text attribute in the JSON object or refer to our Available Models.

Best Practice: Before using the Completion endpoint, verify the model ID from the SDK response. This ensures compatibility with the desired LLM and the use of the latest models.

Using the SDK for Completion

A typical SDK call:

import konko

response = konko.completions.create(
    model="meta-llama/llama-2-13b",
    prompt="Summarize the Foundation by Isaac Asimov",
  	max_tokens=500
    
)