Prompting Guidelines

Best practices for crafting prompts for various models.

This guide provides guidelines for correctly prompting various language models using the Konko API. Below are instructions for each model, including example code snippets.

Models and Prompting Formats

Prompting Llama 2 Chat and CodeLlama Instruct


  • Konko Model IDs:
    meta-llama/llama-2-70b-chat
    meta-llama/llama-2-13b-chat
    codellama/codellama-70b-instruct
    codellama/codellama-34b-instruct
    codellama/codellama-13b-instruct
    codellama/codellama-7b-instruct

  • Endpoint: ChatCompletion

  • Usage: When using these models, the prompt is structured with specific tags (<s>, [INST], <<SYS>>) to define the conversation flow. However, Konko's ChatCompletion endpoint automatically formats these prompts based on user input. Users provide their inputs in a conversational manner without needing to manually include these tags.

    Under the Hood: Prompt Formatting (FYI)

    For the Meta-LLaMa Models, the underlying prompt structure is as follows:

    <s>[INST]<<SYS>>
      system_prompt  
    <</SYS>>  
      user_prompt_1 [/INST]  
      assistant_response_1 </s>  
    <s>[INST] user_prompt_1 [/INST]
    
    • <s> and </s> mark the start and end of a message.
    • [INST] and [/INST] denote instructional blocks.
    • <<SYS>> and <</SYS>> enclose the system message.
  • Example Code:

    import konko
    
    konko_completion = konko.chat.completions.create(
        model="meta-llama/llama-2-70b-chat",
        messages=[
            {"role": "system", "content": "You are a summarizer"},
            {"role": "user", "content": "Your prompt here..."}
        ],
        temperature=0.1,
        max_tokens=300,
        n=2
    )
    

Prompting Mistral-Orca


  • Model ID:
    open-orca/mistral-7b-openorca

  • Endpoint: ChatCompletion

  • Usage: This model utilizes a conversational format. The ChatCompletion endpoint in Konko handles the required formatting, allowing users to input their messages directly.

    Under the Hood: Prompt Formatting (FYI)

    For Open-Orca/Mistral-7B-OpenOrca, the underlying prompt format is as follows:

    <|im_start|>system
    You are MistralOrca, a large language model trained by Alignment Lab AI. Write out your reasoning step-by-step to be sure you get the right answers!
    <|im_end|>
    <|im_start|>user
    How are you?<|im_end|>
    <|im_start|>assistant
    I am doing well!<|im_end|>
    <|im_start|>user
    Please tell me about how mistral winds have attracted super-orcas.<|im_end|>
    <|im_start|>assistant
    
  • Example Code:

    import konko
    
    konko_completion = konko.chat.completions.create(
        model="open-orca/mistral-7b-openorca",
        messages=[
            {"role": "system", "content": "You are MistralOrca, a large language model trained by Alignment Lab AI. Write out your reasoning step-by-step to be sure you get the right answers!"},
            {"role": "user", "content": "How are you?"},
            {"role": "assistant", "content": "I am doing well!"},
            {"role": "user", "content": "Please tell me about how mistral winds have attracted super-orcas."}
        ],
        temperature=0.1,
        max_tokens=300,
        n=2
    )
    

Prompting MistralAI


  • Model ID:
    mistralai/mistral-7b-instruct-v0.1
    mistralai/mistral-7b-instruct-v0.2
    mistralai/mixtral-8x7b-instruct-v0.1

  • Endpoint: ChatCompletion

  • Usage: This model is designed for instruction-based interactions. The ChatCompletion endpoint in Konko processes input with specific formatting, enabling users to include instruction-specific tokens in their messages.

    Under the Hood: Prompt Formatting (FYI)

    The prompt structure for mistralai/Mistral-7B-Instruct-v0.1 incorporates special tokens to delineate instructions. Here's how it looks:

    system_prompt
    <s>[INST]user_prompt_1 [/INST]
    assistant_response</s>
    [INST] user_prompt_2 [/INST]
    
  • Example Code:

    import konko
    
    konko_instruct = konko.chat.completions.create(
        model="mistralai/mistral-7b-instruct-v0.1",
        messages=[
            {"role": "system", "content": "You are Mistral-7B-Instruct, an advanced language model from MistralAI, focusing on understanding and generating responses based on specific instructions. Ensure clarity and accuracy in your responses."},
            {"role": "user", "content": "What is your favourite condiment?"},
            {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
            {"role": "user", "content": "Do you have mayonnaise recipes?"}
        ],
        temperature=0.1,
        max_tokens=300,
        n=2
    )
    

Prompting NousResearch


  • Model ID:
    nousresearch/nous-hermes-llama2-13b
    nousresearch/nous-hermes-llama-2-7b

  • Endpoint: ChatCompletion

  • Usage: Nous-Hermes-Llama2-13B, developed by NousResearch, is optimized for chat-based interactions that follow a specific prompt format. The ChatCompletion endpoint in Konko is specifically designed to handle this format, facilitating clear and structured dialogues.

    Under the Hood: Prompt Formatting (FYI)

    The prompt structure for NousResearch/Nous-Hermes-Llama2-13B follows the Alpaca format, which includes clear instruction and response sections. Here's an example of how it looks:

    system
    ### Instruction:
    user_prompt
    
    ### Response:
    <leave a newline blank for model to respond>
    
    or
    
    system
    ### Instruction:
    user_prompt
    
    ### Input:
    additional_context
    
    ### Response:
    <leave a newline blank for model to respond>
    
  • Example Code:

    import konko
    
    konko_hermes = konko.chat.completions.create(
        model="nousresearch/nous-hermes-llama2-13b",
        messages=[
            {"role": "system", "content": "You are a helpful assistant"},
            {"role": "user", "content": "Can you explain quantum computing?\n### Input:\nIncluding its implications for cryptography."}
        ],
        temperature=0.1,
        max_tokens=300,
        n=2
    )
    

Prompting NousResearch Mixtral


  • Model ID:
    nousresearch/nous-hermes-2-mixtral-8x7b-dpo
    nousresearch/nous-hermes-2-mixtral-8x7b-sft
    nousresearch/nous-hermes-2-yi-34b

  • Endpoint: ChatCompletion

  • Usage: This is the SFT + DPO version of Mixtral Hermes 2. It excels particularly in tasks requiring deep understanding and complex language generation, with its advanced Supervised Fine-Tuning and Direct Preference Optimization techniques The ChatCompletion endpoint in Konko is specifically designed to handle this format, facilitating clear and structured dialogues.

    Under the Hood: Prompt Formatting (FYI)

    For these models, the prompt format integrates a rich, structured system for multi-turn dialogues. This format, known as ChatML, allows for precise control over the model's output by specifying system instructions, user inputs, and assistant responses. Here's how the structure looks:

    <|im_start|>system
    system_message<|im_end|>
    <|im_start|>user
    user_message<|im_end|>
    <|im_start|>assistant
    assistant_message<|im_end|>
    
    
  • Example Code:

    import konko
    
    konko_hermes = konko.chat.completions.create(
        model="nousresearch/nous-hermes-2-mixtral-8x7b-dpo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant"},
            {"role": "user", "content": "Can you explain quantum computing?\n### Input:\nIncluding its implications for cryptography."}
        ],
        temperature=0.1,
        max_tokens=300,
        n=2
    )
    

Prompting Yi Model


  • Model ID:
    zero-one-ai/yi-34b-chat

  • Endpoint: ChatCompletion

  • Usage: Crafted for bilingual proficiency and trained on a vast 3T multilingual corpus, the Yi series models excel globally in language comprehension, reasoning, and reading, marking them as versatile powerhouses in language processing.

    Under the Hood: Prompt Formatting (FYI)

    For these models, the prompt format integrates a rich, structured system for multi-turn dialogues. This format, known as ChatML, allows for precise control over the model's output by specifying system instructions, user inputs, and assistant responses. Here's how the structure looks:

    <|im_start|>system
    system_message<|im_end|>
    <|im_start|>user
    user_message<|im_end|>
    <|im_start|>assistant
    assistant_message<|im_end|>
    
    
  • Example Code:

    import konko
    
    konko_hermes = konko.chat.completions.create(
        model="zero-one-ai/yi-34b-chat",
        messages=[
            {"role": "system", "content": "You are a helpful assistant"},
            {"role": "user", "content": "Can you explain quantum computing?\n### Input:\nIncluding its implications for cryptography."}
        ],
        temperature=0.1,
        max_tokens=300,
        n=2
    )
    

Prompting nous-capybara-7b-v1p9


  • Model ID:
    nousresearch/nous-capybara-7b-v1p9

  • Endpoint: ChatCompletion

  • Usage: Nous-Capybara-7B-V1.9 is a state-of-the-art model for generating complex, multi-turn conversations and intricate summaries on advanced topics. It's highly efficient in conversational context understanding and response generation, especially in fields requiring detailed, nuanced dialogues.

    Under the Hood: Prompt Formatting (FYI)

    The Nous-Capybara-7B-V1.9 model uses a conversational prompt structure, optimized for generating detailed and contextually rich multi-turn dialogues. The structure typically follows this pattern:

    system_prompt
    <s>User: [User Prompt]</s>
    <s>Assistant: [Assistant Response]</s>
    <s>User: [Next User Prompt]</s>
    
  • Example Code:

    import konko
    
    konko_hermes = konko.chat.completions.create(
        model="nousresearch/nous-capybara-7b-v1p9",
        messages=[
            {"role": "system", "content": "You are a helpful assistant"},
            {"role": "user", "content": "Can you explain quantum computing?\n### Input:\nIncluding its implications for cryptography."}
        ],
        temperature=0.1,
        max_tokens=300,
        n=2
    )
    

Prompting Llama 2 and CodeLlama


  • Konko Model IDs:
    codellama/codellama-34b
    codellama/codellama-34b-python
    meta-llama/llama-2-70b
    meta-llama/llama-2-13b

  • Endpoint: Completion

  • Usage: These models are versatile and do not require any specific formatting, making them user-friendly for a variety of tasks. Whether it's generating code, engaging in conversation, or providing information, these models can handle text inputs directly.

    Under the Hood: Prompt Formatting (FYI)

    These models operate effectively with straightforward text prompts. They interpret and respond to the content directly without the need for specialized formatting or structured instructions. Here's a basic idea of how to prompt them:

    "Your unstructured text prompt here..."
    

    This approach allows the models to process and respond to a wide range of queries, coding tasks, or conversational topics.

  • Example Code:

    import konko
    
    # Example for codellama/CodeLlama-34b
    konko_completion_code34b = konko.completions.create(
        model="codellama/codellama-34b",
        prompt="Your code-related prompt here...",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    
    # Example for meta-llama/llama-2-70b
    konko_completion_llama270b = konko.completions.create(
        model="meta-llama/llama-2-70b",
        prompt="Your general inquiry or discussion topic here...",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    

Prompting Phind-CodeLlama-34B-v2


  • Model ID: phind/phind-codellama-34b-v2

  • Endpoint: Completion

  • Usage: Phind/Phind-CodeLlama-34B-v2 is specifically tailored for code generation and assistance. It utilizes the Alpaca/Vicuna instruction format, which is effective in guiding the model to understand and execute programming-related tasks. The Completion endpoint in Konko allows users to input these structured prompts directly.

    How to Prompt the Model

    The model responds best to prompts that include a clear system instruction followed by a user message. This format aids in contextualizing the task for the model. Here’s an example of how to format your prompt:

    ### System Prompt
    You are an intelligent programming assistant.
    
    ### User Message
    Implement a linked list in C++
    
    ### Assistant
    ...
    

    This structured approach helps the model to understand the nature of the coding task and respond with appropriate code or guidance.

  • Example Code:

    import konko
    
    konko_completion_phind = konko.completions.create(
        model="phind/phind-codellama-34b-v2",
        prompt="### System Prompt\nYou are an intelligent programming assistant.\n\n### User Message\nImplement a linked list in C++\n\n### Assistant\n",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    

Prompting Phind-CodeLlama-34B-Python-v1


  • Model ID: phind/phind-codellama-34b-python-v1

  • Endpoint: Completion

  • Usage: This model is specialized in Python coding tasks and is instruction-tuned but not chat-tuned. It's designed to understand and execute Python programming instructions effectively. The Completion endpoint in Konko supports this model by processing direct instruction-based prompts.

    How to Prompt the Model

    Unlike models that require chat markup or complex formats, Phind/Phind-CodeLlama-34B-Python-v1 works best with straightforward instructions. Simply state what you want the model to do and append "\n: " at the end of your task description. This method helps the model clearly identify and focus on the task at hand. Here's an example:

    ### Instruction:
    Write me a linked list implementation in Python:
    ### Response:
    

    This format is concise and directly communicates the programming task to the model, making it ideal for generating Python code solutions.

  • Example Code:

    import konko
    
    konko_completion_phind_python = konko.completions.create(
        model="phind/phind-codellama-34b-python-v1",
        prompt="### Instruction:\n{Write me a linked list implementation in Python:}\n### Response:\n",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    
    

Prompting nsql-llama-2-7B


  • Model ID: numbersstation/nsql-llama-2-7b

  • Endpoint: Completion

  • Usage: NumbersStation/nsql-llama-2-7B is expertly tailored for text-to-SQL generation tasks. It translates natural language prompts into SQL queries, especially focusing on SELECT queries based on provided table schemas. This model is particularly efficient for generating SQL queries from structured natural language questions.

    Usage: To effectively utilize this model, you should provide a detailed table schema followed by a natural language question that requires an SQL SELECT query as a response. The model operates best with prompts that clearly outline the database structure and the query requirement. Here are some examples of how to structure your prompts:

    Example 1:

    "CREATE TABLE stadium (...)
    
    -- Using valid SQLite, answer the following questions for the tables provided above.
    -- What is the maximum, the average, and the minimum capacity of stadiums ?
    SELECT"
    

    Example 2:

    "CREATE TABLE stadium (...)
    
    -- Using valid SQLite, answer the following questions for the tables provided above.
    -- How many stadiums in total?
    SELECT"
    

    Example 3:

    "CREATE TABLE work_orders (...)
    
    -- Using valid SQLite, answer the following questions for the tables provided above.
    -- How many work orders are open?
    SELECT"
    

    This format, which includes both the table schema and a specific SQL query question, guides the model to generate accurate and relevant SQL queries.

  • Example Code:

    import konko
    
    konko_completion_nsql = konko.completions.create(
        model="numbersstation/nsql-llama-2-7b",
        prompt="CREATE TABLE stadium (...) -- Your SQL question here... SELECT",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    

Prompting Mistral-7B-v0.1


  • Model ID: mistralai/mistral-7b-v0.1

  • Endpoint: Completion

  • Usage: The Mistral-7B-v0.1 model from mistralai is a versatile tool designed for a broad spectrum of text-based tasks. It doesn't require any specific formatting for the prompts, making it highly accessible for various applications, including information retrieval, creative writing, and question-answering.

    Prompting the Model

    This model operates efficiently with straightforward text prompts. You can simply input your query, instruction, or topic without needing to follow any special structure or formatting. The model's design allows it to interpret and generate responses based on the content of these unstructured prompts. Here’s an example of how to use it:

    "Your unstructured text prompt here..."
    

    This approach lets the model process and respond to a wide array of inquiries, allowing for flexibility in its applications.

  • Example Code:

    import konko
    
    konko_completion_mistral = konko.completions.create(
        model="mistralai/mistral-7b-v0.1",
        prompt="Your unstructured text prompt here...",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    

Prompting Mixtral-8x7b-v0.1


  • Model ID: mistralai/mixtral-8x7b-v0.1

  • Endpoint: Completion

  • Usage: Mistral AI's Mixtral 8x7B, a 46.7B-parameter Sparse Mixture of Experts (SMoE) LLM, delivers the agility of smaller models with top-tier performance, outshining major counterparts like Llama 2 70B and GPT-3.5 in key benchmarks.

    Prompting the Model

    This model operates efficiently with straightforward text prompts. You can simply input your query, instruction, or topic without needing to follow any special structure or formatting. The model's design allows it to interpret and generate responses based on the content of these unstructured prompts. Here’s an example of how to use it:

    "Your unstructured text prompt here..."
    

    This approach lets the model process and respond to a wide array of inquiries, allowing for flexibility in its applications.

  • Example Code:

    import konko
    
    konko_completion_mistral = konko.completions.create(
        model="mistralai/mixtral-8x7b-v0.1",
        prompt="Your unstructured text prompt here...",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    

Prompting SQLCoder


  • Model ID:
    defog/sqlcoder2

  • Endpoint: Completion

  • Usage: This model is designed for converting questions into SQL queries, typically requiring a specific format for the prompt. Although Konko's Completion endpoint allows users to send any format, models generally perform better when following a structured prompt template. The typical format for sqlcoder2 includes instructions, input, and a request for a SQL query response.

    Under the Hood: Prompt Formatting (FYI)

    The standard prompt format for defog/sqlcoder2 is:

    ### Instructions:
    Your task is to convert a question into a SQL query, given a database schema...
    
    ### Input:
    Generate a SQL query that answers the question "{question}"...
    
    ### Response:
    Here is the SQL query I have generated to answer the question "{question}":
    ```sql
    

    This format helps guide the model to understand the task and respond appropriately.

  • Example Code:

    import konko
    
    konko_completion = konko.Completion.create(
        model="defog/sqlcoder2",
        prompt="Your structured prompt here...",
        temperature=0.1,
        max_tokens=300,
        n=1
    )
    

Prompting Yarn-Mistral


  • Model ID:
    NousResearch/Yarn-Mistral-7b-128k

  • Endpoint: Completion

  • Usage: This model is designed for general-purpose language tasks and doesn't require a specific prompt template. Users can input their prompts directly.

    Under the Hood: Prompt Formatting (FYI)

    • General Prompt Format:
      Simply input the desired prompt. For example:
      {prompt}
      This format allows for a wide range of queries and instructions without the need for specialized tagging or structure.
  • Example Code:

    import konko
    
    konko_completion = konko.Completion.create(
        model="NousResearch/Yarn-Mistral-7b-128k",
        prompt="Your prompt here...",
        temperature=0.1,
        max_tokens=300,
        n=1
    )