> ## Documentation Index
> Fetch the complete documentation index at: https://docs.siliconstorm.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Usage Examples

## Single-turn conversation：

#### Single-modal model：

```json theme={null}
{
    "model": "DeepSeek-R1",
    "messages": [{
        "role": "user",
        "content": "You are a helpful assistant."
    }],
    "stream": false,
    "presence_penalty": 1.03,
    "frequency_penalty": 1.0,
    "repetition_penalty": 1.0,
    "temperature": 0.5,
    "top_p": 0.95,
    "top_k": 0,
    "seed": null,
    "stop": ["stop1", "stop2"],
    "stop_token_ids": [2, 13],
    "include_stop_str_in_output": false,
    "skip_special_tokens": true,
    "ignore_eos": false,
    "max_tokens": 20
}
```

#### Multimodal model：

Note: The value of the “image\_url” parameter should be modified according to the actual situation.

```json theme={null}
{
    "model": "DeepSeek-R1",
    "messages": [{
        "role": "user",
        "content": [
           {"type": "text", "text": "My name is Olivier and I"},
           {"type": "image_url", "image_url": "/xxxx/test.png"}
        ]
    }],
    "stream": false,
    "presence_penalty": 1.03,
    "frequency_penalty": 1.0,
    "repetition_penalty": 1.0,
    "temperature": 0.5,
    "top_p": 0.95,
    "top_k": 0,
    "seed": null,
    "stop": ["stop1", "stop2"],
    "stop_token_ids": [2, 13],
    "include_stop_str_in_output": false,
    "skip_special_tokens": true,
    "ignore_eos": false,
    "max_tokens": 20
}
```

## Multi-turn conversation

#### Request Example 1：

```json theme={null}
{
    "model": "DeepSeek-R1",
    "messages": [{
        "role": "system",
        "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."
        },
        {
        "role": "user",
        "content": "Hi, can you tell me the delivery date for my order? my order id is 12345."
        }
    ],
    "stream": false,
    "presence_penalty": 1.03,
    "frequency_penalty": 1.0,
    "repetition_penalty": 1.0,
    "temperature": 0.5,
    "top_p": 0.95,
    "top_k": 0,
    "seed": null,
    "stop": ["stop1", "stop2"],
    "stop_token_ids": [2],
    "ignore_eos": false,
    "max_tokens": 1024,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_delivery_date",
                "strict": true,
                "description": "Get the delivery date for a customer's order. Call this whenever you need to know the delivery date, for example when a customer asks 'Where is my package'",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "order_id": {
                            "type": "string",
                            "description": "The customer's order ID."
                        }
                    },
                    "required": [
                        "order_id"
                    ],
                    "additionalProperties": false
                }
            }
        }
    ],
   "tool_choice": "auto"
}
```

#### Request Example 2：

```json theme={null}
{
    "model": "DeepSeek-R1",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful customer support assistant. Use the supplied tools to assist the user."
        },
        {
            "role": "user",
            "content": "Hi, can you tell me the delivery date for my order? my order id is 12345."
        },
        {
            "role": "assistant",
            "tool_calls": [
                {
                    "function": {
                        "arguments": "{\"order_id\": \"12345\"}",
                        "name": "get_delivery_date"
                    },
                    "id": "tool_call_8p2Nk",
                    "type": "function"
                }
            ]
        },
        {
            "role": "tool",
            "content": "the delivery date is 2024.09.10.",
            "tool_call_id": "tool_call_8p2Nk"
        }
    ],
    "stream": false,
    "repetition_penalty": 1.1,
    "temperature": 0.9,
    "top_p": 1,
    "max_tokens": 1024,
    "tools": [
        {
            "type": "function",
            "function": {
                "name": "get_delivery_date",
                "strict": true,
                "description": "Get the delivery date for a customer's order. Call this whenever you need to know the delivery date, for example when a customer asks 'Where is my package'",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "order_id": {
                            "type": "string",
                            "description": "The customer's order ID."
                        }
                    },
                    "required": [
                        "order_id"
                    ],
                    "additionalProperties": false
                }
            }
        }
    ],
    "tool_choice": "auto"
}
```

## Response Example：

Text Inference (“stream”=false)：

### Single-turn conversation：

```json theme={null}
{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "model": "DeepSeek-R1",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\n\nHello there, how may I assist you today?"
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 9,
        "completion_tokens": 12,
        "total_tokens": 21
    },
    "prefill_time": 200,
    "decode_time_arr": [56, 28, 28]
}
```

### Multi-turn conversation：

#### Response Example 1：

```json theme={null}
{
    "id": "chatcmpl-123",
    "object": "chat.completion",
    "created": 1677652288,
    "model": "DeepSeek-R1",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "",
                "tool_calls": [
                    {
                        "function": {
                            "arguments": "{\"order_id\": \"12345\"}",
                            "name": "get_delivery_date"
                        },
                        "id": "call_JwmTNF3O",
                        "type": "function"
                    }
                ]
            },
            "finish_reason": "tool_calls"
        }
    ],
    "usage": {
        "prompt_tokens": 226,
        "completion_tokens": 122,
        "total_tokens": 348
    },
    "prefill_time": 200,
    "decode_time_arr": [56, 28, 28]
}

```

#### Response Example 2：

```json theme={null}
{
    "id": "endpoint_common_25",
    "object": "chat.completion",
    "created": 1728959154,
    "model": "DeepSeek-R1",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "\n Your order with ID 12345 is scheduled for delivery on September 10th, 2024.",
                "tool_calls": null
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 265,
        "completion_tokens": 30,
        "total_tokens": 295
    },
    "prefill_time": 200,
    "decode_time_arr": [56, 28, 28]
}

```

### Streaming Inference：

#### Streaming Inference 1

(“stream”=true, using sse format return)：

```json theme={null}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"\t"},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"\t"},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"endpoint_common_8","object":"chat.completion.chunk","created":1729614610,"model":"DeepSeek-R1","usage":{"prompt_tokens":54,"completion_tokens":17,"total_tokens":71},"choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":"stop"}]}

data: [DONE]
```

#### Streaming Inference 2

(“stream”=true, with configuration “fullTextEnabled”=true, using sse format return)：

```json theme={null}
data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello!"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How can"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How can I"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How can I assist"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How can I assist you"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How can I assist you today"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How can I assist you today?"},"finish_reason":null}]}

data: {"id":"endpoint_common_11","object":"chat.completion.chunk","created":1730184192,"model":"DeepSeek-R1","full_text":"Hello! How can I assist you today?","usage":{"prompt_tokens":31,"completion_tokens":10,"total_tokens":41},"choices":[{"index":0,"delta":{"role":"assistant","content":"Hello! How can I assist you today?"},"finish_reason":"length"}]}

data: [DONE]
```

## Output Explanation

### Table 1

Explanation of text inference results

<table cellpadding="4" cellspacing="0" frame="border" border="1" rules="all" data-header="7">
  <thead align="left">
    <tr>
      <th align="left" colspan="5"><p>Parameter Name</p></th>
      <th align="left"><p>Type</p></th>
      <th align="left"><p>Description</p></th>
    </tr>
  </thead>

  <tbody>
    <tr>
      <td colspan="5"><p>id</p></td>
      <td><p>string</p></td>
      <td><p>Request ID.</p></td>
    </tr>

    <tr>
      <td colspan="5"><p>object</p></td>
      <td><p>string</p></td>
      <td><p>The type of the returned result, currently always "chat.completion".</p></td>
    </tr>

    <tr>
      <td colspan="5"><p>created</p></td>
      <td><p>integer</p></td>
      <td><p>Inference request timestamp, accurate to the second.</p></td>
    </tr>

    <tr>
      <td colspan="5"><p>model</p></td>
      <td><p>string</p></td>
      <td><p>The inference model used.</p></td>
    </tr>

    <tr>
      <td colspan="5"><p>choices</p></td>
      <td><p>list</p></td>
      <td><p>List of inference results.</p></td>
    </tr>

    <tr>
      <td rowspan="11"><p>-</p></td>
      <td colspan="4"><p>index</p></td>
      <td><p>integer</p></td>
      <td><p>Index of the choices message, currently only 0.</p></td>
    </tr>

    <tr>
      <td colspan="4"><p>message</p></td>
      <td><p>object</p></td>
      <td><p>Inference message.</p></td>
    </tr>

    <tr>
      <td rowspan="8"><p>-</p></td>
      <td colspan="3"><p>role</p></td>
      <td><p>string</p></td>
      <td><p>Role, currently always returns "assistant".</p></td>
    </tr>

    <tr>
      <td colspan="3"><p>content</p></td>
      <td><p>string</p></td>
      <td><p>Inference text result.</p></td>
    </tr>

    <tr>
      <td colspan="3"><p>tool\_calls</p></td>
      <td><p>list</p></td>
      <td><p>Model tool invocation output.</p></td>
    </tr>

    <tr>
      <td rowspan="5"><p>-</p></td>
      <td colspan="2"><p>function</p></td>
      <td><p>dict</p></td>
      <td><p>Function call description.</p></td>
    </tr>

    <tr>
      <td rowspan="2"><p>-</p></td>
      <td><p>arguments</p></td>
      <td><p>string</p></td>
      <td><p>Arguments for the function call, a JSON-formatted string.</p></td>
    </tr>

    <tr>
      <td><p>name</p></td>
      <td><p>string</p></td>
      <td><p>Name of the called function.</p></td>
    </tr>

    <tr>
      <td colspan="2"><p>id</p></td>
      <td><p>string</p></td>
      <td><p>ID of the model's tool invocation.</p></td>
    </tr>

    <tr>
      <td colspan="2"><p>type</p></td>
      <td><p>string</p></td>
      <td><p>Type of the tool, currently only supports "function".</p></td>
    </tr>

    <tr>
      <td colspan="4"><p>finish\_reason</p></td>
      <td><p>string</p></td>

      <td>
        <p>Reason for completion.</p>

        <ul>
          <li>
            stop:

            <ul>
              <li>Request was CANCELLED or STOPPED, user is unaware, response is discarded.</li>
              <li>Error occurred during execution, response output is empty, err\_msg is non-empty.</li>
              <li>Input validation exception, response output is empty, err\_msg is non-empty.</li>
              <li>Request ended normally upon encountering the eos (end of stream) symbol.</li>
            </ul>
          </li>

          <li>
            length:

            <ul>
              <li>Request ended due to reaching the maximum sequence length, response is the output of the last iteration.</li>
              <li>Request ended due to reaching the maximum output length (including request and model granularity), response is the output of the last iteration.</li>
            </ul>
          </li>

          <li>tool\_calls: Indicates that the model called a tool.</li>
        </ul>
      </td>
    </tr>

    <tr>
      <td colspan="5"><p>usage</p></td>
      <td><p>object</p></td>
      <td><p>Inference result statistics data.</p></td>
    </tr>

    <tr>
      <td rowspan="3"><p>-</p></td>
      <td colspan="4"><p>prompt\_tokens</p></td>
      <td><p>int</p></td>
      <td><p>Token length corresponding to the user input prompt text.</p></td>
    </tr>

    <tr>
      <td colspan="4"><p>completion\_tokens</p></td>
      <td><p>int</p></td>

      <td>
        <p>Number of tokens in the inference result. In the PD scenario, it counts the total token number of both P and D inference results. When the inference length limit for a request is set to maxIterTimes, the D node response's completion\_tokens count is maxIterTimes+1, i.e., it includes the first token from the P inference result.</p>
      </td>
    </tr>

    <tr>
      <td colspan="4"><p>total\_tokens</p></td>
      <td><p>int</p></td>
      <td><p>Total token count for the request and inference.</p></td>
    </tr>

    <tr>
      <td colspan="5"><p>prefill\_time</p></td>
      <td><p>float</p></td>
      <td><p>Time delay for the first token of the inference.</p></td>
    </tr>

    <tr>
      <td colspan="5"><p>decode\_time\_arr</p></td>
      <td><p>list</p></td>
      <td><p>Array of time delays during the inference decoding process.</p></td>
    </tr>
  </tbody>
</table>

### Table 2

Explanation of Streaming Inference Results

<table cellpadding="4" cellspacing="0" frame="border" border="1" rules="all" data-header="6">
  <thead align="left">
    <tr>
      <th align="left" colspan="4" valign="top"><p>Parameter Name</p></th>
      <th align="left" valign="top"><p>Type</p></th>
      <th align="left" valign="top"><p>Description</p></th>
    </tr>
  </thead>

  <tbody>
    <tr>
      <td colspan="4" valign="top"><p>data</p></td>
      <td valign="top"><p>object</p></td>
      <td valign="top"><p>The result of a single inference.</p></td>
    </tr>

    <tr>
      <td rowspan="15" valign="top"><p>-</p></td>
      <td colspan="3" valign="top"><p>id</p></td>
      <td valign="top"><p>string</p></td>
      <td valign="top"><p>Request ID.</p></td>
    </tr>

    <tr>
      <td colspan="3" valign="top"><p>object</p></td>
      <td valign="top"><p>string</p></td>
      <td valign="top"><p>Currently returns "chat.completion.chunk".</p></td>
    </tr>

    <tr>
      <td colspan="3" valign="top"><p>created</p></td>
      <td valign="top"><p>integer</p></td>
      <td valign="top"><p>Inference request timestamp, accurate to the second.</p></td>
    </tr>

    <tr>
      <td colspan="3" valign="top"><p>model</p></td>
      <td valign="top"><p>string</p></td>
      <td valign="top"><p>The inference model used.</p></td>
    </tr>

    <tr>
      <td colspan="3" valign="top"><p>full\_text</p></td>
      <td valign="top"><p>string</p></td>

      <td valign="top">
        <p>Full text result, only returned when the configuration item <span>“fullTextEnabled”</span> is set to true.</p>
      </td>
    </tr>

    <tr>
      <td colspan="3" valign="top"><p>usage</p></td>
      <td valign="top"><p>object</p></td>
      <td valign="top"><p>Inference result statistics.</p></td>
    </tr>

    <tr>
      <td rowspan="3" valign="top"><p>-</p></td>
      <td colspan="2" valign="top"><p>prompt\_tokens</p></td>
      <td valign="top"><p>int</p></td>
      <td valign="top"><p>The token length corresponding to the user-input prompt text.</p></td>
    </tr>

    <tr>
      <td colspan="2" valign="top"><p>completion\_tokens</p></td>
      <td valign="top"><p>int</p></td>

      <td valign="top">
        <p>The number of tokens in the inference result. In PD scenarios, it counts the total tokens of both P and D inference results. When the inference length limit for a request is set to the value of maxIterTimes, the D node response will have a completion\_tokens count of maxIterTimes + 1, which adds the first token of the P inference result.</p>
      </td>
    </tr>

    <tr>
      <td colspan="2" valign="top"><p>total\_tokens</p></td>
      <td valign="top"><p>int</p></td>
      <td valign="top"><p>The total number of tokens for the request and inference.</p></td>
    </tr>

    <tr>
      <td colspan="3" valign="top"><p>choices</p></td>
      <td valign="top"><p>list</p></td>
      <td valign="top"><p>Streaming inference results.</p></td>
    </tr>

    <tr>
      <td rowspan="5" valign="top"><p>-</p></td>
      <td colspan="2" valign="top"><p>index</p></td>
      <td valign="top"><p>integer</p></td>
      <td valign="top"><p>The choices message index, which can only be 0 currently.</p></td>
    </tr>

    <tr>
      <td colspan="2" valign="top"><p>delta</p></td>
      <td valign="top"><p>object</p></td>
      <td valign="top"><p>The inference return result, the last response is empty.</p></td>
    </tr>

    <tr>
      <td rowspan="2" valign="top"><p>-</p></td>
      <td valign="top"><p>role</p></td>
      <td valign="top"><p>string</p></td>
      <td valign="top"><p>The role, currently always returns "assistant".</p></td>
    </tr>

    <tr>
      <td valign="top"><p>content</p></td>
      <td valign="top"><p>string</p></td>
      <td valign="top"><p>The inference text result.</p></td>
    </tr>

    <tr>
      <td colspan="2" valign="top"><p>finish\_reason</p></td>
      <td valign="top"><p>string</p></td>

      <td valign="top">
        <p>Reason for completion, returned only in the final inference result.</p>

        <ul>
          <li>
            stop:

            <ul>
              <li>The request was CANCELLED or STOPPED, without user awareness, and the response is discarded.</li>
              <li>An error occurred during request execution, and the response is empty with a non-empty err\_msg.</li>
              <li>Input validation for the request failed, with the response being empty and err\_msg non-empty.</li>
              <li>The request ended normally due to encountering an EOS (End of Stream) symbol.</li>
            </ul>
          </li>

          <li>
            length:

            <ul>
              <li>The request ended because it reached the maximum sequence length, and the response is the output from the last iteration.</li>
              <li>The request ended because it reached the maximum output length (including both the request and model-level length), and the response is the output from the last iteration.</li>
            </ul>
          </li>
        </ul>
      </td>
    </tr>
  </tbody>
</table>
