Chat Completion

Generate a model response for a given conversation. This endpoint lets you build interactive chat applications using AI models like openai/gpt-4, openai/gpt-4o, or even neuroa/m1-preview. It mirrors the behavior of OpenAI’s chat completions API.

const { Neuroa } = require('neuroa-sdk');

const client = new Neuroa({
  apiKey: process.env.NEUROA_API_KEY
});

// Basic completion (non-streaming)
client.models.generate.completions({
  model: "openai/gpt-4.1",
  messages: [
    { role: "user", content: "Hello, world!" }
  ]
}).then(response => {
  console.log(response.choices[0].message.content);
}).catch(error => {
  console.error("Completions generation error:", error);
});

Request Body

Send a list of messages and specify the model you’d like to use.

{
  "model": "openai/gpt-4.1",
  "messages": [
    { "role": "user", "content": "Tell me a fun fact about space." }
  ]
}

Field	Type	Required	Description
`model`	string	✅ Yes	Model ID to use (e.g. `openai/o1-pro`, `anthropic/claude-4-opus`)
`messages`	array	✅ Yes	Ordered history of messages in the chat
`temperature`	number	❌ No	Controls randomness: 0 (deterministic) to 2 (more random)
`stream`	boolean	❌ No	Enables streaming if `true`
`max_tokens`	number	❌ No	Max number of tokens to generate
`thinking`	ThinkingObject	❌ No	Only supported by M1 Series models
`tools`	object	❌ No	Tool calling payload (OpenAI, Gemini, Anthropic support this)

`thinking` Object

Optional parameter for M1 Series models.

"thinking": {
  "thinking": true,
  "effort": "high"
}

Field	Type	Required	Description
`thinking`	boolean	✅ Yes	Enables internal thinking-like behavior
`effort`	string	✅ Yes	Can be `"low"`, `"medium"` or `"high"`

Example Response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677858242,
  "model": "openai/gpt-4.1",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "A fun fact: One day on Venus is longer than one year on Venus!"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 14,
    "completion_tokens": 17,
    "total_tokens": 31
  }
}

Streaming Responses

When stream: true is set, the API responds with a Server-Sent Events (SSE) stream. Each chunk includes a delta until completion. Sample stream response:

data: {
  "choices": [
    {
      "delta": {
        "content": "Venus"
      }
    }
  ]
}

data: {
  "choices": [
    {
      "delta": {
        "content": " is"
      }
    }
  ]
}
...
data: [DONE]

Each data: message is a part of the response. The client should collect delta.content pieces to reconstruct the final message.

Authentication

Use the Bearer token method in your Authorization header:

Authorization: Bearer YOUR_API_KEY

Need More?

Want to build full real-time chat apps? Check out the OpenAI streaming guide or Neuroa’s SDK documentation for richer control.

API Documentation

Endpoints

Chat Completion

Request Body

`thinking` Object

Example Response

Streaming Responses

Authentication

Need More?

API Documentation

Endpoints

​Request Body

​thinking Object

​Example Response

​Streaming Responses

​Authentication

​Need More?

Request Body

`thinking` Object

Example Response

Streaming Responses

Authentication

Need More?