Introduction
This guide will walk you through performing your first inference. The steps involve creating an account to get an API key, setting up your local environment, and running the code.
First, choose the platform you will be using. The instructions and code examples on this page will adapt to your selection.
- Inceptron
- Hugging Face
1. Create an Inceptron Account and Obtain Your API Key
- Visit the Inceptron website and sign up for a new account.
- After signing up, navigate to the account section of the dashboard to create an API key.
- Copy the API key and store it securely in an environment variable named
INCEPTRON_API_KEY.
To set the environment variable in your terminal:
export INCEPTRON_API_KEY="your_api_key_here"
You can also use an .env file to store your API key securely in your project directory.
Create a file named .env in your project's root directory (and make sure to add .env to
your .gitignore file to prevent it from being committed). To use the .env file in your project,
you can use libraries like python-dotenv for python projects or dotenv for Node.js projects.
Read more at Securing Your API Key.
INCEPTRON_API_KEY="your_api_key_here"
2. Set Up Your Development Environment
- Ensure you have Python or Node.js installed on your machine.
- Install the OpenAI SDK using pip for Python or npm for Node.js:
cURL or another HTTP client.1. Create a Hugging Face Account and Obtain Your API Key
- Visit the Hugging Face website and sign up for a new account.
- After signing up, navigate to the Access Tokens page in your settings.
- Create a new token (a "read" role is sufficient) and store it securely in an environment variable named
HF_TOKEN.
Set the environment variable in your terminal:
export HF_TOKEN="your_hf_token_here"
You can also use an .env file to store your API key securely in your project directory.
Create a file named .env in your project's root directory (and make sure to add .env to
your .gitignore file to prevent it from being committed). To use the .env file in your project,
you can use libraries like python-dotenv for python projects or dotenv for Node.js projects.
Read more at Securing Your API Key.
HF_TOKEN="your_hf_token_here"
2. Set Up Your Development Environment
- Ensure you have Python or Node.js installed on your machine.
- Install the OpenAI SDK using pip for Python or npm for Node.js:
cURL or another HTTP client.- Python
- JavaScript
Python Environment Setup
We recommend that you create a virtual environment to manage your project dependencies. We have found the best way to manage virtual environments is by using uv, which can be installed via pip:
pip install uv
or via a one-liner as found on the uv uv documentation site.
Once uv is installed, create and activate a new virtual environment for your project:
uv venv
source .venv/bin/activate
Then, install the OpenAI SDK:
uv pip install openai
JavaScript Environment Setup
You can use nvm (Node Version Manager) to manage your Node.js versions. First, install nvm by following the instructions on the nvm GitHub repository.
After installing nvm, use it to install the latest LTS version of Node.js and set it as the default:
nvm install --lts
nvm use --lts
nvm alias default lts/*
Next, create a new directory for your project and navigate into it:
mkdir my-inceptron-project
cd my-inceptron-project
Initialize a new Node.js project:
npm init -y
Finally, install the OpenAI SDK:
npm install openai
3. Run Your First Inference
The following examples show how to call the Chat Completions API.
- Inceptron
- Hugging Face
Using OpenAI SDK
- Python
- JavaScript
import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.inceptron.io/v1",
api_key=os.environ["INCEPTRON_API_KEY"],
)
completion = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct",
messages=[{"role": "user", "content": "How many moons are there in the Solar System?"}],
)
print(completion.choices[0].message.content)
import { OpenAI } from "openai";
const client = new OpenAI({
baseURL: "https://api.inceptron.io/v1",
apiKey: process.env.INCEPTRON_API_KEY,
});
const chatCompletion = await client.chat.completions.create({
model: "meta-llama/Llama-3.3-70B-Instruct",
messages: [{"role": "user", "content": "How many moons are there in the Solar System?"}],
});
console.log(chatCompletion.choices[0].message.content);
Using cURL
- cURL
curl https://api.inceptron.io/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $INCEPTRON_API_KEY" \
-d '{
"model": "meta-llama/Llama-3.3-70B-Instruct",
"messages": [{"role": "user", "content": "How many moons are there in the Solar System?"}]
}'
Using huggingface_hub InferenceClient
This is the recommended client for interacting with the Hugging Face Router.
- Python
- JavaScript
import os
from huggingface_hub import InferenceClient
client = InferenceClient(token=os.environ["HF_TOKEN"])
completion = client.chat_completion(
model="meta-llama/Llama-3.3-70B-Instruct:inceptron",
messages=[{"role": "user", "content": "How many moons are there in the Solar System?"}],
)
print(completion.choices[0].message.content)
import { HfInference } from "@huggingface/inference";
const hf = new HfInference(process.env.HF_TOKEN);
const response = await hf.chatCompletion({
model: "meta-llama/Llama-3.3-70B-Instruct:inceptron",
messages: [{"role": "user", "content": "How many moons are there in the Solar System?"}],
});
console.log(response.choices[0].message.content);
Using OpenAI SDK
You can also use the OpenAI SDK by pointing it to the Hugging Face endpoint.
- Python
- JavaScript
import os
from openai import OpenAI
client = OpenAI(
base_url="https://router.huggingface.co/v1",
api_key=os.environ["HF_TOKEN"],
)
completion = client.chat.completions.create(
model="meta-llama/Llama-3.3-70B-Instruct:inceptron",
messages=[{"role": "user", "content": "How many moons are there in the Solar System?"}],
)
print(completion.choices[0].message.content)
import { OpenAI } from "openai";
const client = new OpenAI({
baseURL: "https://router.huggingface.co/v1",
apiKey: process.env.HF_TOKEN,
});
const chatCompletion = await client.chat.completions.create({
model: "meta-llama/Llama-3.3-70B-Instruct:inceptron",
messages: [{"role": "user", "content": "How many moons are there in the Solar System?"}],
});
console.log(chatCompletion.choices[0].message.content);
Using cURL
- cURL
curl https://router.huggingface.co/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $HF_TOKEN" \
-d '{
"model": "meta-llama/Llama-3.3-70B-Instruct:inceptron",
"messages": [{"role": "user", "content": "How many moons are there in the Solar System?"}]
}'