typescript

GPT4All Node.js API

Native Node.js LLM bindings for all.

yarn add gpt4all@latest

npm install gpt4all@latest

pnpm install gpt4all@latest

Breaking changes in version 4!!

See Transition

Api Examples

Chat Completion

Use a chat session to keep context between completions. This is useful for efficient back and forth conversations.

import { createCompletion, loadModel } from "../src/gpt4all.js";

const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf", {
    verbose: true, // logs loaded model configuration
    device: "gpu", // defaults to 'cpu'
    nCtx: 2048, // the maximum sessions context window size.
});

// initialize a chat session on the model. a model instance can have only one chat session at a time.
const chat = await model.createChatSession({
    // any completion options set here will be used as default for all completions in this chat session
    temperature: 0.8,
    // a custom systemPrompt can be set here. note that the template depends on the model.
    // if unset, the systemPrompt that comes with the model will be used.
    systemPrompt: "### System:\nYou are an advanced mathematician.\n\n",
});

// create a completion using a string as input
const res1 = await createCompletion(chat, "What is 1   1?");
console.debug(res1.choices[0].message);

// multiple messages can be input to the conversation at once.
// note that if the last message is not of role 'user', an empty message will be returned.
await createCompletion(chat, [
    {
        role: "user",
        content: "What is 2   2?",
    },
    {
        role: "assistant",
        content: "It's 5.",
    },
]);

const res3 = await createCompletion(chat, "Could you recalculate that?");
console.debug(res3.choices[0].message);

model.dispose();

Stateless usage

You can use the model without a chat session. This is useful for one-off completions.

import { createCompletion, loadModel } from "../src/gpt4all.js";

const model = await loadModel("orca-mini-3b-gguf2-q4_0.gguf");

// createCompletion methods can also be used on the model directly.
// context is not maintained between completions.
const res1 = await createCompletion(model, "What is 1   1?");
console.debug(res1.choices[0].message);

// a whole conversation can be input as well.
// note that if the last message is not of role 'user', an error will be thrown.
const res2 = await createCompletion(model, [
    {
        role: "user",
        content: "What is 2   2?",
    },
    {
        role: "assistant",
        content: "It's 5.",
    },
    {
        role: "user",
        content: "Could you recalculate that?",
    },
]);
console.debug(res2.choices[0].message);

Embedding

import { loadModel, createEmbedding } from '../src/gpt4all.js'

const embedder = await loadModel("nomic-embed-text-v1.5.f16.gguf", { verbose: true, type: 'embedding'})

console.log(createEmbedding(embedder, "Maybe Minecraft was the friends we made along the way"));

Streaming responses

import { loadModel, createCompletionStream } from "../src/gpt4all.js";

const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf", {
    device: "gpu",
});

process.stdout.write("Output: ");
const stream = createCompletionStream(model, "How are you?");
stream.tokens.on("data", (data) => {
    process.stdout.write(data);
});
//wait till stream finishes. We cannot continue until this one is done.
await stream.result;
process.stdout.write("\n");
model.dispose();

Async Generators

import { loadModel, createCompletionGenerator } from "../src/gpt4all.js";

const model = await loadModel("mistral-7b-openorca.gguf2.Q4_0.gguf");

process.stdout.write("Output: ");
const gen = createCompletionGenerator(
    model,
    "Redstone in Minecraft is Turing Complete. Let that sink in. (let it in!)"
);
for await (const chunk of gen) {
    process.stdout.write(chunk);
}

process.stdout.write("\n");
model.dispose();

Offline usage

do this b4 going offline

curl -L https://gpt4all.io/models/models3.json -o ./models3.json

import { createCompletion, loadModel } from 'gpt4all'

//make sure u downloaded the models before going offline!
const model = await loadModel('mistral-7b-openorca.gguf2.Q4_0.gguf', {
    verbose: true,
    device: 'gpu',
    modelConfigFile: "./models3.json"
});

await createCompletion(model, 'What is 1   1?', { verbose: true })

model.dispose();

Develop

Build Instructions

binding.gyp is compile config
Tested on Ubuntu. Everything seems to work fine
Tested on Windows. Everything works fine.
Sparse testing on mac os.
MingW script works to build the gpt4all-backend. We left it there just in case. HOWEVER, this package works only with MSVC built dlls.

Requirements

git
node.js >= 18.0.0
yarn
node-gyp
- all of its requirements.
(unix) gcc version 12
(win) msvc version 143
- Can be obtained with visual studio 2022 build tools
python 3
On Windows and Linux, building GPT4All requires the complete Vulkan SDK. You may download it from here: https://vulkan.lunarg.com/sdk/home
macOS users do not need Vulkan, as GPT4All will use Metal instead.

Build (from source)

git clone https://github.com/nomic-ai/gpt4all.git
cd gpt4all-bindings/typescript

The below shell commands assume the current working directory is typescript.
To Build and Rebuild:

node scripts/prebuild.js

llama.cpp git submodule for gpt4all can be possibly absent. If this is the case, make sure to run in llama.cpp parent directory

git submodule update --init --recursive

yarn build:backend

This will build platform-dependent dynamic libraries, and will be located in runtimes/(platform)/native

Test

yarn test

Source Overview

src/

Extra functions to help aid devex
Typings for the native node addon
the javascript interface

test/

simple unit testings for some functions exported.
more advanced ai testing is not handled

spec/

Average look and feel of the api
Should work assuming a model and libraries are installed locally in working directory

index.cc

The bridge between nodejs and c. Where the bindings are.

prompt.cc

Handling prompting and inference of models in a threadsafe, asynchronous way.

Known Issues

why your model may be spewing bull 💩
- The downloaded model is broken (just reinstall or download from official site)
Your model is hanging after a call to generate tokens.
- Is nPast set too high? This may cause your model to hang (03/16/2024), Linux Mint, Ubuntu 22.04
Your GPU usage is still high after node.js exits.
- Make sure to call model.dispose()!!!

Roadmap

This package has been stabilizing over time development, and breaking changes may happen until the api stabilizes. Here's what's the todo list:

Changes

This repository serves as the new bindings for nodejs users.

If you were a user of these bindings, they are outdated.
Version 4 includes the follow breaking changes
- createEmbedding & EmbeddingModel.embed() returns an object, EmbeddingResult, instead of a float32array.
- Removed deprecated types ModelType and ModelFile
- Removed deprecated initiation of model by string path only

Name		Name	Last commit message	Last commit date
parent directory ..
scripts		scripts
spec		spec
src		src
test		test
.clang-format		.clang-format
.gitignore		.gitignore
.npmignore		.npmignore
.yarnrc.yml		.yarnrc.yml
README.md		README.md
binding.ci.gyp		binding.ci.gyp
binding.gyp		binding.gyp
index.cc		index.cc
index.h		index.h
package.json		package.json
prompt.cc		prompt.cc
prompt.h		prompt.h
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

typescript

typescript

README.md

GPT4All Node.js API

Breaking changes in version 4!!

Contents

Api Examples

Chat Completion

Stateless usage

Embedding

Streaming responses

Async Generators

Offline usage

Develop

Build Instructions

Requirements

Build (from source)

Test

Source Overview

src/

test/

spec/

index.cc

prompt.cc

Known Issues

Roadmap

Changes

API Reference

Files

typescript

Directory actions

More options

Directory actions

More options

Latest commit

History

typescript

Folders and files

parent directory

README.md

GPT4All Node.js API

Breaking changes in version 4!!

Contents

Api Examples

Chat Completion

Stateless usage

Embedding

Streaming responses

Async Generators

Offline usage

Develop

Build Instructions

Requirements

Build (from source)

Test

Source Overview

src/

test/

spec/

index.cc

prompt.cc

Known Issues

Roadmap

Changes

API Reference