Skip to content

matteo-grella/llama2.go

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llama2.go

Cute Llama

This is a Go port of llama2.c.

Setup

  1. Download a model:
  2. Download tokenizer.bin
  3. go install github.com/saracen/llama2.go/cmd/llama2go@latest
  4. Do things:
    ./llama2go --help
    llama2go: <checkpoint>
      -cpuprofile string
             write cpu profile to file
      -prompt string
             prompt
      -steps int
             max number of steps to run for, 0: use seq_len (default 256)
      -temperature float
             temperature for sampling (default 0.9)
    
    ./llama2go -prompt "Cute llamas are" -steps 38 --temperature 0 stories110M.bin
    <s>
    Cute llamas are two friends who love to play together. They have a special game that they play every day. They pretend to be superheroes and save the world.
    achieved tok/s: 43.268528

Performance

system model llama2.c llama2.go
M1 Max, 10-Core, 32 GB stories15M.bin 676.392573 tok/s 230.144629 tok/s
M1 Max, 10-Core, 32 GB stories42M.bin 267.295597 tok/s 94.539509 tok/s
M1 Max, 10-Core, 32 GB stories110M.bin 100.671141 tok/s 42.359789 tok/s

About

Inference Llama 2 in Go

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 100.0%