How to Run Llama 3 On Your PC or Mac

April 19, 2024 • 2 min read

Meta has unveiled Llama 3, the latest generation of its family of open weights large language models. It is currently available in 8B and 70B parameter sizes and pre-trained or instruction-tuned.

Instruction-tuned is fine-tuned for chat/dialogue use cases.
Pre-trained is the base model.

N.B. The 8B parameter model is about 4GB, while the 70B is about 40GB.

In this article, we'll provide a detailed guide about how you can run the models locally.

Option 1: Use Ollama

Platforms Supported: MacOS, Ubuntu, Windows (preview)

Ollama is one of the easiest ways for you to run Llama 3 locally. Simply download the application here, and run one the following command in your CLI.

ollama run llama3

This will download the Llama 3 8B instruct model.

You can optionally provide a tag, but if you don't it will default to latest. The tag is used to identify a specific version. For example you can run:

ollama run llama3:70b-text

ollama run llama3:70b-instruct

🎉 Congrats, you can now access the model via your CLI. 👍🏾

If you want a chatbot UI (like ChatGPT), you'll need to do a bit more work. One option is the Open WebUI project:

Download OpenWebUI (formerly Ollama WebUI) here. It's a feature-filled and friendly self-hosted web UI. I recommend you use docker and set up a container using the following command:

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

After installation, you can access Open WebUI at http://localhost:3000. For other installation mehods, see the Readme.

N.B. You can also look into Chatbot Ollama or Chatbot UI.

Option 2: Use LM Studio

Platforms Supported: MacOS, Ubuntu, Windows

LM Studio is made possible thanks to the llama.cpp project and supports any ggml Llama, MPT, and StarCoder model on Hugging Face. Download the application here and note the system requirements.

N.B. LM Studio has a built in chat interface and other features.

Option 3: GPT4All

Platforms Supported: MacOS, Ubuntu, Windows

Simply download the application here and install it like you would.

Update: Meta has published a series of YouTube tutorials on how to run Llama 3 on Mac, Linux and Windows.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

An Exclusive Leadership Retreat

Leading in the Intelligence Age

How to Run Llama 3 On Your PC or Mac

Option 1: Use Ollama

Option 2: Use LM Studio

Option 3: GPT4All