Open Source Models

Open-weight models you can run, fine-tune and build on — often rivalling closed frontier models.

Open Source vs Open Weights: True open-source includes training code and data. “Open weights” (Llama, Mistral) releases model weights but not always training data. Both are valuable — check licenses before commercial use.

Models

Llama 3.3 (Meta)

Meta’s flagship open-weight family — 8B, 70B, 405B. The most widely deployed open model. Commercial use allowed.

Recommended

DeepSeek R1

Chinese open-weight reasoning model matching GPT-o1 quality at a fraction of the cost. MIT license.

Free Open Source

Mistral 7B / Large

French AI lab producing excellent open-weight models. Mistral 7B punches far above its size.

Recommended

Gemma 2 / 3 (Google)

Google’s lightweight open models — 2B, 7B, 27B. Designed for on-device and edge inference.

Phi-4 (Microsoft)

14B parameter model outperforming many larger ones. Excellent for reasoning and math. MIT license.

Falcon (TII)

UAE’s Technology Innovation Institute open-weight models — strong multilingual capabilities.

Qwen 2.5 (Alibaba)

Alibaba’s open-weight family with strong code and multilingual performance.

Command R (Cohere)

Optimised for enterprise RAG applications — excellent at retrieval and reasoning over documents.

Tools & Platforms

Ollama

The easiest way to run open models locally — one command to download and chat with Llama, Mistral, Gemma.

Recommended

Hugging Face

The central hub for open models — 500K+ models, transformers library, Inference API.

Recommended

vLLM

High-throughput inference server for open models — the standard for production open-model deployment.

llama.cpp

C++ inference engine for running quantised open models on CPU and consumer hardware.

LM Studio

Desktop GUI for downloading and running open models locally with a ChatGPT-like interface.

Together AI

Cloud inference API for open models — pay-per-token, fast, wide model selection.

Replicate

Run any open model via a simple REST API — no GPU provisioning required.

Text Generation WebUI

Oobabooga’s popular web interface for local LLM inference — extensive customisation.

Example Architecture: AI Agent with WhatsApp Control

A practical example of how to connect messaging, AI models, and automation into a working system. This pattern is commonly used for personal AI assistants, automated customer support, and agentic workflows.

Messaging Interface

WhatsApp / Discord / Telegram — User-facing chat interface. Users send messages, AI responds — familiar, accessible, no app install needed.

Alternatives:

WhatsApp — Most accessible globally
Discord — Best for communities, bots, voice
Telegram — Better bot API, privacy

Automation Layer

n8n / Zapier — Connects the messaging platform to your AI. Receives messages, triggers workflows, sends responses.

AI Layer

Ollama (local) or OpenAI / Anthropic API — The brain. Processes user requests, generates responses.

Example: Personal AI WhatsApp Assistant

User sends a message on WhatsApp
n8n receives it via Twilio or WhatsApp Business API
n8n sends the message to Ollama or OpenAI API
Response comes back and n8n sends it to WhatsApp

Getting Started Locally

Visual walkthrough — running Ollama locally

Install Ollama: curl -fsSL https://ollama.ai/install.sh | sh
Pull a model: ollama pull llama3.3 or ollama pull mistral
Chat in terminal: ollama run llama3.3
Use the API: Ollama runs a local REST API at http://localhost:11434 — OpenAI-compatible.
GUI option: Install LM Studio for a desktop ChatGPT-like interface with model management.