Beginner

The Hugging Face Model Hub

The Model Hub is the world's largest repository of pre-trained AI models — 2M+ models across every task, modality, and license. Think of it as npm for AI models.

Browsing Models

The hub at huggingface.co/models lets you filter by:

Task

Text generation, summarization, translation, question answering, text classification, image classification, object detection, image-to-text, text-to-image, audio classification, speech recognition, and dozens more.

Library

Transformers, Diffusers, GGUF (Ollama-compatible), llama.cpp, spaCy, fastai, TensorFlow, JAX, and others. Filter to GGUF to find Ollama-compatible models.

License

MIT, Apache 2.0, Llama Community License, Creative Commons, and custom licenses. Filter to 'openrail' or 'apache-2.0' for the most permissive options.

Model Cards

Every model has a Model Card — a README.md that describes the model's intended use, training data, evaluation results, limitations, and how to use it. Quality varies: major releases (Llama 4, Gemma, Mistral) have detailed cards; smaller community models may have minimal documentation.

A good model card includes:

Intended use cases (and what the model is NOT designed for)
Training dataset provenance
Benchmark results (MMLU, HumanEval, MT-Bench, etc.)
Known limitations and biases
Usage code snippets
License and citation

Downloading Models

Three ways to get a model onto your machine:

# Method 1: Load directly in Python (auto-downloads to cache)
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")

# Method 2: CLI download
pip install huggingface_hub
huggingface-cli download meta-llama/Llama-3.2-3B-Instruct

# Method 3: snapshot_download (download entire repo)
from huggingface_hub import snapshot_download
snapshot_download(repo_id="meta-llama/Llama-3.2-3B-Instruct", local_dir="./llama3")

Models download to a local cache (~/.cache/huggingface/hub/ by default). Set HF_HOME to redirect the cache to a drive with more space.

Gated Models

Some models — particularly from Meta (Llama), Google (Gemma), and Mistral — are gated. They require you to accept a license agreement before downloading. The process:

Go to the model's page on huggingface.co
Click "Access repository" and accept the license
Generate a Hugging Face access token (Settings → Access Tokens)
Set HF_TOKEN=your_token before running your code, or use huggingface-cli login

# Authenticate once
huggingface-cli login   # paste your token when prompted

# Or set as environment variable
export HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxx

Versioning and Revisions

The Hub uses Git under the hood — every model is a Git repository using Git LFS for large files. Model versions are tracked as commits and branches. You can pin to a specific revision:

# Pin to a specific commit hash for reproducibility
model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.3",
    revision="e0bc86c23ce5aae1db576c8cca6f06f1f73af2db"
)

Understanding Model IDs

Model IDs follow the pattern organization/model-name:

Model ID	Org	Notes
meta-llama/Llama-3.2-3B-Instruct	Meta	Gated — requires license acceptance
google/gemma-3-9b-it	Google	Gated — gemma.google.com request
mistralai/Mistral-7B-Instruct-v0.3	Mistral AI	Open — no gate
microsoft/phi-4	Microsoft	Open — MIT license
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B	DeepSeek	Open — MIT license
sentence-transformers/all-MiniLM-L6-v2	sentence-transformers	Open — embedding model

Checklist: Do You Understand This?

Can you find and download a model for a specific task on the Hub?
Do you understand what a Model Card is and what good ones contain?
Can you handle gated models — accepting the license and authenticating?
Do you know what a model ID like meta-llama/Llama-3.2-3B-Instruct means?
Can you pin a model to a specific revision for reproducibility?