The Hugging Face Model Hub
The Model Hub is the world's largest repository of pre-trained AI models — 2M+ models across every task, modality, and license. Think of it as npm for AI models.
Browsing Models
The hub at huggingface.co/models lets you filter by:
Model Cards
Every model has a Model Card — a README.md that describes the model's intended use, training data, evaluation results, limitations, and how to use it. Quality varies: major releases (Llama 4, Gemma, Mistral) have detailed cards; smaller community models may have minimal documentation.
A good model card includes:
- Intended use cases (and what the model is NOT designed for)
- Training dataset provenance
- Benchmark results (MMLU, HumanEval, MT-Bench, etc.)
- Known limitations and biases
- Usage code snippets
- License and citation
Downloading Models
Three ways to get a model onto your machine:
# Method 1: Load directly in Python (auto-downloads to cache)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
# Method 2: CLI download
pip install huggingface_hub
huggingface-cli download meta-llama/Llama-3.2-3B-Instruct
# Method 3: snapshot_download (download entire repo)
from huggingface_hub import snapshot_download
snapshot_download(repo_id="meta-llama/Llama-3.2-3B-Instruct", local_dir="./llama3")Models download to a local cache (~/.cache/huggingface/hub/ by default). Set HF_HOME to redirect the cache to a drive with more space.
Gated Models
Some models — particularly from Meta (Llama), Google (Gemma), and Mistral — are gated. They require you to accept a license agreement before downloading. The process:
- Go to the model's page on huggingface.co
- Click "Access repository" and accept the license
- Generate a Hugging Face access token (Settings → Access Tokens)
- Set
HF_TOKEN=your_tokenbefore running your code, or usehuggingface-cli login
# Authenticate once huggingface-cli login # paste your token when prompted # Or set as environment variable export HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxx
Versioning and Revisions
The Hub uses Git under the hood — every model is a Git repository using Git LFS for large files. Model versions are tracked as commits and branches. You can pin to a specific revision:
# Pin to a specific commit hash for reproducibility
model = AutoModelForCausalLM.from_pretrained(
"mistralai/Mistral-7B-Instruct-v0.3",
revision="e0bc86c23ce5aae1db576c8cca6f06f1f73af2db"
)Understanding Model IDs
Model IDs follow the pattern organization/model-name:
| Model ID | Org | Notes |
|---|---|---|
| meta-llama/Llama-3.2-3B-Instruct | Meta | Gated — requires license acceptance |
| google/gemma-3-9b-it | Gated — gemma.google.com request | |
| mistralai/Mistral-7B-Instruct-v0.3 | Mistral AI | Open — no gate |
| microsoft/phi-4 | Microsoft | Open — MIT license |
| deepseek-ai/DeepSeek-R1-Distill-Qwen-7B | DeepSeek | Open — MIT license |
| sentence-transformers/all-MiniLM-L6-v2 | sentence-transformers | Open — embedding model |
Checklist: Do You Understand This?
- Can you find and download a model for a specific task on the Hub?
- Do you understand what a Model Card is and what good ones contain?
- Can you handle gated models — accepting the license and authenticating?
- Do you know what a model ID like
meta-llama/Llama-3.2-3B-Instructmeans? - Can you pin a model to a specific revision for reproducibility?