Containers
LocalAI supports Docker, Podman, and other OCI-compatible container engines. This guide covers the common aspects of running LocalAI in containers.
Prerequisites
Before you begin, ensure you have a container engine installed:
- Install Docker (Mac, Windows, Linux)
- Install Podman (Linux, macOS, Windows WSL2)
Quick Start
The fastest way to get started is with the CPU image:
This will:
- Start LocalAI (you’ll need to install models separately)
- Make the API available at
http://localhost:8080
Image Types
LocalAI provides several image types to suit different needs. These images work with both Docker and Podman.
Standard Images
Standard images don’t include pre-configured models. Use these if you want to configure models manually.
CPU Image
GPU Images
NVIDIA CUDA 13:
NVIDIA CUDA 12:
AMD GPU (ROCm):
Intel GPU:
Vulkan:
NVIDIA Jetson (L4T ARM64):
CUDA 12 (for Nvidia AGX Orin and similar platforms):
CUDA 13 (for Nvidia DGX Spark):
All-in-One (AIO) Images
Recommended for beginners - These images come pre-configured with models and backends, ready to use immediately.
CPU Image
GPU Images
NVIDIA CUDA 13:
NVIDIA CUDA 12:
AMD GPU (ROCm):
Intel GPU:
Using Compose
For a more manageable setup, especially with persistent volumes, use Docker Compose or Podman Compose:
Using CDI (Container Device Interface) - Recommended for NVIDIA Container Toolkit 1.14+
The CDI approach is recommended for newer versions of the NVIDIA Container Toolkit (1.14 and later). It provides better compatibility and is the future-proof method:
Save this as compose.yaml and run:
Using Legacy NVIDIA Driver - For Older NVIDIA Container Toolkit
If you are using an older version of the NVIDIA Container Toolkit (before 1.14), or need backward compatibility, use the legacy approach:
Persistent Storage
The container exposes the following volumes:
| Volume | Description | CLI Flag | Environment Variable |
|---|---|---|---|
/models | Model files used for inferencing | --models-path | $LOCALAI_MODELS_PATH |
/backends | Custom backends for inferencing | --backends-path | $LOCALAI_BACKENDS_PATH |
/configuration | Dynamic config files (api_keys.json, external_backends.json, runtime_settings.json) | --localai-config-dir | $LOCALAI_CONFIG_DIR |
/data | Persistent data (collections, agent state, tasks, jobs) | --data-path | $LOCALAI_DATA_PATH |
To persist models and data, mount volumes:
Or use named volumes:
What’s Included in AIO Images
All-in-One images come pre-configured with:
- Text Generation: LLM models for chat and completion
- Image Generation: Stable Diffusion models
- Text to Speech: TTS models
- Speech to Text: Whisper models
- Embeddings: Vector embedding models
- Function Calling: Support for OpenAI-compatible function calling
The AIO images use OpenAI-compatible model names (like gpt-4, gpt-4-vision-preview) but are backed by open-source models. See the container images documentation for the complete mapping.
Next Steps
After installation:
- Access the WebUI at
http://localhost:8080 - Check available models:
curl http://localhost:8080/v1/models - Install additional models
- Try out examples
Troubleshooting
Container won’t start
- Check container engine is running:
docker psorpodman ps - Check port 8080 is available:
netstat -an | grep 8080(Linux/Mac) - View logs:
docker logs local-aiorpodman logs local-ai
GPU not detected
- Ensure Docker has GPU access:
docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi - For Podman, see the Podman installation guide
- For NVIDIA: Install NVIDIA Container Toolkit
- For AMD: Ensure devices are accessible:
ls -la /dev/kfd /dev/dri
NVIDIA Container fails to start with “Auto-detected mode as ’legacy’” error
If you encounter this error:
This indicates a Docker/NVIDIA Container Toolkit configuration issue. The container runtime’s prestart hook fails before LocalAI starts. This is not a LocalAI code bug.
Solutions:
Use CDI mode (recommended): Update your docker-compose.yaml to use the CDI driver configuration:
Upgrade NVIDIA Container Toolkit: Ensure you have version 1.14 or later, which has better CDI support.
Check NVIDIA Container Toolkit configuration: Run
nvidia-container-cli --query-gputo verify your installation is working correctly outside of containers.Verify Docker GPU access: Test with
docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi
Models not downloading
- Check internet connection
- Verify disk space:
df -h - Check container logs for errors:
docker logs local-aiorpodman logs local-ai
See Also
- Container Images Reference - Complete image reference
- Install Models - Install and configure models
- GPU Acceleration - GPU setup and optimization
- Kubernetes Installation - Deploy on Kubernetes