Running Ollama with AMD ROCm on the MS-S1-MAX

How to set up Ollama with AMD ROCm acceleration on the MS-S1-MAX for local LLM inference.

The MS-S1-MAX is a compact workstation with AMD Ryzen AI Max+ 395 and integrated Radeon graphics. Here’s how to run Ollama with GPU acceleration.

Prerequisites

  • MS-S1-MAX or similar AMD GPU system
  • Ubuntu 24.04 or later
  • Docker installed

Installation

1. Install ROCm

# Add ROCm repository
sudo apt update
sudo apt install -y rocm-dkms

# Add user to render and video groups
sudo usermod -aG render,video $USER

2. Install Ollama

# Pull Ollama Docker image
docker pull ollama/ollama:latest

# Run with GPU support
docker run -d \
  --name ollama \
  --device /dev/kfd \
  --device /dev/dri \
  -p 11434:11434 \
  ollama/ollama:latest

3. Pull a Model

docker exec -it ollama ollama pull llama3.2

Performance

ModelTPSContext
gpt-oss:20b42128k
gpt-oss:120b25128k

Troubleshooting

GPU Not Detected

# Check ROCm is working
rocminfo

# Check render permissions
ls -la /dev/kfd /dev/dri

Container Permission Issues

Make sure your user is in the render and video groups, then restart.


For more on AMD ROCm, see AMD’s ROCm documentation.