ollama-python-streaming

Community

Stream local Ollama LLM responses from Python.

Authormarcus
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Enables real-time streaming from a local Ollama LLM within Python applications using LiteLLM, facilitating interactive and responsive AI experiences.

Core Features & Use Cases

  • Async streaming with a robust client implementation
  • Retry logic with exponential backoff and error handling
  • Thinking model configuration for low/medium/high levels
  • Production-ready patterns for local Ollama integration

Quick Start

Install Ollama, start the server, install litellm, and run the minimal streaming example to observe live output.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: ollama-python-streaming
Download link: https://github.com/marcus/marcus-skills/archive/main.zip#ollama-python-streaming

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.