Portfolio / Python lane / LlamaLink

PY

LlamaLink

Local LLM chat interface

Python

Open source on GitHub Open the Python lane

Delivery

Source-first

Browse code, README, and release notes on GitHub.

Primary lane

Python lane

The clearest adjacent context for this project inside the portfolio.

Freshness

Apr 26, 2026

Updated Apr 26, 2026

Latest release

No tag yet

README is the clearest project overview right now.

Preview

Using the generated project card as a clean fallback until a live capture is available.

Source at github.com/SysAdminDoc/LlamaLink.

README

Cached at build time, cleaned up for in-site reading, and linked back to the canonical GitHub source.

LlamaLink

A sleek GUI frontend for llama.cpp
Search, download, and chat with local LLMs in one app.

License

Features

Model Management

Browse and download GGUF models directly from HuggingFace
Sort by downloads, likes, date, or trending
View available quantizations (Q4_K_M, Q8_0, IQ3_S, etc.) with file sizes
Download with progress bar, speed display, ETA, and resume support
Recursive model folder scanning with automatic detection

Server Control

Launch and manage llama-server with full parameter control
Or connect to an already-running server (any OpenAI-compatible endpoint)
Auto-detect llama-server from PATH and common install locations
Context size, GPU layers, threads, flash attention, mlock toggles
Embedded server log viewer

Chat Interface

Streaming responses with live token-by-token display
Markdown rendering: code blocks, inline code, bold, italic
Tokens/sec speed display during and after generation
System prompt support
Parameter presets: Default, Creative, Precise, Code, Roleplay
Adjustable temperature, top_p, top_k, repeat penalty, max tokens

Chat History

Auto-saves conversations locally
Load, export (Markdown / JSON / Text), and delete past chats

Design

Catppuccin Mocha dark theme throughout
Responsive split-panel layout
Window position and all settings persist between sessions

Installation

Portable EXE (Recommended)

Download LlamaLink.exe from Releases and run it. No installation required.

From Source

git clone https://github.com/SysAdminDoc/LlamaLink.git
cd LlamaLink
python llamalink.py

Dependencies (PyQt6, requests) are auto-installed on first run.

Quick Start

Download a model - Go to the "Download Models" tab, search for a model (e.g. llama, qwen, mistral), pick a quant, and download it
Set server path - Browse to your llama-server.exe (auto-detected if on PATH)
Select model - Your downloaded model appears automatically in the dropdown
Start server - Click "Start Server" and wait for the "Running" indicator
Chat - Switch to the Chat tab and start talking

Connecting to an Existing Server

Uncheck "Launch server", enter the URL (e.g. http://127.0.0.1:8080), and click Connect. Works with any OpenAI-compatible API endpoint.

Requirements

llama.cpp - Download from llama.cpp releases
Python 3.8+ (if running from source)
NVIDIA GPU recommended (auto-detected, CPU-only works too)

HuggingFace Token

Public models work without authentication. For gated/private models, set the HF_TOKEN environment variable:

set HF_TOKEN=hf_your_token_here
python llamalink.py

Building from Source

pip install pyinstaller
pyinstaller llamalink.spec

The executable will be in dist/LlamaLink.exe.

License

MIT

Read on GitHub → github.com/SysAdminDoc/LlamaLink

LlamaLink

Preview

README

LlamaLink

Features

Installation

Portable EXE (Recommended)

From Source

Quick Start

Connecting to an Existing Server

Requirements

HuggingFace Token

Building from Source

License

More from this lane