R_Daneel_AI 🤖 | Developer Setup Guide

Developer Setup Guide

Welcome to the installation and deployment documentation for R_Daneel_AI. This guide walks you through setting up your local environment, installing the symbolic Prolog engine, spinning up the high-capacity vLLM inference backend, configuring credentials, and running the benchmarking suite.

💡

Hardware Requirements

To run the flagship Gemma 4 26B AWQ MoE model locally, we recommend a GPU with at least 24 GB of VRAM (such as an Nvidia RTX 3090, 4090, or A5000). The model footprint consumes ~16.6 GiB of memory, leaving the remainder for KV cache allocation.

1. Install Queen

The bot relies on the standalone Queen package (the compiled SWI-Prolog chess engine wrapper) to calculate legal moves, pinned pieces, checks, discoverys, and forced tactical wins.

First, ensure you have SWI-Prolog installed on your system:

Fedora / RedHat

sudo dnf install pl

Ubuntu / Debian

sudo apt-get install swi-prolog

Next, install the local python dependencies and install the cloned `Queen` package in editable mode:

Shell Command

pip install -r requirements.txt
pip install -e ./Queen

2. Boot the local vLLM Server

To avoid high inference latency, we serve the model locally using vLLM. Because Gemma 4 is a large Mixture of Experts (MoE) model, we apply structural parameter limits to prevent GPU out-of-memory errors:

--max-model-len 8192: Clamps the maximum context window to fit the GPU VRAM.
--max-num-batched-tokens 4096: Sets appropriate multimodal batched limits.
--enable-auto-tool-choice --tool-call-parser gemma4: Enables native tool calling.

Launch the server by executing the provided startup script:

start_vllm.sh

#!/bin/bash
export VLLM_DISABLE_FLASHINFER=1
export VLLM_USE_FLASHINFER_SAMPLER=0
export VLLM_ATTENTION_BACKEND=FLASH_ATTN
export HF_HUB_OFFLINE=1

# Serve Gemma 4 MoE on port 8000 with optimized parameters
vllm serve cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit \
    --port 8000 \
    --enable-auto-tool-choice \
    --tool-call-parser gemma4 \
    --max-num-batched-tokens 4096 \
    --max-model-len 8192

3. Configure Environment Variables

Create a .env file at the root of the project to hold your private credentials and configuration parameters. The file is listed in .gitignore to prevent accidental commits to public repositories.

.env File Structure

# Lichess Connection Credentials
LICHESS_API_TOKEN=your_lichess_bot_api_token_here
LICHESS_MY_USERNAME=your_personal_username_to_accept_challenges_from

# Local vLLM Model serving configuration
LICHESS_MODEL_NAME=cyankiwi/gemma-4-26B-A4B-it-AWQ-4bit

4. Run the Lichess Bot

Once the vLLM server is running and your .env file is configured, start the Lichess polling client:

Shell Command

python3 lichess_bot.py

The client will connect, check if your Lichess account is already designated as a BOT account, and prompt you for confirmation if an upgrade is needed. Challenges issued from your designated LICHESS_MY_USERNAME will be automatically accepted, starting a match.

⚠️

Irreversible Lichess BOT Upgrade

Upgrading a Lichess account to a BOT designation is irreversible. Once upgraded, that account can never play standard human matches and will be restricted to bot/API play. We strongly recommend creating a fresh, dedicated Lichess account for your bot rather than using your personal main account.

5. Run Benchmark Suites

You can test the bot's tactical accuracy and rating locally using the test utilities:

A. Local Tactical vision Elo Benchmark

Measures accuracy and computes an estimated tactical rating against 30 puzzles:

Shell Command

python3 test_puzzles.py

B. Pure LLM Baseline Comparison

Evaluates a pure LLM (without Queen or min-max constraints) to measure System 2 improvements:

Shell Command

python3 test_puzzles_baseline.py

C. Match Play Against Local Stockfish

Auto-runs standard matches (alternating colors) against a local Stockfish engine to determine win rate:

Shell Command

python3 play_stockfish.py

6. Replaying Games & Analyzing Thought Logs

R_Daneel_AI records every turn's FEN state, proposed moves, metrics, and full Markdown thoughts in structured JSON Lines files (logs/game_{game_id}.jsonl).

To inspect and debug the bot's System 2 logical reasoning step-by-step:

Open logs/visualizer.html in any browser.
Drag and drop your game's .jsonl file directly into the visualizer panel.
Use the navigation controls (Play/Pause, Next, Previous) to replay the board state and read the corresponding LLM Thoughts and updated Strategic Plans.

Troubleshooting Common Errors

⚠️

SWI-Prolog Shared Library Error (pyswip)

Error: Failed to find shared library 'libswipl.so'.

Solution: You must explicitly register the location of SWI-Prolog's binary libraries inside your environment variables before executing. Find the path containing libswipl.so (usually under /usr/lib/swi-prolog/lib/x86_64-linux/ or /usr/lib64/) and export it:

Environment Export

export LD_LIBRARY_PATH=/usr/lib/swi-prolog/lib/x86_64-linux:$LD_LIBRARY_PATH

🦖

Model Context Window Overflow (400 BadRequest)

Error: BadRequestError: 400 Context window exceeded limit.

Solution: In complex games with deep tool calls, the chat history can swell. The python coordinator automatically trims the `get_tactical_summary` dictionary keys and uses history block pruning. If you encounter errors, ensure the server was booted with --max-model-len 8192.

🌀

AWQ Token Repetition Loops

Problem: AWQ-quantized Mixture of Experts models running at temperature=0.0 sometimes enter infinite token repetition loops, failing to emit JSON tool calls.

Solution: We resolved this in lichess_bot.py by setting temperature=0.15 and frequency_penalty=0.15. Ensure these settings are active in your script.