R_Daneel_AI 🤖 | Neuro-Symbolic Chess Bot & Benchmark

R_Daneel_AI

A chess bot that uses a Large Language Model to strategize, with Queen acting as its sensory eyes. Powered by a local model served via vLLM for maximum inference speed, the bot pairs verbal planning with logical constraint validation to play chess through reasoning rather than brute-force calculation.

"Our goal is not to control the LLM, but to show direction when it is confused."

Traditional Engines Stockfish / Lc0

Opponent's Move

➔

Brute-force Tree Search

➔

Numeric Evaluation

➔

Bot's Move

Mimicry Engines Maia Chess

Opponent's Move

➔

Deep Policy Network

➔

Move Prediction

➔

Bot's Move

Our Neuro-Symbolic Bot R_Daneel_AI

Opponent's Move

➔

LLM (Initial View)

➔

Queen (Analysis)

➔

LLM (Planning)

➔

Queen (Simulations)

➔

LLM (Final Check)

➔

Bot's Move

Interactive Autopsy Playground

Select a scenario below to watch the bot execute System 2 logical analysis on the chessboard and review the actual metrics, strategy tags, and thoughts generated in real-time:

Tactical Scenario

Puzzle Rating Benchmarks

To evaluate the performance of our approach, we ran comprehensive benchmarks across Lichess chess puzzle datasets. Below you can switch between the standard range (600–1500 Elo) and the high-Elo range (1600–2600 Elo) to see the comparative results:

Evaluation Metric / Elo Tier	Pure LLM Baseline cyankiwi/gemma-4-26B	Queen-Only Baseline Queen Heuristics (Symbolic)	R_Daneel_AI (Ours) cyankiwi/gemma-4-26B + Queen
Overall Performance Metrics
Puzzles Solved	80 / 500	376 / 500	404 / 500
Accuracy / Pass Rate	16.0%	75.2%	80.8%
Puzzle Rating (Bisection Method)	673 Rating	1399 Rating	1478 Rating
Inference Retries / Logic Loops	0.24 avg retries	0.00 avg retries	0.38 avg retries
Average Move Latency	17.5s / move	0.54s / move	24.3s / move
Accuracy Breakdown by Elo Tier
600s Tier	10.0% (5/50)	98.0% (49/50)	96.0% (48/50)
700s Tier	16.0% (8/50)	96.0% (48/50)	96.0% (48/50)
800s Tier	10.0% (5/50)	94.0% (47/50)	98.0% (49/50)
900s Tier	20.0% (10/50)	82.0% (41/50)	84.0% (42/50)
1000s Tier	16.0% (8/50)	86.0% (43/50)	88.0% (44/50)
1100s Tier	16.0% (8/50)	70.0% (35/50)	82.0% (41/50)
1200s Tier	20.0% (10/50)	66.0% (33/50)	80.0% (40/50)
1300s Tier	24.0% (12/50)	52.0% (26/50)	68.0% (34/50)
1400s Tier	10.0% (5/50)	58.0% (29/50)	52.0% (26/50)
1500s Tier	18.0% (9/50)	50.0% (25/50)	64.0% (32/50)

Evaluation Metric / Elo Tier	Pure LLM Baseline cyankiwi/gemma-4-26B	Queen-Only Baseline Queen Heuristics (Symbolic)	R_Daneel_AI (Ours) cyankiwi/gemma-4-26B + Queen
Overall Performance Metrics
Puzzles Solved	101 / 500	170 / 500	177 / 500
Accuracy / Pass Rate	20.2%	34.0%	35.4%
Puzzle Rating (Bisection Method)	1741 Rating	1919 Rating	1936 Rating
Inference Retries / Logic Loops	0.22 avg retries	0.00 avg retries	1.40 avg retries
Average Move Latency	38.8s / move	0.46s / move	29.6s / move
Accuracy Breakdown by Elo Tier
1600s Tier	20.0% (10/50)	46.0% (23/50)	42.0% (21/50)
1700s Tier	26.0% (13/50)	50.0% (25/50)	50.0% (25/50)
1800s Tier	24.0% (12/50)	44.0% (22/50)	48.0% (24/50)
1900s Tier	12.0% (6/50)	38.0% (19/50)	44.0% (22/50)
2000s Tier	26.0% (13/50)	40.0% (20/50)	44.0% (22/50)
2100s Tier	26.0% (13/50)	22.0% (11/50)	30.0% (15/50)
2200s Tier	18.0% (9/50)	36.0% (18/50)	28.0% (14/50)
2300s Tier	8.0% (4/50)	26.0% (13/50)	22.0% (11/50)
2400s Tier	24.0% (12/50)	20.0% (10/50)	30.0% (15/50)
2500s Tier	18.0% (9/50)	18.0% (9/50)	16.0% (8/50)

Ready to Configure R_Daneel_AI?

Follow our comprehensive interactive setup guide to spin up the local vLLM server, hook in your Lichess bot account credentials, and run the benchmark suite.

View Setup Documentation