llama

8B parameters

Commercial OK

Reviewed May 2026

RefinedNeuro RN TR R1

RefinedNeuro RN TR R1 is an Apache-2.0 Llama-family 8B reasoning model distributed on Hugging Face and Ollama. It is included in the local sweep as a compact reasoning baseline on the same RTX 5080 rig.

License: Apache-2.0·Context: 8,192 tokens

Overview

Strengths

Apache-2.0 license
Llama-family tooling compatibility
Very low variance in the local TPS sweep

Weaknesses

Low public adoption signal at intake
Not tagged as a Turkish-specific model in the sweep
Quality benchmarks are still needed before ranking it by capability

Quantization variants

Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.

Quantization	File size	VRAM required
Q4_K_M	4.9 GB	7 GB

Get the model

Ollama

One-line install

ollama run RefinedNeuro/RN_TR_R1:latestRead our Ollama review →

HuggingFace

Original weights

huggingface.co/RefinedNeuro/RN_TR_R1

Source repository — direct quantization required.

Benchmarks

Real measurements on real hardware. Numbers ship with the runner version, quant, and date.

2 runs on record

Hardware	Provenance	Quant	Ctx	Tokens / sec	TTFT	Date
NVIDIA GeForce RTX 5080	EditorialM	Q4_K_M	2K	133.6tok/s	—	May 28, 26
NVIDIA GeForce RTX 3080 16GB (Mobile)	EditorialM	Q4_K_M	4K	79.9tok/s	361 ms	Jun 2, 26

What to do next

Got this model running on real hardware? Share what you measured — the form arrives with the model pre-selected.

Submit a benchmark for RefinedNeuro RN TR R1

OrBrowse the benchmark roadmap Compare hardware options

Hardware that runs this

Cards with enough VRAM for at least one quantization of RefinedNeuro RN TR R1.

NVIDIA B300 (Blackwell Ultra)

Compare alternatives

Models worth comparing

Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.

Same tier

Models in the same parameter band as this one

Step up

More capable — bigger memory footprint

Step down

Smaller — faster, runs on weaker hardware

Frequently asked

What's the minimum VRAM to run RefinedNeuro RN TR R1?

7GB of VRAM is enough to run RefinedNeuro RN TR R1 at the Q4_K_M quantization (file size 4.9 GB). Higher-quality quantizations need more.

Can I use RefinedNeuro RN TR R1 commercially?

Yes — RefinedNeuro RN TR R1 ships under the Apache-2.0, which permits commercial use. Always read the license text before deployment.

What's the context length of RefinedNeuro RN TR R1?

RefinedNeuro RN TR R1 supports a context window of 8,192 tokens (about 8K).

How do I install RefinedNeuro RN TR R1 with Ollama?

Run `ollama pull RefinedNeuro/RN_TR_R1:latest` to download, then `ollama run RefinedNeuro/RN_TR_R1:latest` to start a chat session. The default quantization is Q4_K_M.

Source: huggingface.co/RefinedNeuro/RN_TR_R1

Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.

Related — keep moving

Compare hardware

Buyer guides

When it doesn't work

Recommended hardware

Before you buy

Verify RefinedNeuro RN TR R1 runs on your specific hardware before committing money.

Will it run on my hardware? →Custom hardware comparison →GPU recommender (4 questions) →