LocalAI - Models

🖼️ Available 1 models from 1 repositories

Filter by type:

Filter by tags:

Operations in progress

vulpecula-4b (from the 'localai' repository)

Installation

fast-math-qwen3-14b

By applying SFT and GRPO on difficult math problems, we enhanced the performance of DeepSeek-R1-Distill-Qwen-14B and developed Fast-Math-R1-14B, which achieves approx. 30% faster inference on average, while maintaining accuracy. In addition, we trained and open-sourced Fast-Math-Qwen3-14B, an efficiency-optimized version of Qwen3-14B`, following the same approach. Compared to Qwen3-14B, this model enables approx. 65% faster inference on average, with minimal loss in performance. Technical details can be found in our github repository. Note: This model likely inherits the ability to perform inference in TIR mode from the original model. However, all of our experiments were conducted in CoT mode, and its performance in TIR mode has not been evaluated.

Repository: localaiLicense: apache-2.0

Link #1 Link #2