Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
Robust Speech Recognition via Large-Scale Weak Supervision.