AI2026.04.23Hardware-Software Co-optimization: How NousCoder-14B and Blackwell Are Redefining Training Latency