Thursday, 7 May 2026

New top story on Hacker News: ZAYA1-8B: An 8B Moe Model with 760M Active Params Matching DeepSeek-R1 on Math

ZAYA1-8B: An 8B Moe Model with 760M Active Params Matching DeepSeek-R1 on Math
9 by steveharing1 | 2 comments on Hacker News.


No comments:

Post a Comment