Special News
Thursday, 7 May 2026
New top story on Hacker News: ZAYA1-8B: An 8B Moe Model with 760M Active Params Matching DeepSeek-R1 on Math
ZAYA1-8B: An 8B Moe Model with 760M Active Params Matching DeepSeek-R1 on Math
9 by steveharing1 |
2 comments
on Hacker News.
No comments:
Post a Comment
Newer Post
Older Post
Home
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment