A heterogeneous hardware acceleration library focused on efficient KV cache transfer operators (H2D/D2H), designed for large model training and inference scenarios.
最近更新: 6天前A heterogeneous hardware acceleration library focused on efficient KV cache transfer operators (H2D/D2H), designed for large model training and inference scenarios.
最近更新: 6天前Library targeting Intel Architecture for specialized dense and sparse matrix operations, and deep learning primitives.
最近更新: 1个月前