果冻甜甜的
首页
分类
标签
归档
关于
Search
总访问量
0
总文章数
14
0%
Um..! 14 posts in total. Keep on posting.
2025
11-23
Reducing Activation Recomputation in Large Transformer Models
11-23
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
11-22
InstructCoder: Instruction Tuning Large Language Models for Code Editing
11-22
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
09-07
token 简介
09-07
pytorch中的stream和event
08-17
ubuntu常见shell命令
08-17
lumos:Efficient Performance Modeling and Estimation for Large-scale LLM Training
08-17
attention中张量并行与GQA
06-20
pytorch Shard
1
2