Search events for 'all'
A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine Programs
Main Conference When: Mon 4 Mar 2024 11:50 - 12:10 People: Jinchen Xu, Guanghui Song, Bei Zhou, Fei Li, Jiangwei Hao, Jie Zhao
… this optimization in a decoupled way by first generating all mixed-precision code … predicted without evaluating all code variants. Experimental results …
POSTER - P2Res: Pattern-Aware Sparse Communication for Scalable Recommendation Model Training
Main Conference When: Sun 3 Mar 2024 18:00 - 20:00 People: Jiaao He, Shengqi Chen, Jidong Zhai
… that replicates hot embedding vectors on all GPUs and stores the rest ones across hosts. Our parallel strategy guarantees consistency of all embedding vectors …
POSTER - StructMG: A Fast and Scalable Structured Multigrid
Main Conference When: Sun 3 Mar 2024 18:00 - 20:00 People: Yi Zong, Xinliang Wang, Haopeng Huang, Chensong Zhang, Xiaowen Xu, Jian Sun, Bowen Yan, Qin Wang, Sicong Li, Zhaohui Ding, Wei Xue
… achieves the fastest time-to-solutions in all cases with average speedups of 17.6x …
Practical Hardware Transactional vEB Trees
Main Conference When: Tue 5 Mar 2024 10:40 - 11:00 People: Mohammad Khalaji, Trevor Brown, Khuzaima Daudjee, Vitaly Aksenov
… Van Emde Boas (vEB) trees are sequential data structures optimized for extremely fast predecessor and successor queries. These queries are one of the most important incentives to use ordered sets or maps such as vEB trees. All …
Language-Agnostic Static Deadlock Detection for Futures
Main Conference When: Mon 4 Mar 2024 12:10 - 12:30 People: Stefan K. Muller
… of the dependency graph of the program but, as far as we are aware, all are specialized …
Memory Bounds for Bounded Queues
Main Conference When: Tue 5 Mar 2024 10:00 - 10:20 People: Nikita Koval, Anton Paramonov, Petr Kuznetsov, Vitaly Aksenov
… or all inserted elements are distinct. However, in the general case, we show …
Are Your Epochs Too Epic? Batch Free Can Be Harmful
Main Conference When: Mon 4 Mar 2024 10:40 - 11:00 People: Daewoo Kim, Trevor Brown, Ajay Singh
… , and 1.2-1.5× faster than not reclaiming at all, on a 192 thread four socket …
Gallatin: A General-Purpose GPU Memory Manager
Main Conference When: Tue 5 Mar 2024 16:50 - 17:10 People: Hunter James McCoy, Prashant Pandey
… -of-the-art for range operations and is the fastest allocator for all graph …
CGO Keynote: Computing Systems for the Foundation Model Era
Keynotes When: Tue 5 Mar 2024 08:30 - 09:30 People: Kunle Olukotun
… Generative AI applications with their ability to produce natural language, computer code and images are transforming all aspects of society. These applications are powered by huge foundation models such as GTP-4 which are trained …