Loki: Low-Rank Keys for Efficient Sparse Attention
Prajwal Singhania, Siddharth Singh, Shwai He, Soheil Feizi, and Abhinav Bhatele
(To Appear) Advances in Neural Information Processing Systems 37 (NeurIPS), 2024
Prajwal Singhania, Siddharth Singh, Shwai He, Soheil Feizi, and Abhinav Bhatele
(To Appear) Advances in Neural Information Processing Systems 37 (NeurIPS), 2024
Abhimanyu Hans, John Kirchenbauer, Yuxin Wen, Neel Jain, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, and Tom Goldstein
(To Appear) Advances in Neural Information Processing Systems 37 (NeurIPS), 2024
Siddharth Singh, Prajwal Singhania, Aditya Ranjan, John Kirchenbauer, Jonas Geiping, Yuxin Wen, Neel Jain, Abhimanyu Hans, Manli Shu, Aditya Tomar, Tom Goldstein, and Abhinav Bhatele
(To Appear) SC'24: Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, 2024
Siddharth Singh, Olatunji Ruwase, Ammar Ahmad Awan, Samyam Rajbhandari, Yuxiong He, and Abhinav Bhatele
ICS '23: Proceedings of the 37th International Conference on Supercomputing, 2023
Siddharth Singh and Abhinav Bhatele
2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2023