Loki: Low-Rank Keys for Efficient Sparse Attention

Published in Advances in Neural Information Processing Systems 37 (NeurIPS), 2024

Recommended citation: @inproceedings{NEURIPS2024_1e027da6, author = {Singhania, Prajwal and Singh, Siddharth and He, Shwai and Feizi, Soheil and Bhatele, Abhinav}, booktitle = {Advances in Neural Information Processing Systems}, editor = {A. Globerson and L. Mackey and D. Belgrave and A. Fan and U. Paquet and J. Tomczak and C. Zhang}, pages = {16692--16723}, publisher = {Curran Associates, Inc.}, title = {Loki: Low-rank Keys for Efficient Sparse Attention}, url = {https://proceedings.neurips.cc/paper_files/paper/2024/file/1e027da6bec9ceb2ec37951ceeccae93-Paper-Conference.pdf}, volume = {37}, year = {2024} }
Download Paper