Artificial Intgelligence Seminar / Computer Science Speaking Skills Talk

Tuesday, April 20, 2021 – 12:00pm to 1:00pm


Virtual Presentation – ET Remote Access – Zoom


MISHA KHODAK, Ph.D. Student http://www.cs.cmu.edu/~mkhodak/

Factorized layers revisited: Compressing deep neural networks without playing the lottery

Machine learning models are rapidly growing in size, leading to increased training and deployment costs. While the most popular approach for training compressed models is trying to guess good “lottery tickets” or sparse subnetworks, we revisit the low-rank factorization approach, in which weights matrices are replaced by products of smaller matrices. We extend recent analyses of optimization of deep networks to motivate simple initialization and regularization schemes for improving the training of these factorized layers. Empirically these methods yield higher accuracies than popular pruning and lottery ticket approaches at the same compression level. We further demonstrate their usefulness in two settings beyond model compression: simplifying knowledge distillation and training Transformer-based architectures such as BERT. This is joint work with Neil Tenenholtz, Lester Mackey, and Nicolo Fusi.

Presented in Partial Fulfillment of the CSD Speaking Skills Requirement.

Zoom Participation. See announcement.

Event Website:


For More Information, Contact:


Speaking Skills

Similar Posts