Efficient Hardware Scaling and Diminishing Returns in Large-Scale Training of Language Models

Jared Fernandez
Jared Fernandez

PhD student at CMU LTI working on ML efficiency.

Related