Posts by Chris Forster
Conversational AI
Dec 05, 2019
Pretraining BERT with Layer-wise Adaptive Learning Rates
Training with larger batches is a straightforward way to scale training of deep neural networks to larger numbers of accelerators and reduce the training time....
10 MIN READ