Key points are not available for this paper at this time.
GPT-Neo is an implementation of model & data-parallel GPT-2 and GPT-3-like models, utilizing Mesh Tensorflow for distributed support. This codebase is designed for TPUs. It should also work on GPUs, though we do not recommend this hardware configuration.
Building similarity graph...
Analyzing shared references across papers
Loading...
Sid Black
Leo Gao
Booz Allen Hamilton (United States)
Phil Wang
Building similarity graph...
Analyzing shared references across papers
Loading...
Black et al. (Sun,) studied this question.
synapsesocial.com/papers/69d95e8dc7f0c3ae80a3d296 — DOI: https://doi.org/10.5281/zenodo.5297715