DeepNet: Scaling Transformers to 1,000 Layers | Synapse