inplace attention state, faster and less memory

2022-07-04 09:14:37 -04:00
parent aca617dc64
commit 6f617fe98f
4 changed files with 25 additions and 15 deletions
@@ -9,7 +9,7 @@
 This is a fast, minimal implementation of Boris Dayma's [DALL·E Mega](https://github.com/borisdayma/dalle-mini).  It has been stripped down for inference and converted to PyTorch.  The only third party dependencies are numpy, requests, pillow and torch.

 It takes
- **35 seconds** to generate a 3x3 grid with a P100 in Colab
+- **32 seconds** to generate a 3x3 grid with a P100 in Colab
 - **16 seconds** to generate a 4x4 grid with an A100 on Replicate
 - **TBD** to generate a 4x4 grid with an H100 (@NVIDIA?)