On social media, people have been asking how the 3D piece was made, and Roque's shared some of the details of his process.
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by training a model that takes as input a text ...