Synthetic shape world with ONNX encoders running fully in the browser.
This demo predicts the closest caption for an image using similarity search, not image generation. Try creating one on the left and run caption search.