APPLICATION OF GENERATIVE DIFFUSION MODELS IN DIGITAL IMAGE CREATION
DOI:
https://doi.org/10.26906/SUNZ.2022.4.114Keywords:
GAN, generative adversarial networks, artificial intelligence, non-equilibrium thermodynamics, diffusion models, digital art, ImageNet model, WordNetAbstract
There’s been a significant surge in the popularity of generative networks over the last year. With public releases of such advanced models as DALL-E, Stable Diffusions, or GPT-3, anyone with modest, run-of-the-mill hardware can dabble in machine learning [3]. Diffusion models are inspired by non-equilibrium thermodynamics. Diffusion models are a subcategory of likelihood-based models. They are known to offer reliably scalable, high-fidelity images while retaining a stationary training objective. These models generate samples by graduallyremoving noise from a signal, and their training objective can be expressed as a reweighted variationallower bound [2]. This class of models already holds the state-of-the-art [6] on CIFAR-10 [3], butstill lags behind GANs on difficult generation datasets like LSUN and ImageNet. Nichol and Dhariwal [4] found that these models improve reliably with increased compute, and can produce high-qualitysamples even on the difficult ImageNet 256×256 dataset using an upsampling stack. However, theFID of this model is still not competitive with BigGAN-deep [5], the current state-of-the-art on thisdataset. Even more, these models are capable of producing an infinite amount of unique, high-quality images, human-like speech, and realistic music, indistinguishable from human-made ones at the first glance. The popularity of generative models has grown rapidly. Likelihood-based models might provide better performance in comparison to GANs. Diffusion models are a promising new category of likelihood models. Disco Diffusion is a combination of CLIP and ImageNet models. It can generate digital art based on text prompts. Numerous applications are possible for this model, such as the creation of video, animation and image content. Several distinctions have to be considered when choosing Disco Diffusion over GAN.Downloads
References
Prafulla Dhariwal, Alex Nichol – Diffusion Models Beat GANs on Image Synthesis URL: https://arxiv.org/pdf/2105.05233.pdf
Sakib Shahriar - GAN Computers Generate Arts? A Survey on Visual Arts, Music, and Literary Text Generation using Generative Adversarial Network, URL: https://arxiv.org/ftp/arxiv/papers/2108/2108.03857.pdf
Ali Razavi, Aaron van den Oord, Oriol Vinyals – Generating Diverse High-Fidelity Images with VQ-VAE-2, URL: https://arxiv.org/pdf/1906.00446
Rewon Child Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images, URL: https://arxiv.org/pdf/2011.10650
Matthew Cateer – CLIP Prompt Engineering for Generative Art, URL: https://matthewmcateer.me/blog/clip-promptengineering/
Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston - Neural Photo Editing with Introspective Adversarial Networks, URL: https://arxiv.org/pdf/1609.07093
Open AI Image GPT, URL: https://openai.com/blog/image-gpt/
ImageNet: About, URL: https://www.image-net.org/about.php
Google Trends, URL: https://trends.google.com/
Golovko G. V., Nikiforova K. M. Information systems use at Poltava national technical Yuri Kondratyuk University. Control, navigation and communication systems. 2018. Vol. 3. Р. 103-105.