I. Executive Summary
Generative AI has been a major focus of interest for investors, entrepreneurs, and several technology giants in recent years. This landscape is in a constant state of evolution, making remarkable strides in a matter of weeks. Today, generative AI stands at the forefront of technological adoption, with exponential improvements and real-world applications reshaping industries.
- Language and Vision: Revolutionizing the Startup Landscape. Language and vision applications have emerged as dominant use cases with natural language interfaces occupying the largest market share.
- Foundation Models: The Bedrock of Generative AI. An array of generative AI capabilities relies on robust foundation models that serve as the building blocks for innovation.
- Shaping the Future: Enhancing Accuracy and Exploring New Frontiers. The future of generative AI will witness a relentless pursuit of enhancing the accuracy and realism of generated content. As the technology progresses, experts envision entirely new realms of content creation, including immersive virtual worlds and interactive narratives. Additionally, the potential applications extend to personalized content generation, targeted advertising, and even automated scientific discovery.
- Bridging the Gap: Enterprise and B2B Transformations. Enterprise and B2B solutions are yet to fully catch up in terms of offerings and capabilities with their B2C counterparts [More on this here]. However, these solutions hold immense promise, as they aim to synthesize vast amounts of information, revamp enterprise workflows, and boost productivity for knowledge workers.
- Overcoming Challenges: Paving the Way for Enterprise Adoption. The path to enterprise adoption of generative AI presents challenges such as bias and hallucination. Nonetheless, continuous advancements in large language models, improved AI operations, enhanced tooling, and the availability of high-quality training data are driving us closer to the imminent enterprise adoption of generative AI.
In this paper DVC gives a high level background of Generative AI, updates on latest investment trends and proposes a framework to analyze various opportunities and companies in the category.
II. Introduction To Generative AI
Generative AI is a branch of AI that creates new data, like images, text, signals and music, based on existing data. A model tries to uncover the underlying features of the data and uses probability, optimization and statistics to create new and unique observations that look like they came from the original data.
Generative AI is different from other forms of AI because it generates both the inputs and outputs, not just identifying, categorizing or classifying them. To truly understand what generative modeling aims to achieve and why this is important, it is useful to compare it to its counterparts Discriminative AI or Predictive AI. Discriminative AI tries to distinguish between certain kinds of input, such as answering “Is this image a cat or a dog?” Predictive AI uses historical data to predict likely future events such as “Is this dog likely to go left or right” whereas Generative AI responds to prompts like “Draw a picture of a cat sitting next to a dog.”
Generative AI through the ages
One of the earliest examples of generative AI was the first computer-generated music, created in 1951 by Christopher Strachey at the University of Manchester.
In the 1960s and 1970s, researchers explored using computers for creative output, adapting the 19th-century Harmonograph device to generate computer-generated art. In the 1980s and 1990s, generative AI gained more attention with advancements in computing power and machine learning algorithms, leading to the development of probabilistic models like Hidden Markov Models and Boltzmann Machines. The “AARON” program, created by artist Harold Cohen, also emerged during this time, utilizing rules and algorithms to produce original artwork.
In the early 2000s, the field of generative AI began to expand rapidly, with the development of new techniques such as deep learning and neural networks. These techniques allowed researchers to create models that could generate more complex and realistic content, such as images and videos.
In 2014, a Ph.D. student in machine learning at the Université de Montréal, Ian Goodfellow, under the supervision of one of the Godfathers of AI, Prof. Yoshua Bengio, and Prof. Aaron Courville introduced the concept of Generative Adversarial Networks (GANs). GANs are a type of neural network that can generate realistic content by training two models against each other – a generator that creates new content, and a discriminator that evaluates the content to determine whether it is real or fake. GANs have since become one of the most popular and widely used techniques in generative AI.
Another important advancement was the development of transformers from Google’s landmark paper in 2017 which helped revolutionize the way language tasks are processed. It is still the bedrock of several state-of-art models like GPT-4 and PaLM 2. Transformers are a type of neural network that can learn to generate text by modeling the relationship between words in a sentence using specialized attention frameworks. Transformers have been used to generate high-quality language translations, as well as to generate creative writing and poetry.
Today, generative AI is a rapidly evolving field that is being used to create new and exciting applications, such as generating realistic images, generating music and speech, and even generating new molecules for drug discovery. As computing power continues to increase and new techniques and algorithms are developed, the possibilities for generative AI are virtually limitless.
III. Technical Overview
Generative AI models use probability distributions to model the data they are trained on. These probability distributions can be simple, such as a Gaussian distribution, or more complex, such as a beta or gamma distribution. The model then learns the parameters of these distributions from the training data and can use them to generate new data points that are similar to the training data.
The vehicle that drives most current models is Deep Learning. Deep Learning is a subfield of machine learning that involves training artificial neural networks to learn from data and make predictions or decisions. VAEs (see below for detailed explanation) and GANs are two popular approaches for generative modeling that have been used to generate realistic images, music, and text. In both GANs and VAEs, the underlying mathematics involves optimization techniques, such as gradient descent, to learn the parameters of the model. Additionally, these models often use techniques such as regularization to prevent overfitting and improve the generalization performance of the model.
There are several different types of generative AI models, but they all share a common goal of generating new data that looks like it could have come from the original dataset. Here are some common techniques of modeling and generating data:
Variational Autoencoders (VAEs): A VAE is a type of deep neural network that is trained to encode input data into a lower-dimensional space, and then decode it back into the original space. By learning to compress the data into a lower-dimensional space, the VAE can learn the underlying distribution of the data and generate new samples that are similar to the original data. The VAE is trained to minimize the difference between the generated data and the training data, using a measure called the reconstruction loss. VAEs are slightly older versions of generative AI models that are increasingly being replaced with GAN-based approaches.
Generative Adversarial Networks (GANs): A GAN is a type of neural network that is composed of two parts: a generator and a discriminator. The generator is trained to generate new samples of data that look like they came from the original dataset, while the discriminator is trained to distinguish between real and fake samples. The two networks are trained together in a process called adversarial training, where the generator tries to fool the discriminator, and the discriminator tries to correctly identify the real data. Image-based generative AI models largely use a GAN framework.
Autoregressive Models: An autoregressive model is a type of time-series model that predicts the probability distribution of the next value in a sequence, based on the previous values. These models can be used to generate new sequences of data that are similar to the original sequence. Autoregressive models are typically used for sequential data, such as text or time series data, where the order of the data is important.
Transformers: Transformers are a type of neural network architecture designed to handle sequential data, such as text, by allowing the model to attend to different parts of the input sequence as it generates the output sequence. Transformers use a self-attention mechanism to weigh the importance of each word in the input sequence for generating each word in the output sequence. Transformers have been used for various natural language processing tasks, including language modeling, machine translation, and text generation.
Diffusion networks: These are a type of probabilistic model that uses a sequence of steps (known as diffusion steps) to generate high-quality samples from complex data distributions, such as images or video. They do not rely on explicit modeling of the joint probability distribution or on predicting the next token in a sequence. Instead, they model the probability of observing a data sample after a sequence of diffusion steps that add noise to the data, and they can be trained in an unsupervised manner.
Other modeling methods include Markov Chain Monte Carlo (MCMC), Hamiltonian Monte Carlo (HMC), Variational Inference, Gibbs Sampling, etc. Transformer-based, GAN-based and diffusion network approaches are currently state of the art and are used extensively in the commercial world.
IV. Market Size, Opportunities And Growth
According to one report, the global generative AI market has experienced substantial growth, with a revenue of USD 7.9 Billion in 2021 and a compound annual growth rate (CAGR) of 34.3% expected from 2022 to 2030. The North America generative AI market accounted for the largest share of over 40% in 2021, while the Asia-Pacific generative AI market is predicted to achieve a significant CAGR of approximately 36% from 2022 to 2030, indicating substantial growth potential in the region. The global generative AI market is expected to reach $42.6 billion in 2023, says Pitchbook.
Source: Acumen Research and Consulting
According to Grandview Research, the global market for generative AI can be categorized by component types, modeling approach, end-user, and region.
The component types can be divided into software and services, with the software segment holding the largest revenue share of 65.0% in 2021. The service segment is expected to witness the fastest growth rate of 35.5% during the forecast period.