In 1998, Jaime Carbonell and Jade Goldstein proposed maximal marginal relevance (MMR) as way to balance the concerns of relevance and diversity. This measure is certainly useful in the context of search result diversity. But I think it also helps us think about creativity in an age of generative AI.
People are concerned about ChatGPT and other generative AI tools for a variety of reasons. They fear misinformation in the form of convincingly forged text, images, audio, and video. They fear how automation will disrupt the global job market, putting people out of work and rendering their skills obsolete. Some people even fear a Skynet-driven apocalypse.
Perhaps the biggest and more immediate concern is that generative AI threatens the livelihoods of creators and even creativity itself. Writers and artists in a variety of media fear and object to their art or style being copied or mimicked by generative AI models trained on their work. Creators whose livelihoods depend on payment for their work have good reason to push back on commodification. Some creators are even taking legal action to try to prevent generative AI from using their work as training data.
I am sympathetic to these creators, but I am also skeptical of approaches that seek to prohibit learning from creations, as opposed to copying them. All automation, going back to the industrial revolution, is a form of training machines based on the results of human effort. Automation has literally cheapened many forms of human labor, much of which we no longer think of work that people should perform. Perhaps the potential impact of generative AI is qualitatively different, but I am cautious about negating a principle that has been key to the progress of humanity over centuries.
Carbonell and Goldstein defined the marginal relevance of a search result as a combination of its relevance to the query and its distinctiveness from other results. Their definition formalizes the intuition that duplicates and near-duplicates of relevant results are not useful for searchers.
Can we apply the same principle to creativity? Clearly a literal copy of a work does not contribute any additional value, and therefore contributes no marginal creativity. Similarly, a highly derivative work does not contribute much value and thus demonstrates little marginal creativity.
Some creators might celebrate such a standard as demonstrating the value of human creativity as compared to what generative AI can produce. But I would not be so quick to jump to that conclusion. After all, many people resort to copying or to producing highly derivative work. Conversely, generative AI has shown itself capable of producing novel results that are not, at least in any obvious way, highly derivative of existing works.
Of course, all creative work is derivative in the sense that it builds on what has been created before it. I suspect that even Ecclesiastes was not the first to remark that “there is nothing new under the sun”. The question is not whether a new work is derivative, but rather how much novelty it adds to the totality of the work that came before it.
Consider the following thought experiment: if a generative AI model did not have access to a particular piece of training data, how much would that lack of access limit or otherwise affect its output? What if it did not have access to anything produced by a particular author or artist? This kind of sensitivity analysis might help us quantify the difference between copying data and learning from data, and arrive at a notion of marginal creativity. After all, if the output of a generative AI model depends so critically on that one input or author, we can reasonably conclude that the model is leaning so heavily on that input or author as to add minimal creativity of its own.
I concede that measuring marginal creativity is even harder than measuring marginal relevance — which is hard enough! — and I do not expect this thought experiment to be easy to put into practice. Still, I suspect that we will have to follow this path to achieve a definition of creativity that offers a level playing field for humans and generative AI.
Meanwhile, I hope that my own derivative thoughts continue to be informative, insightful, and entertaining enough that I am not replaced by a generative AI anytime soon!