The essence of machine learning is learning from data. Since we can only learn from past data, machine learning uses our knowledge of the past to predict the future. Increasingly, that past is privately owned.
With machine learning emerging as the defining technology of our age, we are in the midst of a land-grab for the data that powers it. As companies like Google, Facebook, Amazon, and Uber have demonstrated, data — particularly data about human behavior — creates a robust and sustainable competitive advantage.
Our increasingly digital existence creates a massive historical record, but we are mostly privatizing that record by treating the data as private property, owned by the companies that provide us valuable services. At an individual level, the data represents our identity, interests, and tastes — sometimes seeming to know us better than we know ourselves. In aggregate, the data makes it possible for its owners to understand and target people at scale.
The success of the aforementioned companies shows that we are very willing to give up data ownership in exchange for the value derived from that data. We prefer watching ads to paying for services, and a new generation of “data natives” expects products and services to be personalized. We may have misgivings about our loss of privacy, but for the most part we accept companies that get right up to the creepy line but don’t cross it.
But privatizing the past has a downside. A handful of companies now have exclusive control over much of the data that defines our digital existence. By owning our past, they are in the strongest position to create and control our future. Even if they do so with the best of intentions, we should be thoughtful about yielding so much power to a handful of institutions whose interests aren’t always aligned with our own.
Machine learning is driving an extraordinary revolution, and we should continue to seek way to create a better world by learning from data. But if we allow others to own the data necessary to train models, we are outsourcing that revolution. At an individual level, it may be difficult to see any downside to trading data ownership for value. But at a collective level, we risk giving away our future.
If we are to democratize the benefits of machine learning, we cannot privatize our past.