The Future of AR is Understanding

Up to this point, AR has been about overlaying on top of the world. You can place virtual masks on yourself, see directions on top of a real street or hallway, or have virtual characters dance in your room for your entertainment. This is about where current technology ends. In July of this year, computer vision researchers at the University of Washington debuted something particularly novel. It's a neural network that could generate realistic video clips of Obama, using an audio input of his speech.

Depending on your viewpoint, this is either incredible or horrific. But regardless, it is the future.

What this technology demonstrates is the understanding and synthesis of virtual content into a scene. There are no overlays here, each pixel of the final video was composited through a complex network, trained to match Obama himself. This is the future of AR: realtime synthesis and manipulation of the things you’re looking at. The future of AR is understanding.

Popular culture's representation of AR is stuck in the cliche's of 80's cyber-punk: neon glowing overlays and a complete overstimulation of the senses. The recent concept-film "Hyper-Reality" by Keichii Matsuda is the perfect distillation of this idea.

Your shopping trip in 2025 as seen from the point of view of 1995.

However, this ad-laden future doesn't need to be our future, and I don't see any trends towards it. Ad-based overstimulation might have been a trend in the late 90's, but it's gone away in the modern web with the development of ad blockers. At a fundamental level, humans don't want to be inundated with ads all the time, and (perhaps surprisingly) technology has adapted.

So what will AR look like in the future?

How about AR that allows doctors to take a look at a patient, and see not only information, but a perfectly rendered view of precisely what is behind those layers of skin (using information from ultrasounds and other sensors). The view could be constructed in realtime, and uniquely for every patient.

Pilots, drivers, firefighters could have smoke and debris immediately cleared from their vision, revealing a clear path. This would not be a simple overlay or a "HUD" but the precise removal of obscuring objects from your view, revealing detailed reconstructions of what's behind.

Annoying billboards could simply be removed entirely, replaced with anything you desire.

This is AR with purpose, and it's not hard to let your brain run wild once you give machines the ability of understanding. Futurists jump to neon-lit dystopias because it's difficult to see past the limitations of today's technology, but let's not be so gripped by narratives that we look past what actually might come to pass.

Obviously this version of the future is not here yet. Even with the breakneck pace of machine learning, we're easily decades away. This gets to an issue with AR in the present: it doesn’t have a purpose. It’s a technology fed on ambition and excitement for the future, but with little use until we have better machine understanding and ergonomic means for blending it with reality. It is not a matter of waiting around for someone to come up with the ‘killer AR’ app, because for the current state of technology, there probably isn’t one.

However, this 'fishing' we’re doing in the present is useful precisely because the trajectory is so compelling. I do not mean this to be a cautionary article, rather one of a bit of recalibration. Let’s not burn ourselves out so much on the current state of AR that we throw the baby out with the bathwater. We can already see hints of this in the VR community. Let’s play the long game. The technology will eventually be there with us.