Today’s post envisions the state of XR five years in the future, and how we might prototype advanced features like room-scale digitization with current hardware.
This latest generation of HMDs (Oculus Quest, Oculus Rift S, Valve Index, etc.) has swapped out external tracking equipment in favor of HMD-mounted camera arrays. This change is more than a cut-and-dry hardware replacement; simultaneous localization and mapping (or SLAM) algorithms compute localization … AND a local model (or mapping) of the world around us. Mapping is a powerful capability – it can be subtly useful, in ways like detecting the floor or computing one’s height (Oculus’ Guardian Setup app does precisely this). It can also be world-changing, as Apple devotees have surely imagined after catching a glimpse of RealityKit, Apple’s new framework for interacting with real-world objects in augmented apps. It also has the potential to grossly violate one’s personal privacy (interior scans of my home on the internet? Preferably not).
All of this is to say – in the not-so-distant future our HMDs will sense enough of our environment to faithfully reproduce it digitally. Floor-to-ceiling, with everything from desk chairs to decorative mugs, tea kettles to topiaries (I can only speak to my own interior decor) faithfully represented and 100% upload- and share-able.
Ethical dilemmas aside (see next post), we here at FRL want to prototype this technology and work through some applications to better understand the implications.
Let’s talk prototype.
These HMDs make it purposely difficult to access the underlying camera resources (and understandably so) so we’ll need our own camera(s) for SLAM’ming. One exciting possibility is a 360-degree camera made by manufacturer Ricoh. Paired with the right algorithm, this might be all the sensor hardware we need.
Let’s talk pipeline.
HMDs like the Oculus Quest are resource-constrained and less than ideal for processing lots of data. A compute server would better fit the bill. Our pipeline would therefore look something like:
- Stream 360-degree video from a camera to a local server.
- The server runs SLAM and/or SFM to produce a low-polygon 3D model with reasonable-resolution textures.
- The model is transmitted to and displayed on a VR headset.
Tying this all to the title of the piece, I imagine packaging this prototype within an app which visualizes the 360-degree video capture like waving a VR zipper through the air in front of you. You peer through the zipper opening, and slowly but surely a model of your room resolves. Finally, you step into a VR (or is it now AR?) model of your usual surroundings.
…’til next time!