On the current state of AR

Augmented reality is popping up everywhere! Of course the wide bandwidth of what everybody understands to be AR is a big factor. Primarily you’ve got your mobile apps which track your position and add more information to your surrounding, or the fun and educational stargazing tool. And let’s not forget Google Glass.

Every now and then, things get interesting: people want to see an additional, virtual 3D object in the context of a real scene. The most recent buzz was generated by the IKEA app accompanying their catalog. The very real catalog is used as an anchor point in your very real kitchen, living room or whatever and with the camera image used as a background, you can select one of the pieces of furniture and have a look at it, neatly positioned on top of where the catalog is lying around.

Trouble, however, is imminent. The first thing everyone will realize a few seconds into playing around with tools like this is that there’s something odd about the depth: the augmented object is literally always on top of things. The camera of course, without any depth sensor, cannot provide the necessary information to figure out which object in reality is in front of the virtual one. Cleverly positioning your camera, as seen in the trailer, will hide this issue quite well.

The human visual system will not be as easily fooled by the presentation of lights and shadows though. Augmented reality is all about getting two distinct worlds to behave coherently in one frame. The virtual object should behave as if it were real. Your very first impression of it is the way it looks, and the renderer which is responsible for this image has to deal with how light reflects on the objects surface. Question is: how does the renderer know where the light - the real light - actually comes from? Ignoring this question, as apparently many AR renderers do, will instantly reveal the fake object in the room. Google for some AR apps and see how the virtual objects almost always look like plastic. This is because they have been shaded - if at all - most often with some Phong-like appearance and fixed light sources. If someone comes into the room and dims the lights, the virtual object won’t react to it. You can hide this error by rendering a mostly ambiently lit object with some ambient occlusion and place it into an equally ambiently lit room, as was done in the IKEA video. Other real world lighting configurations are not so forgiving.

Of course, not knowing where your light comes from isn’t too good if you want to drop a shadow somewhere. A whole bunch of AR simulations “solve” this problem simply by drawing some shadow blob beneath the object and secretly hope that the user doesn’t flip it on its head, place it on some edge or non-planar surface where the shadow floats in mid-air.

If you have an iOS device, I invite you to check out the Pointcloud browser: this app is initialized with a sweep across a surface with enough texture to extract some features for tracking. After that you can experience first hand how local-lighting and no depth will destroy your mixed reality experience in seconds. Simply hold your hand in front of the camera or switch off the light in the room. (The app by the way is fantastic, you can write AR applications with small XML descriptors in no-time!)

So what’s the point here? Tracking your object with some kind of anchor in the real scene is called geometric registration. This step is important, but not sufficient. You need to track the rest of reality with it. If we want to have a coherent appearance, we need to look more closely at photometric registration as well.