I recently had a conversation with a client regarding the use of immersive digital media in engineering and design. We discussed how it can enhance research, product development and training and increase the overall impact. Over the last 8 years I have conducted a variety of experiments exploring immersive media such as recording 360 video, drawing with Google Tilt Brush and mixing ambisonic audio. This article, the first of two posts exploring immersive media, provides an overview of terminology.
Traditional vs Immersive Media
The term traditional media usually refers to television, radio, newspapers and cinema. Content is presented to the audience in a passive manner, meaning there is little or no control over presentation or narrative. In contrast, immersive media interactivity and enhanced sensory experience using advanced hardware and software such as VR headsets or headphones designed to emulate spatial audio. Some technologies incorporate the simulation of touch and smell. Immersive experiences are designed to be consumed in a non-linear, participatory manner where choices and physical interaction affect narrative and environment.
Below is an example of a 360 video uploaded to YouTube with a resolution of 8K. The original video was recorded with a high resolution camera. It is important to note only a portion of the 7680 x 3840 pixels recorded by the 360 camera will be visible to the viewer at any given time (depending upon the Field Of View) which reduces displayed resolution to that approximating full HD (1920 x 1080). If viewed on a desktop PC in full screen you can use the mouse to direct the point of view by clicking and dragging in the desired direction
Immersive experiences are designed to increase the sense of realism and there are many different formats available, each with their own characteristics and advantages. Common media formats are:
360 Video
360 video can be viewed in a Virtual Reality headset such as Meta Quest 3 and, when uploaded to platforms such as YouTube, is also available on a desktop or mobile device. The viewer interacts with the content within a VR headset by moving their head or on a desktop by ‘clicking and dragging’ to change the point of view using an input device such as a mouse. It is also possible to achieve similar interactions using the gyro technology on a mobile phone or a screen with touch capability. YouTube can display interactive 360 video in VR, on desktop and mobile.
Video is recorded with a camera utilising a series of wide angle lenses designed to capture the surrounding environment. It is stitched together using compatible software which may be provided by the manufacturer such as Insta360 Studio or by a third party such as Mistika VR.
360 video is usually recorded in the same equirectangular format as 360 photography. Current cameras record video of at least 6 – 8K which results in Gigabytes of data per minute with the Insta360 Titan recording 11K (10K in 3D). The challenges posed in producing 360 video, such as hiding microphones, lights and other equipment, has led to a decline in use during recent years in favour of 3D VR180 video. However the format remains popular in real estate, tourism and journalism where a view of the entire environment is important.
VR180 Video
VR180 uses half the horizontal viewing angle of 360 video with just the front facing 180 degrees available. It is designed to be consumed within a VR headset, viewed on a screen with active glasses or converted to anaglyph for viewing with red / cyan glasses. Whilst there are only 180 degrees of recorded content, most VR headsets have a viewing angle of around 90 degrees which provides a realistic sense of immersion.
Content is typically recorded using 2 wide angle lenses covering a 180 degree viewing angle. Both of these lenses face the same direction with the centre of each lens placed at approximately the same distance as human eyes. When converted for use within a VR headset, the video provides realistic depth. HumanEyes Technologies released the Vuze XR in 2018 which had two 4K cameras which could be used in VR180 mode or 360 capture mode. A recent addition to the VR180 camera market is the CALF 3D VR180.
This format is used in vlogging and entertainment such as storytelling. However, as mentioned in the previous section, 360 video is still used when it is useful to see an environment in its entirety.
Virtual Reality (VR)
Virtual Reality experiences are designed to facilitate interaction where location, physicality and changes to the environment have meaningful consequences. They are usually viewed within a VR headset such as Vive XR Elite or Meta Quest 3 using controllers or hand tracking. However, platforms such as Spatial and Horizon Workrooms allow users access via a desktop environment as a ‘window’ to the virtual world. The user is able to shape the narrative and environment by their choices which may involve changing the state or position of physical objects within a space. Many VR applications are created with software such as Unity or Unreal Engine.
Examples of immersive VR applications range from as simple as the simulation of fairground games within Nvidia’s VR Funhouse, production of 3D art using Google Tilt Brush or involve the complexity associated with piloting an aircraft in Flight Simulator. Other examples may be found on Meta’s App Store.
The term ‘Virtual Reality‘ was first used by American academic Jaron Lanier in the 1980’s as a title for his research project. He is considered to be the ‘father of VR’ because of his groundbreaking work in the field.
Augmented Reality (AR)
Augmented Reality is the technology that overlays visuals, data or audio onto the real world, enhancing the user’s perception of the environment. One example of this is Google Maps Live View where the camera on a mobile phone is used to show a live view of the road ahead whilst superimposing directions and other visual guides. Another notable project is Glass, Google’s answer to Augmented Reality glasses. This project began in 2010 with the wearable tech available in 2014, It was later discontinued in 2015 due to safety and privacy concerns along with a lack of uptake in the healthcare sector – see this article for more information on the cancellation.
Mixed Reality (MR)
Mixed Reality is similar to Augmented Reality but allows the users to interact with the layers or objects superimposed upon the environment around the user. Meta Quest 3’s MR demo First Encounters is a great example of this. The surrounding environment is displayed on the headset in real time using front facing cameras whilst objects are overlaid onto the display to create game elements that can be interacted with.
Extended Reality (XR)
This term incorporates VR, MR and AR. XR refers to the technologies and experiences collectively.
Ambisonic Audio
Ambisonics is an audio technology that uses hardware and software capable of rendering spatial audio in Virtual Reality, Augmented Reality and Mixed Reality. As few as 4 audio channels can be used to represent sound within a virtual space. As the viewer’s head changes direction or objects emitting sound move within a space the audio is adjusted in a realistic manner to reflect the effect of these movements on the perceived sound. It is also possible to experience ambisonic audio in a limited manner when viewing 360 video on a desktop PC or mobile device by moving the point of view. The use of 4 audio channels to simulate spatial sound is referred to as First Order. However, it is possible to use more than 4 channels to enhance the effect in a similar way to improvements of 7.1 surround sound over 5.1.
For more information on ambisonics, see this excellent summary of ambisonic audio from Waves.com
Olfactory
Olfaction or olfactory sense is the sense of smell. There are devices capable of stimulating the olfactory sense as part of an immersive experience. One example is the Smell Engine described as “a system for artificial odour synthesis in virtual environments”
Gustatory
Gustatory perception refers to the taste sense. It is possible to trick the human brain to into thinking that food is being consumed using stimulation by computer controlled plates placed upon the tongue. In 2013 digital lollipop was created by researchers at the University of Singapore that stimulated sweet, sour, salty and bitter tastes.
Summary
Immersive digital media has the potential to elevate and enhance the process of storytelling, communicating research ideas, developing products and in the provision of training. The last 10 years has seen rapid growth of hardware and software technologies at both professional and consumer levels increasing the number of creators and immersive content. Despite these advances, many challenges remain including the size, weight, cost and uptake of VR headsets, the cost and quality issues associated with 360 and VR180 cameras and the technical complexities of generating spatial audio. There are also positive signs too – the release of the Apple Vision Pro, camera releases from manufacturers such as Insta360 and continued support for immersive content in Adobe’s Creative Cloud.