Photogrammetry and Volumetric Capture

What is 3D content?
What is 3D storytelling content
that is not soleley primtie objects, or manufactured, man made models?

What about organic looking content.
Like natural environments, trees, mountains, rocks, animals, and people?

That's where photogrammetry and volumetric capture come in.
They are two new ways to record, scan or capture real world information to place into the virtual as a 3D set of polygons that you can walk around from every angle.

Three methods for taking something from the real world to then display it in 3D:

1. A 3D model animated
This is the traditional method.
The model could be modeled in Maya or Blender.
It could be captured through photogrammetry.
It could be animated by conducting motion capture on an actor retargeting an that animation onto the model.
Model capture method:
Photogrammetry is many photos that are stitched together into one model.
 

2. Volumetric Capture
This method
This is one transform with a bunch of vertices (corners) displayed over time.
The vertices change per frame.
Volume capture method:
One RGB photograph & depth (via infrared) per frame.
A person becomes about 20,000 polygons per frame. 
Textures can drop to 1,000 at 2014 (there is a limit on how close up viewers can get, or how large the model can scale).


This is a linear experience viewed from any angle
But it can become interactive

3. Computer Vision
This is an experimental version right now.
RCNN and Pose Estimation
Can extract the skeleton, the make a model, and the texture.
What is the texture on the other side?
This can be coupled with procedural animation.
Procedural animation - animating based on what the structure of the skeleton is. The machine intelligence knows the form so it should know how the object, animal, or person should behave. The traditional method of animating - manually - makes sense, but is labor intensive. Coding the animation is more efficient but is a higher barrier to entry.


How volumetric can become interactive:

Transition method between views.

Loop one view (one move / one sequence)

Allow a user to navigate to the next view (move / sequence)

Allow a user to navigate non-linearly - requires the transitions blend, tween, or use Temporal Coherence.

Example of Temporal Coherence - if I am in my chair sitting and moving my arms, my chest might be still for 10 seconds - so the same set of polygons on my chest can be used for those 10 seconds - effectively pausing the timeline of frames in that one 3D area. Thus increasing performance and decreasing poly count.


Composing Full VR Environments

A concentric circle - with the action in the center. 

Volumetric content in the center - constantly changing.

Photogrammetry surrounding

360 skybox sphere around you - when something is 10 meters away from you - you cannot see perspective (the same way Game Development Engines use Load on Demand - to "flatten" far away objets - using one asset for both eyes in stereoscopic images).

 


Companies in this field

Microsoft

Intel

4D Views

8i (this has a small bounding box of about 2m tall x 1m wide)