Structured-Light Depth Camera
Structured-light depth cameras project a known pattern (IR dot pattern, fringe, or binary code) onto the scene and infer depth from the pattern deformation observed by a camera offset from the projector. For coded structured light (e.g., Kinect v1), depth is computed via triangulation from the correspondence between projected and observed pattern features. For phase-shifting methods, multiple fringe patterns encode depth as the local phase. Primary challenges include occlusion in the projector-camera baseline, ambient light interference, and depth discontinuity errors.
Triangulation
Gaussian
phase unwrap
CMOS_IR
Forward-Model Signal Chain
Each primitive represents a physical operation in the measurement process. Arrows show signal flow left to right.
S(pattern) → Π(triangulation) → D(g, η₁)
Benchmark Variants & Leaderboards
Structured Light
Structured-Light Depth Camera
S(pattern) → Π(triangulation) → D(g, η₁)
Standard Leaderboard (Top 10)
| # | Method | Score | PSNR (dB) | SSIM | Trust | Source |
|---|---|---|---|---|---|---|
| 🥇 | PhaseFormer | 0.802 | 34.25 | 0.963 | ✓ Certified | Fringe pattern transformer, 2024 |
| 🥈 | FPP-Net | 0.779 | 33.13 | 0.954 | ✓ Certified | Feng et al., Opt. Lasers Eng. 2019 |
| 🥉 | Gray Code | 0.591 | 25.72 | 0.824 | ✓ Certified | Inokuchi et al., Appl. Opt. 1984 |
| 4 | Phase Shifting | 0.564 | 24.87 | 0.798 | ✓ Certified | Srinivasan et al., Appl. Opt. 1984 |
Mismatch Parameters (3) click to expand
| Name | Symbol | Description | Nominal | Perturbed |
|---|---|---|---|---|
| baseline | Δb | Projector-camera baseline error (mm) | 0 | 0.5 |
| pattern_distortion | ΔP | Pattern distortion (%) | 0 | 1.0 |
| ambient_ir | I_amb | Ambient IR interference (%) | 0 | 3.0 |
Reconstruction Triad Diagnostics
The three diagnostic gates (G1, G2, G3) characterize how reconstruction quality degrades under different error sources. Each bar shows the relative attribution.
Model: triangulation — Mismatch modes: occlusion, ambient light, specular reflection, pattern interference, depth shadow
Noise: gaussian — Typical SNR: 15.0–35.0 dB
Requires: projector camera extrinsics, intrinsics, pattern calibration, lens distortion
Modality Deep Dive
Principle
Structured-light depth sensing projects a known pattern (stripes, dots, coded binary patterns) onto the scene and observes the pattern deformation with a camera from a different viewpoint. The displacement (disparity) of each pattern element between projected and observed positions encodes the surface depth via triangulation. Dense depth maps are obtained by identifying pattern correspondences across the scene.
How to Build the System
Arrange a projector (DLP or laser dot projector) and camera with a known baseline separation (5-25 cm) and convergent geometry. Calibrate the projector-camera system (intrinsics and extrinsics) using a planar calibration target. For temporal coding (Gray code), project multiple patterns sequentially. For spatial coding (single-shot, e.g., Apple FaceID dot projector), use a diffractive optical element to generate a unique dot pattern.
Common Reconstruction Algorithms
- Gray code + phase shifting (sequential multi-pattern decoding)
- Single-shot coded pattern matching (speckle or pseudo-random dot decoding)
- Phase unwrapping for sinusoidal fringe projection
- Stereo matching applied to textured scenes (active stereo)
- Deep-learning depth estimation from structured-light patterns
Common Mistakes
- Ambient light washing out the projected pattern, losing depth information
- Specular (shiny) surfaces reflecting the projector into the camera, causing erroneous depth
- Occlusion zones where the projector illuminates but the camera cannot see (shadowed regions)
- Insufficient projector resolution limiting the achievable depth precision
- Color/reflectance variations in the scene altering perceived pattern intensity
How to Avoid Mistakes
- Use NIR projector + camera with ambient-light rejection filter
- Apply polarization filtering or spray surfaces with matte coating for calibration
- Add a second camera or projector to reduce occlusion zones
- Use high-resolution projectors (1080p+) and fine patterns for sub-mm precision
- Use binary or phase-shifting patterns that are robust to reflectance variations
Forward-Model Mismatch Cases
- The widefield fallback applies spatial blur, but structured-light depth sensing projects known patterns and measures their deformation via triangulation — the depth-encoding pattern correspondence between projector and camera is absent
- Structured light extracts depth from disparity between projected and observed pattern positions (d = f*B/disparity) — the widefield blur produces no disparity information and cannot encode surface depth
How to Correct the Mismatch
- Use the structured-light operator that models pattern projection (Gray code, sinusoidal fringe, or speckle) and camera observation from a different viewpoint: depth is encoded in pattern deformation due to surface geometry
- Extract depth maps using pattern decoding (Gray code → correspondence → triangulation) or phase unwrapping (sinusoidal fringe → depth) with calibrated projector-camera geometry
Experimental Setup
Intel RealSense D435i / Apple TrueDepth / Kinect v1
pseudorandom IR dot pattern / fringe projection
850
0.2-10.0
1280x720
1.0
30
55
Signal Chain Diagram
Key References
- Geng, 'Structured-light 3D surface imaging: a tutorial', Advances in Optics and Photonics 3, 128-160 (2011)
Canonical Datasets
- Middlebury stereo benchmark
- ETH3D multi-view stereo benchmark