Physics World Model — Modality Catalog

3 imaging modalities with descriptions, experimental setups, and reconstruction guidance.

All Computational

Integral Photography

Integral photography (IP), originally proposed by Lippmann in 1908, captures a light field using a fly-eye lens array (matrix of small lenses) where each lenslet records a small elemental image from a slightly different perspective. The array of elemental images encodes 3D scene information, enabling computational refocusing, depth estimation, and autostereoscopic 3D display. Compared to microlens-based plenoptic cameras, IP typically uses larger lenslets with correspondingly more pixels per lens. Reconstruction includes depth-from-correspondence between elemental images and 3D focal stack computation.

Physics: light field

Solver: depth_estimation

Noise: gaussian

#computational #integral #multi_view #3d_display #depth

View details →

Integral Photography

Description

Principle

Integral photography (also known as integral imaging) uses a 2-D array of elemental lenses to capture multi-perspective views of a 3-D scene simultaneously. Each elemental lens records a small perspective image, and the full set encodes the 4-D light field. Computational reconstruction produces 3-D images that can be viewed from different angles or refocused without glasses.

How to Build the System

Place a 2-D microlens or lenslet array (pitch 0.5-1 mm, ~50-200 elements per side) at one focal length from a high-resolution sensor. Each lenslet forms a separate elemental image. For display: show the integral image on a high-resolution display with a matched output lenslet array. Calibrate lenslet grid alignment, individual lens focal lengths, and vignetting correction. Use telecentric imaging for uniform magnification.

Common Reconstruction Algorithms

Computational refocusing via pixel rearrangement and summation
Depth estimation from elemental image disparity analysis
3-D scene reconstruction from integral images
Super-resolution integral imaging (combining multiple shifted captures)
Deep-learning integral image reconstruction and view synthesis

Common Mistakes

Lenslet array not properly aligned with the sensor pixel grid
Insufficient number of elemental lenses for the desired depth range
Crosstalk between adjacent elemental images due to lens aberrations
Not correcting for vignetting variations across the lenslet array
Pseudoscopic (depth-reversed) images if reconstruction is not properly handled

How to Avoid Mistakes

Align lenslet array to sensor with precision jigs and verify with calibration patterns
Design lenslet pitch and focal length for the required depth-of-field
Use high-quality molded lenslets and baffles to minimize crosstalk
Apply per-lenslet calibration including vignetting and distortion correction
Use computational depth inversion to correct pseudoscopic effects

Forward-Model Mismatch Cases

The widefield fallback produces a single-perspective blurred image, but integral imaging captures multiple sub-aperture views through a lenslet array — each elemental image sees the scene from a slightly different angle
Without the lenslet-array angular encoding, depth information (parallax between views) is lost — computational refocusing and 3D reconstruction from the fallback output are impossible

How to Correct the Mismatch

Use the integral imaging operator that models the lenslet array: each microlens captures a different angular perspective, encoding the 4D light field on the 2D sensor
Reconstruct depth maps via disparity estimation between elemental images, and perform computational refocusing using pixel rearrangement and summation across sub-aperture views

Experimental Setup — Signal Chain

Experimental setup diagram for Integral Photography

Experimental Setup — Details

Instrument: Custom integral imaging setup / ETRI prototype

Micro Lens Pitch Mm: 1.0

Micro Lens Na: 0.16

Sensor Pixel Um: 5.5

Pixels Per Lens: 20x20

Reconstruction: 3D focal-stack / depth estimation

Key References

Lippmann, C. R. Acad. Sci. Paris 146, 446 (1908)
Park et al., 'Recent progress in 3D imaging systems', J. Opt. Soc. Am. A 26, 2538 (2009)

Canonical Datasets

ETRI integral imaging test set
Middlebury multi-view stereo (adapted)

Light Field Imaging

light_field Computational

Light field imaging captures the full 4D radiance function L(x,y,u,v) describing both spatial position (x,y) and angular direction (u,v) of light rays. A microlens array placed before the sensor captures multiple sub-aperture views simultaneously, enabling post-capture refocusing, depth estimation, and perspective shifts. Each microlens images the objective's exit pupil, trading spatial resolution for angular resolution. The 4D light field can be processed with shift-and-sum for refocusing, disparity estimation for depth, or epipolar-plane image (EPI) analysis. Primary challenges include the inherent spatial-angular resolution tradeoff and microlens aberrations.

Physics: light field

Solver: shift_and_sum

Noise: gaussian

#computational #light_field #plenoptic #depth #refocusing

View details →

Light Field Imaging

Description

Principle

Light-field imaging captures both the spatial position and direction of light rays in a scene, recording a 4-D light field L(u,v,s,t) where (u,v) parameterize the aperture and (s,t) parameterize the spatial position. This enables computational refocusing, depth estimation, and novel viewpoint synthesis from a single capture. A microlens array placed before the sensor trades spatial resolution for angular resolution.

How to Build the System

Place a microlens array (MLA) at the sensor plane of a camera, one focal length in front of the image sensor. Each microlens captures the angular distribution of light from a corresponding spatial position (Lytro-style plenoptic camera). Alternative: use a camera array (e.g., 4×4 or 8×8 synchronized cameras) for higher angular and spatial resolution. Calibrate MLA alignment, microlens pitch, and main lens parameters.

Common Reconstruction Algorithms

Shift-and-sum refocusing (synthetic aperture)
Depth estimation from disparity between sub-aperture images
Fourier slice theorem for light-field refocusing
Light-field super-resolution (recovering spatial resolution lost to MLA)
Deep-learning view synthesis (light field reconstruction from sparse views)

Common Mistakes

Microlens array misaligned with sensor pixels, causing vignetting and crosstalk
Insufficient angular samples for accurate depth estimation in textureless regions
Not calibrating MLA-to-sensor alignment, producing decoding artifacts
Confusing spatial and angular resolution trade-off limits of the plenoptic design
Ignoring diffraction effects at the microlens apertures

How to Avoid Mistakes

Precisely align MLA to sensor with sub-pixel accuracy; use calibration targets
Increase camera array density or use coded-aperture techniques for more angular samples
Calibrate using a white image and point-source images for precise microlens grid mapping
Design the system with the desired spatial-angular trade-off explicitly computed
Use microlens diameters larger than the diffraction limit (> 10× wavelength)

Forward-Model Mismatch Cases

The widefield fallback produces a single (64,64) image, but a light field camera captures both spatial and angular information via a microlens array — the output encodes multiple sub-aperture views for computational refocusing
Without the angular dimension (directions of light rays), depth estimation from parallax and computational refocusing are impossible — the widefield model captures only a single perspective

How to Correct the Mismatch

Use the light field operator that models the microlens array: each microlens captures light from different angular directions, producing an (x, y, u, v) 4D light field on the 2D sensor
Reconstruct depth maps from sub-aperture disparity, perform computational refocusing via shift-and-sum, or apply light-field super-resolution to trade angular for spatial resolution

Experimental Setup — Signal Chain

Experimental setup diagram for Light Field Imaging

Experimental Setup — Details

Instrument: Lytro Illum / Raytrix R42

Micro Lens Pitch Um: 14

Angular Resolution: 9x9 (HCI) / 15x15 (Lytro Illum)

Total Sensor Px: 7728x5368

Spatial Per View: 434x625

Dataset: HCI 4D LF Benchmark, Stanford Lego Gantry

Key References

Levoy & Hanrahan, 'Light field rendering', SIGGRAPH 1996
Ng et al., 'Light field photography with a hand-held plenoptic camera', Stanford Tech Report CTSR 2005-02

Canonical Datasets

HCI 4D Light Field Benchmark
Stanford Lego Gantry Archive
INRIA Lytro Light Field Dataset

Panorama Multi-Focus Fusion

panorama Computational

Multi-focus panoramic fusion combines images captured at different focal planes and/or different spatial positions to produce an all-in-focus image with extended depth of field and wide field of view. Focus stacking selects the sharpest regions from each focal plane using local contrast measures, then blends them via Laplacian pyramid fusion or wavelet-based methods. Panoramic stitching aligns overlapping images using feature matching (SIFT/SURF) and blends seams. Primary challenges include parallax at scene edges and focus measure ambiguity in low-texture regions.

Physics: multi focus

Solver: laplacian_pyramid_fusion

Noise: gaussian

#computational #panorama #fusion #focus_stacking #extended_dof

View details →

Panorama Multi-Focus Fusion

Description

Principle

Panoramic multi-focus fusion captures multiple images of the same wide scene at different focal distances and combines them to produce a single all-in-focus panorama with extended depth of field. Image stitching aligns overlapping frames using feature matching and homography estimation, while focus fusion selects the sharpest pixels from each focal plane.

How to Build the System

Mount a camera on a motorized panoramic head (nodal point rotation). For each pan/tilt position, capture a focus stack (3-10 images at different focus distances). Use a medium-aperture setting (f/5.6-f/8) for each frame. Stitch overlapping views (30 % horizontal overlap) and fuse focus stacks per view tile. Calibrate the panoramic head to rotate around the lens entrance pupil to minimize parallax.

Common Reconstruction Algorithms

Laplacian pyramid focus fusion (weighted blending by local contrast)
SIFT/SURF feature matching + RANSAC homography estimation
Multi-band blending (Burt-Adelson) for seamless stitching
Exposure fusion (Mertens et al.) for HDR panoramas
Deep-learning focus stacking (DFDF, DeepFocus)

Common Mistakes

Parallax errors from rotation not centered on the lens entrance pupil
Ghosting from moving objects between sequential captures
Color inconsistency between overlapping tiles due to auto-exposure variation
Incomplete focus coverage leaving blurry regions in the final panorama
Stitching artifacts at seam lines visible in the final output

How to Avoid Mistakes

Use a calibrated panoramic head; verify no-parallax point for the specific lens
Mask out or blend moving objects; capture quickly or use simultaneous multi-camera rigs
Lock exposure, white balance, and focus (manual mode) across all tiles
Plan focus distances to cover the entire depth range of the scene
Use multi-band blending and choose seam lines in textureless regions

Forward-Model Mismatch Cases

The widefield fallback applies Gaussian blur to a single image, but panoramic imaging involves geometric projection (cylindrical, spherical, or equirectangular) of the scene onto a wide field of view — the projection geometry is absent
Panorama multi-focus fusion requires modeling focus variation across the wide FOV and stitching multiple exposures — the widefield single-frame model cannot capture the spatially varying focus or overlap regions

How to Correct the Mismatch

Use the panorama operator that models the geometric projection (cylindrical or spherical warping) and focus-dependent blur across the wide field of view
Reconstruct using image stitching with homography estimation, exposure fusion, and spatially varying deblurring that account for the correct projection geometry

Experimental Setup — Signal Chain

Experimental setup diagram for Panorama Multi-Focus Fusion

Experimental Setup — Details

Image Size: 4096x2048 (equirectangular)

Focus Planes: 6

Overlap Percent: 30

Fusion: Laplacian pyramid / wavelet fusion

Application: all-in-focus / extended depth of field

Key References

Burt & Adelson, 'The Laplacian Pyramid as a Compact Image Code', IEEE Trans. Commun. 31, 532-540 (1983)

Canonical Datasets

Lytro multi-focus test set