Coded Aperture Compressive Temporal Imaging (CACTI)

cacti Compressive Temporal Coding Ray

CACTI captures multiple video frames in a single camera exposure by modulating the scene with a shifting binary mask during the integration period. Each temporal frame sees a different mask pattern, and the detector integrates all modulated frames into a single 2D measurement. The forward model is y = sum_t M_t * x_t + n where M_t is the mask at time t. Typical compression ratios are 8-48 frames per snapshot. Reconstruction exploits temporal correlation via GAP-TV, PnP-FFDNet, or deep unfolding networks (STFormer, EfficientSCI).

Forward Model

Coded Aperture Temporal

Noise Model

Gaussian

Default Solver

gap tv

Sensor

CMOS

Forward-Model Signal Chain

Each primitive represents a physical operation in the measurement process. Arrows show signal flow left to right.

Spec Notation

M(m_t) → Σ_t → D(g, η₄)

Benchmark Variants & Leaderboards

CACTI

Coded Aperture Compressive Temporal Imaging

Full Benchmark Page →

Contribute Dataset Compete

Spec Notation

M(m_t) → Σ_t → D(g, η₄)

Standard Leaderboard (Top 10)

#	Method	Score	PSNR (dB)	SSIM	Trust	Source
🥇	HiSViT-9	0.876	38.24	0.978	✓ Certified	HiSViT (ECCV 2024)
🥈	EfficientSCI	0.867	37.71	0.976	✓ Certified	EfficientSCI (CVPR 2023)
🥉	ELP-Unfolding	0.826	35.54	0.968	✓ Certified	ELP-Unfolding (2022)
4	RevSCI	0.786	33.49	0.956	✓ Certified	RevSCI (TPAMI 2022)
5	BIRNAT	0.715	30.26	0.921	✓ Certified	BIRNAT (TPAMI 2021)
6	GAP-TV	0.630	26.02	0.892	✓ Certified	GAP-TV (Signal Processing 2016)

Mismatch Parameters (6) click to expand

Name	Symbol	Description	Nominal	Perturbed
mask_dx	Δx	Mask lateral shift (pixels)	0	0.5
mask_dy	Δy	Mask vertical shift (pixels)	0	0.3
mask_theta	θ	Mask rotation (rad)	0	0.1
clock_offset	Δt	Clock synchronization offset	0	0.05
duty_cycle	d	Shutter duty cycle	1.0	0.95
gain	g	Detector gain multiplier	1.0	1.02

Reconstruction Triad Diagnostics

The three diagnostic gates (G1, G2, G3) characterize how reconstruction quality degrades under different error sources. Each bar shows the relative attribution.

G1 — Forward Model Accuracy How well does the mathematical model match reality?

Model: coded aperture temporal — Mismatch modes: mask shift error, motion blur within frame, mask diffraction, nonuniform illumination

G2 — Noise Characterization Is the noise model correctly specified?

Noise: gaussian — Typical SNR: 20.0–40.0 dB

G3 — Calibration Quality Are instrument parameters accurately measured?

Requires: mask patterns, mask shift calibration, dark frame, temporal alignment

Modality Deep Dive

Principle

Coded Aperture Compressive Temporal Imaging (CACTI) compresses multiple high-speed video frames into a single sensor exposure by modulating the scene with a dynamic coded aperture (shifting mask) during the integration time. The sensor accumulates a coded sum of B consecutive frames, and computational algorithms recover all B frames from the single compressed measurement using video sparsity priors.

How to Build the System

Build a relay optical system with a physical translating mask or use a DMD as the coded aperture at an intermediate image plane. The mask shifts by one pixel per sub-frame interval during the camera integration time, effectively encoding B temporal frames. Use a standard camera at normal frame rate (e.g., 30 fps) to capture the compressed measurement. Calibrate the mask pattern and its motion precisely.

Common Reconstruction Algorithms

GAP-TV (Generalized Alternating Projection with Total Variation)
DeSCI (Decompress Snapshot Compressive Imaging, GMM prior)
PnP-FFDNet (Plug-and-Play with FFDNet denoiser)
Deep unfolding: BIRNAT, RevSCI, EfficientSCI
E2E-trained networks: STFormer, CST (transformer-based)

Common Mistakes

Mask calibration error causing temporal frame misalignment in reconstruction
Compression ratio too high (too many sub-frames per snapshot) for the scene motion
Motion blur within individual sub-frame intervals when scene moves fast
Non-uniform mask illumination creating brightness gradients in recovered frames
Choosing masks with poor conditioning (high mutual coherence between rows)

How to Avoid Mistakes

Calibrate mask position precisely using a static known pattern before experiments
Limit compression ratio (B ≤ 8-10 for complex natural scenes; B ≤ 24-48 for simpler scenes)
Ensure sub-frame exposure is short enough that intra-frame motion is negligible
Flatfield-correct the mask modulation using a uniform target calibration
Simulate reconstruction quality with candidate mask patterns before hardware fabrication

Forward-Model Mismatch Cases

The widefield fallback processes a single 2D (64,64) frame, but CACTI compresses B temporal frames into a single 2D coded snapshot using a shifting binary mask — the temporal dimension (64,64,B) is entirely lost
Without the time-varying coded exposure pattern, individual video frames cannot be separated from the compressed measurement — temporal super-resolution from the fallback is impossible

How to Correct the Mismatch

Use the CACTI operator that applies frame-wise binary masks and sums the coded frames: y = sum_b(M_b * x_b), compressing B frames into one measurement
Reconstruct the video sequence using PnP-SCI (plug-and-play with FastDVDnet), ELP-Unfolding, or GAP-TV that model the temporal compression and recover B frames from the single snapshot

Experimental Setup

Instrument

Custom CACTI system (Duke / USTC prototype)

Coded Aperture

shifting binary mask on lithographic substrate

Frames Per Snapshot

Spatial Resolution

256x256

Compression Ratio

Equivalent Fps

1200

Detector

FLIR Point Grey Grasshopper3 CMOS

Reconstruction

GAP-TV / PnP-FFDNet / STFormer

Signal Chain Diagram

Experimental setup diagram for Coded Aperture Compressive Temporal Imaging (CACTI)

Key References

Llull et al., 'Coded aperture compressive temporal imaging', Optics Express 19, 10526 (2011)
Yuan et al., 'Generalized alternating projection based total variation minimization (GAP-TV)', IEEE ICIP 2016
Wang et al., 'Spatial-Temporal Transformer for Video Snapshot Compressive Imaging (STFormer)', ECCV 2022

Canonical Datasets

Kobe, Runner, Drop, Traffic (grayscale SCI benchmarks)
DAVIS 2017 (adapted for SCI simulation)

Related Modalities

CASSI SPC-Block SPC-Kronecker Matrix

Benchmark Pages

CACTI