Dynamic three-dimensional (3-D) reconstruction of small objects moving at high speed is fundamentally limited by the number of viewpoints that a fixed camera array can provide at any single time instant. When the camera count is insufficient, single-frame multi-view stereo produces incomplete or inaccurate geometry. This paper proposes a multi-frame temporal integration approach that overcomes this limitation by exploiting the rigid-body assumption: because a falling object maintains its shape across consecutive frames, images captured at different time instants can be combined into a single, viewpoint-enriched reconstruction. A three-layer circular array of 32 synchronized RGB cameras captures 1440 × 1080 images at 160 fps, and a free-fall-oriented algorithm automatically detects active frames, selects informative temporal windows, and feeds the accumulated multi-frame images into a structure-from-motion and multi-view stereo (SfM-MVS) pipeline, effectively multiplying the number of viewpoints without additional hardware. The algorithm simultaneously recovers the 6-DOF pose trajectory of each object from the SfM-estimated camera parameters. Progressive accumulation experiments on freely falling soybeans (approximately 9–10 mm diameter) show that a single 32-camera frame already achieves an F-score exceeding 0.97 at a 0.5 mm threshold against an industrial structured-light scanner reference, and that accumulating additional temporal frames reaches a stable convergence plateau with both objects reaching a plateau F-score of 0.984. Beyond approximately one to two accumulated frames, additional frames yield diminishing returns, confirming that a small number of temporal frames is sufficient for convergent sub-millimeter accuracy. Across 30 independent free-fall trials with three objects, the system achieves an overall mean error of 0.146±0.033 mm and an overall F-score of 0.980±0.006—a mean relative error of approximately 1.6% on 8–10 mm targets—and fine surface features such as structural cracks are resolved at a fidelity sufficient for visual defect identification. These results establish rigid-body multi-frame temporal integration as an effective strategy for high-throughput, non-contact 3-D inspection of small objects in motion.
Duan et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: