March 3, 2026Open Access

Towards real-time alignment of 3D CT and 2D X-Ray with multi-stage CNNs

Key Points

LXPose achieves real-time registration of 3D CT and 2D X-Ray for image-guided interventions.
The two-stage CNN reduces inference time from several seconds to just 20 milliseconds using automated landmark extraction.
Automated strategies enable LXPose to eliminate manual annotations, enhancing workflow efficiency.
With a projection loss for improved accuracy, LXPose bridges the gap between synthetic and real-world imaging data.

Abstract

2D X-Ray imaging is widely employed in image-guided interventions to provide real-time visualization of patient anatomy and interventional devices. However, due to their lack of depth information and soft-tissue contrast, intraoperative X-Rays are often complemented with high-resolution preoperative 3D Computed Tomography (CT) scans for intervention planning. Clinicians must then mentally register the 3D preoperative planning information onto the 2D visualization in order to navigate the patient's anatomy and interpret the position of surgical instruments. This procedure considerably increases cognitive workload, thus motivating the need for automated solutions. Traditionally, this registration task is formulated as estimating the position of the X-Ray source relatively to the CT scan. State-of-the-art 3D/2D registration approaches are trained using synthetic CT-to-X-Ray projections, but these methods still require manual annotations and employ a time-consuming optimization that limits their deployment for live image guidance. In this paper, we propose LXPose (Live X-Ray Pose estimation), a fast multi-stage 3D/2D registration framework for real-time image guidance. Specifically, we introduce an efficient two-stage CNN that effectively bypasses slow optimization for fast inference. Critically, we also eliminate the need for any manual annotations by introducing an automated strategy for landmark extraction. Finally, we train LXPose using a projection loss for high accuracy, and apply extensive data augmentation in a first attempt to reduce the domain gap between the synthetic training X-Rays and real testing data. We demonstrate LXPose on two datasets from different anatomical regions, where it yields results comparable to the state-of-the-art, while reducing inference time by two orders of magnitude, from several seconds to 20ms. Overall, our results demonstrate the potential of LXPose for real-time clinical deployment. Our code is available at https://github.com/fedefacente/LXPose.

Towards real-time alignment of 3D CT and 2D X-Ray with multi-stage CNNs

Key Points

Abstract

Cite This Study