In this paper, we propose a two-stage Deep Reinforcement Learning (DRL) framework for the automatic arrangement of buildings in urban planning, with a focus on satisfying critical regulatory constraints: sunshine duration and inter-building distance. To overcome the computational bottleneck of high-fidelity sunshine simulation during DRL training, we first introduce a Convolutional Auto-Encoder (CAE) model that serves as a fast and accurate surrogate for predicting sunshine distribution across a building layout. This CAE model, trained on data generated from Autodesk Revit, achieves near real-time prediction (∼0.015 s per layout) with an average error of less than 3 min, enabling its seamless integration into the iterative RL loop. Subsequently, we formulate the sequential placement of buildings as a Markov Decision Process (MDP) and employ a Deep Q-Network (DQN) to learn the optimal placement policy. The DQN agent interacts with a planning environment, where its actions (moving a building in discrete directions) are rewarded based on the satisfaction of distance constraints and the avoidance of low-sunshine zones identified by the CAE. Our integrated CAE-DQN system automates the generation of compliant building layouts. Experimental evaluation demonstrates that our method significantly outperforms baseline approaches (Random Placement, Greedy Heuristic, and Genetic Algorithm) in terms of success rate (100%), convergence speed, and layout compactness. The proposed framework not only ensures hard constraint satisfaction but also provides a practical and scalable tool for rapid prototype generation in the conceptual design phase of urban planning.
Lin et al. (Sat,) studied this question.