Purpose This paper aims to present data refinement and enrichment workflow to integrate building performance guidelines with existing semi-structured floor layout datasets. The goal is leveraging the application of architectural datasets in the built environment across data-driven methods as well as enabling informative visualizations and large-scale analyses. Design/methodology/approach The Swiss dwellings dataset is employed as the foundation in this study, which later undergoes a Python-based data refinement, feature engineering and attribute extension. The modified attributes cover spatial zoning (categorical), proxy indicators for daylight metrics and view layers (numerical), noise level (numerical), acoustic comfort (categorical) and window orientations (categorical). Findings The study presents an efficient workflow of turning textual data of the building performance guidelines into structured tabular data suitable for machine learning. Moreover, the visualizations of the structured floor layouts data reveal new insights as a result of analyzing the dataset. The Oriented Environmental Swiss Dwellings (O-ESD) dataset, as the main product of this study, brings data-driven learning opportunities from existing floor layout datasets towards environmental design automation. Moreover, O-ESD offers human-interpretability through the structured micro-climatic visualizations. Originality/value There has been no previous effort in the field for upgrading the existing architectural datasets in alignment with the building performance guidelines to expand their applicability in data-driven approaches. The proposed workflow not only gives insights into data refinement applications in the field but also results in an environmentally enriched floor layout dataset as the outcome. The resulting dataset, the workflow towards it and example visualizations are released publicly.
Mostafavi et al. (Fri,) studied this question.