There are some problems in traditional soft outfit design, such as low efficiency, difficult style understanding and low user participation, and the development of generative AI has brought new opportunities to solve these problems. In this study, a collaborative soft outfit generation system between users and AI is constructed. Through multi-modal condition fusion and dynamic style constraint method, users' fuzzy preferences are transformed into quantitative conditions, and personalized soft outfit schemes conforming to specific styles are generated. The system uses CLIP text and image encoder to extract features, and controls them in the potential space based on Stable Diffusion architecture, and optimizes the generation effect by combining style contrast loss. At the same time, a three-level verification system is proposed, including quantitative evaluation indicators such as ArtFID and FCS (Furnishing CLIP Score), and human evaluation as a supplement to objectively quantify the matching degree between the generated results and the target style. The experiment uses the FurniSynth data set built by ourselves, covering 10 mainstream decoration styles. The results show that compared with the existing methods, the method proposed in this paper has the best performance in style consistency, with an ArtFID index score of 19.3, a FCS semantic matching degree of 0.83, and a designer's subjective score of 4.6. In addition, the ablation experiment shows that the overall performance of the model can be gradually improved by adding each component in turn. This study not only technically breaks through the control problem of traditional model on style consistency, but also reduces the design cost of small and medium-sized enterprises in industry and improves efficiency, and promotes the integration of personalized consumption mode and smart home aesthetic design in society.
Lijun Zhang (Fri,) studied this question.