With the rapid development of mobile Internet business, the traditional App interface design process is gradually difficult to meet the market's demand for efficiency and personalization, as it is highly dependent on designers' manual labour, has a long iteration cycle, and high trial-and-error costs. Advancements in multimodal AI-generated content technologies have brought revolutionary potential to this field. The purpose of this study is to develop an App interface visual generation and interactive experience optimisation framework based on multimodal artificial intelligence-generated content, enabling end-to-end automated generation and continuous optimisation from natural language descriptions to high-fidelity interactive originals. The framework innovatively integrates technologies such as diffusion models, large language models, and visual-language models. Through LLM, the user's vague natural language requirements and sketch input are accurately analysed and converted into structured design instructions. The driven diffusion model generates a high-fidelity visual interface that strictly follows the design system specification and utilises LLM parallel reasoning to generate the corresponding layout structure and interaction logic. Finally, an interactive dynamic prototype is generated through the prototype building block, and a closed loop of automated evaluation and A/B testing is embedded in VLM to optimise the experience continuously. Comparative experiments and user studies were carried out to verify the effectiveness of the framework. Experimental results demonstrate that the framework achieves an average task-level page generation efficiency improvement of 37% to 45% across specific interface design tasks. Separately, for the end-to-end first-draft turnaround cycle, the framework reduces the production time from the traditional 2–3 days to less than 30 min, representing a significant compression of the initial design cycle. In terms of generative quality, the heuristic evaluation showed that the framework output scheme scored significantly higher on visual consistency and layout plausibility than the baseline model. In terms of user experience, the collection of 150 valid user questionnaires indicates that the system usability scale score of the solution generated and optimised based on this framework reaches 82.5 points, which falls within the "good" to "excellent" level. Compared with the initial solution, satisfaction increased by 28%. The results show that this study provides an integrated intelligent framework for App interface design that combines multimodal generation with continuous optimization, effectively addressing the balance between design efficiency and quality, while offering a feasible path for improving user interaction experiences through a data-driven iterative mechanism.
Bin Yu (Mon,) studied this question.