High-quality training data are essential for the performance and generalization of artificial intelligence systems, particularly in dynamic environments such as adaptive stream processing for disaster response. However, constructing large and representative datasets remains costly and time-consuming, especially in domains where real data are scarce or difficult to obtain. Large Language Models (LLMs) provide powerful capabilities for synthetic text generation, yet the quality of generated data strongly depends on the design of input prompts. Prompt engineering is therefore critical, but it remains largely manual and difficult to scale, particularly in black-box settings where model internals are inaccessible. This work introduces EVOLMD-MO, a multi-objective evolutionary framework for automated prompt learning aimed at generating high-quality synthetic text datasets using black-box LLMs. The proposed approach formulates prompt optimization as a multi-objective search problem in which candidate prompts evolve through genetic operators guided by two complementary objectives: semantic fidelity to reference data and generative diversity of the produced samples. To support scalable optimization, the framework integrates a modular multi-agent architecture that decouples prompt evolution, LLM interaction, and evaluation mechanisms. The evolutionary process is implemented using the NSGA-II algorithm, enabling the discovery of diverse Pareto-optimal prompts that balance semantic preservation and diversity. Experimental evaluation using large-scale disaster-related social media data demonstrates that the proposed approach consistently improves prompt quality across generations while maintaining a stable trade-off between fidelity and diversity. Compared with a single-objective baseline, EVOLMD-MO explores a significantly broader semantic search space and produces more diverse yet semantically coherent synthetic datasets. These results indicate that multi-objective evolutionary prompt learning constitutes a promising strategy for black-box LLM-driven data generation, with potential applicability to adaptive data analytics and real-time decision-support systems in highly dynamic environments, pending broader validation across domains and models.
Building similarity graph...
Analyzing shared references across papers
Loading...
Pastrián et al. (Wed,) studied this question.
synapsesocial.com/papers/69d895d86c1944d70ce06f39 — DOI: https://doi.org/10.3390/app16083623
Diego Pastrián
Diego Portales University
Nicolás Hidalgo
Diego Portales University
Víctor Reyes
Diego Portales University
Applied Sciences
Universitat de València
Universitat Politècnica de València
Diego Portales University
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: