Key points are not available for this paper at this time.
Using pre-trained vision-language models like CLIP with federated training prompts has shown great potential in federated learning (FL) by offering significant benefits in computation, communication, and privacy over existing frameworks. However, existing researches overlook the internal mechanisms underlying federated prompt tuning and comply with the traditional context-unaware tuning mechanism. Our experiments, on the other hand, demonstrate that federated prompting is a data-efficient but data-sensitive paradigm, and therefore, the samples that involved in the prompt tuning process holds significant importance. To address the above issue, we propose C ontext- a ware F ederated P rompt T uning (CaFPT), which facilitates the retrieval process by conditioning on the examples capable of activating the most pertinent knowledge inside the pre-trained models with information theory. Moving in this direction steers the behavior of pre-trained neurons precisely and improves performance on the local task. Informative vectors are built by pruning clients' training data based on their V - usable information. The study shows that these vectors can be updated and combined through operations like FedAVG, and the resulting model's behavior is steered accordingly on multiple clients' tasks. Extensive experiments have demonstrated that informative vectors offer promising robustness, making it a simple yet effective way to enhance the performance of federated prompting.
Guo et al. (Wed,) studied this question.