We study online resource allocation among N interacting modules over a horizon of T rounds. Unlike standard online optimization, the cost of allocating resources to a module is endogenous: it depends on the entire allocation vector through an interaction matrix W that encodes pairwise cooperation and competition. We analyze three allocation paradigms: (I) uniform allocation (cost-ignorant), (II) gated allocation (cost-estimating), and (III) competitive allocation via multiplicative weights update (cost-revealing). Our main results establish a strict separation: uniform allocation incurs Omega(T) regret, gated allocation achieves O(T to the power of 2/3), while competitive allocation achieves the optimal O(square root of T log N). This gap is attributable to the competitive mechanism's ability to exploit endogenous cost information revealed through interaction feedback. We further show that the topology of W governs the computation-regret trade-off. 我们研究了 T 个周期内 N 个相互作用模块之间的在线资源分配问题。与标准的在线优化不同,本研究中的成本是“内生”的:它通过一个编码模块间生克关系的交互矩阵 W,取决于整体分配向量。 我们分析了三种范式:(I)均匀分配(成本忽略型);(II)门控分配(成本估算型);(III)通过乘法权重更新实现的竞争性分配(成本揭示型)。主要结果显示了显著的性能代差:均匀分配产生 Omega(T) 悔界,门控分配达到 O(T 的 2/3 次方),而竞争性分配达到了最优的 O(根号 T log N)。这一差距源于竞争机制能够利用交互反馈揭示出的内生信息。我们进一步证明了交互拓扑 W 决定了计算与悔界之间的权衡。
Rui Chai (Tue,) studied this question.