September 3, 2024Open Access

Differentially Private Kernel Density Estimation

Key Points

Key points are not available for this paper at this time.

Abstract

We introduce a refined differentially private (DP) data structure for kernel density estimation (KDE), offering not only improved privacy-utility tradeoff but also better efficiency over prior results. Specifically, we study the mathematical problem: given a similarity function f (or DP KDE) and a private dataset X Rᵈ, our goal is to preprocess X so that for any query yᵈ, we approximate ₗ ₗ f (x, y) in a differentially private fashion. The best previous algorithm for f (x, y) =\| x - y \|₁ is the node-contaminated balanced binary tree by Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024. Their algorithm requires O (nd) space and time for preprocessing with n=|X|. For any query point, the query time is d n, with an error guarantee of (1+) -approximation and ^-1 ^-0. 5 d^1. 5 R ^1. 5 n. In this paper, we improve the best previous result Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024 in three aspects: - We reduce query time by a factor of ^-1 n. - We improve the approximation ratio from to 1. - We reduce the error dependence by a factor of ^-0. 5. From a technical perspective, our method of constructing the search tree differs from previous work Backurs, Lin, Mahabadi, Silwal, and Tarnawski, ICLR 2024. In prior work, for each query, the answer is split into ^-1 n numbers, each derived from the summation of n values in interval tree countings. In contrast, we construct the tree differently, splitting the answer into n numbers, where each is a smart combination of two distance values, two counting values, and y itself. We believe our tree structure may be of independent interest.

Read Full Paperexternally

Mark Helpful

Bookmark

Relay

View Full Paper