Key points are not available for this paper at this time.
Let (X, Y) be a pair of random variables such that X = (X₁, , XJ) and let f by a function that depends on the joint distribution of (X, Y). A variety of parametric and nonparametric models for f are discussed in relation to flexibility, dimensionality, and interpretability. It is then supposed that each Xⱼ 0, 1, that Y is real valued with mean and finite variance, and that f is the regression function of Y on X. Let f^, of the form f^ (x₁, , xJ) = + f^₁ (x₁) + + fJ (xJ), be chosen subject to the constraints Ef^ⱼ = 0 for 1 j J to minimize E (f (X) - f^ (X) ) ². Then f^ is the closest additive approximation to f, and f^ = f if f itself is additive. Spline estimates of f^ⱼ and its derivatives are considered based on a random sample from the distribution of (X, Y). Under a common smoothness assumption on f^ⱼ, 1 j J, and some mild auxiliary assumptions, these estimates achieve the same (optimal) rate of convergence for general J as they do for J = 1.
Charles J. Stone (Sat,) studied this question.