Abstract In modern software systems, Web services are the primary means of provisioning remote resources. The service ecosystem exhibits significant duplication, with multiple services fulfilling, in essence, the same functional requirements. Such services often can be used interchangeably, so developers face the challenge of choosing the most suitable service for the task at hand. However, extant approaches for selecting services focus solely on the service’s performance characteristics (so-called QoS, whose properties include latency, reliability, availability, etc.). These approaches often neglect important utility characteristics (e.g., data accuracy, correctness, coverage, etc.). As a consequence, selected services may exhibit high performance, while delivering information that is either inaccurate or outdated. This article addresses this problem by introducing Quality of Information (QoI), a quality metric that measures data-related service performance. To be able to measure QoI accurately and effectively, we classify aspects of service QoI (e.g., data freshness) that can be measured automatically by comparing the service outputs and those (e.g., accuracy) that require manual effort for labeling the ground truth of a given test input. To accurately measure QoI without the costly manual effort, we formulate the input selection problem as selecting a small set of test inputs to most accurately approximate the QoI obtained by using a large input set. Inspired by input sample selection methods for testing machine learning (ML) algorithms, we have adapted these methods to evaluate their applicability on two datasets of service invocation results, having observed a noteworthy performance variance between ML testing and web service testing. Having identified insights and challenges in measuring QoI, this research highlights the need for further investigation.
Li et al. (Thu,) studied this question.