Telegram, along with WhatsApp and Signal, has become very popular due to its hybrid capabilities, including both instant private and public messaging, making it an effective tool for quickly broadcasting content to a wide audience. This article presents TGEconomicDataset, a new dataset containing more than 2.9 million messages from the most popular Russian-language Telegram channels in the field of economics, as well as synthetically generated labeled mixtures of these channels. These mixtures are specifically designed to model authorship change scenarios for testing various methods for solving the problem of continuous authentication, which is of particular interest due to the need for organizations and companies to rely on data posted on social media. The presented dataset is enriched with quotes of important financial instruments such as gold futures, the USD/RUB currency pair, BRENT oil, the dollar index (DXY), and bitcoin (BTC), synchronized with the message timestamps. A detailed joint analysis of the collected data is provided. In addition to the presented dataset, we publish the scripts used to collect the data, integrate the financial indicators, and generate the synthetic mixtures for the continuous authentication task, ensuring full reproducibility of the research.
Luneva et al. (Wed,) studied this question.