Illegal gambling websites use advanced technology to evade regulations, posing cybersecurity challenges. To address this, we propose a machine learning method to identify these sites and analyze user behavior accurately. The method extracts key data from post messages in a real-world network environment, generating word vectors via Word2Vec with TF-IDF, which are then downscaled and feature-extracted using a Stacked Denoising Auto Encoder (SDAE). Next, this paper uses Agglomerative Clustering, improved through a combination of distance caching and heap optimization, to initially cluster post-template websites of the same type by clustering them into the same cluster. Then, multiple algorithms are integrated within each website cluster to cluster users? different operational behaviors into different clusters based on the cosine similarity consensus function voting secondary clustering. Results show improved detection of illegal gambling sites and classification of user activities, offering new insights for combating these sites.
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhimin Zhang
Qingdao University
Dezhi Han
Shanghai Ship and Shipping Research Institute
Songyang Wu
Ministry of Public Security of the People's Republic of China
Computer Science and Information Systems
Shanghai Maritime University
Ministry of Public Security of the People's Republic of China
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Wed,) studied this question.
synapsesocial.com/papers/68c1bd4854b1d3bfb60eed7e — DOI: https://doi.org/10.2298/csis240930019z