Mostrar el registro sencillo del ítem
Recommender Systems: Learning Collaborative Filtering Similarity Measures Using Siamese Networks
| dc.contributor.author | Bobadilla, Jesús | |
| dc.contributor.author | Gutierrez, Abraham | |
| dc.date | 2026-02-28 | |
| dc.date.accessioned | 2026-03-09T13:13:20Z | |
| dc.date.available | 2026-03-09T13:13:20Z | |
| dc.identifier.citation | J. Bobadilla, A. Gutierrez. Recommender Systems: Learning Collaborative Filtering Similarity Measures using Siamese Networks, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 9, no. 6, pp. 21-27, 2026, http://doi.org/10.9781/ijimai.2025.03.006KeywordsCollaborative Filtering, Neural Networks, One-Hot Encoding, Recommender Systems, Siamese Networks, Similarity Measures.AbstractImproving current similarity measures in the collaborative filtering Recommender Systems is relevant, since it contributes to different applications such as to get better big data representations of users and items, to implement dynamic browsers able to navigate through data, and to explain recommendation results. Currently, there are many statistically based similarity measures, some of them tailored to the extraordinarily sparse collaborative filtering scenario. Nevertheless, the hypothesis of the paper is that using neural networks, learnt similarity measures can be obtained that improve existing ones. To accomplish the task, the typical neural models cannot be used, and it is necessary to focus on the similarity learning area, in which the goal is to make the model learn, which is a similarity function able to measure how similar two objects are. Siamese networks adequately implement the similarity learning concept, and we have adapted them to collaborative filtering particularities. The results in different scenarios show significant improvements compared to the state-of-the-art. DOI: 10.9781/ijimai.2025.03.006Recommender Systems: Learning Collaborative Filtering Similarity Measures Using Siamese NetworksJesús Bobadilla , Abraham Gutierrez *Universidad Politécnica de Madrid, Dpto. Sistemas Informáticos, Madrid (Spain)* Corresponding author: jesus.bobadilla@upm.es (J. Bobadilla), abraham.gutierrez@upm.es (A. Gutierrez)Received 31 January 2024 | Accepted 1 March 2025 | Published 21 March 2025 I. IntroductionRecommender Systems (RS) [1] is the Artificial Intelligence area focused on personalization. RS recommend products or services to users. Remarkable commercial RS are Spotify, TripAdvisor, Netflix, TikTok, etc. To accomplish their task, RS can use text and images of the items (products or services), so they could recommend a Sci-Fi film based on the similarity between its synopsis and the synopsis of some other films the user liked; this is content-based filtering. There are some other filtering strategies, such as demographic filtering [2] which recommends to an active user the products that users of the same age, sex, nationality, etc. consumed. Social filtering is based on followed, followers, and trusted information [3]. Context-based filtering usually makes use of geographical information [4], such as GPS coordinates. The most accurate filtering strategy is Collaborative Filtering (CF) [5]. CF makes use of datasets that contain all the iterations between users and items; typically, they hold the explicit votes that users cast to items, or the implicit interactions between users and items, such as listened to songs, watched movies, bought products, etc. The most accurate RSs combine several filtering strategies using ensemble architectures.The research in this paper is focused on CF RS, so we will act on data sets containing ratings assigned by users to items. This information can be stored in a bidimensional matrix where each row represents a user, each column represents an item, and each value represents an explicit vote or an implicit rating. Since users can only vote or consume a tiny proportion of the available items, the CF matrices are extraordinarily sparse [6], usually around 98% sparsity. It is relevant in this paper since we will try to design a neural model capable of measuring the existing similarity between users, where each user is represented by a sparse vector of ratings. Accurately measuring similarities between sparse vectors is much more difficult than using dense vectors.The first CF approaches made use of the K-Nearest Neighbors (KNN) algorithm [7]. It directly implements the CF concept: 1) to find the neighbors of the active user, 2) based on the set of neighbors, to predict the ratings of those items not voted for the active user, and 3) to recommend the N highest predictions. The key to improving KNN accuracy is to design a suitable similarity measure between profile vectors and use it to find the neighbors of the active user. The better the similarity measure, the higher the accuracy. Currently, recommendations are made using machine learning matrix factorization, and deep learning models such as DeepMF [8] and Neural Collaborative Filtering [9]; they largely improve accuracy compared to KNN, and their performance is better, since once the model has been trained predictions are processed very fast. Beyond accuracy, there are many objectives in RS, such as novelty [10], diversity [11], trust [12], recommendation explanation [13], big data analysis [14], and information browser design [15]. Most of them can take advantage of improving similarity measures to find similar users or similar items. Some CF similarity measures have been borrowed from the statistical field: Pearson correlation, cosine, sine, Jaccard, MSD, etc. whereas some others have been heuristically | es_ES |
| dc.identifier.uri | https://reunir.unir.net/handle/123456789/19143 | |
| dc.description.abstract | Improving current similarity measures in the collaborative filtering Recommender Systems is relevant, since it contributes to different applications such as to get better big data representations of users and items, to implement dynamic browsers able to navigate through data, and to explain recommendation results. Currently, there are many statistically based similarity measures, some of them tailored to the extraordinarily sparse collaborative filtering scenario. Nevertheless, the hypothesis of the paper is that using neural networks, learnt similarity measures can be obtained that improve existing ones. To accomplish the task, the typical neural models cannot be used, and it is necessary to focus on the similarity learning area, in which the goal is to make the model learn, which is a similarity function able to measure how similar two objects are. Siamese networks adequately implement the similarity learning concept, and we have adapted them to collaborative filtering particularities. The results in different scenarios show significant improvements compared to the state-of-the-art. | es_ES |
| dc.language.iso | eng | es_ES |
| dc.publisher | UNIR | es_ES |
| dc.relation.uri | https://www.ijimai.org/index.php/ijimai/article/view/865 | es_ES |
| dc.rights | openAccess | es_ES |
| dc.subject | collaborative filtering | es_ES |
| dc.subject | neural networks | es_ES |
| dc.subject | One-Hot Encoding | es_ES |
| dc.subject | Recommender System | es_ES |
| dc.subject | Siamese Networks | es_ES |
| dc.subject | Similarity Measure | es_ES |
| dc.title | Recommender Systems: Learning Collaborative Filtering Similarity Measures Using Siamese Networks | es_ES |
| dc.type | article | es_ES |
| reunir.tag | ~IJIMAI | es_ES |
| dc.identifier.doi | http://doi.org/10.9781/ijimai.2025.03.006KeywordsCollaborative Filtering, Neural Networks, One-Hot Encoding, Recommender Systems, Siamese Networks, Similarity Measures.AbstractImproving current similarity measures in the collaborative filtering Recommender Systems is relevant, since it contributes to different applications such as to get better big data representations of users and items, to implement dynamic browsers able to navigate through data, and to explain recommendation results. Currently, there are many statistically based similarity measures, some of them tailored to the extraordinarily sparse collaborative filtering scenario. Nevertheless, the hypothesis of the paper is that using neural networks, learnt similarity measures can be obtained that improve existing ones. To accomplish the task, the typical neural models cannot be used, and it is necessary to focus on the similarity learning area, in which the goal is to make the model learn, which is a similarity function able to measure how similar two objects are. Siamese networks adequately implement the similarity learning concept, and we have adapted them to collaborative filtering particularities. The results in different scenarios show significant improvements compared to the state-of-the-art. DOI: 10.9781/ijimai.2025.03.006Recommender Systems: Learning Collaborative Filtering Similarity Measures Using Siamese NetworksJesús Bobadilla , Abraham Gutierrez *Universidad Politécnica de Madrid, Dpto. Sistemas Informáticos, Madrid (Spain)* Corresponding author: jesus.bobadilla@upm.es (J. Bobadilla), abraham.gutierrez@upm.es (A. Gutierrez)Received 31 January 2024 | Accepted 1 March 2025 | Published 21 March 2025 I. IntroductionRecommender Systems (RS) [1] is the Artificial Intelligence area focused on personalization. RS recommend products or services to users. Remarkable commercial RS are Spotify, TripAdvisor, Netflix, TikTok, etc. To accomplish their task, RS can use text and images of the items (products or services), so they could recommend a Sci-Fi film based on the similarity between its synopsis and the synopsis of some other films the user liked; this is content-based filtering. There are some other filtering strategies, such as demographic filtering [2] which recommends to an active user the products that users of the same age, sex, nationality, etc. consumed. Social filtering is based on followed, followers, and trusted information [3]. Context-based filtering usually makes use of geographical information [4], such as GPS coordinates. The most accurate filtering strategy is Collaborative Filtering (CF) [5]. CF makes use of datasets that contain all the iterations between users and items; typically, they hold the explicit votes that users cast to items, or the implicit interactions between users and items, such as listened to songs, watched movies, bought products, etc. The most accurate RSs combine several filtering strategies using ensemble architectures.The research in this paper is focused on CF RS, so we will act on data sets containing ratings assigned by users to items. This information can be stored in a bidimensional matrix where each row represents a user, each column represents an item, and each value represents an explicit vote or an implicit rating. Since users can only vote or consume a tiny proportion of the available items, the CF matrices are extraordinarily sparse [6], usually around 98% sparsity. It is relevant in this paper since we will try to design a neural model capable of measuring the existing similarity between users, where each user is represented by a sparse vector of ratings. Accurately measuring similarities between sparse vectors is much more difficult than using dense vectors.The first CF approaches made use of the K-Nearest Neighbors (KNN) algorithm [7]. It directly implements the CF concept: 1) to find the neighbors of the active user, 2) based on the set of neighbors, to predict the ratings of those items not voted for the active user, and 3) to recommend the N highest predictions. The key to improving KNN accuracy is to design a suitable similarity measure between profile vectors and use it to find the neighbors of the active user. The better the similarity measure, the higher the accuracy. Currently, recommendations are made using machine learning matrix factorization, and deep learning models such as DeepMF [8] and Neural Collaborative Filtering [9]; they largely improve accuracy compared to KNN, and their performance is better, since once the model has been trained predictions are processed very fast. Beyond accuracy, there are many objectives in RS, such as novelty [10], diversity [11], trust [12], recommendation explanation [13], big data analysis [14], and information browser design [15]. Most of them can take advantage of improving similarity measures to find similar users or similar items. Some CF similarity measures have been borrowed from the statistical field: Pearson correlation, cosine, sine, Jaccard, MSD, etc. whereas some others have been heuristically |





