Skip to content

TemanWisata/recsys-dev

Repository files navigation

Recommender System Development

This project focuses on building and evaluating recommender systems that generate personalized suggestions for users. It covers the full pipeline of recommendation model development, including:

  • Data preparation – transforming interaction logs (users, items, timestamps, ratings, etc.) into a structured dataset.
  • Model training – applying classical algorithms (e.g., Matrix Factorization, kNN) and modern deep learning–based methods (e.g., Transformers like SASRec).
  • Evaluation – assessing models with metrics such as MAP@k and Serendipity.
  • Deployment & usage – generating top-k recommendations for users while supporting filtering, ranking, and interpretability.

Model Architecture

SASRec & BERT4Rec

As an input both models take user sequences, containing previous user interaction history. Item embeddings from these sequences are fed to transformer blocks with multi-head self-attention and feedforward neural network as main components. After one or several stacked attention blocks, resulting user sequence latent representation is used to predict targets items.

SASRec

SASRec is a transformer-based sequential model with unidirectional attention mechanism and "Shifted Sequence" training objective. Resulting user sequence latent representation is used to predict all items in user sequence at each sequence position where each item prediction is based only on previous items.

BERT4Rec

BERT4Rec is a transformer-based sequential model with bi-directional attention mechanism and "Item Masking" (same as "MLM") training objective. Resulting user sequence latent representation is used to predict masked items.

Difference SASRec BERT4Rec
Training objective Shifted sequence target Item masking target
Attention Uni-directional Bi-directional
Transformer block Check the details below Check the details below
Loss in original paper Binary cross-entropy (BCE) with 1 negative per positive Cross-entropy (Softmax) on full items catalog

Transformer Layers

SASRec alt text

BERT4Rec alt text

Dataset

The Indonesia Tourism Destination Dataset contains several tourist attractions in 5 major cities in Indonesia, namely Jakarta, Yogyakarta, Semarang, Bandung, Surabaya. This dataset is used in the Capstone Project Bangkit Academy 2021 called GetLoc. This dataset also consists of 4 files, namely:

  • tourism_ with _id.csv: contains information on tourist attractions in 5 major cities in Indonesia totaling ~400
  • user.csv: contains dummy user data to make recommendation features based on user
  • tourism_rating.csv: contains 3 columns, namely the user, the place, and the rating given, serves to create a recommendation system based on the rating
  • package_tourism.csv: contains recommendations for nearby places based on time, cost, and rating

Metrics Evaluation

Mean Average Precision at k (MAP@k) Average Precision (AP) measures the precision of recommendations while considering their order, and MAP@k is the mean value of AP across all users.

Serendipity measured as the average relevance of recommended items weighted by their dissimilarity from a user’s past interactions.

Indonesia Tourism Destination Dataset results

Model MAP@10 Serendipity@10
Popular (Base Model) 0.006990 0.000007
Ease (Base Model) 0.005978 0.000296
BERT4Rec Softmax 0.006615 0.000337
SASRec Softmax 0.008706 0.000334
SASRec BCE 0.007420 0.000211
SASRec GBCE 0.007404 0.000261

Metrics Visualization

alt text

Results:

  • MAP@10 (Mean Average Precision): SASRec Softmax achieves the highest MAP@10 score (0.008706), meaning it is the most effective at ranking relevant recommendations at the top of the list for users. This indicates users are more likely to find items they prefer among the top 10.
  • Serendipity: SASRec Softmax also has one of the highest Serendipity@10 scores (0.000334), showing it can recommend items that are both relevant and novel (not just similar to what users have already interacted with). This helps users discover new and interesting destinations.

Based on evaluation metrics, we choose SASRec with Softmax as the recommendation model.

Paper

SASRec

BERT4Rec

Requirement

Setup Environment for Development

1. Install Dependency

uv sync

2. Install Pre-Commit

uv run pre-commit install

Or you can use the make command:

make initial-setup

About

Repository for Recommender System Development

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages