Recommender System Development

This project focuses on building and evaluating recommender systems that generate personalized suggestions for users. It covers the full pipeline of recommendation model development, including:

Data preparation – transforming interaction logs (users, items, timestamps, ratings, etc.) into a structured dataset.
Model training – applying classical algorithms (e.g., Matrix Factorization, kNN) and modern deep learning–based methods (e.g., Transformers like SASRec).
Evaluation – assessing models with metrics such as MAP@k and Serendipity.
Deployment & usage – generating top-k recommendations for users while supporting filtering, ranking, and interpretability.

Model Architecture

SASRec & BERT4Rec

As an input both models take user sequences, containing previous user interaction history. Item embeddings from these sequences are fed to transformer blocks with multi-head self-attention and feedforward neural network as main components. After one or several stacked attention blocks, resulting user sequence latent representation is used to predict targets items.

SASRec

SASRec is a transformer-based sequential model with unidirectional attention mechanism and "Shifted Sequence" training objective. Resulting user sequence latent representation is used to predict all items in user sequence at each sequence position where each item prediction is based only on previous items.

BERT4Rec

BERT4Rec is a transformer-based sequential model with bi-directional attention mechanism and "Item Masking" (same as "MLM") training objective. Resulting user sequence latent representation is used to predict masked items.

Difference	SASRec	BERT4Rec
Training objective	Shifted sequence target	Item masking target
Attention	Uni-directional	Bi-directional
Transformer block	Check the details below	Check the details below
Loss in original paper	Binary cross-entropy (BCE) with 1 negative per positive	Cross-entropy (Softmax) on full items catalog

Transformer Layers

SASRec

BERT4Rec

Dataset

The Indonesia Tourism Destination Dataset contains several tourist attractions in 5 major cities in Indonesia, namely Jakarta, Yogyakarta, Semarang, Bandung, Surabaya. This dataset is used in the Capstone Project Bangkit Academy 2021 called GetLoc. This dataset also consists of 4 files, namely:

tourism_ with _id.csv: contains information on tourist attractions in 5 major cities in Indonesia totaling ~400
user.csv: contains dummy user data to make recommendation features based on user
tourism_rating.csv: contains 3 columns, namely the user, the place, and the rating given, serves to create a recommendation system based on the rating
package_tourism.csv: contains recommendations for nearby places based on time, cost, and rating

Metrics Evaluation

Mean Average Precision at k (MAP@k) Average Precision (AP) measures the precision of recommendations while considering their order, and MAP@k is the mean value of AP across all users.

Serendipity measured as the average relevance of recommended items weighted by their dissimilarity from a user’s past interactions.

Indonesia Tourism Destination Dataset results

Model	MAP@10	Serendipity@10
Popular (Base Model)	0.006990	0.000007
Ease (Base Model)	0.005978	0.000296
BERT4Rec Softmax	0.006615	0.000337
SASRec Softmax	0.008706	0.000334
SASRec BCE	0.007420	0.000211
SASRec GBCE	0.007404	0.000261

Metrics Visualization

Results:

MAP@10 (Mean Average Precision): SASRec Softmax achieves the highest MAP@10 score (0.008706), meaning it is the most effective at ranking relevant recommendations at the top of the list for users. This indicates users are more likely to find items they prefer among the top 10.
Serendipity: SASRec Softmax also has one of the highest Serendipity@10 scores (0.000334), showing it can recommend items that are both relevant and novel (not just similar to what users have already interacted with). This helps users discover new and interesting destinations.

Based on evaluation metrics, we choose SASRec with Softmax as the recommendation model.

Paper

SASRec

BERT4Rec

BERT4Rec original paper: BERT4Rec: Sequential Recommendation with Bidirectional Encoder Representations from Transformer
Comparison of BERT4Rec implementations: A Systematic Review and Replicability Study of BERT4Rec for Sequential Recommendation

Requirement

Setup Environment for Development

1. Install Dependency

uv sync

2. Install Pre-Commit

uv run pre-commit install

Or you can use the make command:

make initial-setup

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github		.github
assets		assets
datasets		datasets
notebooks		notebooks
.example.env		.example.env
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
Makefile		Makefile
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recommender System Development

Model Architecture

SASRec & BERT4Rec

SASRec

BERT4Rec

Transformer Layers

Dataset

Metrics Evaluation

Indonesia Tourism Destination Dataset results

Metrics Visualization

Paper

SASRec

BERT4Rec

Requirement

Setup Environment for Development

1. Install Dependency

2. Install Pre-Commit

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

TemanWisata/recsys-dev

Folders and files

Latest commit

History

Repository files navigation

Recommender System Development

Model Architecture

SASRec & BERT4Rec

SASRec

BERT4Rec

Transformer Layers

Dataset

Metrics Evaluation

Indonesia Tourism Destination Dataset results

Metrics Visualization

Paper

SASRec

BERT4Rec

Requirement

Setup Environment for Development

1. Install Dependency

2. Install Pre-Commit

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages