Jill-Jênn Vie

Researcher at Inria

% Using Ratings & Posters\newline for Anime & Manga Recommendations \vspace{2mm} % \alert{Jill-Jênn Vie}¹³ \and Florian Yger² \and Ryan Lahfa³ \and Basile \nolinebreak Clement³ \and Kévin Cocchi³ \and Thomas Chalumeau³ \and Hisashi Kashima¹\textsuperscript4 % ¹ RIKEN Center for Advanced Intelligence Project (Tokyo)\newline ² Université Paris-Dauphine (France)\newline ³ Mangaki (Paris, France)\newline \textsuperscript4 Kyoto University — theme: Frankfurt section-titles: false header-includes: - \usepackage{tikz} - \usepackage{array} - \usepackage{icomma} - \usepackage{multicol,booktabs} - \def\R{\mathcal{R}} handout: true —

Mangaki

Mangaki

Mangaki, recommendations of anime/manga

Rate anime/manga and receive recommendations

2,000 users, 10,000 anime/manga, 350,000 ratings

Mangaki

Build a profile

Mangaki prioritizes your watchlist

Browse the rankings: top works

Why nonprofit?

Everything is open source: \alert{\texttt{github.com/mangaki}} (Python, Vue.js)

Awards: Microsoft Prize (2014) Japan Foundation (2016)

Browse the rankings: precious pearls

RIKEN Center for Advanced Intelligence Project

\

Authors

JJ{height=3.1cm} Florian{height=3.1cm} Ryan{height=3.1cm} Hisashi{height=3.1cm}

\begin{minipage}{2.6cm}\centering Jill-Jênn Vie\end{minipage}\begin{minipage}[c]{2.6cm}\centering Florian Yger\end{minipage}\begin{minipage}[c]{2.5cm}\centering Ryan Lahfa\end{minipage}\begin{minipage}[c]{2.6cm}Hisashi Kashima\end{minipage}

Outline

1. Usual algorithms for recommender systems

2. Our method

3. Experiments

Recommender Systems

Recommender Systems

Recommender Systems

Problem

Example

\begin{tabular}{ccccc} & \includegraphics[height=2.5cm]{figures/1.jpg} & \includegraphics[height=2.5cm]{figures/2.jpg} & \includegraphics[height=2.5cm]{figures/3.jpg} & \includegraphics[height=2.5cm]{figures/4.jpg}
Sacha & ? & 5 & 2 & ?
Ondine & 4 & 1 & ? & 5
Pierre & 3 & 3 & 1 & 4
Joëlle & 5 & ? & 2 & ? \end{tabular}

Recommender Systems

Problem

Example

\begin{tabular}{ccccc} & \includegraphics[height=2.5cm]{figures/1.jpg} & \includegraphics[height=2.5cm]{figures/2.jpg} & \includegraphics[height=2.5cm]{figures/3.jpg} & \includegraphics[height=2.5cm]{figures/4.jpg}
Sacha & \alert{3} & 5 & 2 & \alert{2}
Ondine & 4 & 1 & \alert{4} & 5
Pierre & 3 & 3 & 1 & 4
Joëlle & 5 & \alert{2} & 2 & \alert{5} \end{tabular}

Usual techniques

Content-based

\hfill (work features: directors, genre, etc.)

Collaborative filtering

\hfill (solely based on ratings)

Hybrid recommender systems

\hfill (combine those two)

Example: $K$-Nearest Neighbors

\includegraphics{figures/ratings1.pdf}
\includegraphics{figures/knn.pdf}

Example: $K$-Nearest Neighbors

\includegraphics{figures/ratings2.pdf}
\includegraphics{figures/sim.pdf}

Matrix factorization $\rightarrow$ reduce dimension to generalize

\vspace{-7mm}

\(R = \left(\begin{array}{c} \R_1\\ \R_2\\ \vdots\\ \R_n \end{array}\right) = \raisebox{-1cm}{\begin{tikzpicture} \draw (0,0) rectangle (2.5,2); \end{tikzpicture}} = \raisebox{-1cm}{\begin{tikzpicture} \draw (0,0) rectangle ++(1,2); \draw node at (0.5,1) {$C$}; \draw (1.1,1) rectangle ++(2.5,1); \draw node at (2.35,1.5) {$P$}; \end{tikzpicture}}\) \(\text{$R$: 2k users $\times$ 15k works} \iff \left\{\begin{array}{l} \text{$C$: 2k users $\times$ \alert{20 profiles}}\\ \text{$P$: \alert{20 profiles} $\times$ 15k works}\\ \end{array}\right.\) $\R_\text{Bob}$ is a linear combination of profiles $P_1$, $P_2$, etc..

\pause

Interpreting Key Profiles

\begin{tabular}{@{}lccc@{}} If $P$ & $P_1$: adventure & $P_2$: romance & $P_3$: plot twist
And $C_u$ & $0,2$ & $-0,5$ & $0,6$ \end{tabular}

$\Rightarrow$ $u$ \alert{likes a bit} adventure, \alert{hates} romance, \alert{loves} plot twists.

Weighted Alternating Least Squares (Zhou, 2008)

$R$ ratings, \alert{$U$} user features, \alert{$V$} work features.

\[R = \alert{UV^T} \qquad \Rightarrow \qquad r_{ij} \simeq \hat{r}_{ij}^{ALS} \triangleq \alert{U_i} \cdot \alert{V_j}.\]

Objective function to minimize

$U, V \mapsto \sum_{i, j \textnormal{ known}}~(r_{ij} - U_i \cdot V_j)^2 + \lambda \left(\sum_i N_i ||U_i||^2 + \sum_j M_j ||V_j||^2\right)$ where:

Algorithm

Until convergence (~ 10 iterations):

Visualizing first two components of anime $V_j$

\alert{Closer} points mean similar taste

\vspace{-1cm}

\

Find your taste by plotting first two columns of $U_i$

You will \alert{like} anime that are \alert{in your direction}

\vspace{-1cm}

\

Drawback with collaborative filtering

Issue: Item Cold-Start

No way to distinguish between unrated works.

\pause

But we have posters!

Our method

Our method

Illustration2Vec (Saito and Matsui, 2015)

\centering

{width=40%}\ {width=40%}\

LASSO for sparse linear regression

$T$ matrix of 15000 works $\times$ 502 tags ($t_{jk}$: tag $k$ appears in item $j$)

\pause

Least Absolute Shrinkage and Selection Operator (LASSO)

\[P_i \mapsto \frac1{2 N_i} {\lVert \R_i - P_i T^T \rVert}_2^2 + \alpha \alert{ {\lVert P_i \rVert}_1}.\]

\noindent where $N_i$ is the number of items rated by user $i$.

\pause

Interpretation and explanation of user preferences

Combine models

Which model should be choose between ALS and LASSO?

Answer

Both!

Methods

boosting, bagging, model stacking, blending.

Idea

find $\alert<2>{\alpha\only<2>{j}}$ s.t. $\hat{r{ij}} \triangleq \alert<2>{\alpha\only<2>{j}} \hat{r}{ij}^{ALS} + (1 - \alert<2>{\alpha\only<2>{j}}) \hat{r}{ij}^{LASSO}.$

Examples of $\alpha_j$

\centering \includegraphics{figures/curve1.pdf}
Mimics ALS \(\hat{r_{ij}} \triangleq \alert1 \hat{r}_{ij}^{ALS} + \alert0 \hat{r}_{ij}^{LASSO}.\)

Examples of $\alpha_j$

\centering \includegraphics{figures/curve2.pdf}
Mimics LASSO \(\hat{r_{ij}} \triangleq \alert0 \hat{r}_{ij}^{ALS} + \alert1 \hat{r}_{ij}^{LASSO}.\)

Examples of $\alpha_j$

\centering \includegraphics{figures/curve3.pdf} \(\hat{r}_{ij}^{BALSE} = \begin{cases} \hat{r}_{ij}^{ALS} & \text{if item $j$ was rated at least $\gamma$ times}\\ \hat{r}_{ij}^{LASSO} & \text{otherwise} \end{cases}\) But we can’t: \alert{Not differentiable!}

Examples of $\alpha_j$

\centering \includegraphics{figures/curve4.pdf} \(\hat{r}_{ij}^{BALSE} = \alert{\sigma(\beta(R_j - \gamma))} \hat{r}_{ij}^{ALS} + \left(1 - \alert{\sigma(\beta(R_j - \gamma))}\right) \hat{r}_{ij}^{LASSO}\) $\beta$ and $\gamma$ are learned by stochastic gradient descent.

We call this gate the \alert{Steins;Gate}.

Blended Alternate Least Squares with Explanation

\centering

Blended Alternate Least Squares with Explanation

Blended Alternate Least Squares with Explanation

Blended Alternate Least Squares with Explanation

Blended Alternate Least Squares with Explanation

Blended Alternate Least Squares with Explanation

Blended Alternate Least Squares with Explanation

Experiments

Experiments

Dataset: Mangaki

\

Evaluation: Root Mean Squared Error (RMSE)

If we predict $\hat{r_{ij}}$ for each user-work pair $(i, j)$ to test among $n$,
while truth is $r_{ij}$:

\[RMSE(\hat{r}, r) = \sqrt{\frac1n \sum_{i, j} (\hat{r}_{ij} - r_{ij})^2}.\]

Cross-validation

Differents sets of items:

Results

\centering

\

Summing up

We presented BALSE, a model that:

to \alert{improve} the recommendations, and \alert{explain} them.

\pause

Further work

Coming soon: Watching assistant

Thank you! \hfill jj@mangaki.fr

\centering {width=50%}\

Try it: \alert{https://mangaki.fr} \hfill Twitter: \alert{@MangakiFR}

\raggedright

Read the article

\small Using Posters to Recommend Anime and Mangas in a Cold-Start Scenario

\normalsize \alert{github.com/mangaki/balse} (PDF on arXiv, front page of HNews)

Results of Mangaki Data Challenge: \alert{research.mangaki.fr}

  1. Ronnie Wang (Microsoft Suzhou, China)
  2. Kento Nozawa (Tsukuba University, Japan)
  3. Jo Takano (Kobe University, Japan)