Machine Learning
  • Introduction
  • man
  • Linear model
    • Linear Regression
    • Generalized Linear Models
    • Nonlinear regression
  • bayes
    • bayesian network
    • Variational Bayesian inference
    • Gaussian Process Regression
  • Logistic Regression
    • L1 regularization
    • L2 regularization
    • softmax
    • Overflow and Underflow
  • SVM
    • C-SVM
    • C-SVM求解
  • EM
    • GMM
  • Maximum Entropy
    • IIS
  • HMM
    • viterbi algorithm
  • CRF
  • Random Forest
    • bagging
    • random forest
  • boosting
    • catboost
    • gradient boosting
    • Newton Boosting
    • online boosting
    • gcForest
    • Mixture models
    • XGBoost
    • lightGBM
    • SecureBoost
  • LDA
  • rank
    • RankNet
    • LambdaRank
    • SimRank
  • Factorization Machine
    • Field-aware Factorization Machine
    • xdeepFM
  • Clustering
    • BIRCH
    • Deep Embedding Clustering
  • Kalman filtering
  • word2vec
  • 关联规则挖掘
  • MATH-Mathematical Analysis
    • measure
  • MATH-probability
    • Variational Inference
    • Dirichlet分布
    • Gibbs Sampling
    • Maximum entropy probability distribution
    • Conjugate prior
    • Gaussian Process
    • Markov process
    • Poisson process
    • measure
    • Gumbel
  • MATH-Linear Algebra
    • SVD
    • SVD-推荐
    • PCA
    • Linear Discriminant Analysis
    • Nonnegative Matrix Factorization
  • MATH-Convex optimization
    • 梯度下降
    • 随机梯度下降
    • 牛顿法
    • L-BFGS
    • 最速下降法
    • 坐标下降法
    • OWL-QN
    • 对偶问题
    • 障碍函数法
    • 原对偶内点法
    • ISTA
    • ADMM
    • SAG
  • MATH-碎碎念
    • cost function
    • Learning Theory
    • sampling
    • Entropy
    • variational inference
    • basis function
    • Diffie–Hellman key exchange
    • wavelet transform
    • 图
    • Portfolio
    • 凯利公式
  • ML碎碎念
    • 特征
    • test
    • TF-IDF
    • population stability index
    • Shapley Values
  • 课件
    • xgboost算法演进
  • Time Series
  • PID
  • graph
    • SimRank
    • community detection
    • FRAUDAR
    • Anti-Trust Rank
    • Struc2Vec
    • graph theory
    • GNN
  • Anomaly Detection
    • Isolation Forest
    • Time Series
  • Dimensionality Reduction
    • Deep Embedded Clustering
  • Federated Learning
  • automl
  • Look-alike
  • KNN
  • causal inference
Powered by GitBook
On this page
  • 基于矩阵分解的推荐算法
  • 参考佳文

Was this helpful?

  1. MATH-Linear Algebra

SVD-推荐

《A Guide to Singular Value Decomp osition for Collab orative Filtering》

用户对电影的打分可以用feature建立联系:用户喜欢动作片还是科幻片,该电影是喜剧还是传记。 建立两个矩阵:每个用户对各个feature的喜欢程度,电影的各个feature程度。则两矩阵相乘的结果与原来的评分越近越好,即期望越销越好。

E=12∑i=1n∑j=1mIij(Vij−p(Ui,Mj))2+ku2∑i=1n∣∣Ui∣∣2+km2∑j=1m∣∣Mj∣∣2E=\frac 12 \sum_{i=1}^n \sum_{j=1}^m I_{ij}(V_{ij}-p(U_i,M_j))^2 + \frac {k_u}{2}\sum_{i=1}^n ||U_i||^2 + \frac {k_m}{2} \sum_{j=1}^m ||M_j||^2E=21​i=1∑n​j=1∑m​Iij​(Vij​−p(Ui​,Mj​))2+2ku​​i=1∑n​∣∣Ui​∣∣2+2km​​j=1∑m​∣∣Mj​∣∣2

其中n表示用户数目,m表示物品数目,IijI_ijIi​j是用来表示用户i有没有对物品j评过分,因为我们只需要评过分的那些越接近越好,没评过的就不需要考虑,VijV_ijVi​j表示训练数据中给出的评分,也就是实际评分,p(Ui,Mj)p(U_i,M_j)p(Ui​,Mj​)表示我们对用户i对物品j的评分的预测,结果根据两向量点乘得到,后面的两项主要是为了防止过拟合,之所以都加了系数1/2是为了等会求导方便。

用梯度下降法求解,算法流程:

1. Set the starting values of matrices U,M. 2. Repeat - (a) Compute gradients \Delta_U and \Delta_M - (b) Set U \leftarrow U-\mu\Delta_U , M \leftarrow M-\mu_\Delta_M . until the vaildation RMSE start to increase

其中梯度:

∂E∂Ui=∑j=1mIij((Vij−p(Ui,Mj))Mj)−KuUi,i=1,…,n∂E∂Mj=∑i=1nIij((Vij−p(Ui,Mj))Uj)−KmMj,j=1,…,m\frac{\partial E}{\partial U_i} = \sum_{j=1}^m I_{ij}((V_{ij}-p(U_i,M_j))M_j)-K_uU_i , i=1,\ldots,n \\ \frac{\partial E}{\partial M_j} = \sum_{i=1}^n I_{ij}((V_{ij}-p(U_i,M_j))U_j)-K_mM_j , j=1,\ldots,m∂Ui​∂E​=j=1∑m​Iij​((Vij​−p(Ui​,Mj​))Mj​)−Ku​Ui​,i=1,…,n∂Mj​∂E​=i=1∑n​Iij​((Vij​−p(Ui​,Mj​))Uj​)−Km​Mj​,j=1,…,m

上述算法被称为批处理式学习算法,因为它计算的是整个矩阵。 不完全增量式学习

Ei=12∑i=1n∑j=1mIij(Vij−p(Ui,Mj))2+ku2∑i=1n∣∣Ui∣∣2+km2∑j=1mIij(∣∣Mj∣∣)2E_i = \frac 12 \sum_{i=1}^n \sum_{j=1}^m I_{ij}(V_{ij}-p(U_i,M_j))^2 + \frac {k_u}{2}\sum_{i=1}^n ||U_i||^2 + \frac {k_m}{2} \sum_{j=1}^m I_{ij}(||M_j||)^2Ei​=21​i=1∑n​j=1∑m​Iij​(Vij​−p(Ui​,Mj​))2+2ku​​i=1∑n​∣∣Ui​∣∣2+2km​​j=1∑m​Iij​(∣∣Mj​∣∣)2

梯度:

∂Ei∂Ui=∑j=1mIij((Vij−p(Ui,Mj))Mj)−KuUi,i=1,…,n∂Ei∂Mj=Iij((Vij−p(Ui,Mj))Uj)−KmIij(Mj)=Iij[(Vij−p(Ui,Mj))Uj)−KmMj],j=1,…,m\frac{\partial E_i}{\partial U_i} = \sum_{j=1}^m I_{ij}((V_{ij}-p(U_i,M_j))M_j)-K_uU_i , i=1,\ldots,n \\ \frac{\partial E_i}{\partial M_j} = I_{ij}((V_{ij}-p(U_i,M_j))U_j)-K_mI_{ij}(M_j) \\ =I_{ij}[(V_{ij}-p(U_i,M_j))U_j)-K_mM_j], j=1,\ldots,m∂Ui​∂Ei​​=j=1∑m​Iij​((Vij​−p(Ui​,Mj​))Mj​)−Ku​Ui​,i=1,…,n∂Mj​∂Ei​​=Iij​((Vij​−p(Ui​,Mj​))Uj​)−Km​Iij​(Mj​)=Iij​[(Vij​−p(Ui​,Mj​))Uj​)−Km​Mj​],j=1,…,m

完全增量式学习是对每一个评分进行期望计算,期望如下:

还有些SVD算法考虑了每个用户,每个物品的bias,这里所谓的bias就是每个人的偏差,比如一个电影a,b两人都认为不错,但是a评分方便比较保守,不错的给3分,b评分比较宽松,不错的给4分,故一下的评分方式考虑到了每个用户,每个物品的bias,要比上述算法更加精准。原来评分的话是直接计算userfeature∗itemfeatureTuserfeature * itemfeature^Tuserfeature∗itemfeatureT, ,但现在要考虑各种bias,如下:

p(Ui,Mj,αi,βj)=a+UiTMj+αi+βjp(U_i,M_j,\alpha_i,\beta_j) = a+U_i^TM_j+\alpha_i+\beta_jp(Ui​,Mj​,αi​,βj​)=a+UiT​Mj​+αi​+βj​

其中a表示所有评分的平均数,αi\alpha_iαi​表示用户i的bias,βi\beta_iβi​表示物品j的偏差,相乘的矩阵还是和上面一样的意思。 期望和bias求导(矩阵求导的式子不变):

Eij=12(Vij−p(Ui,Mj,αi,βj))2+ku2∣∣Ui∣∣2+km2Iij(∣∣Mj∣∣)2+kb2(αi2+βj2)∂Eij∂Ui=(Vij−p(Ui,Mj,αi,βj))−Kbαi∂Eij∂Mj=(Vij−p(Ui,Mj,αi,βj))−KbβjE_{ij} = \frac 12 (V_{ij}-p(U_i,M_j,\alpha_i,\beta_j))^2 + \frac {k_u}{2} ||U_i||^2 + \frac {k_m}{2} I_{ij}(||M_j||)^2+ \frac {k_b}{2} (\alpha_i^2 + \beta_j^2) \\ \frac{\partial E_{ij}}{\partial U_i} = (V_{ij}-p(U_i,M_j,\alpha_i,\beta_j))-K_b\alpha_i \\ \frac{\partial E_{ij}}{\partial M_j} = (V_{ij}-p(U_i,M_j,\alpha_i,\beta_j))-K_b\beta_jEij​=21​(Vij​−p(Ui​,Mj​,αi​,βj​))2+2ku​​∣∣Ui​∣∣2+2km​​Iij​(∣∣Mj​∣∣)2+2kb​​(αi2​+βj2​)∂Ui​∂Eij​​=(Vij​−p(Ui​,Mj​,αi​,βj​))−Kb​αi​∂Mj​∂Eij​​=(Vij​−p(Ui​,Mj​,αi​,βj​))−Kb​βj​

评测效果,就是误差平方和:

RMSE(P,A)=∑i=1n∑j=1mJij(Aij−Pij)2∑i=1n∑j=1mJijRMSE(P,A) = \sqrt \frac {\sum_{i=1}^n \sum_{j=1}^m J_{ij}(A_{ij}-P_{ij})^2}{\sum_{i=1}^n \sum_{j=1}^m J_{ij}}RMSE(P,A)=∑i=1n​∑j=1m​Jij​∑i=1n​∑j=1m​Jij​(Aij​−Pij​)2​​

基于矩阵分解的推荐算法

参考佳文

PreviousSVDNextPCA

Last updated 5 years ago

Was this helpful?

跟上面基本一样

基于矩阵分解的推荐算法,简单入门
SVD在推荐系统中的应用与实现(c++)
浅谈矩阵分解在推荐系统中的应用
非负矩阵分解(NMF)
推荐算法概览
饿了么推荐系统:从0到1
推荐系统中基于深度学习的混合协同过滤模型
推荐系统本质与网易严选实践