Machine Learning
More
Search
Ctrl + K
L2 regularization
Previous
L1 regularization
Next
softmax
Last updated
4 years ago
1
N
∑
n
=
1
N
log
(
1
+
exp
(
−
y
n
W
T
X
n
)
)
+
λ
∥
W
∥
2
2
\frac {1} {N} \sum_{n=1}^N \log (1+\exp(-y_n W^T X_n)) + \lambda \left \| W \right \|_2^2
N
1
n
=
1
∑
N
lo
g
(
1
+
exp
(
−
y
n
W
T
X
n
))
+
λ
∥
W
∥
2
2
kernel logistic regression###
带L2正则的LR:
min
W
λ
N
W
T
W
+
1
N
∑
n
=
1
N
log
(
1
+
exp
(
−
y
n
W
T
X
n
)
)
W
=
∑
n
=
1
N
β
n
X
n
min
β
λ
N
∑
n
=
1
N
∑
m
=
1
n
β
n
β
m
K
(
X
n
,
X
m
)
+
1
N
∑
n
=
1
N
log
(
1
+
exp
(
−
y
n
∑
m
=
1
N
β
m
K
(
X
m
,
X
n
)
)
)
\min_W \frac {\lambda} {N} W^TW + \frac {1} {N} \sum_{n=1}^N \log (1+\exp(-y_n W^T X_n)) \\ W = \sum_{n=1}^N \beta_n X_n \\ \min_{\beta} \frac {\lambda} {N} \sum_{n=1}^N \sum_{m=1}^n \beta_n \beta_m K(X_n,X_m) + \frac {1} {N} \sum_{n=1}^N \log (1+\exp(-y_n \sum_{m=1}^N \beta_m K(X_m,X_n) ))
W
min
N
λ
W
T
W
+
N
1
n
=
1
∑
N
lo
g
(
1
+
exp
(
−
y
n
W
T
X
n
))
W
=
n
=
1
∑
N
β
n
X
n
β
min
N
λ
n
=
1
∑
N
m
=
1
∑
n
β
n
β
m
K
(
X
n
,
X
m
)
+
N
1
n
=
1
∑
N
lo
g
(
1
+
exp
(
−
y
n
m
=
1
∑
N
β
m
K
(
X
m
,
X
n
)))
为什么W的最优解是X的线性组合?
求解###
坐标下降法等