# boosting

## 理解boosting

boosting假定模型为加性模型：$$F\_m(x) = \sum\_{m=1}^M \beta\_m f\_m(x;\gamma\_m)$$\
优化目标：$$\min\_{\beta\_m,\gamma\_m} \sum\_{i=1}^N L(y\_i, \sum\_{m=1}^M \beta\_m f\_m(x\_i;\gamma\_m))$$\
boosting过程：\
1\. initialise $$f\_0(x)=0$$\
2\. for m=1 to M

* compute $$(\beta\_m,\gamma\_m) = \arg \min \sum\_{i=1}^N L(y\_i, F\_{m-1}(x\_i) + \beta\_m f\_m(x\_i;\gamma\_m))$$
* set $$F\_m(x) = \sum\_{m=1}^M \beta\_m f\_m(x;\gamma\_m)$$

## AdaBoost

算法过程见《统计学习方法》p138，这里备注下细节。

* 分类错误：迭代到m时，算分类误差，算的仅仅是当前这步分类器分类错误的，而不是**前m步加权累加的分类错误**。
* 迭代终止条件：仅仅人为指定M（分类器数量）。（倘若第n步（$$n \lt M$$）时，分类准确率已经很高了。。。）

![](https://2270971654-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M7DcNFhVrwIk3Tks_pB%2Fsync%2F85ec9606a5d844b8eebb5fda1b3f19ab0f0e927d.png?generation=1589383939193183\&alt=media)

图来自<http://rob.schapire.net/papers/explaining-adaboost.pdf>

## AdaBoost解释

优化目标函数 $$E=\sum\_{n=1}^N \exp(-y\_n F\_m(X\_n))$$,n是样本数。

> 这个分类错误为什么不是0-1损失，而是指数形式？\
> 这个指数损失是0-1的凸上界。会有一些良好的性质，比如优化快。\
> 但是，对异常值敏感，鲁棒性差。可能一个噪音点能将分界面拉动很远。（所以后来有了 LogitBoost）

序列最小化指数函数$$F\_m(x) = \sum\_{m=1}^M \beta\_m f\_m(x)$$，\
所以

$$
\begin{align}
E &= \sum\_{n=1}^N \exp(-y\_n F\_m(X\_n)) \\
&=\sum\_{n=1}^N \exp(-y\_n F\_{m-1}(X\_n) - y\_n \beta\_m f\_m(X\_n) ) \\
&=\sum\_{n=1}^N w\_n^m \exp(-  y\_n \beta\_m f\_m(X\_n) ) \\
&= e^{- \beta\_m} \sum\_{n \in T\_m} w\_n^m + e^{\beta\_m} \sum\_{n \in M\_m} w\_n^m  \\
&= e^{- \beta\_m} \sum\_{n \in T\_m} w\_n^m + e^{- \beta\_m} \sum\_{n \in M\_m} w\_n^m - e^{- \beta\_m} \sum\_{n \in M\_m} w\_n^m + e^{\beta\_m} \sum\_{n \in M\_m} w\_n^m  \\
&= e^{- \beta\_m} \sum\_{n=1}^N w\_n^m + (e^{\beta\_m} - e^{- \beta\_m}) \sum\_{n=1}^N w\_n^m I(f\_m(X\_n) \neq y\_n) \\
&= e^{- \beta\_m} \sum\_{n=1}^N w\_n^m + (e^{\beta\_m} - e^{- \beta\_m}) \sum\_{n=1}^N w\_n^m I(f\_m(X\_n) \neq y\_n) \\
\end{align}
$$

其中

$$
w\_n^m = \exp(-y\_n F\_{m-1}(X\_n) ) \\
y\_n f\_m(X\_n) = 1 -2I(f\_m(X\_n) \neq y\_n) \quad \mbox{就是分对了为1，分错了为-1}
$$

然后通过$$\frac {\partial E}{\partial \beta\_m} = 0$$来求解E最小值。

$$
\frac {\partial E}{\partial \beta\_m} =0 \\
\Rightarrow \beta\_m = \frac 12 \ln (\frac {1-e\_m}{e\_m}) \qquad \text{这就是错误加权率} \\
e\_m = \frac {\sum\_{n=1}^N w\_n^m I(f\_m(X\_n) \neq y\_n)}{\sum\_{n=1}^N w\_n^m} \qquad \text{样本权重更新公式} \\
\= \frac {\sum\_{n=1}^N \exp(-y\_n F\_{m-1}(X\_n) ) I(f\_m(X\_n) \neq y\_n)}{\sum\_{n=1}^N w\_n^m} \\
\= \frac {\prod\_{n=1}^N \exp(-y\_n \beta\_m f\_m(X\_n) ) I(f\_m(X\_n) \neq y\_n)}{\sum\_{n=1}^N w\_n^m} \qquad \text{对照前面的权重更新公式，就是一直累乘起来} \\
$$

[AdaBoost算法](http://www.wengweitao.com/adaboostsuan-fa.html)

### AdaBoost变种

**Real Adaboost**\
adaboost要求输出必须是$${-1,1}$$,若不做此要求，然后相应的loss function改成$$E = \sum\_{n=1}^N \exp(-y\_n (F\_{m-1}(x\_n)+f\_m(x\_n)))$$,则得到了Real Adaboost。（基分类器$$f\_m(x)$$已经没有显式的再加权重了）\
![](https://2270971654-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M7DcNFhVrwIk3Tks_pB%2Fsync%2F23ee9a89cfeca793ef0123637e6d52eb97ea0697.jpg?generation=1589383942744036\&alt=media)

**Gentle Adaboost**\
若在Real Adaboost基础上通过牛顿步来优化loss function，则得到了Gentle Adaboost。\
![](https://2270971654-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M7DcNFhVrwIk3Tks_pB%2Fsync%2F5f6dfbfa3e3ca89e67ec93fbd26e5f6d869ece5a.jpg?generation=1589383939768787\&alt=media)

这些Adaboost都是指数损失。\
**指数损失的问题**：错分的outline会得到很高的权重，导致对噪声的鲁棒性能差\
![](https://2270971654-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M7DcNFhVrwIk3Tks_pB%2Fsync%2F558dcbe3dc3591eefa1d74852fec98005b0a8fef.png?generation=1589383941707523\&alt=media)

### LogitBoost

![](https://2270971654-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M7DcNFhVrwIk3Tks_pB%2Fsync%2Fdf9d61227d3e7ced7673f389b361a01715e228f6.jpg?generation=1589383940725336\&alt=media)

既然是logit损失，那么优化的就是负对数似然：$$l(y\_n,p(x\_n)) = -log(1+e^{-2 y\_n F\_m(x\_n)})$$ 。迭代时也用的是牛顿步。

## 误差分析

Bagging：样本分布的相似性$$E\[\frac{\sum^n x\_i}{n}]=E\[x\_i]$$，所以boostrap模型的bias与单个模型的bias差不多。各个boostrap模型之间有一定的相关性，但不完全相同，所以**Aggregating**之后降低了一定程度的variance。

Boosting : 有序的最小化损失函数，所以bias逐步下降

### 参考佳文

[AdaBoosting流程及数学证明](http://blog.csdn.net/lampqiu/article/details/39338277)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://json007.gitbook.io/svm/boosting.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
