CNN系列之目标检测
R-CNN
R-CNN,这是给予卷积神经网络的物体检测的奠基之作。其核心思想是在对每张图片选取多个区域,然后每个区域作为一个样本进入一个卷积神经网络来提取特征,最后使用分类器来对齐分类,和一个回归器来得到准确的边框。
选择特征搜索边框
每张图选取 2000 个区域,分别做卷积,使用 SVM 单分类器做判别,对其位置做回归。
R-CNN,这是给予卷积神经网络的物体检测的奠基之作。其核心思想是在对每张图片选取多个区域,然后每个区域作为一个样本进入一个卷积神经网络来提取特征,最后使用分类器来对齐分类,和一个回归器来得到准确的边框。
选择特征搜索边框
每张图选取 2000 个区域,分别做卷积,使用 SVM 单分类器做判别,对其位置做回归。
Magenta is a research project exploring the role of machine learning in the process of creating art and music. Primarily this involves developing new deep learning and reinforcement learning algorithms for generating songs, images, drawings, and other materials. But it’s also an exploration in building smart tools and interfaces that allow artists and musicians to extend (not replace!) their processes using these models. Magenta was started by some researchers and engineers from the Google Brain team, but many others have contributed significantly to the project. We useTensorFlow and release our models and tools in open source on this GitHub. If you’d like to learn more about Magenta, check out our blog, where we post technical details. You can also join our discussion group.
This is the home for our Python TensorFlow library. To use our models in the browser with TensorFlow.js, head to the Magenta.js repository.
基于树的模型不依赖 scaling,非基于树的模型恰恰相反
当两个属性数量级的差距很大时,原来微小的距离,将变的很大,这对 KNN、linear models 有很大影响。
梯度下降法在没有适当放缩的情况下会变的很糟糕,由于这个原因,神经网络在特征预处理上与线性模型相似。
标准化不影响分布
在 MinMaxScaling 或 StandardScaling 转换之后,特性对非树模型的影响大致相同。
outliers 离群点
离群点既可以出现在特征值 X 里,也可以在目标值 y 中,这会对模型产生影响
我们可以将特征值控制在两个设定的下界和上界之间,例如第一百分位数和 99 百分位数之间。
机器学习常用方法,包含调参,单模型,集成学习等。
1 | from sklearn.model_selection import train_test_split |
1 | from sklearn.model_selection import StratifiedShuffleSplit |