不平衡分类(一)-综述:imblearnimbalanced-learn库【提供了许多重采样技术,常用于显示强烈类间不平衡的数据集中】【降采样、过采样(SMOTE )】

不平衡分类(一)-综述:imblearnimbalanced-learn库【提供了许多重采样技术,常用于显示强烈类间不平衡的数据集中】【降采样、过采样(SMOTE )】,第1张

一、imblearn/imbalanced-learn库的简介

imblearn/imbalanced-learn是一个python包,它提供了许多重采样技术,常用于显示强烈类间不平衡的数据集中。它与scikit learn兼容,是 scikit-learn-contrib 项目的一部分。

1、imblearn/imbalanced-learn库的安装
pip install imblearn
pip install imbalanced-learn
pip install -U imbalanced-learn
conda install -c conda-forge imbalanced-learn
2、imblearn/imbalanced-learn库的使用方法

大多数分类算法只有在每个类的样本数量大致相同的情况下才能达到最优。

高度倾斜的数据集,其中少数被一个或多个类大大超过,已经证明是一个挑战,但同时变得越来越普遍。

解决这个问题的一种方法是通过重新采样数据集来抵消这种不平衡,希望得到一个比其他方法更健壮和公平的决策边界。

二、重采样

Re-sampling techniques are divided in two categories:

  • Under-sampling the majority class(es).
  • Over-sampling the minority class.
  • Combining over- and under-sampling.
  • Create ensemble balanced sets.
1、降采样(Under-sampling )
  • Random majority under-sampling with replacement
  • Extraction of majority-minority Tomek links [1]
  • Under-sampling with Cluster Centroids
  • NearMiss-(1 & 2 & 3) [2]
  • Condensed Nearest Neighbour [3]
  • One-Sided Selection [4]
  • Neighboorhood Cleaning Rule [5]
  • Edited Nearest Neighbours [6]
  • Instance Hardness Threshold [7]
  • Repeated Edited Nearest Neighbours [14]
  • AllKNN [14]
2、过采样(Over-sampling)

Random minority over-sampling with replacement
SMOTE - Synthetic Minority Over-sampling Technique [8]
SMOTENC - SMOTE for Nominal Continuous [8]
bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2 [9]
SVM SMOTE - Support Vectors SMOTE [10]
ADASYN - Adaptive synthetic sampling approach for imbalanced learning [15]
KMeans-SMOTE [17]

3、Over-sampling followed by under-sampling

SMOTE + Tomek links [12]
SMOTE + ENN [11]

4、Ensemble classifier using samplers internally

Easy Ensemble classifier [13]
Balanced Random Forest [16]
Balanced Bagging
RUSBoost [18]

5、Mini-batch resampling for Keras and Tensorflow


参考资料:
不平衡分类学习方法 --Imbalaced_learn
Py之imblearn:imblearn/imbalanced-learn库的简介、安装、使用方法之详细攻略
SMOTE(SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE ,即“人工少数类过采样法“)----PYTHON调包简单实现
【一步一步的积累】Synthetic Minority Over-sampling TEchnique
SMOTE(Synthetic Minority Over-Sampling Technique ,即“人工少数类过采样法“)----Python调包简单实现
不平衡分类学习方法 --Imbalaced_learn
不平衡学习的方法 Learning from Imbalanced Data

欢迎分享,转载请注明来源:内存溢出

原文地址:https://54852.com/langs/757473.html

(0)
打赏 微信扫一扫微信扫一扫 支付宝扫一扫支付宝扫一扫
上一篇 2022-04-30
下一篇2022-04-30

发表评论

登录后才能评论

评论列表(0条)

    保存