
imblearn/imbalanced-learn是一个python包,它提供了许多重采样技术,常用于显示强烈类间不平衡的数据集中。它与scikit learn兼容,是 scikit-learn-contrib 项目的一部分。
1、imblearn/imbalanced-learn库的安装pip install imblearn
pip install imbalanced-learn
pip install -U imbalanced-learn
conda install -c conda-forge imbalanced-learn
2、imblearn/imbalanced-learn库的使用方法
大多数分类算法只有在每个类的样本数量大致相同的情况下才能达到最优。
高度倾斜的数据集,其中少数被一个或多个类大大超过,已经证明是一个挑战,但同时变得越来越普遍。
解决这个问题的一种方法是通过重新采样数据集来抵消这种不平衡,希望得到一个比其他方法更健壮和公平的决策边界。
二、重采样Re-sampling techniques are divided in two categories:
- Under-sampling the majority class(es).
- Over-sampling the minority class.
- Combining over- and under-sampling.
- Create ensemble balanced sets.
- Random majority under-sampling with replacement
- Extraction of majority-minority Tomek links [1]
- Under-sampling with Cluster Centroids
- NearMiss-(1 & 2 & 3) [2]
- Condensed Nearest Neighbour [3]
- One-Sided Selection [4]
- Neighboorhood Cleaning Rule [5]
- Edited Nearest Neighbours [6]
- Instance Hardness Threshold [7]
- Repeated Edited Nearest Neighbours [14]
- AllKNN [14]
Random minority over-sampling with replacement
SMOTE - Synthetic Minority Over-sampling Technique [8]
SMOTENC - SMOTE for Nominal Continuous [8]
bSMOTE(1 & 2) - Borderline SMOTE of types 1 and 2 [9]
SVM SMOTE - Support Vectors SMOTE [10]
ADASYN - Adaptive synthetic sampling approach for imbalanced learning [15]
KMeans-SMOTE [17]
SMOTE + Tomek links [12]
SMOTE + ENN [11]
Easy Ensemble classifier [13]
Balanced Random Forest [16]
Balanced Bagging
RUSBoost [18]
参考资料:
不平衡分类学习方法 --Imbalaced_learn
Py之imblearn:imblearn/imbalanced-learn库的简介、安装、使用方法之详细攻略
SMOTE(SYNTHETIC MINORITY OVER-SAMPLING TECHNIQUE ,即“人工少数类过采样法“)----PYTHON调包简单实现
【一步一步的积累】Synthetic Minority Over-sampling TEchnique
SMOTE(Synthetic Minority Over-Sampling Technique ,即“人工少数类过采样法“)----Python调包简单实现
不平衡分类学习方法 --Imbalaced_learn
不平衡学习的方法 Learning from Imbalanced Data
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)