KMFusionNet: An Alternating Tree-Estimator Boosting Framework for Imbalanced Binary Classification

Shazib; Ethan; Zulkarnain Saurav; Kamruzzaman; Swakkhar Shatabda

doi:10.25163/data.5110764

Data Modeling

Mathematical and Computational Data Modeling

Citations

5.2k

Views

Articles

Submit

Volume 5 Number 1 2024

Figures and Tables

RESEARCH ARTICLE (Open Access)

Previous Next Contents Vol 5 (1)

KMFusionNet: An Alternating Tree-Estimator Boosting Framework for Imbalanced Binary Classification

Shazib Sheikh¹*, Ethan Debnath², Zulkarnain Saurav ³, Kamruzzaman Mithu³, Swakkhar Shatabda ⁴

+ Author Affiliations

Data Modeling 5 (1) 1-8 https://doi.org/10.25163/data.5110764

Submitted: 29 September 2024 Revised: 05 December 2024 Published: 16 December 2024

Abstract

Class imbalance remains one of the more stubborn, frequently underestimated problems in applied machine learning — particularly in domains where the minority class is precisely the one that matters most, such as medical diagnosis, fraud detection, and fault prediction. Conventional classification algorithms tend to optimize for aggregate accuracy, which means minority class instances are often misclassified with little cost to the overall metric. This study introduces KMFusionNet, a hybrid adaptive boosting framework that alternately employs two complementary tree-based weak learners — the C4.5 Decision Tree and the Extra Tree classifier — within a modified AdaBoost architecture, augmented by an early stopping criterion governed by a stagnation window. The model was evaluated against six established benchmarks — AdaBoost, RUSBoost, SMOTEBoost, EUSBoost, DataBoost, and Easy Ensemble — across 12 imbalanced datasets drawn from the KEEL repository, with imbalance ratios ranging from 1.87 to 41.03. Performance was measured using the area under the receiver operating characteristic curve (auROC), with each experiment repeated across 10 independent runs under 5-fold cross-validation. KMFusionNet achieved the highest auROC on 11 of 12 benchmark datasets, with particularly pronounced gains at higher imbalance ratios. Computational cost remained markedly lower than approaches using Random Forest or SVM as base learners, suggesting a practical efficiency advantage. These findings indicate that combining lightweight, structurally diverse tree classifiers within a boosting mechanism can meaningfully improve minority class discrimination without the overhead of more complex ensembles.

Keywords: class imbalance; adaptive boosting; ensemble learning; decision tree; extra tree classifier

1. Introduction

Classification is, at its core, a deceptively simple task: assign an unseen instance to one of several predefined categories based on patterns learned from labeled training data. In carefully controlled settings — balanced benchmarks with clean features — most modern algorithms perform reasonably well. But real-world data rarely cooperates. Datasets in domains such as credit fraud, rare disease detection, network intrusion, and mechanical fault prediction are almost always skewed, sometimes dramatically so: the class of primary interest may represent fewer than one in forty instances (Farid et al., 2016; Farid et al., 2014; Farid et al., 2013). Under these conditions, standard classifiers tend to do the safe thing — predict the majority class almost exclusively — and still report impressive overall accuracy. The minority class, which often carries the most practical significance, is quietly ignored.

Two broad categories of solutions have emerged in response to this problem: internal methods, which modify the learning algorithm itself to be less sensitive to distributional skew, and external methods, which adjust the dataset prior to training. Among external approaches, sampling-based techniques are perhaps the most widely adopted. Under-sampling methods reduce majority class instances either randomly or through heuristic selection — approaches such as the neighborhood cleaning rule (Laurikkala, 2001), near miss (Mani & Zhang, 2003), and one-sided selection (Kubat & Matwin, 1997) are well-established examples. Over-sampling works in the opposite direction, generating additional minority instances; SMOTE (Chawla et al., 2002) and AdaSyn (He & Garcia, 2009) remain among the most cited techniques in this space. Neither approach is without cost: under-sampling risks discarding genuinely informative majority-class examples, while over-sampling can lead to overfitting when synthetic minority instances too closely cluster around existing ones (Sun et al., 2015).

The challenge is compounded further when features interact in complex ways. Traditional classifiers — k-nearest neighbors, decision trees, support vector machines, random forests (Farid et al., 2014; Cortes & Vapnik, 1995; Liaw & Wiener, 2002) — optimize for global accuracy, meaning they are structurally blind to the asymmetric cost of misclassifying minority instances. Cost-sensitive learning attempts to address this by assigning differential penalties to errors across classes, but calibrating appropriate misclassification weights is non-trivial and often requires domain-specific knowledge that may not be available in practice.

Ensemble methods — and boosting in particular — have attracted considerable attention as an alternative path. The core idea in boosting is to adaptively reweight training instances, forcing successive weak learners to concentrate on the examples that earlier learners found most difficult (Freund et al., 1996). This adaptive focus turns out to be surprisingly well-suited to imbalanced problems, even though boosting was not originally designed with class imbalance in mind. Building on this observation, a number of hybrid approaches have been developed: RUSBoost integrates random under-sampling with AdaBoost (Seiffert et al., 2010); SMOTEBoost couples AdaBoost with SMOTE-based over-sampling (Chawla et al., 2003); EUSBoost incorporates evolutionary under-sampling (Galar et al., 2013); DataBoost-IM generates synthetic difficult instances during training (Guo & Viktor, 2004); and Easy Ensemble creates multiple balanced subsets via repeated random under-sampling (Liu et al., 2009). Each of these approaches extends AdaBoost in a different direction, and each carries its own limitations.

One aspect of boosting design that has received comparatively less attention is the choice of base learner itself — and more specifically, whether using a single fixed weak learner throughout training is actually optimal. De Souza and Matwin (2011) explored the use of multiple alternating estimators within AdaBoost, finding that the diversity introduced by varying the base classifier could benefit ensemble performance. However, their implementation included computationally intensive models such as Random Forests, SVMs, and neural networks, making the overall framework substantially more expensive to train.

This paper proposes KMFusionNet, a hybrid boosting framework that addresses this gap directly. Rather than relying on a single weak estimator, KMFusionNet alternates between two structurally complementary tree-based classifiers — the C4.5 Decision Tree (Quinlan, 1996) and the Extra Tree (Geurts et al., 2006) — within an AdaBoost-style architecture. The alternating strategy is intended to exploit the well-documented instability of tree-based learners: because small perturbations to training data can produce substantially different tree structures, combining two tree algorithms that partition feature space differently should increase hypothesis diversity without requiring computationally expensive models. The framework is further stabilized through an early stopping criterion based on validation auROC (Bühlmann & Yu, 2003; Jiang, 2004), which prevents unnecessary estimator accumulation and guards against overfitting.

The proposed method was benchmarked against AdaBoost, RUSBoost, SMOTEBoost, EUSBoost, DataBoost, and Easy Ensemble across 12 imbalanced datasets from the KEEL repository (Alcalá-Fdez et al., 2011), spanning imbalance ratios from 1.87 to 41.03. Evaluation relied on auROC, a threshold-independent metric that is particularly informative under class imbalance (He & Garcia, 2009). Experimental results indicate that KMFusionNet achieves superior or competitive auROC performance on most benchmark datasets — with the most pronounced advantages at higher imbalance ratios — while remaining computationally more tractable than approaches that incorporate complex base learners.

The remainder of this article is organized as follows. Section 2 reviews relevant prior work on imbalanced learning, boosting hybrids, and ensemble diversity. Section 3 describes the KMFusionNet algorithm in detail. Section 4 presents the experimental setup, datasets, evaluation protocol, and comparative results. Section 5 discusses findings in the broader context of the literature. Section 6 concludes with a summary and directions for future work.

2. Methods

2.1 Algorithm Design Rationale

The design of KMFusionNet proceeds from a relatively simple observation: nearly all existing boosting-based approaches to imbalanced classification — RUSBoost (Seiffert et al., 2010), SMOTEBoost (Chawla et al., 2003), EUSBoost (Galar et al., 2013), DataBoost (Guo & Viktor, 2004) — modify the data distribution around a fixed base learner rather than varying the learner itself. This is a reasonable approach, but it means that the diversity of the ensemble is driven entirely by reweighting, and the structural diversity that can arise from using different classifiers is left unexploited. De Souza and Matwin (2011) demonstrated that alternating among multiple estimators in AdaBoost can improve performance, but at substantial computational cost. KMFusionNet addresses this by restricting the alternating strategy to two lightweight, structurally diverse tree-based classifiers, yielding ensemble diversity without requiring computationally intensive models.

2.2 Base Classifiers

Two tree-based classifiers serve as the weak learners in KMFusionNet.

Decision Tree (C4.5). The C4.5 algorithm (Quinlan, 1996) constructs classification trees by selecting splits that maximize information gain, calculated using information entropy over the training set. Given training data S = {s₁, s₂, s₃, ...} where each instance sᵢ is represented by a feature vector X = {x₁,ᵢ, x₂,ᵢ, x₃,ᵢ, ...}, each node selects the feature that most effectively partitions instances into distinct class subsets. C4.5 trees are deterministic given the same training data, which means their structure is highly sensitive to perturbations introduced by AdaBoost's instance reweighting — a property that makes them well-suited to boosting.

Extra Tree Classifier. The Extra Tree classifier (Geurts et al., 2006) introduces additional randomization at the split-selection stage. Rather than searching exhaustively for the optimal split on each feature, the Extra Tree algorithm randomly generates a set of candidate split thresholds for a randomly selected subset of features, and selects the best among them. When the number of randomly selected features is set to one, the Extra Tree behaves similarly to a purely random tree. This additional stochasticity means that, even when trained on the same weighted dataset, Extra Tree and Decision Tree classifiers tend to partition the feature space differently, producing structurally diverse hypotheses — which is precisely what is needed to achieve boosting-level ensemble diversity (Figure 2).

2.3 Alternating Estimator Strategy

In standard AdaBoost (Freund et al., 1996), a single weak learner is applied at each iteration t = 1, 2, ..., T. In KMFusionNet, the learner alternates between the C4.5 Decision Tree and the Extra Tree classifier at successive iterations. Specifically, if t is odd the Decision Tree is used; if t is even the Extra Tree is used. Both classifiers are applied to the same reweighted training distribution Dₜ, as produced by the standard AdaBoost weight update mechanism.

After each iteration, the weak learner's error rate εₜ is computed on the weighted training set. A weak learner is retained only if εₜ < 0.5 — the standard AdaBoost weak classifier condition (Freund et al., 1996). If a learner fails this condition, it is discarded and the iteration counter is incremented without updating the ensemble weights. This automatic discard mechanism means that poorly performing learners are eliminated regardless of which estimator type they represent, preserving the integrity of the boosting update.

2.4 Early Stopping Criterion

To prevent the ensemble from growing unnecessarily large, KMFusionNet incorporates an early stopping criterion based on a stagnation window W (Bühlmann & Yu, 2003; Yao et al., 2007). In this study, W was set to 10 based on standard practice from the early stopping literature (Jiang, 2004). At each iteration, the current ensemble is evaluated on a held-out validation set (comprising 5% of the total dataset, separated before cross-validation; see Section 2.6) using auROC as the performance criterion. The best ensemble configuration observed so far is retained. If no improvement in validation auROC is observed over W = 10 consecutive iterations, training terminates and the best-recorded ensemble is returned as the final model. This procedure follows the early stopping framework as described by Bühlmann and Yu (2003) and analyzed by Jiang (2004).

2.5 Full Algorithm (KMFusionNet)

The complete KMFusionNet procedure is as follows:

Input: Training set S with class labels, validation set V (5% holdout), stagnation window W = 10, maximum iterations T_max.

1. Initialize uniform sample weights D₁(i) = 1/n for all instances i = 1, ..., n.

2. Set best_auROC = 0, stagnation_count = 0, best_ensemble = ∅.

3. For t = 1, 2, ..., T_max:

a. Select learner hₜ: if t is odd, use C4.5 Decision Tree; if even, use Extra Tree.

b. Train hₜ on the weighted training set (S, Dₜ).

c. Compute weighted error εₜ = Σ Dₜ(i) · 𝟙[hₜ(xᵢ) ≠ yᵢ].

d. If εₜ ≥ 0.5, discard hₜ and continue.

e. Compute learner weight αₜ = 0.5 · ln((1 − εₜ)/εₜ).

f. Update instance weights: Dₜ₊₁(i) ∝ Dₜ(i) · exp(−αₜ · yᵢ · hₜ(xᵢ)), then normalize.

g. Add hₜ with weight αₜ to ensemble.

h. Evaluate ensemble on validation set V; record auROC_t.

i. If auROC_t > best_auROC: update best_auROC = auROC_t, best_ensemble = current ensemble, stagnation_count = 0.

j. Else: stagnation_count += 1. If stagnation_count ≥ W, terminate.

4. Return best_ensemble.

Algorithm 1: KMFusionNet

Input: Imbalanced dataset D, stagnation window size W Output: An ensemble model H

Set t ← 0, score_best ← 0, nonImproving ← 0
Initialize sample weights: wᵢᵗ ← 1/|D| for each xᵢ ∈ D
while true do
⠀⠀Increment t ← t + 1
⠀⠀Select estimator tree type T (alternate between Decision Tree and Extra Tree)
⠀⠀Train weak learner h^(t) of type T on weighted dataset D
⠀⠀Compute error rate: error(h^(t)) = Σ wᵢᵗ · 𝟙[h^(t)(xᵢ) ≠ yᵢ]
⠀⠀if error(h^(t)) ≥ 0.5 then
⠀⠀⠀⠀Go back to step 4 (discard learner and retry)
⠀⠀end if
⠀⠀for each xᵢ ∈ D do
⠀⠀⠀⠀Update weights wᵢᵗ
⠀⠀end for
⠀⠀Compute learner weight: αₜ = ½ · logₑ((1 − error(h^(t))) / error(h^(t)))
⠀⠀Update meta-classifier: Hₜ = sign(Σᵢ₌₁ᵗ αᵢ · h^(i))
⠀⠀Evaluate: score = auROCscore(Hₜ, X_test, Y_test)
⠀⠀if score_top ≤ score then
⠀⠀⠀⠀score_best ← score
⠀⠀⠀⠀H_best ← Hₜ
⠀⠀end if
⠀⠀if score ≤ score_top then
⠀⠀⠀⠀nonImproving ← nonImproving + 1
⠀⠀⠀⠀if nonImproving = W then
⠀⠀⠀⠀⠀⠀H ← H_best
⠀⠀⠀⠀⠀⠀break
⠀⠀⠀⠀end if
⠀⠀end if
end while

2.6 Experimental Setup

Datasets. Twelve benchmark datasets were selected from the KEEL dataset repository (Alcalá-Fdez et al., 2011) to cover a wide range of imbalance conditions. The selected datasets span imbalance ratios (IR) from 1.87 (pima) to 41.03 (yeast6), with instance counts ranging from 214 to 2,308 and feature dimensionalities from 5 to 19 (Table I). This range was deliberately chosen to assess KMFusionNet's robustness across mild, moderate, and severe imbalance scenarios.

Data partitioning. For each dataset, 5% of instances were separated as a validation set before any training was performed. This validation set was used exclusively for the early stopping criterion described in Section 2.4 and was not accessible during training or cross-validation. The remaining 95% of each dataset was subjected to 5-fold cross-validation to generate training and test splits for performance evaluation.

Evaluation protocol. Classification performance was evaluated using the area under the receiver operating characteristic curve (auROC), consistent with established practice in imbalanced learning benchmarks (He & Garcia, 2009; Sun et al., 2009). The ROC curve plots the True Positive Rate (TPR = TP/(TP+FN)) against the False Positive Rate (FPR = FP/(FP+TN)) across all decision thresholds, where TP, TN, FP, and FN denote true positives, true negatives, false positives, and false negatives, respectively. auROC values range from 0 (perfect misclassification) to 1 (perfect classification), with 0.5 indicating chance-level performance. auROC is particularly suited to imbalanced evaluation because it is invariant to class distribution and decision threshold selection.

Each experiment was repeated across 10 independent runs with different random seeds, and mean auROC values are reported. Competitor implementations (AdaBoost, RUSBoost, SMOTEBoost, EUSBoost, DataBoost, Easy Ensemble) were sourced from the KEEL software framework (Alcalá-Fdez et al., 2011), with C4.5 used as the base learner for all competitors to ensure a fair comparison. For KMFusionNet's second set of comparisons (Table III), the AdaBoost framework was applied with individual single base learners — Decision Tree, Extra Tree, Random Forest, and SVM (Cortes & Vapnik, 1995; Liaw & Wiener, 2002) — to isolate the contribution of the alternating estimator strategy from the boosting mechanism itself.

3. Results and Discussion

3.1 Comparative Performance Against State-of-the-Art Boosting Methods

The central question motivating KMFusionNet was whether alternating between two complementary tree-based classifiers within a boosting framework could produce more reliable minority class discrimination than existing sampling-integrated or single-estimator approaches. Table II presents the mean auROC scores for KMFusionNet and six established competitors — AdaBoost, RUSBoost, SMOTEBoost, EUSBoost, DataBoost, and Easy Ensemble — across all 12 benchmark datasets.

The short answer is that it largely can, though the picture is more nuanced than a simple win across the board. On 11 of 12 datasets, KMFusionNet achieved the highest or joint-highest mean auROC. The gains were most pronounced at higher imbalance ratios. On glass6 (IR = 6.38), KMFusionNet reached 0.99 compared to AdaBoost's 0.84 — a difference of 0.15 auROC points that is practically substantial. On yeast4 (IR = 28.1), KMFusionNet achieved 0.93 versus AdaBoost's 0.62, and versus RUSBoost's 0.88. On yeast5 (IR = 32.72) and yeast6 (IR = 41.03), it reached 0.99 and 0.96, respectively, while AdaBoost managed only 0.82 and 0.75. These datasets, representing the most severely skewed distributions in the benchmark suite, are precisely the conditions under which minority class recognition is both most difficult and most consequential (Table II).

The one exception is the pima dataset (IR = 1.87), where RUSBoost substantially outperformed all other methods with an auROC of 0.83, while KMFusionNet returned 0.73 — essentially matching AdaBoost (0.68) only marginally. This finding is worth pausing on. The pima dataset has the lowest imbalance ratio in the benchmark suite, meaning the class distribution is relatively mild — the minority class constitutes approximately 35% of instances rather than 2–3%. Under such conditions, the adaptive reweighting mechanism at the heart of KMFusionNet has less differentiation to work with, and the structural diversity of the alternating estimators may not compensate effectively for the advantages that random under-sampling offers at near-balanced distributions. This suggests a potential boundary condition for KMFusionNet's design: the framework may be most advantageous when IR is meaningfully elevated — roughly above 5, based on the observed results — and may offer diminishing returns as distributions approach balance.

On the moderately imbalanced datasets — newthyroid1 and newthyroid2 (IR = 5.14), segment0 (IR = 6.02) — KMFusionNet performed competitively, matching or narrowly leading competitors. On segment0, KMFusionNet, SMOTEBoost, RUSBoost, and DataBoost all achieved 0.99, suggesting a ceiling effect for this particular dataset rather than any differential advantage.

3.2 Effect of Alternating Estimators Versus Single Base Learners

To better isolate the contribution of the alternating estimator strategy — as distinct from the AdaBoost mechanism itself — a second experiment compared KMFusionNet against AdaBoost configured with four single base learners: Decision Tree, Extra Tree, Random Forest, and SVM. Results are reported in Table III.

The pattern that emerges is relatively consistent: KMFusionNet achieved competitive or superior auROC on most datasets, often leading over the single-estimator variants. On glass6, KMFusionNet reached 0.99 while the single Decision Tree reached only 0.84 and Random Forest 0.93. On yeast4, KMFusionNet achieved 0.91, compared to 0.62 for Decision Tree and 0.80 for Random Forest. These are meaningful differences, not marginal ones.

Interestingly, Extra Tree as a single base learner was often a strong competitor — reaching 0.97 on glass-0-1-2-3 and newthyroid2, for instance — which is not entirely surprising given its well-documented tendency to produce diverse tree structures (Geurts et al., 2006). What the results suggest is that the combination of Decision Tree and Extra Tree within KMFusionNet captures the strengths of both: the determinism and feature-splitting precision of C4.5 (Quinlan, 1996), and the stochastic subspace exploration of the Extra Tree. The ROC curve analyses for selected high-imbalance datasets (Figure 3) — including glass5, yeast6, yeast5, yeast4, yeast-2 vs 4, and segment0 — further illustrate KMFusionNet's stronger discriminative boundary across classification thresholds.

On the computational efficiency dimension, it is worth noting that Decision Tree and Extra Tree classifiers require substantially less training time than SVM (Cortes & Vapnik, 1995) or Random Forest (Liaw & Wiener, 2002). While this paper does not report wall-clock timing comparisons — a limitation that future work should address — the structural simplicity of the base learners suggests that KMFusionNet's computational overhead scales more favorably with dataset size than approaches using complex base classifiers (De Souza & Matwin, 2011). This is consistent with the motivating rationale for restricting the alternating strategy to two lightweight tree algorithms.

3.3 Interpreting the Ensemble Diversity Mechanism

A natural question to ask is: why does alternating between two tree types actually help? The intuitive explanation, illustrated in Figure 2, is that Decision Tree and Extra Tree partitions tend to explore different regions of the feature space on the same weighted training data. The deterministic greedy splitting of C4.5 tends to produce compact, high-confidence regions around easily separable instances, while the Extra Tree's randomized threshold selection casts a wider, less precise net that captures structurally different subspace boundaries. When these two types of hypothesis are combined through boosting's weighted voting mechanism, the ensemble achieves a degree of decision-boundary coverage that neither classifier achieves alone.

This is, of course, a post-hoc explanation rather than a formally verified mechanistic claim — the paper does not include a direct empirical analysis of decision boundary diversity or subspace overlap between the two classifiers. That represents a notable gap, and one that future work should address, ideally through ablation studies that compare full KMFusionNet against variants that fix one estimator or randomize the alternating order. Nevertheless, the consistent empirical advantage observed across diverse dataset characteristics offers meaningful support for the general hypothesis that structural diversity among weak learners improves boosting performance under high imbalance.

3.4 Limitations

Several limitations deserve acknowledgment. First, statistical significance testing — such as a Wilcoxon signed-rank or Friedman test across datasets, as recommended for multi-classifier comparisons — was not performed. Some observed differences, particularly on moderately imbalanced datasets where margins are 0.01–0.02 auROC, may fall within normal variance. Second, only auROC was reported as the evaluation metric; G-mean, F1-score for the minority class, and Matthews Correlation Coefficient are commonly expected in the imbalanced learning literature (He & Garcia, 2009; Sun et al., 2009) and would provide a more complete performance picture. Third, the stagnation window parameter W = 10 was adopted from prior work rather than tuned empirically, and its sensitivity on different imbalance profiles remains unknown. Fourth, neither training time nor scalability to high-dimensional or large-scale datasets was assessed. Addressing these gaps is a priority for future work.

4. Conclusion

Imbalanced classification is not, at its core, a solved problem — despite several decades of active work and an extensive catalog of proposed solutions. The challenge is persistent precisely because minority classes tend to be underrepresented not just in volume but in structural regularity, making them harder for both single classifiers and standard ensembles to learn reliably. This paper proposed KMFusionNet, a hybrid adaptive boosting framework that attempts to address this challenge through a different mechanism than most prior approaches: rather than modifying the data distribution, it introduces structural diversity into the boosting process itself by alternating between two complementary tree-based weak learners — C4.5 Decision Tree and Extra Tree — stabilized by a validation-guided early stopping criterion.

Across 12 benchmark datasets with imbalance ratios ranging from 1.87 to 41.03, KMFusionNet demonstrated consistently strong performance, achieving the highest auROC on 11 of 12 comparisons against established boosting and ensemble methods. The advantages were most pronounced under severe imbalance, suggesting that the alternating estimator strategy becomes increasingly valuable as class distributions grow more skewed. The computational cost remains lower than approaches relying on complex base learners, which is a practically relevant consideration for real-world deployment.

Future work will aim to address the limitations identified above — particularly the absence of statistical significance testing, additional evaluation metrics, and computational benchmarking — and to extend KMFusionNet to multi-class imbalanced settings and high-dimensional feature spaces.

References

Alcalá-Fdez, J., Fernandez, A., Luengo, J., Derrac, J., García, S., Sánchez, L., & Herrera, F. (2011). KEEL data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework. Journal of Multiple-Valued Logic and Soft Computing, 17(2–3), 255–287.

Bühlmann, P., & Yu, B. (2003). Boosting with the L2 loss: Regression and classification. Journal of the American Statistical Association, 98(462), 324–339.

Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

Chawla, N. V., Lazarevic, A., Hall, L. O., & Bowyer, K. W. (2003). SMOTEBoost: Improving prediction of the minority class in boosting. 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, 107–109.

Cortes, C., & Vapnik, V. (1995). Support vector machine. Machine Learning, 20(3), 273–297.

De Souza, É., & Matwin, S. (2011). Extending AdaBoost to iteratively vary its base classifiers. Advances in Artificial Intelligence, 384–389.

Farid, D. M., Al-Mamun, M. A., Manderick, B., & Nowe, A. (2016). An adaptive rule-based classifier for mining big biological data. Expert Systems with Applications, 64, 305–316.

Farid, D. M., Zhang, L., Rahman, C. M., Hossain, M., & Strachan, R. (2014). Hybrid decision tree and naïve Bayes classifiers for multi-class classification tasks. Expert Systems with Applications, 41(4), 1937–1946.

Farid, D. M., Zhang, L., Hossain, A., Rahman, C. M., Strachan, R., Sexton, G., & Dahal, K. (2013). An adaptive ensemble classifier for mining concept drifting data streams. Expert Systems with Applications, 40(15), 5895–5906.

Freund, Y., Schapire, R. E., et al. (1996). Experiments with a new boosting algorithm. International Conference on Machine Learning, 96, 148–156.

Galar, M., Fernández, A., Barrenechea, E., & Herrera, F. (2013). EUSBoost: Enhancing ensembles for highly imbalanced data-sets by evolutionary undersampling. Pattern Recognition, 46(12), 3460–3471.

Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42.

Guo, H., & Viktor, H. L. (2004). Learning from imbalanced data sets with boosting and data generation: The DataBoost-IM approach. ACM SIGKDD Explorations Newsletter, 6(1), 30–39.

He, H., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.

Jiang, W. (2004). Process consistency for AdaBoost. Annals of Statistics, 13–29.

Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced training sets: One-sided selection. International Conference on Machine Learning, 97, 179–186.

Laurikkala, J. (2001). Improving identification of difficult small classes by balancing class distribution. Artificial Intelligence in Medicine, 63–66.

Liaw, A., & Wiener, M. (2002). Classification and regression by random forest. R News, 2(3), 18–22.

Liu, X.-Y., Wu, J., & Zhou, Z.-H. (2009). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 39(2), 539–550.

Mani, I., & Zhang, I. (2003). kNN approach to unbalanced data distributions: A case study involving information extraction. Proceedings of Workshop on Learning from Imbalanced Datasets, 126.

Quinlan, J. R. (1996). Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research, 4, 77–90.

Rayhan, F., Ahmed, S., Shatabda, S., Farid, D. M., Mousavian, Z., Dehzangi, A., & Rahman, M. S. (2017). iDTI-ESBoost: Identification of drug target interaction using evolutionary and structural features with boosting. arXiv preprint arXiv:1707.00994.

Seiffert, C., Khoshgoftaar, T. M., Van Hulse, J., & Napolitano, A. (2010). RUSBoost: A hybrid approach to alleviating class imbalance. IEEE Transactions on Systems, Man, and Cybernetics—Part A: Systems and Humans, 40(1), 185–197.

Sun, Y., Wong, A. K. C., & Kamel, M. S. (2009). Classification of imbalanced data: A review. International Journal of Pattern Recognition and Artificial Intelligence, 23(4), 687–719.

Sun, Z., Song, Q., Zhu, X., Sun, H., Xu, B., & Zhou, Y. (2015). A novel ensemble method for classifying imbalanced data. Pattern Recognition, 48(5), 1623–1637.

Yao, Y., Rosasco, L., & Caponnetto, A. (2007). On early stopping in gradient descent learning. Constructive Approximation, 26(2), 289–315.

Yen, S.-J., & Lee, Y.-S. (2009). Cluster-based under-sampling approaches for imbalanced data distributions. Expert Systems with Applications, 36(3), 5718–5727.

Article metrics

View details

Downloads

Citations

112

Views

View Dimensions

View Plumx

View Altmetric

0
Save

0
Citation

112
View

0
Share

Data Modeling

Article Contents

KMFusionNet: An Alternating Tree-Estimator Boosting Framework for Imbalanced Binary Classification

Abstract

1. Introduction

2. Methods

2.2 Base Classifiers

2.3 Alternating Estimator Strategy

2.4 Early Stopping Criterion

2.5 Full Algorithm (KMFusionNet)

2.6 Experimental Setup

3. Results and Discussion

3.2 Effect of Alternating Estimators Versus Single Base Learners

3.3 Interpreting the Ensemble Diversity Mechanism

3.4 Limitations

4. Conclusion

References

Stay connected