Ensemble classifiers are a bit like Delphi methodology, in that they utilize multiple models (or experts) to arrive at a model that offers better predictive performance than would a single model (Dalkey & Helmer, 1963; Acharya, 2019). These are independent or parallel classifiers, implementing a majority vote amongst the classifiers like the Delphi method. A variety of individual classifiers can be used, including logistic regression, nearest neighbor methods, decision trees, Bayesian analysis, or discriminate analysis. According to Dietterich (2002), ensemble classification overcomes three major problems: Statistical, Computational, and Representational. The Statistical problem involves the hypothetical space being too large for the data itself, producing multiple accurate hypotheses yet only one being chose. The Computational problem involves the algorithm’s inability to guarantee the best hypothesis. The Representational problem involves the hypothetical space being devoid of any good approximation of the target.
Ensemble methods include bagging, boosting, and stacking. Bagging is considered a parallel or independent method; boosting and stacking are both sequential or dependent methods. Parallel methods are used when the independence between the base classifiers is advantageous, including error reduction; sequential methods are used when dependence between the classifiers is advantageous, such as correcting mislabeled examples or converting weak learners (Smolyakov, 2017).
Random forests are not exactly ensemble classifiers but do produce results from multiple decision trees and aggregate the results, like Bagging (Liberman, 2017). These train on different datasets and features, both randomly selected. Bias and variance errors are mitigated by way of low correlation between the models. Again, like ensemble classifiers and even Delphi method decision-making, learners operating as a committee should outperform any of the individual learners.
Acharya, Tarun (2019). Advanced ensemble classifiers. Retrieved from https://towardsdatascience.com/advanced-ensemble-classifiers-8d7372e74e40
Connolly, T. & Begg, C. (2015). Database Systems: A Practical Approach to Design, Implementation, and Management (6th ed.). London, UK: Pearson.
Dalkey, N., & Helmer, O. (1963). An experimental application of the Delphi method to the use of experts. Management Science, 9(3), 458-467.
Dietterich, T. G. (2000). Ensemble methods in machine learning. International workshop on multiple classifier systems (pp. 1-15). Springer Berlin Heidelberg.
Dietterich, T. G. (2002). Ensemble Learning. In The Handbook of Brain Theory and Neural Networks, Second Edition, (M.A. Arbib, Ed.), (pp. 405-408). Cambridge, MA: The MIT Press.
Liberman, N. (2017). Decision trees and random forests. Retrieved from https://towardsdatascience.com/decision-trees-and-random-forests-df0c3123f991
Smolyakov, V. (2017). Ensemble learning to improve machine learning results. Retrieved from https://blog.statsbot.co/ensemble-learning-d1dcd548e936
Tembhurkar, M. P., Tugnayat, R. M., & Nagdive, A. S. (2014). Overview on data mining schemes to design business intelligence framework for mobile technology. International Journal of Advanced Research in Computer Science, 5(8).Most content also appears on my LinkedIn page.