John L Jerz Website II Copyright (c) 2015

The Generalization Paradox of Ensembles (Elder, 2003)

Home
Current Interest
Page Title

John F. Elder IV

http://datamininglab.com/media/pdfs/Paradox_JCGS.pdf

GDF - generalized degrees of freedom

p.853 a method for improving accuracy more powerful than tailoring the algorithm has been discovered: bundling models into ensembles.

p.855 Building an ensemble consists of two steps: (1) constructing varied models, and (2) combining their estimates.

p.857 One criticism of ensembles is that interpretation of the model is now even less possible... are ensembles truly complex? They appear so: but do they act so?

p.862 Bundling competing models into ensembles almost always improves generalization - and using different algorithms is an effective way to obtain the requisite diversity of components. Ensembles appear to increase complexity, as they have many more parameters than their components; so, their ability to generalize better seems to violate the preference for simplicity embodied by "Occam’s razor." Yet, if we employ GDF - an empirical measure of the flexibility of a modeling process - to measure complexity, we find that ensembles can be simpler than their components. We argue that when complexity is thereby more properly measured, Occam’s razor is restored.