Mondrian Forests: Efficient Online Random Forests

Lakshminarayanan, Balaji; Roy, Daniel M.; Teh, Yee Whye

Statistics > Machine Learning

arXiv:1406.2673v1 (stat)

[Submitted on 10 Jun 2014 (this version), latest version 16 Feb 2015 (v2)]

Title:Mondrian Forests: Efficient Online Random Forests

Authors:Balaji Lakshminarayanan, Daniel M. Roy, Yee Whye Teh

View PDF

Abstract:Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as Breiman's random forest and extremely randomized trees) operate on batches of training data. Online methods are now in greater demand. Existing online random forests, however, require more training data than their batch counterpart to achieve comparable predictive performance. In this work, we use Mondrian processes (Roy and Teh, 2009) to construct ensembles of random decision trees we call Mondrian forests. Mondrian forests can be grown in an incremental/online fashion and remarkably, the distribution of online Mondrian forests is the same as that of batch Mondrian forests. Mondrian forests achieve competitive predictive performance comparable with existing online random forests and periodically re-trained batch random forests, while being more than an order of magnitude faster, thus representing a better computation vs accuracy tradeoff.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1406.2673 [stat.ML]
	(or arXiv:1406.2673v1 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1406.2673

Submission history

From: Balaji Lakshminarayanan [view email]
[v1] Tue, 10 Jun 2014 19:34:51 UTC (435 KB)
[v2] Mon, 16 Feb 2015 14:57:52 UTC (594 KB)

Statistics > Machine Learning

Title:Mondrian Forests: Efficient Online Random Forests

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Mondrian Forests: Efficient Online Random Forests

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators