Fast Support Vector Machines Using Parallel Adaptive Shrinking on Distributed Systems

Narasimhan, Jeyanthi; Vishnu, Abhinav; Holder, Lawrence; Hoisie, Adolfy

Computer Science > Distributed, Parallel, and Cluster Computing

arXiv:1406.5161 (cs)

[Submitted on 19 Jun 2014]

Title:Fast Support Vector Machines Using Parallel Adaptive Shrinking on Distributed Systems

Authors:Jeyanthi Narasimhan, Abhinav Vishnu, Lawrence Holder, Adolfy Hoisie

View PDF

Abstract:Support Vector Machines (SVM), a popular machine learning technique, has been applied to a wide range of domains such as science, finance, and social networks for supervised learning. Whether it is identifying high-risk patients by health-care professionals, or potential high-school students to enroll in college by school districts, SVMs can play a major role for social good. This paper undertakes the challenge of designing a scalable parallel SVM training algorithm for large scale systems, which includes commodity multi-core machines, tightly connected supercomputers and cloud computing systems. Intuitive techniques for improving the time-space complexity including adaptive elimination of samples for faster convergence and sparse format representation are proposed. Under sample elimination, several heuristics for {\em earliest possible} to {\em lazy} elimination of non-contributing samples are proposed. In several cases, where an early sample elimination might result in a false positive, low overhead mechanisms for reconstruction of key data structures are proposed. The algorithm and heuristics are implemented and evaluated on various publicly available datasets. Empirical evaluation shows up to 26x speed improvement on some datasets against the sequential baseline, when evaluated on multiple compute nodes, and an improvement in execution time up to 30-60\% is readily observed on a number of other datasets against our parallel baseline.

Comments:	10 pages, 9 figures, 3 tables
Subjects:	Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
Cite as:	arXiv:1406.5161 [cs.DC]
	(or arXiv:1406.5161v1 [cs.DC] for this version)
	https://doi.org/10.48550/arXiv.1406.5161

Submission history

From: Jeyanthi Salem Narasimhan [view email]
[v1] Thu, 19 Jun 2014 19:22:28 UTC (251 KB)

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Fast Support Vector Machines Using Parallel Adaptive Shrinking on Distributed Systems

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Distributed, Parallel, and Cluster Computing

Title:Fast Support Vector Machines Using Parallel Adaptive Shrinking on Distributed Systems

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators