列出mahout所实现或正在实现的一些算法
Classification
(SGD)
(SVM) (open: , and )
(open: )
(open, but might help)
(integrated - , , )
(open, , GSOC2010)
(integrated, )
(awaiting patch commit, )
(HMM) (MAHOUT-627, MAHOUT-396, MAHOUT-734) - Training is done in Map-Reduce
Clustering
( - integrated)
( - integrated)
( - integrated)
(EM) ()
( - integrated)
()
( - integrated)
( - integrated)
( - integrated)
( - integrated)
( - integrated)
Pattern Mining
(Also known as Frequent Itemset mining)
Regression
(open)
Dimension reduction
(available since 0.3)
(PCA and dimensionality reduction workflow is now integrated with SSVD)
(PCA) (open)
(open)
(GDA) (open)
Evolutionary Algorithms
-
NOTE: * Watchmaker support has been removed as of 0.7
see also:
You will find here information, examples, use cases, etc. related to Evolutionary Algorithms.
Introductions and Tutorials:
Examples:
Recommenders / Collaborative Filtering
Mahout contains both simple non-distributed recommender implementations and distributed Hadoop-based recommenders.
-
(integrated)
-
(integrated)
-
(integrated)
Vector Similarity
Mahout contains implementations that allow one to compare one or more vectors with another set of vectors. This can be useful if one is, for instance, trying to calculate the pairwise similarity between all documents (or a subset of docs) in a corpus.
-
RowSimilarityJob – Builds an inverted index and then computes distances between items that have co-occurrences. This is a fully distributed calculation.
-
VectorDistanceJob – Does a map side join between a set of "seed" vectors and all of the input vectors.
Other
本文转自 拖鞋崽 51CTO博客,原文链接:http://blog.51cto.com/1992mrwang/1337941