7.9. Mahout 0.9.0

  • MAHOUT-1464 Cooccurrence Analysis on Spark

  • MAHOUT-1578 Optimizations in matrix serialization

  • MAHOUT-1572 blockify() to detect (naively) the data sparsity in the loaded data

  • MAHOUT-1571 Functional Views are not serialized as dense/sparse correctly

  • MAHOUT-1566 (Experimental) Regular ALS factorizer with conversion tests, optimizer enhancements and bug fixes

  • MAHOUT-1537 Minor fixes to spark-shell

  • MAHOUT-1529 Finalize abstraction of distributed logical plans from backend operations

  • MAHOUT-1489 Interactive Scala & Spark Bindings Shell & Script processor

  • MAHOUT-1346 Spark Bindings

  • MAHOUT-1555 Exception thrown when a test example has the label not present in training examples

  • MAHOUT-1446 Create an intro for matrix factorization

  • MAHOUT-1480 Clean up website on 20 newsgroups

  • MAHOUT-1561 cluster-syntheticcontrol.sh not running locally with MAHOUT_LOCAL=true

  • MAHOUT-1558 Clean up classify-wiki.sh and add in a binary classification problem

  • MAHOUT-1560 Last batch is not filled correctly in MultithreadedBatchItemSimilarities

  • MAHOUT-1554 Provide more comprehensive classification statistics

  • MAHOUT-1548 Fix broken links in quickstart webpage

  • MAHOUT-1542 Tutorial for playing with Mahout's Spark shell

  • MAHOUT-1533 Remove Frequent Pattern Mining

  • MAHOUT-1532 Add solve() function to the Scala DSL

  • MAHOUT-1530 Custom prompt and welcome message for the Spark Shell

  • MAHOUT-1527 Fix wikipedia classifier example

  • MAHOUT-1526 Ant file in examples

  • MAHOUT-1523 Remove @author tags in sparkbindings

  • MAHOUT-1521 lucene2seq - Error trying to load data from stored field (when non-indexed)

  • MAHOUT-1520 Fix links in Mahout website documentation

  • MAHOUT-1519 Remove StandardThetaTrainer

  • MAHOUT-1517 Remove casts to int in ALSWRFactorizer

  • MAHOUT-1513 Deprecate Canopy Clustering

  • MAHOUT-1511 Renaming core to mrlegacy

  • MAHOUT-1510Goodbye MapReduce

  • MAHOUT-1509 Invalid URL in link from "quick start/basics" page

  • MAHOUT-1508 Performance problems with sparse matrices

  • MAHOUT-1505 structure of clusterdump's JSON output

  • MAHOUT-1504 Enable/fix thetaSummer job in TrainNaiveBayesJob

  • MAHOUT-1503 TestNaiveBayesDriver fails in sequential mode

  • MAHOUT-1502 Update Naive Bayes Webpage to Current Implementation

  • MAHOUT-1501 ClusterOutputPostProcessorDriver has private default constructor

  • MAHOUT-1498 DistributedCache.setCacheFiles in DictionaryVectorizer overwrites jars pushed using oozie

  • MAHOUT-1497 mahout resplit not producing splited files

  • MAHOUT-1496 Create a website describing the distributed ALS recommender

  • MAHOUT-1491 Spectral KMeans Clustering doesn't clean its /tmp dir and fails when seeing it again

  • MAHOUT-1488 DisplaySpectralKMeans fails: examples/output/clusteredPoints/part-m-00000 does not exist

  • MAHOUT-1483 Organize links in web site navigation bar

  • MAHOUT-1482 Rework quickstart website

  • MAHOUT-1476 Cleanup website on Hidden Markov Models

  • MAHOUT-1475 Cleanup website on Naive Bayes

  • MAHOUT-1472 Cleanup website on fuzzy kmeans

  • MAHOUT-1471 Cleanup website for Canopy clustering

  • MAHOUT-1468 Creating a new page for StreamingKMeans documentation on mahout website

  • MAHOUT-1467 ClusterClassifier readPolicy leaks file handles

  • MAHOUT-1466 Cluster visualization fails to execute

  • MAHOUT-1465 Clean up README

  • MAHOUT-1463 Modify OnlineSummarizers to use the TDigest dependency from Maven Central

  • MAHOUT-1460 Remove reference to Dirichlet in ClusterIterator

  • MAHOUT-1459 Move Hadoop related code out of CanopyClusterer

  • MAHOUT-1458 Remove KMeansConfigKeys and FuzzyKMeansConfigKeys

  • MAHOUT-1457 Move EigenSeedGenerator into spectral kmeans package

  • MAHOUT-1455 Forkcount config causes JVM crashes during build

  • MAHOUT-1451 Cleaning up the examples for clustering on the website

  • MAHOUT-1450 Cleaning up clustering documentation on mahout website

  • MAHOUT-1449 Update the Known Issues in Random Forests Page

  • MAHOUT-1448 In Random Forest, the training does not support multiple input files. The input dataset must be one single file.

  • MAHOUT-1447 ImplicitFeedbackAlternatingLeastSquaresSolver tests and features

  • MAHOUT-1445 Create an intro for item based recommender

  • MAHOUT-1440 Add option to set the RNG seed for inital cluster generation in Kmeans/fKmeans

  • MAHOUT-1438 "quickstart" tutorial for building a simple recommender

  • MAHOUT-1434 Dead links on the web site

  • MAHOUT-1433 Make SVDRecommender look at all unknown items of a user per default

  • MAHOUT-1429 Parallelize YtransposeY in ImplicitFeedbackAlternatingLeastSquaresSolver

  • MAHOUT-1428 Recommending already consumed items

  • MAHOUT-1425 SGD classifier example with bank marketing dataset.

  • MAHOUT-1420 Add solr-recommender to examples

  • MAHOUT-1419 Random decision forest is excessively slow on numeric features

  • MAHOUT-1417 Random decision forest implementation fails in Hadoop 2

  • MAHOUT-1416 Make access of DecisionForest.read(dataInput) less restricted

  • MAHOUT-1415 Clone method on sparse matrices fails if there is an empty row which has not been set explicitly

  • MAHOUT-1413 Rework Algorithms page

  • MAHOUT-1388 Add command line support and logging for MLP

  • MAHOUT-1385 Caching Encoders don't cache

  • MAHOUT-1356 Ensure unit tests fail fast when writing outside mvn target directory

  • MAHOUT-1329 Mahout for hadoop 2

  • MAHOUT-1310 Mahout support windows