* General: * Added new custom matrix package implementation. MTJ-based implementation is still default and the two should interoperate, though sticking to one implementation is generally more efficient. * Added new Graph package containing several graph algorithms. * Common: * New custom matrix package implementation in gov.sandia.cognition.math.matrix.custom. Contains both sparse and dense implementations of Vector and Matrix. It is optimized for certain use-cases around sparse matrices and dynamically switching between sparse and dense. * Added default implementations to scalar function interfaces. Makes them easier to use as lambdas. * Improved interoperability between matrix/vector implementations through abstract class implementations. * Added method to get vector and matrix factories from those objects. * Added methods create uniform or Gaussian random vectors and matrices. * Added method to check for multiplication dimensions matching for matrices. * Added method to count non-zeros in a vector. * Added methods to get max and min value from a VectorSpace, which includes implementations on vectors. * Added primitive ArrayList implementations: DoubleArrayList, IntArrayList. * CollectionUtil: Added collection equality checkers. * Added equals and hashCode implementations to DefaultKeyValuePair. * Indexer and DefaultIndexer: Added a clear method. * KDTree: Added method to find within a given radius. * Learning: * Changed implementation of Gamma distribution sampling algorithm to greatly improve performance. Also improves performance of Beta and Dirichlet distribution sampling. * Added DBSCAN clustering implementation. * Added mini-batch k-means clustering implementation. * Improvements to K-means and partitional cluster performance. * Added normalized centroid cluster creator, within-cluster divergence, and random cluster initializer. * Added implementation of Burrows Delta algorithm. * Added out-of-bag stopping criteria for bagging and refactored it for IVoting. * Improved memory use of IVoting by removing redundant allocation. * Added several conjugate gradient matrix solvers and matrix-vector solvers, also with preconditioning. * Added multi-partite valance algorithm. * Added hard sigmoid and hard tanh activation functions. * Text: * Added valance spreading implementation.]]>

Release notes * General: * Updated to Java 1.8. * Common: * Added callback-based transform methods to Vector that pass index and value. * Learning: * Fixed issue in RandomSubVectorThresholdLearner where feature selection was ignored if no sampling was done. * Fixed mean and variance computation in StudentTConfidence. * Added hyperbolic tangent function.]]>

Release notes: * General: * Upgraded MTJ to 1.0.3. * Common: * Added package for hash function computation including Eva, FNV-1a, MD5, Murmur2, Prime, SHA1, SHA2 * Added callback-based forEach implementations to Vector and InfiniteVector, which can be faster for iterating through some vector types. * Optimized DenseVector by removing a layer of indirection. * Added method to compute set of percentiles in UnivariateStatisticsUtil and fixed issue with percentile interpolation. * Added utility class for enumerating combinations. * Adjusted ScalarMap implementation hierarchy. * Added method for copying a map to VectorFactory and moved createVectorCapacity up from SparseVectorFactory. * Added method for creating square identity matrix to MatrixFactory. * Added Random implementation that uses a cached set of values. * Learning: * Implemented feature hashing. * Added factory for random forests. * Implemented uniform distribution over integer values. * Added Chi-squared similarity. * Added KL divergence. * Added general conditional probability distribution. * Added interfaces for Regression, UnivariateRegression, and MultivariateRegression. * Fixed null pointer exception that can happen in K-means with an empty cluster. * Fixed name of maxClusters property on AgglomerativeClusterer (was called maxMinDistance). * Text: * Improvements to LDA Gibbs sampler.]]>

Release notes: * General: * Updated MTJ to version 1.0.2 and netlib-java to 1.1.2. * Updated XStream to version 1.4.8. * Common: * Fixed issue in VectorUnionIterator. * Learning: * Added Alternating Least Squares (ALS) Factorization Machine training implementation. * Fixed performance issue in Factorization Machine where linear component was not making use of sparsity. * Added utility function to sigmoid unit.]]>

]]>Release notes:

* General:

* Now requires Java 1.7 or higher.

* Improved compatibility with Java 1.8 functions by removing

ClonableSerializable requirement from many function-style interfaces.

* Common Core:

* Improved iteration speed over sparse MTJ vectors.

* Added utility methods for more stable log(1+x), exp(1-x), log(1 – exp(x)),

and log(1 + exp(x)) to LogMath.

* Added method for creating a partial permutations to Permutation.

* Added methods for computing standard deviation to UnivariateStatisticsUtil.

* Added increment, decrement, and list view methods to Vector and Matrix.

* Added shorter versions of get and set for Vector and Matrix getElement and

setElement.

* Added aliases of dot for dotProduct in VectorSpace.

* Added utility methods for divideByNorm2 to VectorUtil.

* Learning:

* Added a learner for a Factorization Machine using SGD.

* Added a iterative reporter for validation set performance.

* Added new methods to statistical distribution classes to allow for faster

sampling without boxing, in batches, or without creating extra memory.

* Made generics for performance evaluators more permissive.

* ParameterGradientEvaluator changed to not require input, output, and

gradient types to be the same. This allows more sane gradient definitions

for scalar functions.

* Added parameter to enforce a minimum size in a leaf node for decision

tree learning. It is configured through the splitting function.

* Added ability to filter which dimensions to use in the random subspace

and variance tree node splitter.

* Added ReLU, leaky ReLU, and soft plus activation functions for neural

networks.

* Added IntegerDistribution interface for distributions over natural numbers.

* Added a method to get the mean of a numeric distribution without boxing.

* Fixed an issue in DefaultDataDistribution that caused the total to be off

when a value was set to less than or equal to 0.

* Added property for rate to GammaDistribution.

* Added method to get standard deviation from a UnivariateGaussian.

* Added clone operations for decision tree classes.

* Fixed issue TukeyKramerConfidence interval computation.

* Fixed serialization issue with SMO output.

When you are implementing algorithms for research or to build a system, you may often find yourself referring to specific papers, articles, book chapters, web pages, etc. about an algorithm or some aspect of its implementation. Often these references can help provide more background on the algorithm, how it works, or explain some of the notation used in the code. To help document these dependencies and give attribution to where algorithms or implementation approaches come from, we came up with a simple Java annotation, @PublicationReference. This annotation makes it easy to trace from an implementation class back to reference materials that may provide more insight and background.

As an annotation, the @PublicationReference acts similarly to a reference in a paper by referring to some published article. The annotation has the following fields:

**author**(required): The authors of the article. Typically, each author name is a separate String in an array.**title**(required): The title of the article.**type**(required): The type of article. For example: Journal, Conference, Book, WebPage, … See PublicationType.**year**(required): The year of publication.**publication**: The title of the larger publication for the article. May be a journal, book, conference proceedings, etc.**pages**: The range of pages for the article. Typically an array of two values like`{10, 15}`

.**url**: A URL where the publication can be found. This is not required but strongly recommended to help others find the information.**notes**: Any other notes regarding the reference.

Here is an example annotation for the Online Passive-Aggressive Perceptron algorithm:

@PublicationReference( author={"Koby Crammer", "Ofer Dekel", "Joseph Keshet", "Shai Shalev-Shwartz", "Yoram Singer"}, title="Online Passive-Aggressive Algorithms", type=PublicationType.Journal, year=2006, publication="Journal of Machine Learning Research", pages={551, 585}, url="http://jmlr.org/papers/volume7/crammer06a/crammer06a.pdf")

As you can see, the reference is similar to a simplified BibTex entry. We most often use the annotation on classes, but it can also be used on methods, fields, or wherever annotations are allowed. We also provide a multiple-value container, the @PublicationReferences annotation, to combine together multiple @PublicationReference annotations. The Foundry also has other reference annotations like @ModelingApproximation and @SoftwareReference, though we find the publication reference to be the most useful.

Using the annotations allows for a good amount of information about a reference to be kept in a nice, structured form. This makes it easy to do things like generate a simple bibliography for your code, have references show up in the JavaDoc, or use reflection to create a set of references for a specific algorithm configuration. We have found that having these references is extremely handy when someone asks a question about the code, an algorithm, or it is time to write a paper: the reference, often with the URL, is right there in the code. In this way, the @PublicationReference has helped standardize the way we link code to relevant reference material and we wanted to highlight it with the hope that others may find it useful as well.

]]>Here are the release notes:

Release 3.3.3 (2013-05-20): * General: * Made code able to compile under both Java 1.6 and 1.7. This required removing some potentially unsafe methods that used varargs with generics. * Upgraded XStream dependency to 1.4.4. * Improved support for regression algorithms in learning. * Added general-purpose adapters to make it easier to compose learning algorithms and adapt their input or output. * Common Core: * Added isSparse, toArray, dotDivide, and dotDivideEquals methods for Vector and Matrix. * Added scaledPlus, scaledPlusEquals, scaledMinus, and scaledMinusEquals to Ring (and thus Vector and Matrix) for potentially faster such operations. * Fixed issue where matrix and dense vector equals was not checking for equal dimensionality. * Added transform, transformEquals, tranformNonZeros, and transformNonZerosEquals to Vector. * Made LogNumber into a signed version of a log number and moved the prior unsigned implementation into UnsignedLogNumber. * Added EuclideanRing interface that provides methods for times, timesEquals, divide, and divideEquals. Also added Field interface that provides methods for inverse and inverseEquals. These interfaces are now implemented by the appropriate number classes such as ComplexNumber, MutableInteger, MutableLong, MutableDouble, LogNumber, and UnsignedLogNumber. * Added interface for Indexer and DefaultIndexer implementation for creating a zero-based indexing of values. * Added interfaces for MatrixFactoryContainer and DivergenceFunctionContainer. * Added ReversibleEvaluator, which various identity functions implement as well as a new utility class ForwardReverseEvaluatorPair to create a reversible evaluator from a pair of other evaluators. * Added method to create an ArrayList from a pair of values in CollectionUtil. * ArgumentChecker now properly throws assertion errors for NaN values. Also added checks for long types. * Fixed handling of Infinity in subtraction for LogMath. * Fixed issue with angle method that would cause a NaN if cosine had a rounding error. * Added new createMatrix methods to MatrixFactory that initializes the Matrix with the given value. * Added copy, reverse, and isEmpty methods for several array types to ArrayUtil. * Added utility methods for creating a HashMap, LinkedHashMap, HashSet, or LinkedHashSet with an expected size to CollectionUtil. * Added getFirst and getLast methods for List types to CollectionUtil. * Removed some calls to System.out and Exception.printStackTrace. * Common Data: * Added create method for IdentityDataConverter. * ReversibleDataConverter now is an extension of ReversibleEvaluator. * Learning Core: * Added general learner transformation capability to make it easier to adapt and compose algorithms. InputOutputTransformedBatchLearner provides this capability for supervised learning algorithms by composing together a triplet. CompositeBatchLearnerPair does it for a pair of algorithms. * Added a constant and identity learners. * Added Chebyshev, Identity, and Minkowski distance metrics. * Added methods to DatasetUtil to get the output values for a dataset and to compute the sum of weights. * Made generics more permissive for supervised cost functions. * Added ClusterDistanceEvaluator for taking a clustering that encodes the distance from an input value to all clusters and returns the result as a vector. * Fixed potential round-off issue in decision tree splitter. * Added random subspace technique, implemented in RandomSubspace. * Separated functionality from LinearFunction into IdentityScalarFunction. LinearFunction by default is the same, but has parameters that can change the slope and offset of the function. * Default squashing function for GeneralizedLinearModel and DifferentiableGeneralizedLinearModel is now a linear function instead of an atan function. * Added a weighted estimator for the Poisson distribution. * Added Regressor interface for evaluators that are the output of (single-output) regression learning algorithms. Existing such evaluators have been updated to implement this interface. * Added support for regression ensembles including additive and averaging ensembles with and without weights. Added a learner for regression bagging in BaggingRegressionLearner. * Added a simple univariate regression class in UnivariateLinearRegression. * MultivariateDecorrelator now is a VectorInputEvaluator and VectorOutputEvaluator. * Added bias term to PrimalEstimatedSubGradient. * Text Core: * Fixed issue with the start position for tokens from LetterNumberTokenizer being off by one except for the first one.]]>

Release 3.3.2 (2011-11-07): * Common Core: * Added checkedAdd and checkedMultiply functions to MathUtil, providing a means for conducting Integer addition and multiplication with explicit checking for overflow and underflow, and throwing an ArithmeticException if they occur. Java fails silently in integer over(under)flow situations. * Added explicit integer overflow checks to DenseMatrix. The underlying MTJ library stores dense matrices as a single dimensional arrays of integers, which in Java are 32-bit. When creating a matrix with numRows rows and numColumns columns, if numRows * numColumns is more than 2^31 - 1, a silent integer overflow would occur, resulting in later ArrayIndexOutOfBoundsExceptions when attempting to access matrix elements that didn't get allocated. * Added new methods to DiagonalMatrix interface for multiplying diagonal matrices together and for inverting a DiagonalMatrix. * Optimized operations on diagonal matrices in DiagonalMatrixMTJ. * Added checks to norm method in AbstractVectorSpace and DefaultInfiniteVector for power set to NaN, throwing an ArithmeticException if encountered. * Learning Core: * Optimized matrix multiplies in LogisticRegression to avoid creating dense matrices unnecessarily and to reduce computation time using improved DiagonalMatrix interfaces. * Added regularization and explicit bias estimation to MultivariateLinearRegression. * Added ConvexReceiverOperatingCharacteristic, which computes the convex hull of the ROC. * Fixed rare corner-case bug in ReceiverOperatingCharacteristic and added optional trapezoidal AUC computation. * Cleaned up constant in MultivariateCumulativeDistributionFunction and added publication references.]]>