Foundry bits: @PublicationReference Annotation

May 28th, 2014 Comments off

The Foundry has lots of  interesting bits of code that can be helpful for doing algorithmic work in Java (or more broadly in JVM languages). Lets dig into one such example…

When you are implementing algorithms for research or to build a system, you may often find yourself referring to specific papers, articles, book chapters, web pages, etc. about an algorithm or some aspect of its implementation. Often these references can help provide more background on the algorithm, how it works, or explain some of the notation used in the code. To help document these dependencies and give attribution to where algorithms or implementation approaches come from, we came up with a simple Java annotation, @PublicationReference. This annotation makes it easy to trace from an implementation class back to reference materials that may provide more insight and background.

As an annotation, the @PublicationReference acts similarly to a reference in a paper by referring to some published article. The annotation has the following fields:

  • author (required): The authors of the article. Typically, each author name is a separate String in an array.
  • title (required): The title of the article.
  • type (required): The type of article. For example: Journal, Conference, Book, WebPage, … See PublicationType.
  • year (required): The year of publication.
  • publication: The title of the larger publication for the article. May be a journal, book, conference proceedings, etc.
  • pages: The range of pages for the article.  Typically an array of two values like {10, 15}.
  • url: A URL where the publication can be found. This is not required but strongly recommended to help others find the information.
  • notes: Any other notes regarding the reference.

Here is an example annotation for the Online Passive-Aggressive Perceptron algorithm:

@PublicationReference(
    author={"Koby Crammer", "Ofer Dekel", "Joseph Keshet", 
        "Shai Shalev-Shwartz", "Yoram Singer"},
    title="Online Passive-Aggressive Algorithms",
    type=PublicationType.Journal,
    year=2006,
    publication="Journal of Machine Learning Research",
    pages={551, 585},
    url="http://jmlr.org/papers/volume7/crammer06a/crammer06a.pdf")

As you can see, the reference is similar to a simplified BibTex entry. We most often use the annotation on classes, but it can also be used on methods, fields, or wherever annotations are allowed. We also provide a multiple-value container, the @PublicationReferences annotation, to combine together multiple @PublicationReference annotations. The Foundry also has other reference annotations like  @ModelingApproximation and  @SoftwareReference, though we find the publication reference to be the most useful.

Using the annotations allows for a good amount of information about a reference to be kept in a nice, structured form. This makes it easy to do things like generate a simple bibliography for your code, have references show up in the JavaDoc, or use reflection to create a set of references for a specific algorithm configuration. We have found that having these references is extremely handy when someone asks a question about the code, an algorithm, or it is time to write a paper: the reference, often with the URL, is right there in the code. In this way, the @PublicationReference has helped standardize the way we link code to relevant reference material and we wanted to highlight it with the hope that others may find it useful as well.

Categories: Bits Tags:

Cognitive Foundry now on GitHub

January 23rd, 2014 Comments off

By popular demand, I’ve converted the development repository to Git and have made it available on GitHub. This includes migrating the source, issues, and wiki as well as setting up the historical releases. The old Trac site has thus been taken offline. We are hoping that this change will make it easier for people to find, use, and contribute to the Foundry in the future.

Categories: News Tags:

Cognitive Foundry 3.3.3 Released

May 20th, 2013 Comments off

Version 3.3.3 of the Cognitive Foundry is now available for download.

Here are the release notes:

Release 3.3.3 (2013-05-20):
  * General:
    * Made code able to compile under both Java 1.6 and 1.7. This required
      removing some potentially unsafe methods that used varargs with generics.
    * Upgraded XStream dependency to 1.4.4.
    * Improved support for regression algorithms in learning.
    * Added general-purpose adapters to make it easier to compose learning
      algorithms and adapt their input or output.
  * Common Core:
    * Added isSparse, toArray, dotDivide, and dotDivideEquals methods for 
      Vector and Matrix.
    * Added scaledPlus, scaledPlusEquals, scaledMinus, and scaledMinusEquals to
      Ring (and thus Vector and Matrix) for potentially faster such operations.
    * Fixed issue where matrix and dense vector equals was not checking for 
      equal dimensionality.
    * Added transform, transformEquals, tranformNonZeros, and 
      transformNonZerosEquals to Vector.
    * Made LogNumber into a signed version of a log number and moved the prior
      unsigned implementation into UnsignedLogNumber.
    * Added EuclideanRing interface that provides methods for times, 
      timesEquals, divide, and divideEquals. Also added Field interface that 
      provides methods for inverse and inverseEquals. These interfaces are now 
      implemented by the appropriate number classes such as ComplexNumber, 
      MutableInteger, MutableLong, MutableDouble, LogNumber, and 
      UnsignedLogNumber.
    * Added interface for Indexer and DefaultIndexer implementation for
      creating a zero-based indexing of values.
    * Added interfaces for MatrixFactoryContainer and 
      DivergenceFunctionContainer.
    * Added ReversibleEvaluator, which various identity functions implement as 
      well as a new utility class ForwardReverseEvaluatorPair to create a 
      reversible evaluator from a pair of other evaluators.
    * Added method to create an ArrayList from a pair of values in
      CollectionUtil.
    * ArgumentChecker now properly throws assertion errors for NaN values.
      Also added checks for long types.
    * Fixed handling of Infinity in subtraction for LogMath.
    * Fixed issue with angle method that would cause a NaN if cosine had a
      rounding error.
    * Added new createMatrix methods to MatrixFactory that initializes the 
      Matrix with the given value.
    * Added copy, reverse, and isEmpty methods for several array types to 
      ArrayUtil.
    * Added utility methods for creating a HashMap, LinkedHashMap, HashSet, or
      LinkedHashSet with an expected size to CollectionUtil.
    * Added getFirst and getLast methods for List types to CollectionUtil.
    * Removed some calls to System.out and Exception.printStackTrace.
  * Common Data:
    * Added create method for IdentityDataConverter.
    * ReversibleDataConverter now is an extension of ReversibleEvaluator.
  * Learning Core:
    * Added general learner transformation capability to make it easier to adapt
      and compose algorithms. InputOutputTransformedBatchLearner provides this
      capability for supervised learning algorithms by composing together a
      triplet. CompositeBatchLearnerPair does it for a pair of algorithms.
    * Added a constant and identity learners.
    * Added Chebyshev, Identity, and Minkowski distance metrics.
    * Added methods to DatasetUtil to get the output values for a dataset and
      to compute the sum of weights.
    * Made generics more permissive for supervised cost functions.
    * Added ClusterDistanceEvaluator for taking a clustering that encodes the 
      distance from an input value to all clusters and returns the result as a
      vector.
    * Fixed potential round-off issue in decision tree splitter.
    * Added random subspace technique, implemented in RandomSubspace.
    * Separated functionality from LinearFunction into IdentityScalarFunction.
      LinearFunction by default is the same, but has parameters that can change
      the slope and offset of the function.
    * Default squashing function for GeneralizedLinearModel and 
      DifferentiableGeneralizedLinearModel is now a linear function instead of 
      an atan function.
    * Added a weighted estimator for the Poisson distribution.
    * Added Regressor interface for evaluators that are the output of 
      (single-output) regression learning algorithms. Existing such evaluators
      have been updated to implement this interface.
    * Added support for regression ensembles including additive and averaging
      ensembles with and without weights. Added a learner for regression bagging
      in BaggingRegressionLearner.
    * Added a simple univariate regression class in UnivariateLinearRegression.
    * MultivariateDecorrelator now is a VectorInputEvaluator and
      VectorOutputEvaluator.
    * Added bias term to PrimalEstimatedSubGradient.
  * Text Core:
    * Fixed issue with the start position for tokens from LetterNumberTokenizer
      being off by one except for the first one.
Categories: Releases Tags:

Mercurial Repository for Cognitive Foundry Source

November 1st, 2012 Comments off

I’ve added links to the main site for the release and development source repositories that we’ve been using for open source development of the Cognitive Foundry. You can browse the sources online or make a local clone using Mercurial as described in the source page. The release repository contains the source for the latest release, which at this time is still 3.3.2. The development repository has the latest-and-greatest code, though may not be as stable as the release version.

Categories: News Tags:

Cognitive Foundry 3.3.2 Released

November 7th, 2011 Comments off

Version 3.3.2 of the Cognitive Foundry is now available for download.

 

Release 3.3.2 (2011-11-07):
  * Common Core:
    * Added checkedAdd and checkedMultiply functions to MathUtil, providing a
      means for conducting Integer addition and multiplication with explicit
      checking for overflow and underflow, and throwing an ArithmeticException
      if they occur.  Java fails silently in integer over(under)flow situations.
    * Added explicit integer overflow checks to DenseMatrix.  The underlying MTJ
      library stores dense matrices as a single dimensional arrays of integers,
      which in Java are 32-bit.  When creating a matrix with numRows rows and
      numColumns columns, if numRows * numColumns is more than 2^31 - 1, a
      silent integer overflow would occur, resulting in later
      ArrayIndexOutOfBoundsExceptions when attempting to access matrix elements
      that didn't get allocated.
    * Added new methods to DiagonalMatrix interface for multiplying diagonal
      matrices together and for inverting a DiagonalMatrix.
    * Optimized operations on diagonal matrices in DiagonalMatrixMTJ.
    * Added checks to norm method in AbstractVectorSpace and DefaultInfiniteVector
      for power set to NaN, throwing an ArithmeticException if encountered.
  * Learning Core:
    * Optimized matrix multiplies in LogisticRegression to avoid creating dense
      matrices unnecessarily and to reduce computation time using improved
      DiagonalMatrix interfaces.
    * Added regularization and explicit bias estimation to
      MultivariateLinearRegression.
    * Added ConvexReceiverOperatingCharacteristic, which computes the convex
      hull of the ROC.
    * Fixed rare corner-case bug in ReceiverOperatingCharacteristic and added
      optional trapezoidal AUC computation.
    * Cleaned up constant in MultivariateCumulativeDistributionFunction and
      added publication references.
Categories: Releases Tags:

Cognitive Foundry 3.3.1 Released

October 6th, 2011 Comments off

Version 3.3.1 of the Cognitive Foundry is released. Go and download it now. Here is the list of changes:

Release 3.3.1 (2011-10-06):

  * Common Core:
    * Added NumericMap interface, which provides a mapping of keys to numeric
      values.
    * Added ScalarMap interface, which extends NumericMap to provide a mapping
      of objects to scalar values represented as doubled.
    * Added AbstractScalarMap and AbstractMutableDoubleMap to provide abstract,
      partial implementations of the ScalarMap interface.
    * Added VectorSpace interface, where a VectorSpace is a type of Ring that
      you can perform Vector-like operations on such as norm, distances, etc.
    * Added AbstractVectorSpace, which provides an abstract, partial
      implementation of the VectorSpace interface.
    * Updated Vector, AbstractVector, VectorEntry to build on new VectorSpace
      interface and AbstractVectorSpace class.
    * Added InfiniteVector interface, which has a potentially infinite number
      of indices, but contains only a countable number in any given instance.
    * Added DefaultInfiniteVector, an implementation of the InfiniteVector
      interface backed by a LinkedHashMap.
    * Rewrote FiniteCapacityBuffer from the ground up, now with backing from a
      fixed-size array to minimize memory allocation.
    * Renamed IntegerCollection to IntegerSpan.
  * Learning Core:
    * Updated ReceiverOperatingCharacteristic to improve calculation
    * Added PriorWeightedNodeLearner interface, which provides for configuring the
      prior weights on the learning algorithm that searches for a decision
      function inside a decision tree.
    * Updated AbstractDecisionTreeNode to fix off by one error in computing node's
      depth.
    * Updated CategorizationTreeLearner to add ability to specify class priors
      for decision tree algorithm.
    * Updated VectorThresholdInformationGainLearner to add class priors to
      information gain calculation.
    * Updated SequentialMinimalOptimization to improve speed.
    * Added LinearBasisRegression, which uses a basis function to generate
      vectors before performing a LinearRegression.
    * Added MultivariateLinearRegression, which performs multivariate regression;
      does not explicitly estimate a bias term or perform regularization.
    * Added LinearDiscriminantWithBias, which provides a LinearDiscriminant with
      an additional bias term that gets added to the output of the dot product.
    * Updated LinearRegression and LogisticRegression to provide for bias term
      estimation and use of L2 regularization.
    * Renamed SquashedMatrixMultiplyVectorFunction to GeneralizedLinearModel.
    * Renamed DifferentiableSquashedMatrixMultiplyVectorFunction to
      DifferentiableGeneralizedLinearModel.
    * Renamed MatrixMultiplyVectorFunction to MultivariateDiscriminant.
    * Added MultivariateDiscriminantWithBias, which provides a multivariate
      discriminant with a bias term.
    * Renamed DataHistogram to DataDistribution.
    * Renamed AbstractDataHistogram to AbstractDataDistribution.
    * Added DefaultDataDistribution, a default implementation of the
      DataDistribution interface that uses a backing map.
    * Added LogisticDistribution, an implementation of the scalar logistic
      distribution.
    * Updated MultivariateGaussian to provide for incremental estimation of
      covariance-matrix inverse without a single matrix inversion.
    * Removed DecoupledVectorFunction.
    * Removed DecoupledVectorLinearRegression.
    * Removed PointMassDistribution.
    * Removed MapBasedDataHistogram.
    * Removed MapBasedPointDistribution.
    * Removed MapBasedSortedDataHistogram.
    * Removed AbstractBayseianRegression.
    * Additional general reworking and clean up of distribution code,
      impacting classes in gov.sandia.cognition.statistics.distribution
      package.
  * Text Core:
    * Renamed LatentDirichetAllocationVectorGibbsSampler to
      LatentDirichletAllocationVectorGibbsSampler to fix misspelling.
    * Added ParallelLatentDirichletAllocationVectorGibbsSampler, a parallelized
      version of Latent Dirichlet Allocation.

We’ll try to get it up in Maven central soon.

Categories: Releases Tags:

Cognitive Foundry Now Available via Maven and Ivy

September 8th, 2011 1 comment

The current version of the Cognitive Foundry (3.3.0) is now available in the Maven central repository. Thus, if you use Maven or Ivy as part of your build system, you can easily add the Foundry to your Java projects and get all the goodness of dependency management. Each of the 6 primary jars for Common Core, Common Data, Learning Core, Text Core, Framework Core, and Framework Learning are available, so you can pick and choose the parts you want to use. Future versions of the Foundry will be posted to Maven central as well.

If you use Maven, then you can add the following dependencies to your pom.xml file for the various parts of the Foundry you want to use, or include all of them:

<dependencies>
  <dependency>
    <groupId>gov.sandia.foundry</groupId>
    <artifactId>gov-sandia-cognition-common-core</artifactId>
    <version>3.3.0</version>
  </dependency>
  <dependency>
    <groupId>gov.sandia.foundry</groupId>
    <artifactId>gov-sandia-cognition-common-data</artifactId>
    <version>3.3.0</version>
  </dependency>
  <dependency>
    <groupId>gov.sandia.foundry</groupId>
    <artifactId>gov-sandia-cognition-learning-core</artifactId>
    <version>3.3.0</version>
  </dependency>
  <dependency>
    <groupId>gov.sandia.foundry</groupId>
    <artifactId>gov-sandia-cognition-text-core</artifactId>
    <version>3.3.0</version>
  </dependency>
  <dependency>
    <groupId>gov.sandia.foundry</groupId>
    <artifactId>gov-sandia-cognition-framework-core</artifactId>
    <version>3.3.0</version>
  </dependency>
  <dependency>
    <groupId>gov.sandia.foundry</groupId>
    <artifactId>gov-sandia-cognition-framework-learning</artifactId>
    <version>3.3.0</version>
  </dependency>
</dependencies>

If you use Ivy, you can add dependencies using the following declarations in your ivy.xml file:

<dependencies>
    <dependency org="gov.sandia.foundry" name="gov-sandia-cognition-common-core"        rev="3.3.0"/>
    <dependency org="gov.sandia.foundry" name="gov-sandia-cognition-common-data"        rev="3.3.0"/>
    <dependency org="gov.sandia.foundry" name="gov-sandia-cognition-learning-core"      rev="3.3.0"/>
    <dependency org="gov.sandia.foundry" name="gov-sandia-cognition-text-core"          rev="3.3.0"/>
    <dependency org="gov.sandia.foundry" name="gov-sandia-cognition-framework-core"     rev="3.3.0"/>
    <dependency org="gov.sandia.foundry" name="gov-sandia-cognition-framework-learning" rev="3.3.0"/>
</dependencies>

Unless you have changed your Ivy resolvers, you should be able to pick these up just by adding the above.

Let us know if you have any questions. Thanks to Andrew for the suggestion.

Categories: News Tags:

Forums added

June 6th, 2011 Comments off

I set up some forums for this site. Please make use of them to ask questions, provide answers, and share information about the Cognitive Foundry.

Categories: News Tags:

Welcome to cognitivefoundry.org

June 5th, 2011 Comments off

Welcome to cognitivefoundry.org, the community site for the Cognitive Foundry. The Cognitive Foundry was created by the Cognitive Systems group at Sandia National Laboratories to be a software platform for building intelligent systems. Started in 2006, it was open sourced in 2010 under a BSD-style license. It is primarily written in Java and has a heavy emphasis on machine learning algorithms.

This site was created to help provide information about the Foundry and to foster the community of Foundry users.

Categories: News Tags: