maxent_toolbox

Maximum Entropy toolbox for MATLAB

View the Project on GitHub orimaoz/maxent_toolbox

maxent.trainModel

Description

Trains a maximum entropy model on empirical data. The function will automatically choose, based on the number of dimensions in the input distribution, which of two modes of operation to use:

  • For a small number (default ≤ 25) of input dimensions it will compute an exact solution and return a normalized probability distribution.
  • For a large number (default > 25) of input dimensions it will compute an approximate solution using Monte-Carlo Markov Chain (MCMC) methods and return a non-normalized distribution. This distribution can later be normalized using other functions in the toolbox such as wangLandau.
  • If the input model is an independent model, the function will return a normalized probability distribution regardless of the input dimension.
The user can force either an exhaustive solution or an MCMC solution by supplying optional arguments.

Usage

[model_out, bConverged] = maxent.trainModel(model,samples)
[model_out, bConverged] = maxent.trainModel(model,samples,Name,Value,...)

Arguments

Mandatory arguments

  • model - Maximum entropy model as returned by the createModel function.
  • samples - Set of samples to train the model on, in the format (ncells x nsamples). If the input dimensionality is small enough for an exhaustive computation, a Boltzmann distribution (in the same format as returned by createModel) may be inputed instead of a raster, in this case the target marginals will be computed in an exact manner from the distribution.

Optional arguments (in the form of name,value pairs)

  • threshold - convergence threshold in units of standard deviations in the marginals. This standard deviation is estimated using the number of samples the marginal was computed from, which means that larger datasets will be assigned tighter (actual) thresholds. For small groups of inputs (where the model can be built in an exhaustive manner) specifying a threshold of zero will set the threshold to the quantization error of the marginals, which is equal to (1/(nsamples*2)).
  • silent - don't print anything.
  • max_steps - limit to a maximum number of steps.
  • use_acceleration - true/false to use accelerated gradient descent (enabled by default).
  • force_exhaustive - set to true to force an exhaustive numerical solution. This entails storing in memory all 2^ncells states of the distribution which can grow really fast, so is not recommended for inputs of more than 30 cells.
  • force_mcmc - set to true to force an MCMC solver even when a btter option is available.
  • savefile - will constantly save the state in this file, and try to resume from it if it already exists.
  • save_delay - delay between saves (in seconds).

Output

  • model_out - Maximum entropy model describing the distribution of inputs.
  • bConverged - Returns true if the learning process converged, or false if it terminated because the specified maximum number of iterations was reached.
  • Example usage

    % create an ME model
    model = maxent.createModel(30,'pairwise');   
    
    % train it with a sample set to a threshold of two standard deviations
    model = maxent.trainModel(model,samples,'threshold',2);