At its heart FACTORIE is a toolkit for graphical models. All its specific applications, including their data representation, inference and learning methods are built on a small set of common graphical model primitives.
Introduction to Graphical Models
Graphical models are a formalism in which a graph denotes the conditional dependence structure between random variables. The formalism is the marriage between probability theory and graph theory. It provides an elegant framework that combines uncertainty (probabilities) and logical structure (independence constraints) such that complex joint probability distributions over multiple variables that would have otherwise been intractable to represent or manipulate can instead be represented compactly and often manipulated efficiently. Since graphical models can straightforwardly express so many different probabilistic models, they have become a lingua-franca for statistics, machine learning, and data mining.
In graphical models, variables are depicted by the nodes a graph, drawn as circles, and dependencies among variables are depicted by edges, drawn either as directed (with arrows), or undirected (without arrows).
There are two main types of graphical models.
*Directed graphical models (also known as Bayesian networks) represent a joint distribution over random variables by a product of normalized conditional probability distributions (one for each “child” variable, conditioned on the other “parent” variables that are connected by an incoming directed edge). They are convenient generative models when variable values can be generated by an ordered iteration of the graph nodes from “parent” variables to “child” variables.
*Undirected graphical models (also known as Markov random fields) represent a joint distribution over random variables by a product of unnormalized non-negative values (one value for each clique in the graph). They are convenient models for data in which it is not intuitive to impose an ordering on the variables’ generative process. They can represent different patterns of independence constraints than directed models can, and vice versa—neither one is strictly more expressive than the other.
*Factor graphs are a generalization of both directed and undirected graphical models, capable of representing both.
When drawn, factor graphs depict variables as circular nodes in the graph (just as in directed and undirected graphical models), however rather than having edges that connect variables directly to each other, edges instead connect variables to factors (which are drawn as black squares). In other words, variables are connected to other variables only through factors. The variables connected to a factor are called the “neighbors” of the factor. A factor graph is a bipartite graph connecting variables and factors.
Factor graphs represent a joint distribution over random variables by a product of (normalized or unnormalized) non-negative values—one value for each factor in the graph. The factors can be understood as “compatibility functions” that take as input the values of the variables which they neighbor, and outputing a “compatibility score” such that higher scores indicate the combination of values is more likely, and lower scores indicate the combination of values is less likely. A score of 0 indicates the combination is impossible.
Directed graphical models can be represented by having one factor for each (child) variable, where the factor is also connected to the other (parent) variables on whose values we must condition when generating the child’s value. The factor’s scores are normalized probabilities.
Undirected graphical models can be represented with one factor per clique (or other arbitary subsets of clique members). The factors’ scores are unnormalized non-negative values, and thus in order to obtain a normalized joint probability distribution over all the variables in the model, we must normalize the product of factor scores by the proper normalization value (called the “partition function”).
In practice, most implementations of graphical models use the log of the score values so that they can be summed rather than multiplied, and a larger dynamic range of scores can be represented with limited floating-point precision in hardware. Naturally these log-scores can then vary from negative infinity (indicating an impossible combination of values in the neighbors) to positive infinity (indicating an obligatory combination of values in the neighbors). Since the use of log-scores is so common (and used in our implementation) in the remainder of this documentation we will now simply use the term “score” to refer to log-scores.
Factor graphs in FACTORIE
The FACTORIE library defines Scala classes for representing the elements of a factor graph. Usually the names of our Scala classes and methods are simply the standard names used in English machine-learning or statistics vocabulary.
In this section we further describe factor graphs while introducing
FACTORIE class and method names indicated in
mono-space font. Class
(and trait) names are capitalized. Method names begin with a
We provide here a high-level introduction to FACTORIE’s identifiers. Detailed explanations with code examples are given in subsequent chapters.
FACTORIE modular design philosophy
FACTORIE is explicitly designed to support independent, non-intertwined definitions of
- data representations with variables,
- models (value preferences, distributions, dependencies) with factors,
- inference methods, and
- parameter estimation.
This separation provides FACTORIE users with great flexibilty to mix-and-match different choices in each of these four dimensions. For example, the data representation for some task may be written just once, while different choices of dependencies are explored for this task—there is no need to re-write the data representation for each experiment with a different model. Similarly, inference methods may be selected separately from the model definition.
Representing data with variables and values
The data constituting any machine learning task are represented by
variables. A variable can be understood as a container for a value.
A variable may also have certain relations to other variables (for
example, by being part of a sequence of variables corresponding to a
sequence of part-of-speech tags). FACTORIE has many classes for
representing different types of variables. The root of this class
hierarchy is the abstract trait
Assignment represents a mapping from variables to values, and can
be treated as a function. The value of a variable
v1 in assignment
a2 can be obtained by
Note that an assignment stores a single value for a variable, not a probability distribution over values. (Of course FACTORIE has extensive facilities for representing distributions over values, but these are not intrinsic to a variable definition or assignment—rather they are the purview of inference, because different inference methods may choose to use different distribution representations for the same variable.)
There are multiple concrete sub-classes of
MutableAssignment can be modified. A
TargetAssignment returns the
gold-standard (e.g. human-labeler-provided) value for a
variable when such a labeled value is available. An
a compact efficient representation for the value assignment of a
single variable (similarly
Assignment2..4 also exist). A
HashMapAssignment is an arbitrarily-sized mutable assignment with
storage based on a HashMap.
Since we want to efficiently support the common case in which many
variables have relatively static value or unique assignments (because
they are observed, or because they are being sampled in a single MCMC
inference process) FACTORIE also has the
There may exist multiple instances of the other assignment classes,
each with its own values; however, there is only one
GlobalAssignment. For efficiency the
GlobalAssignment values are
not stored in the
GlobalAssignment, but rather in the corresponding
variable instances themselves.
All variables have a
value method that returns the variable’s
currently assigned global value. In other words for variable
GlobalAssignment(v1) is equivalent to
v1.value. Henceforth when
we refer to “a variable’s value” we mean “a variable’s
All variables also have methods for testing the equality (
!==) of their global values. Thus
v1 === v2 will
return true if variables
v2 have the same values, and is a
simple shorthand for
v1.value == v2.value. (Traditional equality
equals) on variables always tests the equality of the
system identity of the two variables, not their values.)
If some non-global assignment
a2 does not contain a value for
then looking up the value with
a2(v1) may have different behavior in
different assignment subclasses. For example, a
will throw an error, but an
simply return the variable’s globally-assigned value.
Many types of variables
FACTORIE has different subclasses of
Var for holding values of
different types. There are a large number of traits and classes for
variables. The following naming convensions make their interpretation
easier. All abstract traits and classes end in
Var, while concrete
classes end in
Variable. Almost all classes ending in
have an abstract
Var counterpart that does not necessarily specify
the mechanism by which the variable stores its value in memory.
The type of the value is named immediately before the
Variable suffix (for example,
IntegerVariable has values of type
Int). Further modifiers may appear as prefixes.
All variables also have a member type
Value indicating the Scala
type returned by its
value method. In some variables the
type is bound, but not assigned. In other cases it is assigned.
Some variables have mutable value (inheriting from the trait
MutableVar). Almost all classes ending in
Variable are mutable.
Mutable variable values can be set with the
:= method. For example,
v1 has integer values, we can set the value of
v1 to 3 by
The following is a selection of FACTORIE’s most widely-used variable classes.
- has value with Scala type Int.
- has value with Scala type Double.
- has value of type Tensor, which is defined in the FACTORIE linear algebra package
cc.factorie.la. This variable class makes no restrictions on the dimensionality of the tensor, nor the lengths of its dimensions.
- has value of type Tensor1, which is a one-dimensional Tensor (also traditionally called a “vector”). In addition each
VectorVariableis associated with a
DiscreteDomain(further described below) whose size matches the length of the variable’s vector value.
DiscreteVariable extends VectorVar
- has a value among N possible values, each of type
DiscreteValue, and each associated with an integer 0 through N-1. This
Tensor1and can also be interpreted as a “one-hot” vector with value 1.0 in one position and 0.0 everywhere else. Given a
DiscreteValue dv1its integer value can be obtained by
dv1.intValue. The length of the vector (in other words, the value of N) can be obtained by
CategoricalVariable[A] extends DiscreteVar
- has value among N possible values, each of type
CategoricalValue[A](which inherits from
DiscreteValue), each associated with an integer 0 through N-1, and also associated with a “category” (often of type String). These variables are often used for representing class labels and words, when a mapping from String category names to integer indices is desirable for efficiency (such as indexing into an array of parameters). Given a
CategoricalValue[String] cv1its integer value can be obtained by
cv1.intValueand its categorical (String) value can be obtained by
cv1.categoryValue. Its value may be set with an integer:
cv1 := 2or set by category string:
cv1 := "sports". (The mapping between Strings and integers is stored in a
CategoricalDomain, which is described below.)
CategoricalVectorVariable[A] extends VectorVar
- has value of type Tensor1, which is a one-dimensional Tensor. In addition each
CategoricalVectorVariableis associated with a
CategoricalDomain[A], which stores a mapping between values of type A (e.g.
String) and integers. Thus each position in the vector is associated with a category. This variable type is useful for storing bag-of-words counts, for example.
BooleanVariable extends CategoricalVar[Boolean]
- has one of two possible values, each of type
BooleanValue(which inherits from CategoricalValue[Boolean]), one of which is associated with integer 0 and boolean value false, the other of which is associated with integer value 1 and boolean value true. Given a
BooleanValue bv1its integer value can be obtained by
bv1.intValueand its boolean value can be obtained by
MassesVariable extends TensorVar
- has value
Masses, which are Tensors constrained to contain non-negative values.
Massesare useful as the parameters of Dirichlet distributions.
ProportionsVariable extends MassesVar
- has value
Proportions, which are
Massesconstrained to sum to 1.0.
Proportionsare useful as the parameters of discrete or multinomial distributions.
RealVariable extends VectorVar
- has a single real scalar value, stored in an object of type
RealValue(which inherits from Tensor1). This variable is similar to
DoubleValuein that it stores a scalar value, however since its value type inherits from Tensor1, it can be used in dot products.
All of the above variable classes have constructors in which their
initial value may be set. For example,
new IntegerVariable(3) will
create a new variable whose initial value is 3.
Some of the above variable types have specializations for the case in
which human-labeled gold-standard target values are known. These
specializations inherit from the
LabeledVar trait which provides a
targetValue that returns this gold-standard target value.
The target value is stored separately and may be different from the
variable’s current global assignment value. For example, if
i1 is a
LabeledIntegerVariable we can determine if the variable is currently
assigned to its gold-standard target value by evaluating the boolean
i1.value == i1.targetValue. (The method
is a convenient short-hand for this test.) In almost all cases the
target value is initialized to be the value provided in the variable
LabeledVar sub-classes include:
LabeledDiscreteVariable LabeledCategoricalVariable[A] LabeledBooleanVariable LabeledStringVariable LabeledIntegerValue LabeledDoubleValue LabeledRealValue
All the above variable types are common in existing graphical models. However, FACTORIE also has random variables for representing less traditional values. Although these may seem like peculiar value types for a graphical model, they nonetheless can be scored by a factor, and are often useful in FACTORIE programs.
- has value with Scala type String.
- has value of type
Seq[A], that is, a sequence of objects of type
ChainVariable[A] extends SeqVar[A]
- has value of type
Seq[A], but the elements
Amust inherit from
SpanVariable[A] extends SeqVar[A]
- has value of type
Seq[A], and is a subsequence of a
- has value of type
Set[A], that is, an unordered set of objects of type
- has value of type A. In other words, it is a variable whose value is a pointer to a Scala object.
- has value of type
Tuple[A,B], that is a pair of objects: a “source” of type
Aand a “destination” of type
ArrowVariable[A,B] extends EdgeVar[A,B]
EdgeVariablehas value of type
Tuple[A,B], but only the “destination” is mutable, while the “source” is immutable.
Some (but not all) FACTORIE variables have a corresponding domain
object that contains information about the set of values the variable
may take on. Such variables inherit from
domain method that returns the variable’s domain.
DiscreteVariable subclasses have a corresponding
DiscreteDomain whose main role is to provide its
of possible values the variable can take on, i.e. from 0 to N-1.
CategoricalVariable[A] subclasses have a corresponding
CategoricalDomain[A] (inheriting from
provides not only its
size but also a one-to-one mapping between the
categories and the integers 0 to N-1. For example, if
cd1 is a
cd1.category(3) returns the String
corresponding to index 3 in the domain.
the integer index corresponding to the category value
"apple" is not yet in the domain, it will be automatically added, assigned the
next integer value, and the size of the domain will be increased by
one (assuming that
cd1 has not previously been frozen).
FACTORIE also provides the ability to undo a collection of changes to variable values. (This is especially useful in Metropolis-Hastings inference, in which a proposed change to variable values is made, but may be rejected, requiring a reversion to previous values.)
Diff instance represents a change in value to a single variable.
It has methods
redo, as well as the
variable method which
returns the variable whose value was changed.
DiffList stores an ordered list of
:= method for assigning the value of a
MutableVar was described above.
An alternative method for changing the value of a
set, which, in addition to the new value, also takes a
an implicit argument. The
set method will then automatically
Diff object and append it to the given
Expressing preferences with factors and models
A collection of variables and their values represents a possible state of the world. To express a degree of preference for one possible world over another—or a probability distribution over possible worlds—we need a mechanism for scoring the different worlds, such as a collection of factors in a factor graph.
In FACTORIE, factors are represented by instances of the trait
Factors have certain basic methods. The
numVariables method returns the number of variables neighboring the
factor. The list of neighboring variables themselves is returned by
currentScore calculates the factor’s
score for the neighboring variables’ values in the current global
assignment. You can obtain the score under alternative assignments by
a1 is some
Subclasses of the
Factor trait include
Factor4, which have one, two, three or four
neighboring variables respectively. (FACTORIE does not currently
support factors with more than four neighbors; if necessary these
could be added in the future, however, thus far we have not found them
necessary because (a) the use of
other composite-valued variables handle cases in which many values are
necessary, or (b) the use of a “var-args”
The only abstract method in
Factor1 (and its other numbered
score, which takes as arguments the values of its
neighbors and returns a
Factor subclasses define a
statistics method that takes as
arguments the values of its neighbors and returns the sufficient
statistics of the factor. (Often the sufficient statistics will
simply be the combination of neighbor values, but this method provides
users with an opportunity to instead perform various useful
transformations from values to sufficient statistics.) In these cases
score method is then defined to be the result of the
statisticsScore method, whose only argument is the statistics, thus
ensuring that the factor score can be calculated using only the
For example, in the class
DotFactor2 the abstract method
statistics must return sufficient statistics of type
statisticsScore method is then defined to be the dot-product of these
sufficient statistics with another
Tensor of scoring parameters, which
is returned by the abstract method
DotFactorWithStatistics2 class inherits from
requires that both its neighbors inherit from
TensorVar. It then
statistics method to return the outer-product of the
tensor values of its two neighbors.
The naming convention is that classes having the suffix
statistics method for the user. Most other classes have
statistics method that must be defined by the user.
In many situations we want to model “relational data” in which the variables and factors appear in repeated patterns of relations to each other.
For example, in a hidden Markov model (HMM) there is a sequence of observations (random variables with observed values) and a corresponding sequence of hidden states (random variables whose value is the identity of the state of a finite-state machine that was responsible for generating that observation). Each hidden state variable has “next” and “previous” relations in the chain of hidden states. Each also has a corresponding observation variable (the one that generated it, and occurs in the same sequence position).
The standard HMM is “time invariant” (sometimes called “stationary”), meaning that the hidden state-transition probabilities and the observation-from-state generation probabilities do not depend on their position in the sequence. Thus, although each transition needs its own factor in the factor graph (because each factor has different neighboring variables), each of these factors can share the same scoring function.
In FACTORIE factors that share the same sufficient statistics
function, score function, and other similar attributes are said to
belong the same “family” of factors. The trait
Family defines a
Factor inner class such that its instances rely on methods in the
Family class for various shared functionality.
For example, the
DotFamily provides a
weights method for accessing
the dot-product parameters shared by all member factors.
Analogous to the numbered
Factor subclasses there are numbered
Family4, each defining an
inner factor classes with the corresponding number of neighbors.
Template is a subclass of
Family that is also able to
automatically construct its factors on the fly in response to requests
to “unroll” part of the graphical model. A
method takes as input a variable, and uses its knowledge about the
“pattern of relational data” to create and return the
factors that neighbor the given variable.
Note that for factors with more than one neighbor this requires traversing some
relational pattern to find the other variables that are also
neighbors of these factors. This knowledge is flexibly encoded in the
implementation of numbered “unroll” methods. For example
unroll method in terms of two abstract methods:
unroll1 method takes as input a
variable with type matching the first neighbor of the factor, and is
responsible for creating and returning a (possibly empty) collection of
templated factors that touch this variable—finding the second
neighbor in each case by traversing some relational structure among
unroll methods may seem cumbersome,
but in most cases these method definitions are extremely succinct, and
moreover the ability to specify the relational pattern through
arbitrary Turing-complete code provides a great deal of flexibility.
Nonetheless, in the future, we plan to implement a simpler template
language on top of this unroll framework.
Family there are numerous specialized subclasses of
Template. For example a
DotTemplateWithStatistics3 is a template
for a factor with three neighboring variables, each of which must
TensorVar; the template defines a factor scoring
function by the dot-product between the outer-product of the three
variables and the template’s
weights. The only abstract methods are
For the sake of flexible modularity, in FACTORIE, a “model” does not
include the data, or inference method, or learning mechanism—a model
is purely a source of factors. The key method of the
Model class is
factors which takes as input a collection of variables, and returns
a collection of
Factors that neighbor those variables. (This
definition allows efficient implementation of large relational models
in which factors are created on-the-fly in response to specific
requests instead of needing to be “unrolled” entirely in advance.)
Model trait also provides alternative
factors methods for
obtaining the factors that neighbor an individual variable, a
DiffList. There are also methods providing a filtered
collection of factors, such as
factorsOfFamilyClass. For example, the later may be used to obtain
only those factors that are members of the
DotFamily class, and thus
conducive to gradient ascent parameter estimation.
Model also has convenience methods that return the total score of
all factors neighboring the method arguments. For example, its
assignmentScore method returns the sum of the factors’ scores
according to the given assignment. Similarly
currentScore does so
for the global assignment.
There are several concrete subclasses of
factors come entirely from a collection of
Templates. By contrast,
ItemizedModel stores a set of factors created beforehand. A
CombinedModel returns the union of the sets of factors returned by
multiple submodels. There are also specialized models, such as the
ChainModel for linear-chain conditional random fields, with
specialized inference and learning algorithms.
Parameters and Weights
The situation in which templated factor scores are calculated by dot-products is so common (and so relevant to our typical parameter estimation procedures) that FACTORIE provides special support for this case.
In a templated model the parameters are a set of weights tensors, one
per factor template. The
Weights trait is a
TensorVar that holds
the parameters of one family. For example,
DotFamily has a
weights method that returns the
Weights used to calculate its
WeightsSet is then a collection of
Weights, typically holding
all the parameters of a model (across multiple families) based on dot-products.
To declare that an object has a
WeightsSet holding parameters we use
Parameters trait, which defines a
parameters method returning
WeightsSet. (Thus this trait should not be thought of as
“being” parameters, but “holding” parameters.) Its typical use-case
Model with Parameters, declaring that the object is a source of
factors, and also has a
WeightsSet available through its
During parameter estimation one typically needs additional tensors
having the same structure as the parameters, but that hold other
quantities, such as gradients or expectations. For example the
blankSparseMap method on
WeightsSet will create a “blank”
WeightsMap, which is a collection of tensors
having the same structure as the
WeightsSet that created it. This
WeightsMap can then be filled with a gradient through some
computation on training data. It is called a “map” because the
Weights objects are used as “keys” to
obtain the gradient tensor corresponding to that
In summary, both
WeightsMap make available a
WeightsSet the tensors are stored inside
TensorVars) themselves. In
the tensors are stored in a map internal to the
WeightsMap; they are
accessible by lookup using the corresponding
Weights as keys.
The abstract superclass of both
TensorSet. Although a
TensorSet is not a single tensor but a
collection of tensors, it has a variety of useful methods allowing it
to be treated somewhat like a single tensor. These methods include
twoNorm for calculating norms,
+= for incremental
dot for the sum of dot-product of all constituent tensors,
different to determine if the element-wise distance between two
WeightsSets is larger than a threshold.
Searching for solutions with inference
Once we have a way to represent a possible world (variables, with their possible values) and a way to represent preferences over different possible worlds (factors provided by a model) we can address the problem of searching for the most preferred possible world (MAP inference) or finding the probabilities of subsets of all possible worlds (marginal inference).
In FACTORIE terminology, inference is a process that takes as input a
list of variables and a
Model and produces as output a
contains a collection of
Marginals and which also has a
marginalization constant, accessible as
Marginal represents a joint distribution over a subset of the
variables. For example, a
DiscreteMarginal2 specifies two
DiscreteVars, and a
Proportions2 which contains their marginal
RealGaussianMarginal1 represents a univariate
Gaussian distribution with a specified mean and variance. Naturally a
marginal that results from maximum aposteriori inference will simply
have a spike with all probability on the maximum-scoring values. The
most widely used marginals are those over a single variable,
Marginal1 because marginals over multiple variables
more typically represented by
FactorMarginal is marginal associated with a
however, that for technical reasons it does not inherit from the
Marginal trait). These are mainly used for learning, providing a
tensor of sufficient statistics which are used to calculate parameter
Its marginal distribution may cover all the neighboring variables of the
factor, or, in the case in which inference varied some but not all of
the neighbors, the marginal distribution will be over a subset of the
neighboring variables of the factor. Its
returns the expected sufficient statistics of the factor, which always
covers all the neighboring variables of the factor; this is useful for
There is infrastructure for automatically creating a
Marginals that cover a subset of the neighbors of a
For example if we want a
FactorMarginal for a
Factor2 in which one
of the factor’s two neighbors is varying but the other is treated as
constant, we can use a
DiscreteMargina1 to construct a
DiscreteMarginal1Factor2 whose sufficient statistics will cover both
A request for inference on a model results in a
Summary which is a
When inference is run the result is a
Summary which is a container
for (a) the collection of variables that varied during inference, (b)
a collection of the corresponding single-variable
Marginal1s, (c) a
FactorMarginals for the factors used during inference,
and (d) a normalization constant,
logZ. Using a
Summary you can
look up a
Marginal1 given a variable, and can look up a
FactorMarginal given a
Several specialized implementations of
Summary are provided. For
MAPSummary puts all of its probability on one
of the variables’ values. Because it also contains a collection of
FactorMarginals, it can also be used for (max-margin)
parameter estimation. Other examples include
BPSummary for holding
the intermediate and final states of belief propagation, and
IndependentDiscreteSummary for holding an IID distribution over
The primary interface for making inference requests is in the method
infer in the
Infer trait. This method takes two or three
arguments: (1) the collection of non-constant variables whose values
should be varied and whose marginal distributions should be
calculated; (2) the
Model whose factors will score these different
assignments; and optionally, (3) a
Summary which may indicate
additional variables that should be marginalized (and the functional
form of that marginalization), or which may alternatively provide
information about the functional form to be used for the marginals of
the first argument variables.
Maximize trait is a subtype of
Infer for object that implement
maximum aposteriori (MAP) inference. It adds a method
infer and then assigns the variables to their maximizing
Generally it is easy to implement your own inferencer and make it
conform to this API (which then immediately enables its use for
parameter estimation and other tasks). Typically much of the existing
Marginals will be helpful in your
FACTORIE comes with a large collection of already-implement marginal
and MAP inference procedures, whose names start with
There are many variants of belief propagation (BP) for both
marginalization and maximization on graphical models of discrete
variables. These include
InferByBPChain, an efficient
implementation of forward-backward in linear-chain structured models;
InferByBPTree for exact inference tree-shaped graphical models;
InferByBPLoopy for approximate BP inference by on graphical models
InferByBPLoopyTreewise for more robust approximate BP
inference with tree-reparameterization [Wainwright, Jaakkola, Wilsky,
Other variational inference methods exist as well, including
InferByMeanField, an implementation of marginal inference by mean
field for discrete variables;
MaximizeByMPLP, which uses dual
coordinate ascent on the LP relaxation for MAP inference.
FACTORIE also provides infrastructure for Monte Carlo inference. This
includes a generic
Sampler as well as its special-case subclass, a
ProposalSampler which selects among a collection of
objects (each of which provides a score and a
DiffList for undoing
and redoing the proposed changes). Examples of
MHSampler (Metropolis-Hastings), and
IteratedConditionalModes which is the maximization analogue of Gibbs
These underlying sampling mechanisms are wrapped into objects that
Infer interface, providing
Finally, Expectation-Maximization (EM) and dual decomposition are
implemented outside of the
Infer API, by the
There is currently limited support for inference in that directly leverages the special case of directed graphical models. We have preliminarily implemented a system of storing and selecting among a collection of “recipes” for handling different patterns of random variables and factor types (similar to BUGS), but only a few such “recipes” have been implemented so far. One of those cases, however, provides an extremely efficient collapsed Gibbs sampler for latent Dirichlet allocation (LDA)—the FACTORIE implementation of LDA is actually faster than the widely-used MALLET implementation. See the Section on “Topic Modeling” below.
FACTORIE does not currently provide automated methods of selecting the best inference method given a model and collection of variables. This is an area of future work.
(To be provided from Luke’s text.)
In additional to its general support for factor graphs, FACTORIE provides pre-build models and other infrastructure for particular application areas.
This package is not yet implemented, but will be in the future.