This page describes with the help of an example how to implement a new light-weight expression type in Eigen. This consists of three parts: the expression type itself, a traits class containing compile-time information about the expression, and the evaluator class which is used to evaluate the expression to a matrix.
TO DO: Write a page explaining the design, with details on vectorization etc., and refer to that page here.
\eigenAutoToc
A circulant matrix is a matrix where each column is the same as the column to the left, except that it is cyclically shifted downwards. For example, here is a 4-by-4 circulant matrix:
A circulant matrix is uniquely determined by its first column. We wish to write a function makeCirculant
which, given the first column, returns an expression representing the circulant matrix.
For simplicity, we restrict the makeCirculant
function to dense matrices. It may make sense to also allow arrays, or sparse matrices, but we will not do so here. We also do not want to support vectorization.
We will present the file implementing the makeCirculant
function part by part. We start by including the appropriate header files and forward declaring the expression class, which we will call Circulant
. The makeCirculant
function will return an object of this type. The class Circulant
is in fact a class template; the template argument ArgType
refers to the type of the vector passed to the makeCirculant
function.
For every expression class X
, there should be a traits class Traits<X>
in the Eigen::internal
namespace containing information about X
known as compile time.
As explained in The setting, we designed the Circulant
expression class to refer to dense matrices. The entries of the circulant matrix have the same type as the entries of the vector passed to the makeCirculant
function. The type used to index the entries is also the same. Again for simplicity, we will only return column-major matrices. Finally, the circulant matrix is a square matrix (number of rows equals number of columns), and the number of rows equals the number of rows of the column vector passed to the makeCirculant
function. If this is a dynamic-size vector, then the size of the circulant matrix is not known at compile-time.
This leads to the following code:
The next step is to define the expression class itself. In our case, we want to inherit from MatrixBase
in order to expose the interface for dense matrices. In the constructor, we check that we are passed a column vector (see Assertions) and we store the vector from which we are going to build the circulant matrix in the member variable m_arg
. Finally, the expression class should compute the size of the corresponding circulant matrix. As explained above, this is a square matrix with as many columns as the vector used to construct the matrix.
TO DO: What about the Nested
typedef? It seems to be necessary; is this only temporary?
The last big fragment implements the evaluator for the Circulant
expression. The evaluator computes the entries of the circulant matrix; this is done in the .coeff() member function. The entries are computed by finding the corresponding entry of the vector from which the circulant matrix is constructed. Getting this entry may actually be non-trivial when the circulant matrix is constructed from a vector which is given by a complicated expression, so we use the evaluator which corresponds to the vector.
The CoeffReadCost
constant records the cost of computing an entry of the circulant matrix; we ignore the index computation and say that this is the same as the cost of computing an entry of the vector from which the circulant matrix is constructed.
In the constructor, we save the evaluator for the column vector which defined the circulant matrix. We also save the size of that vector; remember that we can query an expression object to find the size but not the evaluator.
After all this, the makeCirculant
function is very simple. It simply creates an expression object and returns it.
Finally, a short main
function that shows how the makeCirculant
function can be called.
If all the fragments are combined, the following output is produced, showing that the program works as expected: