newmatnl.h
Go to the documentation of this file.
1 
6 
7 // Copyright (C) 1993,4,5: R B Davies
8 
9 #ifndef NEWMATNL_LIB
10 #define NEWMATNL_LIB 0
11 
12 #include "newmat.h"
13 
14 #ifdef use_namespace
15 namespace NEWMAT {
16 #endif
17 
18 
19 
20 /*
21 
22 This is a beginning of a series of classes for non-linear optimisation.
23 
24 At present there are two classes. FindMaximum2 is the basic optimisation
25 strategy when one is doing an optimisation where one has first
26 derivatives and estimates of the second derivatives. Class
27 NonLinearLeastSquares is derived from FindMaximum2. This provides the
28 functions that calculate function values and derivatives.
29 
30 A third class is now added. This is for doing maximum-likelihood when
31 you have first derviatives and something like the Fisher Information
32 matrix (eg the variance covariance matrix of the first derivatives or
33 minus the second derivatives - this matrix is assumed to be positive
34 definite).
35 
36 
37 
38  class FindMaximum2
39 
40 Suppose T is the ColumnVector of parameters, F(T) the function we want
41 to maximise, D(T) the ColumnVector of derivatives of F with respect to
42 T, and S(T) the matrix of second derivatives.
43 
44 Then the basic iteration is given a value of T, update it to
45 
46  T - S.i() * D
47 
48 where .i() denotes inverse.
49 
50 If F was quadratic this would give exactly the right answer (except it
51 might get a minimum rather than a maximum). Since F is not usually
52 quadratic, the simple procedure would be to recalculate S and D with the
53 new value of T and keep iterating until the process converges. This is
54 known as the method of conjugate gradients.
55 
56 In practice, this method may not converge. FindMaximum2 considers an
57 iteration of the form
58 
59  T - x * S.i() * D
60 
61 where x is a number. It tries x = 1 and uses the values of F and its
62 slope with respect to x at x = 0 and x = 1 to fit a cubic in x. It then
63 choses x to maximise the resulting function. This gives our new value of
64 T. The program checks that the value of F is getting better and carries
65 out a variety of strategies if it is not.
66 
67 The program also has a second strategy. If the successive values of T
68 seem to be lying along a curve - eg we are following along a curved
69 ridge, the program will try to fit this ridge and project along it. This
70 does not work at present and is commented out.
71 
72 FindMaximum2 has three virtual functions which need to be over-ridden by
73 a derived class.
74 
75  void Value(const ColumnVector& T, bool wg, Real& f, bool& oorg);
76 
77 T is the column vector of parameters. The function returns the value of
78 the function to f, but may instead set oorg to true if the parameter
79 values are not valid. If wg is true it may also calculate and store the
80 second derivative information.
81 
82  bool NextPoint(ColumnVector& H, Real& d);
83 
84 Using the value of T provided in the previous call of Value, find the
85 conjugate gradients adjustment to T, that is - S.i() * D. Also return
86 
87  d = D.t() * S.i() * D.
88 
89 NextPoint should return true if it considers that the process has
90 converged (d very small) and false otherwise. The previous call of Value
91 will have set wg to true, so that S will be available.
92 
93  Real LastDerivative(const ColumnVector& H);
94 
95 Return the scalar product of H and the vector of derivatives at the last
96 value of T.
97 
98 The function Fit is the function that calls the iteration.
99 
100  void Fit(ColumnVector&, int);
101 
102 The arguments are the trial parameter values as a ColumnVector and the
103 maximum number of iterations. The program calls a DataException if the
104 initial parameters are not valid and a ConvergenceException if the
105 process fails to converge.
106 
107 
108  class NonLinearLeastSquares
109 
110 This class is derived from FindMaximum2 and carries out a non-linear
111 least squares fit. It uses a QR decomposition to carry out the
112 operations required by FindMaximum2.
113 
114 A prototype class R1_Col_I_D is provided. The user needs to derive a
115 class from this which includes functions the predicted value of each
116 observation its derivatives. An object from this class has to be
117 provided to class NonLinearLeastSquares.
118 
119 Suppose we observe n normal random variables with the same unknown
120 variance and such the i-th one has expected value given by f(i,P)
121 where P is a column vector of unknown parameters and f is a known
122 function. We wish to estimate P.
123 
124 First derive a class from R1_Col_I_D and override Real operator()(int i)
125 to give the value of the function f in terms of i and the ColumnVector
126 para defined in class R1_CoL_I_D. Also override ReturnMatrix
127 Derivatives() to give the derivates of f at para and the value of i
128 used in the preceeding call to operator(). Return the result as a
129 RowVector. Construct an object from this class. Suppose in what follows
130 it is called pred.
131 
132 Now constuct a NonLinearLeastSquaresObject accessing pred and optionally
133 an iteration limit and an accuracy critierion.
134 
135  NonLinearLeastSquares NLLS(pred, 1000, 0.0001);
136 
137 The accuracy critierion should be somewhat less than one and 0.0001 is
138 about the smallest sensible value.
139 
140 Define a ColumnVector P containing a guess at the value of the unknown
141 parameter, and a ColumnVector Y containing the unknown data. Call
142 
143  NLLS.Fit(Y,P);
144 
145 If the process converges, P will contain the estimates of the unknown
146 parameters. If it does not converge an exception will be generated.
147 
148 The following member functions can be called after you have done a fit.
149 
150 Real ResidualVariance() const;
151 
152 The estimate of the variance of the observations.
153 
154 void GetResiduals(ColumnVector& Z) const;
155 
156 The residuals of the individual observations.
157 
158 void GetStandardErrors(ColumnVector&);
159 
160 The standard errors of the observations.
161 
162 void GetCorrelations(SymmetricMatrix&);
163 
164 The correlations of the observations.
165 
166 void GetHatDiagonal(DiagonalMatrix&) const;
167 
168 Forms a diagonal matrix of values between 0 and 1. If the i-th value is
169 larger than, say 0.2, then the i-th data value could have an undue
170 influence on your estimates.
171 
172 
173 */
174 
176 {
177  virtual void Value(const ColumnVector&, bool, Real&, bool&) = 0;
178  virtual bool NextPoint(ColumnVector&, Real&) = 0;
179  virtual Real LastDerivative(const ColumnVector&) = 0;
180 public:
181  void Fit(ColumnVector&, int);
182  virtual ~FindMaximum2() {} // to keep gnu happy
183 };
184 
186 {
187  // The prototype for a Real function of a ColumnVector and an
188  // integer.
189  // You need to derive your function from this one and put in your
190  // function for operator() and Derivatives() at least.
191  // You may also want to set up a constructor to enter in additional
192  // parameter values (that will not vary during the solve).
193 
194 protected:
195  ColumnVector para; // Current x value
196 
197 public:
198  virtual bool IsValid() { return true; }
199  // is the current x value OK
200  virtual Real operator()(int i) = 0; // i-th function value at current para
201  virtual void Set(const ColumnVector& X) { para = X; }
202  // set current para
203  bool IsValid(const ColumnVector& X)
204  { Set(X); return IsValid(); }
205  // set para, check OK
206  Real operator()(int i, const ColumnVector& X)
207  { Set(X); return operator()(i); }
208  // set para, return value
209  virtual ReturnMatrix Derivatives() = 0;
210  // return derivatives as RowVector
211  virtual ~R1_Col_I_D() {} // to keep gnu happy
212 };
213 
214 
216 {
217  // these replace the corresponding functions in FindMaximum2
218  void Value(const ColumnVector&, bool, Real&, bool&);
219  bool NextPoint(ColumnVector&, Real&);
220  Real LastDerivative(const ColumnVector&);
221 
222  Matrix X; // the things we need to do the
223  ColumnVector Y; // QR triangularisation
224  UpperTriangularMatrix U; // see the write-up in newmata.txt
226  Real errorvar, criterion;
227  int n_obs, n_param;
232  R1_Col_I_D& Pred; // Reference to predictor object
233  int Lim; // maximum number of iterations
234 
235 public:
236  NonLinearLeastSquares(R1_Col_I_D& pred, int lim=1000, Real crit=0.0001)
237  : criterion(crit), Pred(pred), Lim(lim) {}
238  void Fit(const ColumnVector&, ColumnVector&);
239  Real ResidualVariance() const { return errorvar; }
240  void GetResiduals(ColumnVector& Z) const { Z = Y; }
241  void GetStandardErrors(ColumnVector&);
242  void GetCorrelations(SymmetricMatrix&);
243  void GetHatDiagonal(DiagonalMatrix&) const;
244 
245 private:
246  void MakeCovariance();
247 };
248 
249 
250 // The next class is the prototype class for calculating the
251 // log-likelihood.
252 // I assume first derivatives are available and something like the
253 // Fisher Information or variance/covariance matrix of the first
254 // derivatives or minus the matrix of second derivatives is
255 // available. This matrix must be positive definite.
256 
257 class LL_D_FI
258 {
259 protected:
260  ColumnVector para; // current parameter values
261  bool wg; // true if FI matrix wanted
262 
263 public:
264  virtual void Set(const ColumnVector& X) { para = X; }
265  // set parameter values
266  virtual void WG(bool wgx) { wg = wgx; }
267  // set wg
268 
269  virtual bool IsValid() { return true; }
270  // return true is para is OK
271  bool IsValid(const ColumnVector& X, bool wgx=true)
272  { Set(X); WG(wgx); return IsValid(); }
273 
274  virtual Real LogLikelihood() = 0; // return the loglikelihhod
275  Real LogLikelihood(const ColumnVector& X, bool wgx=true)
276  { Set(X); WG(wgx); return LogLikelihood(); }
277 
278  virtual ReturnMatrix Derivatives() = 0;
279  // column vector of derivatives
280  virtual ReturnMatrix FI() = 0; // Fisher Information matrix
281  virtual ~LL_D_FI() {} // to keep gnu happy
282 };
283 
284 // This is the class for doing the maximum likelihood estimation
285 
286 class MLE_D_FI : public FindMaximum2
287 {
288  // these replace the corresponding functions in FindMaximum2
289  void Value(const ColumnVector&, bool, Real&, bool&);
290  bool NextPoint(ColumnVector&, Real&);
291  Real LastDerivative(const ColumnVector&);
292 
293  // the things we need for the analysis
294  LL_D_FI& LL; // reference to log-likelihood
295  int Lim; // maximum number of iterations
296  Real Criterion; // convergence criterion
297  ColumnVector Derivs; // for the derivatives
298  LowerTriangularMatrix LT; // Cholesky decomposition of FI
301 
302 public:
303  MLE_D_FI(LL_D_FI& ll, int lim=1000, Real criterion=0.0001)
304  : LL(ll), Lim(lim), Criterion(criterion) {}
305  void Fit(ColumnVector& Parameters);
306  void GetStandardErrors(ColumnVector&);
307  void GetCorrelations(SymmetricMatrix&);
308 
309 private:
310  void MakeCovariance();
311 };
312 
313 
314 #ifdef use_namespace
315 }
316 #endif
317 
318 
319 
320 #endif
321 
322 // body file: newmatnl.cpp
323 
324 
325 
327 
virtual ~LL_D_FI()
Definition: newmatnl.h:281
ColumnVector para
Definition: newmatnl.h:195
Real LogLikelihood(const ColumnVector &X, bool wgx=true)
Definition: newmatnl.h:275
LowerTriangularMatrix LT
Definition: newmatnl.h:298
DiagonalMatrix SE
Definition: newmatnl.h:300
LL_D_FI & LL
Definition: newmatnl.h:294
SymmetricMatrix Covariance
Definition: newmatnl.h:230
int Lim
Definition: newmatnl.h:295
virtual ~FindMaximum2()
Definition: newmatnl.h:182
virtual void Set(const ColumnVector &X)
Definition: newmatnl.h:264
Real ResidualVariance() const
Definition: newmatnl.h:239
SymmetricMatrix Covariance
Definition: newmatnl.h:299
double Real
Definition: include.h:307
DiagonalMatrix SE
Definition: newmatnl.h:231
virtual void Set(const ColumnVector &X)
Definition: newmatnl.h:201
MLE_D_FI(LL_D_FI &ll, int lim=1000, Real criterion=0.0001)
Definition: newmatnl.h:303
ColumnVector M
Definition: newmatnl.h:225
bool IsValid(const ColumnVector &X, bool wgx=true)
Definition: newmatnl.h:271
Upper triangular matrix.
Definition: newmat.h:799
R1_Col_I_D & Pred
Definition: newmatnl.h:232
bool IsValid(const ColumnVector &X)
Definition: newmatnl.h:203
void GetResiduals(ColumnVector &Z) const
Definition: newmatnl.h:240
bool wg
Definition: newmatnl.h:261
virtual void WG(bool wgx)
Definition: newmatnl.h:266
Real operator()(int i, const ColumnVector &X)
Definition: newmatnl.h:206
The usual rectangular matrix.
Definition: newmat.h:625
UpperTriangularMatrix U
Definition: newmatnl.h:224
NonLinearLeastSquares(R1_Col_I_D &pred, int lim=1000, Real crit=0.0001)
Definition: newmatnl.h:236
ColumnVector Derivs
Definition: newmatnl.h:297
virtual bool IsValid()
Definition: newmatnl.h:269
Diagonal matrix.
Definition: newmat.h:896
virtual bool IsValid()
Definition: newmatnl.h:198
Lower triangular matrix.
Definition: newmat.h:848
const ColumnVector * DataPointer
Definition: newmatnl.h:228
Real Criterion
Definition: newmatnl.h:296
Row vector.
Definition: newmat.h:953
virtual ~R1_Col_I_D()
Definition: newmatnl.h:211
Column vector.
Definition: newmat.h:1008
ColumnVector para
Definition: newmatnl.h:260
ColumnVector Y
Definition: newmatnl.h:223
Symmetric matrix.
Definition: newmat.h:753


kni
Author(s): Martin Günther
autogenerated on Fri Jun 7 2019 22:06:45