Amesos Package Browser (Single Doxygen Collection) Development
Loading...
Searching...
No Matches
Amesos_Merikos.h
Go to the documentation of this file.
1This file is out of date. It has not been refactored to use Amesos_Status.
2
3/*
4Task list:
5 Amesos_Merikos.h
6 Amesos_Merikos.cpp
7 Partition the matrix - store L as L^T?
8 Build tree
9 Initial data redistribution
10 Change row and column ownership (pass them up to the parent)
11 Amesos_Component_Solver.h
12 Amesos_BTF.h
13 Amesos_BTF.cpp
14
15
16
17 Communications issues/challenges:
18 ** Redistributing the original matrix to the arrowhead form that we need, options:
19 1) Two matrices: L^T and U
20 2) One matrix: U | L^T
21 3) Intermediate "fat" matrix - the way that I did Scalapack
22
23 ** Adding the Schur complements SC_0 and SC_1 to the original
24 trailing matirx A_2, owned by the parent
25 1) Precompute the final size and shape of A_2 + SC_0 + SC_1
26 2) Perform A_2 + SC_0 + SC_1 in an empty matrix of size n by n
27 and extract the non-empty rows.
28 CHALLENGES:
29 A) Only process 0/1 knows the size and map for SC_0/SC_1
30 B) It would be nice to allow SC_0 and SC_1 to be sent as soon as
31 they are available
32 C) It would be nice to have just one copy of the matrix on the
33 parent. Hence, it would be nice to know the shape of
34 A_2 + SC_0 + SC_1 in advance.
35 D) An import would do the trick provided that we know both maps
36 in advance. But, neither map is available to us in advance.
37 The original map (which has SC_0 on process 0 and SC_1 on
38 process 1) is not known
39 QUESTION:
40 Should the maps be in some global address space or should they be
41 in a local address space?
42 I'd like to keep them in the global address space as long as possible,
43 but we can't do the import of the A_2 + SC_0 + SC_1 in a global
44 address space because that would require a map that changes at each
45
46 ** Redistributing the right hand side vector, b
47 If we create a map that reflects the post-pivoting reality, assigning
48 each row of U and each column of L to the process that owns the diagonal
49 entry, we can redistribute the right hand side vector, b, to the
50 processes where the values of b will first be used, in a single, efficient,
51 import operation.
52
53
54Observations:
551) Although this algorithm is recursive, a non-recursive implementation
56 might be cleaner. If it is done recursively, it should be done in place,
57 i.e. any data movement of the matrix itself should have been done in
58 advance.
592) There are two choices for the basic paradigm for parallelism. Consider
60 a two level bisection of the matrix, yielding seven tasks or diaganol
61 blocks:: T0, T1, T01, T2, T3, T23 and T0123. In both paradigms,
62 T0, T1, T2 and T3 would each
63 be handled by a single process. Up until now, we have been assuming
64 that T01 would be handled by processes 0 and/or 1 while T23 would be
65 handled by processes 2 and/or 3. The other option is to arrange the
66 tasks/diagonal blocks as follows: T0, T1, T2, T3, T01, T23, T0123 and
67 treat the last three blocks: T01, T23 and T0123 as a single block to be
68 handled by all four processes. This second paradigm includes an
69 additional synchronization, but may allow a better partitioning of
70 the remaining matrix because the resulting schur complement is now
71 known. This improved partitioning will also improve the refactorization
72 (i.e. pivotless factorization). The second paradigm may also allow for
73 better load balancing. For example, after using recursive minimum
74 degree bi-section (or any other scheme) to partition the matrix, one could
75 run a peephole optimization pass that would look for individuals blocks
76 that could be moved from the largest partition to a smaller one. Finally,
77 if it is clear that a given partition is going to be the slowest, it might
78 make sense to shift some rows/columns off of it into the splitter just
79 for load balancing.
803) It seems possible that Merikos would be a cleaner code if rows
81 which are shared by multiple processes are split such that each row
82 resides entirely within a given process.
83
844) Support for pivotless refactorization is important.
855) There is no mention of the required row and column permutations.
866) Amesos_Merikos only needs to support the Amesos_Component interface if
87 it will call itself recursively on the subblocks.
887) Perhaps Amesos_Component.h should be an added interface. Instead
89 of replacing Amesos_BaseSolver.h, maybe it should add functionality
90 to it.
91*/
92// @HEADER
93// ***********************************************************************
94//
95// Amesos: Direct Sparse Solver Package
96// Copyright (2004) Sandia Corporation
97//
98// Under terms of Contract DE-AC04-94AL85000, there is a non-exclusive
99// license for use of this work by or on behalf of the U.S. Government.
100//
101// This library is free software; you can redistribute it and/or modify
102// it under the terms of the GNU Lesser General Public License as
103// published by the Free Software Foundation; either version 2.1 of the
104// License, or (at your option) any later version.
105//
106// This library is distributed in the hope that it will be useful, but
107// WITHOUT ANY WARRANTY; without even the implied warranty of
108// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
109// Lesser General Public License for more details.
110//
111// You should have received a copy of the GNU Lesser General Public
112// License along with this library; if not, write to the Free Software
113// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301
114// USA
115// Questions? Contact Michael A. Heroux (maherou@sandia.gov)
116//
117// ***********************************************************************
118// @HEADER
119
120#ifndef _AMESOS_MERIKOS_H_
121#define _AMESOS_MERIKOS_H_
122
123#include "Amesos_ConfigDefs.h"
124#include "Amesos_BaseSolver.h"
125#include "Epetra_LinearProblem.h"
126#include "Epetra_Time.h"
127#ifdef EPETRA_MPI
128#include "Epetra_MpiComm.h"
129#else
130#include "Epetra_Comm.h"
131#endif
132#include "Epetra_CrsGraph.h"
133
134
136
155
156public:
157
159
164 Amesos_Merikos(const Epetra_LinearProblem& LinearProblem );
165
167
171
173
175
185
187
212
214
234 int LSolve();
235
236
238
258 int USolve();
259
261
290 int Solve();
291
292
293
295
297
298
300 const Epetra_LinearProblem *GetProblem() const { return(Problem_); };
301
303
306 bool MatrixShapeOK() const ;
307
309
312
314 bool UseTranspose() const {return(UseTranspose_);};
315
317 const Epetra_Comm & Comm() const {return(GetProblem()->GetOperator()->Comm());};
318
320
323 int SetParameters( Teuchos::ParameterList &ParameterList ) ;
324
326 int NumSymbolicFact() const { return( NumSymbolicFact_ ); }
327
329 int NumNumericFact() const { return( NumNumericFact_ ); }
330
332 int NumSolve() const { return( NumSolve_ ); }
333
336
339
341
342 protected:
343
345 const Epetra_LinearProblem * Problem_;
346
347 Epetra_CrsMatrix *L;
348 Epetra_CrsMatrix *U;
349
354
357
358 // some timing internal, copied from MUMPS
359 double ConTime_; // time to convert to MERIKOS format
360 double SymTime_; // time for symbolic factorization
361 double NumTime_; // time for numeric factorization
362 double SolTime_; // time for solution
363 double VecTime_; // time to redistribute vectors
364 double MatTime_; // time to redistribute matrix
365
369
370 Epetra_Time * Time_;
371
372
373
374//
375// These allow us to use the Scalapack based Merikos code
376//
377 Epetra_Map *ScaLAPACK1DMap_ ; // Points to a 1D Map which matches a ScaLAPACK 1D
378 // blocked (not block cyclic) distribution
379 Epetra_CrsMatrix *ScaLAPACK1DMatrix_ ; // Points to a ScaLAPACK 1D
380 // blocked (not block cyclic) distribution
381 Epetra_Map *VectorMap_ ; // Points to a Map for vectors X and B
382 std::vector<double> DenseA_; // The data in a ScaLAPACK 1D blocked format
383 std::vector<int> Ipiv_ ; // ScaLAPACK pivot information
386
387
388 //
389 // Control of the data distribution
390 //
391 bool TwoD_distribution_; // True if 2D data distribution is used
392 int grid_nb_; // Row and Column blocking factor (only used in 2D distribution)
393 int mypcol_; // Process column in the ScaLAPACK2D grid
394 int myprow_; // Process row in the ScaLAPACK2D grid
395 Epetra_CrsMatrix* FatOut_;//
396
397 //
398 // Blocking factors (For both 1D and 2D data distributions)
399 //
400 int nb_;
401 int lda_;
402
408
409
410
411}; // End of class Amesos_Merikos
412#endif /* _AMESOS_MERIKOS_H_ */
Amesos_BaseSolver: A pure virtual class for direct solution of real-valued double-precision operators...
Amesos_Merikos: A parallel divide and conquer solver.
int SetParameters(Teuchos::ParameterList &ParameterList)
Updates internal variables.
Amesos_Merikos(const Epetra_LinearProblem &LinearProblem)
Amesos_Merikos Constructor.
Epetra_Map * ScaLAPACK1DMap_
Epetra_Map * VectorMap_
std::vector< double > DenseA_
int SetUseTranspose(bool UseTranspose)
SetUseTranpose() controls whether to compute AX=B or ATX = B.
Epetra_CrsMatrix * U
const Epetra_Comm & Comm() const
Returns a pointer to the Epetra_Comm communicator associated with this matrix.
int Solve()
Solves A X = B.
int NumericFactorization()
Performs NumericFactorization on the matrix A.
int NumSolve() const
Returns the number of solves performed by this object.
int NumNumericFact() const
Returns the number of numeric factorizations performed by this object.
Epetra_CrsMatrix * FatOut_
const Epetra_LinearProblem * Problem_
int ConvertToScalapack()
int SymbolicFactorization()
Performs SymbolicFactorization on the matrix A.
int PerformNumericFactorization()
const Epetra_LinearProblem * GetProblem() const
Get a pointer to the Problem.
int RedistributeA()
Performs SymbolicFactorization on the matrix A.
Epetra_CrsMatrix * L
~Amesos_Merikos(void)
Amesos_Merikos Destructor.
int LSolve()
Solves L X = B.
bool MatrixShapeOK() const
Returns true if MERIKOS can handle this matrix shape.
int USolve()
Solves U X = B.
void PrintTiming()
Print timing information.
Epetra_CrsMatrix * ScaLAPACK1DMatrix_
void PrintStatus()
Print information about the factorization and solution phases.
Epetra_Time * Time_
std::vector< int > Ipiv_
int NumSymbolicFact() const
Returns the number of symbolic factorizations performed by this object.
bool UseTranspose() const
Returns the current UseTranspose setting.
Amesos_Status: Container for some status variables.
Definition: Amesos_Status.h:21