الرئيسية Least-Squares Finite Element Methods

Least-Squares Finite Element Methods

,
السنة: 2009
الناشر: Springer
اللغة: english
الصفحات: 664
ISBN 10: 0387308881
ISBN 13: 978-0-387-68922-7
Series: Applied mathematical sciences 166
File: PDF, 12.52 MB
تحميل (pdf, 12.52 MB)
اقرأ الكتاب مباشرة
 
You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.
1

Body language for Dummies

السنة: 2007
اللغة: english
File: PDF, 8.19 MB
2

The Bob Dylan Encyclopedia

السنة: 2006
اللغة: english
File: PDF, 7.87 MB
Applied Mathematical Sciences
Volume 166
Editors
S.S. Antman J.E. Marsden L. Sirovich

Advisors
J.K. Hale P. Holmes J. Keener
J. Keller B.J. Matkowsky A. Mielke
C.S. Peskin K.R. Sreenivasan

For further volumes:
http://www.springer.com/series/34

Pavel B. Bochev

•

Max D. Gunzburger

Least-Squares Finite
Element Methods

123

Pavel B. Bochev
Sandia National Laboratories
Applied Mathematics and Applications
MS 1320, P.O. Box 5800
Albuquerque NM 87185-1320
USA
pbboche@sandia.gov

Max D. Gunzburger
Florida State University
Department of Scientific Computing
400 Dirac Science Library
Tallahassee FL 32306-4120
USA
gunzburg@fsu.edu

Editors:
S.S. Antman
Department of Mathematics
and
Institute for Physical Science
and Technology
University of Maryland
College Park
MD 20742-4015
USA
ssa@math.umd.edu

J.E. Marsden
Control and Dynamical
Systems, 107-81
California Institute
of Technology
Pasadena, CA 91125
USA
marsden@cds.caltech.edu

ISBN 978-0-387-30888-3
DOI 10.1007/978-0-387-68922-7

L. Sirovich
Laboratory of Applied
Mathematics
Department of
Biomathematical Sciences
Mount Sinai School of Medicine
New York, NY 10029-6574
USA
Lawrence.Sirovich@mssm.edu

e-ISBN 978-0-387-68922-7

Library of Congress Control Number: 2008943966
Mathematics Subject Classification (2000): 65N30, 65N35, 65N12, 65N21, 65M60, 65M70, 65M12
c Springer Science+Business Media, LLC 2009

All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York,
NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in
connection with any form of information storage and retrieval, electronic adaptation, computer software,
or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are
not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject
to proprietary rights.
Printed on acid-free paper
springer.com

To my mother and Biliana
To Janet

v

Preface

Since their emergence in the early 1950s, finite element methods have become one
of the most versatile and powerful methodologies for the approximate numerical
solution of partial differential equations. At the time of their inception, finite element methods were viewed primarily as a tool for solving problems in structural
analysis. However, it did not take long to discover that finite element methods could
be applied with equal success to problems in other engineering and scientific fields.
Today, finite element methods are also in common use, and indeed are often the
method of choice, for incompressible fluid flow, heat transfer, electromagnetics, and
advection-diffusion-reaction problems, just to name a few. Given the early connection between finite element methods and problems engendered by energy minimization principles, it is not surprising that the first mathematical analyses of finite element methods were given in the environment of the classical Rayleigh–Ritz setting.
Yet again, using the fertile soil provided by functional analysis in Hilbert spaces, it
did not take long for the rigorous analysis of finite element methods to be extended
to many other settings. Today, finite element methods are unsurpassed with respect
to their level of theoretical maturity.
A finite element method is first and foremost a quasi-projection scheme.1 This
truly fundamental property establishes a link with sophisticated mathematical structures and has a tremendous impact on the algebraic problems generated by the
method. The paradigm that describes and defines the key properties of what we
refer to as a quasi-projection is the marriage of two ingredients: a variational principle and a closed subspace. Approximate solutions are then characterized as quasiprojections of the exact (weak) solutions onto the closed subspace. Finite element
approximations are an example of this rule. Indeed, a finite element method and
its properties are completely determined by specifying the variational principle and

1

A projection in a Hilbert space is, of course, defined with respect to an inner product and a
closed subspace. Finite element methods do not always involve a true projection because they are
not always based on inner products; however, as we show, finite element approximations do, in
general, possess many of the important properties of projections.

vii

viii

Preface

the approximation subspace.2 The cornerstone of the great success of finite element
methods is the remarkable fact that the combination of these two ingredients prove
to be exceptionally well suited for the numerical solution of partial differential equations.
From a mathematical viewpoint, this success is rooted in the existence of strong
intrinsic links between differential equations and variational principles; in fact, it is
often the case that the latter serve as the primary mathematical model. On the other
hand, the practical appeal of finite element methods and their wide acceptance in the
engineering community results from the choice of approximating subspaces. These
spaces are spanned by locally supported, piecewise polynomial functions defined
over a subdivision of a domain into simple geometrical subdomains referred to as
finite elements. Such spaces are not only simple to use, but allow for the almost
automatic generation of high-order methods3 with respect to arbitrary unstructured
subdivisions, a trait that, if not outright impossible, is not easy to accomplish with
other schemes. The popularity of finite element methods is also largely due to the
small support of standard finite element basis functions. This property implies that
the resulting algebraic problems involve banded and sparse matrices.
Although the choice of approximation space has, especially from a practical point
of view, a tremendous influence on the attributes of the resulting discretized systems,
all fundamental properties of finite element methods are ultimately governed by the
variational principles from which they are defined. It is, therefore, a fortuitous coincidence that the variational foundations of the first finite element methods were
provided by unconstrained, quadratic energy minimization principles, a setting that
turned out to be, by far, the most attractive one for finite element methods. Such
principles search for a point (in a suitable function space) that is the unconstrained
minimizer of a convex quadratic functional and give rise to true inner-product projections in Hilbert spaces. This property, exemplified by the classical Rayleigh–Ritz
principle, results in distinct, unique, and very desirable computational and analytic
advantages for the resulting finite element methods. Most notably, true inner product
projections allow for a wide and nonrestrictive choice for the approximating spaces
and lead to symmetric and positive definite algebraic systems.
The connection between finite element methods and Rayleigh–Ritz principles
was not immediately recognized as the fundamental cause for the remarkable success of the first finite element methods. As a result, some of the early attempts
directed at extending finite element methods beyond the class of problems whose
solutions can be characterized as global minimizers of convex quadratic functionals
2

There exists a fundamental philosophical difference between finite difference and finite element
methods. The former substitutes difference operators for differential operators and leads to approximations defined with respect to a discrete lattice, i.e., the fundamental discretization step is
to approximate operators. Finite element methods, on the other hand, do not deal directly with
the differential operators; instead, discretization is effected using alternative weak formulations of
these equations posed over finite-dimensional function spaces. This leads to functional approximation as opposed to operator approximation. Finite volume methods occupy a middle ground as
they exhibit features shared by both finite element and finite difference methods.
3 This is due to the possibility of using internal degrees of freedom that enrich the finite element
space inside each element.

Preface

ix

led to disappointing results and a reluctance (by some) to apply the methods outside
the field of structural analysis. However, with the rapid development of the mathematical theory of finite element methods and the rapid accumulation of a wealth
of practical experience in applying the methods, there came the understanding for
what were the reasons behind these early setbacks. It was realized that differential equations not associated with unconstrained minimization principles lead to
two other classes of variational principles with strikingly different properties. In
one of them, the quasi-projection is induced by a search for a stationary point of
an indefinite functional; constrained minimization problems provide an example of
this class. The other type offers even less mathematical structure and defines the
quasi-projection by a formal residual orthogonalization process. In both cases, the
variational principle does not lead to a true inner-product projection; this causes the
associated finite element methods to operate in a much less favorable setting as compared to those based on Rayleigh–Ritz principles. The computation of stationary
points that is the paradigm of mixed-Galerkin methods demands strict compatibility
conditions on the discrete spaces, if stable and accurate approximations are desired.
Likewise, the formal residual orthogonalization process that provides the template
for Galerkin methods may require onerous stability conditions on the approximation
spaces. In both cases, one is confronted by the problem of solving indefinite and/or
non-symmetric algebraic equations.
Not surprisingly, there have been many efforts aimed at developing finite element
methods that, for problems not connected to unconstrained energy principles, share
some, if not all, of the attractive mathematical and algorithmic properties of the
Rayleigh–Ritz setting. Broadly speaking, there are two different ways to approach
the construction of better projections or quasi-projections than those afforded by the
original problem. The first one improves the quasi-projection by penalization or regularization of the original variational principle. In this approach, the principal role
of the naturally occurring variational principle4 is retained, but the principle itself is
modified so as to behave more like a true inner-product projection.5 Galerkin leastsquares, stabilized Galerkin, penalized Lagrangian methods, artificial diffusion, and
upwind Petrov–Galerkin methods are all examples of finite element methods resulting from this approach.
A second approach that leads to better quasi-projections, indeed to true projections, is to abandon completely the naturally occurring variational principle and
to devise a Rayleigh–Ritz-like environment by formulating an artificial, externally
defined energy-type principle. This energy principle most often takes the form of
residual minimization in a suitable Hilbert space; thus, there arises the commonly
used adjective least-squares to denote the resulting finite element methods. Residual
4

“Naturally occurring” variational principles include Galerkin principles that are based on residual
orthogonalization and may or may not have any physical interpretation.
5 Often the same effect can be achieved by posing the original variational principle on a modified
finite element space. One example is the SUPG method which can be viewed both as resulting
from the use of a modified test space or a modified variational principle. Another example is finite
element spaces “enriched” by bubble functions. In many cases, enriched spaces lead to exactly the
same formulations as modified Galerkin principles posed on conventional finite element spaces.

x

Preface

minimization is as universal as residual orthogonalization so that, in principle, leastsquares finite element methods have the same wide range of applicability as do
Galerkin finite element methods, i.e., they are both applicable to virtually any partial differential equation problem. However, residual minimization differs fundamentally from other variational settings, including modified and formal Galerkin
principles. Unlike the case for other settings, least-squares finite element methods
are consistently capable of recovering almost all of the advantages of the Rayleigh–
Ritz setting over a wide range of problems and, with some additional effort, they
can often create a completely analogous variational environment for finite element
methods. This is what makes least-squares finite element methods stand apart from
the rest of the methods in the finite element universe.
Mostly over the last ten to fifteen years, efforts focused on least-squares finite
element methods have achieved tremendous success in making the methods viable
alternatives to existing schemes. There is a wealth of theoretical and practical experience with the methods and they are steadily gaining a solid reputation and popularity among researchers and practitioners for providing robust, efficient, and practical
computational tools. Nevertheless, compared with established finite element techniques, such as mixed-Galerkin methods, the theory for least-squares finite element
methods remains much less unified and is not always well understood. Because such
methods are based on inner-product projections, they tend to be exceptionally robust
and stable. As a result, one is often tempted to forego analyses and proceed with
the seemingly most natural and simple least-squares formulations. This sometimes
leads to unsatisfactory results and methods that cannot fully exploit the advantages
of the least-squares approach.
This book is motivated by the premise that there exists a real need to put leastsquares finite element methods on a common, mathematically sound foundation and
also to discuss important implementation issues that are critical to their success in
practice. It is intended to give both the researcher and the practitioner a guide to
the theory and practice of least-squares finite element methods, their strengths and
weaknesses, caveats to be followed, and established successes.
The factors that set least-squares finite element methods apart from other finite
element methods also call for a different approach in the algorithmic development
of these methods, if truly reliable, efficient, and accurate schemes are desired. Most
notably, reliance of a least-squares method on externally defined variational principles makes the choice of this principle the single most important step in the design of
a method. Unlike traditional finite element methods for which the variational principle is almost always dictated by the given problem, least-squares finite element
methods dictate the variational principle and then fit the problem into the principle. Thus, the flexibility afforded by the freedom to choose the variational principle
places the principal responsibility for the success of the method on the “fitting” process. There are two, often opposing, forces that affect this process. One is the desire
to keep the method as simple as possible so as to develop a scheme that is easy to
implement. The other is the need to adhere to the basic premises of a Rayleigh–Ritzlike framework, namely, to work with projections defined by inner products that are
equivalent to the natural inner products in suitable Hilbert spaces. The interaction

Preface

xi

between these two competing forces is, in our opinion, the key to understanding
least-squares finite element methods; we make it the central theme of this book.
Indeed, we show over and over again that the choices made in reconciling ease of
implementation with adherence to the Rayleigh–Ritz framework affect all aspects of
least-squares finite element methods, from condition numbers of the algebraic problems and their efficient preconditioning to the existence of quasioptimal asymptotic
error estimates.

An overview of the book
Throughout the book, careful attention is paid to not only the rigorous analysis of
least-squares finite element methods, but also to practical issues that arise in their
implementation.
For those who wish to gain a full understanding of the mathematical analyses
connected with least-squares finite element methods, space limitations necessitate
the requirement of certain prerequisites. In particular, a basic familiarity with functional analysis in Hilbert spaces and with the theory and implementation of standard
finite element methods is assumed at the level of, e.g., [250] and [76, 123, 188], respectively. More advanced background material on functional analysis, partial differential equations, and finite element methods is contained within the book, especially in the appendices. Those who merely wish to learn about least-squares finite
element algorithms and their implementation can still use the book by focusing on
those sections that consider these topics. In this case, space limitations necessitate
familiarity with the algorithmic and implementation aspects of standard finite element methods.
The book is organized into parts, each containing several chapters. In Part I of
the book, we provide a necessarily brief review of the finite element universe. It
is not meant to give a comprehensive treatment, but rather to provide a context for
understanding where least-squares finite element methods fit into that universe. In
Chapter 1, “classical” finite element methods, i.e., those based on Rayleigh–Ritz,
mixed-Galerkin, and Galerkin variational principles, are discussed. Then, in Chapter 2, a discussion is provided about several of the attempts that have been made
to try to recover some of the advantageous features of finite element methods in the
Rayleigh–Ritz setting through the definition of modifications to the “classical” principles. Along the way, the strengths and weaknesses of the specific finite element
methods encountered are pointed out. At that point, one is ready for the discussion
provided in the latter part of Chapter 2 about why and how least-squares finite element methods are meant to improve on the three “classical” approaches and their
modifications.
In Part II, we provide the theoretical core of the book that is repeatedly referred
to in the rest of the book. Chapter 3 is devoted to an abstract theory of least-squares
finite element methods for systems of elliptic partial differential equations. The abstract framework can itself be divided into three classes of least-squares finite element methods that are characterized by discrete least-squares principles having

xii

Preface

different derivations and properties. The first is defined by simply restricting the
minimization of a continuous least-squares functional for the partial differential
equations to a finite element space. The other two classes additionally involve a
discretization of the least-squares functional itself and differ from each other by
the type of equivalence relations that exist (or do not exist) between least-squares
functionals and the norms used to measure the size of the solution and the data
for the problem. All the possible scenarios emanating from this classification of
least-squares finite element methods are analyzed. For a specific problem, one can
usually define several different realizations of least-squares finite element methods.
Most of these can be viewed as particular applications of one or another of the different scenarios covered by the abstract framework. In addition, several additional
basic ingredients needed to define and analyze least-squares finite element methods
but which transcend the specific context to which those methods are applied are
also discussed. In Chapter 4, the Agmon–Douglis–Nirenberg theory for elliptic partial differential equations is exploited to provide “automated” mechanisms for using
solution and data spaces and norms for the definition of least-squares variational
principles.
The abstract frameworks developed and analyzed in Part II can be used to define
and analyze many least-squares finite element methods for a number of specific
settings. In these cases, one need only show how the setting fits into the abstract
frameworks and that the hypotheses invoked in those frameworks hold. Thus, there
is no need to repeat over and over again proofs specialized to each setting. Using this
approach, Part III of the book is devoted to the application of the abstract framework
developed in Part II to concrete elliptic partial differential equations.6 In Chapters
5 and 6, least-squares finite element methods for scalar and vector elliptic partial
differential equations are considered, respectively, and, in Chapter 7, least-squares
finite element methods for the Stokes equations are considered.
In Part IV, least-squares finite element methods in contexts that do not fit into the
abstract theory of Part II, but nevertheless rely on some aspects of that theory, are
examined. In Chapter 8, the nonlinear stationary Navier–Stokes equations are considered. In addition to the intrinsic interest engendered by the Navier–Stokes equations, this setting provides an example of how least-squares finite element methods can be used for nonlinear problems. In Chapters 9 and 10, parabolic and hyperbolic (time-dependent) partial differential equations are considered, respectively.
The application of least-squares finite element methods to optimization and control
problems for systems governed by elliptic partial differential equations is treated in
Chapter 11.
Then, in Chapter 12, other settings to which least-squares finite element methods
have been applied, but that are not discussed in Chapters 5 through 11, are briefly
considered.7 In that chapter, some variations on the least-squares finite element
6

There are a few settings considered in Part III that do not fit into the abstract frameworks of Part
II; these are treated directly on a case-by-case basis.
7 Providing a full treatment of all the topics discussed in Chapter 12 would have easily doubled
the length of the book. However, we believe these topics should be included in the book, even in a
somewhat cursory manner, not only because they are important, but because they serve to further

Preface

xiii

methods are also briefly discussed; these are methods that have a least-squares character or that use least-squares principles in different ways or for different purposes
than those discussed in the rest of the book. Included in Chapter 12 are discussions of
boundary condition treatments, LL∗ least-squares methods, mimetic reformulations
of least-squares methods, least-squares collocation methods, restricted least-squares
methods, optimization-based least-squares methods, advection–diffusion–reaction
problems, higher-order problems, div–grad–curl systems, domain decomposition
least-squares methods, multi-physics problems, problems with singular solution,
Treffetz least-squares methods, a posteriori error estimation and mesh refinement,
least-squares wavelet methods, and meshless least-squares methods.
In the four Appendices, results from functional analysis, partial differential equations, and finite element theory that are used in various places in the rest of the book
are provided. In particular, we define a consistent and unified notational system that
does not vary from setting to setting, even though much of the source material used
for the book is not consistent in this regard. This not only greatly reduces the notational definitions introduced, but, more important, facilitates making connections
between the different settings treated in the book. We also include, in addition to the
expected index, a list of often used acronyms and a glossary of often used notations.

Acknowledgements
Achi Dosanjh, our editor at Springer, has shown great patience and support throughout the process of getting this book to print. It has been a pleasure to work with her.
Much of this book and much of our own work on least-squares finite element
methods would not have come to pass without the help of our collaborators who
have worked with us on this subject, especially students and many colleagues with
whom we had pleasant and productive interactions. Thus, a great deal of thanks is
owed to Dana Bedivan, John Burkardt, Zhiqiang Cai, Yanzhao Cao, Ching-Lung
Chang, Jungmin Choi, Christopher Cox, Jennifer Deang, Raytcho Lazarov, HyungChun Lee, Hugh MacMillan, Tom Manteuffel, Steve McCormick, Roy Nicolaides,
and Panayot Vassilevski. Both authors especially acknowledge George Fix who was
instrumental to their progress in the early stages of their careers and who, directly in
the case of Max Gunzburger and indirectly in the case of Pavel Bochev, introduced
them to the world of least-squares finite element methods.
This book is dedicated, by Pavel Bochev, to his wife, Biliana, for her unyielding
support, kindness and exceptional patience, and for nurturing his resolve in those
moments when the sheer magnitude of this endeavor made its success seem distant
and unattainable. A special debt of gratitude is owed by Pavel to his mother Dora
for believing in his dreams and making the many sacrifices that helped turn these
dreams into reality.
This book is also dedicated, by Max Gunzburger, to his wife, Janet Peterson, not
so much for the real support she provided throughout the long and arduous path he
illustrate the very significant progress that has recently been made in algorithmic, theoretical, and
application aspects of least-squares finite element methods.

xiv

Preface

took to the completion of the book, but for the constant and enduring patience she
has shown, support and help she has provided, sacrifices she has made, and love she
has given that have been instrumental to his success, not only in his professional
life, but, more important, in his personal life.
Pavel Bochev and Max Gunzburger
Albuquerque and Tallahassee
November, 2008

Contents

Part I Survey of Variational Principles and Associated Finite Element
Methods
1

2

Classical Variational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1
Variational Methods for Operator Equations . . . . . . . . . . . . . . . . . . .
1.2
A Taxonomy of Classical Variational Formulations . . . . . . . . . . . . .
1.2.1 Weakly Coercive Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2 Strongly Coercive Problems . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.3 Mixed Variational Problems . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.4 Relations Between Variational Problems
and Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
Approximation of Solutions of Variational Problems . . . . . . . . . . . .
1.3.1 Weakly and Strongly Coercive Variational Problems . . . . . .
1.3.2 Mixed Variational Problems . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4
Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.1 The Poisson Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.2 The Equations of Linear Elasticity . . . . . . . . . . . . . . . . . . . . .
1.4.3 The Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.4 The Helmholtz Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.5 A Scalar Linear Advection-Diffusion-Reaction Equation . .
1.4.6 The Navier–Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . . .
1.5
A Comparative Summary of Classical Finite Element Methods . . .

3
4
8
8
9
10

Alternative Variational Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1
Modified Variational Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Enhanced and Stabilized Methods for Weakly Coercive
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.2 Stabilized Methods for Strongly Coercive Problems . . . . . .
2.2
Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 A Straightforward Least-Squares Finite Element Method . .
2.2.2 Practical Least-Squares Finite Element Methods . . . . . . . . .

35
36

12
15
15
18
22
22
25
26
28
30
30
31

36
46
49
51
53

xv

xvi

Contents

2.3

2.2.3 Norm-Equivalence Versus Practicality . . . . . . . . . . . . . . . . . . 58
2.2.4 Some Questions and Answers . . . . . . . . . . . . . . . . . . . . . . . . . 60
Putting Things in Perspective and What to Expect from the Book . 62

Part II Abstract Theory of Least-Squares Finite Element Methods
3

Mathematical Foundations of Least-Squares Finite Element Methods 69
3.1
Least-Squares Principles for Linear Operator Equations
in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.1.1 Problems with Zero Nullity . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.1.2 Problems with Positive Nullity . . . . . . . . . . . . . . . . . . . . . . . . 73
3.2
Application to Partial Differential Equations . . . . . . . . . . . . . . . . . . . 75
3.2.1 Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.2.2 Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . 77
3.3
General Discrete Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . 80
3.3.1 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.3.2 The Need for Continuous Least-Squares Principles . . . . . . . 84
3.4
Binding Discrete Least-Squares Principles to Partial Differential
Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.4.1 Transformations from Continuous to Discrete
Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.5
Taxonomy of Conforming Discrete Least-Squares Principles
and their Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.5.1 Compliant Discrete Least-Squares Principles . . . . . . . . . . . . 92
3.5.2 Norm-Equivalent Discrete Least-Squares Principles . . . . . . 94
3.5.3 Quasi-Norm-Equivalent Discrete Least-Squares Principles 96
3.5.4 Summary Review of Discrete Least-Squares Principles . . . 100

4

The Agmon–Douglis–Nirenberg Setting for Least-Squares Finite
Element Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1
Transformations to First-Order Systems . . . . . . . . . . . . . . . . . . . . . . . 105
4.2
Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2.1 Homogeneous Elliptic Systems . . . . . . . . . . . . . . . . . . . . . . . 107
4.2.2 Non-Homogeneous Elliptic Systems . . . . . . . . . . . . . . . . . . . 107
4.3
Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.3.1 Homogeneous Elliptic Systems . . . . . . . . . . . . . . . . . . . . . . . 108
4.3.2 Non-Homogeneous Elliptic Systems . . . . . . . . . . . . . . . . . . . 110
4.4
Least-Squares Finite Element Methods for Homogeneous
Elliptic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.5
Least-Squares Finite Element Methods for Non-Homogeneous
Elliptic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.5.1 Quasi-Norm-Equivalent Discrete Least-Squares Principles 114
4.5.2 Norm-Equivalent Discrete Least-Squares Principles . . . . . . 124
4.6
Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

Contents

xvii

Part III Least-Squares Finite Element Methods for Elliptic Problems
5

Scalar Elliptic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.1
Applications of Scalar Poisson Equations . . . . . . . . . . . . . . . . . . . . . 135
5.2
Least-Squares Finite Element Methods for the Second-Order
Poisson Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.2.1 Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . 138
5.2.2 Discrete Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . 139
5.3
First–Order System Reformulations . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.3.1 The Div–Grad System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.3.2 The Extended Div–Grad System . . . . . . . . . . . . . . . . . . . . . . 145
5.3.3 Application Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.4
Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.4.1 Energy Balances in the Agmon–Douglis–Nirenberg
Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.4.2 Energy Balances in the Vector-Operator Setting . . . . . . . . . . 152
5.5
Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . . . . . 159
5.6
Discrete Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.6.1 The Div–Grad System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.6.2 The Extended Div–Grad System . . . . . . . . . . . . . . . . . . . . . . 169
5.7
Error Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.7.1 Error Estimates in Solution Space Norms . . . . . . . . . . . . . . . 171
5.7.2 L2 (Ω ) Error Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
5.8
Connections Between Compatible Least-Squares
and Standard Finite Element Methods . . . . . . . . . . . . . . . . . . . . . . . . 176
5.8.1 The Compatible Least-Squares Finite Element Method
with a Reaction Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.8.2 The Compatible Least-Squares Finite Element Method
Without a Reaction Term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
5.9
Practicality Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
5.9.1 Practical Rewards of Compatibility . . . . . . . . . . . . . . . . . . . . 184
5.9.2 Compatible Least-Squares Finite Element Methods
on Non-Affine Grids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
5.9.3 Advantages and Disadvantages of Extended Systems . . . . . 192
5.10 A Summary of Conclusions and Recommendations . . . . . . . . . . . . . 194

6

Vector Elliptic Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.1
Applications of Vector Elliptic Equations . . . . . . . . . . . . . . . . . . . . . 200
6.2
Reformulation of Vector Elliptic Problems . . . . . . . . . . . . . . . . . . . . 201
6.2.1 Div–Curl Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
6.2.2 Curl–Curl Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
6.3
Least-Squares Finite Element Methods for Div–Curl Systems . . . . 206
6.3.1 Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
6.3.2 Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . 209
6.3.3 Discrete Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . 211

xviii

Contents

6.3.4

6.4

6.5

6.6
7

Analysis of Conforming Least-Squares Finite Element
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
6.3.5 Analysis of Non-Conforming Least-Squares Finite
Element Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
Least-Squares Finite Element Methods for Curl–Curl Systems . . . . 221
6.4.1 Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.4.2 Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . 224
6.4.3 Discrete Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . 225
6.4.4 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
Practicality Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
6.5.1 Solution of Algebraic Equations . . . . . . . . . . . . . . . . . . . . . . . 232
6.5.2 Implementation of Non-Conforming Methods . . . . . . . . . . . 234
A Summary of Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

The Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
7.1
First-Order System Formulations of the Stokes Equations . . . . . . . . 238
7.1.1 The Velocity–Vorticity–Pressure System . . . . . . . . . . . . . . . . 239
7.1.2 The Velocity–Stress–Pressure System . . . . . . . . . . . . . . . . . . 242
7.1.3 The Velocity Gradient–Velocity–Pressure System . . . . . . . . 243
7.2
Energy Balances in the Agmon–Douglis–Nirenberg Setting . . . . . . 246
7.2.1 The Velocity–Vorticity–Pressure System . . . . . . . . . . . . . . . . 247
7.2.2 The Velocity–Stress–Pressure System . . . . . . . . . . . . . . . . . . 250
7.2.3 The Velocity Gradient–Velocity–Pressure System . . . . . . . . 251
7.3
Continuous Least-Squares Principles
in the Agmon–Douglis–Nirenberg Setting . . . . . . . . . . . . . . . . . . . . . 253
7.3.1 The Velocity–Vorticity–Pressure System . . . . . . . . . . . . . . . . 253
7.3.2 The Velocity–Stress–Pressure System . . . . . . . . . . . . . . . . . . 256
7.3.3 The Velocity Gradient–Velocity–Pressure System . . . . . . . . 256
7.4
Discrete Least-Squares Principles
in the Agmon–Douglis–Nirenberg Setting . . . . . . . . . . . . . . . . . . . . . 257
7.4.1 The Velocity–Vorticity–Pressure System . . . . . . . . . . . . . . . . 258
7.4.2 The Velocity–Stress–Pressure System . . . . . . . . . . . . . . . . . . 260
7.4.3 The Velocity Gradient–Velocity–Pressure System . . . . . . . . 260
7.5
Error Estimates in the Agmon–Douglis–Nirenberg Setting . . . . . . . 261
7.5.1 The Velocity–Vorticity–Pressure System . . . . . . . . . . . . . . . . 261
7.5.2 The Velocity–Stress–Pressure System . . . . . . . . . . . . . . . . . . 263
7.5.3 The Velocity Gradient–Velocity–Pressure System . . . . . . . . 264
7.6
Practicality Issues in the Agmon–Douglis–Nirenberg Setting . . . . . 264
7.6.1 Solution of the Discrete Equations . . . . . . . . . . . . . . . . . . . . . 265
7.6.2 Issues Related to Non-Homogeneous Elliptic Systems . . . . 266
7.6.3 Mass Conservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
7.6.4 The Zero Mean Pressure Constraint . . . . . . . . . . . . . . . . . . . . 274
7.7
Least-Squares Finite Element Methods
in the Vector-Operator Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
7.7.1 Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

Contents

xix

7.7.2
7.7.3
7.7.4
7.7.5
7.7.6
7.7.7

7.8

Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . 281
Discrete Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . 281
Stability of Discrete Least-Squares Principles . . . . . . . . . . . 284
Conservation of Mass and Strong Compatibility . . . . . . . . . 287
Error Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
Connection Between Discrete Least-Squares Principles
and Mixed-Galerkin Methods . . . . . . . . . . . . . . . . . . . . . . . . . 302
7.7.8 Practicality Issues in the Vector Operator Setting . . . . . . . . . 304
A Summary of Conclusions and Recommendations . . . . . . . . . . . . . 306

Part IV Least-Squares Finite Element Methods for Other Settings
8

The Navier–Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
8.1
First-Order System Formulations of the Navier–Stokes Equations . 313
8.2
Least-Squares Principles for the Navier–Stokes Equations . . . . . . . 314
8.2.1 Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . 315
8.2.2 Discrete Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . 316
8.3
Analysis of Least-Squares Finite Element Methods . . . . . . . . . . . . . 317
8.3.1 Quotation of Background Results . . . . . . . . . . . . . . . . . . . . . . 318
8.3.2 Compliant Discrete Least-Squares Principles
for the Velocity–Vorticity–Pressure System . . . . . . . . . . . . . 321
8.3.3 Norm-Equivalent Discrete Least-Squares Principles
for the Velocity–Vorticity–Pressure System . . . . . . . . . . . . . 329
8.3.4 Compliant Discrete Least-Squares Principles
for the Velocity Gradient–Velocity–Pressure System . . . . . . 340
8.3.5 A Norm-Equivalent Discrete Least-Squares Principle
for the Velocity Gradient–Velocity–Pressure System . . . . . . 344
8.4
Practicality Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
8.4.1 Solution of the Nonlinear Equations . . . . . . . . . . . . . . . . . . . 348
8.4.2 Implementation of Norm-Equivalent Methods . . . . . . . . . . . 351
8.4.3 The Utility of Discrete Negative Norm Least-Squares
Finite Element Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
8.4.4 Advantages and Disadvantages of Extended Systems . . . . . 359
8.5
A Summary of Conclusions and Recommendations . . . . . . . . . . . . . 364

9

Parabolic Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 367
9.1
The Generalized Heat Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
9.1.1 Backward-Euler Least-Squares Finite Element Methods . . 369
9.1.2 Second-Order Time Accurate Least-Squares Finite
Element Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
9.1.3 Comparison of Finite-Difference Least-Squares Finite
Element Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
9.1.4 Space–Time Least-Squares Principles . . . . . . . . . . . . . . . . . . 391
9.1.5 Practical Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
9.2
The Time-Dependent Stokes Equations . . . . . . . . . . . . . . . . . . . . . . . 396

xx

Contents

10

Hyperbolic Partial Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 403
10.1 Model Conservation Law Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 404
10.2 Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
10.2.1 Energy Balances in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . 407
10.2.2 Energy Balances in Banach Spaces . . . . . . . . . . . . . . . . . . . . 409
10.3 Continuous Least-Squares Principles . . . . . . . . . . . . . . . . . . . . . . . . . 410
10.3.1 Extension to Time-Dependent Conservation Laws . . . . . . . . 412
10.4 Least-Squares Finite Element Methods in a Hilbert Space Setting . 413
10.4.1 Conforming Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
10.4.2 Non-Conforming Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 414
10.5 Residual Minimization Methods in a Banach Space Setting . . . . . . 416
10.5.1 An L1 (Ω ) Minimization Method . . . . . . . . . . . . . . . . . . . . . . 416
10.5.2 Regularized L1 (Ω ) Minimization Method . . . . . . . . . . . . . . 418
10.6 Least-Squares Finite Element Methods Based on Adaptively
Weighted L2 (Ω ) Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
10.6.1 An Iteratively Re-Weighted Least-Squares Finite
Element Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419
10.6.2 A Feedback Least-Squares Finite Element Method . . . . . . . 420
10.7 Practicality Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422
10.7.1 Approximation of Smooth Solutions . . . . . . . . . . . . . . . . . . . 422
10.7.2 Approximation of Discontinuous Solutions . . . . . . . . . . . . . 423
10.8 A Summary of Conclusions and Recommendations . . . . . . . . . . . . . 427

11

Control and Optimization Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429
11.1 Quadratic Optimization and Control Problems in Hilbert Spaces
with Linear Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431
11.1.1 Existence of Optimal States and Controls . . . . . . . . . . . . . . . 432
11.1.2 Least-Squares Formulation of the Constraint Equation . . . . 435
11.2 Solution via Lagrange Multipliers of the Optimal Control
Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
11.2.1 Galerkin Finite Element Methods for the Optimality
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
11.2.2 Least-Squares Finite Element Methods for the Optimality
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442
11.3 Methods Based on Direct Penalization by the Least-Squares
Functional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447
11.3.1 Discretization of the Perturbed Optimality System . . . . . . . 450
11.3.2 Discretization of the Eliminated System . . . . . . . . . . . . . . . . 453
11.4 Methods Based on Constraining by the Least-Squares Functional . 455
11.4.1 Discretization of the Optimality System . . . . . . . . . . . . . . . . 457
11.4.2 Discretize-Then-Eliminate Approach for the Perturbed
Optimality System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
11.4.3 Eliminate-Then-Discretize Approach for the Perturbed
Optimality System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 459
11.5 Relative Merits of the Different Approaches . . . . . . . . . . . . . . . . . . . 460
11.6 Example: Optimization Problems for the Stokes Equations . . . . . . . 461

Contents

xxi

11.6.1 The Optimization Problems and Galerkin Finite Element
Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463
11.6.2 Least-Squares Finite Element Methods for the Constraint
Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
11.6.3 Least-Squares Finite Element Methods for the Optimality
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 468
11.6.4 Constraining by the Least-Squares Functional
for the Constraint Equations . . . . . . . . . . . . . . . . . . . . . . . . . . 471
12

Variations on Least-Squares Finite Element Methods . . . . . . . . . . . . . . 475
12.1 Weak Enforcement of Boundary Conditions . . . . . . . . . . . . . . . . . . . 475
12.2 LL* Finite Element Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
12.3 Mimetic Reformulation of Least-Squares Finite Element Methods . 483
12.4 Collocation Least-Squares Finite Element Methods . . . . . . . . . . . . . 488
12.5 Restricted Least-Squares Finite Element Methods . . . . . . . . . . . . . . 490
12.6 Optimization-Based Least-Squares Finite Element Methods . . . . . . 492
12.7 Least-Squares Finite Element Methods
for Advection–Diffusion–Reaction Problems . . . . . . . . . . . . . . . . . . 494
12.8 Least-Squares Finite Element Methods for Higher-Order
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503
12.9 Least-Squares Finite Element Methods for Div–Grad–Curl
Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 505
12.10 Domain Decomposition Least-Squares Finite Element Methods . . . 507
12.11 Least-Squares Finite Element Methods for Multi-Physics
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513
12.12 Least-Squares Finite Element Methods for Problems
with Singular Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517
12.13 Treffetz Least-Squares Finite Element Methods . . . . . . . . . . . . . . . . 521
12.14 A Posteriori Error Estimation and Adaptive Mesh Refinement . . . . 523
12.15 Least-Squares Wavelet Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 526
12.16 Meshless Least-Squares Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528

Part V Supplementary Material
A

Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
A.1 General Notations and Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533
A.2 Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535
A.2.1 The Sobolev Spaces H s (Ω ) . . . . . . . . . . . . . . . . . . . . . . . . . . 536
A.2.2 Spaces Related to the Gradient, Curl, and Divergence
Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540
A.3 Properties of Function Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 547
A.3.1 Embeddings of C(Ω ) ∩ D(Ω ) . . . . . . . . . . . . . . . . . . . . . . . . . 547
A.3.2 Poincaré–Friedrichs Inequalities . . . . . . . . . . . . . . . . . . . . . . 548
A.3.3 Hodge Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 550
A.3.4 Trace Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551

xxii

Contents

B

Compatible Finite Element Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553
B.1 Formal Definition and Properties of Finite Element Spaces . . . . . . . 554
B.2 Finite Element Approximation of the De Rham Complex . . . . . . . . 557
B.2.1 Examples of Compatible Finite Element Spaces . . . . . . . . . 559
B.2.2 Approximation of C(Ω ) ∩ D(Ω ) . . . . . . . . . . . . . . . . . . . . . . 567
B.2.3 Exact Sequences of Finite Element Spaces . . . . . . . . . . . . . . 569
B.3 Properties of Compatible Finite Element Spaces . . . . . . . . . . . . . . . . 571
B.3.1 Discrete Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
B.3.2 Discrete Poincaré–Friedrichs Inequalities . . . . . . . . . . . . . . . 576
B.3.3 Discrete Hodge Decompositions . . . . . . . . . . . . . . . . . . . . . . 577
B.3.4 Inverse Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580
B.4 Norm Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581
B.4.1 Quasi-Norm-Equivalent Approximations . . . . . . . . . . . . . . . 581
B.4.2 Norm-Equivalent Approximations . . . . . . . . . . . . . . . . . . . . . 582

C

Linear Operator Equations in Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . 585
C.1 Auxiliary Operator Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
C.2 Energy Balances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589

D

The Agmon–Douglis–Nirenberg Theory and Verifying
its Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
D.1 The Agmon–Douglis–Nirenberg Theory . . . . . . . . . . . . . . . . . . . . . . 593
D.2 Verifying the Assumptions of the Agmon-Douglis-Nirenberg
Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
D.2.1 Div–Grad Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598
D.2.2 Div–Grad–Curl Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602
D.2.3 Div–Curl Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
D.2.4 The Velocity–Vorticity–Pressure Formulation
of the Stokes System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
D.2.5 The Velocity–Stress–Pressure Formulation of the Stokes
System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647

Chapter 1

Classical Variational Methods

In order to provide a context for least-squares finite element methods (LSFEMs)
and a means for judging their effectiveness, we briefly review, in this chapter and
the next, other finite element approaches. In this chapter, we focus on what we refer
to as “classical” finite element methods, a designation that is admittedly somewhat
arbitrary, but that is nonetheless generally accepted for the methods we include in
the so-named class.
Finite element methods belong to the class of variational methods. As such, they
are quasi-projections1 whose properties are intrinsically connected to an underlying
variational formulation. Thus, it is instructive to establish, in an abstract setting, taxonomies of variational formulations that provide the basis for defining variational
methods, including finite element methods. We begin the chapter with a review of
variational formulations for abstract linear operator equations. We distinguish between weakly and strongly coercive problems and also subdivide the formulations
into those that are related to an optimization problem and those that are not. The
taxonomy of variational formulations induces a like taxonomy of the associated
variational methods, i.e., the quasi-projections that serve to define approximations
of solutions of the variational equations. We explain how the properties of the variational formulations affect those of the associated variational methods. We apply
our taxonomy to examine finite element methods for several canonical examples of
partial differential equation (PDE) problems. We use these examples to show the
influence that the underlying variational setting has on each discretization method
and to explain the reasons that led to interest in the use of alternative variational
formulations, including least-squares principles.
Because the aim of this chapter is to explain and fix important notions that surface
regularly throughout the book, we keep the discussion at a fairly informal level.

1

We explain what we mean by a quasi-projection in Remark 1.1.

P.B. Bochev, M.D. Gunzburger, Least-Squares Finite Element Methods,
Applied Mathematical Sciences 166, DOI: 10.1007/b13382 1,
c Springer Science+Business Media LLC 2009

3

4

1 Classical Variational Methods

1.1 Variational Methods for Operator Equations
Given Banach spaces X and Y , a bounded linear operator2 Q from X into Y , and an
element f ∈ Y , consider the abstract, linear operator equation
find u ∈ X

such that Qu = f

in Y .

(1.1)

Typically, computational solutions of (1.1) rely on a process, called discretization,
that converts this problem into a parameterized family3
Qh~uh = ~f h

(1.2)

of systems of linear algebraic equations for the unknown vector ~uh .4 The main goal
in the design of a discretization process is to obtain parametrized families (1.2) of
discrete problems whose solution sequence {~uh }h>0 converges, in some sense as
h → 0, to an exact solution u ∈ X of (1.1).
For variational methods, the conversion of (1.1) into a discrete problem (1.2) begins with the reformulation of (1.1) into a suitable variational equation. An abstract
variational equation or variational formulation is defined in terms of two Hilbert
spaces U and W , a continuous bilinear form Q(·, ·) : U × W 7→ R, and a bounded
linear functional F(·) : W 7→ R, and is given by
find u ∈ U

such that Q(u, w) = F(w)

∀w∈W .

(1.3)

The space U in which we seek the solution u is referred to as the trial space and the
space W is referred to as the test space. Variational equations such as (1.3) are often
called weak formulations or generalized formulations5 of operator equations such
as (1.1).
An operator equation such as (1.1) may be reformulated into several different
variational equations of the type (1.3); the specifics of which particular variational
equation one uses, i.e., the relationships between the spaces X,Y and U,W and between the operator Q and the bilinear form Q(·, ·), usually depend on the nature of
the operator equation and practicality issues. In Section 1.4, we provide examples
of the reformulation process.
2

For the class of problems of interest to us, i.e., PDEs, Q is a partial differential operator.
Usually, h is related to the number of degrees of freedom used in the definition of an approximate
solution. In the finite element context, it is some measure of the size of the grid used to define the
finite element spaces.
4 The connection between the components of the vector ~
uh and the solution u of (1.1) depends
on the specific discretization process used. For finite difference methods, the components of ~uh
are associated with approximations of u on a set of grid points. For finite element methods, those
components are the coefficients of an expansion of an approximate solution uh in terms of a basis
for a finite-dimensional approximating subspace. For finite volume methods, these components are
usually some averaged quantities related to u and associated with the edges, faces, or cells in a
grid.
5 The origin of these terminologies is that, in the PDE setting, variational equations often have
solutions that are not smooth enough to be pointwise solutions of the corresponding PDE.
3

1.1 Variational Methods for Operator Equations

5

On the other hand, the continuous bilinear form Q(·; ·) implies the existence of
e : U 7→ W ∗ through the relation
an operator6,7 Q
e wiW ∗ ,W
Q(u, w) = hQu,

∀ u ∈ U, w ∈ W .

Similarly, the bounded linear functional F(·) implies the existence of an element
fe ∈ W ∗ through the relation
F(w) = h fe, wiW ∗ ,W

∀w ∈ W .

Then, (1.3) is equivalent to the problem
find u ∈ U

e = fe in W ∗ .
such that Qu

(1.4)

Because the pairs of spaces X,Y and U,W ∗ are not necessarily the same, it is clear
e 6= Q and fe 6= f .8
that, in general, Q
One of the key reasons for the popularity of variational methods in general and
finite element methods in particular is the ease with which variational equations
such as (1.3) can be used to obtain parameterized discrete systems such as (1.2). The
discretization process is simple to describe. A discrete variational equation consists
of the same bilinear form and functional9 as in (1.3), but restricted to two families
of finite-dimensional subspaces10 U h ⊂ U and W h ⊂ W . Thus, we obtain the family
of discrete weak formulations
find uh ∈ U h

such that Q(uh , wh ) = F(wh )

∀ wh ∈ W h .

(1.5)

To connect (1.5) with the parameterized linear algebraic system (1.2), we choose
h
h
h
bases {φ jh }Nj=1 and {ψih }M
i=1 for U and W , respectively. We then write u =
∑Nj=1 ~uhj φ jh and set Qhij = Q(φ jh , ψih ) and ~fih = F(ψih ). Then, ~uh ∈ RN , Qh ∈ RM×N ,
~f h ∈ RM , and, for every h > 0, (1.5) is equivalent to the linear system (1.2).
Throughout, W ∗ denotes the dual space of a Hilbert space W , i.e., W ∗ is the space of all bounded
linear functionals on W . Then, W ∗ itself is a Hilbert space. Also, h·, ·iW ∗ ,W denotes the duality
pairing between elements of W and its dual space W ∗ . The inner product on W is denoted by
(·, ·)W .
7 We adopt the notation that, e.g., Q
e denotes the operator induced by the bilinear form appearing
in a weak formulation of a PDE that is defined in terms of the operator Q.
8 For example, (1.1) could represent a PDE for u ∈ X that holds pointwise in a domain so that
e could involve generalized derivatives and
Q involves classical derivatives. On the other hand, Q
then u ∈ U is viewed as a generalized solution of (1.1), i.e., one that does not satisfy (1.1) in a
e and Q
pointwise manner. Thus, although formally (1.1) and (1.4) may look alike, the operators Q
have different domains and ranges.
9 It is possible to define discretizations based on bilinear forms and functionals that are different
from those in (1.3), e.g., mesh-dependent bilinear forms arising from the use of quadrature rules to
approximate integrals. However, for brevity, we do not discuss such discretizations here.
10 Whenever the inclusion U h ⊂ U holds, the resulting discrete approximations are referred to as
being conforming. Nonconforming approximations for which U h 6⊂ U are also used in practice.
Unless explicitly noted, in this book, we confine our presentation to conforming approximations.
6

6

1 Classical Variational Methods

Remark 1.1 Because W h ⊂ W , (1.3) implies that
Q(u, wh ) = F(wh )

∀ wh ∈ W h ⊂ W .

Together with (1.5), this relation implies that
Q(uh , wh ) = Q(u, wh )

∀ wh ∈ W h .

(1.6)

Thus, if u ∈ U and uh ∈ U h are solutions of (1.3) and (1.5), respectively, then (1.6)
has the appearance of defining a projection uh ∈ U h ⊂ U of u ∈ U. This is what we
mean when we say that finite element methods define quasi-projections. In fact, in
general, (1.6) does not define a true projection; indeed that is the case only if the
bilinear form Q(·, ·) defines an inner product, a situation for which a necessary but
not sufficient condition is W = U. Note that (1.6) can be rewritten as
Q(u − uh , wh ) = 0

∀ wh ∈ W h .

(1.7)

This relation is often referred to as a Galerkin orthogonality relation for the error
u − uh ; of course, it defines a true orthogonality relation only if the bilinear form
Q(·, ·) defines an inner product.
2
Finite element methods are distinguished from other variational methods by the
manner in which one defines the approximation spaces U h and W h . Construction
of a finite element space begins with partitioning the problem domain into finite
elements.11 A locally supported basis is then obtained by mapping polynomial
functions defined on a standard reference element to every element in the domain
partition. For further discussion of finite element spaces, see Appendix B.
An important special class of variational problems is provided by mixed variational problems. In this setting, we have two Hilbert spaces V and S, two continuous
bilinear forms
a(·, ·) : V ×V 7→ R

and

b(·, ·) : V × S 7→ R ,

(1.8)

two bounded linear functionals D(·) : V 7→ R and G(·) : S →
7 R, and the mixed
variational problem
(
find {v, p} ∈ V × S
a(v, r) + b(r, p) = D(r)
∀r ∈ V
(1.9)
such that
b(v, s)
= G(s)
∀s ∈ S.

11

In two dimensions, triangles or quadrilaterals are used; in three dimensions, tetrahedrons, rectangular parallelepipeds, or hexahedrons are among the finite elements commonly employed. Although the explicit use of finite elements as building blocks for the approximation spaces is still
the prevailing way of developing and using finite element methods, finite element methods have a
much broader scope and encompass any quasi-projection scheme that uses approximation spaces
with locally supported bases. This notion includes, e.g., meshless methods for which a traditional
subdivision of a domain into finite elements is not explicitly used in the definition of the approximation space.

1.1 Variational Methods for Operator Equations

7

Using the duality pairings between the spaces V and S and their dual spaces V ∗
and S∗ , respectively, the continuous bilinear forms in (1.8) give rise to the bounded
linear operators Ae : V 7→ V ∗ , Be : V 7→ S∗ , and Be∗ : S 7→ V ∗ defined by
e riV ∗ ,V
a(v, r) = hAv,

∀ v, r ∈ V

and
e siS∗ ,S = hBe∗ s, viV ∗ ,V
b(v, s) = hBv,

∀ v ∈ V, s ∈ S ,

respectively. The bounded linear functionals D(·) and G(·) respectively give rise to
the functions de∈ V ∗ and ge ∈ S∗ defined by
e riV ∗ ,V
D(r) = hd,

∀v ∈ V

Then, (1.9) is equivalent to
(

and

G(s) = he
g, siS∗ ,S

e + Be∗ p = de
Av

in V ∗

e
Bv

in S∗ .

= ge

∀s ∈ S.

(1.10)

Remark 1.2 Analogous to the relation between (1.1) and (1.4), (1.10) or, equivalently, (1.9), can be viewed as a weak or generalized formulation of a system of
PDEs of the form
(
Av + B ∗ p = d
(1.11)
Bv
= g,
where these equations are thought of as holding pointwise.

2

By setting U = W = V × S, u = {v, p}, w = {r, s}, F({r, s}) = D(r) + G(s), and
Q({v, p}, {r, s}) = a(v, r) + b(r, p) + b(v, s) ,
(1.9) assumes the form (1.3) and, by setting
!
e Be∗
A
e=
Q
and
Be 0

fe =

 
de
,
ge

(1.12)

(1.13)

(1.10) takes the form (1.4).
The parameterized linear system (1.2) also has a special structure in the case of
mixed variational problems. To define this system, we choose conforming subspaces
V h ⊂ V and Sh ⊂ S and restrict (1.9) to these subspaces, i.e., we solve the problem:
(
find {vh , ph } ∈
a(vh , rh ) + b(rh , ph ) = D(rh )
∀ rh ∈ V h
(1.14)
V h × Sh such that
b(vh , sh )
= G(sh )
∀ sh ∈ Sh .
h
h
If {ξ jh }Nj=1 and {η hj }M
j=1 denote bases for V and S , respectively, then (1.14) is
equivalent to the linear system

8

1 Classical Variational Methods

Ah Bh
Bh 0

T

!

~vh
~ph




=

d~h
~gh


,

(1.15)

where vh = ∑Nj=1 ~vhj ξ jh and ph = ∑M
phj η hj and (·)T denotes the transpose matrix.
j=1 ~
In (1.15), we have that
Ahij = a(ξ jh , ξih ) for i, j = 1, . . . , N ,
Bhij = b(ξ jh , ηih ) for i = 1, . . . , M, j = 1, . . . , N,
d~ih = D(ξih ) for i = 1, . . . , N,

and

~ghi = G(ηih ) for i = 1, . . . , M .

Thus, the coefficient matrix Qh in the parameterized linear system (1.2) is given by
!
h Bh T
A
h
Q =
.
Bh 0

1.2 A Taxonomy of Classical Variational Formulations
We now discuss the basic classes of variational formulations upon which classical
finite element methods are based and the conditions that ensure the well-posedness
of the weak problems (1.3) and (1.9). We then explore the connections between the
variational formulations of Section 1.1 and optimization problems.

1.2.1 Weakly Coercive Problems
A bilinear form Q(·, ·) : U ×W 7→ R is called continuous if
|Q(u, w)| ≤ αkukU kwkW

∀ u ∈ U, w ∈ W ,

(1.16)

where α < ∞, and is called weakly coercive if12
inf sup

u∈U w∈W

Q(u, w)
≥β
kukU kwkW

(1.17)

and

12

We should more precisely write (1.17) and similar expressions as
inf

sup

u∈U, u6=0 w∈W, w6=0

Q(u, w)
≥β.
kukU kwkW

For the sake of economy of notation, we omit the obvious necessity to have u 6= 0, and so on.

1.2 A Taxonomy of Classical Variational Formulations

inf sup

w∈W u∈U

9

Q(u, w)
> 0,
kukU kwkW

(1.18)

where β > 0.13 The following theorem, often referred to as the Nečas theorem (see
[280,281]), shows that (1.16)–(1.18) are sufficient14 to guarantee that the variational
problem (1.3) is well posed.
Theorem 1.3 Let F(·) denote a bounded linear functional on W and let Q(·, ·) denote a continuous and weakly coercive bilinear form on U × W , i.e., (1.16)–(1.18)
hold. Then, the abstract variational problem (1.3) has a unique solution u ∈ U.
Moreover, that solution satisfies
kukU ≤

1
kFkW ∗
β
2

so that it depends continuously on the data.15

1.2.2 Strongly Coercive Problems
If W 6= U, the most one can expect from a bilinear form is weak coercivity. However,
if W = U, i.e., if the test space W and the trial space U are the same, a stronger notion
of coercivity can be defined. Note that if W = U, then (1.16) becomes
|Q(u; v)| ≤ αkukU kvkU

∀ u, v ∈ U .

(1.19)

13

It is clear why (1.17) and (1.18) are referred to as inf–sup conditions. An equivalent way to
express, e.g., (1.17) is
Q(u, w)
sup
≥ β kukU
∀u ∈ U .
w∈W kwkW
Because of its equivalence to (1.17), we also refer to conditions of this type as inf–sup conditions.
14 The necessity of these conditions can also be demonstrated.
15 It is instructive to consider an example in finite-dimensional spaces for which U = RN , W =
e ∈ RM×N , ~f ∈ RM , Q(~u; ~w) = ~wT Q~
e u, and F(~w) = ~wT ~f . With these identifications,
W ∗ = RM , Q
e u = ~f . In
one can then verify that the abstract problem (1.3) is equivalent to the linear system Q~
e
this case, the weak coercivity conditions (1.17) and (1.18) imply that Q has full row and column
ranks, i.e., it is a square invertible matrix.
Note the distinction between this example, where the problem-defining operator is finite dimensional, and a parameterized linear system such as (1.2). In the former case, the given problem
is a linear system with a coefficient matrix having fixed dimensions. In the case of (1.2), the dimensions of the coefficient matrix grow as the parameter h tends to zero; we have emphasized this point
by attaching the superscript h to the matrix Qh in (1.2). This distinction between the coefficient
matrices is important because, in the case of (1.2), it is not enough to know that the coefficient
matrix is invertible for any fixed value of h; one also needs to know about the uniformity of the
invertibility, e.g., the uniformity of a bound on the inverse matrix as h → 0. Of course, this last
point does not apply to the example discussed in the previous paragraph.

10

1 Classical Variational Methods

A bilinear form Q(·, ·) : U × U 7→ R is called strongly coercive or U-elliptic if,
for some β > 0,
Q(u; u) ≥ β kukU2
∀u ∈ U .
(1.20)
It is easy to verify that (1.20) implies the weak coercivity conditions (1.17) and
(1.18), i.e., strong coercivity implies weak coercivity; the converse is not true.
The classical Lax–Milgram lemma shows that strongly coercive problems are well
posed.16
Corollary 1.4 Let F(·) denote a bounded linear functional on U and Q(·, ·) denote
a continuous, strongly coercive bilinear form on U ×U, i.e., (1.19) and (1.20) hold.
Then, the abstract variational problem (1.3) has a unique solution. Moreover, that
solution satisfies
1
kukU ≤ kFkU ∗
β
so that it depends continuously on the data.17

2

1.2.3 Mixed Variational Problems
Mixed variational problems provide an example of weakly coercive problems and
therefore they can be analyzed using Theorem 1.3.18 However, due to their special
structure, it is advantageous to state the hypotheses needed to guarantee the wellposedness of problems such as (1.9) in terms of the constituent bilinear forms and
linear functionals. To this end, we need to use the subspace
Z = {v ∈ V | b(v, s) = 0 ∀ s ∈ S} ⊂ V

(1.21)

e
consisting of those elements v ∈ V that belong to the null space of the operator B.
The following generalization of the Brezzi theorem (see [83,87,183]) establishes
hypotheses for which the mixed variational problem (1.9) is well posed.
Theorem 1.5 Given the Hilbert spaces V and S, the bounded linear functionals
D(·) and G(·) on V and S, respectively, and the bilinear forms a(·, ·) and b(·, ·) on
16

Obviously, because strong coercivity implies weak coercivity, Corollary 1.4 is simply a special
case of Theorem 1.3. Here, we state it as a separate result because of its historical importance.
17 We again visit the finite-dimensional setting. Let U = W = W ∗ = RN , Q
e ∈ RN×N , ~f ∈ RN ,
e u, and F(~w) = ~wT ~f . A simple argument shows that the strong coercivity of Q(·, ·) is
Q(~u; ~w) = ~wT Q~
e being a real matrix with a positive definite symmetric part. A real, positive definite
equivalent to Q
matrix is always invertible. This is the algebraic version of the statement that strong coercivity
implies weak coercivity. On the other hand, because a matrix does not have to be positive definite
to be invertible, it is clear that strong coercivity is a sufficient but not necessary condition for
guaranteeing the results of Theorem 1.3.
18 Indeed, if a(·, ·) and b(·, ·) satisfy the hypotheses of Theorem 1.5 below, then it can be shown
that the compound form Q(·, ·) defined in (1.12) satisfies (1.16)–(1.18).

1.2 A Taxonomy of Classical Variational Formulations

11

V × V and V × S, respectively. Assume that the bilinear forms a(·, ·) and b(·, ·) are
continuous, i.e., for some constants αa < ∞ and αb < ∞,
a(v, r) ≤ αa kvkV krkV

∀ v, r ∈ V

(1.22)

∀ v ∈ V, s ∈ S .

(1.23)

and
b(v, s) ≤ αb kvkV kskS

e i.e.,
Assume further that a(·, ·) is weakly coercive on the null space of B,
inf sup

v∈Z r∈Z

a(v, r)
≥ βa
kvkV krkV

a(v, r)
inf sup
> 0,
r∈Z v∈Z kvkV krkV

(1.24)

where βa > 0, and that b(·, ·) satisfies the inf–sup condition
inf sup

s∈S v∈V

b(v, s)
≥ βb ,
kvkV kskS

(1.25)

where βb > 0. Then, the mixed variational problem (1.9) has a unique solution
{v, p} ∈ V × S and that solution satisfies
kvkV + kpkS ≤ C(kDkV ∗ + kGkS∗ )
so that it depends continuously on the data.19

2

Null space methods
The problem (1.9) can, in principle, be “simplified” by restricting it to an affine
space on which the second equation is satisfied. For the sake of simplicity, we assume that G(·) = 0 in (1.9)20 so that now we have that b(v, s) = 0 for all s ∈ S, i.e.,
v ∈ Z. We then use, in the first equation in (1.9), test functions r ∈ Z ⊂ V so that
b(r, p) = 0. Then, (1.9) is reduced to the problem
Considering again the finite-dimensional setting, let V = V ∗ = RN , S = S∗ = RM , Ae ∈ RN×N ,
e
ev, b(~v,~s) = ~sT B~
ev, D(~r) =~rT d,
~ and G(~s) = ~sT~g. It is
B ∈ RM×N , ~v ∈ RN , ~p ∈ RM , a(~v,~r) =~rT A~
e
not difficult to show that (1.22)–(1.25) imply that the matrix B has full row rank M and that the
e i.e., that the problem Z T AZ~
e q = Z T d~ has a unique
matrix Ae is invertible on the null space of B,
solution ~q ∈ RN−M for any d~ ∈ RN , where Z ∈ RN×(N−M) is a matrix whose columns form a basis
e where N(·) denotes the null space. These conditions on the matrices Ae and Be in turn
for N(B),
e ∈ R(M+N)×(M+N) defined in (1.13) is invertible. Because clearly that
guarantee that the matrix Q
e satisfies the weak coercivity conditions
matrix is not positive definite, we see that in this case Q
(1.17) and (1.18).
20 The more general case of G(·) 6= 0 can, in principle, be reduced to this case by subtracting from
v any particular solution vb ∈ V satisfying the equation b(b
v, s) = G(s) for all s ∈ S.
19

12

1 Classical Variational Methods

find v ∈ Z ⊂ V

such that a(v, r) = D(r) ∀ r ∈ Z .

(1.26)

Note that (1.26) is of the form (1.3) with U = W = Z, Q(·, ·) = a(·, ·), and F(·) =
D(·).
Under the same hypotheses as those of Theorem 1.5, the problem (1.26) is a
weakly coercive variational problem; indeed, (1.22) implies that the bilinear form
a(·, ·) is continuous and (1.24) is exactly the definition of weak coercivity whenever
that form is restricted to act on Z × Z. Then, by Theorem 1.3, we have that (1.26)
has a unique solution v ∈ Z, and, because Z ⊂ V and V ∗ ⊂ Z ∗ , that solution satisfies
kvkV ≤ β1a kDkV ∗ .
If one wishes to also determine the variable p in (1.9), one need only solve, after
determining v ∈ Z from (1.26), the problem:
find p ∈ S

such that

b(r, p) = D(r) − a(v, r) ∀ r ∈ Z ⊥ ⊂ V .

(1.27)

The continuity condition (1.23) and the inf–sup condition (1.25) not only guarantee
that Z ⊥ is well defined, but also that the problem (1.27) has a unique solution p ∈ S
and that solution satisfies kpkS ≤ CkDkV ∗ ; see, e.g., [183], for details.21

1.2.4 Relations Between Variational Problems and Optimization
Problems
If we make further assumptions on the bilinear form Q(·, ·) of the problem (1.3) and
the bilinear form a(·, ·) of the problem (1.9), we can relate those two problems to
optimization problems.

Unconstrained optimization problems (the Rayleigh–Ritz setting)
We first consider the classical Rayleigh–Ritz setting in which strongly coercive variational problems can be associated with the minimization of energy functionals.
Proposition 1.6 Assume that the hypotheses of Corollary 1.4 hold. Assume further that the bilinear form Q(·, ·) is symmetric, i.e., that Q(u, w) = Q(w, u) for all
u, w ∈ U. Then, the problem (1.3) is equivalent22 to the unconstrained optimization
problem
21

With G(·) = 0, in the finite-dimensional setting we have been following, we have that (1.26) has
e q = Z T d~ for ~q ∈ RN−M . The hypotheses of Theorem 1.5 imply that the matrix
the form Z T AZ~
e ∈ R(N−M)×(N−M) is invertible. Then, after solving for ~q, ~v ∈ RN satisfying (1.9) is found
Z T AZ
by setting ~v = Z~q. If the matrix Z⊥ ∈ RN×M has columns that form a basis for Z ⊥ , then the
TB
eT~p = Z T d~ − Z T A~
e
problem (1.27) for ~p ∈ RM takes the form Z⊥
⊥
⊥ v; the hypotheses of Theorem
T
T
M×M
e
1.5 imply that the matrix Z⊥ B ∈ R
is invertible.
22 Equivalence of two problems means that solutions of one problem solve the other and conversely.

1.2 A Taxonomy of Classical Variational Formulations

find u ∈ U

13

1
that minimizes J(u; F) = Q(u; u) − F(u) .
2

2

(1.28)

In fact, (1.3) is the first-order necessary condition that solutions u ∈ U of the unconstrained minimization problem (1.28) are required to satisfy. In the Rayleigh-Ritz
e associated with the bilinear form Q(·, ·) is symmetric and
setting, the operator Q
positive definite. Note also that, in this setting, the discrete problem (1.5) is equivalent to the discrete optimization problem
find uh ∈ U h

1
that minimizes J(uh ; F) = Q(uh ; uh ) − F(uh ) .
2

It is important to note that in the Rayleigh-Ritz setting, the bilinear form Q(·, ·)
defines an equivalent inner product23 on the Hilbert space U and the functional
J(·; 0) defines an equivalent norm24 on U. These observations follow from the symmetry of Q(·, ·) and the relation
β kukU2 ≤ 2J(u; 0) = Q(u; u) ≤ αkukU2
which in turn follows easily from (1.19) and (1.20).

Constrained optimization problems
We now consider the relation between mixed variational principles and constrained
optimization problems.
Proposition 1.7 Assume that the hypotheses of Theorem 1.5 hold. Assume further
that the bilinear form a(·, ·) is symmetric, nonnegative, and strongly coercive on the
e i.e., that
null space of B,
a(v, r) = a(r, v)

and

a(v, v) ≥ 0

for all v, r ∈ V

(1.29)

and
a(v, v) ≥ βa kvkV2

for all v ∈ Z ,

(1.30)

where βa > 0. Then, the problem (1.9) is equivalent to the constrained optimization
problem
find v ∈ V that minimizes
subject to
23

1
J(v; D) = a(v, v) − D(v)
2
b(v, s) = G(s) ∀ s ∈ S .

(1.31)
2

Thus, in the Rayleigh-Ritz setting, (1.6) implies that the approximate solution uh ∈ U h is a true
projection of u ∈ U with respect to the inner product Q(·, ·). Similarly, in that case, (1.7) implies
that the error u − uh is truly orthogonal to all elements of the approximating space W h = U h , again
with respect to the inner product Q(·, ·). In the more general setting of Section 1.1, (1.6) does not
define a true projection and (1.7) does not define a true orthogonality relation.
24 The norm induced by the inner product Q(·, ·) is often referred to as the energy norm.

14

1 Classical Variational Methods

In fact, if we enforce the constraint in (1.31) by introducing the Lagrange multiplier
p ∈ S and the Lagrangian functional
L(v, p; D, G) = J(v; D) − b(v, p) + G(p) ,
then the equations in (1.9) are the first-order necessary conditions that saddle points
{v, p} ∈ V × S of L(·, ·; ·, ·) are required to satisfy. In the setting of Proposition 1.7,
the operator Ae associated with the bilinear form a(·, ·) is symmetric and positive
semi-definite; it is also positive definite with respect to the null space of the operator
Be associated with the bilinear form b(·, ·). However, even in this case, the composite
e defined in (1.13), although symmetric, is indefinite so that the associated
operator Q
bilinear form Q(·, ·) remains weakly coercive. This, of course, is the nature of saddle
point problems. Note also that in the setting of constrained optimization problems,
the discrete problem (1.14) is equivalent to the discrete constrained optimization
problem
find vh ∈ V h that minimizes
subject to

1
J(vh ; D) = a(vh , vh ) − D(vh )
2
b(vh , sh ) = G(sh ) ∀ sh ∈ Sh .

A second approach for handling the constraint in the problem (1.31) is to restrict
the minimization problem to an affine space on which the constraint is satisfied,
i.e, to use the null space method described in Section 1.2.3. With G(·) = 0, the
hypotheses of Theorem 1.7 imply that (1.26) are the first-order necessary conditions
for v ∈ Z to solve the reduced, unconstrained optimization problem
find v ∈ Z

1
that minimizes J(v; D) = a(v, v) − D(v)
2

(1.32)

that is equivalent to the constrained optimization problem (1.31). Under the same
hypotheses as those of Theorem 1.7, the problem (1.26) is now a strongly coercive
variational problem; this should be contrasted to the situation for Lagrange multiplier methods for which the same hypotheses lead to the weakly coercive variational
problem involving the bilinear form (1.12). The unconstrained optimization problem (1.32) is of the Rayleigh-Ritz form (1.28) and therefore the former may be
discretized in the same manner as the latter. One chooses a conforming subspace
Zeh ⊂ Z and then restricts the problem (1.32) to that subspace, i.e., one solves the
discrete unconstrained optimization problem
)
find vh ∈ Zeh ⊂ Z ⊂ V
1
J(vh ; D) = a(vh , vh ) − D(vh ) .
2
that minimizes
The first-order necessary conditions corresponding to this problem are given by
(1.45).

1.3 Approximation of Solutions of Variational Problems

15

1.3 Approximation of Solutions of Variational Problems
Problems such as (1.3) are generally referred to as Galerkin25 formulations of the
operator equation (1.1); finite element methods based on (1.5) or, equivalently, (1.2),
are referred to as Galerkin finite element methods. As seen in Sections 1.2.1 and
1.2.2, Galerkin formulations can involve either a strongly or weakly coercive bilinear form Q(·, ·). The Rayleigh–Ritz setting refers to the special case for which this
bilinear form is not only strongly coercive, but is also symmetric. Mixed-Galerkin
formulations of the operator equation (1.11) refer to the system (1.9); finite element
methods based on (1.14) or, equivalently, (1.15), are therefore known as mixedGalerkin finite element methods or, more simply, as mixed finite element methods.
There is less need to distinguish between problems of the form (1.9) that are related
to a constrained optimization problem and those that are not because in both cases
one is led to weakly coercive problems.
Recall that discretized variational problems are restrictions of variational problems to finite-dimensional subspaces so that their well-posedness is subject to similar conditions as were imposed for the parent problems. In this section, we apply
this observation to state sufficient stability conditions, first for discrete problems
spawned from weakly and strongly coercive variational problems, and then for those
spawned from mixed variational problems. It is important to note that, in some cases,
a condition imposed on the discretized problem is automatically inherited from the
corresponding condition imposed on the parent problem and, in other cases, it is not.

1.3.1 Weakly and Strongly Coercive Variational Problems
For the discretization of weakly coercive problems, we have the following theorem
that is sometimes referred to as Babuška’s theorem [19, 20].
Theorem 1.8 Let the spaces U and W , the bilinear form Q(·, ·), and the functional
F(·) satisfy the hypotheses of Theorem 1.3. Let U h ⊂ U and W h ⊂ W be closed
subspaces.26 Then, Q(·, ·) is continuous on U h ×W h . Assume also that the bilinear
form Q(·, ·) satisfies the discrete inf–sup conditions
Q(uh , wh )
≥ βh
h
h
uh ∈U h wh ∈W h ku kU kw kW

(1.33)

Q(uh , wh )
> 0,
h
h
wh ∈W h uh ∈U h ku kU kw kW

(1.34)

inf

and

inf

25

sup

sup

Strictly speaking, the term “Galerkin formulation” refers to variational problems for which the
test and the trial spaces coincide. In the general case when U 6= W , one speaks of “Petrov–Galerkin
variational formulations.” For brevity, we use the former terminology in all cases.
26 Note that it is not required for the subspaces to be finite dimensional.

16

1 Classical Variational Methods

where β h ≥ βb > 0. Then, for every h > 0, the discretized problem (1.5) has a unique
solution uh ∈ U h . Moreover, that solution satisfies the stability estimate
kuh kU ≤

1
kFkW ∗
βb

and, if u ∈ U denotes the unique solution of (1.3), the optimal error estimate27,28

α
ku − uh kU ≤ 1 +
inf ku − vh kU .
2
(1.35)
h ∈U h
b
v
β
In Section 1.1 it is seen that the discretized problem (1.5) is equivalent to a parameterized family of linear systems (1.2). From this theorem, it follows that for
every h > 0, the matrix Qh in (1.2) is square and nonsingular. This is basically all
that can be inferred about the discretized equations that result from weakly coercive
variational formulations.
For the discretization of strongly coercive problems, we have the following result
that is sometimes referred to as Céa’s lemma [82, 108, 123].
Theorem 1.9 Let the space U, the bilinear form Q(·, ·), and the linear functional
F(·) satisfy the hypotheses of Corollary 1.4. Let U h ⊂ U be a closed subspace. Then,
Q(·, ·) is continuous on U h ×U h and satisfies
Q(uh , uh ) ≥ β kuh kU2

∀ uh ∈ U h

(1.36)

and, for every h > 0, the discretized problem (1.5) has a unique solution uh ∈ U h .
Moreover, that solution satisfies the stability estimate
kuh kU ≤

1
kFkU ∗
β

and, if u ∈ U denotes the unique solution of (1.3), the optimal error estimate
ku − uh kU ≤

α
β

inf ku − wh kU .

wh ∈U h

2

(1.37)

Remark 1.10 Remarkably, if, in addition, the bilinear form Q(·, ·) is symmetric,
e.g., in the Rayleigh-Ritz setting of unconstrained optimization problems, the error
estimate (1.37) can be improved to

27

An optimal error estimate is one for which the error in a finite-dimensional approximation is
bounded from above by a constant multiple of the error of the best approximation out of the same
finite-dimensional subspace. An approximate solution that satisfies an optimal error estimate is
referred to as being optimally accurate.
28 It is important to keep in mind that optimality of an error estimate of a numerical method is not
sufficient to guarantee convergence of the numerical solution uh to the exact solution u. The latter
requires an additional approximability property of the discrete spaces. See Appendix B for further
details.

1.3 Approximation of Solutions of Variational Problems

ku − uh kU ≤

r

α
inf ku − vh kU ;
β vh ∈U h

17

(1.38)

see, e.g., [123]. That this is indeed an improvement over (1.37) follows from the fact
that necessarily α ≥ β .
2
This theorem also asserts that, for every h > 0, the matrix Qh in (1.2) is square
and nonsingular. However, the discrete coercivity (1.36) implies that, in addition
to these properties, strongly coercive variational formulations give rise to positive
definite linear systems. Furthermore, in the Rayleigh–Ritz setting, these systems are
also symmetric.
A cursory inspection of Theorems 1.8 and 1.9 points out an important difference
between the conditions that guarantee the stability of discretizations of weakly and
strongly coercive variational problems. In both cases, the continuity property of
the bilinear form Q(·, ·) with respect to conforming discrete subspaces is inherited
from the continuity property that holds on the parent spaces. Likewise, the strong
coercivity property (1.36) with respect to U h ⊂ U automatically follows from the
strong coercivity property (1.20) with respect to U. Thus,
for strongly coercive variational problems, the conformity of the approximating subspace U h ⊂ U is all that is needed to guarantee the stability of the
discrete variational problem and the optimal accuracy of the approximate
solution, i.e., conformity of the spaces is necessary and sufficient for their
compatibility.
However, this is not true for weakly coercive variational problems;
the inclusions U h ⊂ U and W h ⊂ W are not by themselves sufficient to guarantee that the discrete weak coercivity conditions (1.33) and (1.34) hold, even
if the conditions (1.17) and (1.18) are known to hold, i.e., conformity of the
spaces is necessary but not sufficient for their compatibility.
Thus, in Theorem 1.8, the discrete weak coercivity conditions must be stated as
additional hypotheses on the bilinear form Q(·, ·) and the discrete subspaces U h
and W h .
Let us examine these issues in more detail. Assume first that Q(·, ·) is continuous
on U × W . Let uh and wh denote arbitrary elements belonging to the conforming
discrete subspaces U h ⊂ U and W h ⊂ W , respectively. Obviously, uh ∈ U and wh ∈
W as well and (1.16) holds for uh and wh considered as elements of the parent
spaces. Because uh and wh are arbitrary, we conclude that (1.16) holds on U h ×W h .
The same argument can be used to show that if Q(·, ·) is strongly coercive on U ×U,
then (1.36) holds on any conforming subspace U h ⊂ U.
To understand why the weak coercivity properties with respect to the discrete
spaces are not inherited from the analogous property on the parent spaces, note that
an equivalent way to express (1.17) is that, given u ∈ U, there exists wu ∈ W such
that
Q(u, wu ) ≥ β kukU kwu kW .
(1.39)

18

1 Classical Variational Methods

Thus, given uh ∈ U h ⊂ U, we obviously have that uh ∈ U and so (1.39) implies that
there exists a wuh ∈ W such that
Q(uh , wuh ) ≥ β kuh kU kwuh kW .
However, the function wuh is not guaranteed to belong to the discrete subspace
W h . Unfortunately, it is this last property that is required for (1.33) to hold; see
Figure 1.1.

Fig. 1.1 The strong coercivity property is inherited on the subspace U h (left). The weak coercivity
property is not inherited by the subspaces U h and W h (right) because the function wuh whose
existence follows from (1.17) is not guaranteed to belong to W h .

1.3.2 Mixed Variational Problems
In the case of mixed variational problems, the well-posedness of (1.14) is subject
to the conditions (1.24) and (1.25) but restricted to the subspaces V h ⊂ V and Sh ⊂
S. As is the case in Section 1.3.1 for weakly coercive variational problems, the
conformity of the subspaces V h and Sh is not sufficient for the discrete versions of
(1.24) and (1.25) to follow directly from the corresponding conditions on the parent
spaces. For this reason, the discrete versions of these conditions have to be stated as
assumptions in the following generalization of a result found in, e.g., [83, 87, 183].
Theorem 1.11 Let the spaces V and S, the linear functionals D(·) and G(·), and the
bilinear forms a(·, ·) and b(·, ·) satisfy the hypotheses of Theorem 1.5. Let V h ⊂ V
and Sh ⊂ S denote conforming subspaces. Then, a(·, ·) and b(·, ·) are continuous on
V h ×V h and V h × Sh , respectively. Define Z h ⊂ V h by
Z h = {vh ∈ V h | b(vh , sh ) = 0
and assume that a(·, ·) is weakly coercive on Z h , i.e.,

∀ sh ∈ Sh }

(1.40)

1.3 Approximation of Solutions of Variational Problems

19

a(vh , rh )
≥ βah
h
h
vh ∈Z h rh ∈Z h kv kV kr kV
inf sup

a(vh , rh )
inf sup
> 0,
h
h
rh ∈Z h vh ∈Z h kv kV kr kV

(1.41)

where βah ≥ βba > 0. Assume that b(·, ·) verifies the LBB29 or discrete inf–sup condition
b(vh , sh )
inf sup
≥ βbh ,
(1.42)
h
h
sh ∈Sh vh ∈V h kv kV ks kS
where βbh ≥ βbb > 0. Then, for every h > 0, the discrete problem (1.14) has a unique
solution {vh , ph } ∈ V h × Sh . Moreover, for some C > 0 whose value is independent
of h, that solution satisfies the stability estimate
kvh kV + kph kS ≤ C(kDkV ∗ + kGkS∗ )
and, if {v, p} ∈ V × S denotes the solution of (1.9), the optimal error estimate


kv − vh kV + kp − ph kS ≤ C inf kv − rh kV + inf kp − sh kS .
2 (1.43)
rh ∈V h

sh ∈Sh

The mixed discrete problem (1.14) is equivalent to the parameterized family of
linear systems (1.15). Similarly to Theorems 1.8 and 1.9, Theorem 1.11 also asserts
that, for every h > 0, the 2 × 2 block matrix in this system is square and invertible.
If, in addition, the parent mixed variational formulation is related to a constrained
minimization problem, this matrix is also symmetric. However, in contrast with the
strongly coercive setting, the matrix in (1.15) is always indefinite.
It is instructive to examine (1.42) more closely. Just as is the case for (1.17)–
(1.18) and (1.33)–(1.34), the conformity of the subspaces V h ⊂ V and Sh ⊂ S is
not sufficient to guarantee that the inf–sup condition (1.25) automatically implies
that the discrete inf–sup condition (1.42) holds; thus, the latter must be explicitly
assumed in Theorem 1.11; see Figure 1.2.
Clearly, the weak coercivity conditions in (1.24) for the bilinear form a(·, ·) with
respect to the space Z do not immediately imply the like conditions in (1.41) with
respect to the space Z h . But there is more to this particular issue. Suppose we replace
the weak coercivity assumptions in (1.24) with the strong coercivity assumption
(1.30) and then replace, in Theorem 1.11, the discrete weak coercivity assumptions
in (1.41) by the discrete strong coercivity assumption
a(vh , vh ) ≥ βah kvh kV2
29

∀ vh ∈ Z h ,

(1.44)

The terminology “LBB” originates from the facts that this condition was first explicitly discussed
in the finite element setting for saddle point problems by Brezzi [83] and that it is a special case of
the general weak-coercivity condition (1.17) first discussed for finite element methods by Babuška
[19] and that, in the continuous setting of the Stokes equation, this condition was first proved to
hold by Ladyzhenskaya; see [253].

20

1 Classical Variational Methods

Fig. 1.2 Conditions for the well-posedness of discrete saddle-point problems. Coercivity on the
null space (left) and the discrete inf–sup condition (right) must be imposed independently on the
discrete spaces because 1) in general Z h 6⊂ Z and 2) the function w ph whose existence follows from
the discrete inf–sup condition is not guaranteed to belong to V h .

where βah ≥ βba > 0. Strong coercivity is inherited on subspaces; thus, the inclusion
of (1.44) as a hypothesis for Theorem 1.11 might seem redundant in light of (1.30).
However, although the latter condition does imply that the bilinear form a(·, ·) is
coercive on any subspace of Z, it is generally the case that Z h 6⊂ Z. As a result, even
for the case for which the bilinear form a(·, ·) is strongly coercive on the null space
Z, e.g., for constrained optimization problems, (1.44) must also be imposed as a
hypothesis on the discrete space Z h in Theorem 1.11; see Figure 1.2.

Null space methods
We are now in a position to discuss two approaches for using null space methods
to “simplify” the discretization of (1.9). One can first reduce to the null space, i.e.,
derive the reduced problem (1.26), then choose a subspace Zeh ⊂ Z, and then restrict
(1.26) to that subspace to obtain the discrete problem
find vh ∈ Zeh ⊂ Z ⊂ V

such that

a(vh , rh ) = D(rh ) ∀ rh ∈ Zeh .

(1.45)

Alternately, one can discretize (1.9) using the subspaces V h ⊂ V and Sh ⊂ S to obtain
(1.14) and then reduce the latter (discrete) problem using the discrete null space
Z h ⊂ V h defined in (1.40). The discrete problems resulting from the two approaches
are not the same. For one thing, we cannot choose Zeh = Z h because we want Zeh ⊂ Z
but, in general, Z h 6⊂ Z. For another, in (1.45), one directly chooses a subspace Zeh of
the null space Z without any need to introduce the intermediate spaces V h and Sh .
The distinction between the reduce-then-discretize and discretize-then-reduce
approaches becomes even more evident when we consider the corresponding pae
rameterized linear systems. If {ξeh }Nz denotes a basis for Zeh ⊂ Z, then the solution
j

j=1

z
vh ∈ Zeh of (1.45) can be expressed as vh = ∑Nj=1
~zhj ξejh and then (1.45) is equivalent

e

1.3 Approximation of Solutions of Variational Problems

21

to the linear system
e h~zh = deh ,
A

(1.46)

e
e h ∈ RNez ×Nez and deh ∈ RNez are respectively given by
where~zh ∈ RNz and where A

e h = a(ξeh ; ξeh )
A
ij
j
i

and

dei h = D(ξeih )

ez .
for i, j = 1, . . . , N

Now, let Zh denote a matrix whose columns form a basis for N(Bh ). For the
sake of simplicity, we assume that (1.42) is satisfied so that Bh has full row rank,
dim(Z h ) = N − M, and Zh ∈ RN×(N−M) . Then, from (1.15), we easily obtain that
T
T
Zh Ah~vh = Zh d~h .

(1.47)

Furthermore, the vector of coefficients ~vh ∈ RN (with respect to a chosen basis for
V h ) corresponding to any function vh ∈ Z h can be expressed in the form ~vh = Zh~qh
for some ~qh ∈ RN−M . Then, it follows from (1.47) that
b h~qh = dbh ,
A
where

b h = Zh T Ah Zh
A

and

(1.48)
T
dbh = Zh d~h .

Clearly, the systems (1.46) and (1.48) are not the same.
It would seem that, in the case for which the bilinear form a(·, ·) is strongly coercive on Z, finding approximate solutions of constrained optimization problems by
solving the reduced problem (1.45) is preferable to solving (1.14). The application
of the Lagrange multiplier rule results in the discrete mixed-Galerkin formulation
(1.14) that is necessarily weakly coercive.30 However, because, in general, the discrete null space Z h 6⊂ Z, it is clear that it cannot be used in (1.45). Thus, Zeh must
be constructed separately, even if a stable pair {V h , Sh } for (1.14) is already available. On the other hand, if (1.30) holds and if one has available a subspace Zeh ⊂ Z,
we have that the bilinear form a(·, ·) is strongly coercive with respect to Zeh so that
the problem (1.45) has a unique solution, that solution depends continuously on the
data, and satisfies the error estimate
kv − vh kV ≤ C inf kv − rh kV ,
rh ∈Zeh

where v denotes the solution of (1.26). Thus, the reduced formulation has the advantage of offering a strongly coercive setting for discrete variational methods for
the constrained optimization problem (1.31). Unfortunately, in practice, it is often
not an easy matter to construct subspaces of the space Z defined by (1.21).

30

An additional advantage of (1.45) over (1.14) is that the former does not involve ph , i.e., it
involves fewer degrees of freedom.

22

1 Classical Variational Methods

1.4 Examples
We now provide examples of variational problems that form the basis for classical
finite element methods. We begin with the classification of these problems according
to the taxonomy given in Section 1.2, and then infer the properties of the associated
finite element methods from the discussions given in Section 1.3.
We first consider some PDEs related to optimization problems, starting from
the optimization problems themselves and working our way to the equivalent PDE
formulations. The optimization problems are turned into weak formulations of the
related PDEs by the application of standard techniques from the calculus of variations.
For PDEs such as conservation laws and advection-diffusion-reaction problems
that are not related to optimization problems, we cannot use the calculus of variations to obtain variational formulations. Instead, variational methods can be based
on Galerkin principles that are derived directly from the PDEs. The paradigm of a
Galerkin principle is formal residual “orthogonalization.” This process can, in principle, be applied to any PDE, even if there is no underlying optimization problem.
Because of their universality, Galerkin principles have provided a natural and popular choice for extending finite element methods beyond PDE problems associated
with optimization problems. On the other hand, if such an optimization problem
exists, the associated optimality system can be recovered through the use of the
Galerkin paradigm.
In Sections 1.4.1 and 1.4.2 we respectively consider the Poisson equation and the
equations of linear elasticity; both of these problems can be associated with unconstrained optimization problems. In Section 1.4.3, we consider the Stokes equations
that can be associated with a constrained optimization problem. In Section 1.4.4 we
consider the Helmholtz equation and in Section 1.4.5 a linear advection-diffusionreaction equation; neither of these problems can be related to an optimization
problem. Finally, in Section 1.4.6, we consider the nonlinear example of the Navier–
Stokes equations which also cannot be associated with an optimization problem.

1.4.1 The Poisson Equation
We consider the Dirichlet and Kelvin principles which give rise to two different
variational formulations for the Poisson equation. The Dirichlet and Kelvin principles provide examples of unconstrained and constrained, respectively, optimization
problems; the former leads to a symmetric, strongly coercive, Rayleigh–Ritz setting for the Poisson problem and the latter leads to a symmetric, weakly coercive,
mixed-Galerkin formulation for the same problem.
We assume that the domain Ω ⊂ Rd has boundary ∂ Ω which is made up of two
disjoint, nonempty parts Γ and Γ ∗ .

1.4 Examples

23

The Dirichlet principle
Given31 f ∈ (HΓ1 (Ω ))∗ , consider the functional
1
JD (φ ; f ) =
2

Z

2

|∇φ | dΩ −

Z

(1.49)

f φ dΩ

Ω

Ω

and the Dirichlet principle
min JD (φ ; f ) .

(1.50)

φ ∈HΓ1 (Ω )

Clearly, (1.50) defines an unconstrained optimization setting.
The Euler–Lagrange condition corresponding to this principle is given by
Z

∇φ · ∇ψ dΩ =

Z

Ω

∀ ψ ∈ HΓ1 (Ω ) .

f ψ dΩ

(1.51)

Ω

Letting U = W = HΓ1 (Ω ) and
Z

Q(φ , ψ) =

∇φ · ∇ψ dΩ

Z

F(ψ) =

and

Ω

f ψ dΩ

∀ φ , ψ ∈ HΓ1 (Ω ) ,

Ω

we see that (1.51) is of the form (1.3). Furthermore, it is well known [82, 123] that,
in this case, the bilinear form Q(·, ·) is symmetric and strongly coercive on HΓ1 (Ω )×
HΓ1 (Ω ) so that we find ourselves in the Rayleigh–Ritz unconstrained optimization
setting discussed in Section 1.2.4. Furthermore, the error estimate (1.38) applies to
solutions of the finite element discrete problem (1.5) corresponding to (1.51) for any
conforming finite element approximation space U h ⊂ HΓ1 (Ω ).
Equivalent PDE. Formally, by integration by parts, (1.51) is equivalent to the Poisson problem

−∆ φ = f in Ω ,

(1.52)
∂φ
 φ = 0 on Γ ,
and
= 0 on Γ ∗ .
∂n
Thus, (1.51) provides a symmetric, strongly coercive weak formulation for the Poisson problem (1.52).

The Kelvin principle
Consider the functional
JK (v) =
31

1
2

Z

|v|2 dΩ

Ω

HΓ1 (Ω ) is the space {ψ ∈ H 1 (Ω ) | ψ = 0 on Γ }. When Γ = ∂ Ω this space is denoted