Which mathematical definitions should be formalised in Lean?

Question

The question.

Which mathematical objects would you like to see formally defined in the Lean Theorem Prover?

Examples.

In the current stable version of the Lean Theorem Prover, topological groups have been done, schemes have been done, Noetherian rings got done last month, Noetherian schemes have not yet been done (but are probably not going to be too difficult, if anyone is interested in trying), but complex manifolds have not yet been done. In fact I think we are nearer to perfectoid spaces than complex manifolds -- maybe because algebra is closer to the axioms than analysis. But actually we also have Lebesgue measure (it's differentiability we're not too strong at), and today we got modular forms. There is a sort of an indication of where we are.

What else should we be doing? What should we work on next?

Some background.

The Lean theorem prover is a computer program which can check mathematical proofs which are written in a sufficiently formal mathematical language. You can read my personal thoughts on why I believe this sort of thing is timely and important for the pure mathematics community. Other formal proof verification software exists (Coq, Isabelle, Mizar...). I am very ignorant when it comes to other theorem provers and feel like I would like to see a comparison of where they all are.

Over the last year I have become increasingly interested in Lean's mathematics library, because it contains a bunch of what I as a number theorist regard as "normal mathematics". No issues with constructivism, the axiom of choice, quotients by equivalence relations, the law of the excluded middle or anything. My impression that most mathematicians are not particularly knowledgeable about what can actually be done now with computer proof checkers, and perhaps many have no interest. These paragraphs are an attempt to give an update to the community.

Let's start by getting one thing straight -- formalising deep mathematical proofs is extremely hard. For example, it would cost tens of millions of dollars at least, i.e. many many person-years, to formalize and maintain (a proof is a computer program, and computer programs needs maintaining!) a complete proof of Fermat's Last Theorem in a theorem prover. It would certainly be theoretically possible, but it is not currently clear to me whether any funding bodies are interested in that sort of project.

But formalising deep mathematical objects is really possible nowadays. I formalised the definition of a scheme earlier this year. But here's the funny thing. 15 months ago I had never heard of the Lean Theorem Prover, and I had never used anything like a theorem prover in my life. Then in July 2017 I watched a live stream (thank you Newton Institute!) of Tom Hales' talk in Cambridge, and in particular I saw his answer to Tobias Nipkow's question 48 minutes in. And here we are now, just over a year later, with me half way through perfectoid spaces, integrating Lean into my first year undergraduate teaching, and two of my starting second year Imperial College undergraduate students, Chris Hughes and Kenny Lau, both much better than me at it. The links are to their first year undergraduate projects, one a complete formal proof of Sylow's theorems and the other an almost finished formalization of the local Langlands conjectures for abelian algebraic groups over a p-adic field. It's all open source, we are writing the new Bourbaki in our spare time and I cannot see it stopping. I know many people don't care about Bourbaki, and I know it's not a perfect analogy, but I do care about Bourbaki. I want to know which chapters should get written next, because writing them is something I find really good fun.

But why write Bourbaki in a computer language? Well whether you care or not, I think it's going to happen. Because it's there. Whether it happens in Lean or one of the other systems -- time will tell. Tom Hales' formal abstracts project plans to formalise the statements of new theorems (in Lean) as they come out -- look at his blog to read more about his project. But to formalise the statements of hard theorems you have to formalise the definitions first. Mathematics is built on rigorous definitions. Computers are now capable of understanding many more mathematical definitions than they have ever been told, and I believe that this is mostly because the mathematical community, myself included, just didn't ever realise or care that it was happening. If you're a mathematician, I challenge you to formalise your best theorem in a theorem prover and send it to Tom Hales! If you need hints about how to do that in Lean, come and ask us at the Lean Zulip chat. And if if it turns out that you can't do it because you are missing some definitions, you can put them down here as answers to this big list question.

We are a small but growing community at the Lean prover Zulip chat and I am asking for direction.

korrok · Answer

One of the outstanding problems in finite element analysis is how to come up with good sets of basis functions to discretize mixed problems.
By mixed problems I mean partial differential equations that can be derived through minimization of some convex functional, but with a constraint.
A classic example is the Stokes problem, which describes the flow of very viscous fluids: find a velocity field $u$ in a nice domain $Omega$ that minimizes the free energy dissipation
$$J(u) = int_Omegaleft(munabla u : nabla u - fcdot uright)dx,$$
where $mu$ is the viscosity and $f$ the body forces on the fluid, with the added constraint of incompressibility:
$$nablacdot u = 0.$$
This can be expressed instead through finding a critical point of the Lagrangian
$$L(u, p) = J(u) - int_Omega pnablacdot u, dx$$
where $p$ is the pressure field.
For unconstrained problems, we can triangulate the domain and take the solution $u$ to be a polynomial of pretty much any degree in each triangle and everything converges nicely as you refine the mesh.
Constrained problems like Stokes are much harder!
The discretization for the velocity and pressure fields have to be carefully chosen in order to get a non-singular finite-dimensional linear system.
The most common choice is to take the velocity to be piecewise quadratic and the pressure to be piecewise linear in each triangle; this is called the Taylor-Hood element.
But there are many other elements with various tradeoffs.
The proof that a given element pair works -- that it satisfies a discrete Ladyzhenskaya-Babuska-Brezzi condition -- is usually confusing and arduous.
It would be cool if these proofs could be formalized.
It would be publication-worthy if the techniques used could enable the discovery of new element pairs.
More generally, one could try to formalize finite element exterior calculus.

Dominic van der Zypen · Answer

Why formalize complex objects like topological groups when so much fun can be had with natural, simple objects like undirected graphs? There's also tons of open questions involving these simple beings. Erdös loved them. So I suggest to have a look at simple undirected graphs.

Answered by Dominic van der Zypen on December 21, 2021

Daniel Gerigk · Answer

I would like to see Jacob Lurie's books and papers formalized, beginning with his book Higher Topos Theory.

Jonny Evans · Answer

Formalised definitions of Kuranishi structures, polyfolds, d-manifolds and/or implicit atlas might be useful, along with the associated construction of virtual fundamental class or chain. These are all structures that exist on various moduli spaces of pseudoholomorphic curves in symplectic geometry, and allow us to extract invariants.

I would guess that a proof that these structures actually exist on moduli spaces is probably too hard for most current theorem assistants, but already a formal definition might help to clarify various things: the definitions are extremely subtle, and the experts often exchange lengthy papers which illuminate or query different aspects of the definitions.

You should probably do manifolds first, though, if you haven't already...

Timothy Chow · Answer

This might be slightly controversial, but I think that one thing that would be valuable to formalize in Lean is the definition of a set.

Working mathematicians are used to formalizing things in terms of sets.  Currently, I believe that requiring working mathematicians to swallow type theory before they can even get started with a theorem prover is a significant barrier to entry.  It's just a hunch, but I think you'll get more buy-in from the mathematician in the street if you let them work set-theoretically.

Although I know that type-theoretic proof assistants have certain practical advantages, some people, such as John Harrison, have advocated a return to set theory.

Craig Gidney · Answer

I would like unsolved problems and conjectures to be formalized. That is to say, the library would provide the statements-to-be-proved and the user would be responsible for filling in the details.

The reason I think this is interesting is because it creates a very simple bar for declaring a theorem proven: make my computer print "yes" when I patch your proof into the definitions provided by the community. If the person-claiming-a-proof were to provide their own definitions, then I have to carefully review them for subtle holes (i.e. I have the problem that they gave me a valid proof that makes the computer say "yes", but did it prove what they claim it proved?).

A good simple case is the Collatz conjecture. One could imagine a person claiming a computer existence proof that there is a cycle beside 4->2->1->4, but upon close investigation it would be realized that the second cycle the proof was describing was (-1) -> (-2) -> (-1) or 0 -> 0 -> 0 because the theorem-to-be-proven was accidentally defined using integers or non-negative naturals instead of the positive naturals.

For more complicated cases, there are often many of these kinds of subtleties. For example, when defining the complexity class BQP, you cannot say that a quantum gate's coefficients are just real numbers. You have to limit the coefficients to computable real numbers. Otherwise you can solve the halting problem by directly encoding the uncomputable answer into one of your coefficients and performing increasingly detailed phase estimation to extract the encoded bits. In fact, you probably want efficiently computable real numbers since otherwise you cap the "difficulty" of any problem to the difficulty of the decoding process.

Sh.M1972 · Answer

After finite structures, finitely presented groups are the most natural mathematical beings understandable for computers. Many infinite groups have finite presentations and so they can be formalized in any Theorem Prover. Many decision problems in group theory (and low-dimensional topology) can be stated in terms of finitely presented groups (word problem for example). So, f.p groups should be formalized for such programs.

Answered by Sh.M1972 on December 21, 2021

Philippe Gaucher · Answer

Sorry for asking a question here but it is too long for a comment. Or maybe if you tell me it's wrong, I'll delete it and I could post it as a regular question in MathOverflow instead.

What is the difference between Lean and for example Coq ?

You write: "No issues with constructivism, the axiom of choice, quotients by equivalence relations, the law of the excluded middle or anything." Fine. Because a proof assistant like Coq is a nightmare for normal mathematician like me. Honestly I tried to learn Coq. I spent four months some years ago to learn the language and to do all basic exercices I found on the Internet. It's really complicated, not intuitive at all, although quite recreational (watching the goals evolving by trying tactics is a little bit like a video game), and useless for mathematics, because restricted to the intuitionistic part mostly. Of course there is the HoTT extension, but I was been told that there were still a lot of non-trivial mathematical problems to put HoTT inside Coq. And I don't think that HoTT can be a good foundations for mathematics. On the contrary, I think that HoTT will be very useful for people studying type theory.

For example, can you implement easily the theory of locally
  presentable categories in Lean. It's nice to see that you can prove
  the Yoneda lemma in Lean, but frankly it's quite a very basic lemma.
  What about the enriched Yoneda lemma for example ?

Joe Corneli · Answer

The other answers given here so far are interesting and colourful, but I'm proposing a strategic answer, drawing on the wisdom of (pre-existing)                                                    crowds.  Here are several lists of core articles on Wikipedia, increasingly diffuse, but with more technical details in the longer lists.

53 core articles (many but not all of these are already in Lean)
300 core articles (including those above)
481 core articles (including those above -- and, incidentally, including sheaves! -- this is what I'd suggest as the first-draft checklist for Lean, even if some of them get a WONTFIX status)

If the corresponding definitions and theorems were all in place, then I think various  "wish list" items could be pursued more straightforwardly.  Furthermore, many of these are topics that undergraduates could tackle, which would tend to increase the size of the pool of trained people for tackling more mathematically difficult topics later.

none · Answer

Maybe attack some of these, and get Lean added to the page's prover list:

http://www.cs.ru.nl/~freek/100/

(it says it was last updated a few weeks ago, so it is still maintained).

Daniel Satanove · Answer

These theorem provers are founded on type theory, and as well with the popularity of homotopy type theory, it would be nice to see some of type theory itself formalized to get a sort of type theory in type theory. Then formalize some of the metatheory of various type theories like normalization, type checking and inhabitation (or the undecidability thereof), and perhaps some categorical logic (which I personally have found a somewhat nebulous subject, and seeing it formalized would be helpful).

Answered by Daniel Satanove on December 21, 2021

user47305 · Answer

As much a question as a suggestion, but now that a scheme is defined, is there a plan to move forward on EGA-style algebraic geometry? Is it feasible that someone (or many someones) could just plow through the Stacks project tag by tag, or are there some basic difficulties?

Answered by user47305 on December 21, 2021

S. Carnahan · Answer

I'm not sure how much interest in theorem proving software one can get among the physically inclined, but I would be interested in seeing a formalization of BPS states.  They are representations of the supersymmetry algebra (a finite dimensional extension of the Poincare Lie algebra), that have nontrivial odd part in the annihilator.  This property makes them robust against small changes in theoretical parameters, and hence good for computing invariants of manifolds (among other things).

They became popular in theoretical physics in the mid 1990s, around the same time as the word "brane", and physicists close to my area of research use them a lot to count black holes and construct infinite dimensional Lie algebras.  However, I am not particularly fluent in this language.

Warning: It is quite conceivable that a bare formalized definition won't do anything for us until we find a way to formalize some supersymmetric theories whose BPS states can be identified.  However, some toy models are supposed to be "not too bad for mathematicians".

Manuel Bärenz · Answer

Graphical calculus in monoidal categories with extra structure.

A lot of proofs in category theory are done graphically. There are even graphical proof assistants like Globular. Lean needs to make a connection to these eventually, so we can give rigorous graphical proofs.

This is not just about the library, but also additional tooling is required that e.g. transforms certain Lean quantities into diagrams in some format, and vice versa.

Thomas Kojar · Answer

Here are some definitions from analysis and probability:

Many measure theory concepts follow clear axiomatic rules. One quick search gave Formalization of Measure Theory.
Many commonly used functional spaces such as $Sobolev$ and $Hddot{o}lder$ would be interesting to see eventually.  A lot of calculus definitions and results seem to have been formalized already see

Formalization of the Integral Calculus in the PVS Theorem Prover and
Formalization of Real Analysis: A Survey of Proof Assistants and Libraries ,

I am guessing one can start from them.

Manuel Bärenz · Answer

Basic differential topology, basic Morse theory
Category theory and standard theorems (Yoneda lemma, adjoint functor theorem, etc.)
Simplicial sets
Basic homotopy theory
Some models of $(infty,n)$-categories, and their equivalences

It would be amazing if we could formalise some version of the cobordism hypothesis, which connects Morse theory and $(infty,n)$-category theory. At least the statement should be feasible.

Neil Strickland · Answer

Here is a list of things that I would like to see formalised, slightly modified from the list that I was using at Schloss Dagstuhl Seminar 18341.  All of these things should be trivial for the experts.  However, I would like to see formalisations with extensive explanation and annotation, written from the point of view of mathematicians who might want to use Lean, not from the point of view of developers of proof assistants.

Prove that $2+2=4$.  Key points: basic boilerplate at the top of the file, basic grammar of stating and proving, how to interact with the proof assistant.
Prove that there are infinitely many primes.  Key points: more basic grammar, how to search and refer to existing libraries, how to write a proof by contradiction, how to write a proof that constructs a witness to an existential statement.
Set up the basic facts about the group of units of a commutative ring, and prove that $mathbb{Z}^times={1,-1}$.  Key points: the basic framework for dealing with algebraic structures such as rings and groups.  The basic framework for dealing with substructures defined by a condition such as invertibility.  The framework for dealing with chains of algebraic identities.  Where to look in the library for facts about $mathbb{Z}$.  How to refer to the standard ring structure of $mathbb{Z}$.
Set  up the theory the ideal of nilpotent elements in a commutative ring $R$.  Prove that $text{Nil}(R)$ is an ideal and that $text{Nil}(mathbb{Z}/4)={0,2}$ and that $text{Nil}(R/text{Nil}(R))={0}$.  Key points: the basic framework for quotient structures.  The framework for dealing with $mathbb{Z}/n$ (which might be defined as a quotient ring, or as the set ${0,dotsc,n-1}subsetmathbb{N}$).  The framework for dealing with ring homomorphisms, including the projection $mathbb{Z}tomathbb{Z}/n$.  Indexed sums in general, and the binomial expansion.
Given a finite set $A$ (with decidable equality), construct the poset of partitions of $A$, as an instance of some well-structured theory of finite posets.  Prove that if $|A|=3$ then $|text{Part}(A)|=5$. Key points: the basic framework for dealing with finiteness and decidability.

I have been working on this myself, doing Coq+sseflect, Isabelle+HOL-algebra and Lean in parallel, but so far I have spent less time on Lean than with the other two, so I have only looked at tasks 1 and 2 in Lean.  I spent about a week on this directly after Dagstuhl but then had to turn to other things; I hope to get back to it in a week or two.  Of course all the real mathematical issues are simple, but apparently trivial points about the software framework can easily absorb enormous amounts of time. If anyone wants to send me Lean code then I would be grateful.

I will also say that the main issue in promoting the use of Lean is not the presence or absence of formalisations of your favourite bits of mathematics.

It is not possible to get a usable installation of Lean + library under Windows without following instructions in Kevin's blog post, which is hard to find unless a well-informed person directs you there.  If you get anything wrong, then the error messages are not helpful.  You first have to install MSYS + various packages, totalling nearly 1GB.  As far as I can tell this could be avoided by restructuring a couple of files in the Lean C++ code to use the appropriate Windows system calls rather than starting an external process.  That would also let you install Lean properly in C:Program Files, rather than having to use a path without spaces.  The lead developer has told me that he does not see it as a priority to fix any of this.  All this is slightly ironic given that the project is funded by Microsoft.
Although there is a lot that is very good about the existing documentation, it is clearly not written from the point of view of a mathematician who is interested in using Lean to help check proof of new things.

UPDATE:
There has been some recent activity aiming to improve the installation process.
You can read about it at https://github.com/leanprover/mathlib/blob/master/docs/elan.md.  (I haven't tried it myself.)

jmc · Answer

I would like to see classical representation theory and Lie theory formalised:

Representations of finite groups
Classification of semisimple Lie algebras (over $bar{k}$) and their finite-dimensional representations.

Yemon Choi · Answer

Something which shouldn't require completeness: the Tsirelson norm on the space of finitely supported real-valued sequences. (Well, more precisely, the norm defined in Figiel and Johnson's paper http://www.numdam.org/item?id=CM_1974__29_2_179_0 ).

Answered by Yemon Choi on December 21, 2021

Yemon Choi · Answer

This may be non-interesting or way off-base, but: I would love to see some way of formalizing the notion of "the category of real Banach spaces and bounded linear maps, in a world without assuming the Hahn-Banach theorem". The point is that some short exact sequences seem to split in the "usual category" of Banach spaces, without splitting in the category mentioned above, and this affects whether various Banach spaces are isomorphic or not.

It might then be possible to apply a properly homological point of view to questions in functional analysis that normally get stated loosely as "yes, well we know X happens, but that needs Axiom of Choice".

Ryan O'Donnell · Answer

I propose formalizing the definition of a Turing Machine, and hence decidable/computable languages.

(That said, a paper by Asperti and Ricciotti has already described doing it for the Matita Theorem Prover.)

Mark Wildon · Answer

Maybe a formalization of homological algebra would be interesting? There are many objects that appear to depend on a choice of an infinite projective (or injective) resolution, but in fact need only finitely many terms, and are independent of which resolution is chosen.

In particular, I suggest formalizing the $mathrm{Ext}$ and $mathrm{Tor}$ functors for modules over a Noetherian ring.

François Brunault · Answer

I would really like to see motives formalized. Although motives were in the beginning more a "philosophy" (dating back at least to Grothendieck), the situation has evolved and thanks to Beilinson's insights and the work of Voevodsky (among others), we now have working definitions for the category of mixed motives (technically this is a triangulated category: it still remains to define a $t$-structure on this category in order to get an abelian category of mixed motives). Of course, the philosophy is still there and the informal point of view is both necessary and very motivating, but the point is that things have become much more precise. As an application or motivation for this formalization, I think of the conjectures about the transcendence of periods (in the sense of Kontsevich and Zagier). These conjectures naturally fit in and extend to the framework of motives, where one tries to describe all the algebraic relations between periods (or regulators) associated to one or several given motives.

I feel that I should try to give a flavour of what motives look like. Very roughly, the motive associated to an algebraic variety $X$ is an abstract version of the cohomology group $H^i(X)$. Motives are universal in the sense that applying various realization functors we recover the usual cohomologies. Smooth proper varieties give rise to pure motives, while general varieties give rise to mixed motives (basically, non-trivial extensions of pure motives). Voevodsky's published work deals with algebraic varieties over a field, but we now have definitions for (triangulated) mixed motives over a pretty much arbitrary base scheme, and with arbitrary coefficient rings.

Broadly speaking, the theory of motives has an heavy formalism but this is precisely where computers may be helpful. The actual construction relies in particular on homological and homotopical algebra, so I guess one would need to formalize this first, which could be a lot of (interesting!) work. Of course, it would also build on the formalization of schemes.

I should add that although the theory of motives is generally considered as difficult, formalizing them could help in making it more accessible. Namely, it is the kind of theory which has gone very far, but the layman struggles to even find the definitions. The conjectures in transcendence number theory that I mentioned are very appealing, but it's hard to make them precise, in the sense of finding and understanding the relevant literature. Of course, the experts (not me) know how to formulate things or at least have a clear view, but somehow I feel that the access to this knowledge is difficult (there are a number of excellent references on motives though). I hope that having a formalization would encourage interested people to actually learn more about motives.

Benjamin Steinberg · Answer

I would like to see that Statement of the Classification of Finite Simple Groups formalized. And then I would like to see the proof formalized. Many people use the theorem as a black box. The first generation human proof is too big for people who are not working on the second generation and third generation proofs to completely understand and it would be a huge service to the mathematical community to formalize this, as was done for the Odd Order Theorem.

Answered by Benjamin Steinberg on December 21, 2021

Deane Yang · Answer

Basics of differential geometry:

Definitions of manifold, tangent and tensor bundles
Basic theorems:

Frobenius theorem
Poincare lemma
Implicit function theorem

Basics of Riemannian geometry

Denis Chaperon de Lauzières · Answer

Here are ideas that might be the type that you are looking for, some with more analytic flavor: formalize the statements of

(1) The Mostow Rigidity Theorem.

(2) The existence of Brownian motion.

(3) The Weyl law for the asymptotic behavior of the number of eigenvalues of the Laplace operator on a compact Riemannian manifold.

beginner · Answer

I think it would be great to have the Multiradial Representation of Inter-universal Teichmüller Theory, as in this:

https://www.youtube.com/watch?v=6HCz1tFqIcs

Kevin Buzzard · Answer

I would like to see a formal definition of the Langlands Group of the rational numbers, with a future eye on formalising the statement of the global Langlands reciprocity conjecture rigorously in the number field case.

I am currently unclear on whether there is a way to rigorously formalise the statement of the full Global Langlands Reciprocity Conjecture, or whether it is just supposed to be a "philosophy" which implies functoriality. I wish I understood more of Jim Arthur's paper on the subject.

Which mathematical definitions should be formalised in Lean?

28 Answers

Add your own answers!

Ask a Question