set theory

Why Infinity is Strange

Infinite Sets I think I finally understand Cantor’s proof about non-countable (or non-enumerable) sets (I’ve been slowly working through Boolos and Jeffrey’s Computability and Logic; I find their notation and general approach to be quite hard to follow, so it’s been a struggle). This particular result has to do with infinity, which is a conceptually difficult topic to grasp, even when considered non-mathematically. Part of what Cantor discovered is that infinite sets can have different cardinalities (where cardinality here corresponds to the number of items in a given set). For example, the cardinality of the natural (or counting) numbers $\mathbb{N}$ is much smaller than that of the real numbers $\mathbb{R}$ (we won’t talk directly about the real number, but instead about the power set of the natural numbers $\mathcal{P}(\mathbb{N})$, which has the same cardinality as $\mathbb{R}$). More generally, Cantor showed that infinity is not a singular concept and that there are in fact many different types of infinity (which, at first glance, is a bewildering proposition). Let’s start with some definitions. Somewhat informally, we will call an enumerable set as any set that can be aligned to a subset of the natural counting numbers (e.g., the set of chairs in my kitchen, of which there are 4). A countably infinite set, then, is a set that can be aligned to all of the natural counting numbers. For example, let’s imagine we have a simple formal language $\mathcal{L} = { a^{*} }$; we can argue that the string set associated with this language is countably infinite by doing the following alignment (such that the number of $a$s in each string is mapped uniquely to each associated counting number): $$\lambda \to 0, a \to 1, aa \to 2, aaa \to 3, aaaa \to 4, …$$ More formally, you can define an infinite set as a function from values in that set to the natural numbers, e.g., $$f : \mathcal{L} \to \mathbb{N}$$ To use some terminology that will become useful later, we will say that for an infinite set to be countably infinite, there must exist some bijective relation, or one-to-one mapping, between the given set and the natural numbers (i.e., for each item $x$ in the domain of $f$, which in this case would would be our infinite set, there should exist a single item in the co-domain $f(x) \in \mathbb{N}$ and vice versa). With these ideas, let’s now prove a relatively simple, yet puzzling, theorem about countably infinite sets. Theorem 1 For any two countably infinite sets $A$ and $B$ (for convenience, we will assume that $A$ and $B$ are disjoint, i.e., $A \cap B = \emptyset$), the union of $A$ and $B$, $A \cup B$, is also countably infinite. Proof (rough sketch following the discussion here) Given that $A$ and $B$ are countably infinite, we know that there exist two bijective functions $f : A \to \mathbb{N}$ and $f’ : B \to \mathbb{N}$. To prove the theorem, we need to come up with a new bijective function $g : (A \cup B) \to \mathbb{N}$. Below defines such a function: $$ g(x) = \begin{cases} 2f(x) & \text{if }x \in A, \\ 2f’(x)+1 & \text {if }x \in B \end{cases} $$ which simply assigns items in $A$ to the even numbers (of which there are a countably infinite amount) and items in $B$ to odd numbers (of which there also are a countably infinite amount) □ Now why is this result puzzling? Well, typically when we combine two of anything, the result is larger than the individual pieces we combined. For example, if I combine the set of chairs in my kitchen with the set of chairs in your kitchen (assuming that you have chairs), the resulting set is the sum of our two sets and hence is larger. Countable infinity doesn’t work this way; when you combine two countably infinite sets, the resulting set has exactly the same cardinality. In the 17th century, Galileo had made several observations about infinity, even without the set-theoretic machinery that was developed by Cantor much later in the 19th century. What he noticed is that if you can create a one-to-one mapping between square numbers (i.e., the numbers that are the product of other natural numbers, e.g., $0,1,4,9,16,…$) and the natural numbers (this is similar in spirit to what we did in Theorem 1). More technically, he made the observation that the function $ f : n \to n^{2} $ is bijective for all $n \in \mathbb{N}$. This led him to say the following about infinity in his book Two New Sciences: This is one of the difficulties which arise when we attempt, with our finite minds, to discuss the infinite, assigning to it those properties which we give to the finite and limited; but this I think is wrong, for we cannot speak of infinite quantities as being the one greater or less than or equal to another. [He then goes on to discuss his result on square numbers]. As a mathematical matter, it turns out that his last point about how we cannot speak of infinite quantities as being smaller or greater than one another is not quite right. Even though countably infinite sets are all equal to one another, there are infinities with larger cardinalities. This is the result that we turn to next. Cantor’s Theorem Cantor’s Theorem is about the relative cardinality of sets and their power sets, as mentioned at the beginning. More specifically, the theorem states that the cardinality of a given set $A$ is strictly less than its power set (i.e., the set of all sets constructed from $A$, plus the empty set $\emptyset$). Below states this more formally (where $|\cdot|$ denotes the cardinality of a set, and $\mathcal{P}(\cdot)$ denotes the power set): Cantor’s Theorem $$\text{For any set } A, |A| \lt | \mathcal{P}(A)|$$ The Finite case: Notice that we didn’t specify whether the set $A$ is finite or infinite, meaning that this result applies to both. For finite sets, this is straightforward to prove; if you attempt to construct a power set from a given set of cardinality $n$, you will notice that the cardinality of the resulting power set will always be $2^{n}$ (it’s encouraged that you try to do this yourself). For example, imagining the set $A = \{ a , b\}$ of size 2, the power set is the following: Example 1 $$\mathcal{P}(A) = \{ \{ a \} , \{ b \} , \{ a , b \} , \emptyset \} $$ (note the existence of the empty set $\emptyset$, which is always a subset of every set). Clearly $n \lt 2^{n}$ is always true for all $n \in \mathbb{N}$ (the full proof requires a bit more work, but it relates to what comes next related to proving this for infinite sets). The Infinite case: The more surprising part of Cantor’s theorem is that this applies to infinite sets. We will specifically try to prove the following claim: Claim 1 $$| \mathbb{N} | \lt | \mathcal{P}(\mathbb{N}) |$$ In keeping with our more general theme, what this claim says is that the cardinality of $\mathbb{N}$, which we know to be countably infinite, is smaller than something else (hence, there is an infinity larger than $| \mathbb{N} |$!). To demonstrate this, we will do the following: first, we will start with a definition of a set that is undeniably in $\mathcal{P}(\mathbb{N})$, then show that if we try to assign it to a specific natural number (as is required to make the mapping bijective), we will run into a contradiction. Let’s imagine that we were to map the natural numbers to its power set. What we would have are individual mappings such as those shown below: $$ 0 \to \{ 0, 1 \} \\ 1 \to \{ 3, 4 \} \\ 2 \to \{ 5, 6 \} \\ 3 \to \{ 3, 5 \} \\ … $$ Notice that some of the sets on the right side contain the numbers on the left to which they are indexed (i.e., for $0$ and $3$, these numbers are included in their corresponding sets $\{ 0,1 \}$ and $\{ 3, 5\}$), whereas the others do not. Based on this, let’s make the following definition: Definition Let’s define the function $d : \mathbb{N} \to \mathcal{P}(\mathbb{N})$, and the set D of all numbers that are not included in their corresponding set: $$ D = \bigg\{ x \in \mathbb{N} \mid x \notin d(x) \bigg\} $$ Using again the example mapping above, $D$ would include the numbers $1$ and $2$ (i.e., $\{ 1, 2\} \subseteq D$). Since $D$ is a set of positive numbers, it clearly must appear in the power set of $\mathbb{N}$, or $\mathcal{P}(\mathbb{N})$. That is, there should exist some mapping: $$ j \to D $$ This definition, however, leads to a contradiction that makes it impossible for $d$ to be bijective (thus making Claim 1 true). To organize this a bit, we will first prove that $d$ is not bijective in the following lemma, then prove Claim 1 as a separate theorem: Lemma 1 The function $d$ defined above is not bijective/one-to-one. Proof (sketch) We will prove this in relation to the set $D$ defined above. Let’s assume that $d$ is bijective. Let’s then imagine that we find some set $S_{j} \in \mathcal{P}(\mathbb{N})$ that is identical to $D$ and, since $d$ is bijective, maps backwards to the number $j$, i.e., $S_{j} = D$ and $d(j) = S_{j}$. By definition of $D$, the following must be true: $$j \in D \text{ iff } j \notin S_{j}$$ If we assume the equivalence $S_{j} = D$, however, we get the following contradiction by substituting $D$ for $S_{j}$ in the statement above: $$j \in S_{j} \text{ iff } j \notin S_{j}$$ This shows that It is not possible to assign a given number $j$ to $D$ (which again, is undeniably in $\mathcal{P}(\mathbb{N})$). Therefore, $d$ cannot be bijective. □ To be honest, this result still boggles my mind a bit. What this shows is that if we try to construct a bijective relationship between $\mathbb{N}$ and $\mathcal{P}(\mathbb{N})$, we will find a hole in the form of a missing set such as $S_{j}$ that is excluded in our mapping. As Boolos and Jeffrey remark, even if we try to repair this hole by trying to add $S_{j}$, this will only have the effect of creating a new hole. With this, we can prove the main result related to Claim 1: Theorem 2 Claim 1 is true, i.e., $| \mathbb{N} | \lt | \mathcal{P}(\mathbb{N}) |$ Proof (rough sketch) Let’s assume that this claim isn’t true, and that $\mathbb{N}$ and $\mathcal{P}(\mathbb{N})$ have the same cardinality. Since we already know that $\mathbb{N}$ is countably infinite, this would mean that $\mathcal{P}(\mathbb{N})$ is also countably infinite, and hence should have a bijective/one-to-one mapping to $\mathbb{N}$. Through Lemma 1, we know that such a bijection is not possible since it is possible to come up with a set in the power set that can’t be indexed with some $j \in \mathbb{N}$. Given the result above, we still need to exclude the possibility that $\mathcal{P}(\mathbb{N})$ is smaller than $\mathbb{N}$. To do this, it suffices to show that some strict subset of $\mathcal{P}(\mathbb{N})$ is greater or equal to $\mathbb{N}$. Consider the subset $D’ = \{ \{ x \} \mid x \in \mathbb{N} \}$. Clearly $D’$ is a strict subset of $\mathcal{P}(\mathbb{N})$, and clearly it can have a one-to-one mapping to $\mathbb{N}$ (i.e., by indexing each $x \in \mathbb{N}$ with its singleton set which will give it the same size) □ A natural question to ask at this point is whether there is an infinity that has a greater cardinality than $\mathcal{P}(\mathbb{N})$ (perhaps this is the largest infinity). Using similar arguments, we can prove that $| \mathcal{P}(\mathbb{N}) | \lt |\mathcal{P}(\mathcal{P}(\mathbb{N})) |$, which in turn is less than $ | \mathcal{P}(\mathcal{P}(\mathcal{P}(\mathbb{N}))) |$, which is less than $ | \mathcal{P}(\mathcal{P}(\mathcal{P}(\mathcal{P}(\mathbb{N})))) |$, which is less than….