Modules over a ring #
These notes are a slightly edited version of an introductory talk I gave at the Australian Mathematical Sciences Student Conference at ANU Canberra in July 2013.
In undergraduate mathematics, we are introduced to the concept of a vector space over a field.
A vector space is a set of vectors satisfying a group of intuituive axioms which are abstracted from the rules obeyed by tuples of real or complex numbers.
More formally, a vector space \(V\) over a field \(F\) is a set equipped with the binary operations of addition \[ V\times V\to V:(v,w)\mapsto v+w \] and scalar multiplication \[F\times V\to V:(\lambda,v)\mapsto \lambda v\] satisfying a bunch of familiar-looking axioms (which I won’t write down).
Classification of vector spaces #
As pure mathematicians, having just introduced a new type of algebraic structure, we ask the following natural question:
What are the possible behaviours of this new structure? That is, given a field \(F\), how many different vector spaces are there over this field?
This answer is, of course, provided by the notion of dimension:
Theorem (Classification of vector spaces) #
Given a field \(F\), there is precisely one vector space, denoted \(F^n\), for each integer \(n\geq 1\) (for a specific notion of precision called isomorphism). \(n\) is called the dimension of \(F^n\).
Generalising #
Again, as inquisitive pure mathematicians, having solved the previous question we are naturally led to a new question:
What happens to the definition of a vector space if we allow \(F\) to be a more general algebraic structure than a field? Can we still hope to classify these new objects?
Rings #
We now need to think about generalising the notion of a field (a set which is closed under multiplication and addition with distinct additive and multiplicative identities in which every element has an additive and multiplicative inverse). If we relax the assumption that every element has a multiplicative inverse, we arrive at the definition of a ring.
- All fields are in particular rings. e.g. \(\mathbb{R},\mathbb{C},\mathbb{F}_3\), etc.
- The ring of integers \(\mathbb{Z}\)
- The ring of polynomials over a field \(F[ x]\)
- The ring of integers modulo 4 \(\mathbb{Z}/4\mathbb{Z}\)
- The ring of all \(n\times n\) matrices \(M_n(\mathbb{R})\) with real entries
Modules #
If we keep all the previous vector space axioms, but allow for a general ring \(R\) instead of a field, we arrive at the definition of an \(R\) -module.
More formally, a module \(M\) over a ring \(R\) is a set equipped with the binary operations of addition \[ M\times M\to M:(m,n)\mapsto m+n \] and an \(R\) -action \[ R\times M\to M:(r,m)\mapsto rm \] satisfying a bunch of familiar-looking axioms (which I won’t write down, and which are almost identical to the vector space axioms).
So if \(F\) is a field, an \(F\) -module is just an \(F\) -vector space.
The more general question of classifying \(R\) -modules for a given ring \(R\) is essentially what drives a large proportion of modern algebra – this is an area known as representation theory.
Some research focusses on coarser classification for certain families of rings, while other research focusses on complete classification for specific rings (this is what I do).
In fact, for almost all rings, a complete classification of all modules is pretty much impossible, and we have to settle for classifying the atomic building block objects from which all modules are made.
These are known as the irreducible or simple modules for this ring, and there is a well-defined way (composition series) of building general modules from these simple pieces (similar to how we build general groups from simple groups).
Modules over the integers #
A particularly tractable non-field example is given by modules over the integers \(\mathbb{Z}\).
In fact, \(\mathbb{Z}\) -modules are exactly the same as another familiar algebraic object, namely abelian groups. This is because the scalar multiplication, when the scalars come from \(\mathbb{Z}\), i.e. the \(\mathbb{Z}\) -action, corresponds precisely with iterated addition in the group.
If you’re interested in the next most complicated case for classifying modules, read about the classification of modules over principal ideal domains (of which \(\mathbb{Z}\) and fields are special cases).
Homomorphisms #
In order to have any meaningful further discussion about modules, we need the notion of a module homomorphism.
After being introduced to vector spaces, you are then introduced to linear transformations, which are maps \(T:V\to W\) between vector spaces which preserve or respect the linear structure of the spaces (these requirements can be encoded in a list of axioms).
A fundamental result is that linear transformations can be encoded as matrices.
Another is that the rank (or dimension of the image) of a linear transformation \(T:V\to W\) and its nullity (the dimension of its kernel) add to give the dimension of \(V\) (the rank-nullity theorem).
Module homomorphisms #
A homomorphism \(\phi:M\to N\) between two $R$-modules \(M\) and \(N\) is just a map which respects the $R$-module structure of both modules.
So if \(F\) is a field, an $F$-module homomorphism is just a linear transformation.
It turns out that homomorphisms are extremely important in
- precisely describing the notion of equivalence of modules in our classification problem
- determining the atomic building blocks of all possible modules for a given ring
The symmetric group #
In my research, I’m interested in modules over a special family of rings, called the group rings} of the {\em symmetric groups.
The symmetric group \(\mathfrak{S}_n\) is just the group of all permutations of the numbers \(\{1,2,\ldots,n\}\). So for example, the symmetric group \(\mathfrak{S}_3\) consists of the six permutations
\begin{array}{ccccccccccc} 1&2&3&&1&2&3&&1&2&3\\ 1&2&3&&1&3&2&&2&1&3\\ &&&&&&&&&&\\ 1&2&3&&1&2&3&&1&2&3\\ 2&3&1&&3&1&2&&3&2&1 \end{array}
Let’s call these \(\sigma_1,\sigma_2,\ldots,\sigma_6\).
The symmetric group ring #
We can make a ring from this group by considering all possible products of formal linear combinations of \(\sigma_1,\ldots,\sigma_6\) with coefficients from some other ring \(R\). For example if \(R=\mathbb{C}\) then elements like \[ \sigma_1+3\sigma_3-(2+i)\sigma_4+(\sqrt{3}-2i)\sigma_6 \] belong to the group ring \(\mathbb{C}\mathfrak{S}_3\), and we multiply by expanding products of these linear combinations as usual and then multiplying permutations:
\begin{array}{ccc} 1&2&3\\ 1&3&2 \end{array}
×
\begin{array}{ccc} 1&2&3\\ 2&1&3 \end{array}
=
\begin{array}{ccc} 1&2&3\\ 1&3&2\\ 2&3&1\\ \end{array}
=
\begin{array}{ccc} 1&2&3\\ 2&3&1 \end{array}
\[ \sigma_2\times \sigma_3=\sigma_4. \]
The symmetric group ring #
Question: study modules over these group rings.
Answer: for some choices of \(R\) (like \(\mathbb{C}\)), the classification has been known for decades. For others, this is still a significant open problem.
Combinatorics of modules over the symmetric group ring #
It turns out that the classification of \(\mathbb{C}\mathfrak{S}_n\) -modules can be given combinatorially, using objects called partitions and tableaux.
A partition \(\lambda=(\lambda_1,\lambda_2,\dots)\) is a weakly decreasing sequence of non-negative integers.
For example, \((3)\) , \((2,1)\) and \((1,1,1)\) are the three partitions of 3. We usually represent them as Young diagrams.
There is one irreducible \(\mathbb{C}\mathfrak{S}_n\) -module for each partition of \(n\). This means that all \(\mathbb{C}\mathfrak{S}_n\) -modules can be built from these irreducible modules in some sense.
Rather than just giving a bijection in this way, the combinatorics of partitions actually allow us to encode the internal structure of the modules as well.
(Partitions are playing the role that the non-negative integers played for fields; the internal structure of vector spaces is very simple so a combinatorial description isn’t needed).
For a partition \(\lambda\) with \(n\) boxes, a \(\lambda\) -tableau is a filling of the boxes with the numbers \(1,2,\ldots, n\). The set of all tableaux of shape \(\lambda\) is denoted \(X^\lambda\).
The tableau is standard if the numbers increase down columns and along rows.
For each vector space \(F^n\), the integer \(n\) was sufficient to completely describe the space (assuming we are using the standard basis).
For each irreducible \(\mathbb{C}\mathfrak{S}_n\) -module \(S^\lambda\) (remember there is one for each partition \(\lambda\) of \(n\)), it turns out:
- The number of elements in a basis for the module (its dimension) is given by the number of standard $λ$-tableaux
- There is a natural way to index a basis by the tableaux
- There are simple formulae describing the action of \(\mathbb{C}\mathfrak{S}_n\) on this basis in terms of purely combinatorial information
Modules and voting models #
Given a partition \(\lambda\), define an equivalence relation \(\sim\) on the set of \(\lambda\) -tableaux by \[ \mathtt{s}\sim \mathtt{t}\quad\mbox{if the rows of $\mathtt{s}$ and $\mathtt{t}$ are pairwise identical as sets} \]
We denote equivalence classes under this relation, called tabloids, by removing the vertical lines within rows.
As a final application, let’s see how to use the language of \(\mathbb{C}\mathfrak{S}_n\) -modules and tableau combinatorics to discuss modelling elections.
Let \(X^\lambda\) denote the set of all tabloids of a given shape \(\lambda\).
Suppose there are \(n\) candidates in an election and voters are asked to return a complete list of candidates ranked in order of preference, from most to least favoured. (This system is called compulsory preferential or full preferential voting, and is used in federal lower-house elections in Australia).
Then each voter is really being asked to choose a tabloid from the set \(X^{\overbrace{(1,1,\ldots,1)}^{n\;\; \text{ones}}}\).
For example if there are three candidates, each voter must choose one of the six tabloids from \(X^{(1,1,1)}\).
As another example, suppose voters are asked to return only their most favourite candidate, and leave the remaining \(n-1\) candidates unranked. (This system is called first-past-the-post, and is used for example in parliamentary elections in the UK and presidential elections in the USA).
Then each voter is really being asked to choose a tabloid from the set \(X^{(1,n-1)}\).
In general, the situation is more complicated than this.
For example, in NSW lower-house elections, which are conducted under optional preferential voting, voters may rank as many or as few candidates as they wish.
This means voters can return a tabloid from \(X^{(1,1,\ldots,1)}\), or \(X^{(1,n-1)}\), or any set in between.
In the nascent field of algebraic voting theory, the assumption is usually made that voters are choosing from a single set \(X^\lambda\) of tabloids.
It turns out that such positional voting procedures can be viewed as \(\mathbb{C}\mathfrak{S}_n\) -module homorphisms between two \(\mathbb{C}\mathfrak{S}_n\) -modules \(M^\lambda\) and \(M^{(1,n-1)}\).
This then allows us to use representation theory and our understanding of the simple \(\mathbb{C}\mathfrak{S}_n\) -modules and the decomposition of \(M^\lambda\) into the simple modules we saw earlier to make meaningful statements about elections and voting procedures.
Indeed, the above ideas give a very elegant proof of the famous statement “rather than reflecting the views of the voters, it is entirely possible for an election outcome to more accurately reflect the choice of an election procedure.