Elliptic Curve Associativity

Associativity of Elliptic curve group law

This page was made in December 2022 to explain the associativity of the elliptic curve group law. This follows a standard low brow approach, such as given in Miles Reid's book on an undergraduate course in algebraic geometry.

I use embedded Desmos on this page.

The group law on an elliptic curve

I was going to write this page without a definition of the group law, since the students this is aimed at already know. But for completeness I will add this in.

An elliptic curve is a non-singular cubic, with a given "zero" point. It will often be given in the form

y² = x³ + Ax + B

(go to this desmos)

But it can be given in other ways. On this page, we won't worry too much about this, since putting the curve into Weierstrass form or whatever, would be a topic for a different discussion.

Here is another example of an elliptic curve:

(go to this desmos)

Pick a zero

You have to pick a point to be the zero of the group, and also this doesn't matter too much where it is, unless you want to put in Weiestrass form: If you have any abelian group stucture on your curve E, that is, a map from E x E to E with a zero O, and the usual abelian group law, +, then we can take any point X, and define a new group law, o, and set P o Q = P + Q - X.
Then X o P = X + P - X = P for all P on E, and P o (2X-P) = P + 2X - P -X = X, so 2X - P is the inverse of P. Finally, (A o B) o C = (A + B - X) + C - X = A+B+C-2X = A o (B o C). So the group is associative assuming E,+ is. The group E,o is abelian since E,+ is. So let's not worry about what point zero is, and let's not worry about whether E is in Weierstrass form. (This is probably the sort of thing that should be left as an exercise, but students who want to do the exercise can do it anyway without being told to do it as an exercise.)
Recently I have been telling my second year geometry students that a Euclidean space doesn't come with a zero. This is a bit similar; in fact, an elliptic curve does come with a specified zero, but a cubic curve doesn't. When we specify the zero, we turn a non-singular cubic into an elliptic curve. So saying that where the zero is doesn't matter is possibly false; anyhow, pick a point on your cubic, and you have an elliptic curve; the only problem will arise if there is no point, and this is not so much a problem if you know there is no point; the difficulty would be if you don't know where any point is, or, if you know there are some points, but you don't know where any are. But let's ignore this discussion, it's not relevant for proving associativity.

Note that the zero for the curve in the standard Weierstrass form is generally taken to be the "point at infinity". Many students find this quite scary; it's nothing to be afraid of really, but makes more sense if you have an understanding of the projective plane.

The group law

A group law is a way of taking two points, A, B and spitting out a third, called A+B. The point of this is that from a couple of points on the curve you can get a whole load more, which helps if you want to know about all possible solutions, or you have one solution, but it's not the solution you want for some reason, or if you want to do elliptic curve cryptography, or you like groups, or whatever.
The recipe for finding A+B is as follows: Find a point, which we call -(A+B) which is the point where the straight line L through A and B meets E at a third point. Then A+B is the third point on E and the line through O and -(A+B). You need to do some work to check all this makes sense, e.g., that there are three points on the intersection of E and L when counted in the "correct" way.
In the following example, O is the point at infinity, (0:1:0), which we can think of as being a point where all vertical lines meet.

(go to this desmos)

In the following example, O is the point (0,0).

(go to this desmos)

Associativity

Associativity means that (A+B)+C = A+(B+C).
If we take for granted than -X is well defined, and -(-X)=X, which we'll leave as an exercise, then this is equivalent to showing that -((A+B)+C) = -(A+(B+C)), I.e., you follow the rules to construct these points, which a priori will be different, and show you end up with the same points.
The constructions of both -((A+B)+C) and -(A+(B+C)) are shown below in the sections with the corresponding titles.

Families of curves

The method we use to show that the points -((A+B)+C) and -(A+(B+C)) are the same is to show that if two cubics pass through 8 common points, then there is a unique 9th common point they pass through.
A key point is to show that the dimension of the family of curves through 8 points (in "general position", meaning "so that this result is true") is one dimensional. So it might be useful to think about the dimension of various families of curves.

Lines through a point

For example, the family of lines through a point is "one dimensional". Here is a picture.

(go to this desmos)

Being one dimensional means that there is a bijection between the family of lines through the point (which is the red dot in the above picture), and a "line". Here a line means a projective line, but you can just take it as a usual line with a point at infinity. The parameter could be the slope of the line, or the y-intercept, or the x-intercept, or some other parameterisation. The important thing is that just one number determines the line through this point, and all lines through this point are determined by a choice of the parameter. In the picture, there is a moving line shown, shown as a parameter t varys, and there is also a selection of possible lines from the family shown.

Parabolas through a point

What about parabolas through a point? In this case we need two paramaters to determine a choice of parabola through a single point:

(go to this desmos)

You could think of the parameters as the x and y intercept, or the two x intercepts, or the coordinates at the point where the derivative is zero, or any other pair of numbers which determines the curve. These parameters will not always work, e.g., if the chosen point is on the x-axis then you can't specify two other interpepts. So different choices of parameter may be used for different sitations.

Parabolas through two points

If you take two points and ask for all the parabolas that pass through both of them, you get a one dimensional family.

(go to this desmos)

And if you specified three points, and asked for a parabola though them, in general you would get a unique solution (forgetting about cases such as all points on a single line). This is similar to how specifying three points on a circle is enough to determine the circle (which is taught in an elementary ruler and compass geometry course). A unique solution corresponds to a single point in the parameter space, and a single point is considered to have dimension zero.
A general parabola has the form y = Ax² + Bx + C, so the set of all parabolas is three dimensional, since there are three choices of parameter here, A, B, C. For each point (A,B,C) in three space, there is a corresponding parabola. Specifying that a point (x,y) is on a parabola is a linear condition on the values A, B, C. Each point will give a condition. From linear algebra, we expect that each additional point specified to be on the parabola will reduce the dimension of the set of all the parabolas through the points by 1.

Other quadratics

You could also consider other families of quadratic curves, otherwise known as conics, through various numbers of points. E.g., what are all the ellipses through 3 points?

Cubic curves

For the case of cubic curves, the general cubic in the xy plane is given by an equation of the form

a₁x³ + a₂x²y + a₃x² + a₄xy² + a₅xy + a₆x + a₇y³ + a₈y² + a₉y + a₁₀ =0

So every time you specify a point has to be on a cubic, you would generally expect the dimension of the family of cubics to go down by 1.
Although there are 10 parameters in the above family, the space of cubics is only 9 dimensional, because scaling all the coordinates gives the same curve. So, the parameters are considered as a set of ratios:

( a₁: a₂ : a₃: a₄: a₅: a₆: a₇: a₈: a₉: a₁₀ )

Or, you could scale one of these numbers to be 1, which will be possible unless it is zero. Considering as ratios doesn't leave anything out. This is 9 dimensional projective space. Projective space can be considered as collections of sets of ratio.

The construction of -((A+B)+C):

(go to this desmos)

The construction of -(A+(B+C)):

(go to this desmos)

Here are both constructions on the same graph, where we can see that in this example -((A+B)+C) = -(A+(B+C)). But we need to show that this always happens.

(go to this desmos)

The points on the curves shown in this picture are A, B, C, B+C, -(B+C), A+B, -(B+C), and the two points -((A+B)+C), -(A+(B+C)), which we want to show are equal. There is also another point, at infinity (0:1:0). This is the point where the vertical lines meet. We could change to a different zero of the group law to avoid this, but this is the standard zero, so I'll leave this as it is, though might consider adding another example at some point.
In general the points O,A,B,C,(A+B),(B+C), -(A+B), and -(B+C), are all different points. If they are not, then there is a special case to deal with, but the special cases are easier than the general case, so I won't bother talking about them.
I have coloured the sets of lines in two colours in the above figure. There are the following lines, given by the zeros of linear functions g₁, g₂ and g₃ and h₁, h₂ and h₃ :

g₁(x,y)=0 line through A, B, -(A+B)
g₂(x,y)=0 line through B+C, -(B+C), O
g₃(x,y)=0 line through A+B, C, -((A+B)+C)

The union of the set of black lines is the set of zeros of

g₁g₂g₃(x,y)=0

h₁(x,y)=0 line through C, B, -(C+B)
h₂(x,y)=0 line through B+A, -(B+A), O
h₃(x,y)=0 line through C+B, A, -(A+(B+C))

The union of the set of green lines is the set of zeros of

h₁h₂h₃(x,y)=0

The elliptic curve itself (red curve) is given by a cubic equation

f(x,y)=0

In this example, f(x,y)=-y² + x³ + ax + b
For a point P=(x,y), let f(P)=f(x,y). Note that f(P)=0 for all the points P= A, B, C, B+C, -(B+C), A+B, -(B+C), -((A+B)+C), -(A+(B+C)) and O. Actually, this is not quite true, since we have not defined f(O). To do this, we work instead in projective coordinates, (X : Y : Z), with the relation x=X/Z and y=Y/Z. Then the curve is defined by a cubic F(X,Y,Z)=0, which is an equation of the form

F(X,Y,Z)= a₁X³ + a₂X²Y + a₃X²Z + a₄XY² + a₅XYZ + a₆XZ² + a₇Y³ + a₈Y²Z + a₉YZ² + a₁₀Z³

and we set f(x,y)=F(x,y,1). The condition for O=(0:1:0) to be on the cubic is obtained by setting F(0,1,0)=0, giving that the coefficient of Y³ is zero: a₇=0.
The eight points A, B, -(A+B), B+C, -(B+C), O, A+B and C lie on all of the following three cubics (by which I mean set of zeros of a equation which has terms being maximal degree 3):

f(x,y)=0
g(x,y):=g₁g₂g₃(x,y)=0
h(x,y):=h₁h₂h₃(x,y)=0

Now assume that the eight points are in general position (meaning not special cases where the statements I want to be true don't hold; in fact using the phrase "general position" also means most cases are not special cases).
Every point required to be on a cubic reduces the dimension of the space of cubics, so since the original space is 9 dimensional, requiring 8 points to be on the cubic results in a 1 dimensional space. That is, the space can be described by one parameter.
Since for these 8 points, g(P)=h(P)=0, then also for any linear combination ag + bh, for real numbers a, b (or in some other feild if you want) we have ag(P) + bh(P)=0, so this gives a one dimensional space of curves through these points; you can use a parameter t if you want, and then the family of curves is given by g(x,y) + th(x,y)=0, together with the curve given by h(x,y)=0, corresponding to t=infinity. However, to avoid having to resort to infinity, we instead parameterise by (1-t)g(x,y) + th(x,y)=0, and when t=1 or 0, we get the cubics given by the sets of three lines. This is shown in the following picture:

(go to this desmos)

Since the space of cubic curves through these 8 points is one dimensional (projectively), and since f and g do not define the same curve, all cubic curves in this family have the form (1-t)g(x,y) + th(x,y) (this comes from (projective) linear algebra and results about spanning sets). In particular, for some t, which seems to be around t=2.25, we have f(x,y)=(1-t)g(x,y) + th(x,y).
This means that if g(P)=0 and h(P)=0, then also f(P)=0. Let Q be the point where the lines g₃(x,y)=0 and h₃(x,y)=0 intersect. In the general case, Q is different from the 8 points already mentioned. And since g(Q)=h(Q)=0, then also f(Q)=0, so Q is a point on the cubic curve f(x,y)=0, our original elliptic curve. Now we should mention Bezout's theorem, which implies that two cubics can only intersect in 9 points. And so the points Q and -((A+B)+C), both being different from the eight points A,B,...,O, and both being on f(x,y)=0 and g(x,y)=0, must be the same point, i.e., Q=-((A+B)+C). Similarly, by considering the intersection of h(x,y)=0 and f(x,y)=0, we get Q = -(A+(B+C)), and so -((A+B)+C) = -((A+B)+C), and so the group law of the elliptic curve is associative.
Note that Desmos does not seem to plot vertical lines when they should apear as a factor of an implicit equation in x and y, so the above picture won't show the vertical lines when t=0 or t=1. A correct plot would show them. There may be spurious vertical lines for other values of t.