The reason for the conventions used by modulus notation

elementary-number-theorymodular arithmeticnotationnumber theory

Apologies in advance for what I anticipate will be a very dumb question.

To give some background:

I am a software architect that’s been programming since the age of 8 and professionally for the last 15 years. Never had any formal university-level training; everything I know is self taught. As such, there are unorthodox gaps in my knowledge; for example, I have in-depth knowledge of subjects like set theory or statistics (given that I primarily work on business software), but when it comes to something like calculus the extent of my knowledge is that it exists and that it’s used in advanced graphics.

However, now my career may take a turn to where I need an in-depth understanding of crypthography (on a mathematical / theoretical level). So, here I am in my 30’s going back to the roots and learning abstract, theoretical math & comp sci. I’ve hired a tutor for the purpose, who is an undergrad student in a comp sci program that’s known for being very theory-focused.

The question:

Recently, he started teaching me modular notation. Being a software engineer, I am obviously deeply familiar with the modulus operator, and in the world of programming we use the % operator for the purpose. I.e. 3 % 5 = 3.

However, I was told that in mathematics, the notation is to arbitrarily add mod X to the end of an equation/problem block. So, everything is written as usual and there is a note at the end of the equation/problem specifying “under what modulus” everything in the problem is.

This makes no sense to me whatsoever, and when I asked my tutor he said that he asked the same question in his class and the professor replied that “it’s just the way it’s done”. I understand that conventions can be unique, but this to me feels like a very radical departure from the way math is usually written down, and because of that — I feel like there has to be an underlying reason for it that I am not seeing… I am hoping that someone much more knowledgeable than me can help clarify my few questions and help it all make sense.

  1. Math is typically written left to right, with parenthesis and/or other symbols defining blocks/scopes. For example, the body of a square root can include massive formulas, but the scope of the square root is still visually defined. Same with parenthesis blocks and global operations done upon the entire block.

Numbers and operations also follow in sequence, which have a value that they act on and an argument. For example, to get a sum of 4 numbers, we would write (2+5+7+8) (three individual operations), not (2,5,7,8 +) (apply this operator on all numbers in the set).

But with the mod operator, it seems like it’s an arbitrarily-placed footnote at the end of a block, which on top of everything contains extremely vital information. There is no purpose to reading whatever formula is inside the block without first knowing “under what modulus” it is, so how does this work out in academia with page-long formulas?

  1. How exactly does scoping work?

From my understanding, all of the below is syntactically legal:

3x = 15y mod 5

(2x + 8y) - 12z mod 5

(2x + 8y)(12z - 5x) mod 5

What happens if my problem is using a different modulus/base for different parts of the problem?

Would this be legal? (3x + 4y mod 5) - (8z - 2a mod 8)?

What if I have nested clauses? I.e. (3x + (2a - 2b mod 7)^2 + 4y mod 5) - 17z mod 3?

Thanks so much in advance to whoever can help me make sense of this system!

Best Answer

In programming terminology, the symbol $\bmod$ is "overloaded" in math to mean two different things: the modulus operator, and the "congruent mod $r$" relation.

The operator, written $a \bmod r$, is the equivalent of your % operator. You can think of it as a function taking two integers and returning an integer: %(int a, int r) -> int.

The relation, written $a \equiv b \pmod r$, is effectively a predicate: it is a statement with a true/false truth value, like a function returning bool. So you can think of it as cong(int a, int b, int r) -> bool. The connection between them is that cong(a, b, r) := ((a - b) % r == 0). Or ignoring what % might do for negative numbers, cong(a, b, r) := (a % r == b % r).

Now, it may be that an extended passage of a math paper will be "working mod $r$". Formally, this means that all operations are not intended as operations on the integers $\mathbb{Z}$, but on the ring $\mathbb{Z}_r$ of integers mod $r$. Without getting into abstract algebra, you can think of it as roughly "the notation $a = b$ is now redefined as cong(a, b, r)". Two numbers that are congruent mod $r$ are now considered to be the same number; they compare as equal.

It's true that the "scope" may not be specified explicitly, and the author may expect you to understand from context which ring we are working in. Usually, once you know enough abstract algebra to be able to read the paper at all, this does not lead to any confusion. A paper is after all written for a human mathematician to read, not for a compiler to parse.

However, if we say something like "Let $a,b,c \in \mathbb{Z}_r$", this means that a,b,c are "declared", not as integers, but as objects of a class for which the == operator is overloaded as cong(a, b, r). And so a following expression like $a + b = c$ means cong(a+b, c, r) == true, where you can think of the + operator also having been overloaded to return an object of the $\mathbb{Z}_r$ class.