I have an unusual programming problem and the math side of it has me stumped. It's probably a simple answer but math isn't my strongest area.
I've generated a unique string of 7 characters which are each randomly selected from these possibilities: ABCDEFGHIJKLMNOPQRSTUVWXYZ123456789
for example A6HJ92B
and I need to convert it to a unique number value. When converted, no two versions of this random string can be the name number.
I could just generate a number rather than including letters in the original id, but of course that means I have to increase the length of my string, and it's possible that the user of my application may want to type this string, as it identifies his "session" in an application, so I want to keep it short.
So my idea was to build a table like this:
A : 1,
B : 2,
C : 3,
D : 4,
E : 5,
F : 6,
G : 7,
H : 8,
... you get the idea ...
5 : 31,
6 : 32,
7 : 33,
8 : 34,
9 : 35
And then I'd add all of the numbers up…
A6HJ92B
:
A : 1
6 : 32
H : 8
J : 10
9 : 35
2 : 28
B : 2
1+32+8+10+35+28+2 = 116
…but I realized this is a flawed idea because many possible strings will "collide" or equal the same number. I need each unique string to equal a unique number.
So even if I multiplied each character's value (1*32*8*10*35*28*2 = 5,017,600
), I'm thinking there might be possible collisions there too.
Is there a way to calculate this in a way that eliminates collisions? If the collisions cant be eliminated, what methods can I use to minimize them?
Best Answer
You have $35$ characters, so you can effectively convert the string to an integer in base $35$. It would be better to start counting from $0$ instead of $1$ to get values from $0$ to $34$:
Now for a given string $a_6a_5a_4\ldots a_1a_0$, your function can be $$a_635^6 + a_535^5 + \ldots + a_135 + a_0.$$ Your example of
A6HJ92B
would correspond to: $$0\cdot35^6 + 31\cdot35^5 + 7\cdot35^4 + 9\cdot35^3 + 34\cdot35^2 + 27\cdot35+1 = 1\ 639\ 110\ 971.$$ Is there a reason you don't include the digit0
? It would seem more natural to include it (thus counting in base 36) and to order the symbols0123456789ABC...Z
.