Algorithm verification: Get all the combinations of possible words

algorithmscombinatoricspythonrecursive algorithmssolution-verification

I wanted to know if the algorithm that i wrotte just below in python is correct.

My goal is to find an algorithm that print/find all the possible combinaison of words that can be done using the character from character '!' (decimal value = 33) to character '~' (decimal value = 126) in the asccii table:

ascii table

Here the code using recursion in python:

byteWord = bytearray(b'\x20')  # Hex = '\x21' & Dec = '33' & Char = '!'

cntVerif = 0 

def comb_fct(bytes_arr, cnt: int):
    global cntVerif

    if len(bytes_arr) > 3:
        print(f'{cntVerif+1}:TEST END')
        sys.exit()

    if bytes_arr[cnt] == 126:
        if cnt == len(bytes_arr) or len(bytes_arr) == 1:
            bytes_arr.insert(0, 32)
        bytes_arr[cnt] = 32
        cnt += 1
        cntVerif += 1
        print(f'{cntVerif}:if bytes_arr[cnt] == 126: \n\tbytes_arr = {bytes_arr}')
        comb_fct(bytes_arr, cnt)

    if cnt == -1 or cnt == len(bytes_arr)-1:
        bytes_arr[cnt] = bytes_arr[cnt] + 1
        cntVerif += 1
        print(f'{cntVerif}:if cnt==-1: \n\tbytes_arr = {bytes_arr}')
        comb_fct(bytes_arr, cnt=-1)  # index = -1 means last index

    bytes_arr[cnt] = bytes_arr[cnt] + 1
    cntVerif += 1
    print(f'{cntVerif}:None if: \n\tbytes_arr={bytes_arr}')
    comb_fct(bytes_arr, cnt+1)


comb_fct(byteWord, -1)

Pseudo-code:

byteWord = bytearray(b'\x20')  # create a byte with value 20 in hex basis
cntVerif = 0  


def comb_fct(bytes_arr, cnt: int):
 
    if length(bytes_arr) > 3: 
        print(f'{cntVerif+1}:TEST END')
        stop

    if bytes_arr[cnt] == 126:
        if cnt == length(bytes_arr) or length(bytes_arr) == 1:
            bytes_arr.insert(index=0, value=32)  # Insert at the top of the byte array an other byte at index 0 (the begining) with value 32
        bytes_arr[cnt] = 32
        cnt += 1
        cntVerif += 1 
        print(f'{cntVerif}:if bytes_arr[cnt] == 126: \n\tbytes_arr = {bytes_arr}')
        comb_fct(bytes_arr, cnt)

    if cnt == -1 or cnt == len(bytes_arr)-1:
        bytes_arr[cnt] = bytes_arr[cnt] + 1
        cntVerif += 1
        print(f'{cntVerif}:if cnt==-1: \n\tbytes_arr = {bytes_arr}')
        comb_fct(bytes_arr, cnt=-1)  # index = -1 means last index

    bytes_arr[cnt] = bytes_arr[cnt] + 1
    cntVerif += 1  
    print(f'{cntVerif}:None if: \n\tbytes_arr={bytes_arr}')
    brute_force_fct(bytes_arr, cnt+1)


comb_fct(byteWord, -1)

Thank your for your help because python allow just a limited number of recursion (996 on my computer) so for exemple i can't verify if my algorithm give all the word of length 3 that can be realised with the range of character describe upper.
And more generally i would like to know if my algorithm is correct genrally (the logic).

Of course if anyone has a better idea to writte this algorithm (a faster algorithm for exemple). I will be happy to read it.

Best Answer

EDIT: a full efficient python code (see the accepted answer): https://stackoverflow.com/questions/72762403/algorithm-verification-get-all-the-combinaison-of-possible-word

EXPLANATION:

To solve this problem i think that i ve found a more elegant and faster way to do it without using recursive algorithm.
Complexity:
I think too that it is the time and space optimal solution.
As it is in time: O(n) with n the total number of possible combinaison that can be very very high. And theorically O(1) in space complexity. Concerning the space complexity because of the python language characteristics my code ,from a practical point of view, creates a lot of bytearray. This can be corrected with light modification. But for a better code check the solution posted by @ricci here that i marked as the accepted answer.
Mathematical principle used:
I am using the fact that it exists a bijection between all the number in decimal basis and the number in base 94.
It is obvious that each number in base 94 can be written using a special sequance of unique character as the one in the range [30, 126] (in decimal value) in the ascii code.
Exemple of base conversion:
https://www.rapidtables.com/convert/number/decimal-to-hex.html
The operator '//' is the quotient operator and the operator '%' is the modulo operator.

I will be happy if anyone can confirm that my solution is correct. :-)

ALGORITHM

VERSION 1:
If you are NOT interested by getting all the sequence of words starting by '!'.
For exemple in lenght 2, you are NOT interested by the words of the form '!!'...'!A' '!B' ... etc ...'!R'...'!~' (as in our base '!' is equivalent to zero).

# Get all ascii relevant character in a list
asciiList = []
for c in (chr(i) for i in range(33, 127)):
    asciiList.append(c)
print(f'ascii List: \n{asciiList} \nlist length: {len(asciiList)}')


def base10_to_base94_fct(int_to_convert: int) -> str:
    sol_str = ''
    loop_condition = True

    while loop_condition is True:
        quo = int_to_convert // 94
        mod = int_to_convert % 94
        sol_str = asciiList[mod] + sol_str
        int_to_convert = quo
        if quo == 0:
            loop_condition = False

    return sol_str


# test = base10_to_base94_fct(94**2-1)
# print(f'TEST result: {test}')


def comb_fct(word_length: int) -> None:
    max_iter = 94**word_length

    cnt = 1
    while cnt < max_iter:
        str_tmp = base10_to_base94_fct(cnt)
        cnt += 1
        print(f'{cnt}: Current word check:{str_tmp}')


# Test
comb_fct(3)

VERSION 2:
If you are interested by getting all the sequence of words starting by '!'.
For exemple in lenght 2, you are interested by the words of the form '!!'...'!A' '!B' ... etc ...'!R'...'!~' (as in our base '!' is equivalent to zero).

# Get all ascii relevant character in a list
asciiList = []
for c in (chr(i) for i in range(33, 127)):
    asciiList.append(c)
print(f'The word should contain only the character in the following ascii List: \n{asciiList} \nlist length: {len(asciiList)}')


def base10_to_base94_fct(int_to_convert: int, str_length: int) -> bytearray:
    sol_str = bytearray(b'\x21') * str_length
    digit_nbr = str_length-1
    loop_condition = True

    while loop_condition is True:
        quo = int_to_convert // 94
        mod = int_to_convert % 94
        sol_str[digit_nbr] = 33 + mod
        digit_nbr -= 1
        int_to_convert = quo
        if digit_nbr == -1:
            loop_condition = False

    return sol_str


def comb_fct(max_word_length: int) -> None:
    max_iter_abs = (94/93) * (94**max_word_length-1)  # sum of a geometric series: 94 + 94^2 + 94^3 + 94^4 + ... + 94^N
    max_iter_rel = 94
    word_length = 1

    cnt_rel = 0  # rel = relative
    cnt_abs = 0  # abs = absolute
    while cnt_rel < max_iter_rel**word_length and cnt_abs < max_iter_abs:
        str_tmp = base10_to_base94_fct(cnt_rel, word_length)
        print(f'{cnt_abs}:Current word test:{str_tmp}.')
        print(f'cnt_rel = {cnt_rel} and cnt_abs={cnt_abs}')
        if str_tmp == bytearray(b'\x7e') * word_length:
            word_length += 1
            cnt_rel = 0
            continue
        cnt_rel += 1
        cnt_abs += 1

comb_fct(2)  # Test
Related Question