It appears that your MATLAB code has 64 separate inputs, and that fixpt conversion is selecting anywhere from 3 to 7 bits per input. if I guess a median value of 5 bits per pin, this accounts for 320 pins. You will need to stream these inputs in one at a time, most likely, either using them in a streaming fashion or storing them in a RAM, as your algorithm requires.
In addition, you are generating two output vectors that are 24 * 64 bits, which is 1536 pins for each of these two outputs. Once again, you need to serialize the outputs and stream the data out rather than requesting such large arrays. Worst case you can use a RAM to buffer the outputs and start streaming them out once enough data has been calculated.
I don't know the internals of your algorithm, but you do. You may need no buffering, line/column buffers, or full IO matrix buffering; it depends on your algorithm and implementation. In any case you should be able to drop the input pin requirement to that needed for a single input element. The two outputs should likewise be able to be reduced to a pair of 64-bit streaming outputs. If pins are really restricted you should be able to multiplex the output, and reduce the 128 bits to 64.
Best Answer