MATLAB: Is cputime a linear function of the number of for-loops

cputimeMATLAB and Simulink Student Suite

Can anyone help solve the following problems? Thank you in advance.

I tried two ways to measure my CPU timing: 1) using the Profiler; 2) using the cputime function.

When I used the Profiler, I realized that this cpu time was a linear function of the number of for-loops; but when I used the cputime function, this was not a linear but a exponential. Why were they different? And why the timing of using the Profiler was larger than that of the cputime function? Can anyone help me?

Best Answer

This is a complicated question, in the sense that you need to understand the mathematics FIRST. And I think you do not understand that from your question. But then you need to understand things about how your computer works. And that can get more complicated.

The mathematics is simple, in a sense.

If I have one loop, from 1 to n, and if each pass through the loop consumes k CPU cycles, then the loop requires n*k cycles. We might say this is a process of order n*k, writing that using big O notation, thus O(n*k).

A nested for loop, so one loop inside a second loop, would then use O(k*n1*n2) cpu cycles, where n1 and n2 are the lengths of the respective loops. Assuming both loops are the same lengths, then we could say the process now is O(k*n^2).

So a triply nested for loop, uses O(k*n^3). You should follow the pattern. If I have loops that are nested to a depth of D, then the process will use O(k*n^D) cycles.

Therefore, If each loop does nothing inside the loop but start another nested loop until we get to the innermost loop, AND each loop is the same length as the others, then the process is indeed exponentially growing, where that exponential growth is in the form of powers of n. So at its simplest, you would see exponential growth behavior, AS A FUNCTION OF THE NUMBER OF NESTED LOOPS AND THE LENGTH OF THE LOOP.

Essentially, it takes a computer 10 times as long to compute 10 times as much stuff. Computers are linear in that respect. Thus 1 core computes numbers linearly, one number at a time. Except that it does not really perfectly work quite that way. Why not? Because CPUs now have multiple cores. And SOME of the time, MATLAB can farm out a process to multiple cores. So my computer has 8 real cores. So on some operations where it can do that efficiently, on SOME larger problems, MATLAB automatically can create up to 8 parallel threads, thus doing my computation 8 times as fast. (Hyperthreaded cores are a mirage though.) And then there is pipelining, to speed things up just a bit on long computations.

Don't forget though that MATLAB is not the only thing that is running on your computer. If you try to measure the time required by some operation, and you are surfing the web or answering your mail while it runs? That is a bad idea, since now you are stealing CPU cycles, AWAY from the MATLAB process. Just moving and clicking the mouse can steal cycles. And your computer is always doing something in the background. Sometimes it may be refreshing a web page. Sometimes doing an automatic backup on your comuter, or running an anti-virus application. Is your mail tool running in the background? Mine is all the time. When new mail arrives, it downloads it from the mail server, then a little banner pops up to tell me I have mail. Those are CPU cycles. All sorts of crap gets done without you thinking about it. So measuring the time required simply by counting the number of milli-seconds between two points in your code is at best a rough approximation.

Worse, do you think that measuring the time using a tool like the profile tool is the SAME as the time required without profiling being done? WRONG. In order to count the operations, your computer needs to do a little extra work between every operation! Of course it takes longer when you are running the profiler! You are telling your computer to count things at the same time! So it must take more time. The goal is that the extra time to use the profile tool is not significant. What you care about when profiling code is the relative times for each part of your code, not so much the actual overall number.

Does this mean you can measure the time of a small something, do it twice, and it will take your computer twice as long to finish? Probably not, since there is more to worry about. That small something takes so little time to execute, that it is done before you could measure the time used. cputime (or the tic and toc functions) is in fact a bad way to measure time, because it is trying to measure the real time consumed between two points. Instead, you need to run the computation many times, then divide the total time by the number of evaluations. This gives you a better measure of the time taken.

As well, the very first time you call a function, MATLAB does something extra - caching the function in memory so that the NEXT time it is called and every future call will require less work. This "warm up" cost can be significant.

The point of all this is, you need to understand the mathematics. But also how your computer works. In general, use tools like timeit to measure the time for a specific process. timeit warms the code up, then computes a measure like the median of the times over multiple calls. This will be less sensitive than the mean, in case your computer suddenly got busy doing something else mid-process.

Sadly, I've left a lot out, and oversimplified a lot in the above explanations. There are other subtle factors to consider, and if you look VERY closely at the times required, measured as accurately as possible, things can get very interesting.

Best Answer

Related Solutions

MATLAB: Hyperthreading question

MATLAB: Tic toc in subfunctions

Related Question