# How Quickly Can the Matrix Be Solved?

The solution was finally made to run last week. Today the question is how fast the thing runs. My feel for the answer to this question has to do with the context in which I first asked it.

From 1994 to 2000 the fastest desktop computers I worked with maxed out at 400 MHz. That was true for PCs and the more expensive DEC Vax and Alpha machines we were running. By contrast the laptop I’m working on uses an Intel Core i7-4510U CPU with a base speed of 2.00 GHz. That would be five times faster on its own. However, that number is the base speed of each core, and depending on the number running the speed can theoretically be boosted to as much as 3.1 GHz. The Windows system page indicates that the fastest core may be running at 2.6 GHz.

After that we can consider that each core in a current CPU is likely to be further optimized in terms of its internal architecture, the number of clock cycles it takes to perform each operation, the amount of available cache memory, the speed at which data can be moved in and out of RAM, and other considerations. The current chip and OS is also 64 bits while the older ones were only 32 bits, though that consideration may not have much effect on the speed of calculations.

I don’t know how to evaluate the overhead incurred by the operating system itself, but running JavaScript in a browser at least carries the overhead of the browser.

Next we can consider the number of times the matrix code had to be run for every model iteration. The matrix had to be run at least once to update the current temperature of each piece in the furnace during each time step. The number of pieces in a furnace could range from, say, 8 to 144 depending on the application, so lets pick 50 for giggles. This number doesn’t really matter because the bulk of the work is done when trying to predict what happens in the future.

Control systems based on model-predictive simulation work by starting from current conditions and calculating what will happen in the future. If the results obtained by the simulation match the desired results then the control settings need not be changed. However, if the predicted results do not match the desired results the program has to figure out what control inputs to change. Without going into detail, the furnace Level 2 systems I wrote would march the pieces out of the furnace at the current average pace of movement through the furnace, figure out what the discharge average and differential temperatures (the difference between the temperatures of the hottest and coldest nodes in a particular piece, which value had to be below a specified threshold) would be, compare the predicted result with the desired result, and adjust the control settings.

The control settings adjusted by the Level 2 system were the temperature setpoints for each zone in the furnace. A furnace might have anywhere from one to six (or more) independently controlled zones. It was up to the Level 1 systems to do the detailed work necessary to make this happen. They used PLCs to control gas flows, fuel-air ratios, pressures, movements, cooling, opening and closing of doors, operation of exhaust flues, safety systems, and so on. They were a whole world of their own and I was always thoroughly impressed with the people who built those systems.

For this example I’ve assumed that there are three effective zones of control in the furnace, and the predicted temperature of at least one piece has to be calculated for each zone. The “critical” piece in each zone was the one that was logically identified to be the hardest to heat. Pieces could be a bit over temperature of more thoroughly soaked (lower than target differential temperature, but if they were too cold or not sufficiently soaked they could break a mill stand, which would slow the operation down). As the pieces are being marched out of the furnace the prediction code assumes that the Level 1 system will move the furnace temperatures to their setpoints for each zone if they are not already there. I ran successive iterations of the prediction code until I either got the discharge temperatures to within a few degrees of their targets or until the number of iterations gets up to five. Greater accuracy could be achieved during later time steps but the model could never be allowed to run longer than its designated time step.

Next I considered the pace of movement through the furnace. If the furnace was running slowly it might take two hours for a piece to get through the furnace, and that would mean 180 40-seecond time steps. So, at 3 pieces times 5 iterations times 180 steps we have 2700 matrix calculations. More are certainly possible.

I’ve set the example up to run 2750 iterations, which would be enough to update 50 pieces plus predict three as described. The code runs when the page is first loaded and again whenever the button is clicked. I noticed that clicking the button to repeat the test yields notably better results, so there must be some overhead associated with the initial load of the page.

I would have expected the code to take 5-10 seconds to run, but on my laptop, with the file served from local disk, the initial result was 0.08 seconds and about 0.05 when the button is clicked to rerun. This represents a performance improvement of at least 100, which is kind of mind-blowing to me.

Served over a fast connection from the website I get results of about 0.085 seconds on load and 0.052 seconds on rerun.

On my iPhone 5S I get 0.352 seconds on load and 0.230 seconds on rerun. That may be even more mind-blowing.

Tomorrow I’ll try this on a few other systems to see how it feels. What results are you getting on your setup?

This entry was posted in Tools and methods and tagged , , , , , , . Bookmark the permalink.