Calculus… or Statistics?

Most roads in mathematics instruction seem to lead to calculus before any but the most basic statistics. Many statistical techniques require calculus to derive, but it isn’t usually necessary, short of advanced or novel applications, to know calculus when applying those techniques. What’s more important is understanding how to apply them and when they are appropriate, which of course could be said of any tool.

I went through the standard progression, ending in calculus and differential equations, and learned a reasonable amount of statistics only later when it came up in my work. As an engineer I used a certain amount of calculus, but most people don’t. I wish I’d had a better grounding in statistics earlier.

Both tools provide important insights and serve to quantify phenomena but, in my mind, calculus is generally used to provide point values (if X, then Y) while statistics are geared toward calculating probabilities (if X, then range of Ys). The difference is that some of the Xs in probabilistic systems are themselves variant (more properly, if range of Xs, then range of Ys).

Calculus might be used to answer questions like:

  • What cross-sectional area is needed to support a specified weight under tension?
  • How will temperature change when a given amount of heat is added?
  • What dimensions of a cylindrical can will require the minimum use of sheet metal to enclose the maximum amount of volume?

Statistics might be used to answer questions like:

  • What are the chances that a specified number of events will be completed in a specified amount of time?
  • What is the probability that a part will experience problems that will require a machine to shut down, over a given period of time?
  • What is the expected amount of downtime for a production process?

It’s easy to see how all of these questions factor into making decisions.

The ideas of calculus and statistics can get conflated as they are actually used. Thinking about the first calculus example would might use the calculation to design an elevator to lift a certain amount of weight. If we make the cable thick enough to barely hold the maximum weight it might fail if the elevator is overloaded or the cable is worn or defective. The cables are therefore designed with numerous safety factors including the presence of multiple cables, each of which is strong enough to hold more than the expected weight of the elevator and its load (as well as other mechanisms). The safety factor recognizes the possibilities that things can go wrong, even if the possibilities are not calculated explicitly. As a result, cable-driven elevators are about the safest form of passenger transport known.

The third calculus example is a direct form of optimization; the can example is often used in beginning courses. Different kinds of optimizations can be performed using statistics.

Calculations where none of the inputs vary need only be performed once. If the inputs do vary according to known distributions (this is different than purposefully varying system parameters to test different design cases), then the calculations would have to be run many times to generate a range of results.

Inputs to calculus problems are all fixed rates and quantities, while inputs to statistical problems are themselves probabilities. At scale, however, the results of the different types of analyses converge. The molecules of water in a steam turbine can be very hot or cold on a statistical basis, but on average the steam behaves uniformly, because the individual variations are small relative to the size of the system. The randomness of individual molecules don’t matter and calculus is therefore used to analyze that behavior. If a system is “chunkier”, as in maintenance actions performed on a group of machines over a limited period of time, each statistically-variable event stands a larger chance of affecting the behavior of the system as a whole, so statistical methods are used.

Finally, it’s always important to understand the assumptions and results of each specific analysis. For example, the machine maintenance problems I worked on generated outcomes according to a rather naïve set of rules. The human managers of the system being modeled were known to be able to reorder events to provide better outcomes in many cases. We therefore had to understand that the results we were getting, even though themselves variable, were likely to be somewhat pessimistic. They provided an envelope of worst-case outcomes.

It is also important to understand the methods themselves. It is all too easy to make errors if one does not grasp the subtleties in the techniques of either calculus or statistics.

This entry was posted in Tools and methods and tagged . Bookmark the permalink.

Leave a Reply