End-of-Life for Old Computers

Many of the Level 2 systems I wrote were for new mills but a few were replacements for older systems. Replacements were sometime sought simply because the newer versions were more efficient and saved more fuel, but sometimes there was a rush to replace an older system because repair parts could no longer be found. This surely applied to many different types of hardware but in this case I’m thinking specifically of DEC PDP 11/73 hardware running a flavor of Rsx11 OS. I remember systems that looked like this Q-bus rack shown here.

Each system had at least a CPU board, an I/O board, and some kind of graphics board. There was probably a drive controller board as well and there may have been others. As working examples of these boards became increasingly scarce the demand for them at surviving installations went up. In order to circulate them quickly it was still possible in the mid-90s to arrange with a commercial airline to walk to a departure gate, hand a board off to a flight attendant, and have a customer pick it up at the arrival gate in the destination city. Try to imagine that happening today. (Shipment of parcels is now far cheaper and faster than it was then, of course.)

The boards were sometimes sent to needy customers as an act of goodwill but in other cases there may have been a charge. The airlines were certainly paid. From a pure economics viewpoint one can see that price would go up as supply decreased. Whether you measure expense in terms of money or effort, the last available boards demonstrated economic theory exactly as expected.

As an aside, this paper provides an interesting and concise history of control technologies. I was in the field as a myriad of specialized computers were giving way to PC-based platforms. I specifically worked with DEC Vax and Alpha machines running VMS and led our company’s transition to PCs running Windows NT.

Posted in Hardware, Software | Tagged , , , | Leave a comment

Three Plates of Spaghetti

Spaghetti is great for a meal but not so much for code.  I’ve had to deal with three major plates of spaghetti code in my career.  Here’s how they went and what I learned.

I got to write all my code from scratch for the first few years of my working life, which was nice.  The one brush I had with anything that looked like spaghetti was quickly dispatched.  I had to adapt my predecessor’s code for one or two jobs when I first started writing model-predictive control systems for steel mills, but that only took a week or so in each case.  They were similar to several existing systems so adapting them was simpler.  The truth was that I had chosen to discard his code the first day I was there and build my own from scratch.

Several years later, though, I took custody of a control system that had been adapted and hacked over by several different people over the course of a decade.  I got to mentor the team that wrote the replacement but I was the high-level language guy that got to keep the old one running.

Our parent company’s equipment was originally run by Programmable Logic Controllers (PLCs) but it was cheaper to use a PC and some simple serial I/O devices.  The person charged with writing the original PC software took a shortcut — he mimicked the the PLC ladder logic in C++ code, though on an orderly, object-oriented basis.  It was clever and quick and it worked, but those who edited the code after him did it differently.  The first person added code in a consciously high-level way, as C++ that looked like nice C++.  The person or people who worked on it after that seemed to add patches any old way they wanted.  The mods I made were always in the style of the affected location.  It was hard enough to make fixes and mods without breaking anything; imposing order in a major way was a lower priority on a system that was due for replacement.

It was actually a pretty nice system.  It allocated memory for the controls for each furnace after reading a configuration file.  That file had 250 possible configuration settings and the international language file had another 700 entries.  The innards of the thing, though, were difficult to tease apart and doing so took more than a year.  Requests for fixes and mods came in at irregular intervals and it took a while for the internal structure to become clear to me.  It was ultimately replaced by the new version and I made sure that team included a framework for documentation and configuration management from the beginning.  Today I would do this in an even more aggressive and organized fashion.

I encountered another mess at my next job.  In this case the PC software had two components, a driver unit that handled the communications and much of the configuration and a UI unit that allowed the user to configure and use interface screens.  The UI unit was just a hair buggy but it was licensed from a third-party developer and we could neither see nor modify its code.  That was a problem before I was there, while I was there, and apparently long after I was there.  The driver section was just entropic in general.  I could always bang on it to get it to do what it needed to do (it took a few late nights) but I never intuited any real structure.  One of my colleagues who worked on it full time ended up getting a better grip and ultimately added a whole new structure to it.  He did some nice work.

I also had to write installers using InstallShield but, at least for that version, there seemed to be two different paradigms for doing things.  The installer was implemented in one when I started but the other one seemed much more straightforward and I eventually converted to it. We also discovered that neither the PC software nor Installshield dealt with Windows ME very well, so we simply chose not to support it.  It was unlikely anyone was going to try to run control software on that OS anyway.  I learned a lot more about how the Windows OS worked, that’s for sure.

The final tangle I encountered was a simulation of aircraft maintenance that had existed in various forms since the 1960s.  It was still supported by an engineer who was close to 80 years old and others who had worked on it over the years (the owners of the company) were all in their late 50s.  The owners had become managers and didn’t get into the code much but when they and the senior engineer did work on it they tended to just patch it.

This code was a problem for a bunch of reasons, some of which were legitimate.  The model itself was modified over the years to cover many different operations.  In particular it had code and variables related to operations on aircraft carriers (usually involving constraints of landing and service locations).  The analyses we were running did not consider carrier ops explicitly and the code had never been consciously reworked to remove the relevant references.  Indeed, after so many years the requirements and core modeling assumptions had changed on numerous occasions (sometimes for very good reasons) so the code naturally reflected all of that drift.

The people who did most of the work on the code were primarily concerned with the analyses they were performing and the code itself was a means and not an end.  They were all smart and capable but not always motivated by the latest ideas in design, consistency, modularity, and configuration management.  The code was so intertwined that changes often had unexpected side effects.  Testing was not exceptionally thorough and never had been.

Another issue was that the code was written in a rather obscure language called GPSS.  It is a discrete-event simulation language with a low level structure like that of assembly language in expression although more complex behind the scenes.  If you want to create really organized code in this language you have to work at it.  It’s very easy to let entropy creep in, especially for practitioners looking at it through more contemporary eyes.  It didn’t help either that this particular code was the most complex that had ever been written in the language.  It was big and had a lot of moving parts.

Substantially restructuring the code or replacing it entirely was going to be difficult for both financial and political reasons.  The financial limitations are that the team is funded by incremental contracts by the year (with the occasional bump for a one-off analysis) and there was never time to do much more than what was needed.  The first political limitation is that the software team employees were all subcontractors and the prime contractor was always angling for more involvement.  If the software was ever rewritten from scratch the prime might be able to ease the sub right out of the picture.  The other political limitation was that a replacement would have to demonstrate its credibility anew with the customer, even though it would likely be a superior product.

To the good the team in charge of the software was fully aware of all these issues and did what they could over time to hammer things into shape.  Old bits were slowly and carefully excised and various operations and structures were refactored to make them more consistent and approachable.  A series of regression tests were developed over time in the form of multiple input configurations that started simply and progressively tested more and more capabilities of the code.  The biggest improvement was the creation of a wrapper framework written in C# that managed all the inputs and outputs of the model, which were many, and in many different formats.  That wrapper had problems of its own but they were minor in comparison.

The system was subject to a formal configuration management process and most of the documentation of the configuration, structure, and operation of the system was updated fairly regularly.  The structure of the code itself was not externally documented but everything else was.

So what is the takeaway from all this?  First, there are at least some legitimate reasons for spaghetti code to exist.  It would be better if code never got into that state but once it gets there it can be difficult to replace.  Much of modern software development practice has to do with managing complexity (everything from test-driven development to source code control and build automation systems to functional programming) and that goes a long way to preventing this kind of pastafication.  Most of the rest is experience and good management.

If you are confronted with a plate of spaghetti try to do the following:

  • Be careful at first.  Don’t try to make radical changes; you don’t yet know what you might break.
  • Read any existing documentation in any form (e.g., user manuals, configuration guides, maintenance logs, developer notes, data descriptions, source code control system notes, and so on).
  • Learn what you can from any people that may still be available.
  • Document whatever you can.
  • Try to identify the underlying structure or structures.
  • If any configuration, input, or output processes can be automated then make it happen.  Try to remove ways to make mistakes.
  • Add various kinds of error checking and validation if they are lacking.
  • Make sure you document your own efforts to figure things out.  If you approach things systematically you are likely to receive more support and understanding.
  • Make sure you secure access to the tools needed to build and manage the system.  That can be a challenge in itself, especially on older systems.
  • Try to discern the underlying principles at work in the system.  What is it for?  How can it be made more consistent?
  • If you’re holding the fort while a replacement is being built, make sure you share your findings with the team doing that work.
  • Refactor slowly but try to impose some order over time.

I’m sure there are more good ideas.  What suggestions would you add?

Posted in Software | Tagged , , | Leave a comment

Listen, Listen, Listen

I’ve heard it said that you can learn from anyone, but you can only do that if you actually listen to them. If they aren’t talking, ask. Even if you choose not to take anyone’s suggestions they’re likely to be happy you even consulted them.

I was talking to some of the floor guys at a ball mill in Kansas City one day as I was admiring one of their machines. Our furnace heated twenty-foot-long round billets of two to six inches in diameter, which were then dropped between two shaped rollers that pressed the bars into directly into a collection of balls that dropped straight down from the mechanism and rolled away to the cooling tumbler. (The orange-red balls were always beautiful to watch but they made an incredible amount of noise; that was by far the loudest industrial facility I was ever in.)

The bars rolled out of the side of the furnace lengthwise onto a pair of indexing chains. The chains had hooks at intervals which allowed them to pull each bar sideways to where it could drop down into the shaping mechanism. The millwright I was talking to told me the indexing chains were new. The guys on the floor used to pull the bars into place with long hooks but they all thought that process was ridiculous and should be automated. The engineers didn’t listen to them.

Those same engineers, however, ended up doing that job themselves for a few weeks during a strike and finally saw the inefficiency of it for themselves. Then—and only then—did they automate the process.

It’s a little thing, but I think it speaks to a larger issue, which is that it’s all too easy to not listen to people when you don’t think you have to. I’ve seen situations where the hourly workers and even some professionals are all too happy to come in, do their piece by rote, and go home without thinking about anything too much. I also understand that relations between labor and management can be contentious. That’s fine, but there are still plenty of people who have observations that might be of use. Seeking the ideas out and discussing them keeps people involved in the process.

Most technical people, and particularly in software if they’re any good, are always interested in finding better ways to do things. Getting their feedback and incorporating their ideas (or letting them do so on their own), where possible, is one of the most important ways to enhance job satisfaction and morale in a company. Let’s face it, you can’t always pay more, and paying more doesn’t always produce results anyway. You need to take advantage of every piece of creativity and desire your people have. People want to feel like they are listened to and developed actively.

A recent job posting illustrated these points beautifully by emphasizing that every developer got access to a nice suite of development tools and, here’s the kicker, a full subscription to Safari Books Online, which is an online repository that gives access to a huge collection of computing and technical books. All the industry classics are there along with multiple volumes teaching every language, tool, and technique you could ever want. (I’ve maintained my own full subscription for a few years now.) This may be standard procedure in some places but it isn’t in most. Providing a supportive environment generates a lot of enthusiasm and that particular posting drew an exceptional number of responses.

Employees, in the end, do have to show up and provide value as they are directed, but you can make them a lot happier and more productive by being involved in what they’re doing. Most of them want to do a great job and learn and be recognized. Don’t wait for them to come to you, go ask. And when they talk, listen to them.

An even more important group of people to listen to are your customers, though this is better understood by most people.  Sometimes the customer has already identified the problem and sometimes you have to help them figure out what the problem is.  How you listen depends on their understanding of the problem and the nature of the solution you’re offering.  If you have a canned product or service you might be the expert and the listening and negotiating will be about the terms of applying your solution.  If your solution is more adaptive or unique then you’ll need to listen and negotiate even more carefully to ensure your proposals address their needs.

How well you solve their problems, or even if you get to solve them (or continue to solve them), will depend on how well you solicit information from them and listen to the answers.

Posted in Soft Skills | Tagged , , | Leave a comment

What Do You Just “See”?

One of my cousins is a totally extroverted, outgoing “superconnecter”. She is all people, all the time. She’s been everything from a restaurant hostess to a greeter and mascot’s assistant at Pittsburgh Pirates games to a small businesswoman with multiple product lines. When asked to provide a starting list of contacts for Mary Kay she quickly gave them 500 names. She had a car in four months when she really got going.

During a recent conversation she observed, and mind you she said this at about five hundred miles an hour, that when she’s talking with somebody she knows “what the other person’s going to say and what I’m going to say and what they’re going to say next and what I’m going to say next and what they’re going to say after that and what I’m going to say after that about three levels out.” She kept right on going but my takeaway was that conversations and people are what she just “sees”.

While I’ve gotten a lot smoother and more aware over the years I’m never going to see the world and people like that. Everybody has a specialty, though, and mine is looking under the hood to see how all the pieces fit together. I don’t know if it’s because I played with every building toy known to Man when I was young, but once I understand how the pieces of anything work I can just “see” how they can fit together and what you can do with them. This is true of systems I’m analyzing, programming tools, data schemes, or anything else. Having seen so many systems I have a definite knack for breaking them down and figuring out what needs to be captured and changed and implemented. If technique, feel, and experience don’t get me where I need to go (I’ve heard it said that an expert is someone who knows enough to look at something and go, “that’s odd”), my willingness to wallow around in a subject until I “see” what I’m looking for will get me the rest of the way there.

I’ve also figured out how to talk to and work with all different kinds of people to make that happen, as customers and colleagues, as mentees and bosses, and as strangers and friends. I got better at this first in professional situations. A supervisor put me in the lead with clients at an early job because he thought I had “customer savvy”. I’m thinking, “Me?” I was just doing what came naturally, trying to figure out what they needed, what the system we were planning needed, and what I needed to help them.

Many professional situations are structured so you know what the relationships are supposed to be. As long as everyone is pleasant and sticks to the program things usually go well in the short run. In the longer run it’s a good idea to establish a basic framework for communication. The Project Management Body of Knowledge suggests when formal plans are necessary while frameworks like Scrum define details about those plans and Six Sigma gives advice on handling problems. Formal systems like that are good guides and you can take a lot from them. They help you “see” those things more quickly if you don’t have the experience or if they don’t come naturally. Nothing, however, can take the place of thinking about what you want to do ahead of time and being conscious of how you want to do it. Whether short or long run, it’s helpful to establish ground rules where possible, even if informally. Make sure everyone else does the same.

I’ve seen a lot of meetings and projects go well and go poorly over the years and I’ve learned from all of it. What motivates me today is less the technical stuff than making sure all the participants in a process are respected and get what they need. I love the technical part, it’s still what I naturally “see”, after all. It’s a measure of the passion I’ve developed for taking care of people that I often “see” that even more.

Posted in Soft Skills | Tagged , | Leave a comment

An Effective Framework for Verification, Validation, and Accreditation

I recently encountered a formal methodology for conducting VV&A efforts that I think is worthy of your consideration. Briefly, Verification shows whether a system works as specified, Validation shows whether the specification addresses the correct problem, and Accreditation shows whether the system is accepted for its intended use.

The method is based on a specification (MIL-STD-3022) created for the Navy by a senior analyst charged with overseeing such efforts. I was part of a large team that completed an 18-month project to achieve full accreditation for a planning tool the Navy was adopting to manage its entire fleet of F-18-series aircraft. I expect the tool will be adapted to manage other types of aircraft as well, and perhaps other types of equipment.

The details of the actual specification are less important here; it took a team of people a long time to tease apart the details and hammer the process and documents into shape to the satisfaction of the project’s advisers. I am instead providing a brief outline of how the process works, particularly the V&V part. The accreditation part is merely a wrapper around the V&V process that says what the review process is going to be in the beginning and whether the V&V was carried out correctly and shows whether accreditation is supported in the end. I’m convinced that the framework is effective but I’m not convinced that there is any one way to go about it or that every jot and tittle of this formal specification needs to be followed. Plenty of software systems are successfully verified, validated, deployed, and used every day and they clearly don’t all used this method. Take the best features of every framework you find, including this one.

The basic framework is this:

  1. Write an Accreditation Plan to describe the steps that will be taken to support an accreditation of the model.
  2. Write a V&V Plan to describe the steps that will be taken to conduct the Verification and Validation of the model.
  3. Perform the Verification and Validation steps.
  4. Write a V&V Report describing the results of the V&V process. This includes a V&V Recommendation and a description of Lessons Learned.
  5. Write an Accreditation Report describing the results of the Accreditation analysis (which itself is a review of the V&V Report). This includes an Accreditation Recommendation and a description of Lessons Learned.
  6. Make an Accreditation Decision that accepts or fails to accept the system for the intended use, or accepts the system with limitations.

This process assumes that a host of artifacts have been produced and can be reviewed including:

  • Intended Use Statement
  • Conceptual Model (Description of System to be Simulated in the defined framework, but this can be generalized to be the description of any as-is system or process that is going to be addressed by a new implementation)
  • Statement of Requirements and Acceptability Criteria
  • Statement of Assumptions, Capabilities, Limitations, and Risks and Impacts
  • Input and Output Data Artifacts
  • Design Document
  • System Implementation
  • Configuration Management History
  • Test History and Results
  • Customer/SME Assessment

The following evaluations are carried out as part of the V&V process. These steps are meant to assess an implementation’s credibility in terms of capability, accuracy, and usability. Verify that:

  • Requirements map to Specific Intended Use Statement (Requirements Traceability Matrix – RTM)
  • Requirements map to Conceptual Model and vice versa (Requirements to Conceptual Model Map – RCMM)
  • Implementation items map to Requirements
  • Implementation items make logical sense
  • Sources of data are authoritative
  • Data are correct (input and output)
  • Data are correctly formatted (input and output)
  • Data can be traced through processing
  • User Interface items support all required behaviors
  • Outputs conform to specification
  • Outputs are accepted as authoritative
  • User operations and transformations are accepted as logical and appropriate
  • Full configuration management history is available
  • Functional test results are available and show all problems corrected
  • Quantitative test results are available and show all problems corrected
  • System operates correctly in target environment
  • All documentation items are complete and accepted

The VV&A process can be applied to an existing system but it’s usually better if the process is in place from the beginning of an implementation so the development team(s) can work with the VV&A team(s) with the review framework in mind. It’s an effective form of project governance and quality assurance.


This table is from the Naval specification we worked from.

This paper provides excellent background and insight into how and why this methodology was developed. As of this writing the full pdf could be accessed in the Google cache, but doing so may be problematic.

Posted in Software, Tools and methods | Tagged , , , | Leave a comment

The Greatest Field Improvisation Ever

When I was working on a furnace control system in Thailand I witnessed one of the coolest feats of engineering ever. The field service guy needed to measure the flow of gasses and a manometer was nowhere to be found.

A manometer measures pressure differences using columns of fluid. In a U manometer the difference between the heights of each column is proportional to the difference in pressure and the density of the fluid in the “U”. If you want to measure very small pressure differences, which you often want to do in gas flow applications, you want a less dense fluid. A column of water at atmospheric pressure against a vacuum can be pushed up almost 34 feet, or about 406 inches of water. A column of mercury, which is much more dense, can only be pushed up about 30 inches under the same conditions.

U Manometer

Another way to measure smaller pressure differences is to stretch the manometer out and measure the height change horizontally over a shallow incline. Devices with this arrangement are called inclined manometers. You can buy these things commercially and there are electronic versions as well, but none of that was available when my colleague needed it.

Inclined Manometer

Here’s where the clever bit comes in. He ended up mounting a piece of clear plastic hose on the wide face of a two-foot length of two-by-four. It was held in place on each end by an industrial staple. A piece of molding kept the working section straight, with one end mounted about an inch-and-a-half higher than the other over a run of 20 inches. He used some quick trigonometry to convert elevation distance to linear distance along the incline and marked the board at intervals of tenths of an inch of water. The marks were far enough apart that hundredths of an inch could easily be estimated. He poured a liquid into the tube that he could see clearly.

Coffee.

…which he assumed had the same density as water. In a scene right out of Zen and the Art of Motorcycle Maintenance the device worked perfectly and my colleague had the combustion system straightened out in no time.

When he finished the job he stuffed his creation in his suitcase and took it back to the office. The company was seeking ISO 9000 certification at the time and the big joke was whether we were going to send it out for calibration every six months as the standard called for. That obviously never happened but I would love to have seen the technicians’ faces when they opened that package!

Posted in Engineering | Tagged | Leave a comment

How Not to Miss Things in a Discovery Process

I’ve been part of a lot of discovery efforts and have found a few ways to increase the chances of identifying all the relevant factors. I plan to discuss these only informally. Volumes of ink and electrons have been spilled describing formal methods like UML, Model-Driven Engineering, and Business Process Notation, but I’m not talking about implementing systems based on what is discovered, only making sure the discovery process itself is thorough.

That said, discovery begins with an idea of what you’re trying to accomplish. You have to discover the factors relevant to your end goals. When I worked on training simulators for nuclear power plants the main rule guiding discovery was whether an element affected the readings on the operator panels or were affected by the controls on the operator panels. Sample ports, bypasses, and elements used only for maintenance were not included.

The goal of the card game Set is to identify groups of three cards from those that are turned face up in the playing area. Each card has a figure with four possible characteristics: shape (diamond, oval, or squiggle),color (red, green, or purple),fill (open, hashed, or solid), and count (one, two, or three items). A valid set is one where each card is either the same in any of the characteristics or where each card is different in any one of the characteristics. The following three cards would be a valid set:

  • one purple open oval
  • one purple hashed diamond
  • one purple solid squiggle

This set is valid because the three shapes are different, the three fills are different, the three colors are the same, and the three counts are the same. When scanning the cards you might try to just stare and hope a set jumps out at you (like the one-, two-, and three-count red open ovals), but if that doesn’t happen quickly you would take a more systematic approach, right? And how would you do that?

You might look at all the cards of each color, check to see if there are any groups of three that have the same or different number of shapes (a one, a two, and a three or three threes), then see if any of those groupings are still valid after looking at the shapes themselves (all squiggles or one of each), then see if the fill patterns work (one of each or all the same). If you don’t find anything you might start with a different characteristic. On the next pass you’d start by looking at all the cards with three shapes, then look at the shapes, then the colors, and then the fills. On the third pass you might try looking at every pair of cards, figuring out what a valid third card would have to be, and then see if any such card is available.

The point is to be sure you examine the characteristics of the cards in a systematic way. With a little practice you get faster. The person who taught me the game was downright scary at it, which inspired me to buy a version for my phone. I got faster all right, but the game was too much fun so I had to delete it!

Performing process discovery in the wild is almost easier than finding valid sets in the card game. In the game you have to consider four criteria in combination, but in a discovery process you can mostly consider each element of the system in isolation. You just need to be thorough about each one. To be systematic, let’s look at all the element types one by one.

Entities: Systems may process one type of entity or multiple types. How you define them is up to you. In the border station simulations I worked on the only moving entities were border-crossers in general. The fact that they were private passenger vehicles (cars, etc.), commercial vehicles (trucks, etc.), buses, and pedestrians was important, but those differences were treated as characteristics of entities, not different types of entities. A manufacturing system might treat an assembly that becomes the final product differently than the parts that get added to it. Returning to the example of border stations we might have chosen to model the movement and presence of the officers explicitly, and they could reasonably have been considered a separate kind of entity. The distinctions here are subtle, and such classifications can be difficult and may not be that important in the end. However you define them, make sure you find them all.

Entities can also be messages, transactions, permissions, reports, requests, orders, notifications, and almost anything else. Anything that moves in a system can be thought of as an entity.

The entity characteristics you must identify are those qualities that affect their processing. For example, in a border crossing the processes are affected by the mode of conveyance and the citizenship of the border-crossers. A very important property of entities is the time and rate of arrivals.

Stations: These are the places where processes and operations are carried out. There may be multiple stations that all carry out the same operations at the same time (e.g., a grocery store might have a dozen checkout lines, all of which work at least roughly the same way). Groups of similar stations might be called facilities, plazas, process blocks, or something else. They may also be referred to as a process, but the concept is that stations do operations in parallel (many things at the same time), while processes do things more in series (one thing after another).

Once you’ve identified a process identifying the stations is relatively straightforward. In physical system you might identify the stations first, which could lead to identifying the process. The most important characteristic of stations is how many of them there are in each processes (12 lanes in the grocery checkout process). After that you should determine whether the stations have different rules (15 items or less, self checkout, or general) or process different types of entities. Process time(s) can also be very important.

Processes: Processes are individual operations or series of operations carried out at individual stations. It’s easy to miss some of these. Ask everything you can about the operations that go on in each station for each type of entity. Find out what types of decisions are made and what the range of outcomes might be and why. Find out how many operations occur, and what steps have to be handled separately and which, if any, can be combined or abstracted. If it makes sense, find out how long things take. Determine the distribution of times and results, along with averages, minimums, and maximums. Last but not least, find out what conditions must apply before the operations can be carried out. What is needed? Staff, supplies, utilities, logical conditions?

As an example I once analyzed a system that made decisions about whether to underwrite companies for disability insurance. The entities were documents describing the company staff, and those were collated into files, one per employee. Groups of files were shuffled from area to area where different processes were performed. Each process might be composed of multiple sub-operations. Sometimes those had to be considered separately and sometimes they didn’t. Multiple people worked in each area, each carrying out the same process and sub-operations on different files in parallel. In this scheme the people, or at least their desks, could each be thought of as stations.

The way to be thorough when identifying processes in physical systems is to not be bashful about asking what things are. If you see a door or counter or machine or painted area on the ground, ask what it’s for, and what happens there. The worst that can happen is to be told it doesn’t matter. If you have a guide it’s always good to listen to all the information they offer, and some guides may not miss a detail you need. But, you shouldn’t assume the guide will always know what you need. If you ask a lot of questions about things you may jog the guide’s memory, or at least get them to think differently about what you need. It’s a cooperative venture. Be polite and respectful, describe what you’re thinking, and work together. The guide wants you to understand the system so you can solve his or her problem.

Paths: Paths define the routes by which entities move from process to process. If the system involves the movement of real objects then the paths are defined in space. They might be roads, sidewalks, hallways, conveyors, or something else. Paths may also be logical, as in the case of, say, data records getting shuffled to different queues and processes in an IT system.

Buffers: Buffers are intended to hold entities while they’re waiting to be processed. Physical paths may act like buffers, since entities could wait in line on the path to be processed. The most important characteristic of a buffer is its capacity, especially if there are limits to it. The minimum time to traverse a physical buffer might be of importance, but this might not apply to a logical buffer. They often behave like one or another kind of queue, but there are exceptions. A parking lot would be a kind of buffer but the rules for exiting the lot could be implemented in different ways. The residence times could be generated according to a random distribution or could be defined by the time it takes for one or more pedestrian occupants to be processed separately.

Resources: Resources are logical conditions that must apply in order for other elements of the system to work. With respect to stations, resources might include staff, supplies, permissions (including those based on schedules), and so on.

Scope: Understanding the scope or reach or boundary of the system might be the trickiest thing to understand. In my opinion it’s better to consider the scope too widely than too narrowly, at least on the first pass. You will review your findings before doing anything with them, hopefully with experts or customers, so you can always eliminate things you don’t need. If you miss something entirely you’ll either have to go back and look again or you’ll simply fail to capture something important.

One of the biggest disappointments of my career came because I didn’t think widely enough when I was trying to diagnose a system problem. I assumed that furnace combustion gasses just flowed to the exhaust under their own pressure, so when they weren’t flowing properly I only looked at the feed piping, control valves, and instrumentation. I didn’t realize that the particular type of furnace I was dealing with actually had a big ol’ fan mounted in the base of the exhaust stack. The fan was needed because the exhaust had to be pulled out through the bottom of the furnace rather than just jetting out the top like it did in most other furnaces, and in this case the fan wasn’t running. The Level 1 guys would have recognized the problem immediately since they had to do the controls for it, and the field service tech who was sent figured it out right away. I was alone on the site, and as a Level 2 engineer I mostly didn’t have to think about anything that wasn’t going on inside the furnace proper. I knew what some of the exterior stuff was, but I usually had my hands full and didn’t know everything.

I also didn’t think to ask, which is really the greater sin. At the very least I should have called back to the office to ask, “What am I missing?” There was some back and forth, but I know I didn’t ask the right questions and wasn’t systematic enough. There were a lot of ways I could have figured it out, and I kick myself to this day whenever I think about it. Once I thought about what happened in some detail I resolved not to make that type of mistake again. Do us both a favor and learn from my experience.

Most business systems take inputs from a variety of different sources, process them in some way through a number of steps, and produce some sort of output. While the details may change I think you will find this approach useful. If you have any experiences that could add to this general framework I’d love to hear about them.

Posted in Engineering, Software, Tools and methods | Tagged , | Leave a comment

Building Tools

Most generally, a tool is a means of accomplishing some end. If one uses the right tool for the job it is a more efficient and effective means than other alternatives. Naturally we’re always trying to use the optimal tool.

If a repetitive series of operations can be identified then a tool can be built to carry out the steps automatically. Building a tool makes sense when the effort needed to create it is less than the time that would be saved by doing the operations by hand. The more the operations need to be carried out, or the greater the possibility of making mistakes, the more justification there is for the tool, and the better the tool should be.

A tool can also hide a lot of details from its users, which could be a good or bad thing. It’s good in that it allows less skilled practitioners to carry out complex or repetitive operations quickly and with a high degree of success. It’s bad when the details and knowledge that are obscured lead to problems and, even worse, lost opportunities.

I’ve created tools and worked with teams that did so, and always found those efforts to be the most satisfying in software development. Every software program is a tool to some degree; it serves as a means to an end and in most cases is meant to do something repetitively. That said, when the software system is an end product I don’t see it the same way. Extremely general products like word processors, spreadsheets, and programming languages are also tools, but here I am talking about systems with more targeted applications. I am most interested in tools that allow me to build the end products more quickly. In sort, I am interested in tools that build software or components of software systems. These can take many forms.

Baby CAD program: The first “tool” I ever built was a little vector graphics editor for a computer graphics class in college. We started learning from the ground up: addressing individual pixels; drawing lines, curves, circles and ellipses, and text; combining elements into polygons, shapes, and larger entities; performing 2D and 3D transforms like translation, scaling, shearing, and rotating; clipping; filling polygons; and hidden line and surface removal. Along the way we built little programs that allowed users to define simple and compound elements and then manipulate them. They had menus, keyboard commands, read and saved files, printed images, and so on.

Interactive System to Help Students Solve Engineering Problems: I worked on this system for one of my professors for two semesters. It was based on the graphics program described above but was intended to walk mechanical engineering students through the process of solving certain classes of engineering problems. It had a “natural language” interface (artificial intelligence was a big deal at that time but as a practical matter all the system did was look for keywords in the text the students entered). The problem was presented to the student in the manner found in textbooks: a block of text gave the parameters and described the solution desired and a diagram was provided for further explanation and clarity. It contained an equation editor (which did clever things like create integral symbols out of multiple ASCII characters). The student was supposed to type natural language commands that would be used to carry out steps in the following order: provide a frame of reference for the problem by defining axes, control surfaces or control volumes, and so on; apply the appropriate engineering equation(s); make assumptions about which terms do not apply; solve the equation(s) for the desired answer; assign values to the relevant terms of the equation(s); and calculate the final answer. The system could save problems to files and retrieve and present them in a modular way, test to see that each step was carried out correctly and in the proper order, give feedback when the wrong thing was done, and provide help if the user was stuck. The pieces were pretty primitive, as you can imagine, but they were all more or less there.

Automated Player Character Record Sheets: At the risk of publicizing my intensely nerdy side… I had several different computers when I was in the Army and amused myself in my spare time by building tools to manage the information associated with D&D player characters. The controls stored information about all of a character’s abilities, skills, and belongings. Over time this work evolved into a text-based windowing system, which was itself a tool, or at least a framework that could be leveraged. It was originally all keyboard-driven but I eventually incorporated mouse controls as well. I remember being amazed that it seemed to require a lot of data to define the location, content, and format of the various data entry fields and controls, but there was no more or less than what was needed. Every such system I’ve seen since has worked the same way.

Steam Table Properties Calculator

This was actually a combination of two tools, one for generating curves for the properties of saturated water and steam at different temperatures or pressures and one for curve fitting. The thermodynamic properties of water are described in steam tables with entries listed every few psi or degrees F (or other units of pressure and temperature). In school we always had to do linear interpolations of in-between values by hand, and in an industrial setting that just won’t do. It might be ok for design if you aren’t doing too many iterations but over time it starts to get painful. That method can also be surprisingly inaccurate in certain situations (e.g., at very low partial pressure of water vapor). In general it’s better if you can enter a single input and get a single output and be done with it.

Curve-Fitting Tool

I got the curve-fitting idea from an article that got passed around the office while I was working at my first engineering job, where some of us had to do thermodynamic calculations regularly. The article described a curve-fitting technique that either came with one of the TI-50-series calculators or was implemented on one. It was based on an equation that was the sum of a constant, a linear term, a square term and similar terms up to a power of seven, an inverse term, a natural logarithm term, and a square root term. Without going into the gritty details (I’ll save those for a separate article), the tool allowed the user to enter a series of input and output values (X and Y values) along a section of the curve to be fit, and then solved a series of simultaneous equations which yielded coefficients for each of the terms. The plot would then be graphed, with the input points clearly marked. If the curve looked smooth then the fit could be accepted but if the curve wiggled between the input points, especially near one end, then the process had to be repeated with a different set of inputs and possibly different terms included or omitted. A little bit of software legerdemain allowed formulas with power terms to be written efficiently but I always felt that some of the formulations were computationally expensive. That said, they did work, and they were good enough for my intended uses. (The link below leads to several online calculators that do the same thing. These capabilities are no longer a novelty, as they were when I needed them.) By the time I stopped working on the tool I had added some extra tricks to try to smooth the inputs and outputs and had also made the program generate curve fits for the first three derivatives of the curve fit equations. That got a little hairy with some of the combinations of more complex terms.

Pressure Drop In Pipe Calculator

Early in my time at the Westinghouse Nuclear Simulator Division I wrote a tool to calculate the pressure drops in runs of pipe. The pressure drop was dependent on the properties of the fluid (mostly density and viscosity), the length of the pipes, the diameter of the pipes, the bends and diameter changes in the pipes, the equipment installed in that run of pipe (valves, orifice plates, and so on), and the relative roughness of the pipes. The basic equation was a variation I got from a colleague and the characteristics of most types of equipment were taken from Crane Technical Paper 410 (also see here), a famous industry standard. The interface was a simple text-based one, but it got the job done and was more than accurate enough for my uses.

Continuous Simulation Test Tool

The simulators at Westinghouse were complex affairs with multiple CPUs, shared memory, multiple connected computers, massive amounts of I/O, and industrial tape and disk drives. They were expensive and were in use for one purpose or another almost around the clock. You could test individual programs in their somewhat native environment to a limited extent during the day, but doing so was slow and tedious. Modelers took turns exercising sole control of those systems only on the 11 pm – 7 am shift. I found that I needed to do a lot more testing, and I needed to do it a lot more quickly. I therefore found it necessary to write an entire continuous simulation test framework for a PC. Doing so also meant that I had to dummy in the behavior of variables and systems that were modified by other models. I learned a lot in the process and the framework I devised served me well for almost ten years.

Automated Fluid Model Documentation and Coefficient Generator

The project management process used by Westinghouse was solid overall (it was even rated highly in a formal audit conducted by consultants from HP while I was there) but that doesn’t mean there weren’t problems. One monkey wrench in the system caused me to have to rewrite a particularly long document over on several occasions. After about the third time I wrote a program that allowed me to enter information about all of the components of the system to be modeled, and the system then generated text with appropriate sections, equations, variable definitions, introductory blurbs, and so on. The system also calculated the values of all of the constant coefficients that were to be used in the model (in the equations defined) and formatted them in tables where appropriate. I briefly toyed with extending the system to automate the generation of model code, but the contract ended before I got very far.

Automatic Matrix Solution Code Generator

While working at Bricmont I ended up doing a lot of things by hand over and over. The control systems I built all did the same things in principle but the details of each application were just different enough that generalizing and automating their creation did not seem to make sense. If I had stayed there longer I might have changed my mind. I did leverage previous work by adapting it to new situations instead of building each system from scratch. That way I was at least able to identify and implement improvements during each project. There was one exception, however. I was able to automate the generation of the matrix solution code for each project. In general the block of code was always the same; there were only variances in the number and configuration of nodes and those could be handled by parameters. That said, the matrix calculations probably chewed up 80% of the CPU time required on some systems, so streamlining those bits of code represented the greatest possible opportunity to improve the system’s efficiency. To that end I employed extreme loop unrolling. That is writing all of the explicit calculations carried out by the tightly looped matrix solution code with array indices expressed as constants. In that way you get rid of all calculations having to do with incrementing loop counters and calculating indirect addresses. The method saves around 30% of execution time in this application, but at the cost of requiring many, many more lines of code. The solution to a 49×7 symmetric banded matrix expanded to 50,000 lines of code. The host system was way more constrained by calculation speed than it was by any kind of memory, so this was a good trade-off. The code was generated automatically by inserting write statements after each line of code in the matrix calculation — if that line performed any kind of multiplication or summation. The purpose of the inserted lines was to write the operations carried out in the line above in the desired target language (C++, FORTRAN, Pascal/Delphi at that time), with any array indices written out as constants. Run the matrix code once, it writes out the loop-unrolled code in the language of choice, done.

Modular Simulation of Medical Offices

This tool allowed a user to define the floorplan, equipment, procedures, employees, patients, communications, and administrative activities of medical offices. The main cheats were that the location, size, and shape of rooms in the floorplan were defined by entering vertex coordinates by hand instead of using a graphical tool, and that I also did not show employees and patients moving from one place to another. I simply had them teleport and included a time delay. Beyond that the system provided a wide variety of outputs and served the desired analytical needs fairly well. If work on the project had continued there were numerous improvements that could have been made.

BorderWizard / SimFronteras / CanSim

These were tools used to build simulations of land border facilities for the United States, Canada, and Mexico. The layouts were defined by a CAD-like network of paths and processing stations. Arrival volumes and rates, process times, and diversion percentages were included based on data collected from field visits and automated records. The models were typically run for a week and might process up to 100,000 entities in that time. Simulation runs could take up to an hour. Unlike most of the other tools on this they were developed over a period of years by a large team of programmers and analysts. They were also end products used to support the analyses that were that company’s real products rather than an internal tool used to create some other product or component.

Pedestrian Modeling Tool(s)

This family of tools was developed by a large team over multiple contracts to define environments, facilities, processes, and the movements of goal-directed entities through them. The tools were sometimes used to test the effects of changing facility layouts or the configuration of steps in a defined process, but were most often used to model evacuation events. The evacuation models incorporated a wide variety of challenge effects the model occupants had to react to.

Budget Planning Tools: I created a number of spreadsheet tools over time, to do things like calculate the output characteristics of combined material flows, test and condition inputs to other models and processes, and size processing systems. The most modular of these, however, was the set of tools I created to manage the employees and contracts I supported as a program manager for four Naval aviation task orders. I had to manage the billing for labor and expenses and track all activities very closely, in order to prevent recurrence of the problems I had to overcome when I inherited the position.

Flight Schedule Generator

When spreadsheet tools proved insufficient for the task I spent a few days writing a program to read in files of historical flight records and write out randomly-generated flight schedules of defined calendar durations and flying hour rates that had the proper distributions of daily flight frequencies, mission types, flight durations, and departure times. The tool could also generate flight schedules with the desired output characteristics by hand-entering notional input data in the correct proportions.

Conclusion

I’ve always been fond of modular tools, toys, and games that can be configured and used in a variety of ways. They can be adapted to many different situations and generate a variety of outputs. When it comes to building and employing tools I have every type of experience related to defining what they need to do, managing their construction and use, understanding how they should be tested and modified, and knowing when and whether they make sense to build at all.

Posted in Software, Tools and methods | Tagged , | Leave a comment

Calculus… or Statistics?

Most roads in mathematics instruction seem to lead to calculus before any but the most basic statistics. Many statistical techniques require calculus to derive, but it isn’t usually necessary, short of advanced or novel applications, to know calculus when applying those techniques. What’s more important is understanding how to apply them and when they are appropriate, which of course could be said of any tool.

I went through the standard progression, ending in calculus and differential equations, and learned a reasonable amount of statistics only later when it came up in my work. As an engineer I used a certain amount of calculus, but most people don’t. I wish I’d had a better grounding in statistics earlier.

Both tools provide important insights and serve to quantify phenomena but, in my mind, calculus is generally used to provide point values (if X, then Y) while statistics are geared toward calculating probabilities (if X, then range of Ys). The difference is that some of the Xs in probabilistic systems are themselves variant (more properly, if range of Xs, then range of Ys).

Calculus might be used to answer questions like:

  • What cross-sectional area is needed to support a specified weight under tension?
  • How will temperature change when a given amount of heat is added?
  • What dimensions of a cylindrical can will require the minimum use of sheet metal to enclose the maximum amount of volume?

Statistics might be used to answer questions like:

  • What are the chances that a specified number of events will be completed in a specified amount of time?
  • What is the probability that a part will experience problems that will require a machine to shut down, over a given period of time?
  • What is the expected amount of downtime for a production process?

It’s easy to see how all of these questions factor into making decisions.

The ideas of calculus and statistics can get conflated as they are actually used. Thinking about the first calculus example would might use the calculation to design an elevator to lift a certain amount of weight. If we make the cable thick enough to barely hold the maximum weight it might fail if the elevator is overloaded or the cable is worn or defective. The cables are therefore designed with numerous safety factors including the presence of multiple cables, each of which is strong enough to hold more than the expected weight of the elevator and its load (as well as other mechanisms). The safety factor recognizes the possibilities that things can go wrong, even if the possibilities are not calculated explicitly. As a result, cable-driven elevators are about the safest form of passenger transport known.

The third calculus example is a direct form of optimization; the can example is often used in beginning courses. Different kinds of optimizations can be performed using statistics.

Calculations where none of the inputs vary need only be performed once. If the inputs do vary according to known distributions (this is different than purposefully varying system parameters to test different design cases), then the calculations would have to be run many times to generate a range of results.

Inputs to calculus problems are all fixed rates and quantities, while inputs to statistical problems are themselves probabilities. At scale, however, the results of the different types of analyses converge. The molecules of water in a steam turbine can be very hot or cold on a statistical basis, but on average the steam behaves uniformly, because the individual variations are small relative to the size of the system. The randomness of individual molecules don’t matter and calculus is therefore used to analyze that behavior. If a system is “chunkier”, as in maintenance actions performed on a group of machines over a limited period of time, each statistically-variable event stands a larger chance of affecting the behavior of the system as a whole, so statistical methods are used.

Finally, it’s always important to understand the assumptions and results of each specific analysis. For example, the machine maintenance problems I worked on generated outcomes according to a rather naïve set of rules. The human managers of the system being modeled were known to be able to reorder events to provide better outcomes in many cases. We therefore had to understand that the results we were getting, even though themselves variable, were likely to be somewhat pessimistic. They provided an envelope of worst-case outcomes.

It is also important to understand the methods themselves. It is all too easy to make errors if one does not grasp the subtleties in the techniques of either calculus or statistics.

Posted in Tools and methods | Tagged | Leave a comment

Be Honest When Things Go Wrong

One of the important lessons I learned at my first engineering job was to be honest and open at all times. This was never illustrated more clearly than when a refiner disc flew apart, tore through the pressurized refiner casing, took out the feeder mechanisms, and took chunks out of nearby floors and walls. Everyone was relieved to find out that no one was near the refiner when it happened; the debris surely would have killed anyone it struck. I’ve been at plants when it has happened, which is a bad as it seems, so it’s always a relief to find nobody was hurt.

This twin disc refiner was meant to grind wood chips or rough pulp into finer pulp. It had a rotating disc sandwiched between two static discs. The static discs could move in and out to adjust the gaps between themselves and the center disc. The surface of each disc was covered in refiner plates that were sharply grooved to shred fibers off the feed material as it moved from the shaft where it was fed in to the outer casing and refiner discharge. The five foot disc spinning at 1800 rpm stored a lot of energy.

The refiner plates were bolted to the faces of the discs and if one of them detached it would kick around and chew things up but I don’t think it would shred through the refiner casing. The failure of the rotating disc itself was a different story.

We arrived at work on the morning of the accident to be briefed on the event and plot a course of action. There was some discussion of what caused the failure. Was it a flaw in manufacturing? Was it excessive vibration from the fact that the refiners in that bay were mounted on a floor that itself was suspended above a river and thus not very stable? Was it something that could have been looked into during the previous plate change, if the mill technicians had recognized the terrible condition of the older plates? None of us knew. Whatever the cause, the main thing I remember is that the senior managers made sure that the company put out a notice of the failure to everyone. I’m sure that they quickly produced replacement materials and had them shipped to the site so repairs could be completed as quickly as possible. The machines were still under warranty so the company covered all of the costs.

I don’t know if they ever figured out what the problem was, and I don’t know if anything like it has happened since, but I know it was addressed openly, honestly, and quickly. There can be no other way. Problems can’t be solved or prevented if information is hidden.  Customers won’t deal with companies or workers who aren’t forthright.

Posted in Engineering | Tagged | Leave a comment