TWSL Series 10: Permutations and Traceability

Today I gave this webinar for the Tom Woods School of Life Lunch and Learn Series. The slides are here.

Posted in Tools and methods | Tagged , , , , | Leave a comment

TWSL Series 09: Testing and Acceptance: Verification, Validation, and Acceptance (VV&A)

Today I gave this webinar for the Tom Woods School of Life Lunch and Learn Series. The slides are here.

Posted in Tools and methods | Tagged , , , , , , | Leave a comment

Risk Analysis and Management

I don’t strongly feature the subject of risk analysis when I talk about my framework, but that doesn’t mean it isn’t accounted for or present. I usually build up my framework like this:

Risks and Impacts appears with Assumptions and Capabilities in the first outline. The reason it isn’t presented as its own phase is because this work really should be happening though every phase in many different forms, and my formulation is meant to provide practitioners with the improved situational awareness that comes from understanding how the phases are truly different. That said, these concepts are called out early because the sooner they are considered, the better.

The practices of project management, Lean Six Sigma, and business analysis both have a lot to say about risk, but the BABOK only obliquely addresses the costing side of things. One thing the project management oeuvre considers, for example, is a way to compare risks by multiplying the cost of the unanticipated or unwanted outcome by the percentage change they will occur. The BABOK includes this idea, but not for evaluating which projects to pursue. Outside of that, however, the practices are mostly similar. (So, should we make the next version of the BABOK even thicker by expanding this section with a lot more detail, especially given the IIBA’s posture of never being prescriptive?)

The first step in dealing with risk is identifying them. Some will be known, some will be unknown, and some will be of indeterminate or variable severity. More information is always preferred, and one way to identify risk is the follow the techniques I recently discussed for the proactive parts of root cause analysis (e.g., FMEA and generally being thorough when examining all parts of existing and new systems and the environment in which the engagement is conducted). Leverage your organization’s policies, lessons learned, and people for all possible insights, and research common occurrences and practices in your region and in your industry. Your organization’s insurers may have further insights to offer.

Risks come in may forms. They can be based on things occurring once, multiple times, or not at all. Events and consequences may map one-to-one, one-to-many, and many-to-one, so be thorough.

Once risks are identified they should be maintained and tracked in a risk register. It should include information along the lines the example in the BABOK.

  • Risk Event or Condition: description of the potential situation that may have to be addressed
  • Consequence: what happens if the event occurs or situation arises
  • Probability: how likely the situation is to arrive (percentage or something like high / medium / low)
  • Impact: the cost of effect if the situation arises (cost, time, materials, people, contract (scope & quality), reputation, or legal explicitly or high / medium / low)
  • Risk Level: rough amalgam of probability and impact
  • Risk Modification Plan: how the occurrence should be handled (see below in this article)
  • Risk Owner: name and contact information of party in charge of managing the situation
  • Residual Probability: as above but residual
  • Residual Impact: as above but residual
  • Residual Risk Level: as above but residual

There are five classic ways to manage risk.

  • Avoid: The risk is either entirely prevent or plans are changed so the risk cannot possibly occur (at least in a way that will effect the plans).
  • Transfer: The impact of the risk is moved to or shared with a third party (e.g., an insurance company, but could also involve other kinds of teaming).
  • Mitigate: Steps are taken to reduce the probability of the situation arising or, if it does arise, reducing the effects or impact of the situation.
  • Accept: Deal with risks as they occur, or do nothing at all.
  • Increase: Not all risks are negative. Some are positive, and in those cases it may be best to load up on more risk in hopes of a big payoff.

Risks should be reviewed and plans updated at intervals. Some risks are reasonably well understood and quantifiable through actuarial analysis performed on voluminous historical data, known weather patterns in combination with geography, prevailing conditions in industry and economy, and so on, but others are less predictable. The bottom line is to prepare for the expected and to expect the unexpected.

Posted in Tools and methods | Tagged , , | Leave a comment

Sequence Diagrams

Imagine a network with the following representative elements. Many external and internal workstations, mobile devices, and pieces of equipment exchange information with an internal system of servers and capabilities.

Now imagine that some of the internal servers provide a number of services where sending them a message results in them returning a package of information that can be made use of, while simultaneously causing some kind of (desirable) internal side-effect, such as a new or modified entry in a data table. This kind of distributed, stateless architecture is called a microservice, and they are usually bundled to provide groups of small, related functions that cause a related set of desirable side-effects.

Now imagine a group of external applications that make a series of calls to a given server, to evoke a series of related actions, in a system like that depicted in this block architecture diagram.

Given these description, now ask what is known about the order in which these operations usually take place.

The answer shouldn’t be, “not much,” even if you had published descriptions of the information and format of the messages sent and the answers received.

The answer should be, “nothing.”

There are a lot of ways the order of operations could be listed. They could be written out as a series of instructions in a document or a use case. The order could be left undocumented so the various capabilities could be used as seems appropriate for each application. Or, customized ad hoc representations could be made up as shown in the next figures. The first shows a routing diagram that implies timing, while the second shows an order of operations overlaid on a block architecture diagram.

However, there is a fairly standard way these can depicted, and that is by creating a sequence diagram, an example of which is shown next. These grew out of the Unified Modeling Language (UML) practice that developed within systems engineering. It is often, but not exclusively, used in computer science, and I refer to it in various of my presentations.

These diagrams show operations in time from top to bottom, and also show different components from right to left that house operations. Each object is represented by a lifeline, which is shown as a dashed line that descends from an object box, examples of which are shown in gray along the top. An “X” on an object’s lifeline, placed just below the last activation box, indicates that the object goes out of existence (or is otherwise deallocated or destroyed). Activation boxes are shown in brown (note that any colors can be used for any elements), and show the span of time during which the item is acting with respect to the process depicted. They extend vertically along the lifeline. The top of the box shows when the object becomes active, and the bottom of the box shows when the object becomes inactive. If an object is called upon to act more than once, multiple activation boxes can be placed along a single lifeline.

Control is passed to different objects by passing messages. In most cases, as in microservice architectures, messages are actually passed using some kind of communication channel, but many means of passing control are possible. For example, a function call may have parameters and may result in values being returned, but they may both be in a single unit of code that doesn’t involve any kind of “over-the-wire” communications.

Note that two kinds of messages can be passed. Synchronous messages are shown with filled-in arrows (like those in the figure above), and represent operations where the sender may not carry out any operations until a return message is received (or a local time-out procedure is potentially invoked). Return messages are shown as dashed lines and with open arrows. Messages can all contain information of various kinds and in any quantity. The simplest messages will indicate requests and acknowledgements with no other information. Asynchronous messages are shown with open arrows and indicate that the sender may perform other operations while waiting for a reply (or that a reply may not even be expected or needed).

The BABOK shows only the most basic elements of a sequence diagram, but many more can be used. There are methods for depicting logical branches, self messages, recursion, and more, and you are invited to research other sources for more information.

Posted in Tools and methods | Tagged , , , | Leave a comment

TWSL Series 08: Data: How to Get It and How to Use It

Today I gave this webinar for the Tom Woods School of Life Lunch and Learn Series. The slides are here.

Posted in Tools and methods | Tagged , , , , , | Leave a comment

Acceptance and Evaluation Criteria

Acceptance criteria and evaluation criteria can almost be considered as two separate considerations, but they are presented together in the BABOK and I can see why. The BABOK states that acceptance criteria are the basis for determining whether a solution is acceptable to stakeholders based on identified requirements, and the evaluation criteria are about how multiple possible solutions are compared to pick the best one. So both are about solutions, but one (acceptance criteria) is about meeting requirements and the other (evaluation criteria) is potentially exceeding them (if one meets requirements by “more” than another).

Another way to think about evaluation requirements is to note that defined requirements of value may not exist. That is, an organization may identify requirements and develop solutions that meet those requirements, but the organization may or may not have, say, a specific cost a project needs to stay under or a specific benefit a solution needs to realize. Instead, in such a situation, the costs and benefits are compared and an entrepreneurial judgment is made about whether or not to proceed.

Still another complication is when potential solution approaches are sufficiently different that the participants need to create some kind of hybrid scoring system that allows comparison of different mixes of features that cannot be compared directly. On the one hand, many criteria can be translated into monetary terms (abstractly, at least), but on the other hand it doesn’t always occur to people to do so, and attempts at it might not be all that accurate.

Functional requirements are more often amenable to discrete evaluation with yes/no, go/no-go -type answers than are non-functional requirements. For example, you might specify that “a process must run to completion within 2.5 seconds,” but how do you judge whether a solution or element is more or less elegant or modular? Some non-functional requirements can be objectively evaluated, as a measure of robustness might be expressed as “the system must achieve 95% operational uptime,” but it may be that being able to express a requirement in that way inherently transforms it into a functional requirement. A review of the non-functional requirements listed here shows that some can be directly evaluated and some cannot readily be, so I will leave this idea as something to reflect upon.

My framework emphasizes iteration and communication within and between phases. Communication and iteration within phases, especially in the first three, involves having analysts learn from customers, document their findings, submit them to the customers for review, and modifying the documented findings until the customers agree that the analysts have everything right. This means that a form of acceptance is embedded into each phase when customers accept the definition(s) of the intended use, the findings of discovery and data collection activities, and the expression of requirements. After that the emphasis shifts to evaluation criteria.

The requirements traceability matrix links all elements of the process or product under investigation to elements in the previous and subsequent phases in both directions. This ensures that all work is targeted to items that address the intended use, that all requirements are addressed in the design, that all design elements are implemented, and that all implementations are tested. The linking encompasses acceptance criteria in a way and is a form of validation (whether the developed solution addresses the identified problem or use), but the requirements is really where the acceptance criteria are explicitly baked in.

A good requirement will usually be written in a way that includes how it will be judged to have been met. After that there are endless ways of testing all elements of the solution, and these most often (but not always) involve forms of verification (whether the developed solution functions as intended). Certain kinds of implementation may involve numerous verification tests (like automated unit tests for software code) that aren’t written by the analysts and customers, but that will vary with the situation.

If this article seems to wander through a bunch of different concepts in no particular order, let me return to the original definition from the BABOK. Acceptance criteria drive yes/no decisions that can often but not always be automated. Evaluation criteria allow comparison of potential solutions either directly or indirectly.

Posted in Tools and methods | Tagged , , , , | Leave a comment

Root Cause Analysis

Root cause analysis is the process of digging into a problem in order to find out what actually caused it. In simple systems or situations this process is pretty straightforward, but of course we often work in situations that are far more complex. When that happens we need a more robust way to figure out what’s going on and address it.

The BABOK advances the following general procedure for doing this work. Sharp-eyed readers may note similarities to the steps in my framework, and also to the scientific method.

  • Problem statement definition: This identifies the symptoms of the problem, or at least the effects in some way.
  • Data Collection: This is where we gather all the information we can about the system in which the problem occurred.
  • Cause Identification: This identifies candidates for the ultimate cause of the problem, potentially from many possibilities.
  • Action Identification: This describes the corrective actions to take to correct the problem, and ideally do some things to prevent its recurrence.

The BABOK lists two main methods of performing root cause analysis.

Fishbone Diagram

The fisbone diagram, also referred to as an Ishikawa diagram, is a graphical method of structuring potential contributing factors.

The diagram can even be drawn as a Mind Map like this:

Each form of the diagram allows investigators to identify primary, secondary, and even tertiary contributing factors, as shown in the upper version. The primary branches of the diagram can take on any labels, and there can be any number of them, but it used to be common to label the main branches as the Six Ms as follows: Measurement, Methods, Mother Nature, Machine, Materials, and Man. The preferred way of doing this now is to use the generic terms People and Environment for Man and Mother Nature, respectively. This heuristic is simply a way to inspire investigators to consider a wide range of contributing factors.

The Five Whys

Another common method for digging into problems is to keep asking “Why?” until you drill through enough clues and connections to find the root cause of a problem. (Now why do I suddenly have the lyrics for “Dem Bones” running through my head?) The point of this method is to not be satisfied with answers that are, per H.L. Mencken, “clear, simple, and wrong.”

I embarrassingly failed to apply this technique when I was trying to diagnose a problem with a steel reheat furnace in Thailand, as I describe near the bottom of this article. The overall process I describe there tends to go wide, where the Five Whys technique is meant to go deep. However, if you don’t identify the right thread to start with, no amount of pulling on the ones you do identify is likely to lead you to the right source. Therefore I recommend a combination of the two approaches (wide and deep).

I will further opine that most people are not inclined to follow long chains of logic. Of those that can do so, many can only to it in certain (professional and interpersonal) contexts. As you develop this skill, don’t be bashful about getting other people to help. Ask them, “What am I missing?” “What do you know that I don’t know?” If you get enough people with enough different points of view thinking and communicating, you’re more likely to find what you’re looking for. You will also build trust and cooperation, and learn how to dig into problems in new ways.

Other Contexts for Root Cause Analysis

Remembering that business analysis can be used to build, modify, and improve environments as well as process and products, I want to share some things I learned during my Six Sigma training.

Top Five Reasons for Project Failure

  1. No stakeholder buy-in: If important participants are not invested in the outcome and won’t take the time and energy to contribute to the effort, you are likely to be short on resources and insight needed for success.
  2. No champion involvement: This is similar to the situation above, except this usually involves a senior manager starving the effort of resource or otherwise blocking progress. I’ve heard of executives running entire teams to make people happy, but with no intention of letting them actually change anything.
  3. No root causation: If you don’t identify the right problem, you are unlikely to actually solve it.
  4. Scope Creep: This involves agreeing to include too many extras in a project, resulting in not having sufficient resources to do the intended work or solve the intended problem completely.
  5. Poor Team Dynamics: If the members of the team do not communicate, cooperate, or support each other, the team is unlikely to realize much success. This is the biggest killer of projects.

Start General, Then Get Specific

I’m not going into detail on this one. This outlined approach was taken from my abbreviated study notes. It is a restatement of things I’ve written and said here and elsewhere.

Open – Generate Maximum Suspects
    Macro problem
    Micro problem statement
    Brainstorm
    Cause-and-effect diagram
Narrow – Clarify, Remove Duplicates, Narrow
    multi-vote to narrow list, not a decision-making tool
Close – Test Hypotheses
    1. Do nothing
    2. I said so
    3. Basic data collection
    4. Scatter Analysis or Regression
    5. Design of Experiments

Reactive vs. Proactive

Root cause analysis is usually contemplated in terms of figuring out what went wrong after is happens. In other words, the typical approach is reactive. By contrast, a proactive approach is also possible, and attempts to examine all aspects of any design to prevent, reduce, mitigate, or recover from failures before they happen. One method of doing this is called FMEA, for Failure Modes and Effect Analysis. You should be aware of this, though it isn’t covered in the BABOK. I invite you to check out the Wikipedia article for a brief overview. Note that this method is in keeping with my oft-repeated admonition to be thorough and examine problems from every possible angle.

Posted in Tools and methods | Tagged , , , , | Leave a comment

TWSL Series 07: Discovery and Data Collection

Today I gave this webinar for the Tom Woods School of Life Lunch and Learn Series. The slides are here.

Posted in Tools and methods | Tagged , , , , , | 1 Comment

Business Rules Analysis

Business rules are procedures, decisions, and parameters that govern business operations. They are always under control of the business. (This is true even when imposed by external regulations. The business controls some aspects of how those rules are followed.)

Business rules analysis involves defining, validating, and organizing business rules. Business rules can describe the actions taken, the decisions made, and the criteria that govern those decisions.

Like requirements, business rules must be expressed in a succinct, declarative, unambiguous way, that allows them to be understood, implemented, and executed in a way that can readily be judged. They should be expressed using appropriate terms of art (see discussion of a glossary here). Rules should be expressed separately from how they will be enforced. For example, a rule might state that for all orders reaching sixty days from fulfillment without having been paid, will result in a message being sent to the account holder. That describes the rule, but doesn’t say anything about the mechanisms that might be instituted to complete the various actions needed to follow it. There are many ways to express business rules, and multiple rules can be chained together.

In the discussion we had in our weekly Tampa IIBA study group, we identified some of the following bases for establishing business rules. We identified only a small handful when we defined them as generally as show. Can you think of any more that would be equally general?

  • Follow defined procedures: The actions taken in the organization can be defined by business rules. In terms of the flowchart shown above, think about the overall shape of the process, the nature of processes and decisions in each box, and the nature of the entities that move within and through the process.
  • Sort based on immediate characteristic: Different actions may be taken based on defined but unchanging qualities (e.g., location, customer type, product, job function, etc.) of every entity. Such rules tend to be written as If A do X, Else If B do Y, Else do Z.
  • Sort based on changing characteristic: Different actions may be taken based on defined but changing qualities (often based on time) of every entity. These items are often held someplace, and are periodically checked to see if they’ve changed in a way that requires action, e.g., If Less Than 30 days do nothing, but when 30 days have passed change the status to overdue and send a message to the customer. Another example might involve quantities, e.g., If shelf quantity goes below three, Then order five more items.
  • Changeable parameters governing the above: In programming it’s usually a good idea to create named variables and constants for defined parameters rather than hard-coding them. This way, if the value is used in many places, it only needs to be changed once. Moreover, the name of the parameter gives insight into its potential meaning. For a business rule, and organization might change the time at which a payment is considered to be overdue from 30 days to 45 days. This might affect many decisions or only one, but it is defined and managed in a clear, modular way.

For those of you nerds (and I ask your forgiveness if my Javascript is a bit rusty), here is how the rules for the upper right corner of the business process might be expressed in code. This captures the actions taken, the decisions made, and the parameters governing the decisions.

Posted in Tools and methods | Tagged , , | Leave a comment

TWSL Series 06: Approach Contexts for Potential Solutions

Today I gave this webinar for the Tom Woods School of Life Lunch and Learn Series. The slides are here.

Posted in Tools and methods | Tagged , , , , | Leave a comment