Data Modeling

From the BABOK:

A data model describes the entities, classes or data objects relevant to a domain, the attributes that are used to describe them, and the relationships among them to provide a common set of semantics for analysis and implementation.

I’ve written about data in many contexts, but I usually start by pointing out that data is identified through the processes of discovery, which identifies the nouns and verbs of a process (the BABOK refers to these as entities), and data collection, which describes the adjectives and adverbs of a process (the BABOK refers to these as attributes). The BABOK further describes relationships or associations between entities and attributes (entities-attributes, entities-entities, attributes-attributes). Finally, this information is often represented in the form of diagrams.

Different types of data models are generated during different phases of an engagement (per my six-phase, iterative framework).

The conceptual data model is created during the conceptual model phase (oooh, there’s a shock!). This shows how the business thinks of its data, and these diagrams are produced as the result of the discovery and data collection processes mentioned above. This work may be folded into other phases if the engagement is meant to build something new, as opposed to modifying (or simulating) something that already exists.

The logical data model is typically developed during the requirements and design phases. This is an extension or abstraction of the conceptual data model that describes the relationships and rules for normalization that help govern and ensure the integrity of the data representation.

The physical data model is defined during the implementation phase. This shows how the data is physically and logically arranged in memory, files structures, databases, and so on.

There are many ways to list and describe data in diagrams.

This diagram shows the nouns and some implied verbs of a system, sometimes using slightly different verbiage.

Here is a representation of the attributes associated with each identified entity.

Here is a simple representation of the physical location of data in an implemented system.

The header listing below shows a detailed description of the shared memory area from the diagram above.

Here is a more complicated and explicit representation of data in a database.

image linked from a paper on, ma be subject to copyright

The BABOK describes two specific types of diagrams, an Entity-Relationship Diagram using Crow’s Foot notation, and a Class Diagram from UML. I recommend researching these two types of diagrams as questions about them may arise on the CBAP exam and other exams.

This entry was posted in Tools and methods and tagged , , , . Bookmark the permalink.

Leave a Reply