Graph Project: Auto-Sizing Works — The Easy Part

Today I got the main parts of the auto-sizing going again, though you can see it’s for the simple case when there is never more than one axis on a side (the support is there, I just haven’t finished the implementation for that case) and where major ticks always fall on the end of each axis. Handling major ticks that don’t fall on the end of axes is probably the most complex case.

I appear to have a one-pixel overlap with the main graph label and the x-axis in the high position, but you can see that the bottom margin is set based on the larger of the overflows from the y-axes on the left and right sides.

So far, so good.

Posted in Tools and methods | Tagged , | Leave a comment

Graph Project: Toward Getting Auto-Sizing Working Again

When I reworked the axes as separate objects I’ve been setting the plotting area and label locations by hand, as a hack/placeholder until I got all the pieces working again. I’ve still got a bit to go, but for now I’ve fixed it so I can place the labels automatically spaced beyond the tick value labels plus a specified buffer, and then account for a buffer outside of that. I use the information compiled to calculate a total distance in pixels perpendicular to the axis that the axis elements and buffers require. The information about how far the tick value labels extend past the plot area in each direction (left/right or top/bottom) is also tracked correctly, as you can tell by how the information is used to draw bounding boxes (in purple).

I’ve decided that rather than relying on buffers defined from the outer edges of the applicable drawing region that I would instead rely on buffers defined around all outer edges of the axes themselves. I have a buffer defined for outside of each element in a perpendicular direction going away from the axis but I see I need to add them for each end along the axis as well.

The current code also only calculates the overflow space off each end of an axis if major ticks fall right at the end of an axis, as they do in the accompanying figure. If a tick does not fall directly at the end of an axis the code then has to figure out how far away from the end it’ll be, which would not only reduce the overflow space, possibly to zero, but would also affect how the elements need to be placed. The issue is that the distance from the final major tick to the end of an axis is proportional to the length of the axis, but the placement of the axes might depend on the amount of overflow. The spacing calculations may therefore have to be performed iteratively, a process made even more complex by the possibility that multiple axes may be present. That should be a treat to figure out…

Once all of these items are implemented I can get to work on making all items place themselves from outside in and define the plot area which remains (and also determine if not enough space remains). I also have to make sure I account for whether the labels are drawn or not in the calculation of pixels required perpendicular to the axes. So far I’ve just assumed that they’re going to be drawn.

The current code assumes that x-axis labels (the main ones, not those for each tick value) will be printed in their normal, horizontal orientation while the y-axis labels will be displayed rotated counter-clockwise by 90 degrees whether the axis is displayed on the left (low) or right (high) side of the graph. If I decide to change that rotation to 90 degrees clockwise either by default or user parameter I’ll have to make a few more tweaks still. The placement of those labels requires that the height of the letters’ height and descenders be accumulated in the proper order. If I allow rotation in the other direction I’ll have to reverse things and add more switches.

I need to tweak the placement of the bounding rectangle by a pixel or two here and there to ensure I’m summing up the perpendicular distances exactly right. At the moment it looks like I’m about two pixels too narrow. I’ll review that as part of tomorrow’s work.

Finally, I think I may also shorten the tick lines drawn across the plotting area by a pixel, so they don’t overlap the bounding line on the far side of the plot, as is seen here along the bottom edge.

Posted in Tools and methods | Tagged , | Leave a comment

Graph Project: Labels Rotated Properly In Every Location

Today’s work was to get the tick value labels to handle rotation correctly, which they now do whichever side the labels are on. I also realized that I had cleverly defined rotations for the x-axis to go in the opposite direction from those on the y-axis, which is, how shall I say, not actually so clever. I therefore changed it so positive rotation always means clockwise (all labels in the figure are rotated positive 30 degrees) and negative always means counter-clockwise. That should make sense to users.

Posted in Tools and methods | Tagged , | Leave a comment

Graph Project: Axes On Other Side Of Plot

Here I’ve modified the code to not only accept extra axes, but also to be able to place them on either side of the plot area. This is reflected in the location of the tick and axis labels relative to the axis locations (in both the x and y directions) and the meaning of the “in” vs. “out” setting of the tick marks, all of which are presently set to “out” rather than “both” as they have been.

Next I’ll review how the rotations work from the other side. That should be a bit more involved.

Posted in Tools and methods | Tagged , | Leave a comment

Graph Project: Multiple Axes

Now that the axis objects have been split out from the main graph object it’s a simple matter to add in as many as you need, as seen in the image. This example shows parallel axes stacked side to side, but it would be easy to imagine plots being stacked lengthwise, as is often seen in dual plots of stock price and trading volume by time period.

Note that the second axis in either direction would usually be drawn adjacent to the other side of the plot area, which would also require that the elements be drawn in the other direction from the base location of the axis. That will involve more parameters and code, which I believe will be tomorrow’s project.

Also, every decision you make has unexpected consequences. For example, I initially assumed that the axes would always meet in the lower left corner of the graph and for some reason decided that tick marks in that corner should not be drawn over the axes in that corner. This looks OK when the axes do, in fact, meet in that corner, but not so good when they don’t. One case where that may happen is where an axis intersects a perpendicular axis at an arbitrary location and another is when an axis floats away from the plot area as shown here. In that case it just looks stupid.

Here are a few things that need to happen:

  • Ticks specified to cross an axis should be drawn that way no matter where they fall.
  • The spacing and buffers within and around each axis element have to be reconsidered.
  • Once the axis elements are reworked the ability to click on the related area should be added, as a way to kick off user modification processes.
  • The labels are drawn relative to the location of the axis, but the axis itself may need to be able to be drawn in a different spot, and the parameters needed to keep track of all that need to be added.
    • The location of axis endpoints should be specified as it is currently.
    • Tick labels should always be drawn close to that location.
    • The axis label should be drawn just outside the tick labels.
    • The axis itself should be drawn based on where it intersects an axis perpendicular to it, by value for that axis and not by pixel.
    • It’s possible that ticks should be drawn close to the tick labels and on an axis display in the middle of a plot area. Maybe.
    • Buffer spaces should be defined between all elements: inner tick to axis (if that axis is floating and not on or directly adjacent to the plot area), axis or tick to tick label, tick label to axis label, axis label to outer buffer. That should provide enough information to allow all axes to space themselves relative to the specified edge of the graph object, the plot area, and other axes in the same direction.
  • If multiple axes are to be drawn, then certain limits may need to be placed on them:
    • Floating axes should not be able to be longer than the one(s) adjacent to the plot area.
    • If automatic sizing is to be attempted (I had it, but for the time being I’ve lost it), the end buffer space requirements of the axis that takes up the most pixels should govern the required end buffer space for all parallel axes.
    • In theory, no more than one axis in either direction should be able to be drawn somewhere in the middle of the plot area, away from its labels.
    • Similarly, only one axis in either direction should be able to generate background lines on the plot area, and that should probably be the first one listed, one directly adjacent to the plot area, and probably the primary axis on the left side or the bottom.
    • It may be possible to stack axes along their length instead of side to side as shown. Consider a display that shows stock prices and ranges by day in a large, upper section and the volume traded by day in a smaller, lower section, as described above. On the other hand, it would be just as simple to do the same thing using two separate graphs, or some new type of object using polymorphism and inheritance.
    • At some point, if things get too crazy, it may not be possible to automatically place all elements. Some user intervention may be needed.

I decided I’d keep working on things and updated the plotting functions for multiple axes. This involves passing not only the values through to the plotting routines but the relevant axes as well. Functions to determine the pixel location along each axis are called separately.

Of course, you now can’t tell which graph is associated with each axis, but that’s easy enough to address.

Posted in Tools and methods | Tagged , | Leave a comment

Graph Project: Splitting Axes Into Separate Objects

One of the problems with developing from a quick hack, as I have here, is that you sometimes have to take a step back and do some serious re-plumbing. In this case that means breaking things apart and defining axes as separate objects. That’s the way I actually did it the first time I executed this project. This has the benefit of allowing me to easily specify any number of axes and also cuts the amount of code and unique variable names almost in half from what I had. It took some doing over the weekend and this morning but I’m well on my way.

Not everything is quite as automated as it was but I’ll get back to that point over the next couple of days. In the meantime I’ve defined a plot area which is part of the graph object itself and each of the defined axes. I also defined a TextSizer object, one instance of which is defined as part of the graph object and references to which are passed to each axis object. That allows all entities to refer to the same instantiation. For now the TextSizer doesn’t do any more than the original custom sizing function(s) did, but a mechanism is in place that can be expanded and generalized as needed, so it can potentially handle a wider variety of fonts and pixel sizes. It may be a good idea to store reference to the plot area in the same way–or it might not. It’s the difference between storing four local coordinate values and going through the extra reference indirections to access the ones stored with the graph object (rather than the individual axes). My feeling is that saving the time for the extra reference indirections is worth giving up the extra storage for each axis object (even if they all contain the same information), but that’s just a gut feeling. I could go the other way if I decide that thee repetitive storage hurts my soul more than giving up a few clock cycles. Given the magnitude of the effects of the decision, which approaches zero, I think I’m going to stop thinking about it. I only report it because it may be worth describing how I subconsciously consider such issues at all times.

Finally, I added a few more complications to govern the display of the plotting area. I’ll go into some of the edge cases these brought up in the days ahead. They mainly have to do with the kind of lines that are drawn around the boundary of the plot area. It’s also clear that a designer would have to be careful not to specify that tick lines not be drawn across the plot area based on potentially incompatible locations from multiple axes in one direction (say, if two or more y-axes were specified with independent scales).

In the meantime, the figure below shows that I’m able to generate more or less the same display elements I could before, minus the actual function plot, so I know I haven’t lost anything. I left all the old code in place so one of the first things I’ll do going forward is comment out all the code and declarations that have been obviated, to ensure I don’t still have any hidden dependencies.

Posted in Tools and methods | Tagged , | Leave a comment

Graph Project: Automatically Generating Axis Labels

Up until not the graph object has required that the parameters governing the generation of axis tick labels be specified explicitly. However, it would be nice for the graph to be able to generate reasonable value labels on its own given the high and low range of data it’s supposed to plot. I decided that a) I wanted to make this happen and b) I’m not overly mad about the way I’ve seen it done by other software.

I have a few more scenarios to test (I haven’t looked at ranges that cross zero, i.e., that have both negative and positive values) but I did come up with a method that should generate roundish values across somewhere between six and eleven cycles.

I began with the observation that the size of the chosen interval should be based on the span of values (the difference between the highest and lowest values in a data set) and not by the magnitude of the values. Basically, I calculate the span, divide it by six, and then massage that value until it looks like a fairly rounded value of the appropriate magnitude. The code listing is shown farther down.

I wrote some code and tried a few range values but realized I needed to be more systematic and exhaustive, so I created an Excel worksheet that tested combinations of base values from 10-6 to 106 and ranges from 10-8 to 108, with some randomization thrown in for the first significant digit. It implemented the code in spreadsheet form and listed the expected results. You can see the patterns in the image, starting from column T and moving right.

As you can see, some combinations of base and range yield only two or three values for the range, while other combinations yield up to a dozen (and other testing has indicated that more may be possible). I therefore found it necessary to add extra checks to divide the interval if there are too few and shrink the interval if there are too many. That said, I also found that the code behaves just a little bit differently than does the spreadsheet (it gives better results, I think the problem in the spreadsheet is in columns K and L, which corresponds to lines 7 to 15 in the code snippet), but the adjustments are occasionally still needed.

So far I’ve just written the code to generate the range values but I have not yet extended this to draw the values generated. That’s going to be interesting because the calculations generate unexpected results when some of the values cannot be represented exactly. Who can tell what’s going to happen when you think you’re supposed to get 0.002999, 0.003009, and 0.000001 but you actually get 0.0029990000000000004, 0.0030080000000000016, and 0.0000010000000000000002? Another issue is that combinations of very large and very small numbers (e.g., 4,700,000,000.0000007) cannot be represented; the least significant digits get truncated entirely.

Annoyances like this came up when I wrote code to generate graphs in the early 90s and they come up now, but that’s part of the game, isn’t it? Computers do what they do and you have to work around that. The formatting routines for displaying the tick values may or may not take care of these issues so we’ll see how it goes. If they don’t, then I’m going to add some extra manipulations.

I’ll be testing this going forward and describe any further modifications I identify.

Posted in Tools and methods | Tagged , | Leave a comment

Completing the Logarithmic Scale Implementation

Today I was able to finish the implementation of logarithmic scales on both axes. Even better, I added the ability to specify a major interval for each axis separately. This value is used to determine the number of cycles and thus major ticks for linear axes; the number of intervals is determined automatically for logarithmic axes. A happy side effect of this was that axes can now display an arbitrary number of cycles in linear or logarithmic mode, as shown in the figure below. I also rearranged the code to be more modular and generalized.

Posted in Software | Tagged , | Leave a comment

Rules for Drawing Graph Elements, Especially Axes

Since I don’t have time to work through much code today I thought I’d take a step back and define the rules I’ve implemented–or will implement–that govern the construction of the different elements of the graph object I’ve been developing. The process of adding new capabilities always illustrates assumptions you’ve made in previous code and where you want to be able to break things up to make it more modular and flexible. Nowhere is this more important than in the construction of the axis elements and their associated labels.

The basic outline of the process so far is to determine as much information about the size of elements as possible so they can be placed onto the canvas with the correct amount of room. I’ve done this from the outer edges of the canvas working in, so an important intermediate result is the determination of the area available to draw the actual plot(s) (i.e., the central area where the actual data is shown graphically).

By default, I’ve assumed that the entire canvas will be taken up by the graph object’s components but as I think about it, it might be a good idea to make the outer boundaries completely arbitrary. I can think of a number of reasons for this, one of which would be the ability to place a stripped-down, real-time scrolling chart at a user-specified location on a larger animation of some process being monitored (and/or simulated). Could you place another DOM element (canvas) over an existing display? Sure, but being able to simply save an area and plot a graph in an arbitrary location on a larger canvas probably gives more flexibility. This generates two more changes to all the code so far written: 1) the x- and y-coordinates of each edge have to be specified for the graph object’s area, rather than assuming that the coordinates correspond to the edge of the host canvas, and 2) all internal locations have to be calculated from an offset x and y origin, rather than assuming a default origin of 0,0. That change alone will touch a good proportion of the code so far written.

I have to account for a few things I haven’t accounted for yet, all of which items were already included in my earlier To Do list. These include legends, labels with line wraps, axes on either end of the plotting area, the inclusion of multiple x-axes or multiple y-axes, and the ability to reverse the direction of any axis.

The big news from my most recent work is defining rules for how axes are defined and drawn. It was obviously easier to implement linear axes first, but the process of adding the ability to draw axes with an exponential scale proved illuminating. The initial set of rules I came up with for linear scales were the following:

  • I defined the end points of each axis as coordinates in pixels, using an x,y pair for each end of the axis.
  • I assumed that any axis must be vertical or horizontal. In theory this would save me from having to specify one of the four coordinate values (e.g., the y-values for both ends of the x-axis would be the same) but chose not to do that for clarity.
  • I defined separate pairs of inner and outer high and low values for each axis. The inner high and low values are intended to reflect the highest and lowest values actually included in the data or plot (when these can be determined automatically and when it isn’t acceptable to allow plot points outside of the graph bounds). The outer high and low values are intended to coincide with the beginning and end of the axis. The theory there is that there should or at least could be buffer space at either end of the data set. When the inner range can be known the outer range could then be determined automatically in a way that would allow for a reasonably sized spatial buffer. The automated parts of this mechanism have not yet been implemented. So far the ranges have been set manually.
  • Each axis has been divided by an integer number of major ticks, not counting the major tick defined to fall at the start of the axis. That is, if the number of major ticks is defined as four, then there will be five ticks defining four (evenly spaced) intervals along the axis. So far I’ve chosen values for the outer range of values and the number of ticks that generally yield even values at each major tick mark, and I’ve done this manually. It is also possible to define and origin value and incremental values and let the number and location of tick marks and values be determined from that. In that case the final tick might not coincide with the far end of the axis; the number of cycles could end in a fraction. I’ve had it in mind to support axis labels to begin and end at arbitrary values, but have not implemented any such support.
  • Each increment of major tick values can be further subdivided by a specified number of minor ticks. Choosing zero or a positive number of minor ticks defines that number of intervals plus one. The minor ticks are then spaced equally down the length of the axis, between each pair of major ticks. The original implementation calculated the total number of tick marks (major and minor) as major * (minor + 1), and then the ticks were all drawn in order. After implementing the logarithmic scale drawing process I see it would be more consistent to draw the minor ticks as a sub-process of drawing each major tick. That would allow the relevant pieces of code to be constructed in a more consistent manner and probably allow greater flexibility in dealing with partial major tick intervals.
  • Tick value labels are only drawn in association with major ticks. (Note that the ticks do not actually have to be drawn, and the graph area lines may similarly be drawn or not drawn.)
  • The number of pixels the first and last major tick values take up beyond the location of the tick itself are calculated for each axis, in the direction of that axis. The number of pixels perpendicular to the axis is also calculated for the longest label. This information may be required to ensure that proper buffer space is provided so all such values will be displayed in full. If either of the end labels do not fall at exactly at the end of the axis, then further adjustments would have to be made. I have not implemented calculation that deal with any aspect of partial ranges for axes with linear scales.
  • The tick value labels are placed so the center line of the text (running along the length of the text evenly between the top and bottom) intersects the end of the tick plus an offset defined by a parameter. So far the spacing from top to bottom has not considered the height of any possible descenders, because only numeric labels have been supported so far. The code may be modified to consider the presence of descenders on such labels in the futures.
  • The dimensions of text label for each axis have a bearing on the spacing required to display all elements. So far only the height of the characters is considered along with a buffer between the label and the outer edge of the tick value labels (perpendicular to the associated axis). Provision for multi-line text labels has not yet been implemented. Provision for inserting line wraps for very long text labels (or possible adjustment of the wrap location by the user) based on the amount of space available (parallel to the associated axis), has not yet been implemented. Flags governing whether or not to display text label for each axis have been implemented, but not all functionality associated with drawing or not drawing the label and adjusting the space required for the elements has been implemented. So far I have ensured that text labels are always defined and drawn.

Many of the rules for drawing axes with logarithmic scales are the same but the necessary differences are these:

  • The lower and upper bounds of the range are defined manually.
  • The major intervals are defined by a multiplier which is successively applied. The multiplier is referred to as the Base. For example, if the lower bound is set as 1.0, the following major interval values would be 1.0, 5.0, 125.0, 625.0, 3125.0, and so on. A lower bound of 3.3 with a base of 10 would yield 3.3, 33, 330, 3300, and so on. A lower bound of 1.0 with a base of 4.5 would yield 1.0, 4.5, 20.25, 91.125, 410.0625, and so on. The lower bound and base can be any value and do not have to be whole numbers. The lower bound will always define the location of a major tick.
  • The upper bound may fall anywhere in a major interval. The calculations for how this affects the overflow pixels beyond the far end of the axis have been implemented in this case, though the example I’ve actually plotted ends on an even boundary.
  • The number of minor ticks is determined by the base value minus two, rounded down. That is, a base of 10 would yield 8 minor ticks (2-9), a base of 5.7 would yield 3 (2-4), a base of 3 would yield 1, and a base of 2 would yield none.
  • The number of cycles is given by the logarithm of the maximum value divided by the minimum value, which value is then divided by the log of the base value. For example, a range of 0.018 to 1800 yields a multiple of 100,000, or 10 to power of 5. The natural log of 100,000 (11.5129) divided by the natural log of 10 (2.30258) also works out to 5. This math works for any base value.
  • Once the pixel length of the axis is determined the number of pixels per cycle can be determined.
  • Once the pixels per cycle is determined the location of each minor tick within a cycle can be determined in pixels. The fraction of the cycle length for each tick is given by the natural log of the tick number plus 1 (counting from 1 to the number of ticks) divided by the natural log of the base value. When each major tick is drawn the information then exists to be able to draw all of the minor ticks at one (as a sub-process of drawing the major ticks as described above). Care must be taken to ensure minor ticks are not drawn past then end of the axis for partial cycles, which code has not yet been implemented.
  • Plotting values on a log scale turns out to be more difficult than for a linear scale. Start by taking the natural log of the quotient of the value to be plotted divided by the minimum value, then divide that by the natural log of the base value. Then divide that by the number of cycles. Then you can apply that fraction to the pixel length of the axis. (Note that I still have a small issue to work out about how to count the end pixel.)
Posted in Tools and methods | Tagged , | Leave a comment

First Implementation of Logarithmic Scale On One Axis

I was able to get the x-axis drawn out in a logarithmic scale, get the plot drawn on the same scale, and do it in a somewhat modular fashion. It needs a few pixel-level tweaks still and to be replumbed to separate the relevant sections of code, but conceptually the hard problems are solved.

In this mode you can clearly see the different sections of the plot in different colors.

Posted in Tools and methods | Tagged , | Leave a comment