Home > Blogs > web > Calculating dimensions of words/phrases programmatically (theory)

Calculating dimensions of words/phrases programmatically (theory)

March 24, 2011

Introduction

A project I'm working on right now involves working very closely with type faces. Specifically, trying to determine how large (dimensions/area) a word in a specific font is. An interesting part about this is that the font-size itself isn't really of interest, rather a general ratio for the width versus height of the word or phrase; but that's a tangent.

I'm going to go over the approach we (coworker and I) went through in determining the dimensions of a phrase. But I need to be more specific as to what we're looking for. In a phrase such as "juice", we're interested in determining the width/height (relative to whichever metric system you'd like; pixels, points, etc.) of the entire phrase. This means that the dimensions encompass every pixel in the phrase, and that the height of a character like "j" is calculated relative to the baseline of the word. For example, a character such as a "j" which may have a height of 100 pixels would adjust the total height of the word "juice" based on how far below the baseline it's descender falls, and how far above the baseline it's ascender lies.

Font Background

For our project, which I'll cover in a different post, we're using an AFM font file. This is a font that contains information on all the supported characters in the font family. So for example, our font is Helvetica Neue. Opening up the AFM file shows a bunch of data in the following format:

C 75 ; WX 741 ; N K ; B 64 0 749 714 ;
C 76 ; WX 611 ; N L ; B 64 0 583 714 ;
C 77 ; WX 926 ; N M ; B 65 0 861 714 ;
C 78 ; WX 741 ; N N ; B 62 0 678 714 ;

According to Adobe, this information is broken down as follows:

The character code
Width of character (aka. wx below)
The character itself (eg. K, L, M, N)
The x-coordinate of the bottom-most pixel (aka. x1 below; can be a negative value)
The y-coordinate of the bottom-most pixel (aka. y1 below; can be a negative value)
The x-coordinate of the upper-most pixel (aka. x2 below; should be a positive value; may depend on weird characters/fonts though)
The y-coordinate of the upper-most pixel (aka. y2 below; can be a negative value)

Some notes for the above metrics are as follows:

2. This width includes character-padding, but not character spacing. Fonts themselves determine default padding for some characters which excludes how far apart the characters themselves are spaced from one another (which is managed by word processors, browsers, etc., as character-spacing). Some characters have padding (eg. the letter e; the number 4), while some will in fact seem to have "negative character padding" (eg. the capital letter Y in this font will in fact "bleed" beyond it's wx width/value, such that adjacent characters, if the character-spacing were set to 0, would have traces of the character in their "cell").

I imagine different font-file formats are formatted slightly differently, but this seems to be the way AFM files are formatted. Additionally, in these applications, line-spacing is also added to text. For our calculations, we want the "best fit" for a word. Meaning, when we calculate the height for a word/phrase, we do not include any vertical buffer.

To repeat, the character widths (wx) are meant to represent how much "space" in a phrase the character itself should be allotted; any extra widths that bleed over, should bleed over into the adjacent characters. Character-spacing can then be applied to control that (and should when calculating word dimensions), but we can leave that until the end.

Worth noting again is that the x and y coordinates (excluding x2) can have negative values. Underscores are great examples of this, as often they lie/rest slightly below where most other characters lie, and their left most pixels are meant to "bleed" over into the adjacent/left cell/character.

Calculating Dimensions

So now that we have the details of the file, how do we use it to get the dimensions of a word/phrase? We need to do it character by character. It took a lot of white-boarding and diagraming, but we've figured out the following equations. All height and descender calculations are done relative to the baseline, such that the height calculated always includes the spacing either up or down to the phrases baseline.

Some characters dip below the baseline, so to determine the height of a character, we use the following formula:

characterHeight = max(0, y2) - min(0, y1)

To determine how far the descender dips below the baseline (eg. characters with descenders are letters such as j, y, and the underscore), we use the following:

descender = min(0, y1)

And finally (well sort of), to determine the width of a character, we use the formula:

characterWidth = max(0, x2) - min(0, x1)

This is where it gets a little complicated though. The above characterWidth formula calculates the entire width of a character including pixel-bleed-over. So this is the raw width of a character, and this would be the width of a word/phrase if it were one character in length. If the word were two characters in length, the following would be used:

firstCharWidth = min(x1, 0) + wx
lastCharWidth = x2
characterWidth = firstCharWidth + lastCharWidth

And finally, if we were calculating the width of a 3-or-more character-length word/phrase, we would use the following:

middleCharWidth = wx
characterWidth = firstCharWidth + lastCharWidth + numberOfRemainingCharacters *
(respective) middleCharWidth

In this final calculation, adjacent characters would be given the leniency to bleed over as they were intended to.

Noteworthy (character-spacing)

The above calculations wouldn't be very useful if character spacing weren't taken into consideration. The word, if drawn out of generated, would have each character drawn directly next to one another, with certain characters bleeding over each-other. The formula, however, for character spacing could itself be fairly complicated.

For example, if a word had 12 characters in it, 11 calculations of character-spacing would have to be calculated and added to the total width. This is a pretty straight-forward calculation with the exception that character-spacing is based on font sizes. Since we're not interested in hard dimensions (eg. we're using ratios rather than widths and heights) the calculation would not be finite for us.

Conclusion

While the equations above aren't the most complex, they were fairly to derive based solely on a raw AFM file. I definitely look at typography-implementations in applications differently now considering the calculations involved, especially when dealing with variable font sizes.

In our case, we're being forced to calculate ratios rather than hard dimensions for a word/phrase since this word will be used in many different font sizes, and therefore finite numbers wouldn't be very useful as they'd require scaling too often. It's for that reason that pre-calculating character-spacing can be tricky, but depending on your use, you may be able to simply throw those numbers in and get solid dimensions.

Looking over this post, it feels pretty long-winded, but alas, it's more or less for me own posterity; I wanted to have a written document outling the logic behind the programatic-mess that has come from all this.