Font Subsetting - shrink down font files to speed up page loads

Fonts are one of the largest resources on any page after images, and can have a big impact on CLS when they vary in size from the underlying system font. Font subsetting allows us to radically shrink font file sizes, speed up initial page loads, and improve our page speed scores.


With modern web design, it's not uncommon to have 4-6 fonts being loaded on each page - core font files, bold versions, italic versions, for each of a couple of different font styles. While compressed font formats like woff2 are now commonly supported, this can still lead to an additional 350-400kb of weight on an uncached page. This impacts page speed scores across the board - from Large Contentful Paint (LCP) and First Contentful Paint (FCP) through to the overall page score. While techniques like self-hosting and cache headers can speed this delivery up a bit, ideally we want to ultimately serve smaller files, without compromising the design vision of the site. Enter font subsetting!

What is Font Subsetting?

Glyph Map

Internally, each binary font file is essentially a giant table. It contains a reference for each unicode character code, and alongside it, the font’s representation of that code. Where no representation exists for a given character, an empty cell exists.

Font files will typically support a wide variety of languages, within the same font file. If we're only going to be using some of these languages, then we have an opportunity to shrink the size of the files, delivering a faster experience for our users. Within the fonts we are using, there are numerous cells taken up by values which never appear on site - values for cyrillic and other non-latin languages. What we can do is effectively “purge” these binary files to leave us with only a subset of characters, focused on the latin characters (letters and numbers from English, with accented characters used in Spanish, French, German, etc). Depending on the font files we're using, and the level of language support they have, stripping out these other characters can reduce the size of the font files by up to 60% (~60kb → ~23kb on some typical Google web fonts).

What happens if someone tries to use a character which was removed?

If the browser encounters a character which does not exist in the current font family, it will attempt to load it in the next available font family declared in our css. This can lead to an unpleasant mismatch of fonts within a single word (some letters with different height, density etc). For this reason, even if a site is only serving English-speaking audiences, we will still grab the whole latin set, rather than just a stricter subset based on UK & Ireland. This avoids words like café rendering é in a different font. Incidentally, this is how emojis are typically rendered on sites - the main font files do not generally have a rendering for emojis, so the browser fails all the way through to the first font which will render these characters (typically a system font, which is also why some emojis look different on iOS vs android vs Windows machines).

How is the subsetting done?

Manually! Subsetting of each font file can be done using a tool called Glyphhanger. Glyphhanger has a number of options for generating font subsets - extracting only glyphs for a particular character set, or even extrating only the characters which exist on a remote url! The most basic example is to generate a subset which just contains latin characters.

$ npm install -g glyphhanger

$ glyphhanger --LATIN --subset=fonts/*.woff --formats=woff2
Subsetting Roboto-Regular.woff to Roboto-Regular-subset.woff2 (was 65.7 KB, now 15.8 KB)

The above command will subset all woff files inside the fonts/ directory, creating subset files for each font discovered with just the latin character subset. The formats option allows us to specify the output format(s) we want - in this case, we're asking glyphhanger to not only subset our woff files, but to also save the output in the more compressed file format woff2.

Additional optimisations

If you have a font file where you know you'll only ever use a small number of characters (maybe a specific font for a sports scoreboard style), then making use of the whitelist option can make for a huge reduction.

$ glyphhanger --whitelist="01234567890-:" --subset=Sports-Font.ttf --formats=woff2
Subsetting Sports-Font.ttf to Sports-Font-subset.woff2 (was 304.25 KB, now 3.85 KB)

In the above example, you'll notice that we haven't just limited our subsetting to woff font files. Many older sites may still be carrying older, less-efficient file formats like ttf. With support for woff2 being widespread, this is a great opportunity to really optimise the font stack on site, moving from ttf to woff2 as the primary font supported.


Results Image

The top half of this image is the network tab for font loading on the article page on a popular news site. There are a number of font variants being served for different parts of the design. The bottom half shows the result for the same article after subsetting the fonts to just the Latin characters. In this instance, the size of the transferred font files on a cold cache has dropped from ~400kb to ~140kb, which is a drop of ~65%.

This file size drop lead, in this particular case, to an LCP score increase of close to 20%, a FCP increase of 10%, and an overall page speed score in the same range. If there are limited languages in use on a particular site, then font subsetting can be a really effective way of quickly improving the optimisation of the site's page speed, and, ultimately, Google ranking!

One caveat here is that some font licences do not permit modification of the source files in any way, even for subsetting. So ensure that the licence in your font file is ok with this type of modification before proceeding!

Share This Article

Related Articles

Lazy loading background images to improve load time performance

Lazy loading of images helps to radically speed up initial page load. Rich site designs often call for background images, which can't be lazily loaded in the same way. How can we keep our designs, while optimising for a fast initial load?

Idempotency - what is it, and how can it help our Laravel APIs?

Idempotency is a critical concept to be aware of when building robust APIs, and is baked into the SDKs of companies like Stripe, Paypal, Shopify, and Amazon. But what exactly is idempotency? And how can we easily add support for it to our Laravel APIs?

Calculating rolling averages with Laravel Collections

Rolling averages are perfect for smoothing out time-series data, helping you to gain insight from noisy graphs and tables. This new package adds first-class support to Laravel Collections for rolling average calculation.

Slack Mobile Problems After Enabling 2FA

Two Factor Authentication is an important method for ensuring account security. When I added it to my work Slack account, the mobile app refused to let me back into my workspace. Fortunately, there's a fix, though it does involve jumping through a few hoops!