The Sad State of Lazy Loading: A bit of a rant

Last updated on October 14, 2019

Every now and again I spend sometime on some other sites of my working on new features and performance. One of the facets that comes up occasionally, especially on my photography site, with it’s tons of images, is lazy loading the images.

Now don’t get me wrong, lazy loading images makes a lot of sense. I have pages where I might have 2 or 3 MB of large high quality JPEGs on the page. As a photographers, small heavily compressed images with lots of artifacts isn’t something that I find attractive, or what I want to show of.

Problem 1: The web developer has to do it.

The problem here is that like so many things on the web, the burden of doing this has been put on the web developer.

Okay, sure, there are JavaScript scripts available from all over the place, in just about any license that you could imagine. And it’s not like the JS needs to be all that big. Moreover, you can certainly make an argument that it should be the web developer doing the implementation as some people might want, or depend on, the standard behavior of downloading everything.

That said, I’m going to make exactly the opposite argument here, well kind of.

Like usual it seems to me like the web browser makers have pushed the problem off on the much larger number of developers. By this I mean companies like Google and Mozilla have a lot smarter developers working for them than most of us doing front end web work. Moreover, the browser has a lot more information about what’s going on than it exposes to the JavaScript engine.

For example, one might want to change the lazy load behaviors based on the performance of the user’s connection. One example of this might be increasingly the lead time for users with slower connections so they’re more likely to not see the placeholder images. The browser can make estimates of available bandwidth based on the time and size of the pages it’s downloading, to do the same in JavaScript would require connecting and downloading resources to test the performance…

Moreover, implementing lazy loading at the bowser level means that web sites aren’t required to push even more content down to the browser. Okay sure, my current lazy load solution, the script at least, is only 1.37 KB after running through Closuer, and gzipped that dropps to 693 bytes — which honestly is the smallest lazy loading script I’ve found to date. But the script itself isn’t the only thing that has to be transfered to the client.

Which brings me to problem 2.

Problem 2: It makes the HTML a mess.

Doing all of this is JavaScript means making further compromises with the document’s HTML, and this can have some far reaching ramifications.

So normally we’d embed an image in a page using something akin to the following bit of code.

<img alt="Some image." src="someimage.jpg" 
    srcset="someimage-large.jpg 1000w, 
            someimage-medium.jpg 500w, 
            someimage-small.jpg 250w" 
    sizes="(max-width: 1000px) 100vw, 700px">

Admittedly this is a bit more complex than what many people might be use to because of the HTML5 responsive image stuff. (I’ll also admit, I’m a bit shaky on the sizes part of that). However, I’m including this because it’s something I find to be an important feature and something that’s really useful for saving bandwidth on responsive sites when used with mobile devices with smaller than desktop screens.

In any event, the idea here is that the browser is given not just the image sources as it would have in the past, but a list of alternative sources that it can choose from based on screen size or resolution (e.g. retina scaling).

Under normal operations, when a browser is parsing the HTML for a page and it encounters an img tag, it requests the image from the server and downloads it (assuming it’s not already in the browser’s cache). In the case of response images, it figures out which of the alternatives it should download, if any, and requests that.

The problem here is that this is all automatically done by the browser in the process of parsing the page. You can’t, at least not with JavaScript and at least not to my knowledge, stop the browser from doing this.

When you’re lazy loading an image, this is a problem. You don’t want the browser to go and download the full resolution image. Either you don’t want the browser to download anything, or in my opinion, you want it to download a small, low-resolution, place holder that indicates that there’s an image there but that it hasn’t loaded yet.

The way most, if not every (certainly every script I’ve seen) gets around this problem is to not use the standard src and srcset tags. Instead they use data-src and data-srcset^[1].

This gets around the browser downloading the large images automatically, and even allows you to specify a small placeholder image too.

Instead of the above code snippet you might now have something like this:

<img alt="Some image." 
    src="someimage-placeholder.jpg" 
    data-src="someimage.jpg"
    data-srcset="someimage-large.jpg 1000w, 
            someimage-medium.jpg 500w, 
            someimage-small.jpg 250w" 
    sizes="(max-width: 1000px) 100vw, 700px">

Adding the data- prefix to src and srcset stops the browser from downloading a large image, and since we still need a valid img tag (which requires a src attribute) that can be used for a small placeholder.

This isn’t a huge issue if you’re hand wiring all of your HTML code, or you’re CMS was designed the idea that the img tag it outputs to the client isn’t the image that it’s supposed to ultimately be. However, this whole idea breaks down when your CMS is designed around standard HTML. Though to be completely honest, there’s no reason that anyone should have to kludge around like this.

So take for example, WordPress. The post editor is designed around the idea of working in standard HTML; even the new block editor doesn’t, nor should it, work on the idea of producing data-src and data-srcset instead of proper src and srcset attributes. Though as I said, it shouldn’t be doing that anyway, the most appropreate thing for the CMS to do is store the proper (e.g., the high resolution) image src in the img tag in the document/post.

To work around this, you need to filter the output server side so that you can convert the intelligible HTML into the kludgy mess that’s needed to feed these lazy load scripts.

Of course, filtering HTML like this comes with another big caveat. HTML is not a regular language.

If you read any of the vast majority of guides around for say web programming in PHP, there’s invariably a lot of use of regex to manipulate HTML. Presumably, the thinking is that the HTML output is a string, and regex is the ultimate swiss army knife of string processing, so it must be the right fit. The problem is, HTML isn’t something that can be processed on the whole this way; its not a regular language.

The proper way to manipulate HTML is to use a DOM manipulation tool. One that understands the structure of HTML and can pick out elements and manipulate them. If you’re working in PHP, the built in DOM tools can be used, or you can use one of the myriad of libraries that are out there that provide alternative (often more jQuery like) interfaces to do the same thing.

Moving from regex, to a proper tool, of course adds more code to your codebase. If the library is pure PHP, for example, then there’s going to be performance impact compared to one that’s in native C++. Moreover, this means a greater surface to potential security problems — though I’m sure I’m the only one that even thinks about that. Of course, most people, at least those working in PHP, probably don’t know or bother to do this, instead opting for regex solutions that are fragile and finicky.

The real problem here is now we’re adding even more overhead to the server to deal with needing to have a kludgy workaround for a problem that’s supposed to reduce bandwidth and make the client experience better.

HTML and browser support is what we should have.

As far as I’m concerned this is all a huge mess of a kludge and it’s putting the burnt of the work in the wrong place. As a web developer, to implement lazy loading, you need to find or write some JavaScript on the front end, and write a backend post processor that translates clean HTML img tags into something that doesn’t work out of the box without the JS.

Oh, yea, JavaScript.

This whole thing requires that users have JavaScript enabled or that the server side processing code goes a step further and emits the original img tag in a noscript block so that people who don’t have JS enabled don’t end up just seeing a low-res or other placeholder image.

On the other hand, there’s the browsers; the place where this ought to be handled, not by every developer.

To start with, Google, Mozilla, Apple, Microsoft, Opera, etc. all employ software engineers that are almost certainly a lot smarter than most of us doing web development work (at least front-end web dev work). There’s a much higher chance that they’re going to be able to write a performant clean and secure solution than the rest of us.

Moreover, it optimizes developer resources. By this I mean you have a handful of developers implementing a feature that lots of developers can easily use, instead of a lot of developers using their time to implement a feature that only works on their site.

To put some perspective on this. While the JavaScript I used was readily available, integrating it with my WordPress theme, writing the backend code, testing and debugging the resulting output, deploying it, and then fixing the inevitable bugs that made it through my internal testing took me the better part of 8 hours (I didn’t say I was the fastest or best programmer).

Currently there’s 1.8 billion sites on the internet, if 5% of them were to implement this same solution and took the same time I did to implement and test it, you’re talking about 720 million manhours (73,191 man-years) going into doing this. On the other hand, if Google et all spend 40 manhours each implementing this even if it took a man-year to get a standard done, that’s still vastly less than the all of us doing it ourselves.

Beyond that it would potentially mean that more sites are implementing lazy loading. Here too there’s value. More and more people work from mobile devices, which in most places ultimately have some pretty significant restrictions on their network bandwidth (and overages often cost a lot).

One of my articles on my other site is about picking choosing what camera gear to take when going on a cruise to Alaska. Without lazy loading, the initial page load totals 1.4 MB due to the images on the page. With lazy loading, the page is only 700KB (still big, and dominated by a large header image), but roughly half the size.

With lazy loading, if you leave after reading the first couple of paragraphs, you’ve only used less than 1MB of your bandwidth and mine. On the other hand, without lazy loading, you end up downloading the whole 1.4 MB of data.

Now this is just one page on one site. If that page sees 1000 views in a month, that’s 1.5 GB of bandwidth used. And there’s a broader consideration on page weight, while most of us in the western world can be blasé about how large a web page is because bandwidth is comparative cheap for us. For those in the developing world, the cost of a web page in terms of their annual income is actual significant^[2].

Proposing a soluton

What’s frustrating to me is that this is something that could have easily been addressed when the HTML5 specification was being developed (after all it included the responsive image stuff). But it wasn’t so we’re still stuck with this mess.

With that said, I do want to propose a solution. Simply having browsers load all images lazy is probably not a good idea. It would probably work most of the time, but I’m sure there’s a bunch of sites out there where it would break some functionality that they depend on.

Therefore I I’m going to propose a new header and a pair of new attributes.

The new header would be something like;
<meta lazy-load="on|off" type="%mime_type%">
or
<link lazy-load="on|off" type="%mime_type%".

The intent is to indicate to the browser with a single tag that the specified mime types, or if “*” was specified all media (images, video, and audio) can be loaded lazily on that page. (Though I admit I’m not sure whether a meta or link tag is more semantically appropriate.)

I’m suggesting both on and off as possible flags as it’s reasonable to see a point in the future where browsers lazy load media by default, and a developer may want to turn that off for some reason.

The second aspect is a new attribute for media tags, lazy.

This tag could either be specified without a value (the equivalent of boolean true), such as lazy in which case it would indicate that the marked media element can be loaded lazily. Or with a value such as lazy="no" which would indicate that the media element may not be loaded lazily.

In either case, the lazy-load attribute would override the document or browser level behavior for the specific element. E.g. if the document is set to lazy load, but you need to insure that a specific image always loads, then marking it with lazy="no" would cause it to be loaded in the initial parsing not on demand.

The HTML standard permits attributes prefixed with data-as being valid HTML. See Using data attributes | MDN ↩︎
Compare the cost of a site as % of GNI for my site (cult-of-tech.net) versus blog.codinghorror.com. ↩︎