This is a guest post, by Warren Gaebel, an author at our sister blog at Monitor.Us.
Zero is the best possible metric! If something takes zero time, it would be impossible to improve on its performance. Since end-user performance is judged by how long the user sits twiddling his thumbs waiting for a web page to magically appear, everything the computer is doing during this time is a performance issue and everything done outside this time is not.
Improving Eliminating Performance Problems
Issue #1 – Combining & Inlining: Some components can be combined into one download to reduce connecting, handshaking, and round-trip times.
An automated build process can combine files into one download stream. For example, client-side scripts, style sheets, and small images can be moved into the .html file. The developers’ copy can remain separated into individual files and the production copy can be combined into one file.Issue #2 – Compression: Much compressible material travels through the Internet uncompressed even though compression is known to eliminate up to 90% of this traffic. Because compression can yield a substantial performance improvement, it should be automatic for everything (unless it would increase the size, of course).Apache’s mod_deflate module can be configured to automatically compress .html, .css, .js, and others, but its compression happens repeatedly (i.e., every time the page is requested). Worse yet, it happens during that all-important user-waiting time.Apache’s compression is essential for dynamically-created content, but there’s a better way of handling static content. A build-time process can ensure that all material is compressed once (at build time). Apache and other servers can then serve the compressed version without stopping to recompress it. The build-time process compresses once, when the user is not waiting.
Of course, this approach will require a small internal change to Apache, but it’s a simple change. Facing a request for x.html, servers should automatically look in the same directory for x.html.gz, x.html.zip, x.html.7z, and other compressed versions. It should automatically use the smallest one that the browser is willing to accept. The same approach can be used for .js and .css files.
Issue #3 – Blocking: Scripts and other components block parsing, downloading, and execution.
If the .html were to identify a priority level for each component, the build-time process could create the code for the appropriate download method. Needed-now components must be downloaded before the page can be considered interactive. The automated build process can inline them. Needed-soon components will be needed within a second or two after the page becomes interactive. The automated build process can create code to download them after onLoad fires. Maybe-needed-later components are needed later on the same page. The automated build process can create code to download them after the needed-soon components are taken care of. Finally, the looking-ahead components are our best guess of what the user will need on the following page. The automated build process can create code to download them after everything else is taken care of.
The best part of this process is that the web developer need not write any special code. Every component can be specified by an html tag. The only difference would be to add a priority attribute to that tag. Examples:
<style ... priority="1"> //needed now
<script ... priority="2"> //needed soon after interactivity
<img ... priority="3"> //needed later on this web page
<img ... priority="4"> //cache for use on future pages
Using decimals instead of integers gives us the ability to set priority levels within each of the four priority classes. Another adaptation will allow us to specify whether the downloads should be serial or parallel.Issue #4 – Unnecessary Downloading: Style sheets and scripts are downloaded, then not used on the page. Some, whether used or not, are downloaded multiple times on a page.A build-time process can parse the document and download only what is referenced in the document (plus whatever it is dependent upon). It would make sure that each required component is downloaded once and only once. It would make sure that unnecessary components are not downloaded.
Issue #5 – Image Minification: Many (perhaps most) images that traverse the Internet are larger than they need to be. Their resolution is higher than what will be displayed. Their colour depth is greater than necessary. They contain needless meta-data.
The web page specifies how wide and how tall the image is supposed to be, so an automated build process can resample the image down to exactly the right size. While it’s at it, it can reduce the colour depth and strip out the meta-data. This not only saves on network transmission time, but it also sets the browser free from its resizing task.
The build-time process could also create clickable thumbnails so small images can be displayed on the page and large images can be viewed with a mouse click. The larger image can be downloaded after the rest of the page is fully loaded.
A build-time process can handle this task quite nicely. The developers can continue working with the original source files while the server serves the minified versions.
Issue #7 – Broken Links & Redirects: Broken links and redirects create needless back-and-forth traffic on an already-slow Internet. Apart from the additional wait time, they annoy the user.
An automated build process can verify all links. If the process knows that a resource has been moved to a different URL, it can automatically make the change. If not, it can warn the developer about the problem. It won’t guarantee that there will never be any broken links because links may become broken later, but it helps. [I think this will impact user satisfaction more than performance.]
Issue #8 – URLs for Directories: If a URL for a directory is not terminated by a slash, the server will tell the client to add the slash, then the client will have to start the request all over again, this time with a slash at the end. It’s just silly!
This trivial little matter can be taken care of by an automated build process. Developers don’t have to think about it any more, and URLs don’t have to be rewritten to compensate for it.
Issue #9 – Connection Reuse: Establishing connections is temporally expensive. Establishing connections when they can be reused is wasteful.
An automated build process can make sure that connections are reused as much as possible, freeing the developer from this concern.
An automated build process offers other, non-performance, benefits, too.
Benefit #1 – Browser-Specific Code: One or two unruly browsers ignore standards, and all browsers do things their own way when the standards are silent. Either way, you may have to write code multiple times, once for each browser. Yuck!An automated build process can create one copy of the page for each browser, with all the other-browser code stripped out. The server-side script can inspect the user-agent string to determine which file to send.Benefit #2 – Including Common Elements: If some elements occur on every web page in the website, the build process can automatically add these elements. Headers, footer, navigation systems, and standard layouts can all be handled this way, which means it can be done once, not once per page.Benefit #3 – Organization of Source Code: Many web projects organize their components into a hierarchical directory structure that suits the user, his browser, and how cookies are used. Developers can be forced to think in those terms, but development is more efficient if code and resources are encapsulated in the way developers think.
An automated build process would allow both organizations. The development team could structure their code and components in a manner that makes sense to them, then the build process would place post-processing copies into the production structure while leaving the developers with the originals. A simple manifest file could identify the source and destination for every file.
Benefit #4 – Catch Some Programming Errors: Developers are human. They make mistakes and create their own performance issues. Examples: serving components from a domain instead of from an IP address, using SSL where it’s not needed, using multiple URLs for the same resource, setting inappropriate cache expiry dates, putting components on the wrong server, using a URL that was moved elsewhere, etc.
An automated build process can correct a large number of programming errors. It would need a list of the application’s resources, including where they are located, whether they are served from a domain or an IP address, whether they require SSL, and how long they should be cached. Any time a resource changes, modify the list rather than modifying every web page.
Any code that executes while the user waits is a performance issue. In some cases, making this code execute faster is not the right approach. The right approach is to execute the code at build time, a time when the user is not waiting.
The above shows that a few of our most serious performance issues can be eliminated by moving processing to build time.
Post Tagged with automating the build process
, broken links
, browser-specific code
, build time
, image minification
, image processing
, organization of source code
, reusing connections
, URLs for directories
, user wait time
, website performance
, website performance tuning
, zero wait time