Librarians and other information managers are familiar with the Internet Archive, the open digital library and website archive of over 391 billion web pages.
The Internet Archive was developed in recognition of the Web as a record of historical and cultural significance. While content may be king, the overwhelming scope and volume of it now being generated was the inspiration for another archive dedicated to performance and other technical aspects. Called the HTTP Archive, it’s a repository of web performance information such as size of pages, failed requests and underlying technologies.
The HTTP Archive is an open source project maintained by a core group of developers and contributors. Two people in the group have ties to LAC Group: John Fox, the company CMO (Chief Marketing Officer) and David Fox, who helps to ensure the performance of LAC Group’s websites. Both were involved in the HTTP Archive’s 2019 “state of the web” report called the Web Almanac.
The Web Almanac is a comprehensive report covering four categories:
- Page content
- User experience
While the Web Almanac is primarily the domain of developers and web architects, it’s full of helpful stats for content managers, marketers and publishers. After all, if links are broken, an important page takes too long to load or the user experience is inadequate, your content will be inaccessible or likely abandoned.
Following is a summary of the four sections and some key stats, with a link to the entire Almanac at the end.
Growing use of images and videos on web pages
Media formats like images and videos have become an important type of content and a valuable part of a user’s web experience. Today, virtually every web page depends on images and videos.
Because of file size and other technical details, media volume is a burden that can be measured in two significant ways:
- Network overhead, especially in cellular or slow network environments like coffee shops. Slow transfers of visual content give the perception of a slow web page.
- The financial costs, which are a burden to the user and can be significant for the growing number of people whose internet access is confined to mobile devices and possibly limited monthly data plans. Also consider the disparity between developing and developed countries: In Madagascar, loading a single web page at the 90th percentile equals 2.6% of the daily gross income. In comparison, it’s only 0.3% in Germany.
If your content is media-rich and you have significant traffic from mobile users and developing countries, these are factors to consider. Included in the Almanac is a fascinating website on page and site costs named What Does My Site Cost? Using mobile network data from the International Telecommunication Union (ITU) and measured in US dollars, you can enter a web page or site URL and test the result (link included at the end of this article).
The LAC Group website, lac-group.com, weighs 0.49MB. The highest cost to view our site is in Canada, where a mobile user will “spend” six cents. The highest percentage of daily gross income is .32%in Mauritania.
Growing use of third parties on websites
Another important chapter in this section is the chapter on third parties, also a growing category. Third parties can be anything from ads and analytics to marketing tools and social network integrations.
Notable third party web stats in 2019:
- Nearly 94% of desktop pages include at least one third-party resource.
- 76% of pages issue a request to an analytics domain.
- The most active 10% of pages issue well over one hundred third-party requests.
Performance has become an important benchmark, as speeding up page load time improves conversion rates. When performance is poor, users don’t convert as often and have been observed to “rage click” in frustration, something all of us can likely relate to. The report covers three metrics for load time, including the first contentful paint or FCP, which is the time users spend waiting for a page to display something useful on the screen, like an image or text.
Also covered under User Experience are chapters on security, accessibility and the mobile web. The key takeaway for security is the increase in the adoption of Transport Layer Security, the protocol that gives the ‘S’ in HTTPS and allows secure and private browsing. Google is highly supportive of secure sites, even those that don’t handle sensitive data or communications and can be punitive toward unsecure sites. For site visitors, a secure site is another measure of trustworthiness.
As mentioned already, mobile access has expanded and now accounts for 59% of all searches and 58.7% of all web traffic. This transforms mobility from “good to consider” to “must consider” status for website developers and managers.
Another web growth area is online retailing. Nearly 10% of home pages in the Web Almanac were found to be on an ecommerce platform like Shopify, which delivers a set of software or services for easy creation and operation of online stores.
Bigger than ecommerce platforms are content management systems (CMS)—more than 40% of web pages are powered by a CMS platform like WordPress, Joomla or Drupal. CMS platforms simplify the process of creating, managing and publishing web content. These platforms operate in a growing “ecosystem” of components like hosting providers, extension developers and website development agencies.
The final section of the 2019 Web Almanac is dedicated to content distribution, which isn’t about feeds or social media sharing. It’s about the underlying technologies and processes that improve web performance by reducing the amount of data transmitted to clients and increasing efficiency of available bandwidth.
The chapters in this section include compression, caching, CDN (content delivery network of geographically distributed servers which work together to improve delivery), page weight, resource hints (not hints for people but programming “hints” to the browser about what resources will be needed) and HTTP/2. Here we will summarize just two—caching and page weight.
Cache web content early and often
For caching, a technique that enables the reuse of previously downloaded content, the Almanac offers three recommendations:
- Cache as much as you can, understanding whether a response is static or dynamic. Dynamically generated content requires more careful consideration.
- Cache for as long as you can, depending on the sensitivity of the content.
- Cache as close as possible to your end users, which removes latency to reduce download times.
Keep web pages as slim as possible
Regarding page weight and building on the growth of media and third parties on websites, a commonly held belief in the United States, Canada and other developed countries is that it no longer matters thanks to high-speed internet and sophisticated devices. The Web Almanac wants to debunk that myth. It’s highly dependent on the nature of the website and your audience, as already discussed in terms of mobile devices and people in developing countries.
The Web Almanac offers this sound advice on page weight:
“You should care about page bloat in terms of how it affects all your users, especially mobile-only users who deal with bandwidth constraints or data limits.”
The 2019 Web Almanac is a helpful resource and reference for anyone involved in creating and managing both content and the underlying technologies and processes on the websites that contain the content.