Follow this blog

Insertable streams for MediaStreamTrack

Insertable streams for MediaStreamTrack is part of the capabilities project and is currently in development. This post will be updated as the implementation progresses. Background # In the context of the Media Capture and Streams API the MediaStreamTrack interface represents a single media track within a stream; typically, these are audio or video tracks, but other track types may exist. MediaStream objects consist of zero or more MediaStreamTrack objects, representing various audio or video tracks. Each MediaStreamTrack may have one or more channels. The channel represents the smallest unit of a media stream, such as an audio signal associated with a given speaker, like left or right in a stereo audio track. What is insertable streams for MediaStreamTrack? # The core idea behind insertable streams for MediaStreamTrack is to expose the content of a MediaStreamTrack as a collection of streams (as defined by the WHATWG Streams API). These streams can be manipulated to introduce new components. Granting developers access to the video (or audio) stream directly allows them to apply modifications directly to the stream. In contrast, realizing the same video manipulation task with traditional methods requires developers to use intermediaries such as <canvas> elements. (For details of this type of process, see, for example, video + canvas = magic.) Use cases # Use cases for insertable streams for MediaStreamTrack include, but are not limited to: Video conferencing gadgets like "funny hats" or virtual backgrounds. Voice processing like software vocoders. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification In Progress 3. Gather feedback & iterate on design In progress 4. Origin trial In progress 5. Launch Not started How to use insertable streams for MediaStreamTrack # Enabling support during the origin trial phase # Starting in Chrome 90, insertable streams for MediaStreamTrack is available as part of the WebCodecs origin trial in Chrome. The origin trial is expected to end in Chrome 91 (July 14, 2021). If necessary, a separate origin trial will continue for insertable streams for MediaStreamTrack. Origin trials allow you to try new features and give feedback on their usability, practicality, and effectiveness to the web standards community. For more information, see the Origin Trials Guide for Web Developers. To sign up for this or another origin trial, visit the registration page. Register for the origin trial # Request a token for your origin. Add the token to your pages. There are two ways to do that: Add an origin-trial <meta> tag to the head of each page. For example, this may look something like: <meta http-equiv="origin-trial" content="TOKEN_GOES_HERE"> If you can configure your server, you can also add the token using an Origin-Trial HTTP header. The resulting response header should look something like: Origin-Trial: TOKEN_GOES_HERE Enabling via chrome://flags # To experiment with insertable streams for MediaStreamTrack locally, without an origin trial token, enable the #enable-experimental-web-platform-features flag in chrome://flags. Feature detection # You can feature-detect insertable streams for MediaStreamTrack support as follows. if ('MediaStreamTrackProcessor' in window && 'MediaStreamTrackGenerator' in window) { // Insertable streams for `MediaStreamTrack` is supported. } Core concepts # Insertable streams for MediaStreamTrack builds on concepts previously proposed by WebCodecs and conceptually splits the MediaStreamTrack into two components: The MediaStreamTrackProcessor, which consumes a MediaStreamTrack object's source and generates a stream of media frames, specifically VideoFrame or AudioFrame) objects. You can think of this as a track sink that is capable of exposing the unencoded frames from the track as a ReadableStream. It also exposes a control channel for signals going in the opposite direction. The MediaStreamTrackGenerator, which consumes a stream of media frames and exposes a MediaStreamTrack interface. It can be provided to any sink, just like a track from getUserMedia(). It takes media frames as input. In addition, it provides access to control signals that are generated by the sink. The MediaStreamTrackProcessor # A MediaStreamTrackProcessor object exposes two properties: readable: Allows reading the frames from the MediaStreamTrack. If the track is a video track, chunks read from readable will be VideoFrame objects. If the track is an audio track, chunks read from readable will be AudioFrame objects. writableControl: Allows sending control signals to the track. Control signals are objects of type MediaStreamTrackSignal. The MediaStreamTrackGenerator # A MediaStreamTrackGenerator object likewise exposes two properties: writable: A WritableStream that allows writing media frames to the MediaStreamTrackGenerator, which is itself a MediaStreamTrack. If the kind attribute is "audio", the stream accepts AudioFrame objects and fails with any other type. If kind is "video", the stream accepts VideoFrame objects and fails with any other type. When a frame is written to writable, the frame's close() method is automatically invoked, so that its media resources are no longer accessible from JavaScript. readableControl: A ReadableStream that allows reading control signals sent from any sinks connected to the MediaStreamTrackGenerator. Control signals are objects of type MediaStreamTrackSignal. In the MediaStream model, apart from media, which flows from sources to sinks, there are also control signals that flow in the opposite direction (i.e., from sinks to sources via the track). A MediaStreamTrackProcessor is a sink and it allows sending control signals to its track and source via its writableControl property. A MediaStreamTrackGenerator is a track for which a custom source can be implemented by writing media frames to its writable field. Such a source can receive control signals sent by sinks via its readableControl property. Bringing it all together # The core idea is to create a processing chain as follows: Platform Track → Processor → Transform → Generator → Platform Sinks For a barcode scanner application, this chain would look as in the code sample below. const stream = await getUserMedia({ video: true }); const videoTrack = stream.getVideoTracks()[0]; const trackProcessor = new MediaStreamTrackProcessor({ track: videoTrack }); const trackGenerator = new MediaStreamTrackGenerator({ kind: 'video' }); const transformer = new TransformStream({ async transform(videoFrame, controller) { const barcodes = await detectBarcodes(videoFrame); const newFrame = highlightBarcodes(videoFrame, barcodes); videoFrame.close(); controller.enqueue(newFrame); }, }); trackProcessor.readable.pipeThrough(transformer).pipeTo(trackGenerator.writable); trackGenerator.readableControl.pipeTo(trackProcessor.writableControl); This article barely scratches the surface of what is possible and going into the details is way beyond the scope of this publication. For more examples, see the extended video processing demo and the audio processing demo respectively. You can find the source code for both demos on GitHub. Demo # You can see the QR code scanner demo from the section above in action on a desktop or mobile browser. Hold a QR code in front of the camera and the app will detect it and highlight it. You can see the application's source code on Glitch. Security and Privacy considerations # The security of this API relies on existing mechanisms in the web platform. As data is exposed using the VideoFrame and AudioFrame interfaces, the rules of those interfaces to deal with origin-tainted data apply. For example, data from cross-origin resources cannot be accessed due to existing restrictions on accessing such resources (e.g., it is not possible to access the pixels of a cross-origin image or video element). In addition, access to media data from cameras, microphones, or screens is subject to user authorization. The media data this API exposes is already available through other APIs. In addition to the media data, this API exposes some control signals such as requests for new frames. These signals are intended as hints and do not pose a significant security risk. Feedback # The Chromium team wants to hear about your experiences with insertable streams for MediaStreamTrack. Tell us about the API design # Is there something about the API that does not work like you expected? Or are there missing methods or properties that you need to implement your idea? Do you have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chromium's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter Blink>MediaStream in the Components box. Glitch works great for sharing quick and easy repros. Show support for the API # Are you planning to use insertable streams for MediaStreamTrack? Your public support helps the Chromium team prioritize features and shows other browser vendors how critical it is to support them. Send a tweet to @ChromiumDev using the hashtag #InsertableStreams and let us know where and how you are using it. Helpful links # Spec draft Explainer ChromeStatus Chromium bug TAG review GitHub repo Acknowledgements # The insertable streams for MediaStreamTrack spec was written by Harald Alvestrand and Guido Urdaneta. This article was reviewed by Harald Alvestrand, Joe Medley, Ben Wagner, Huib Kleinhout, and François Beaufort. Hero image by Chris Montgomery on Unsplash.

Using asynchronous web APIs from WebAssembly

The I/O APIs on the web are asynchronous, but they're synchronous in most system languages. When compiling code to WebAssembly, you need to bridge one kind of APIs to another—and this bridge is Asyncify. In this post, you'll learn when and how to use Asyncify and how it works under the hood. I/O in system languages # I'll start with a simple example in C. Say, you want to read the user's name from a file, and greet them with a "Hello, (username)!" message: #include <stdio.h> int main() { FILE *stream = fopen("name.txt", "r"); char name[20+1]; size_t len = fread(&name, 1, 20, stream); name[len] = '\0'; fclose(stream); printf("Hello, %s!\n", name); return 0; } While the example doesn't do much, it already demonstrates something you'll find in an application of any size: it reads some inputs from the external world, processes them internally and writes outputs back to the external world. All such interaction with the outside world happens via a few functions commonly called input-output functions, also shortened to I/O. To read the name from C, you need at least two crucial I/O calls: fopen, to open the file, and fread to read data from it. Once you retrieve the data, you can use another I/O function printf to print the result to the console. Those functions look quite simple at first glance and you don't have to think twice about the machinery involved to read or write data. However, depending on the environment, there can be quite a lot going on inside: If the input file is located on a local drive, the application needs to perform a series of memory and disk accesses to locate the file, check permissions, open it for reading, and then read block by block until the requested number of bytes is retrieved. This can be pretty slow, depending on the speed of your disk and the requested size. Or, the input file might be located on a mounted network location, in which case, the network stack will now be involved too, increasing the complexity, latency and number of potential retries for each operation. Finally, even printf is not guaranteed to print things to the console and might be redirected to a file or a network location, in which case it would have to go via the same steps above. Long story short, I/O can be slow and you can't predict how long a particular call will take by a quick glance at the code. While that operation is running, your whole application will appear frozen and unresponsive to the user. This is not limited to C or C++ either. Most system languages present all the I/O in a form of synchronous APIs. For example, if you translate the example to Rust, the API might look simpler, but the same principles apply. You just make a call and synchronously wait for it to return the result, while it performs all the expensive operations and eventually returns the result in a single invocation: fn main() { let s = std::fs::read_to_string("name.txt"); println!("Hello, {}!", s); } But what happens when you try to compile any of those samples to WebAssembly and translate them to the web? Or, to provide a specific example, what could "file read" operation translate to? It would need to read data from some storage. Asynchronous model of the web # The web has a variety of different storage options you could map to, such as in-memory storage (JS objects), localStorage, IndexedDB, server-side storage, and a new File System Access API. However, only two of those APIs—the in-memory storage and the localStorage—can be used synchronously, and both are the most limiting options in what you can store and for how long. All the other options provide only asynchronous APIs. This is one of the core properties of executing code on the web: any time-consuming operation, which includes any I/O, has to be asynchronous. The reason is that the web is historically single-threaded, and any user code that touches the UI has to run on the same thread as the UI. It has to compete with the other important tasks like layout, rendering and event handling for the CPU time. You wouldn't want a piece of JavaScript or WebAssembly to be able to start a "file read" operation and block everything else—the entire tab, or, in the past, the entire browser—for a range from milliseconds to a few seconds, until it's over. Instead, code is only allowed to schedule an I/O operation together with a callback to be executed once it's finished. Such callbacks are executed as part of the browser's event loop. I won't be going into details here, but if you're interested in learning how the event loop works under the hood, check out Tasks, microtasks, queues and schedules which explains this topic in-depth. The short version is that the browser runs all the pieces of code in sort of an infinite loop, by taking them from the queue one by one. When some event is triggered, the browser queues the corresponding handler, and on the next loop iteration it's taken out from the queue and executed. This mechanism allows simulating concurrency and running lots of parallel operations while using only a single thread. The important thing to remember about this mechanism is that, while your custom JavaScript (or WebAssembly) code executes, the event loop is blocked and, while it is, there is no way to react to any external handlers, events, I/O, etc. The only way to get the I/O results back is to register a callback, finish executing your code, and give the control back to the browser so that it can keep processing any pending tasks. Once I/O is finished, your handler will become one of those tasks and will get executed. For example, if you wanted to rewrite the samples above in modern JavaScript and decided to read a name from a remote URL, you would use Fetch API and async-await syntax: async function main() { let response = await fetch("name.txt"); let name = await response.text(); console.log("Hello, %s!", name); } Even though it looks synchronous, under the hood each await is essentially syntax sugar for callbacks: function main() { return fetch("name.txt") .then(response => response.text()) .then(name => console.log("Hello, %s!", name)); } In this de-sugared example, which is a bit clearer, a request is started and responses are subscribed to with the first callback. Once the browser receives the initial response—just the HTTP headers—it asynchronously invokes this callback. The callback starts reading the body as text using response.text(), and subscribes to the result with another callback. Finally, once fetch has retrieved all the contents, it invokes the last callback, which prints "Hello, (username)!" to the console. Thanks to the asynchronous nature of those steps, the original function can return control to the browser as soon as the I/O has been scheduled, and leave the entire UI responsive and available for other tasks, including rendering, scrolling and so on, while the I/O is executing in background. As a final example, even simple APIs like "sleep", which makes an application wait a specified number of seconds, are also a form of an I/O operation: #include <stdio.h> #include <unistd.h> // ... printf("A\n"); sleep(1); printf("B\n"); Sure, you could translate it in a very straightforward manner that would block the current thread until the time expires: console.log("A"); for (let start = Date.now(); Date.now() - start < 1000;); console.log("B"); In fact, that's exactly what Emscripten does in its default implementation of "sleep", but that's very inefficient, will block the entire UI and won't allow any other events to be handled meanwhile. Generally, don't do that in production code. Instead, a more idiomatic version of "sleep" in JavaScript would involve calling setTimeout(), and subscribing with a handler: console.log("A"); setTimeout(() => { console.log("B"); }, 1000); What's common to all these examples and APIs? In each case, the idiomatic code in the original systems language uses a blocking API for the I/O, whereas an equivalent example for the web uses an asynchronous API instead. When compiling to the web, you need to somehow transform between those two execution models, and WebAssembly has no built-in ability to do so just yet. Bridging the gap with Asyncify # This is where Asyncify comes in. Asyncify is a compile-time feature supported by Emscripten that allows pausing the entire program and asynchronously resuming it later. Usage in C / C++ with Emscripten # If you wanted to use Asyncify to implement an asynchronous sleep for the last example, you could do it like this: #include <stdio.h> #include <emscripten.h> EM_JS(void, async_sleep, (int seconds), { Asyncify.handleSleep(wakeUp => { setTimeout(wakeUp, seconds * 1000); }); }); …puts("A"); async_sleep(1); puts("B"); EM_JS is a macro that allows defining JavaScript snippets as if they were C functions. Inside, use a function Asyncify.handleSleep() which tells Emscripten to suspend the program and provides a wakeUp() handler that should be called once the asynchronous operation has finished. In the example above, the handler is passed to setTimeout(), but it could be used in any other context that accepts callbacks. Finally, you can call async_sleep() anywhere you want just like regular sleep() or any other synchronous API. When compiling such code, you need to tell Emscripten to activate the Asyncify feature. Do that by passing -s ASYNCIFY as well as -s ASYNCIFY_IMPORTS=[func1, func2] with an array-like list of functions that might be asynchronous. emcc -O2 \ -s ASYNCIFY \ -s ASYNCIFY_IMPORTS=[async_sleep] \ ... This lets Emscripten know that any calls to those functions might require saving and restoring the state, so the compiler will inject supporting code around such calls. Now, when you execute this code in the browser you'll see a seamless output log like you'd expect, with B coming after a short delay after A. A B You can return values from Asyncify functions too. What you need to do is return the result of handleSleep(), and pass the result to the wakeUp() callback. For example, if, instead of reading from a file, you want to fetch a number from a remote resource, you can use a snippet like the one below to issue a request, suspend the C code, and resume once the response body is retrieved—all done seamlessly as if the call were synchronous. EM_JS(int, get_answer, (), { return Asyncify.handleSleep(wakeUp => { fetch("answer.txt") .then(response => response.text()) .then(text => wakeUp(Number(text))); }); }); puts("Getting answer..."); int answer = get_answer(); printf("Answer is %d\n", answer); In fact, for Promise-based APIs like fetch(), you can even combine Asyncify with JavaScript's async-await feature instead of using the callback-based API. For that, instead of Asyncify.handleSleep(), call Asyncify.handleAsync(). Then, instead of having to schedule a wakeUp() callback, you can pass an async JavaScript function and use await and return inside, making code look even more natural and synchronous, while not losing any of the benefits of the asynchronous I/O. EM_JS(int, get_answer, (), { return Asyncify.handleAsync(async () => { let response = await fetch("answer.txt"); let text = await response.text(); return Number(text); }); }); int answer = get_answer(); Awaiting complex values # But this example still limits you only to numbers. What if you want to implement the original example, where I tried to get a user's name from a file as a string? Well, you can do that too! Emscripten provides a feature called Embind that allows you to handle conversions between JavaScript and C++ values. It has support for Asyncify as well, so you can call await() on external Promises and it will act just like await in async-await JavaScript code: val fetch = val::global("fetch"); val response = fetch(std::string("answer.txt")).await(); val text = response.call<val>("text").await(); auto answer = text.as<std::string>(); When using this method, you don't even need to pass ASYNCIFY_IMPORTS as a compile flag, as it's already included by default. Okay, so this all works great in Emscripten. What about other toolchains and languages? Usage from other languages # Say that you have a similar synchronous call somewhere in your Rust code that you want to map to an async API on the web. Turns out, you can do that too! First, you need to define such a function as a regular import via extern block (or your chosen language's syntax for foreign functions). extern { fn get_answer() -> i32; } println!("Getting answer..."); let answer = get_answer(); println!("Answer is {}", answer); And compile your code to WebAssembly: cargo build --target wasm32-unknown-unknown Now you need to instrument the WebAssembly file with code for storing/restoring the stack. For C / C++, Emscripten would do this for us, but it's not used here, so the process is a bit more manual. Luckily, the Asyncify transform itself is completely toolchain-agnostic. It can transform arbitrary WebAssembly files, no matter which compiler it's produced by. The transform is provided separately as part of the wasm-opt optimiser from the Binaryen toolchain and can be invoked like this: wasm-opt -O2 --asyncify \ --pass-arg=asyncify-imports@env.get_answer \ [...] Pass --asyncify to enable the transform, and then use --pass-arg=… to provide a comma-separated list of asynchronous functions, where the program state should be suspended and later resumed. All that's left is to provide supporting runtime code that will actually do that—suspend and resume WebAssembly code. Again, in the C / C++ case this would be included by Emscripten, but now you need custom JavaScript glue code that would handle arbitrary WebAssembly files. We've created a library just for that. You can find it on Github at https://github.com/GoogleChromeLabs/asyncify or npm under the name asyncify-wasm. It simulates a standard WebAssembly instantiation API, but under its own namespace. The only difference is that, under a regular WebAssembly API you can only provide synchronous functions as imports, while under the Asyncify wrapper, you can provide asynchronous imports as well: const { instance } = await Asyncify.instantiateStreaming(fetch('app.wasm'), { env: { async get_answer() { let response = await fetch("answer.txt"); let text = await response.text(); return Number(text); } } }); …await instance.exports.main(); Once you try to call such an asynchronous function - like get_answer() in the example above - from the WebAssembly side, the library will detect the returned Promise, suspend and save the state of the WebAssembly application, subscribe to the promise completion, and later, once it's resolved, seamlessly restore the call stack and state and continue execution as if nothing has happened. Since any function in the module might make an asynchronous call, all the exports become potentially asynchronous too, so they get wrapped as well. You might have noticed in the example above that you need to await the result of instance.exports.main() to know when the execution is truly finished. How does this all work under the hood? # When Asyncify detects a call to one of the ASYNCIFY_IMPORTS functions, it starts an asynchronous operation, saves the entire state of the application, including the call stack and any temporary locals, and later, when that operation is finished, restores all the memory and call stack and resumes from the same place and with the same state as if the program has never stopped. This is quite similar to async-await feature in JavaScript that I showed earlier, but, unlike the JavaScript one, doesn't require any special syntax or runtime support from the language, and instead works by transforming plain synchronous functions at compile-time. When compiling the earlier shown asynchronous sleep example: puts("A"); async_sleep(1); puts("B"); Asyncify takes this code and transforms it to roughly like the following one (pseudo-code, real transformation is more involved than this): if (mode == NORMAL_EXECUTION) { puts("A"); async_sleep(1); saveLocals(); mode = UNWINDING; return; } if (mode == REWINDING) { restoreLocals(); mode = NORMAL_EXECUTION; } puts("B"); Initially mode is set to NORMAL_EXECUTION. Correspondingly, the first time such transformed code is executed, only the part leading up to async_sleep() will get evaluated. As soon as the asynchronous operation is scheduled, Asyncify saves all the locals, and unwinds the stack by returning from each function all the way to the top, this way giving control back to the browser event loop. Then, once async_sleep() resolves, Asyncify support code will change mode to REWINDING, and call the function again. This time, the "normal execution" branch is skipped - since it already did the job last time and I want to avoid printing "A" twice - and instead it comes straight to the "rewinding" branch. Once it's reached, it restores all the stored locals, changes mode back to "normal" and continues the execution as if the code were never stopped in the first place. Transformation costs # Unfortunately, Asyncify transform isn't completely free, since it has to inject quite a bit of supporting code for storing and restoring all those locals, navigating the call stack under different modes and so on. It tries to modify only functions marked as asynchronous on the command line, as well as any of their potential callers, but the code size overhead might still add up to approximately 50% before compression. This isn't ideal, but in many cases acceptable when the alternative is not having the functionality altogether or having to make significant rewrites to the original code. Make sure to always enable optimizations for the final builds to avoid it going even higher. You can also check Asyncify-specific optimization options to reduce the overhead by limiting transforms only to specified functions and/or only direct function calls. There is also a minor cost to runtime performance, but it's limited to the async calls themselves. However, compared to the cost of the actual work, it's usually negligible. Real-world demos # Now that you've looked at the simple examples, I'll move on to more complicated scenarios. As mentioned in the beginning of the article, one of the storage options on the web is an asynchronous File System Access API. It provides access to a real host filesystem from a web application. On the other hand, there is a de-facto standard called WASI for WebAssembly I/O in the console and the server-side. It was designed as a compilation target for system languages, and exposes all sorts of file system and other operations in a traditional synchronous form. What if you could map one to another? Then you could compile any application in any source language with any toolchain supporting the WASI target, and run it in a sandbox on the web, while still allowing it to operate on real user files! With Asyncify, you can do just that. In this demo, I've compiled Rust coreutils crate with a few minor patches to WASI, passed via Asyncify transform and implemented asynchronous bindings from WASI to File System Access API on the JavaScript side. Once combined with Xterm.js terminal component, this provides a realistic shell running in the browser tab and operating on real user files - just like an actual terminal. Check it out live at https://wasi.rreverser.com/. Asyncify use-cases are not limited just to timers and filesystems, either. You can go further and use more niche APIs on the web. For example, also with the help of Asyncify, it's possible to map libusb—probably the most popular native library for working with USB devices—to a WebUSB API, which gives asynchronous access to such devices on the web. Once mapped and compiled, I got standard libusb tests and examples to run against chosen devices right in the sandbox of a web page. It's probably a story for another blog post though. Those examples demonstrate just how powerful Asyncify can be for bridging the gap and porting all sorts of applications to the web, allowing you to gain cross-platform access, sandboxing, and better security, all without losing functionality.

Customize the window controls overlay of your PWA's title bar

If you remember my article Make your PWA feel more like an app, you may recall how I mentioned customizing the title bar of your app as a strategy for creating a more app-like experience. Here is an example of how this can look like showing the macOS Podcasts app. Now you may be tempted to object by saying that Podcasts is a platform-specific macOS app that does not run in a browser and therefore can do what it wants without having to play by the browser's rules. True, but the good news is that the Window Controls Overlay feature, which is the topic of this very article, soon lets you create similar user interfaces for your PWA. Window Controls Overlay components # Window Controls Overlay consists of four sub-features: The "window-controls-overlay" value for the "display_override" field in the web app manifest. The CSS environment variables titlebar-area-x, titlebar-area-y, titlebar-area-width, and titlebar-area-height. The standardization of the previously proprietary CSS property -webkit-app-region as the app-region property to define draggable regions in web content. A mechanism to query for and work around the window controls region via the windowControlsOverlay member of window.navigator. What is Window Controls Overlay # The title bar area refers to the space to the left or right of the window controls (that is, the buttons to minimize, maximize, close, etc.) and often contains the title of the application. Window Controls Overlay lets progressive web applications (PWAs) provide a more app-like feel by swapping the existing full-width title bar for a small overlay containing the window controls. This allows developers to place custom content in what was previously the browser-controlled title bar area. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification Not started 3. Gather feedback & iterate on design In progress 4. Origin trial Not started 5. Launch Not started Enabling via chrome://flags # To experiment with Window Controls Overlay locally, without an origin trial token, enable the #enable-desktop-pwas-window-controls-overlay flag in chrome://flags. Enabling support during the origin trial phase # Starting in Chrome 92, Window Controls Overlay will be available as an origin trial in Chrome. The origin trial is expected to end in Chrome 94 (expected in July 2021). Origin trials allow you to try new features and give feedback on their usability, practicality, and effectiveness to the web standards community. For more information, see the Origin Trials Guide for Web Developers. To sign up for this or another origin trial, visit the registration page. Register for the origin trial # Request a token for your origin. Add the token to your pages. There are two ways to do that: Add an origin-trial <meta> tag to the head of each page. For example, this may look something like: <meta http-equiv="origin-trial" content="TOKEN_GOES_HERE"> If you can configure your server, you can also add the token using an Origin-Trial HTTP header. The resulting response header should look something like: Origin-Trial: TOKEN_GOES_HERE How to use Window Controls Overlay # Adding window-controls-overlay to the Web App Manifest # A progressive web app can opt-in to the window controls overlay by adding "window-controls-overlay" as the primary "display_override" member in the web app manifest: { "display_override": ["window-controls-overlay"] } The window controls overlay will be visible only when all of the following conditions are satisfied: The app is not opened in the browser, but in a separate PWA window. The manifest includes "display_override": ["window-controls-overlay"]. (Other values are allowed thereafter.) The PWA is running on a desktop operating system. The current origin matches the origin for which the PWA was installed. The result of this is an empty title bar area with the regular window controls on the left or the right, depending on the operating system. Moving content into the title bar # Now that there is space in the title bar, you can move something there. For this article, I have built a Chuck Norris jokes PWA. A useful feature for this app may be a search for words in jokes. Fun fact: Chuck Norris has installed this PWA on his iPhone and has let me know he loves the push notifications he receives whenever a new joke is submitted. The HTML for the search feature looks like this: <div class="search"> <img src="chuck-norris.png" alt="Chuck Norris" width="32" height="32" /> <label> <input type="search" /> Search words in jokes </label> </div> To move this div up into the title bar, some CSS is needed: .search { /* Make sure the `div` stays there, even when scrolling. */ position: fixed; /** * Gradient, because why not. Endless opportunities. * The gradient ends in maroon, which happens to be the app's * `<meta name="theme-color" content="maroon">`. */ background-image: linear-gradient(90deg, #131313, 33%, maroon); /* Use the environment variable for the left anchoring with a fallback. */ left: env(titlebar-area-x, 0); /* Use the environment variable for the top anchoring with a fallback. */ top: env(titlebar-area-y, 0); /* Use the environment variable for setting the width with a fallback. */ width: env(titlebar-area-width, 100%); /* Use the environment variable for setting the height with a fallback. */ height: env(titlebar-area-height, 33px); } You can see the effect of this code in the screenshot below. The title bar is fully responsive. When you resize the PWA window, the title bar reacts as if it were composed of regular HTML content, which, in fact, it is. Determining which parts of the title bar are draggable # While the screenshot above suggests that you are done, you are not done quite yet. The PWA window is no longer draggable (apart from a very small area), since the window controls buttons are no drag areas, and the rest of the title bar consists of the search widget. This can be fixed by leveraging the app-region CSS property with a value of drag. In the concrete case, it is fine to make everything besides the input element draggable. /* The entire search `div` is draggable… */ .search { -webkit-app-region: drag; app-region: drag; } /* …except for the `input`. */ input { -webkit-app-region: no-drag; app-region: no-drag; } For now, app-region has not been standardized yet, so the plan is to continue using the prefixed -webkit-app-region until app-region is standardized. Currently, only -webkit-app-region is supported in the browser. With this CSS in place, the user can drag the app window as usual by dragging the div, the img, or the label. Only the input element is interactive so the search query can be entered. Feature detection # Support for Window Controls Overlay can be detected by testing for the existence of windowControlsOverlay: if ('windowControlsOverlay' in navigator) { // Window Controls Overlay is supported. } Querying the window controls region with windowControlsOverlay # The code so far has only one problem: on some platforms the window controls are on the right, on others they are on the left. To make matters worse, the "three dots" Chrome menu will change position, too, based on the platform. This means that the linear gradient background image needs to be dynamically adapted to run from #131313→maroon or maroon→#131313→maroon, so that it blends in with the title bar's maroon background color that is determined by <meta name="theme-color" content="maroon">. This can be achieved by querying the getBoundingClientRect() API on the navigator.windowControlsOverlay property. if ('windowControlsOverlay' in navigator) { const { x } = navigator.windowControlsOverlay.getBoundingClientRect(); // Window controls are on the right (like on Windows). // Chrome menu is left of the window controls. // [ windowControlsOverlay___________________ […] [_] [■] [X] ] if (x === 0) { div.classList.add('search-controls-right'); } // Window controls are on the left (like on macOS). // Chrome menu is right of the window controls overlay. // [ [X] [_] [■] ___________________windowControlsOverlay [⋮] ] else { div.classList.add('search-controls-left'); } } else { // When running in a non-supporting browser tab. div.classList.add('search-controls-right'); } Rather than having the background image in the .search class CSS rules directly (as before), the modified code now uses two classes that the code above sets dynamically. /* For macOS: */ .search-controls-left { background-image: linear-gradient(90deg, maroon, 45%, #131313, 90%, maroon); } /* For Windows: */ .search-controls-right { background-image: linear-gradient(90deg, #131313, 33%, maroon); } Determining if the window controls overlay is visible # The window controls overlay will not be visible in the title bar area in all circumstances. While it will naturally not be there on browsers that do not support the Window Controls Overlay feature, it will also not be there when the PWA in question runs in a tab. To detect this situation, you can query the visible property of the windowControlsOverlay: if (navigator.windowControlsOverlay.visible) { // The window controls overlay is visible in the title bar area. } The window controls overlay visibility is not to be confused with the visibility in the CSS sense of whatever HTML content you place next to the window controls overlay. Even if you set display: none on the div placed into the window controls overlay, the visible property of the window controls overlay would still report true. Being notified of geometry changes # Querying the window controls overlay area with getBoundingClientRect() can suffice for one-off things like setting the correct background image based on where the window controls are, but in other cases, more fine-grained control is necessary. For example, a possible use case could be to adapt the window controls overlay based on the available space and to add a joke right in the window control overlay when there is enough space. You can be notified of geometry changes by subscribing to navigator.windowControlsOverlay.ongeometrychange or by setting up an event listener for the geometrychange event. This event will only fire when the window controls overlay is visible, that is, when navigator.windowControlsOverlay.visible is true. Since this event fires frequently (comparable to how a scroll listener fires), I always recommend you use a debounce function so the event does not fire too often. const debounce = (func, wait) => { let timeout; return function executedFunction(...args) { const later = () => { clearTimeout(timeout); func(...args); }; clearTimeout(timeout); timeout = setTimeout(later, wait); }; }; if ('windowControlsOverlay' in navigator) { navigator.windowControlsOverlay.ongeometrychange = debounce((e) => { span.hidden = e.boundingRect.width < 800; }, 250); } Rather than assigning a function to ongeometrychange, you can also add an event listener to windowControlsOverlay like below. You can read up on the difference between the two on MDN. navigator.windowControlsOverlay.addEventListener( 'geometrychange', debounce((e) => { span.hidden = e.boundingRect.width < 800; }, 250), ); Compatibility when running in a tab and on non-supporting browsers # There are two possible cases to consider: The case where an app is running in a browser that does support Window Controls Overlay, but where the app is used in a browser tab. The case where an app is running in a browser that does not support Window Controls Overlay. In both cases, by default the HTML the developer has determined to be placed in the window controls overlay will display inline like regular HTML content and the env() variables' fallback values will kick in for the positioning. On supporting browsers, you can also decide to not display the HTML designated for the window controls overlay by checking the overlay's visible property, and if it reports false, then hiding that HTML content. As a reminder, non-supporting browsers will either not consider the "display_override" web app manifest property at all, or not recognize the "window-controls-overlay" and thus use the next possible value according to the fallback chain, for example, "standalone". Demo # I have created a demo that you can play with in different supporting and non-supporting browsers and in the installed and non-installed state. For the actual Window Controls Overlay experience, you need to install the app and set the flag. You can see two screenshots of what to expect below. The source code of the app is available on Glitch. The search feature in the window controls overlay is fully functional: Security considerations # The Chromium team has designed and implemented the Window Controls Overlay API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics. Spoofing # Giving sites partial control of the title bar leaves room for developers to spoof content in what was previously a trusted, browser-controlled region. Currently, in Chromium browsers, standalone mode includes a title bar which on initial launch displays the title of the webpage on the left, and the origin of the page on the right (followed by the "settings and more" button and the window controls). After a few seconds, the origin text disappears. If the browser is set to a right-to-left (RTL) language, this layout is flipped such that the origin text is on the left. This opens the window controls overlay to spoof the origin if there is insufficient padding between the origin and the right edge of the overlay. For example, the origin "evil.ltd" could be appended with a trusted site "google.com", leading users to believe that the source is trustworthy. The plan is to keep this origin text so that users know what the origin of the app is and can ensure that it matches their expectations. For RTL configured browsers, there must be enough padding to the right of the origin text to prevent a malicious website from appending the unsafe origin with a trusted origin. Fingerprinting # Enabling the window controls overlay and draggable regions do not pose considerable privacy concerns other than feature detection. However, due to differing sizes and positions of the window controls buttons across operating systems, the JavaScript API for navigator.windowControlsOverlay.getBoundingClientRect() will return a DOMRect whose position and dimensions will reveal information about the operating system upon which the browser is running. Currently, developers can already discover the OS from the user agent string, but due to fingerprinting concerns, there is discussion about freezing the UA string and unifying OS versions. There is an ongoing effort with the community to understand how frequently the size of the window controls overlay changes across platforms, as the current assumption is that these are fairly stable across OS versions and thus would not be useful for observing minor OS versions. Although this is a potential fingerprinting issue, it only applies to installed PWAs that use the custom title bar feature and does not apply to general browser usage. Additionally, the navigator.windowControlsOverlay API will not be available to iframes embedded inside of a PWA. Navigation # Navigating to a different origin within the PWA will cause it to fall back to the normal standalone title bar, even if it meets the above criteria and is launched with the window controls overlay. This is to accommodate the black bar that appears on navigation to a different origin. After navigating back to the original origin, the window controls overlay will be used again. Feedback # The Chromium team wants to hear about your experiences with the Window Controls Overlay API. Tell us about the API design # Is there something about the API that doesn't work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chromium's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter UI>Browser>WebAppInstalls in the Components box. Glitch works great for sharing quick and easy repros. Show support for the API # Are you planning to use the Window Controls Overlay API? Your public support helps the Chromium team to prioritize features and shows other browser vendors how critical it is to support them. Send a Tweet to @ChromiumDev with the #WindowControlsOverlay hashtag and let us know where and how you're using it. Helpful links # Explainer Chromium bug Chrome Platform Status entry TAG review Microsoft Edge's related docs Acknowledgements # Window Controls Overlay was implemented and specified by Amanda Baker from the Microsoft Edge team. This article was reviewed by Joe Medley and Kenneth Rohde Christiansen. Hero image by Sigmund on Unsplash.

Keeping third-party scripts under control

Third-party scripts, or "tags" can be a source of performance problems on your site, and therefore a target for optimization. However, before you start optimizing the tags you have added, make sure that you are not optimizing tags you don't even need. This article shows you how to assess requests for new tags, and manage and review existing ones. When discussing third-party tags, the conversation often quickly moves to performance problems, losing sight of the foundations of what the "core" role of these tags are. They provide a wide range of useful functionality, making the web more dynamic, interactive, and interconnected. However, third-party tags can be added by different teams across the organization and are often forgotten about over time. People move on, contracts expire, or the results are yielded, but the teams never get back in touch to have the scripts removed. In the article Improving Third-Party Web Performance, the web team at The Telegraph removed old tags where they could not identify the requester, deciding that if the tags were missed the responsible party would get in touch. However, no one ever did. Before you start to think about third-party tag script execution, or which tags can be deferred, lazy-loaded or preconnected from a technical lens, there's an opportunity to govern which tags are added to a site/page from an organizational point of view. A common theme with websites that are being slowed down due to vast amounts of third-party tags, is this part of the website is not owned by a single person or team, and therefore falls between the cracks. There's nothing more frustrating than optimizing your website, being happy with the performance in a staging environment, only for the speed to regress in production because of tags that are being added. Implementing a "vetting process" for third-party tags can help prevent this, by building a workflow that creates cross-functional accountability and responsibility for these tags. The manner in which you vet third-party tags depends solely on the organisation, its structure and its current processes. It could be as basic as having a single team who govern and act as the gatekeeper for analysing tags before they are added. Or more advanced and formal, for example by providing a form to teams to submit requests for a tag. This might ask for context in terms of why it needs to be on the website, for how long it should be present, and what benefit it would bring to the business. Tag governance process # However you choose to vet tags within your organization, the following stages should be considered as part of the lifecycle of a tag. Compliance # Before any tag is added onto a page, check that it has been thoroughly vetted by a legal team to ensure it passes all compliance requirements for it to be present. This might include checking that the tag is compliant with the EU General Data Protection Regulation (GDPR), and California Consumer Privacy Act (CCPA). This is critical, if there is any doubt with this step it needs to be addressed before assessing the tag from a performance point of view. Required # The second step is to question whether a specific tag is needed on the page. Consider the following discussion points: Is the tag actively being used? If not, can it be removed? If the tag is loading sitewide, is this necessary? For example, if we're analysing an A/B testing suite and you are currently only testing on Landing Pages, can we only drop the tag on this page type? Can we add further logic to this, can we detect if there is a live A/B test? If so, allow the tag to be added, but if not ensure that it is not present. Ownership # Having a clear person or team as an owner of a tag, helps to proactively keep track of tags. Usually this would be whomever has added the tag. By having an assignee next to the tag, this will ensure reviews and audits in the future can be conducted to re-visit whether the tag is required. Purpose # The fourth step begins to create cross-functional accountability and responsibility by ensuring people understand why the tag is added to the page. It's important for there to be a cross-functional understanding of what each tag is bringing to the website, and why it is being used. For example, if the tag is recording user session actions to allow personalization, do all teams know why this should be present? Furthermore, have there been any commercial vs performance trade-off discussions? If there is a tag that is deemed as "required" because it brings in revenue, has there been an analysis to the potential revenue lost through speed regression Review # The fifth, final and arguably most important step is to ensure tags are being reviewed on a regular basis. This should be dependent on the size of the website, the number of tags that are on the site, and their turnaround time (e.g. weekly, monthly, quarterly). This should be treated in the same manner as optimizing other website assets (JS, CSS, images, etc.) and proactively checked on a regular basis. Failure to review could lead to a "bloated" tag manager, which slows down the pages. It can be a complex task to revert back to being performant, while not regressing the required functionality on the page The vetting process should leave you with a final list of tags which are classified as needed for a specific page. At this stage, you can then delve into technical optimisation approaches. This also opens up the opportunity to define the number of tags in this final list within a performance budget, which can be monitored within Lighthouse CI and incorporated into performance-specific goal setting. For example: If we stick to <5 tags on our Landing Pages along with our own optimized JS, we're confident the Total Blocking Time (TBT) can hit 'good' in the Core Web Vitals.

Fill OTP forms within cross-origin iframes with WebOTP API

SMS OTPs (one-time passwords) are commonly used to verify phone numbers, for example as a second step in authentication, or to verify payments on the web. However, switching between the browser and the SMS app, to copy-paste or manually enter the OTP makes it easy to make mistakes and adds friction to the user experience. The WebOTP API gives websites the ability to programmatically obtain the one-time password from a SMS message and enter it automatically in the form for the users with just one tap without switching the app. The SMS is specially-formatted and bound to the origin, so it mitigates chances for phishing websites to steal the OTP as well. One use case that has yet to be supported in WebOTP was targeting an origin inside an iframe. This is typically used for payment confirmation, especially with 3D Secure. Having the common format to support cross-origin iframes, WebOTP API now delivers OTPs bound to nested origins starting in Chrome 91. How WebOTP API works # WebOTP API itself is simple enough: … const otp = await navigator.credentials.get({ otp: { transport:['sms'] } }); … The SMS message must be formatted with the origin-bound one-time codes. Your OTP is: 123456. @web-otp.glitch.me #12345 Notice that at the last line it contains the origin to be bound to preceded with a @ followed by the OTP preceded with a #. When the text message arrives, an info bar pops up and prompts the user to verify their phone number. After the user clicks the Verify button, the browser automatically forwards the OTP to the site and resolves the navigator.credentials.get(). The website can then extract the OTP and complete the verification process. Learn the basics of using WebOTP at Verify phone numbers on the web with the WebOTP API. Cross-origin iframes use cases # Entering an OTP in a form within a cross-origin iframe is common in payment scenarios. Some credit card issuers require an additional verification step to check the payer's authenticity. This is called 3D Secure and the form is typically exposed within an iframe on the same page as if it's a part of the payment flow. For example: A user visits shop.example to purchase a pair of shoes with a credit card. After entering the credit card number, the integrated payment provider shows a form from bank.example within an iframe asking the user to verify their phone number for fast checkout. bank.example sends an SMS that contains an OTP to the user so that they can enter it to verify their identity. How to use WebOTP API from a cross-origin iframe # To use WebOTP API from within a cross-origin iframe, you need to do two things: Annotate both the top-frame origin and the iframe origin in the SMS text message. Configure permissions policy to allow the cross-origin iframe to receive OTP from the user directly. WebOTP API within an iframe in action. You can try the demo yourself at https://web-otp-iframe-demo.stackblitz.io. Annotate bound-origins to the SMS text message # When WebOTP API is called from within an iframe, the SMS text message must include the top-frame origin preceded by @ followed by the OTP preceded by # followed by the iframe origin preceded by @. @shop.example #123456 @bank.exmple Configure Permissions Policy # To use WebOTP in a cross-origin iframe, the embedder must grant access to this API via otp-credentials permissions policy to avoid unintended behavior. In general there are two ways to achieve this goal: via HTTP Header: Permissions-Policy: otp-credentials=(self "https://bank.example") via iframe allow attribute: <iframe src="https://bank.example/…" allow="otp-credentials"></iframe> See more examples on how to specify a permission policy . Caveats # Nesting levels # At the moment Chrome only supports WebOTP API calls from cross-origin iframes that have no more than one unique origin in its ancestor chain. In the following scenarios: a.com -> b.com a.com -> b.com -> b.com a.com -> a.com -> b.com a.com -> b.com -> c.com using WebOTP in b.com is supported but using it in c.com is not. Note that the following scenario is also not supported because of lack of demand and UX complexities. a.com -> b.com -> a.com (calls WebOTP API) Interoperability # While browser engines other than Chromium do not implement the WebOTP API, Safari shares the same SMS format with its input[autocomplete="one-time-code"] support. In Safari, as soon as an SMS that contains an origin-bound one-time code format arrives with the matched origin, the keyboard suggests to enter the OTP to the input field. As of April 2021, Safari supports iframe with a unique SMS format using %. However, as the spec discussion concluded to go with @ instead, we hope the implementation of supported SMS format will converge. Feedback # Your feedback is invaluable in making WebOTP API better, so go on and try it out and let us know what you think. Resources # Verify phone numbers on the web with the Web OTP API SMS OTP form best practices WebOTP API Origin-bound one-time codes delivered via SMS Photo by rupixen.com on Unsplash

Breaking down barriers using the DataTransfer API

You might have heard about the DataTransfer API before, that is part of the HTML5 Drag and Drop API and Clipboard events. It can be used to transfer data between source and receiving targets. This API ready to use in all modern desktop browsers. The drag-drop and copy-paste interactions are often used for interactions within a page, transferring a simple text from A to B. But what is oftentimes overlooked is the ability to use these same interactions to go beyond the browser window. Both the browser's built-in drag-and-drop as well as the copy-paste interactions can communicate with other (web) applications, not tied to any origin. The API has support for providing multiple data entries that can have different behaviors based on where data is transferred to. Your web application can send and receive the transferred data when listening to incoming events. This capability can change the way we think about sharing and interoperability in web applications on desktop. Transferring data between applications doesn't need to rely on tightly coupled integrations anymore. Instead you can give the user the full control to transfer their data to wherever they would like. An example of interactions that are possible with the DataTransfer API. Transferring data # To get started with transferring data, you'll need to implement drag-drop or copy-paste. The examples below show drag-drop interactions, but the process for copy-paste is similar. If you are unfamiliar with the Drag and Drop API, there's a great article explaining HTML5 Drag and Drop that details the ins and outs. By providing MIME-type keyed data, you are able to freely interact with external applications. Most WYSIWYG editors, text editors, and browsers respond to the "primitive" mime-types used in the example below. document.querySelector('#dragSource').addEventListener('dragstart', (event) => { event.dataTransfer.setData('text/plain', 'Foo bar'); event.dataTransfer.setData('text/html', '<h1>Foo bar</h1>'); event.dataTransfer.setData('text/uri-list', 'https://example.com'); }); Receiving the data transfer works almost the same as providing it. Listen to the receiving events (drop, or paste) and read the keys. When dragging over an element, the browser only has access to the type keys of the data. The data itself can only be accessed after a drop. document.querySelector('#dropTarget').addEventListener('dragover', (event) => { console.log(event.dataTransfer.types); // Accept the drag-drop transfer. event.preventDefault(); }); document.querySelector('#dropTarget').addEventListener('drop', (event) => { // Log all the transferred data items to the console. for (let type of event.dataTransfer.types) { console.log({ type, data: event.dataTransfer.getData(type) }); } event.preventDefault(); }); Three MIME-types are widely supported across applications: text/html: Renders the HTML payload in contentEditable elements and rich text (WYSIWYG) editors like Google Docs, Microsoft Word, and others. text/plain: Sets the value of input elements, content of code editors, and the fallback from text/html. text/uri-list: Navigates to the URL when dropping on the URL bar or browser page. A URL shortcut will be created when dropping on a directory or the desktop. The widespread adoption of text/html by WYSIWYG editors makes it very useful. Like in HTML documents, you can embed resources by using Data URLs or publicly accessible URLs. This works well with exporting visuals (for example from a canvas) to editors like Google Docs. const redPixel = ''; const html = '<img src="' + redPixel + '" width="100" height="100" alt="" />'; event.dataTransfer.setData('text/html', html); Transfer using copy and paste # For posterity, using the DataTransfer API with copy-paste interactions looks like the following. Notice that the dataTransfer property is named clipboardData for clipboard events. // Listen to copy-paste events on the document. document.addEventListener('copy', (event) => { const copySource = document.querySelector('#copySource'); // Only copy when the `activeElement` (i.e., focused element) is, // or is within, the `copySource` element. if (copySource.contains(document.activeElement)) { event.clipboardData.setData('text/plain', 'Foo bar'); event.preventDefault(); } }); document.addEventListener('paste', (event) => { const pasteTarget = document.querySelector('#pasteTarget'); if (pasteTarget.contains(document.activeElement)) { const data = event.clipboardData.getData('text/plain'); console.log(data); } }); Custom data formats # You are not limited to the primitive MIME types, but can use any key to identify the transferred data. This can be useful for cross-browser interactions within your application. As shown below, you can transfer more complex data using the JSON.stringify() and JSON.parse() functions. document.querySelector('#dragSource').addEventListener('dragstart', (event) => { const data = { foo: 'bar' }; event.dataTransfer.setData('my-custom-type', JSON.stringify(data)); }); document.querySelector('#dropTarget').addEventListener('dragover', (event) => { // Only allow dropping when our custom data is available. if (event.dataTransfer.types.includes('my-custom-type')) { event.preventDefault(); } }); document.querySelector('#dropTarget').addEventListener('drop', (event) => { if (event.dataTransfer.types.includes('my-custom-type')) { event.preventDefault(); const dataString = event.dataTransfer.getData('my-custom-type'); const data = JSON.parse(dataString); console.log(data); } }); Connecting the web # While custom formats are great for communication between applications you have in your control, it also limits the user when transferring data to applications that aren't using your format. If you want to connect with third-party applications across the web, you need a universal data format. The JSON-LD (Linked Data) standard is a great candidate for this. It is lightweight and easy to read from and write to in JavaScript. Schema.org contains many predefined types that can be used, and custom schema definitions are an option as well. const data = { '@context': 'https://schema.org', '@type': 'ImageObject', contentLocation: 'Venice, Italy', contentUrl: 'venice.jpg', datePublished: '2010-08-08', description: 'I took this picture during our honey moon.', name: 'Canal in Venice', }; event.dataTransfer.setData('application/ld+json', JSON.stringify(data)); When using the Schema.org types, you can start with the generic Thing type, or use something closer to your use case like Event, Person, MediaObject, Place, or even highly-specific types like MedicalEntity if need be. When you use TypeScript, you can use the interface definitions from the schema-dts type definitions. By transmitting and receiving JSON-LD data, you will support a more connected and open web. With applications speaking the same language, you can create deep integrations with external applications. There's no need for complicated API integrations; all the information that's needed is included in the transferred data. Think of all the possibilities for transferring data between any (web) application with no restrictions: sharing events from a calendar to your favorite ToDo app, attaching virtual files to emails, sharing contacts. That would be great, right? This starts with you! 🙌 Concerns # While the DataTransfer API is available today, there are some things to be aware of before integrating. Browser compatibility # Desktop browsers all have great support for the technique described above, while mobile devices do not. The technique has been tested on all major browsers (Chrome, Edge, Firefox, Safari) and operating systems (Android, Chrome OS, iOS, macOS, Ubuntu Linux, and Windows), but unfortunately Android and iOS didn't pass the test. While browsers continue to develop, for now the technique is limited to desktop browsers only. Discoverability # Drag-drop and copy-paste are system-level interactions when working on a desktop computer, with roots back to the first GUIs about 40 years ago. Think for example about how many times you have used these interactions for organizing files. On the web, this is not very common yet. You will need to educate users about this new interaction, and come up with UX patterns to make this recognizable, especially for people whose experience with computers so far has been confined to mobile devices. Accessibility # Drag-drop is not a very accessible interaction, but the DataTransfer API works with copy-paste, too. Make sure you listen to copy-paste events! It doesn't take much extra work, and your users will be grateful to you for adding it. Security and privacy # There are some security and privacy considerations you should be aware of when using the technique. Clipboard data is available to other applications on the user's device. Web applications you are dragging over have access to the type keys, not the data. The data only becomes available on drop or paste. The received data should be treated like any other user input; sanitize and validate before using. Getting started with the Transmat helper library # Are you excited about using the DataTransfer API in your application? Consider taking a look at the Transmat library on GitHub. This open-source library aligns browser differences, provides JSON-LD utilities, contains an observer to respond to transfer events for highlighting drop-areas, and lets you integrate the data transfer operations among existing drag and drop implementations. import { Transmat, TransmatObserver, addListeners } from 'transmat'; // Send data on drag/copy. addListeners(myElement, 'transmit', (event) => { const transmat = new Transmat(event); transmat.setData({ 'text/plain': 'Foobar', 'application/json': { foo: 'bar' }, }); }); // Receive data on drop/paste. addListeners(myElement, 'receive', (event) => { const transmat = new Transmat(event); if (transmat.hasType('application/json') && transmat.accept()) { const data = JSON.parse(transmat.getData('application/json')); } }); // Observe transfer events and highlight drop areas. const obs = new TransmatObserver((entries) => { for (const entry of entries) { const transmat = new Transmat(entry.event); if (transmat.hasMimeType('application/json')) { entry.target.classList.toggle('drag-over', entry.isTarget); entry.target.classList.toggle('drag-active', entry.isActive); } } }); obs.observe(myElement); Acknowledgements # Hero image by Luba Ertel on Unsplash.

Mainline Menswear implements PWA and sees a 55% conversion rate uplift

Mainline is an online clothing retailer that offers the biggest designer brand names in fashion. The UK-based company entrusts its team of in-house experts, blended strategically with key partners, to provide a frictionless shopping experience for all. With market presence in over 100 countries via seven custom-built territorial websites and an app, Mainline will continue to ensure the ecommerce offering is rivalling the competition. Challenge # Mainline Menswear's goal was to complement the current mobile optimized website with progressive features that would adhere to their 'mobile first' vision, focusing on mobile-friendly design and functionality with a growing smartphone market in mind. Solution # The objective was to build and launch a PWA that complemented the original mobile friendly version of the Mainline Menswear website, and then compare the stats to their hybrid mobile app, which is currently available on Android and iOS. Once the app launched and was being used by a small section of Mainline Menswear users, they were able to determine the difference in key stats between PWA, app, and Web. The approach Mainline took when converting their website to a PWA was to make sure that the framework they selected for their website (Nuxt.js, utilizing Vue.js) would be future-proof and enable them to take advantage of fast moving web technology. Results # 139% More pages per session in PWA vs. web. 161% Longer session durations in PWA vs. web. 10% Lower bounce rate in PWA vs. web 12.5% Higher average order value in PWA vs. web 55% Higher conversion rate in PWA vs. web. 243% Higher revenue per session in PWA vs. web. Technical deep dive # Mainline Menswear is using the Nuxt.js framework to bundle and render their site, which is a single page application (SPA). Generating a service worker file # For generating the service worker, Mainline Menswear added configuration through a custom implementation of the nuxt/pwa Workbox module. The reason they forked the nuxt/pwa module was to allow the team to add more customizations to the service worker file that they weren't able to or had issues with when using the standard version. One such optimization was around the offline functionality of the site like, for example, serving a default offline page and gathering analytics while offline. Anatomy of the web application manifest # The team generated a manifest with icons for different mobile app icon sizes and other web app details like name, description and theme_color: { "name": "Mainline Menswear", "short_name": "MMW", "description": "Shop mens designer clothes with Mainline Menswear. Famous brands including Hugo Boss, Adidas, and Emporio Armani.", "icons": [ { "src": "/_nuxt/icons/icon_512.c2336e.png", "sizes": "512x512", "type": "image/png" } ], "theme_color": "#107cbb" } The web app, once installed, can be launched from the home screen without the browser getting in the way. This is achieved by adding the display parameter in the web application manifest file: { "display": "standalone" } Last but not the least, the company is now able to easily track how many users are visiting their web app from the home screen by simply appending a utm_source parameter in the start_url field of the manifest: { "start_url": "/?utm_source=pwa" } See Add a web app manifest for a more in-depth explanation of all the web application manifest fields. Runtime caching for faster navigations # Caching for web apps is a must for page speed optimization and for providing a better user experience for returning users. For caching on the web, there are quite a few different approaches. The team is using a mix of the HTTP cache and the Cache API for caching assets on the client side. The Cache API gives Mainline Menswear finer control over the cached assets, allowing them to apply complex strategies to each file type. While all this sounds complicated and hard to set up and maintain, Workbox provides them with an easy way of declaring such complex strategies and eases the pain of maintenance. Caching CSS and JS # For CSS and JS files, the team chose to cache them and serve them over the cache using the StaleWhileRevalidate Workbox strategy. This strategy allows them to serve all Nuxt CSS and JS files fast, which significantly increases their site's performance. At the same time, the files are being updated in the background to the latest version for the next visit: /* sw.js */ workbox.routing.registerRoute( /\/_nuxt\/.*(?:js|css)$/, new workbox.strategies.StaleWhileRevalidate({ cacheName: 'css_js', }), 'GET', ); Caching Google fonts # The strategy for caching Google Fonts depends on two file types: The stylesheet that contains the @font-face declarations. The underlying font files (requested within the stylesheet mentioned above). // Cache the Google Fonts stylesheets with a stale-while-revalidate strategy. workbox.routing.registerRoute( /https:\/\/fonts\.googleapis\.com\/*/, new workbox.strategies.StaleWhileRevalidate({ cacheName: 'google_fonts_stylesheets', }), 'GET', ); // Cache the underlying font files with a cache-first strategy for 1 year. workbox.routing.registerRoute( /https:\/\/fonts\.gstatic\.com\/*/, new workbox.strategies.CacheFirst({ cacheName: 'google_fonts_webfonts', plugins: [ new workbox.cacheableResponse.CacheableResponsePlugin({ statuses: [0, 200], }), new workbox.expiration.ExpirationPlugin({ maxAgeSeconds: 60 * 60 * 24 * 365, // 1 year maxEntries: 30, }), ], }), 'GET', ); A full example of the common Google Fonts strategy can be found in the Workbox Docs. Caching images # For images, Mainline Menswear decided to go with two strategies. The first strategy applies to all images coming from their CDN, which are usually product images. Their pages are image-heavy so they are conscious of not taking too much of their users' device storage. So through Workbox, they added a strategy that is caching images coming only from their CDN with a maximum of 60 images using the ExpirationPlugin. The 61st (newest) image requested, replaces the 1st (oldest) image so that no more than 60 product images are cached at any point in time. workbox.routing.registerRoute( ({ url, request }) => url.origin === 'https://mainline-menswear-res.cloudinary.com' && request.destination === 'image', new workbox.strategies.StaleWhileRevalidate({ cacheName: 'product_images', plugins: [ new workbox.expiration.ExpirationPlugin({ // Only cache 60 images. maxEntries: 60, purgeOnQuotaError: true, }), ], }), ); The second image strategy handles the rest of the images being requested by the origin. These images tend to be very few and small across the whole origin, but to be on the safe side, the number of these cached images is also limited to 60. workbox.routing.registerRoute( /\.(?:png|gif|jpg|jpeg|svg|webp)$/, new workbox.strategies.StaleWhileRevalidate({ cacheName: 'images', plugins: [ new workbox.expiration.ExpirationPlugin({ // Only cache 60 images. maxEntries: 60, purgeOnQuotaError: true, }), ], }), ); Objective: Even though the caching strategy is exactly the same as the previous one, by splitting images into two caches (product_images and images), it allows for more flexible updates to the strategies or caches. Providing offline functionality # The offline page is precached right after the service worker is installed and activated. They do this by creating a list of all offline dependencies: the offline HTML file and an offline SVG icon. const OFFLINE_HTML = '/offline/offline.html'; const PRECACHE = [ { url: OFFLINE_HTML, revision: '70f044fda3e9647a98f084763ae2c32a' }, { url: '/offline/offline.svg', revision: 'efe016c546d7ba9f20aefc0afa9fc74a' }, ]; The precache list is then fed into Workbox which takes care of all the heavy lifting of adding the URLs to the cache, checking for any revision mismatch, updating, and serving the precached files with a CacheFirst strategy. workbox.precaching.precacheAndRoute(PRECACHE); Handling offline navigations # Once the service worker activates and the offline page is precached, it is then used to respond to offline navigation requests by the user. While Mainline Menswear's web app is an SPA, the offline page shows only after the page reloads, the user closes and reopens the browser tab, or when the web app is launched from the home screen while offline. To achieve this, Mainline Menswear provided a fallback to failed NavigationRoute requests with the precached offline page: const htmlHandler = new workbox.strategies.NetworkOnly(); const navigationRoute = new workbox.routing.NavigationRoute(({ event }) => { const request = event.request; // A NavigationRoute matches navigation requests in the browser, i.e. requests for HTML return htmlHandler.handle({ event, request }).catch(() => caches.match(OFFLINE_HTML, { ignoreSearch: true })); }); workbox.routing.registerRoute(navigationRoute); Demo # Offline page example as seen on www.mainlinemenswear.co.uk. Reporting successful installs # Apart from the home screen launch tracking (with "start_url": "/?utm_source=pwa" in the web application manifest), the web app also reports successful app installs by listening to the appinstalled event on window: window.addEventListener('appinstalled', (evt) => { ga('send', 'event', 'Install', 'Success'); }); Adding PWA capabilities to your website will further enhance your customers experience of shopping with you, and will be quicker to market than a [platform-specific] app. Andy Hoyle, Head of Development Conclusion # To learn more about progressive web apps and how to build them, head to the Progressive Web Apps section on web.dev. To read more Progressive Web Apps case studies, browse to the case studies section.

Building split text animations

In this post I want to share thinking on ways to solve split text animations and interactions for the web that are minimal, accessible, and work across browsers. Try the demo. Demo If you prefer video, here's a YouTube version of this post: Overview # Split text animations can be amazing. We'll be barely scratching the surface of animation potential in this post, but it does provide a foundation to build upon. The goal is to animate progressively. The text should be readable by default, with the animation built on top. Split text motion effects can get extravagant and potentially disruptive, so we will only manipulate HTML, or apply motion styles if the user is OK with motion. Here's a general overview of the workflow and results: Prepare reduced motion conditional variables for CSS and JS. Prepare split text utilities in JavaScript. Orchestrate the conditionals and utilities on page load. Write CSS transitions and animations for letters and words (the rad part!). Here's a preview of the conditional results we're going for: If a user prefers reduced motion, we leave the HTML document alone and do no animation. If motion is OK, we go ahead and chop it up into pieces. Here's a preview of the HTML after JavaScript has split the text by letter. elements Preparing motion conditionals # The conveniently available @media (prefers-reduced-motion: reduce) media query will be used from CSS and JavaScript in this project. This media query is our primary conditional for deciding to split text or not. The CSS media query will be used to withhold transitions and animations, while the JavaScript media query will be used to withhold the HTML manipulation. Question: What else should be used to withhold split text animations? Preparing the CSS conditional # I used PostCSS to enable the syntax of Media Queries Level 5, where I can store a media query boolean into a variable: @custom-media --motionOK (prefers-reduced-motion: no-preference); Preparing the JS conditional # In JavaScript, the browser provides a way to check media queries, I used destructuring to extract and rename the boolean result from the media query check: const {matches:motionOK} = window.matchMedia( '(prefers-reduced-motion: no-preference)' ) I can then test for motionOK, and only change the document if the user has not requested to reduce motion. if (motionOK) { // document split manipulations } I can check the same value by using PostCSS to enable the @nest syntax from Nesting Draft 1. This allows me to store all the logic about the animation and it's style requirements for the parent and children, in one place: letter-animation { @media (--motionOK) { /* animation styles */ } } With the PostCSS custom property and a JavaScript boolean, we're ready to conditionally upgrade the effect. That rolls us into the next section where I break down the JavaScript for transforming strings into elements. Splitting Text # Text letters, words, lines, etc., cannot be individually animated with CSS or JS. To achieve the effect, we need boxes. If we want to animate each letter, then each letter needs to be an element. If we want to animate each word, then each word needs to be an element. Create JavaScript utility functions for splitting strings into elements Orchestrate the usage of these utilities In this demo I'll be splitting the text from JavaScript on the DOM of the page. If you're in a framework or on the server, you could split the text into elements from there, but do so respectfully. Splitting letters utility function # A fun place to start is with a function which takes a string and returns each letter in an array. export const byLetter = text => [...text].map(span) The spread syntax from ES6 really helped make that a swift task. Splitting words utility function # Similar to splitting letters, this function takes a string and returns each word in an array. export const byWord = text => text.split(' ').map(span) The split() method on JavaScript strings allows us to specify which characters to slice at. I passed an empty space, indicating a split between words. Making boxes utility function # The effect requires boxes for each letter, and we see in those functions, that map() is being called with a span() function. Here is the span() function. const span = (text, index) => { const node = document.createElement('span') node.textContent = text node.style.setProperty('--index', index) return node } It's crucial to note that a custom property called --index is being set with the array position. Having the boxes for the letter animations is great, but having an index to use in CSS is a seemingly small addition with a large impact. Most notable in this large impact is staggering. We'll be able to use --index as a way of offsetting animations for a staggered look. Utilities conclusion # The splitting.js module in completion: const span = (text, index) => { const node = document.createElement('span') node.textContent = text node.style.setProperty('--index', index) return node } export const byLetter = text => [...text].map(span) export const byWord = text => text.split(' ').map(span) Next is importing and using these byLetter() and byWord() functions. Split orchestration # With the splitting utilities ready to use, putting it all together means: Finding which elements to split Splitting them and replacing text with HTML After that, CSS takes over and will animate the elements / boxes. Finding Elements # I chose to use attributes and values to store information about the desired animation and how to split the text. I liked putting these declarative options into the HTML. The attribute split-by is used from JavaScript, to find elements and create boxes for either letters or words. The attribute letter-animation or word-animation is used from CSS, to target element children and apply transforms and animations. Here's a sample of HTML that demonstrates the two attributes: <h1 split-by="letter" letter-animation="breath">animated letters</h1> <h1 split-by="word" word-animation="trampoline">hover the words</h1> Finding elements from JavaScript # I used the CSS selector syntax for attribute presence to gather the list of elements which want their text split: const splitTargets = document.querySelectorAll('[split-by]') Finding elements from CSS # I also used the attribute presence selector in CSS to give all letter animations the same base styles. Later, we'll use the attribute value to add more specific styles to achieve an effect. letter-animation { @media (--motionOK) { /* animation styles */ } } Splitting text in place # For each of the split targets we find in JavaScript, we'll split their text based on the value of the attribute and map each string to a <span>. We can then replace the text of the element with the boxes we made: splitTargets.forEach(node => { const type = node.getAttribute('split-by') let nodes = null if (type === 'letter') { nodes = byLetter(node.innerText) } else if (type === 'word') { nodes = byWord(node.innerText) } if (nodes) { node.firstChild.replaceWith(...nodes) } }) Orchestration conclusion # index.js in completion: import {byLetter, byWord} from './splitting.js' const {matches:motionOK} = window.matchMedia( '(prefers-reduced-motion: no-preference)' ) if (motionOK) { const splitTargets = document.querySelectorAll('[split-by]') splitTargets.forEach(node => { const type = node.getAttribute('split-by') let nodes = null if (type === 'letter') nodes = byLetter(node.innerText) else if (type === 'word') nodes = byWord(node.innerText) if (nodes) node.firstChild.replaceWith(...nodes) }) } The JavaScript could be read in the following English: Import some helper utility functions. Check if motion is ok for this user, if not do nothing. For each element that wants to be split. Split them based on how they want to be split. Replace text with elements. Splitting animations and transitions # The above splitting document manipulation has just unlocked a multitude of potential animations and effects with CSS or JavaScript. There are a few links at the bottom of this article to help inspire your splitting potential. Time to show what you can do with this! I'll share 4 CSS driven animations and transitions. 🤓 Split letters # As a foundation for the split letter effects, I found the following CSS to be helpful. I put all transitions and animations behind the motion media query and then give each new child letter span a display property plus a style for what to do with white spaces: [letter-animation] > span { display: inline-block; white-space: break-spaces; } The white spaces style is important so that the spans which are only a space, aren't collapsed by the layout engine. Now onto the stateful fun stuff. Transition split letters example # This example uses CSS transitions to the split text effect. With transitions we need states for the engine to animate between, and I chose three states: no hover, hover in sentence, hover on a letter. When the user hovers the sentence, aka the container, I scale back all the children as if the user pushed them further away. Then, as the user hovers a letter, I bring it forward. @media (--motionOK) { [letter-animation="hover"] { &:hover > span { transform: scale(.75); } & > span { transition: transform .3s ease; cursor: pointer; &:hover { transform: scale(1.25); } } } } Animate split letters example # This example uses a predefined @keyframe animation to infinitely animated each letter, and leverages the inline custom property index to create a stagger effect. @media (--motionOK) { [letter-animation="breath"] > span { animation: breath 1200ms ease calc(var(--index) * 100 * 1ms) infinite alternate; } } @keyframes breath { from { animation-timing-function: ease-out; } to { transform: translateY(-5px) scale(1.25); text-shadow: 0 0 25px var(--glow-color); animation-timing-function: ease-in-out; } } Objective: CSS calc() will use the unit type from the last item in the calculation. In the above case, that's 1ms. It's used strategically to convert the otherwise unitless number, into a value of <time> for the animation. Split words # Flexbox worked as a container type for me here in these examples, nicely leveraging the ch unit as a healthy gap length. word-animation { display: inline-flex; flex-wrap: wrap; gap: 1ch; } Flexbox devtools showing the gap between words Transition split words example # In this transition example I use hover again. As the effect initially hides the content until hover, I ensured that the interaction and styles were only applied if the device had the capability to hover. @media (hover) { [word-animation="hover"] { overflow: hidden; overflow: clip; & > span { transition: transform .3s ease; cursor: pointer; &:not(:hover) { transform: translateY(50%); } } } } Animate split words example # In this animation example I use CSS @keyframes again to create a staggered infinite animation on a regular paragraph of text. [word-animation="trampoline"] > span { display: inline-block; transform: translateY(100%); animation: trampoline 3s ease calc(var(--index) * 150 * 1ms) infinite alternate; } @keyframes trampoline { 0% { transform: translateY(100%); animation-timing-function: ease-out; } 50% { transform: translateY(0); animation-timing-function: ease-in; } } Conclusion # Now that you know how I did it, how would you?! 🙂 Let's diversify our approaches and learn all the ways to build on the web. Create a Codepen or host your own demo, tweet me with it, and I'll add it to the Community remixes section below. Source GUI Challenges source on GitHub Splitting text Codepen starter More demos and inspiration Splitting text Codepen collection Splitting.js Community remixes # <text-hover> web component by gnehcwu on CodeSandbox

Take the 2021 scroll survey to help improve scrolling on the web

We've been analyzing the results of the 2019 Mozilla Developer Network Web DNA Report for action items and follow up strategies to improve the top reported issues. A small sub-group on the Chrome team identified a group of issues related to scrolling, scroll-snap, and touch-action. Those issues were researched, evaluated, and synthesized into a new, follow-up survey with more precise questions about issues related to these topics. 2021 Scroll Survey # We believe that there are many ways to improve scrolling on the web for developers, designers, and users alike. We've created a survey which we hope is respectful of your time. It should take no more than 10 minutes to complete. The results will help browser vendors and standards groups understand how to make scrolling better. You can take part in the survey at 2021 Web Scrolling Survey.

How Zalando reduced performance feedback time from 1 day to 15 minutes with Lighthouse CI

This case study was authored by Jeremy Colin and Jan Brockmeyer from the Zalando web infrastructure team. With more than 35 million active customers, Zalando is Europe's leading online fashion platform. In this post we explain why we started to use Lighthouse CI, the ease of implementation, and the advantages to our team. At Zalando, we know the relationship between website performance and revenue. In the past, we tested how artificially increasing the loading time on Catalog pages affected bounce rates, conversion rates, and revenue per user. The results were clear. A 100 milliseconds page load time improvement led to increased engagement with lower bounce rate and a 0.7% uplift in revenue per session. 100ms Page load time improvement 0.7% Increased revenue per session Company buy-in does not always translate to performance # Despite the strong performance buy-in inside the company, if performance is not set as a product delivery criteria it can easily slip away. When we were redesigning Zalando website in 2020 we focused on delivering new features while maintaining excellent user experience and applying a facelift to the website with custom fonts and more vibrant colors. However, when the redesigned website and app were ready for release, early adopter metrics revealed that the new version was slower. First Contentful Paint was up to 53% slower, and our measured Time to Interactive reported up to 59% slower. The web at Zalando # The Zalando website is created by a core team developing a framework, with over 15 feature teams contributing frontend microservices. While supporting the new release, we also transitioned part of our website to a more centralized architecture. The previous architecture called Mosaic included a way to measure page performance with in-house metrics. However, it was difficult to compare performance metrics prior to rolling out to real users as we lacked internal lab performance monitoring tools. Despite deploying every day, there was a feedback loop of around one day for developers working on performance improvements. Web Vitals and Lighthouse to the rescue # We were not entirely satisfied with our in-house metrics as they did not adapt well to our new setup. More importantly, they were not centered on customer experience. We switched to Core Web Vitals as they provided a condensed, yet comprehensive and user-centric set of metrics. In order to improve the performance before the release, we needed to create a proper lab environment. This provided reproducible measurements, in addition to testing conditions representing our 90th percentile of field data. Now, engineers working on performance improvements knew where to focus their efforts to make the biggest impact. We were already using Lighthouse audit reports locally. So our first iteration was to develop a service based on Lighthouse node module, where changes could be tested from our staging environment. This gave us a reliable performance feedback loop of around one hour, which enabled us to bring the performance on par and save our release! Giving performance feedback to developers on pull requests # We did not want to stop there, as we wanted to take the opportunity to not only be reactive towards performance but also proactive. Making the jump from Lighthouse node module to Lighthouse CI (LHCI) server was not too difficult. We opted for the self hosted solution in order to give us a a better integration with our existing company services. Our LHCI server application gets built as a Docker image, which is then deployed to our Kubernetes cluster together with a PostgreSQL database, and reports to our GitHub. Our framework was already providing some performance feedback to developers— component bundle sizes were being compared to threshold values on every commit. Now we are able to report Lighthouse metrics as GitHub status checks. These cause the CI pipeline to fail if they do not meet the performance thresholds, with a link to the detailed Lighthouse reports as shown in the following images. Lighthouse CI GitHub status checks make it easy for developers to understand the regression and address it before it reaches production. Lighthouse CI detailed commit report compared to the main branch. Extending the performance coverage # We started with a very pragmatic approach. Currently Lighthouse only runs on two of our most important pages: the home page and product detail page. Fortunately, Lighthouse CI makes it easy to extend the run configurations. Feature teams working on specific pages of our website are able to set up their matching URL pattern and assertions. With this in place, we are pretty confident that our performance coverage will increase. We are now much more confident when building larger releases, and developers can enjoy a much shorter feedback loop on the performance of their code.

Evolving the CLS metric

We (the Chrome Speed Metrics Team) recently outlined our initial research into options for making the CLS metric more fair to pages that are open for a long time. We've received a lot of very helpful feedback and after completing the large-scale analysis, we've finalized the change we plan to make to the metric: maximum session window with 1 second gap, capped at 5 seconds. Read on for the details! How did we evaluate the options? # We reviewed all the feedback received from the developer community and took it into account. We also implemented the top options in Chrome and did a large-scale analysis of the metrics over millions of web pages. We checked what types of sites each option improved, and how the options compared, especially looking into the sites which were scored differently by different options. Overall, we found that: All the options reduced the correlation between time spent on page and layout shift score. None of the options resulted in a worse score for any page. So there is no need to be concerned that this change will worsen the scores for your site. Decision points # Why a session window? # In our earlier post, we covered a few different windowing strategies for grouping together layout shifts while ensuring the score doesn't grow unbounded. The feedback we received from developers overwhelmingly favored the session window strategy because it groups the layout shifts together most intuitively. To review session windows, here's an example: In the example above, many layout shifts occur over time as the user views the page. Each is represented by a blue bar. You'll notice above that the blue bars have different heights; those represent the score of each individual layout shift. A session window starts with the first layout shift and continues to expand until there is a gap with no layout shifts. When the next layout shift occurs, a new session window starts. Since there are three gaps with no layout shifts, there are three session windows in the example. Similar to the current definition of CLS, the scores of each shift are added up, so that each window's score is the sum of its individual layout shifts. Based on the initial research, we chose a 1 second gap between session windows, and that gap worked well in our large-scale analysis. So the "Session Gap" shown in the example above is 1 second. Why the maximum session window? # We narrowed the summarization strategies down to two options in our initial research: The average score of all the session windows, for very large session windows (uncapped windows with 5 second gaps between them). The maximum score of all the session windows, for smaller session windows (capped at 5 seconds, with 1 second gaps between them). After the initial research, we added each metric to Chrome so that we could do a large-scale analysis over millions of URLs. In the large-scale analysis, we found a lot of URLs with layout shift patterns like this: On the bottom right, you can see there is only a single, tiny layout shift in Session Window 2, giving it a very low score. That means that the average score is pretty low. But what if the developer fixes that tiny layout shift? Then the score is calculated just on Session Window 1, which means that the page's score nearly doubles. It would be really confusing and discouraging to developers to improve their layout shifts only to find that the score got worse. And removing this small layout shift is obviously slightly better for the user experience, so it shouldn't worsen the score. Because of this problem with averages, we decided to move forward with the smaller, capped, maximum windows. So in the example above, Session Window 2 would be ignored and only the sum of the layout shifts in Session Window 1 would be reported. Why 5 seconds? # We evaluated multiple window sizes and found two things: For short windows, slower page loads and slower responses to user interactions could break layout shifts into multiple windows and improve the score. We wanted to keep the window large enough so it doesn't reward slowdowns! There are some pages with a continual stream of small layout shifts. For example, a sports score page that shifts a bit with each score update. These shifts are annoying, but they don't get more annoying as time passes. So we wanted to ensure that the window was capped for these types of layout shifts. With these two things in mind, comparing a variety of window sizes on many real-world web pages, we concluded that 5 seconds would be a good limit to the window size. How will this affect my page's CLS score? # Since this update caps the CLS of a page, no page will have a worse score as a result of this change. And based on our analysis, 55% of origins will not see a change in CLS at all at the 75th percentile. This is because their pages either do not currently have any layout shifts or the shifts they do have are already confined to a single session window. The rest of the origins will see improved scores at the 75th percentile with this change. Most will only see a slight improvement, but about 3% will see their scores improve from having a "needs improvement" or "poor" rating to having a "good" rating. These pages tend to use infinite scrollers or have many slow UI updates, as described in our earlier post. How can I try it out? # We'll be updating our tools to use the new metric definition soon! Until then, you can try out the updated version of CLS on any site using the example JavaScript implementations or the fork of the Web Vitals extension. Thanks to everyone who took the time to read the previous post and give their feedback!

Debug Web Vitals in the field

Google currently provides two categories of tools to measure and debug Web Vitals: Lab tools: Tools such as Lighthouse, where your page is loaded in a simulated environment that can mimic various conditions (for example, a slow network and a low-end mobile device). Field tools: Tools such as Chrome User Experience Report (CrUX), which is based on aggregate, real-user data from Chrome. (Note that the field data reported by tools such as PageSpeed Insights and Search Console is sourced from CrUX data.) While field tools offer more accurate data—data which actually represents what real users experience—lab tools are often better at helping you identify and fix issues. CrUX data is more representative of your page's real performance, but knowing your CrUX scores is unlikely to help you figure out how to improve your performance. Lighthouse, on the other hand, will identify issues and make specific suggestions for how to improve. However, Lighthouse will only make suggestions for performance issues it discovers at page load time. It does not detect issues that only manifest as a result of user interaction such as scrolling or clicking buttons on the page. This raises an important question: how can you capture debug information for the Web Vitals metric data from real users in the field? This post will explain in detail what APIs you can use to collect additional debugging information for each of the current Core Web Vitals metrics and give you ideas for how to capture this data in your existing analytics tool. APIs for attribution and debugging # CLS # Of all the Core Web Vitals metrics, CLS is perhaps the one for which collecting debug information in the field is the most important. CLS is measured throughout the entire lifespan of the page, so the way a user interacts with the page—how far they scroll, what they click on, and so on—can have a significant impact on whether there are layout shifts and which elements are shifting. Consider the following report from PageSpeed Insights for the URL: web.dev/measure The value reported for CLS from the lab (Lighthouse) compared to the CLS from the field (CrUX data) are quite different, and this makes sense if you consider that the web.dev/measure page has a lot of interactive content that is not being used when tested in Lighthouse. But even if you understand that user interaction affects field data, you still need to know what elements on the page are shifting to result in a score of 0.45 at the 75th percentile. The LayoutShiftAttribution interface makes that possible. Get layout shift attribution # The LayoutShiftAttribution interface is exposed on each layout-shift entry that Layout Instability API emmits. For a detailed explanation of both of these interfaces, see Debug layout shifts. For the purposes of this post, the main thing you need to know is that, as a developer, you are able to observe every layout shift that happens on the page as well as what elements are shifting. Here's some example code that logs each layout shift as well as the elements that shifted: new PerformanceObserver((list) => { for (const {value, startTime, sources} of list.getEntries()) { // Log the shift amount and other entry info. console.log('Layout shift:', {value, startTime}); if (sources) { for (const {node, curRect, prevRect} of sources) { // Log the elements that shifted. console.log(' Shift source:', node, {curRect, prevRect}); } } } }).observe({type: 'layout-shift', buffered: true}); It's probably not practical to measure and send data to your analytics tool for every single layout shift that occurs; however, by monitoring all shifts, you can keep track of the worst shifts and just report information about those. The goal isn't to identify and fix every single layout shift that occurs for every user, the goal is to identify the shifts that affect the largest number of users and thus contribute the most to your page's CLS at the 75th percentile. Also, you don't need to compute the largest source element every time there's a shift, you only need to do so when you're ready to send the CLS value to your analytics tool. The following code takes a list of layout-shift entries that have contributed to CLS and returns the largest source element from the largest shift: function getCLSDebugTarget(entries) { const largestShift = entries.reduce((a, b) => { return a && a.value > b.value ? a : b; }); if (largestShift && largestShift.sources) { const largestSource = largestShift.sources.reduce((a, b) => { return a.node && a.previousRect.width * a.previousRect.height > b.previousRect.width * b.previousRect.height ? a : b; }); if (largestSource) { return largestSource.node; } } } Once you've identified the largest element contributing to the largest shift, you can report that to your analytics tool. The element contributing the most to CLS for a given page will likely vary from user to user, but if you aggregate those elements across all users, you'll be able to generate a list of shifting elements affecting the most number of users. Once you've identified and fixed the root cause of the shifts for those elements, your analytics code will start reporting smaller shifts as the "worst" shifts for your pages. Eventually, all reported shifts will be small enough that your pages are well within the "good" threshold of 0.1! Some other metadata that may be useful to capture along with the largest shift source element is: The time of the largest shift The URL path at the time of the largest shift (for sites that dynamically update the URL, such as Single Page Applications). LCP # To debug LCP in the field, the primary information you need is which particular element was the largest element (the LCP candidate element) for that particular page load. Note that it's entirely possible—in fact, it's quite common—that the LCP candidate element will be different from user to user, even for the exact same page. This can happen for several reasons: User devices have different screen resolutions, which results in different page layouts and thus different elements being visible within the viewport. Users don't always load pages scrolled to the very top. Oftentimes links will contain fragment identifiers or even text fragments, which means it's possible for your pages to be loaded and displayed at any scroll position on the page. Content may be personalized for the current user, so the LCP candidate element could vary wildly from user to user. This means you cannot make assumptions about which element or set of elements will be the most common LCP candidate element for a particular page. You have to measure it based on real-user behavior. Identify the LCP candidate element # To determine the LCP candidate element in JavaScript you can use the Largest Contentful Paint API, the same API you use to determine the LCP time value. Given a list of largest-contentful-paint entries, you can determine the current LCP candidate element by looking at the last entry: function getLCPDebugTarget(entries) { const lastEntry = entries[entries.length - 1]; return lastEntry.element; } Caution: As explained in the LCP metric documentation, the LCP candidate element can change through the page load, so more work is required to identify the "final" LCP candidate element. The easiest way to identify and measure the "final" LCP candidate element is to use the web-vitals JavaScript library, as shown in the example below. Once you know the LCP candidate element, you can send it to your analytics tool along with the metric value. As with CLS, this will help you identify which elements are most important to optimize first. Some other metadata that may be useful to capture along with the LCP candidate element: The image source URL (if the LCP candidate element is an image). The text font family (if the LCP candidate element is text and the page uses web fonts). FID # To debug FID in the field, it's important to remember that FID measures only the delay portion of the overall first input event latency. That means that what the user interacted with is not really as important as what else was happening on the main thread at the time they interacted. For example, many JavaScript applications that support server-side rendering (SSR) will deliver static HTML that can be rendered to the screen before it's interactive to user input—that is, before the JavaScript required to make the content interactive has finished loading. For these types of applications, it can be very important to know whether the first input occurred before or after hydration. If it turns out that many people are attempting to interact with the page before hydration completes, consider rendering your pages in a disabled or loading state rather than in a state that looks interactive. If your application framework exposes the hydration timestamp, you can easily compare that with the timestamp of the first-input entry to determine whether the first input happened before or after hydration. If your framework doesn't expose that timestamp, or doesn't use hydration at all, another useful signal may be whether input occurred before or after JavaScript finished loading. The DOMContentLoaded event fires after the page's HTML has completely loaded and parsed, which includes waiting for any synchronous, deferred, or module scripts (including all statically imported modules) to load. So you can use the timing of that event and compare it to when FID occurred. The following code takes a first-input entry and returns true if the first input occurred prior to the end of the DOMContentLoaded event: function wasFIDBeforeDCL(fidEntry) { const navEntry = performance.getEntriesByType('navigation')[0]; return navEntry && fidEntry.startTime < navEntry.domContentLoadedEventStart; } If your page uses async scripts or dynamic import() to load JavaScript, the DOMContentLoaded event may not be a useful signal. Instead, you can consider using the load event or—if there's a particular script you know takes a while to execute—you can use the Resource Timing entry for that script. Identify the FID target element # Another potentially useful debug signal is the element that was interacted with. While the interaction with the element itself does not contribute to FID (remember FID is just the delay portion of the total event latency), knowing which elements your users are interacting with may be useful in determining how best to improve FID. For example, if the vast majority of your user's first interactions are with a particular element, it consider inlining the JavaScript code needed for that element in the HTML, and lazy loading the rest. To get the element associated with the first input event, you can reference the first-input entry's target property: function getFIDDebugTarget(entries) { return entries[0].target; } Some other metadata that may be useful to capture along with the FID target element: The type of event (such as mousedown, keydown, pointerdown). Any relevant long task attribution data for the long task that occurred at the same time as the first input (useful if the page loads third-party scripts). Usage with the web-vitals JavaScript library # The sections above offer some suggestions for additional debug info to include in the data you send to your analytics tool. Each of the examples includes some code that uses one or more performance entries associated with a particular Web Vitals metric and returns a DOM element that can be used to help debug issues affecting that metric. These examples are designed to work well with the web-vitals JavaScript library, which exposes the list of performance entries on the Metric object passed to each callback function. If you combine the examples listed above with the web-vitals metric functions, the end result will look something like this: import {getLCP, getFID, getCLS} from 'web-vitals'; function getSelector(node, maxLen = 100) { let sel = ''; try { while (node && node.nodeType !== 9) { const part = node.id ? '#' + node.id : node.nodeName.toLowerCase() + ( (node.className && node.className.length) ? '.' + Array.from(node.classList.values()).join('.') : ''); if (sel.length + part.length > maxLen - 1) return sel || part; sel = sel ? part + '>' + sel : part; if (node.id) break; node = node.parentNode; } } catch (err) { // Do nothing... } return sel; } function getLargestLayoutShiftEntry(entries) { return entries.reduce((a, b) => a && a.value > b.value ? a : b); } function getLargestLayoutShiftSource(sources) { return sources.reduce((a, b) => { return a.node && a.previousRect.width * a.previousRect.height > b.previousRect.width * b.previousRect.height ? a : b; }); } function wasFIDBeforeDCL(fidEntry) { const navEntry = performance.getEntriesByType('navigation')[0]; return navEntry && fidEntry.startTime < navEntry.domContentLoadedEventStart; } function getDebugInfo(name, entries = []) { // In some cases there won't be any entries (e.g. if CLS is 0, // or for LCP after a bfcache restore), so we have to check first. if (entries.length) { if (name === 'LCP') { const lastEntry = entries[entries.length - 1]; return { debug_target: getSelector(lastEntry.element), event_time: lastEntry.startTime, }; } else if (name === 'FID') { const firstEntry = entries[0]; return { debug_target: getSelector(firstEntry.target), debug_event: firstEntry.name, debug_timing: wasFIDBeforeDCL(firstEntry) ? 'pre_dcl' : 'post_dcl', event_time: firstEntry.startTime, }; } else if (name === 'CLS') { const largestEntry = getLargestLayoutShiftEntry(entries); if (largestEntry && largestEntry.sources) { const largestSource = getLargestLayoutShiftSource(largestEntry.sources); if (largestSource) { return { debug_target: getSelector(largestSource.node), event_time: largestEntry.startTime, }; } } } } // Return default/empty params in case there are no entries. return { debug_target: '(not set)', }; } function sendToAnalytics({name, value, entries}) { navigator.sendBeacon('/analytics', JSON.stringify({ name, value, ...getDebugInfo(name, entries) }); } getLCP(sendToAnalytics); getFID(sendToAnalytics); getCLS(sendToAnalytics); The specific format required to send the data will vary by analytics tool, but the above code should be sufficient to get the data needed, regardless of the format requirements. The code above also includes a getSelector() function (not mentioned in previous sections), which takes a DOM node and returns a CSS selector representing that node and its place in the DOM. It also takes an optional maximum length parameter (defaulting to 100 characters) in the event that your analytics provider has length restrictions on the data you send it. Report and visualize the data # Once you've started collecting debug information along with your Web Vitals metrics, the next step is aggregating the data across all your users to start looking for patterns and trends. As mentioned above, you don't necessarily need to address every single issue your users are encountering, you want to address—especially at first—the issues that are affecting the largest number of users, which should also be the issues that have the largest negative impact on your Core Web Vitals scores. The Web Vitals Report tool # If you're using the Web Vitals Report tool, it's been recently updated to support reporting on a single debug dimension for each of the Core Web Vitals metrics. Here's a screenshot from the Web Vitals Report debug info section, showing data for the Web Vitals Report tool website itself: Using the data above, you can see that whatever is causing the section.Intro element to shift is contributing the most to CLS on this page, so identifying and fixing the cause of that shift will yield the greatest improvement to the score. Summary # Hopefully this post has helped outline the specific ways you can use the existing performance APIs to get debug information for each of the Core Web Vitals metrics based on real-user interactions in the field. While it's focused on the Core Web Vitals, the concepts also apply to debugging any performance metric that's measurable in JavaScript. If you're just getting started measuring performance, and you're already a Google Analytics user, the Web Vitals Report tool may be a good place to start because it already supports reporting debug information for each of the Core Web Vitals metrics. If you're an analytics vendor and you're looking to improve your products and provide more debugging information to your users, consider some of the techniques described here but don't limit yourself to just the ideas presented here. This post is intended to be generally applicable to all analytics tools; however, individual analytics tools likely can (and should) capture and report even more debug information. Lastly, if you feel there are gaps in your ability to debug these metrics due to missing features or information in the APIs themselves send your feedback to web-vitals-feedback@googlegroups.com.

Best practices for cookie notices

This article discusses how cookie notices can affect performance, performance measurement, and user experience. Performance # Cookie notices can have a significant impact on page performance due to the fact that they are typically loaded early in the page load process, are shown to all users, and can potentially influence the loading of ads and other page content. Here's how cookie notices can impact Web Vitals metrics: Largest Contentful Paint (LCP): Most cookie consent notices are fairly small and therefore typically don't contain a page's LCP element. However, this can happen—particularly on mobile devices. On mobile devices, a cookie notice typically takes up a larger portion of the screen. This usually occurs when a cookie notice contains a large block of text (text blocks can be LCP elements too). First Input Delay (FID): Generally speaking, your cookie consent solution in and of itself should have a minimal impact on FID—cookie consent requires little JavaScript execution. However, the technologies that these cookies enable—namely advertising and tracking scripts—may have a significant impact on page interactivity. Delaying these scripts until cookie acceptance can serve as a technique to decrease First Input Delay (FID). Cumulative Layout Shift (CLS): Cookie consent notices are a very common source of layout shifts. Generally speaking, you can expect a cookie notice from third-party providers to have a greater impact on performance than a cookie notice that you build yourself. This is not a problem unique to cookie notices—but rather the nature of third-party scripts in general. Best practices # The best practices in this section focus on third-party cookie notices. Some, but not all, of these best practices will also be applicable to first-party cookie notices. Load cookie notices scripts asynchronously # Cookie notice scripts should be loaded asynchronously. To do this, add the async attribute to the script tag. <script src="https://cookie-notice.com/script.js" async> Scripts that are not asynchronous block the browser parser. This delays page load and LCP. For more information, see Efficiently load third-party JavaScript. If you must use synchronous scripts (for example, some cookie notices rely on synchronous scripts to implement cookie blocking) you should make sure that this request loads as quickly as possible. One way to do this is to use resource hints. Load cookie notice scripts directly # Cookie notice scripts should be loaded "directly" by placing the script tag in the HTML of the main document—rather than loaded by a tag manager or other script. Using a tag manager or secondary script to inject the cookie notice script delays the loading of the cookie notice script: it obscures the script from the browser's lookahead parser and it prevents the script from loading before JavaScript execution. Establish an early connection with the cookie notice origin # All sites that load their cookie notice scripts from a third-party location should use either the dns-prefetch or preconnect resource hints to help establish an early connection with the origin that hosts cookie notice resources. For more information, see Establish network connections early to improve perceived page speed. <link rel="preconnect" href="https://cdn.cookie-notice.com/"> It is common for cookie notices to load resources from multiple origins— for example, loading resources from both www.cookie-notice.com and cdn.cookie-notice.com. Separate origins require separate connections and therefore separate resource hints. Preload cookie notices as appropriate # Some sites would benefit from using the preload resource hint to load their cookie notice script. The preload resource hint informs the browser to initiate an early request for the specified resource. <link rel="preload" href="https://www.cookie-notice.com/cookie-script.js"> preload is most powerful when its usage is limited to fetching a couple key resources per page. Thus, the usefulness of preloading the cookie notice script will vary depending on the situation. Be aware of performance tradeoffs when styling cookie notices # Customizing the look and feel of a third-party cookie notice may incur additional performance costs. For example, third-party cookie notices aren't always able to reuse the same resources (for example, web fonts) that are used elsewhere on the page. In addition, third-party cookie notices tend to load styling at the end of long request chains. To avoid any surprises, be aware of how your cookie notice loads and applies styling and related resources. Avoid layout shifts # These are some of the most common layout shift issues associated with cookie notices: Top-of-screen cookie notices: Top-of-screen cookie notices are a very common source of layout shift. If a cookie notice is inserted into the DOM after the surrounding page has already rendered, it will push the page elements below it further down the page. This type of layout shift can be eliminated by reserving space in the DOM for the consent notice. If this is not a feasible solution—for example, if the dimensions of your cookie notice vary by geography, consider using a sticky footer or modal to display the cookie notice. Because both of these alternative approaches display the cookie notice as an "overlay" on top of the rest of the page, the cookie notice should not cause content shifts when it loads. Animations: Many cookie notices use animations—for example, "sliding in" a cookie notice is a common design pattern. Depending on how these effects are implemented, they can cause layout shifts. For more information, see Debugging layout shifts. Fonts: Late-loading fonts can block render and or cause layout shifts. This phenomena is more apparent on slow connections. Advanced loading optimizations # These techniques take more work to implement but can further optimize the loading of cookie notice scripts: Caching and serving third-party cookie notice scripts from your own servers can improve the delivery speed of these resources. Using service workers can allow you more control over the fetching and caching of third-party scripts such as cookie notice scripts. Performance measurement # Cookie notices can impact performance measurements. This section discusses some of these implications and techniques for mitigating them. Real User Monitoring (RUM) # Some analytics and RUM tools use cookies to collect performance data. In the event that a user declines usage of cookies these tools cannot capture performance data. Sites should be aware of this phenomenon; it is also worthwhile to understand the mechanisms that your RUM tooling uses to collect its data. However, for the typical site this discrepancy probably isn't a cause for alarm given the direction and magnitude of the data skew. Cookie usage is not a technical requirement for performance measurement. The web-vitals JavaScript library is an example of a library that does not use cookies. Depending on how your site uses cookies to collect performance data (that is, whether the cookies contain personal information), as well as the legislation in question, the use of cookies for performance measurement might not be subject to the same legislative requirements as some of the cookies used on your site for other purposes—for example, advertising cookies. Some sites choose to break out performance cookies as a separate category of cookies when asking for user consent. Synthetic monitoring # Without custom configuration, most synthetic tools (such as Lighthouse and WebPageTest) will only measure the experience of a first-time user who has not responded to a cookie consent notice. However, not only do variations in cache state (for example, an initial visit versus a repeat visit) need to be considered when collecting performance data, but also variations in cookie acceptance state—accepted, rejected, or unresponded. The following sections discuss WebPageTest and Lighthouse settings that can be helpful for incorporating cookie notices into performance measurement workflows. However, cookies and cookie notices are just one of many factors that can be difficult to perfectly simulate in lab environments. For this reason, it is important to make RUM data the cornerstone of your performance benchmarking, rather than synthetic tooling. Testing cookie notices with WebPageTest # Scripting # You can use scripting to have a WebPageTest "click" the cookie consent banner while collecting a trace. Add a script by going to the Script tab. The script below navigates to the URL to be tested and then clicks the DOM element with the id cookieButton. Caution: WebPageTest scripts are tab-delimited. combineSteps navigate %URL% clickAndWait id=cookieButton When using this script be aware that: combineSteps tells WebPageTest to "combine" the results of the scripting steps that follow into a single set of traces and measurements. Running this script without combineSteps can also be useful—separate traces make it easy to see whether resources were loaded before or after cookie acceptance. %URL% is a WebPageTest convention that refers to the URL that is being tested. clickAndWait tells WebPageTest to click on the element indicated by attribute=value and wait for the subsequent browser activity to complete. It follows the format clickAndWait attribute=Value. If you've configured this script correctly, the screenshot taken by WebPageTest should not show a cookie notice (the cookie notice has been accepted). For more information on WebPageTest scripting, check out WebPageTest documentation. Set cookies # To run WebPageTest with a cookie set, go to the Advanced tab and add the cookie header to the Custom headers field: Change the test location # To change the test location used by WebPageTest, click the Test Location dropdown located on the Advanced Testing tab. Testing cookie notices with Lighthouse # Setting cookies on a Lighthouse run can serve as a mechanism for getting a page into a particular state for testing by Lighthouse. Lighthouse's cookie behavior varies slightly by context (DevTools, CLI, or PageSpeed Insights). DevTools # Cookies are not cleared when Lighthouse is run from DevTools. However, other types of storage are cleared by default. This behavior can be changed by using the Clear Storage option in the Lighthouse settings panel. CLI # Running Lighthouse from the CLI uses a fresh Chrome instance, so no cookies are set by default. To run Lighthouse from the CLI with a particular cookie set, use the following command: lighthouse <url> --extra-headers "{\"Cookie\":\"cookie1=abc; cookie2=def; \_id=foo\"}" For more information on setting custom request headers in Lighthouse CLI, see Running Lighthouse on Authenticated Pages. PageSpeed Insights # Running Lighthouse from PageSpeed Insights uses a fresh Chrome instance and does not set any cookies. PageSeed Insights cannot be configured to set particular cookies. User experience # The user experience (UX) of different cookie consent notices is primarily the result of two decisions: the location of the cookie notice within the page and the extent to which the user can customize a site's use of cookies. This section discusses potential approaches to these two decisions. Caution: Cookie notice UX is often heavily influenced by legislation which can vary widely by geography. Thus, some of the design patterns discussed in this section may not be relevant to your particular situation. This article should not be considered a substitute for legal advice. When considering potential designs for your cookie notice, here are some things to think about: UX: Is this a good user experience? How will this particular design affect existing page elements and user flows? Business: What is your site's cookie strategy? What are your goals for the cookie notice? Legal: Does this comply with legal requirements? Engineering: How much work would this be to implement and maintain? How difficult would it be to change? Placement # Cookie notices can be displayed as a header, inline element, or footer. They can also be displayed on top of page content using a modal or served as an interstitial. Header, footer, and inline cookie notices # Cookie notices are commonly placed in the header or footer. Of these two options, the footer placement is generally preferable because it is unobtrusive, does not compete for attention with banner ads or notifications, and typically does not cause CLS. In addition, it is a common place for placing privacy policies and terms of use. Although inline cookie notices are an option, they can be difficult to integrate into existing user interfaces, and therefore are uncommon. Modals # Modals are cookie consent notices that are displayed on top of page content. Modals can look and perform quite differently depending on their size. Smaller, partial-screen modals can be a good alternative for sites that are struggling to implement cookie notices in a way that doesn't cause layout shifts. On the other hand, large modals that obscure the majority of page content should be used carefully. In particular, smaller sites may find that users bounce rather than accept the cookie notice of an unfamiliar site with obscured content. Although they are not necessarily synonymous concepts, if you are considering using a full-screen cookie consent modal, you should be aware of legislation regarding cookie walls. Large modals can be considered a type of interstitial. Google Search does not penalize the usage of interstitials when they are used to comply with legal regulations such as in the case of cookie banners. However, ther usage of interstitials in other contexts—particularly if they are intrusive or create a poor user experience—may be penalized. Configurability # Cookie notice interfaces give users varying levels of control over which cookies they accept. No configurability # These notice-style cookie banners do not present users with direct UX controls for opting out of cookies. Instead, they typically include a link to the site's cookie policy which may provide users with information about managing cookies using their web browser. These notices typically include either a "dismiss" and/or "Accept" button. Some configurability # These cookie notices give the user the option of declining cookies but do not support more granular controls. This approach to cookie notices is less common. Full configurability # These cookie notices provide users with more fine-grained controls for configuring the cookie usage that they accept. UX: Controls for configuring cookie usage are most commonly displayed using a separate modal that is launched when the user responds to the initial cookie consent notice. However, if space permits, some sites will display these controls inline within the initial cookie consent notice. Granularity: The most common approach to cookie configurability is to allow users to opt-in to cookies by cookie "category". Examples of common cookie categories include functional, targeting, and social media cookies. However, some sites will go a step further and allow users to opt-in on a per-cookie basis. Alternatively, another way of providing users with more specific controls is to break down cookie categories like "advertising" into specific use cases—for example, allowing users to separately opt-in to "basic ads" and "personalized ads".

What is Federated Learning of Cohorts (FLoC)?

Summary # FLoC provides a privacy-preserving mechanism for interest-based ad selection. As a user moves around the web, their browser uses the FLoC algorithm to work out its "interest cohort", which will be the same for thousands of browsers with a similar recent browsing history. The browser recalculates its cohort periodically, on the user's device, without sharing individual browsing data with the browser vendor or anyone else. FLoC is now in origin trial in Chrome. Find out more: How to take part in the FLoC origin trial. During the current FLoC origin trial, a page visit will only be included in the browser's FLoC computation for one of two reasons: The FLoC API (document.interestCohort()) is used on the page. Chrome detects that the page loads ads or ads-related resources. For other clustering algorithms, the trial may experiment with different inclusion criteria: that's part of the origin trial experiment process. Advertisers (sites that pay for advertisements) can include code on their own websites in order to gather and provide cohort data to their adtech platforms (companies that provide software and tools to deliver advertising). For example, an adtech platform might learn from an online shoe store that browsers from cohorts 1101 and 1354 seem interested in the store's hiking gear. From other advertisers, the adtech platform learns about other interests of those cohorts. Subsequently, the ad platform can use this data to select relevant ads (such as an ad for hiking boots from the shoe store) when a browser from one of those cohorts requests a page from a site that displays ads, such as a news website. The Privacy Sandbox is a series of proposals to satisfy third-party use cases without third-party cookies or other tracking mechanisms. See Digging into the Privacy Sandbox for an overview of all the proposals. This proposal needs your feedback. If you have comments, please create an issue on the FLoC Explainer repository. If you have feedback on Chrome's experiment with this proposal, please post a reply on the Intent to Experiment. Why do we need FLoC? # Many businesses rely on advertising to drive traffic to their sites, and many publisher websites fund content by selling advertising inventory. People generally prefer to see ads that are relevant and useful to them, and relevant ads also bring more business to advertisers and more revenue to the websites that host them. In other words, ad space is more valuable when it displays relevant ads. Thus, selecting relevant ads increases revenue for ad-supported websites. That, in turn, means that relevant ads help fund content creation that benefits users. However, many people are concerned about the privacy implications of tailored advertising, which currently relies on techniques such as tracking cookies and device fingerprinting which are used to track individual browsing behavior. The FLoC proposal aims to allow more effective ad selection without compromising privacy. What can FLoC be used for? # Show ads to people whose browsers belong to a cohort that has been observed to frequently visit an advertiser's site or shows interest in relevant topics. Use machine learning models to predict the probability a user will convert based on their cohort, in order to inform ad auction bidding behavior. Recommend content to users. For example, suppose a news site observes that their sports podcast page has become especially popular with visitors from cohorts 1234 and 7. They can recommend that content to other visitors from those cohorts. How does FLoC work? # The example below describes the different roles in selecting an ad using FLoC. The advertiser (a company that pays for advertising) in this example is an online shoe retailer: shoestore.example The publisher (a site that sells ad space) in the example is a news site: dailynews.example The adtech platform (which provides software and tools to deliver advertising) is: adnetwork.example In this example we've called the users Yoshi and Alex. Initially their browsers both belong to the same cohort, 1354. We've called the users here Yoshi and Alex, but this is only for the purpose of the example. Names and individual identities are not revealed to the advertiser, publisher, or adtech platform with FLoC. Don't think of a cohort as a collection of people. Instead, think of a cohort as a grouping of browsing activity. 1. FLoC service # The FLoC service used by the browser creates a mathematical model with thousands of "cohorts", each of which will correspond to thousands of web browsers with similar recent browsing histories. More about how this works below. Each cohort is given a number. 2. Browser # From the FLoC service, Yoshi's browser gets data describing the FLoC model. Yoshi's browser works out its cohort by using the FLoC model's algorithm to calculate which cohort corresponds most closely to its own browsing history. In this example, that will be the cohort 1354. Note that Yoshi's browser does not share any data with the FLoC service. In the same way, Alex's browser calculates its cohort ID. Alex's browsing history is different from Yoshi's, but similar enough that their browsers both belong to cohort 1354. 3. Advertiser: shoestore.example # Yoshi visits shoestore.example. The site asks Yoshi's browser for its cohort: 1354. Yoshi looks at hiking boots. The site records that a browser from cohort 1354 showed interest in hiking boots. The site later records additional interest in its products from cohort 1354, as well as from other cohorts. The site periodically aggregates and shares information about cohorts and product interests with its adtech platform adnetwork.example. Now it's Alex's turn. 4. Publisher: dailynews.example # Alex visits dailynews.example. The site asks Alex's browser for its cohort. The site then makes a request for an ad to its adtech platform, adnetwork.example, including Alex's browser's cohort: 1354. 5. Adtech platform: adnetwork.example # adnetwork.example can select an ad suitable for Alex by combining the data it has from the publisher dailynews.example and the advertiser shoestore.example: Alex's browser's cohort (1354) provided by dailynews.example. Data about cohorts and product interests from shoestore.example: "Browsers from cohort 1354 might be interested in hiking boots." adnetwork.example selects an ad appropriate to Alex: an ad for hiking boots on shoestore.example. dailynews.example displays the ad 🥾. Current approaches for ad selection rely on techniques such as tracking cookies and device fingerprinting, which are used by third parties such as advertisers to track individual browsing behavior. With FLoC, the browser does not share its browsing history with the FLoC service or anyone else. The browser, on the user's device, works out which cohort it belongs to. The user's browsing history never leaves the device. Who runs the back-end service that creates the FLoC model? # Every browser vendor will need to make their own choice of how to group browsers into cohorts. Chrome is running its own FLoC service; other browsers might choose to implement FLoC with a different clustering approach, and would run their own service to do so. How does the FLoC service enable the browser to work out its cohort? # The FLoC service used by the browser creates a multi-dimensional mathematical representation of all potential web browsing histories. We'll call this model "cohort space". The service divides up this space into thousands of segments. Each segment represents a cluster of thousands of similar browsing histories. These groupings aren't based on knowing any actual browsing histories; they're simply based on picking random centers in "cohort space" or cutting up the space with random lines. Each segment is given a cohort number. The web browser gets this data describing "cohort space" from its FLoC service. As a user moves around the web, their browser uses an algorithm to periodically calculate the region in "cohort space" that corresponds most closely to its own browsing history. The FLoC service divides up "cohort space" into thousands of segments (only a few are shown here). At no point in this process is the user's browsing history shared with the FLoC service, or any third party. The browser's cohort is calculated by the browser, on the user's device. No user data is acquired or stored by the FLoC service. Can a browser's cohort change? # Yes! A browser's cohort definitely can change! You probably don't visit the same websites every week, and your browser's cohort will reflect that. A cohort represents a cluster of browsing activity, not a collection of people. The activity characteristics of a cohort are generally consistent over time, and cohorts are useful for ad selection because they group similar recent browsing behavior. Individual people's browsers will float in and out of a cohort as their browsing behavior changes. Initially, we expect the browser to recalculate its cohort every seven days. In the example above, both Yoshi and Alex's browser's cohort is 1354. In the future, Yoshi's browser and Alex's browser may move to a different cohort if their interests change. In the example below, Yoshi's browser moves to cohort 1101 and Alex's browser moves to cohort 1378. Other people's browsers will move into and out of cohorts as their browsing interests change. Yoshi's and Alex's browser cohort may change if their interests change. A cohort defines a grouping of browsing activity, not a group of people. Browsers will move in and out of a cohort as their activity changes. How does the browser work out its cohort? # As described above, the user's browser gets data from its FLoC service that describes the mathematical model for cohorts: a multi-dimensional space that represents the browsing activity of all users. The browser then uses an algorithm to work out which region of this "cohort space" (that is, which cohort) most closely matches its own recent browsing behavior. How does FLoC work out the right size of cohort? # There will be thousands of browsers in each cohort. A smaller cohort size might be more useful for personalizing ads, but is less likely to stop user tracking—and vice versa. A mechanism for assigning browsers to cohorts needs to make a trade off between privacy and utility. The Privacy Sandbox uses k-anonymity to allow a user to "hide in a crowd". A cohort is k-anonymous if it is shared by at least k users. The higher the k number, the more privacy-preserving the cohort. Can FLoC be used to group people based on sensitive categories? # The clustering algorithm used to construct the FLoC cohort model is designed to evaluate whether a cohort may be correlated with sensitive categories, without learning why a category is sensitive. Cohorts that might reveal sensitive categories such as race, sexuality, or medical history will be blocked. In other words, when working out its cohort, a browser will only be choosing between cohorts that won't reveal sensitive categories. Is FLoC just another way of categorizing people online? # With FLoC, a user's browser will belong to one of thousands of cohorts, along with thousands of other users' browsers. Unlike with third-party cookies and other targeting mechanisms, FLoC only reveals the cohort a user's browser is in, and not an individual user ID. It does not enable others to distinguish an individual within a cohort. In addition, the information about browsing activity that is used to work out a browser's cohort is kept local on the browser or device, and is not uploaded elsewhere. The browser may further leverage other anonymization methods, such as differential privacy. Do websites have to participate and share information? # Websites will have the ability to opt in or out of FLoC, so sites about sensitive topics will be able to prevent visits to their site from being included in the FLoC calculation. As additional protection, analysis by the FLoC service will evaluate whether a cohort may reveal sensitive information about users without learning why that cohort is sensitive. If a cohort might represent a greater-than-typical number of people who visit sites in a sensitive category, that entire cohort is removed. Negative financial status and mental health are among the sensitive categories covered by this analysis. Websites can exclude a page from the FLoC calculation by setting a Permissions-Policy header interest-cohort=() for that page. For pages that haven't been excluded, a page visit will be included in the browser's FLoC calculation if document.interestCohort() is used on the page. During the current FLoC origin trial, a page will also be included in the calculation if Chrome detects that the page loads ads or ads-related resources. (Ad Tagging in Chromium explains how Chrome's ad detection mechanism works.) Pages served from private IP addresses, such as intranet pages, won't be part of the FLoC computation. As a web developer how can I try out FLoC? # The FLoC API is very simple: just a single method that returns a promise that resolves to an object providing the cohort id and version: const { id, version } = await document.interestCohort(); console.log('FLoC ID:', id); console.log('FLoC version:', version); The cohort data made available looks like this: { id: "14159", version: "chrome.1.0" } The version value enables sites using FLoC to know which browser and which FLoC model the cohort ID refers to. As described below, the promise returned by document.interestCohort() will reject for any frame that is not allowed the interest-cohort permission. The FLoC API is available in Chrome 89 and above, but if you are not taking part in the origin trial, you will need to set flags and run Chrome from the command line. Run Chromium with flags explains how to do this for different operating systems. Start Chrome with the following flags: --enable-blink-features=InterestCohortAPI --enable-features="FederatedLearningOfCohorts:update_interval/10s/minimum_history_domain_size_required/1,FlocIdSortingLshBasedComputation,InterestCohortFeaturePolicy" Make sure third-party cookies are not blocked and that no ad blocker is running. View the demo at floc.glitch.me. How to take part in the FLoC origin trial explains how to try out FLoC in both first- and third-party contexts. How can websites opt out of the FLoC computation? # The interest-cohort permissions policy enables a site to declare that it does not want to be included in the user's list of sites for cohort calculation. The policy will be allow by default. The promise returned by document.interestCohort() will reject for any frame that is not allowed interest-cohort permission. If the main frame does not have the interest-cohort permission, then the page visit will not be included in the interest cohort calculation. For example, a site can opt out of all FLoC cohort calculation by sending the following HTTP response header: Permissions-Policy: interest-cohort=() How can I make suggestions or provide feedback? # If you have comments on the API, please create an issue on the FLoC Explainer repository. Find out more # FLoC demo How to take part in the FLoC origin trial Digging in to the Privacy Sandbox FLoC Explainer Evaluation of cohort Algorithms for the FLoC API Photo by Rhys Kentish on Unsplash.

Lowe's website is among fastest performing e-commerce websites

This post was authored by Ashish Choudhury, Dinakar Chandolu, Abhimanyu Raibahadur, and Dhilipvenkatesh Uvarajan from Lowe's. Lowe's is a nearly $90B home improvement retailer that operates about 2,200 stores and employs more than 300,000 associates. By building an automated testing and monitoring system that prevents performance regressions from deploying to production, Lowe's Site Speed Team was able to improve its website performance, ranking among the top retail sites. Problem # The Site Speed Team's goal is to make the Lowe's site one of the fastest e-commerce sites in terms of page load performance. Before they built their automated testing and monitoring system, Lowe's website developers were unable to measure performance automatically in pre-production environments. Existing tools only conducted tests in the production environment. As a result, inferior builds slipped into production, creating a poor user experience. These inferior builds would remain in production until they were detected by the Site Speed Team and reverted by the author. Solution # The Site Speed Team used open source tools to build an automated performance testing and monitoring system for pre-production environments. The system measures the performance of every pull request (PR) and gates the PR from shipping to production if it does not meet the Site Speed Team's performance budget and metric criteria. The system also measures SEO and ADA compliance. Impact # From a sample of 1 team over 16 weeks deploying 102 builds, the automated performance testing and monitoring system prevented 32 builds with subpar performance from going into production. Where it used to take the Site Speed Team three to five days to inform developers that they had shipped performance regressions into production, the system now automatically informs developers of performance problems five minutes after submitting a pull request in a pre-production environment. Code quality is improving over time, as measured by the fact that fewer pull requests are being flagged for performance regressions. The Site Speed Team is also gradually tightening governance budgets to continuously improve site quality. In general, having clear ownership of problematic code has shifted the engineering culture. Instead of begrudging reactive corrections because it was never clear who actually introduced the problems, the team can make proactive optimizations with ownership of problematic code being objectively attributable. Implementation # The heart of the Site Speed Governance (SSG) app is Lighthouse CI. The SSG app uses Lighthouse to validate and audit the page performance of every pull request. The SSG app causes a build to fail if the Site Speed Team's defined performance budget and metric targets are not reached. It enforces not only load performance but also SEO, PWA, and accessibility. It can report status immediately to authors, reviewers, and SRE teams. It can also be configured to bypass the checks when exceptions are needed. Automated Speed Governance (ASG) process flow # Spinnaker # Start point. A developer merges their code into a pre-production environment. Deploy the pre-production environment with CDN assets. Check for the successful deployment. Run a Docker container to start building the ASG application or send a notification (in the event of deployment failure). Jenkins and Lighthouse # Build the ASG application with Jenkins. Run a custom Docker container that has Chrome and Lighthouse installed. Pull lighthouserc.json from the SSG app and run lhci autorun --collect-url=https://example.com. Jenkins and SSG app # Extract assertion-results.json from lhci and compare it to predefined budgets in budgets.json. Save the output as a text file and upload it to Nexus for future comparisons. Compare the current assertion-results.json to the last successful build (downloaded from Nexus) and save it as a text file. Build an HTML email with the success or failure information. Send the email to the relevant distribution lists with Jenkins.

Compat2021: Eliminating five top compatibility pain points on the web

Google is working with other browser vendors and industry partners to fix the top five browser compatibility pain points for web developers. The areas of focus are CSS Flexbox, CSS Grid, position: sticky, aspect-ratio, and CSS transforms. Check out How you can contribute and follow along to learn how to get involved. Background # Compatibility on the web has always been a big challenge for developers. In the last couple of years, Google and other partners, including Mozilla and Microsoft, have set out to learn more about the top pain points for web developers, to drive our work and prioritization to make the situation better. This project is connected to Google's Developer Satisfaction (DevSAT) work, and it started on a larger scale with the creation of the MDN DNA (Developer Needs Assessment) surveys in 2019 and 2020, and a deep-dive research effort presented in the MDN Browser Compatibility Report 2020. Additional research has been done in various channels, such as the State of CSS and State of JS surveys. The goal in 2021 is to eliminate browser compatibility problems in five key focus areas so developers can confidently build on them as reliable foundations. This effort is called #Compat2021. Choosing what to focus on # While there are browser compatibility issues in basically all of the web platform, the focus of this project is on a small number of the most problematic areas which can be made significantly better, thus removing them as top issues for developers. The compatibility project uses multiple criteria influencing which areas to prioritize, and some are: Feature usage. For example, Flexbox is used in 75% of all page views, and adoption is growing strongly in HTTP Archive. Number of bugs (in Chromium, Gecko, WebKit), and for Chromium, how many stars those bugs have. Survey results: MDN DNA surveys MDN Browser Compatibility Report State of CSS most known and used features Test results from web-platform-tests. For example, Flexbox on wpt.fyi. Can I use's most-searched-for features. The five top focus areas in 2021 # In 2020, Chromium started work addressing the top areas outlined in Improving Chromium's browser compatibility in 2020. In 2021, we are beginning a dedicated effort to go even further. Google and Microsoft are working together on addressing top issues in Chromium, along with Igalia. Igalia, who are regular contributors to Chromium and WebKit, and maintainers of the official WebKit port for embedded devices, have been very supportive and engaged in these compatibility efforts, and will be helping tackle and track the identified issues. Here are the areas which are committed to being fixed in 2021. CSS Flexbox # CSS Flexbox is widely used on the web and there are still some major challenges for developers. For example, both Chromium and WebKit have had issues with auto-height flex containers leading to incorrectly sized images. Alireza Mahmoudi. Igalia's Flexbox Cats blog post dives deeper into these issues with many more examples. Why it is prioritized # Surveys: Top issue in MDN Browser Compatibility Report, most known and used in State of CSS Tests: 85% pass in all browsers Usage: 75% of page views, growing strongly in HTTP Archive CSS Grid # CSS Grid is a core building block for modern web layouts, replacing many older techniques and workarounds. As adoption is growing, it needs to be rock solid, so that differences between browsers is never a reason to avoid it. One area that's lacking is the ability to animate grid layouts, supported in Gecko but not Chromium or WebKit. When supported, effects like this are made possible: Animated chess demo by Chen Hui Jing. Why it is prioritized # Surveys: Runner-up in MDN Browser Compatibility Report, well known but less often used in State of CSS Tests: 75% pass in all browsers Usage: 8% and growing steady, slight growth in HTTP Archive While a newer feature like subgrid is important for developers, it isn't a part of this specific effort. To follow along, see Subgrid compat on MDN. CSS position: sticky # Sticky positioning allows content to stick to the edge of the viewport and is commonly used for headers that are always visible at the top of the viewport. While supported in all browsers, there are common use cases where it doesn't work as intended. For example, sticky table headers aren't supported in Chromium, and although now supported behind a flag, the results are inconsistent across browsers: Check out the sticky table headers demo by Rob Flack. Why it is prioritized # Surveys: Highly known/used in State of CSS and was brought up multiple times in MDN Browser Compatibility Report Tests: 66% pass in all browsers Usage: 8% CSS aspect-ratio property # The new aspect-ratio CSS property makes it easy to maintain a consistent width-to-height ratio for elements, removing the need for the well-known padding-top hack: Using padding-top .container { width: 100%; padding-top: 56.25%; } Using aspect-ratio .container { width: 100%; aspect-ratio: 16 / 9; } Because it is such a common use case this is expected to become widely used, and we want to make sure it's solid in all common scenarios and across browsers. Why it is prioritized # Surveys: Already well known but not yet widely used in State of CSS Tests: 27% pass in all browsers Usage: 3% and expected to grow CSS transforms # CSS transforms have been supported in all browsers for many years and are widely used on the web. However, there still remain many areas where they don't work the same across browsers, notably with animations and 3D transforms. For example, a card flip effect can be very inconsistent across browsers: Card flip effect in Chromium (left), Gecko (middle) and WebKit (right). Demo by David Baron from bug comment. Why it is prioritized # Surveys: Very well known and used in State of CSS Tests: 55% pass in all browsers Usage: 80% How you can contribute and follow along # Follow and share any updates we post on @ChromiumDev or the public mailing list, Compat 2021. Make sure bugs exist, or file them for issues you have been experiencing, and if there's anything missing, reach out through the above channels. There will be regular updates about the progress here on web.dev and you can also follow the progress for each focus area in the Compat 2021 Dashboard. We hope this concerted effort among browser vendors to improve reliability and interoperability will help you go build amazing things on the web!

Vodafone: A 31% improvement in LCP increased sales by 8%

Vodafone is a leading telecommunications company in Europe and Africa operating fixed and mobile networks in 21 countries and partnering with mobile networks in 48 more. By running an A/B test on a landing page (where version A was optimized for Web Vitals and had a 31% better LCP score in the field than version B), Vodafone determined that optimizing for Web Vitals generated 8% more sales. 31% A 31% improvement in LCP led to… +8% Increase in total sales +15% Uplift in the lead to visit rate +11% Uplift in the cart to visit rate Highlighting the opportunity # Vodafone knew that faster websites generally correlate to improved business metrics and were interested in optimizing their Web Vitals scores as a potential strategy for increasing sales, but they needed to determine exactly what kind of ROI they would get. not version A and version B from the A/B test. Both versions were visually and functionally identical. The approach they used # A/B test # The traffic for the A/B test came from different paid media channels, including display, iOS/Android, search, and social. 50% of the traffic was sent to the optimized landing page (version A), and 50% was sent to the baseline page (version B). Version A and version B both got around 100K clicks and 34K visits per day. As mentioned before, the only difference between version A and version B was that version A was optimized for Web Vitals. There were no functional or visual differences between the two versions other than that. Vodafone used the PerformanceObserver API to measure LCP on real user sessions and sent the field data to their analytics provider. Optimizations # Vodafone made the following changes on the optimized page (version A): Moved the rendering logic for a widget from client-side to server-side, which resulted in less render-blocking JavaScript Server-side rendered critical HTML Optimized images, including resizing the hero image, optimizing SVG images, using media queries to avoid loading images that weren't yet visible in the viewport, and optimizing PNG images Overall business results # An 8% increase in sales The following table shows the values for DOMContentLoaded ("DCL") and LCP that Vodafone observed on version A ("Optimized Page") and version B ("Default Page"). Note that DCL actually increased 15%. The absolute values related to business metrics have been redacted. Davide Grossi, Head of Digital Marketing, Business Check out the Scale on web case studies page for more success stories.

Building a Settings component

In this post I want to share thinking on building a Settings component for the web that is responsive, supports multiple device inputs, and works across browsers. Try the demo. Demo If you prefer video, or want a UI/UX preview of what we're building, here's a shorter walkthrough on YouTube: Overview # I've broken out the aspects of this component into the following sections: Layouts Color Custom range input Custom checkbox input Accessibility considerations JavaScript Gotchas! The CSS snippets below assume PostCSS with PostCSS Preset Env. Intent is to practice early and often with syntax in early drafts or experimentally available in browsers. Or as the plugin likes to say, "Use tomorrow's CSS today". Layouts # This is the first GUI Challenge demo to be all CSS Grid! Here's each grid highlighted with the Chrome DevTools for grid: To highlight your grid layouts: Open Chrome DevTools with cmd+opt+i or ctrl+alt+i. Select the Layout tab next to the Styles tab. Under the Grid layouts section, check on all the layouts. Change the colors of all layouts. Just for gap # The most common layout: foo { display: grid; gap: var(--something); } I call this layout "just for gap" because it only uses grid to add gaps between blocks. Five layouts use this strategy, here's all of them displayed: The fieldset element, which contains each input group (.fieldset-item), is using gap: 1px to create the hairline borders between elements. No tricky border solution! Filled gap .grid { display: grid; gap: 1px; background: var(--bg-surface-1); & > .fieldset-item { background: var(--bg-surface-2); } } Border trick .grid { display: grid; & > .fieldset-item { background: var(--bg-surface-2); &:not(:last-child) { border-bottom: 1px solid var(--bg-surface-1); } } } Natural grid wrapping # The most complex layout ended up being the macro layout, the logical layout system between <main> and <form>. Centering wrapping content # Flexbox and grid both provide abilities to align-items or align-content, and when dealing with wrapping elements, content layout alignments will distribute space amongst the children as a group. main { display: grid; gap: var(--space-xl); place-content: center; } The main element is using place-content: center alignment shorthand so that the children are centered vertically and horizontally in both one and two column layouts. Watch in the above video how the "content" stays centered, even though wrapping has occurred. Repeat auto-fit minmax # The <form> uses an adaptive grid layout for each section. This layout switches from one to two columns based on available space. form { display: grid; gap: var(--space-xl) var(--space-xxl); grid-template-columns: repeat(auto-fit, minmax(min(10ch, 100%), 35ch)); align-items: flex-start; max-width: 89vw; } This grid has a different value for row-gap (--space-xl) than column-gap (--space-xxl) to put that custom touch on the responsive layout. When the columns stack, we want a large gap, but not as large as if we're on a wide screen. The grid-template-columns property uses 3 CSS functions: repeat(), minmax() and min(). Una Kravets has a great layout blog post about this, calling it RAM. There's 3 special additions in our layout, if you compare it to Una's: We pass an extra min() function. We specify align-items: flex-start. There's a max-width: 89vw style. The extra min() function is well described by Evan Minto on their blog in the post Intrinsically Responsive CSS Grid with minmax() and min(). I recommend giving that a read. The flex-start alignment correction is to remove the default stretching effect, so that the children of this layout don't need to have equal heights, they can have natural, intrinsic heights. The YouTube video has a quick breakdown of this alignment addition. max-width: 89vw is worth a small breakdown in this post. Let me show you the layout with and without the style applied: What's happening? When max-width is specified, it's providing context, explicit sizing or definite sizing for the auto-fit layout algorithm to know how many repetitions it can fit into the space. While it seems obvious that the space is "full width", per the CSS grid spec, a definite size or max-size must be provided. I've provided a max-size. So, why 89vw? Because "it worked" for my layout. Me and a couple of other Chrome folks are investigating why a more reasonable value, like 100vw isn't sufficient, and if this is in fact a bug. Spacing # A majority of the harmony of this layout is from a limited palette of spacing, 7 to be exact. :root { --space-xxs: .25rem; --space-xs: .5rem; --space-sm: 1rem; --space-md: 1.5rem; --space-lg: 2rem; --space-xl: 3rem; --space-xxl: 6rem; } Usage of these flows really nicely with grid, CSS @nest, and level 5 syntax of @media. Here's an example, the fully <main> layout set of styles. main { display: grid; gap: var(--space-xl); place-content: center; padding: var(--space-sm); @media (width >= 540px) { & { padding: var(--space-lg); } } @media (width >= 800px) { & { padding: var(--space-xl); } } } A grid with centered content, moderately padded by default (like on mobile). But as more viewport space becomes available, it spreads out by increasing padding. 2021 CSS is looking pretty good! Remember the earlier layout, "just for gap"? Here's a more complete version of how they look in this component: header { display: grid; gap: var(--space-xxs); } section { display: grid; gap: var(--space-md); } Color # A controlled use of color helped this design stand out as expressive yet minimal. I do it like this: :root { --surface1: lch(10 0 0); --surface2: lch(15 0 0); --surface3: lch(20 0 0); --surface4: lch(25 0 0); --text1: lch(95 0 0); --text2: lch(75 0 0); } Key Term: PostCSS lab() and lch() plugin is part of PostCSS Preset Env, and will output rgb() colors. I name my surface and text colors with numbers as opposed to names like surface-dark and surface-darker because in a media query, I'll be flipping them, and light and dark won't be meaningful. I flip them in a preference media query like this: :root { ... @media (prefers-color-scheme: light) { & { --surface1: lch(90 0 0); --surface2: lch(100 0 0); --surface3: lch(98 0 0); --surface4: lch(85 0 0); --text1: lch(20 0 0); --text2: lch(40 0 0); } } } Key Term: PostCSS @nest plugin is part of PostCSS Preset Env, and will expand selectors to a syntax browsers support today. It's important to get a quick glimpse at the overall picture and strategy before we dive into color syntax details. But, since I've gotten a bit ahead of myself, let me back up a bit. LCH? # Without getting too deep into color theory land, LCH is a human oriented syntax, that caters to how we percieve color, not how we measure color with math (like 255). This gives it a distinct advantage as humans can write it more easily and other humans will be in tune with these adjustments. CSS Podcast For today, in this demo, let's focus on the syntax and the values I'm flipping to make light and dark. Let's look at 1 surface and 1 text color: :root { --surface1: lch(10 0 0); --text1: lch(95 0 0); @media (prefers-color-scheme: light) { & { --surface1: lch(90 0 0); --text1: lch(40 0 0); } } } --surface1: lch(10 0 0) translates to 10% lightness, 0 chroma and 0 hue: a very dark colorless gray. Then, in the media query for light mode, the lightness is flipped to 90% with --surface1: lch(90 0 0);. And that's the gist of the strategy. Start by just changing lightness between the 2 themes, maintaining the contrast ratios the design calls for or what can maintain accessibility. The bonus with lch() here is that lightness is human oriented, and we can feel good about a % change to it, that it will be perceptually and consistently that % different. hsl() for example is not as reliable. There's more to learn about color spaces and lch() if you're interested. It's coming! CSS right now cannot access these colors at all. Let me repeat: We have no access to one third of the colors in most modern monitors. And these are not just any colors, but the most vivid colors the screen can display. Our websites are washed out because monitor hardware evolved faster than CSS specs and browser implementations. Lea Verou Adaptive form controls with color-scheme # Many browsers ship dark theme controls, currently Safari and Chromium, but you have to specify in CSS or HTML that your design uses them. The above is demonstrating the effect of the property from the Styles panel of DevTools. The demo uses the HTML tag, which in my opinion is generally a better location: <meta name="color-scheme" content="dark light"> Learn all about it in this color-scheme article by Thomas Steiner. There's a lot more to gain than dark checkbox inputs! CSS accent-color # There's been recent activity around accent-color on form elements, being a single CSS style that can change the tint color used in the browsers input element. Read more about it here on GitHub. I've included it in my styles for this component. As browsers support it, my checkboxes will be more on theme with the pink and purple color pops. input[type="checkbox"] { accent-color: var(--brand); } Color pops with fixed gradients and focus-within # Color pops most when it's used sparingly, and one of the ways I like to achieve that is through colorful UI interactions. There are many layers of UI feedback and interaction in the above video, which help give personality to the interaction by: Highlighting context. Providing UI feedback of "how full" the value is in the range. Providing UI feedback that a field is accepting input. To provide feedback when an element is being interacted with, CSS is using the :focus-within pseudo class to change the appearance of various elements, let's break down the .fieldset-item, it's super interesting: .fieldset-item { ... &:focus-within { background: var(--surface2); & svg { fill: white; } & picture { clip-path: circle(50%); background: var(--brand-bg-gradient) fixed; } } } When one of the children of this element has focus-within: The .fieldset-item background is assigned a higher contrast surface color. The nested svg is filled white for higher contrast. The nested <picture> clip-path expands to a full circle and the background is filled with the bright fixed gradient. Custom range # Given the following HTML input element, I'll show you how I customized its appearance: <input type="range"> There's 3 parts to this element we need to customize: Range element / container Track Thumb Range element styles # input[type="range"] { /* style setting variables */ --track-height: .5ex; --track-fill: 0%; --thumb-size: 3ex; --thumb-offset: -1.25ex; --thumb-highlight-size: 0px; appearance: none; /* clear styles, make way for mine */ display: block; inline-size: 100%; /* fill container */ margin: 1ex 0; /* ensure thumb isn't colliding with sibling content */ background: transparent; /* bg is in the track */ outline-offset: 5px; /* focus styles have space */ } The first few lines of CSS are the custom parts of the styles, and I hope that clearly labeling them helps. The rest of the styles are mostly reset styles, to provide a consistent foundation for building the tricky parts of the component. Track styles # input[type="range"]::-webkit-slider-runnable-track { appearance: none; /* clear styles, make way for mine */ block-size: var(--track-height); border-radius: 5ex; background: /* hard stop gradient: - half transparent (where colorful fill we be) - half dark track fill - 1st background image is on top */ linear-gradient( to right, transparent var(--track-fill), var(--surface1) 0% ), /* colorful fill effect, behind track surface fill */ var(--brand-bg-gradient) fixed; } The trick to this is "revealing" the vibrant fill color. This is done with the hard stop gradient on top. The gradient is transparent up to the fill percentage, and after that uses the unfilled track surface color. Behind that unfilled surface, is a full width color, waiting for transparency to reveal it. Track fill style # My design does require JavaScript in order to maintain the fill style. There are CSS only strategies but they require the thumb element to be the same height as the track, and I wasn't able to find a harmony within those limits. /* grab sliders on page */ const sliders = document.querySelectorAll('input[type="range"]') /* take a slider element, return a percentage string for use in CSS */ const rangeToPercent = slider => { const max = slider.getAttribute('max') || 10; const percent = slider.value / max * 100; return `${parseInt(percent)}%`; }; /* on page load, set the fill amount */ sliders.forEach(slider => { slider.style.setProperty('--track-fill', rangeToPercent(slider)); /* when a slider changes, update the fill prop */ slider.addEventListener('input', e => { e.target.style.setProperty('--track-fill', rangeToPercent(e.target)); }) }) I think this makes for a nice visual upgrade. The slider works great without JavaScript, the --track-fill prop is not required, it simply will not have a fill style if not present. If JavaScript is available, populate the custom property while also observing any user changes, syncing the custom property with the value. Here's a great post on CSS-Tricks by Ana Tudor, that demonstrates a CSS only solution for track fill. I also found this range element very inspiring. Thumb styles # input[type="range"]::-webkit-slider-thumb { appearance: none; /* clear styles, make way for mine */ cursor: ew-resize; /* cursor style to support drag direction */ border: 3px solid var(--surface3); block-size: var(--thumb-size); inline-size: var(--thumb-size); margin-top: var(--thumb-offset); border-radius: 50%; background: var(--brand-bg-gradient) fixed; } The majority of these styles are to make a nice circle. Again you see the fixed background gradient there that unifies the dynamic colors of the thumbs, tracks and associated SVG elements. I separated the styles for the interaction to help isolate the box-shadow technique being used for the hover highlight: @custom-media --motionOK (prefers-reduced-motion: no-preference); ::-webkit-slider-thumb { … /* shadow spread is initally 0 */ box-shadow: 0 0 0 var(--thumb-highlight-size) var(--thumb-highlight-color); /* if motion is OK, transition the box-shadow change */ @media (--motionOK) { & { transition: box-shadow .1s ease; } } /* on hover/active state of parent, increase size prop */ @nest input[type="range"]:is(:hover,:active) & { --thumb-highlight-size: 10px; } } Key Term: @custom-media is a Level 5 spec addition that PostCSS Custom Media, part of PostCSS Preset Env. The goal was an easy to manage and animated visual highlight for user feedback. By using a box shadow I can avoid triggering layout with the effect. I do this by creating a shadow that's not blurred and matches the circular shape of the thumb element. Then I change and transition it's spread size on hover. If only the highlight effect was so easy on checkboxes… Cross browser selectors # I found I needed these -webkit- and -moz- selectors to achieve cross browser consistency: input[type="range"] { &::-webkit-slider-runnable-track {} &::-moz-range-track {} &::-webkit-slider-thumb {} &::-moz-range-thumb {} } Gotchas! Josh Comeau outlines why the above examples don't simply use a comma between selectors for cross browser styling, see the Twitter thread for more information. Custom Checkbox # Given the following HTML input element, I'll show you how I customized its appearance: <input type="checkbox"> There's 3 parts to this element we need to customize: Checkbox element Associated labels Highlight effect Checkbox element # input[type="checkbox"] { inline-size: var(--space-sm); /* increase width */ block-size: var(--space-sm); /* increase height */ outline-offset: 5px; /* focus style enhancement */ accent-color: var(--brand); /* tint the input */ position: relative; /* prepare for an absolute pseudo element */ transform-style: preserve-3d; /* create a 3d z-space stacking context */ margin: 0; cursor: pointer; } The transform-style and position styles prepare for the pseudo-element we will introduce later to style the highlight. Otherwise, it's mostly minor opinionated style stuff from me. I like the cursor to be pointer, I like outline offsets, default checkboxes are too tiny, and if accent-color is supported, bring these checkboxes into the brand color scheme. Checkbox labels # It's important to provide labels for checkboxes for 2 reasons. The first is to represent what the checkbox value is used for, to answer "on or off for what?" Second is for UX, web users have become accustomed to interacting with checkboxes via their associated labels. input <input type="checkbox" id="text-notifications" name="text-notifications" > label <label for="text-notifications"> <h3>Text Messages</h3> <small>Get notified about all text messages sent to your device</small> </label> On your label, put a for attribute that points to a checkbox by ID: <label for="text-notifications">. On your checkbox, double up both the name and id to ensure it's found with varying tools and tech, like a mouse or screenreader: <input type="checkbox" id="text-notifications" name="text-notifications">. :hover, :active and more come for free with the connection, increasing the ways your form can be interacted with. Checkbox highlight # I want to keep my interfaces consistent, and the slider element has a nice thumbnail highlight that I'd like to use with the checkbox. The thumbnail was able to use box-shadow and it's spread property to scale a shadow up and down. However, that effect doesn't work here because our checkboxes are, and should be, square. I was able to achieve the same visual effect with a pseudo element, and an unfortunate amount of tricky CSS: @custom-media --motionOK (prefers-reduced-motion: no-preference); input[type="checkbox"]::before { --thumb-scale: .01; /* initial scale of highlight */ --thumb-highlight-size: var(--space-xl); content: ""; inline-size: var(--thumb-highlight-size); block-size: var(--thumb-highlight-size); clip-path: circle(50%); /* circle shape */ position: absolute; /* this is why position relative on parent */ top: 50%; /* pop and plop technique (https://web.dev/centering-in-css/#5.-pop-and-plop) */ left: 50%; background: var(--thumb-highlight-color); transform-origin: center center; /* goal is a centered scaling circle */ transform: /* order here matters!! */ translateX(-50%) /* counter balances left: 50% */ translateY(-50%) /* counter balances top: 50% */ translateZ(-1px) /* PUTS IT BEHIND THE CHECKBOX */ scale(var(--thumb-scale)) /* value we toggle for animation */ ; will-change: transform; @media (--motionOK) { /* transition only if motion is OK */ & { transition: transform .2s ease; } } } /* on hover, set scale custom property to "in" state */ input[type="checkbox"]:hover::before { --thumb-scale: 1; } Creating a circle psuedo-element is straightforward work, but placing it behind the element it's attached to was harder. Here's before and after I fixed it: It's definitely a micro interaction, but important to me to keep the visual consistency. The animation scaling technique is the same as we've been using in other places. We set a custom property to a new value and let CSS transition it based on motion preferences. The key feature here is translateZ(-1px). The parent created a 3D space and this pseudo-element child tapped into it by placing itself slightly back in z-space. Accessibility # The YouTube video does a great demonstration of the mouse, keyboard and screenreader interactions for this settings component. I'll call out some of the details here. HTML Element Choices # <form> <header> <fieldset> <picture> <label> <input> Each of these holds hints and tips to the user's browsing tool. Some elements provide interaction hints, some connect interactivity, and some help shape the acccessibility tree that a screenreader navigates. HTML Attributes # We can hide elements that are not needed by screenreaders, in this case the icon next to the slider: <picture aria-hidden="true"> The above video demonstrates the screenreader flow on Mac OS. Notice how input focus moves straight from one slider to the next. This is because we've hidden the icon that may have been a stop on the way to the next slider. Without this attribute, a user would need to stop, listen and move past the picture which they may not be able to see. Gotchas! Ensure to cross browser test screenreader interactions. The original demo included <label> in the list of elements with aria-hidden="true", but it's been since removed after Twitter conversation revealed cross browser differences. The SVG is a bunch of math, let's add a <title> element for a free mouse hover title and a human readable comment about what the math is creating: <svg viewBox="0 0 24 24"> <title>A note icon</title> <path d="M12 3v10.55c-.59-.34-1.27-.55-2-.55-2.21 0-4 1.79-4 4s1.79 4 4 4 4-1.79 4-4V7h4V3h-6z"/> </svg> Other than that, we've used enough clearly marked HTML, that the form tests really well across mouse, keyboard, video game controllers and screenreaders. JavaScript # I've already covered how the track fill color was being managed from JavaScript, so let's look at the <form> related JavaScript now: const form = document.querySelector('form'); form.addEventListener('input', event => { const formData = Object.fromEntries(new FormData(form)); console.table(formData); }) Everytime the form is interacted with and changed, the console logs the form as an object into a table for easy review before submitting to a server. Conclusion # Now that you know how I did it, how would you?! This makes for some fun component architecture! Who's going to make the 1st version with slots in their favorite framework? 🙂 Let's diversify our approaches and learn all the ways to build on the web. Create a demo, tweet me links, and I'll add it to the Community remixes section below! Community remixes # @tomayac with their style regarding the hover area for the checkbox labels! This version has no hover gap between elements: demo and source.

Mitigate cross-site scripting (XSS) with a strict Content Security Policy (CSP)

Why should you deploy a strict Content Security Policy (CSP)? # Cross-site scripting (XSS)—the ability to inject malicious scripts into a web application—has been one of the biggest web security vulnerabilities for over a decade. Content Security Policy (CSP) is an added layer of security that helps to mitigate XSS. Configuring a CSP involves adding the Content-Security-Policy HTTP header to a web page and setting values to control what resources the user agent is allowed to load for that page. This article explains how to use a CSP based on nonces or hashes to mitigate XSS instead of the commonly used host-allowlist-based CSPs which often leave the page exposed to XSS as they can be bypassed in most configurations. Key Term: A nonce is a random number used only once that can be used to mark a <script> tag as trusted. Key Term: A hash function is a mathematical function that converts an input value into a compressed numerical value—a hash. A hash (such as SHA-256) can be used to mark an inline <script> tag as trusted. A Content Security Policy based on nonces or hashes is often called a strict CSP. When an application uses a strict CSP, attackers who find HTML injection flaws will generally not be able to use them to force the browser to execute malicious scripts in the context of the vulnerable document. This is because strict CSP only permits hashed scripts or scripts with the correct nonce value generated on the server, so attackers cannot execute the script without knowing the correct nonce for a given response. To protect your site from XSS, make sure to sanitize user input and use CSP as an extra security layer. CSP is a defense-in-depth technique that can prevent the execution of malicious scripts, but it's not a substitute for avoiding (and promptly fixing) XSS bugs. Why a strict CSP is recommended over allowlist CSPs # If your site already has a CSP that looks like this: script-src www.googleapis.com, it may not be effective against cross-site scripting! This type of CSP is called an allowlist CSP and it has a couple of downsides: It requires a lot of customization. It can be bypassed in most configurations. This makes allowlist CSPs generally ineffective at preventing attackers from exploiting XSS. That's why it's recommended to use a strict CSP based on cryptographic nonces or hashes, which avoids the pitfalls outlined above. Allowlist CSP Doesn't effectively protect your site. ❌ Must be highly customized. 😓 Strict CSP Effectively protects your site. ✅ Always has the same structure. 😌 What is a strict Content Security Policy? # A strict Content Security Policy has the following structure and is enabled by setting one of the following HTTP response headers: Nonce-based strict CSP Content-Security-Policy: script-src 'nonce-{RANDOM}' 'strict-dynamic'; object-src 'none'; base-uri 'none'; Hash-based strict CSP Content-Security-Policy: script-src 'sha256-{HASHED_INLINE_SCRIPT}' 'strict-dynamic'; object-src 'none'; base-uri 'none'; This is the most stripped-down version of a strict CSP. You'll need to tweak it to make it effective across browsers. See add a fallback to support Safari and older browsers for details. The following properties make a CSP like the one above "strict" and hence secure: Uses nonces 'nonce-{RANDOM}' or hashes 'sha256-{HASHED_INLINE_SCRIPT}' to indicate which <script> tags are trusted by the site's developer and should be allowed to execute in the user's browser. Sets 'strict-dynamic' to reduce the effort of deploying a nonce- or hash-based CSP by automatically allowing the execution of scripts that are created by an already trusted script. This also unblocks the use of most third party JavaScript libraries and widgets. Not based on URL allowlists and therefore doesn't suffer from common CSP bypasses. Blocks untrusted inline scripts like inline event handlers or javascript: URIs. Restricts object-src to disable dangerous plugins such as Flash. Restricts base-uri to block the injection of <base> tags. This prevents attackers from changing the locations of scripts loaded from relative URLs. Another advantage of a strict CSP is that the CSP always has the same structure and doesn't need to be customized for your application. Adopting a strict CSP # To adopt a strict CSP, you need to: Decide if your application should set a nonce- or hash-based CSP. Copy the CSP from the What is a strict Content Security Policy section and set it as a response header across your application. Refactor HTML templates and client-side code to remove patterns that are incompatible with CSP. Add fallbacks to support Safari and older browsers. Deploy your CSP. You can use Lighthouse (v7.3.0 and above with flag --preset=experimental) Best Practices audit throughout this process to check whether your site has a CSP, and whether it's strict enough to be effective against XSS. Step 1: Decide if you need a nonce- or hash-based CSP # There are two types of strict CSPs, nonce- and hash-based. Here's how they work: Nonce-based CSP: You generate a random number at runtime, include it in your CSP, and associate it with every script tag in your page. An attacker can't include and run a malicious script in your page, because they would need to guess the correct random number for that script. This only works if the number is not guessable and newly generated at runtime for every response. Hash-based CSP: The hash of every inline script tag is added to the CSP. Note that each script has a different hash. An attacker can't include and run a malicious script in your page, because the hash of that script would need to be present in your CSP. Criteria for choosing a strict CSP approach: Criteria for choosing a strict CSP approach Nonce-based CSP For HTML pages rendered on the server where you can create a new random token (nonce) for every response. Hash-based CSP For HTML pages served statically or those that need to be cached. For example, single-page web applications built with frameworks such as Angular, React or others, that are statically served without server-side rendering. Step 2: Set a strict CSP and prepare your scripts # When setting a CSP, you have a few options: Report-only mode (Content-Security-Policy-Report-Only) or enforcement mode (Content-Security-Policy). In report-only, the CSP won't block resources yet—nothing will break—but you'll be able to see errors and receive reports for what would have been blocked. Locally, when you're in the process of setting a CSP, this doesn't really matter, because both modes will show you the errors in the browser console. If anything, enforcement mode will make it even easier for you to see blocked resources and tweak your CSP, since your page will look broken. Report-only mode becomes most useful later in the process (see Step 5). Header or HTML <meta> tag. For local development, a <meta> tag may be more convenient for tweaking your CSP and quickly seeing how it affects your site. However: Later on, when deploying your CSP in production, it is recommended to set it as an HTTP header. If you want to set your CSP in report-only mode, you'll need to set it as a header—CSP meta tags don't support report-only mode. Set the following Content-Security-Policy HTTP response header in your application: Content-Security-Policy: script-src 'nonce-{RANDOM}' 'strict-dynamic'; object-src 'none'; base-uri 'none'; Caution: Replace the {RANDOM} placeholder with a random nonce that is regenerated on every server response. Generate a nonce for CSP # A nonce is a random number used only once per page load. A nonce-based CSP can only mitigate XSS if the nonce value is not guessable by an attacker. A nonce for CSP needs to be: A cryptographically strong random value (ideally 128+ bits in length) Newly generated for every response Base64 encoded Here are some examples on how to add a CSP nonce in server-side frameworks: Django (python) Express (JavaScript): const app = express(); app.get('/', function(request, response) { // Generate a new random nonce value for every response. const nonce = crypto.randomBytes(16).toString("base64"); // Set the strict nonce-based CSP response header const csp = `script-src 'nonce-${nonce}' 'strict-dynamic' https:; object-src 'none'; base-uri 'none';`; response.set("Content-Security-Policy", csp); // Every <script> tag in your application should set the `nonce` attribute to this value. response.render(template, { nonce: nonce }); }); } Add a nonce attribute to <script> elements # With a nonce-based CSP, every <script> element must have a nonce attribute which matches the random nonce value specified in the CSP header (all scripts can have the same nonce). The first step is to add these attributes to all scripts: Blocked by CSP <script src="/path/to/script.js"></script> <script>foo()</script> CSP will block these scripts, because they don't have nonce attributes. Allowed by CSP <script nonce="${NONCE}" src="/path/to/script.js"></script> <script nonce="${NONCE}">foo()</script> CSP will allow the execution of these scripts if ${NONCE} is replaced with a value matching the nonce in the CSP response header. Note that some browsers will hide the nonce attribute when inspecting the page source. Gotchas! With 'strict-dynamic' in your CSP, you'll only have to add nonces to <script> tags that are present in the initial HTML response.'strict-dynamic' allows the execution of scripts dynamically added to the page, as long as they were loaded by a safe, already-trusted script (see the specification). Set the following Content-Security-Policy HTTP response header in your application: Content-Security-Policy: script-src 'sha256-{HASHED_INLINE_SCRIPT}' 'strict-dynamic'; object-src 'none'; base-uri 'none'; For several inline scripts, the syntax is as follows: 'sha256-{HASHED_INLINE_SCRIPT_1}' 'sha256-{HASHED_INLINE_SCRIPT_2}'. Caution: The {HASHED_INLINE_SCRIPT} placeholder must be replaced with a base64-encoded SHA-256 hash of an inline script that can be used to load other scripts (see next section). You can calculate SHA hashes of static inline <script> blocks with this tool. An alternative is to inspect the CSP violation warnings in Chrome's developer console, which contains hashes of blocked scripts, and add these hashes to the policy as 'sha256-…'. A script injected by an attacker will be blocked by the browser as only the hashed inline script and any scripts dynamically added by it will be allowed to execute by the browser. Load sourced scripts dynamically # All scripts that are externally sourced need to be loaded dynamically via an inline script, because CSP hashes are supported across browsers only for inline scripts (hashes for sourced scripts are not well-supported across browsers). Blocked by CSP <script src="https://example.org/foo.js"></script> <script src="https://example.org/bar.js"></script> CSP will block these scripts since only inline-scripts can be hashed. Allowed by CSP <script> var scripts = [ 'https://example.org/foo.js', 'https://example.org/bar.js']; scripts.forEach(function(scriptUrl) { var s = document.createElement('script'); s.src = scriptUrl; s.async = false; // to preserve execution order document.head.appendChild(s); }); </script> To allow execution of this script, the hash of the inline script must be calculated and added to the CSP response header, replacing the {HASHED_INLINE_SCRIPT} placeholder. To reduce the amount of hashes, you can optionally merge all inline scripts into a single script. To see this in action checkout the example and examine the code. Gotchas! When calculating a CSP hash for inline scripts, whitespace characters between the opening and closing <script> tags matter. You can calculate CSP hashes for inline scripts using this tool. async = false and script loading async = false isn't blocking in this case, but use this with care. In the code snippet above, s.async = false is added to ensure that foo executes before bar (even if bar loads first). In this snippet, s.async = false does not block the parser while the scripts load; that's because the scripts are added dynamically. The parser will only stop as the scripts are being executed, just like it would behave for async scripts. However, with this snippet, keep in mind: One/both scripts may execute before the document has finished downloading. If you want the document to be ready by the time the scripts execute, you need to wait for the DOMContentLoaded event before you append the scripts. If this causes a performance issue (because the scripts don't start downloading early enough), you can use preload tags earlier in the page. defer = true won't do anything. If you need that behaviour, you'll have to manually run the script at the time you want to run it. Step 3: Refactor HTML templates and client-side code to remove patterns incompatible with CSP # Inline event handlers (such as onclick="…", onerror="…") and JavaScript URIs (<a href="javascript:…">) can be used to run scripts. This means that an attacker who finds an XSS bug could inject this kind of HTML and execute malicious JavaScript. A nonce- or hash-based CSP disallows the use of such markup. If your site makes use of any of the patterns described above, you'll need to refactor them into safer alternatives. If you enabled CSP in the previous step, you'll be able to see CSP violations in the console every time CSP blocks an incompatible pattern. In most cases, the fix is straightforward: To refactor inline event handlers, rewrite them to be added from a JavaScript block # Blocked by CSP <span onclick="doThings();">A thing.</span> CSP will block inline event handlers. Allowed by CSP <span id="things">A thing.</span> <script nonce="${nonce}"> document.getElementById('things') .addEventListener('click', doThings); </script> CSP will allow event handlers that are registered via JavaScript. For javascript: URIs, you can use a similar pattern # Blocked by CSP <a href="javascript:linkClicked()">foo</a> CSP will block javascript: URIs. Allowed by CSP <a id="foo">foo</a> <script nonce="${nonce}"> document.getElementById('foo') .addEventListener('click', linkClicked); </script> CSP will allow event handlers that are registered via JavaScript. Use of eval() in JavaScript # If your application uses eval() to convert JSON string serializations into JS objects, you should refactor such instances to JSON.parse(), which is also faster. If you cannot remove all uses of eval(), you can still set a strict nonce-based CSP, but you will have to use the 'unsafe-eval' CSP keyword which will make your policy slightly less secure. You can find these and more examples of such refactoring in this strict CSP Codelab: Step 4: Add fallbacks to support Safari and older browsers # CSP is supported by all major browsers, but you'll need two fallbacks: Using 'strict-dynamic' requires adding https: as a fallback for Safari, the only major browser without support for 'strict-dynamic'. By doing so: All browsers that support 'strict-dynamic' will ignore the https: fallback, so this won't reduce the strength of the policy. In Safari, externally sourced scripts will be allowed to load only if they come from an HTTPS origin. This is less secure than a strict CSP–it's a fallback–but would still prevent certain common XSS causes like injections of javascript: URIs because 'unsafe-inline' is not present or ignored in presence of a hash or nonce. To ensure compatibility with very old browser versions (4+ years), you can add 'unsafe-inline' as a fallback. All recent browsers will ignore 'unsafe-inline' if a CSP nonce or hash is present. Content-Security-Policy: script-src 'nonce-{random}' 'strict-dynamic' https: 'unsafe-inline'; object-src 'none'; base-uri 'none'; https: and unsafe-inline don't make your policy less safe because they will be ignored by browsers which support strict-dynamic. Step 5: Deploy your CSP # After confirming that no legitimate scripts are being blocked by CSP in your local development environment, you can proceed with deploying your CSP to your (staging, then) production environment: (Optional) Deploy your CSP in report-only mode using the Content-Security-Policy-Report-Only header. Learn more about the Reporting API. Report-only mode is handy to test a potentially breaking change like a new CSP in production, before actually enforcing CSP restrictions. In report-only mode, your CSP does not affect the behavior of your application (nothing will actually break). But the browser will still generate console errors and violation reports when patterns incompatible with CSP are encountered (so you can see what would have broken for your end-users). Once you're confident that your CSP won't induce breakage for your end-users, deploy your CSP using the Content-Security-Policy response header. Only once you've completed this step, will CSP begin to protect your application from XSS. Setting your CSP via a HTTP header server-side is more secure than setting it as a <meta> tag; use a header if you can. Gotchas! Make sure that the CSP you're using is "strict" by checking it with the CSP Evaluator or Lighthouse. This is very important, as even small changes to a policy can significantly reduce its security. Caution: When enabling CSP for production traffic, you may see some noise in the CSP violation reports due to browser extensions and malware. Limitations # Generally speaking, a strict CSP provides a strong added layer of security that helps to mitigate XSS. In most cases, CSP reduces the attack surface significantly (dangerous patterns like javascript: URIs are completely turned off). However, based on the type of CSP you're using (nonces, hashes, with or without 'strict-dynamic'), there are cases where CSP doesn't protect: If you nonce a script, but there's an injection directly into the body or into the src parameter of that <script> element. If there are injections into the locations of dynamically created scripts (document.createElement('script')), including into any library functions which create script DOM nodes based on the value of their arguments. This includes some common APIs such as jQuery's .html(), as well as .get() and .post() in jQuery < 3.0. If there are template injections in old AngularJS applications. An attacker who can inject an AngularJS template can use it to execute arbitrary JavaScript. If the policy contains 'unsafe-eval', injections into eval(), setTimeout() and a few other rarely used APIs. Developers and security engineers should pay particular attention to such patterns during code reviews and security audits. You can find more details on the cases described above in this CSP presentation. Trusted Types complements strict CSP very well and can efficiently protect against some of the limitations listed above. Learn more about how to use Trusted Types at web.dev. Further reading # CSP Is Dead, Long Live CSP! On the Insecurity of Whitelists and the Future of Content Security Policy CSP Evaluator LocoMoco Conference: Content Security Policy - A successful mess between hardening and mitigation Google I/O talk: Securing Web Apps with Modern Platform Features

Debug layout shifts

The first part of this article discusses tooling for debugging layout shifts, while the second part discusses the thought process to use when identifying the cause of a layout shift. Tooling # Layout Instability API # The Layout Instability API is the browser mechanism for measuring and reporting layout shifts. All tools for debugging layout shifts, including DevTools, are ultimately built upon the Layout Instability API. However, using the Layout Instability API directly is a powerful debugging tool due to its flexibility. The Layout Instability API is only supported by Chromium browsers. At the current time there is no way to measure or debug layout shifts in non-Chromium browsers. Usage # The same code snippet that measures Cumulative Layout Shift (CLS) can also serve to debug layout shifts. The snippet below logs information about layout shifts to the console. Inspecting this log will provide you information about when, where, and how a layout shift occurred. let cls = 0; new PerformanceObserver((entryList) => { for (const entry of entryList.getEntries()) { if (!entry.hadRecentInput) { cls += entry.value; console.log('Current CLS value:', cls, entry); } } }).observe({type: 'layout-shift', buffered: true}); When running this script be aware that: The buffered: true option indicates that the PerformanceObserver should check the browser's performance entry buffer for performance entries that were created before the observer's initialization. As a result, the PerformanceObserver will report layout shifts that happened both before and after it was initialized. Keep this in mind when inspecting the console logs. An initial glut of layout shifts can reflect a reporting backlog, rather than the sudden occurrence of numerous layout shifts. To avoid impacting performance, the PerformanceObserver waits until the main thread is idle to report on layout shifts. As a result, depending on how busy the main thread is, there may be a slight delay between when a layout shift occurs and when it is logged in the console. This script ignores layout shifts that occurred within 500 ms of user input and therefore do not count towards CLS. Information about layout shifts is reported using a combination of two APIs: the LayoutShift and LayoutShiftAttribution interfaces. Each of these interfaces are explained in more detail in the following sections. LayoutShift # Each layout shift is reported using the LayoutShift interface. The contents of an entry look like this: duration: 0 entryType: "layout-shift" hadRecentInput: false lastInputTime: 0 name: "" sources: (3) [LayoutShiftAttribution, LayoutShiftAttribution, LayoutShiftAttribution] startTime: 11317.934999999125 value: 0.17508567530168798 The entry above indicates a layout shift during which three DOM elements changed position. The layout shift score of this particular layout shift was 0.175. These are the properties of a LayoutShift instance that are most relevant to debugging layout shifts: Property Description sources The sources property lists the DOM elements that moved during the layout shift. This array can contain up to five sources. In the event that there are more than five elements impacted by the layout shift, the five largest (as measured by impact on layout stability) sources of layout shift are reported. This information is reported using the LayoutShiftAttribution interface (explained in more detail below). value The value property reports the layout shift score for a particular layout shift. hadRecentInput The hadRecentInput property indicates whether a layout shift occurred within 500 milliseconds of user input. startTime The startTime property indicates when a layout shift occurred. startTime is indicated in milliseconds and is measured relative to the time that the page load was initiated. duration The duration property will always be set to 0. This property is inherited from the PerformanceEntry interface (the LayoutShift interface extends the PerformanceEntry interface). However, the concept of duration does not apply to layout shift events, so it is set to 0. For information on the PerformanceEntry interface, refer to the spec. The Web Vitals Extension can log layout shift info to the console. To enable this feature, go to Options > Console Logging. LayoutShiftAttribution # The LayoutShiftAttribution interface describes a single shift of a single DOM element. If multiple elements shift during a layout shift, the sources property contains multiple entries. For example, the JSON below corresponds to a layout shift with one source: the downward shift of the <div id='banner'> DOM element from y: 76 to y:246. // ... "sources": [ { "node": "div#banner", "previousRect": { "x": 311, "y": 76, "width": 4, "height": 18, "top": 76, "right": 315, "bottom": 94, "left": 311 }, "currentRect": { "x": 311, "y": 246, "width": 4, "height": 18, "top": 246, "right": 315, "bottom": 264, "left": 311 } } ] The node property identifies the HTML element that shifted. Hovering on this property in DevTools highlights the corresponding page element. The previousRect and currentRect properties report the size and position of the node. The x and y coordinates report the x-coordinate and y-coordinate respectively of the top-left corner of the element The width and height properties report the width and height respectively of the element. The top, right, bottom, and left properties report the x or y coordinate values corresponding to the given edge of the element. In other words, the value of top is equal to y; the value of bottom is equal to y+height. If all properties of previousRect are set to 0 this means that the element has shifted into view. If all properties of currentRect are set to 0 this means that the element has shifted out of view. One of the most important things to understand when interpreting these outputs is that elements listed as sources are the elements that shifted during the layout shift. However, it's possible that these elements are only indirectly related to the "root cause" of layout instability. Here are a few examples. Example #1 This layout shift would be reported with one source: element B. However, the root cause of this layout shift is the change in size of element A. Example #2 The layout shift in this example would be reported with two sources: element A and element B. The root cause of this layout shift is the change in position of element A. Example #3 The layout shift in this example would be reported with one source: element B. Changing the position of element B resulted in this layout shift. Example #4 Although element B changes size, there is no layout shift in this example. Check out a demo of how DOM changes are reported by the Layout Instability API. DevTools # Performance panel # The Experience pane of the DevTools Performance panel displays all layout shifts that occur during a given performance trace—even if they occur within 500 ms of a user interaction and therefore don't count towards CLS. Hovering over a particular layout shift in the Experience panel highlights the affected DOM element. To view more information about the layout shift, click on the layout shift, then open the Summary drawer. Changes to the element's dimensions are listed using the format [width, height]; changes to the element's position are listed using the format [x,y]. The Had recent input property indicates whether a layout shift occurred within 500 ms of a user interaction. For information on the duration of a layout shift, open the Event Log tab. The duration of a layout shift can also be approximated by looking in the Experience pane for the length of the red layout shift rectangle. The duration of a layout shift has no impact on its layout shift score. For more information on using the Performance panel, refer to Performance Analysis Reference. Highlight layout shift regions # Highlighting layout shift regions can be a helpful technique for getting a quick, at-a-glance feel for the location and timing of the layout shifts occurring on a page. To enable Layout Shift Regions in DevTools, go to Settings > More Tools > Rendering > Layout Shift Regions then refresh the page that you wish to debug. Areas of layout shift will be briefly highlighted in purple. Thought process for identifying the cause of layout shifts # You can use the steps below to identify the cause of layout shifts regardless of when or how the layout shift occurs. These steps can be supplemented with running Lighthouse—however, keep in mind that Lighthouse can only identify layout shifts that occurred during the initial page load. In addition, Lighthouse also can only provide suggestions for some causes of layout shifts—for example, image elements that do not have explicit width and height. Identifying the cause of a layout shift # Layout shifts can be caused by the following events: Changes to the position of a DOM element Changes to the dimensions of a DOM element Insertion or removal of a DOM element Animations that trigger layout In particular, the DOM element immediately preceding the shifted element is the element most likely to be involved in "causing" layout shift. Thus, when investigating why a layout shift occurred consider: Did the position or dimensions of the preceding element change? Was a DOM element inserted or removed before the shifted element? Was the position of the shifted element explicitly changed? If the preceding element did not cause the layout shift, continue your search by considering other preceding and nearby elements. In addition, the direction and distance of a layout shift can provide hints about root cause. For example, a large downward shift often indicates the insertion of a DOM element, whereas a 1 px or 2 px layout shift often indicates the application of conflicting CSS styles or the loading and application of a web font. These are some of the specific behaviors that most frequently cause layout shift events: Changes to the position of an element (that aren't due to the movement of another element) # This type of change is often a result of: Stylesheets that are loaded late or overwrite previously declared styles. Animation and transition effects. Changes to the dimensions of an element # This type of change is often a result of: Stylesheets that are loaded late or overwrite previously declared styles. Images and iframes without width and height attributes that load after their "slot" has been rendered. Text blocks without width or height attributes that swap fonts after the text has been rendered. The insertion or removal of DOM elements # This is often the result of: Insertion of ads and other third-party embeds. Insertion of banners, alerts, and modals. Infinite scroll and other UX patterns that load additional content above existing content. Animations that trigger layout # Some animation effects can trigger layout. A common example of this is when DOM elements are 'animated' by incrementing properties like top or left rather than using CSS's transform property. Read How to create high-performance CSS animations for more information. Reproducing layout shifts # You can't fix layout shifts that you can't reproduce. One of the simplest, yet most effective things you can do to get a better sense of your site's layout stability is take 5-10 minutes to interact with your site with the goal triggering layout shifts. Keep the console open while doing this and use the Layout Instability API to report on layout shifts. For hard to locate layout shifts, consider repeating this exercise with different devices and connection speeds. In particular, using a slower connection speed can make it easier to identify layout shifts. In addition, you can use a debugger statement to make it easier to step through layout shifts. new PerformanceObserver((entryList) => { for (const entry of entryList.getEntries()) { if (!entry.hadRecentInput) { cls += entry.value; debugger; console.log('Current CLS value:', cls, entry); } } }).observe({type: 'layout-shift', buffered: true}); Lastly, for layout issues that aren't reproducible in development, consider using the Layout Instability API in conjunction with your front-end logging tool of choice to collect more information on these issues. Check out the example code for how to track the largest shifted element on a page.

How Wix improved website performance by evolving their infrastructure

Alon leads the core backend team at Wix. Thanks to leveraging industry standards, cloud providers, and CDN capabilities, combined with a major rewrite of our website runtime, the percentage of Wix sites reaching good 75th percentile scores on all Core Web Vitals (CWV) metrics more than tripled year over year, according to data from CrUX and HTTPArchive. Wix adopted a performance-oriented culture, and further improvements will continue rolling out to users. As we focus on performance KPIs, we expect to see the number of sites passing CWV thresholds grow. Overview # The world of performance is beautifully complex, with many variables and intricacies. Research shows that site speed has a direct impact on conversion rates and revenues for businesses. In recent years, the industry has put more emphasis on performance visibility and making the web faster. Starting in May 2021, page experience signals will be included in Google Search ranking. The unique challenge at Wix is supporting millions of sites, some of which were built many years ago and have not been updated since. We have various tools and articles to assist our users on what they can do to analyze and improve the performance of their sites. Wix is a managed environment and not everything is in the hands of the user. Sharing common infrastructures presents many challenges for all these sites, but also opens opportunities for major enhancements across the board, i.e. leveraging the economies of scale. In this post I will focus on enhancements done around serving the initial HTML, which initiates the page loading process. Speaking in a common language # One of the core difficulties with performance is finding a common terminology to discuss different aspects of the user experience, while considering both the technical and perceived performance. Using a well-defined, common language within the organization enabled us to easily discuss and categorize the different technical parts and trade-offs, clarified our performance reports and tremendously helped to understand what aspects we should focus on improving first. We adjusted all our monitoring and internal discussions to include industry standard metrics such as Web Vitals, which include: Core Web Vitals Site complexity and performance scores # It's pretty easy to create a site that loads instantly so long as you make it very simple using only HTML and serve it via a CDN. However, the reality is that sites are getting more and more complex and sophisticated, operating more like applications rather than documents, and supporting functionalities such as blogs, e-commerce solutions, custom code, etc. Wix offers a very large variety of templates, enabling its users to easily build a site with many business capabilities. Those additional features often come with some performance costs. The journey # In the beginning, there was HTML # Every time a webpage is loaded, it always starts with an initial request to a URL in order to retrieve the HTML document. This HTML response triggers all the additional client requests and browser logic to run and render your site. This is the most important part of the page loading, because nothing happens until the beginning of the response arrives (known as TTFB - time to first byte). WebPageTest First View The past: client-side rendering (CSR) # When operating large scale systems, you always have trade-offs you need to consider, such as performance, reliability and costs. Up to a few years ago, Wix used client-side rendering (CSR), in which the actual HTML content was generated via JavaScript on the client side (i.e. in the browser) allowing us to support a high scale of sites without having huge backend operational costs. CSR enabled us to use a common HTML document, which was essentially empty. All it did was trigger the download of the required code and data which was then used to generate the full HTML on the client device. Today: server-side rendering (SSR) # A few years ago we transitioned to server-side rendering (SSR), as that was beneficial both to SEO and performance, improving initial page visibility times and ensuring better indexing for search engines that do not have full support for running JavaScript. This approach improved the visibility experience, especially on slower devices/connections, and opened the door for further performance optimizations. However, it also meant that for each web page request, a unique HTML response was generated on the fly, which is far from optimal, especially for sites with a large amount of views. Introducing caching in multiple locations # The HTML for each site was mostly static, but it had a few caveats: It frequently changes. Each time a user edits their site, or makes changes in site data, such as on the website store inventory. It had certain data and cookies that were visitor specific, meaning two people visiting the same site would see somewhat different HTML. For example, to support products features such as remembering what items a visitor put in the cart, or the chat the visitor started with the business earlier, and more. Not all pages are cacheable. For example a page with custom user code on it, that displays the current time as part of the document, is not eligible for caching. Initially, we took the relatively safe approach of caching the HTML without visitor data, and then only modified specific parts of the HTML response on the fly for each visitor, for each cache hit. In-house CDN solution # We did this by deploying an in-house solution: Using Varnish HTTP Cache for proxying and caching, Kafka for invalidation messages, and a Scala/Netty-based service which proxies these HTML responses, but mutates the HTML and adds visitor-specific data and cookies to the cached response. This solution enabled us to deploy these slim components in many more geographic locations and multiple cloud provider regions, which are spread across the world. In 2019, we introduced over 15 new regions, and gradually enabled caching for over 90% of our page views that were eligible for caching. Serving sites from additional locations reduced the network latency between the clients and the servers serving the HTML response, by bringing the content closer to the website's visitors. We also started caching certain read-only API responses by using the same solution and invalidating the cache on any change to the site content. For example, the list of blog posts on the site is cached and invalidated when a post is published/modified. Reducing complexities # Implementing caching improved performance substantially, mostly on the TTFB and FCP phases, and improved our reliability by serving the content from a location closer to the end user. However, the need to modify the HTML for each response introduced an unnecessary complexity that, if removed, presented an opportunity for further performance improvements. Browser caching (and preparations for CDNs) # ~ 13% HTML requests served directly from the browser cache, saving much bandwidth and reducing loading times for repeat views The next step was to actually remove this visitor-specific data from the HTML entirely, and retrieve it from a separate endpoint, called by the client for this purpose, after the HTML has arrived. We carefully migrated this data and cookies to a new endpoint, which is called on each page load, but returns a slim JSON, which is required only for the hydration process, to reach full page interactivity. This allowed us to enable browser caching of the HTML, which means that browsers now save the HTML response for repeating visits, and only call the server to validate that the content hasn't changed. This is done using HTTP ETag, which is basically an identifier assigned to a specific version of an HTML resource. If the content is still the same, a 304 Not Modified response is sent by our servers to the client, without a body. WebPageTest Repeat View In addition, this change means that our HTML is no longer visitor-specific and contains no cookies. In other words it can basically be cached anywhere, opening the door to using CDN providers that have much better geo presence in hundreds of locations around the world. DNS, SSL and HTTP/2 # With caching enabled, wait times were reduced and other important parts of the initial connection became more substantial. Enhancing our networking infrastructure and monitoring enabled us to improve our DNS, connection, and SSL times. HTTP/2 was enabled for all user domains, reducing both the amount of connections required and the overhead that comes with each new connection. This was a relatively easy change to deploy, while taking advantage of the performance and resilience benefits that come with HTTP/2. Brotli compression (vs. gzip) # 21 - 25% Reduction of median file transfer size Traditionally, all our files were compressed using gzip compression, which is the most prevalent HTML compression option on the web. This compression protocol was initially implemented almost 30 years ago! Brotli Compression Level Estimator The newer Brotli compression introduces compression improvements with almost no trade-offs, and is slowly becoming more popular, as described in the yearly Web Almanac Compression chapter. It has been supported by all the major browsers for a while. We enabled Brotli support on our nginx proxies in the edges, for all clients that support it. Moving to use Brotli compression reduced our median file transfer sizes by 21% to 25% resulting in a reduced bandwidth usage and improved loading times. Median Response Sizes Content delivery networks (CDNs) # Dynamic CDN selection # At Wix, we have always used CDNs to serve all the JavaScript code and images on user websites. Recently, we integrated with a solution by our DNS provider, to automatically select the best performing CDN according to the client's network and origin. This enables us to serve the static files from the best location for each visitor, and avoid availability issues on a certain CDN. Coming soon… user domains served by CDNs # The final piece of the puzzle is serving the last, and most critical part, through a CDN: the HTML from the user domain. As described above, we created our own in-house solution to cache and serve the site-specific HTML and API results. Maintaining this solution in so many new regions also has its operational costs, and adding new locations becomes a process we need to manage and continually optimize. We are currently integrating with various CDN providers to support serving the entire Wix site directly from CDN locations to improve the distribution of our servers across the globe and thus further improve response times. This is a challenge due to the large amount of domains we serve, which require SSL termination at the edge. Integrating with CDNs brings Wix websites closer than ever to the customer and comes with more improvements to the loading experience, including newer technologies such as HTTP/3 without added effort on our side. A few words on performance monitoring # If you run a Wix site, you're probably wondering how this translates to your Wix site performance results, and how we compare against other website platforms. Most of the work done above has been deployed in the past year, and some is still being rolled out. The Web Almanac by HTTPArchive recently published the 2020 edition which includes an excellent chapter on CMS user experience. Keep in mind that many of the numbers described in this article are from the middle of 2020. We look forward to seeing the updated report in 2021, and are actively monitoring CrUX reports for our sites as well as our internal performance metrics. We are committed to continuously improve loading times and provide our users with a platform where they can build sites as they imagine, without compromising on performance. LCP, Speed Index and FCP for a mobile site over time DebugBear recently released a very interesting Website Builder Performance Review, which touches on some of the areas I mentioned above and examines the performance of very simple sites built on each platform. This site was built almost two years ago, and not modified since, but the platform is continually improving, and the site performance along with it, which can be witnessed by viewing its data over the past year and a half. Conclusion # We hope our experience inspires you to adopt a performance-oriented culture at your organisation and that the details above are helpful and applicable to your platform or site. To sum up: Pick a set of metrics that you can track consistently using tools endorsed by the industry. We recommend Core Web Vitals. Leverage browser caching and CDNs. Migrate to HTTP/2 (or HTTP/3 if possible). Use Brotli compression. Thanks for learning our story and we invite you to ask questions, share ideas on Twitter and GitHub and join the web performance conversation on your favorite channels. So, how does your recent Wix site performance look like? #

How CLS optimizations increased Yahoo! JAPAN News's page views per session by 15%

Yahoo! JAPAN is one of the largest media companies in Japan, providing over 79 billion page views per month. Their news platform, Yahoo! JAPAN News has more than 22 billion page views per month and an engineering team dedicated to improving the user experience. By continuously monitoring Core Web Vitals (CWV), they correlated the site's improved Cumulative Layout Shift (CLS) score with a 15% increase in page views per session and 13% increase in session duration. 0.2 CLS improvement 15.1% More page views per session 13.3% Longer session duration Cumulative Layout Shift measures how visually stable a website is—it helps quantify how often users experience unexpected layout shifts. Page content moving around unexpectedly often causes accidental clicks, disorientation on the page, and ultimately user frustration. Frustrated users tend not to stick around for long. To keep users happy, the page layout should stay stable through the entire lifecycle of the user journey. For Yahoo! JAPAN News this improvement had a significant positive impact on business critical engagement metrics. For technical details on how they improved the CLS, read the Yahoo! JAPAN News engineering team's post. Identifying the issue # Monitoring Core Web Vitals, including CLS, is crucial in catching issues and identifying where they're coming from. At Yahoo! JAPAN News, Search Console provided a great overview of groups of pages with performance issues and Lighthouse helped identify per-page opportunities to improve page experience. Using these tools, they discovered that the article detail page had poor CLS. It's important to keep in mind the cumulative part of the Cumulative Layout Shift—the score is captured through the entire page lifecycle. In the real-world, the score can include shifts that happen as a result of user interactions such as scrolling a page or tapping a button. To collect CLS scores from the field data, the team integrated web-vitals JavaScript library reporting. As a part of performance monitoring strategy, they're also working on building an internal tool with Lighthouse CI to continuously audit performance across businesses in the company. The team used Chrome DevTools to identify which elements were making layout shifts on the page. Layout Shift Regions in DevTools visualizes elements that contribute to CLS by highlighting them with a blue rectangle whenever a layout shift happens. They figured out that a layout shift occurred after the hero image at the top of the article was loaded for the first view. In the example above, when the image finishes loading, the text gets pushed down (the position change is indicated with the red line). Improving CLS for images # For fixed-size images, layout shifts can be prevented by specifying the width and height attributes in the img element and using the CSS aspect-ratio property available in modern browsers. However, Yahoo! JAPAN News needed to support not only modern browsers, but also browsers installed in relatively old operating systems such as iOS 9. They used Aspect Ratio Boxes—a method which uses markup to reserve the space on the page before the image is loaded. This method requires knowing the aspect ratio of the image in advance, which they were able to get from the backend API. Results # The number of URLs with poor performance in Search Console decreased by 98% and CLS in lab data decreased from about 0.2 to 0. More importantly, there were several correlated improvements in business metrics. Search Console doesn't reflect improvements in real-time. When Yahoo! JAPAN News compared user engagement metrics before and after CLS optimization, they saw multiple improvements: 15.1% More page views per session 13.3% Longer session duration 1.72%* Lower bounce rate (*percentage points) By improving CLS and other Core Web Vitals metrics, Yahoo! JAPAN News also got the "Fast page" label in the context menu of Chrome Android. Layout shifts are frustrating and discourage users from reading more pages, but that can be improved by using the appropriate tools, identifying issues, and applying best practices. Improving CLS is a chance to improve your business. For more information, read the Yahoo! JAPAN engineering team's post.

JavaScript: What is the meaning of this?

JavaScript's this is the butt of many jokes, and that's because, well, it's pretty complicated. However, I've seen developers do much-more-complicated and domain-specific things to avoid dealing with this this. If you're unsure about this, hopefully this will help. This is my this guide. I'm going to start with the most specific situation, and end with the least-specific. This article is kinda like a big if (…) … else if () … else if (…) …, so you can go straight to the first section that matches the code you're looking at. If the function is defined as an arrow function Otherwise, if the function/class is called with new Otherwise, if the function has a 'bound' this value Otherwise, if this is set at call-time Otherwise, if the function is called via a parent object (parent.func()) Otherwise, if the function or parent scope is in strict mode Otherwise If the function is defined as an arrow function: # const arrowFunction = () => { console.log(this); }; In this case, the value of this is always the same as this in the parent scope: const outerThis = this; const arrowFunction = () => { // Always logs `true`: console.log(this === outerThis); }; Arrow functions are great because the inner value of this can't be changed, it's always the same as the outer this. Other examples # With arrow functions, the value of this can't be changed with bind: // Logs `true` - bound `this` value is ignored: arrowFunction.bind({foo: 'bar'})(); With arrow functions, the value of this can't be changed with call or apply: // Logs `true` - called `this` value is ignored: arrowFunction.call({foo: 'bar'}); // Logs `true` - applied `this` value is ignored: arrowFunction.apply({foo: 'bar'}); With arrow functions, the value of this can't be changed by calling the function as a member of another object: const obj = {arrowFunction}; // Logs `true` - parent object is ignored: obj.arrowFunction(); With arrow functions, the value of this can't be changed by calling the function as a constructor: // TypeError: arrowFunction is not a constructor new arrowFunction(); 'Bound' instance methods # With instance methods, if you want to ensure this always refers to the class instance, the best way is to use arrow functions and class fields: class Whatever { someMethod = () => { // Always the instance of Whatever: console.log(this); }; } This pattern is really useful when using instance methods as event listeners in components (such as React components, or web components). The above might feel like it's breaking the "this will be the same as this in the parent scope" rule, but it starts to make sense if you think of class fields as syntactic sugar for setting things in the constructor: class Whatever { someMethod = (() => { const outerThis = this; return () => { // Always logs `true`: console.log(this === outerThis); }; })(); } // …is roughly equivalent to: class Whatever { constructor() { const outerThis = this; this.someMethod = () => { // Always logs `true`: console.log(this === outerThis); }; } } Alternative pattens involve binding an existing function in the constructor, or assigning the function in the constructor. If you can't use class fields for some reason, assigning functions in the constructor is a reasonable alternative: class Whatever { constructor() { this.someMethod = () => { // … }; } } Otherwise, if the function/class is called with new: # new Whatever(); The above will call Whatever (or its constructor function if it's a class) with this set to the result of Object.create(Whatever.prototype). class MyClass { constructor() { console.log( this.constructor === Object.create(MyClass.prototype).constructor, ); } } // Logs `true`: new MyClass(); The same is true for older-style constructors: function MyClass() { console.log( this.constructor === Object.create(MyClass.prototype).constructor, ); } // Logs `true`: new MyClass(); Other examples # When called with new, the value of this can't be changed with bind: const BoundMyClass = MyClass.bind({foo: 'bar'}); // Logs `true` - bound `this` value is ignored: new BoundMyClass(); When called with new, the value of this can't be changed by calling the function as a member of another object: const obj = {MyClass}; // Logs `true` - parent object is ignored: new obj.MyClass(); Otherwise, if the function has a 'bound' this value: # function someFunction() { return this; } const boundObject = {hello: 'world'}; const boundFunction = someFunction.bind(boundObject); Whenever boundFunction is called, its this value will be the object passed to bind (boundObject). // Logs `false`: console.log(someFunction() === boundObject); // Logs `true`: console.log(boundFunction() === boundObject); Warning: Avoid using bind to bind a function to its outer this. Instead, use arrow functions, as they make this clear from the function declaration, rather than something that happens later in the code. Don't use bind to set this to some value unrelated to the parent object; it's usually unexpected and it's why this gets such a bad reputation. Consider passing the value as an argument instead; it's more explicit, and works with arrow functions. Other examples # When calling a bound function, the value of this can't be changed with call or apply: // Logs `true` - called `this` value is ignored: console.log(boundFunction.call({foo: 'bar'}) === boundObject); // Logs `true` - applied `this` value is ignored: console.log(boundFunction.apply({foo: 'bar'}) === boundObject); When calling a bound function, the value of this can't be changed by calling the function as a member of another object: const obj = {boundFunction}; // Logs `true` - parent object is ignored: console.log(obj.boundFunction() === boundObject); Otherwise, if this is set at call-time: # function someFunction() { return this; } const someObject = {hello: 'world'}; // Logs `true`: console.log(someFunction.call(someObject) === someObject); // Logs `true`: console.log(someFunction.apply(someObject) === someObject); The value of this is the object passed to call/apply. Warning: Don't use call/apply to set this to some value unrelated to the parent object; it's usually unexpected and it's why this gets such a bad reputation. Consider passing the value as an argument instead; it's more explicit, and works with arrow functions. Unfortunately this is set to some other value by things like DOM event listeners, and using it can result in difficult-to-understand code: Don't element.addEventListener('click', function (event) { // Logs `element`, since the DOM spec sets `this` to // the element the handler is attached to. console.log(this); }); I avoid using this in cases like above, and instead: Do element.addEventListener('click', (event) => { // Ideally, grab it from a parent scope: console.log(element); // But if you can't do that, get it from the event object: console.log(event.currentTarget); }); Otherwise, if the function is called via a parent object (parent.func()): # const obj = { someMethod() { return this; }, }; // Logs `true`: console.log(obj.someMethod() === obj); In this case the function is called as a member of obj, so this will be obj. This happens at call-time, so the link is broken if the function is called without its parent object, or with a different parent object: const {someMethod} = obj; // Logs `false`: console.log(someMethod() === obj); const anotherObj = {someMethod}; // Logs `false`: console.log(anotherObj.someMethod() === obj); // Logs `true`: console.log(anotherObj.someMethod() === anotherObj); someMethod() === obj is false because someMethod isn't called as a member of obj. You might have encountered this gotcha when trying something like this: const $ = document.querySelector; // TypeError: Illegal invocation const el = $('.some-element'); This breaks because the implementation of querySelector looks at its own this value and expects it to be a DOM node of sorts, and the above breaks that connection. To achieve the above correctly: const $ = document.querySelector.bind(document); // Or: const $ = (...args) => document.querySelector(...args); Fun fact: Not all APIs use this internally. Console methods like console.log were changed to avoid this references, so log doesn't need to be bound to console. Warning: Don't transplant a function onto an object just to set this to some value unrelated to the parent object; it's usually unexpected and it's why this gets such a bad reputation. Consider passing the value as an argument instead; it's more explicit, and works with arrow functions. Otherwise, if the function or parent scope is in strict mode: # function someFunction() { 'use strict'; return this; } // Logs `true`: console.log(someFunction() === undefined); In this case, the value of this is undefined. 'use strict' isn't needed in the function if the parent scope is in strict mode (and all modules are in strict mode). Warning: Don't rely on this. I mean, there are easier ways to get an undefined value 😀. Otherwise: # function someFunction() { return this; } // Logs `true`: console.log(someFunction() === globalThis); In this case, the value of this is the same as globalThis. Most folks (including me) call globalThis the global object, but this isn't 100% technically correct. Here's Mathias Bynens with the details, including why it's called globalThis rather than simply global. Warning: Avoid using this to reference the global object (yes, I'm still calling it that). Instead, use globalThis, which is much more explicit. Phew! # And that's it! That's everything I know about this. Any questions? Something I've missed? Feel free to tweet at me. Thanks to Mathias Bynens, Ingvar Stepanyan, and Thomas Steiner for reviewing.

Agrofy: A 70% improvement in LCP correlated to a 76% reduction in load abandonment

Agrofy is an online marketplace for Latin America's agribusiness market. They match up buyers and sellers of farm machines, land, equipment, and financial services. In Q3 2020 a 4-person development team at Agrofy spent a month optimizing their website because they hypothesized that improved performance would lead to reduced bounce rates. They specifically focused on improving LCP, which is one of the Core Web Vitals. These performance optimizations led to a 70% improvement in LCP, which correlated to a 76% reduction in load abandonment (from 3.8% to 0.9%). 70% Lower LCP 76% Lower load abandonment Problem # While studying their business metrics, a development team at Agrofy noticed that their bounce rates seemed higher than industry benchmarks. Technical debt was also increasing in the website codebase. Solution # The Agrofy team pitched their executives and got buy-in to: Migrate from an older, deprecated framework to a newer, actively supported one. Optimize the load performance of the new codebase. The migration took 2 months. Aside from the 4-person development team mentioned earlier, this migration also involved product and UX specialists and a software architect. The optimization project took the 4-person development team 1 month. They focused on LCP, CLS (another Core Web Vitals metric), and FCP. Specific optimizations included: Lazy loading all non-visible elements with the Intersection Observer API. Delivering static resources faster with a content delivery network. Lazy loading images with loading="lazy". Server-side rendering of critical rendering path content. Preloading and preconnecting critical resources to minimize handshake times. Using real user monitoring (RUM) tools to identify which product detail pages were experiencing lots of layout shifts and then make adjustments to the codebase's architecture. Check out the Agrofy engineering blog post for more technical details. After enabling the new codebase on 20% of traffic, they launched the new site to all visitors in early September 2020. Results # The development team's optimizations led to measurable improvements in many different metrics: LCP improved 70%. CLS improved 72%. Blocking JS requests reduced 100% and blocking CSS requests 80%. Long tasks reduced 72%. First CPU Idle improved 25%. Over the same time frame, real user monitoring data (also known as field data) showed that the load abandonment rate on product detail pages dropped 76%, from 3.8% to 0.9%: Load abandonment rate trend on product detail page.

Tabbed application mode for PWAs

Tabbed application mode is part of the capabilities project and is currently in development. This post will be updated as the implementation progresses. Tabbed application mode is an early-stage exploration of the Chrome team. It is not ready for production yet. In the world of computing, the desktop metaphor is an interface metaphor that is a set of unifying concepts used by graphical user interfaces (GUI) to help users interact more easily with the computer. In keeping with the desktop metaphor, GUI tabs are modeled after traditional card tabs inserted in books, paper files, or card indexes. A tabbed document interface (TDI) or tab is a graphical control element that allows multiple documents or panels to be contained within a single window, using tabs as a navigational widget for switching between sets of documents. Progressive Web Apps can run in various display modes determined by the display property in the Web App Manifest. Examples are fullscreen, standalone, minimal-ui, and browser. These display modes follow a well-defined fallback chain ("fullscreen" → "standalone" → "minimal-ui" → "browser"). If a browser does not support a given mode, it falls back to the next display mode in the chain. Via the "display_override" property, developers can specify their own fallback chain if they need to. What is tabbed application mode # Something that has been missing from the platform so far is a way to let PWA developers offer their users a tabbed document interface, for example, to enable editing different files in the same PWA window. Tabbed application mode closes this gap. This feature is about having a standalone app window with multiple tabs (containing separate documents inside the app scope) inside it. It is not to be confused with the existing "display": "browser", which has a separate meaning (specifically, that the app is opened in a regular browser tab). Suggested use cases for tabbed application mode # Examples of sites that may use tabbed application mode include: Productivity apps that let the user edit more than one document (or file) at the same time. Communication apps that let the user have conversations in different rooms per tab. Reading apps that open article links in new in-app tabs. Differences to developer-built tabs # Having documents in separate browser tabs comes with resource isolation for free, which is not possible using the web today. Developer-built tabs would not scale acceptably to hundreds of tabs like browser tabs do. Developer-built tabs could also not be dragged out of the window to split into a separate application window, or be dragged back in to combine them back into a single window. Browser affordances such as navigation history, "Copy this page URL", "Cast this tab" or "Open this page in a web browser" would be applied to the developer-built tabbed interface page, but not the currently selected document page. Differences to "display": "browser" # The current "display": "browser" already has a specific meaning: Opens the web application using the platform-specific convention for opening hyperlinks in the user agent (e.g., in a browser tab or a new window). While browsers can do whatever they want regarding UI, it would clearly be a pretty big subversion of developer expectations if "display": "browser" suddenly meant "run in a separate application-specific window with no browser affordances, but a tabbed document interface". Setting "display": "browser" is effectively the way you opt out of being put into an application window. Current status # Step Status 1. Create explainer In progress 2. Create initial draft of specification Not started 3. Gather feedback & iterate on design In progress 4. Origin trial Not started 5. Launch Not started Using tabbed application mode # To use tabbed application mode, developers need to opt their apps in by setting a specific "display_override" mode value in the Web App Manifest. { … "display": "standalone", "display_override": ["tabbed"], … } Warning: The details of the potential display_override property's value (currently "tabbed") are not final. While you can try tabbed mode behind a flag, it will blindly apply to all sites and does not currently care about the manifest. When you set "display_override": ["tabbed"], it will just be treated the same as "display": "browser". Trying tabbed application mode # You can try tabbed application mode on Chrome OS devices running Chrome 83 and up today: Set the #enable-desktop-pwas-tab-strip flag. Install any web app that runs in standalone mode, for example, Excalidraw. Pin the app icon to the shelf, right click the icon, and select "New tabbed window" from the context menu. Open the app and interact with the tab strip. The video below shows the current iteration of the feature in action. There is no need to make any changes to the Web App Manifest for this to work. Feedback # The Chrome team wants to hear about your experiences with tabbed application mode. Tell us about the API design # Is there something about tabbed application mode that does not work like you expected? Comment on the Web App Manifest Issue that we have created. Report a problem with the implementation # Did you find a bug with Chrome's implementation? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter UI>Browser>WebAppInstalls in the Components box. Glitch works great for sharing quick and easy reproduction cases. Show support for the API # Are you planning to use tabbed application mode? Your public support helps the Chrome team prioritize features and shows other browser vendors how critical it is to support them. Send a tweet to @ChromiumDev using the hashtag #TabbedApplicationMode and let us know where and how you are using it. Useful links # Web App Manifest spec issue Chromium bug Blink Component: UI>Browser>WebAppInstalls Acknowledgements # Tabbed application mode was explored by Matt Giuca. The experimental implementation in Chrome was the work of Alan Cutter. This article was reviewed by Joe Medley. Hero image by Till Niermann on Wikimedia Commons.

Preparing for the display modes of tomorrow

A Web App Manifest is a JSON file that tells the browser about your Progressive Web App and how it should behave when installed on the user's desktop or mobile device. Via the display property, you can customize what browser UI is shown when your app is launched. For example, you can hide the address bar and browser chrome. Games can even be made to launch full screen. As a quick recap, below are the display modes that are specified at the time this article was written. Property Use fullscreen Opens the web application without any browser UI and takes up the entirety of the available display area. standalone Opens the web app to look and feel like a standalone app. The app runs in its own window, separate from the browser, and hides standard browser UI elements like the URL bar. minimal-ui This mode is similar to standalone, but provides the user a minimal set of UI elements for controlling navigation (such as back and reload). browser A standard browser experience. These display modes follow a well-defined fallback chain ("fullscreen" → "standalone" → "minimal-ui" → "browser"). If a browser does not support a given mode, it falls back to the next display mode in the chain. Shortcomings of the display property # The problem with this hard-wired fallback chain approach is threefold: A developer cannot request "minimal-ui" without being forced back into the "browser" display mode in case "minimal-ui" is not supported by a given browser. Developers have no way of handling cross-browser differences, like if the browser includes or excludes a back button in the window for "standalone" mode. The current behavior makes it impossible to introduce new display modes in a backward compatible way, since explorations like tabbed application mode do not have a natural place in the fallback chain. The display_override property # These problems are solved by the display_override property, which the browser considers before the display property. Its value is a sequence of strings that are considered in-order, and the first supported display mode is applied. If none are supported, the browser falls back to evaluating the display field. The display_override property is meant to solve special corner cases. In almost all circumstances the regular display property is what developers should reach for. In the example below, the display mode fallback chain would be as follows. (The details of "window-controls-overlay" are out-of-scope for this article.) "window-controls-overlay" (First, look at display_override.) "minimal-ui" "standalone" (When display_override is exhausted, evaluate display.) "minimal-ui" (Finally, use the display fallback chain.) "browser" { "display_override": ["window-controls-overlay", "minimal-ui"], "display": "standalone", } The browser will not consider display_override unless display is also present. To remain backward compatible, any future display mode will only be acceptable as a value of display_override, but not display. Browsers that do not support display_override fall back to the display property and ignore display_override as an unknown Web App Manifest property. The display_override property is defined independently from its potential values. Browser compatibility # The display_override property is supported as of Chromium 89. Other browsers support the display property, which caters to the majority of display mode use cases. Useful links # Explainer Intent to Ship thread Chromium bug Chrome Status entry Manifest Incubations repository Acknowledgments # The display_override property was formalized by Daniel Murphy.

Streams—The definitive guide

The Streams API allows you to programmatically access streams of data received over the network or created by whatever means locally and process them with JavaScript. Streaming involves breaking down a resource that you want to receive, send, or transform into small chunks, and then processing these chunks bit by bit. While streaming is something browsers do anyway when receiving assets like HTML or videos to be shown on webpages, this capability has never been available to JavaScript before fetch with streams was introduced in 2015. Streaming was technically possible with XMLHttpRequest, but it really was not pretty. Previously, if you wanted to process a resource of some kind (be it a video, or a text file, etc.), you would have to download the entire file, wait for it to be deserialized into a suitable format, and then process it. With streams being available to JavaScript, this all changes. You can now process raw data with JavaScript progressively as soon as it is available on the client, without needing to generate a buffer, string, or blob. This unlocks a number of use cases, some of which I list below: Video effects: piping a readable video stream through a transform stream that applies effects in real time. Data (de)compression: piping a file stream through a transform stream that selectively (de)compresses it. Image decoding: piping an HTTP response stream through a transform stream that decodes bytes into bitmap data, and then through another transform stream that translates bitmaps into PNGs. If installed inside the fetch handler of a service worker, this allows you to transparently polyfill new image formats like AVIF. Core concepts # Before I go into details on the various types of streams, let me introduce some core concepts. Chunks # A chunk is a single piece of data that is written to or read from a stream. It can be of any type; streams can even contain chunks of different types. Most of the time, a chunk will not be the most atomic unit of data for a given stream. For example, a byte stream might contain chunks consisting of 16 KiB Uint8Array units, instead of single bytes. Readable streams # A readable stream represents a source of data from which you can read. In other words, data comes out of a readable stream. Concretely, a readable stream is an instance of the ReadableStream class. Writable streams # A writable stream represents a destination for data into which you can write. In other words, data goes in to a writable stream. Concretely, a writable stream is an instance of the WritableStream class. Transform streams # A transform stream consists of a pair of streams: a writable stream, known as its writable side, and a readable stream, known as its readable side. A real-world metaphor for this would be a simultaneous interpreter who translates from one language to another on-the-fly. In a manner specific to the transform stream, writing to the writable side results in new data being made available for reading from the readable side. Concretely, any object with a writable property and a readable property can serve as a transform stream. However, the standard TransformStream class makes it easier to create such a pair that is properly entangled. Pipe chains # Streams are primarily used by piping them to each other. A readable stream can be piped directly to a writable stream, using the readable stream's pipeTo() method, or it can be piped through one or more transform streams first, using the readable stream's pipeThrough() method. A set of streams piped together in this way is referred to as a pipe chain. Backpressure # Once a pipe chain is constructed, it will propagate signals regarding how fast chunks should flow through it. If any step in the chain cannot yet accept chunks, it propagates a signal backwards through the pipe chain, until eventually the original source is told to stop producing chunks so fast. This process of normalizing flow is called backpressure. Teeing # A readable stream can be teed (named after the shape of an uppercase 'T') using its tee() method. This will lock the stream, that is, make it no longer directly usable; however, it will create two new streams, called branches, which can be consumed independently. Teeing also is important because streams cannot be rewound or restarted, more about this later. A pipe chain. The mechanics of a readable stream # A readable stream is a data source represented in JavaScript by a ReadableStream object that flows from an underlying source. The ReadableStream() constructor creates and returns a readable stream object from the given handlers. There are two types of underlying source: Push sources constantly push data at you when you have accessed them, and it is up to you to start, pause, or cancel access to the stream. Examples include live video streams, server-sent events, or WebSockets. Pull sources require you to explicitly request data from them once connected to. Examples include HTTP operations via fetch() or XMLHttpRequest calls. Stream data is read sequentially in small pieces called chunks. The chunks placed in a stream are said to be enqueued. This means they are waiting in a queue ready to be read. An internal queue keeps track of the chunks that have not yet been read. A queuing strategy is an object that determines how a stream should signal backpressure based on the state of its internal queue. The queuing strategy assigns a size to each chunk, and compares the total size of all chunks in the queue to a specified number, known as the high water mark. The chunks inside the stream are read by a reader. This reader retrieves the data one chunk at a time, allowing you to do whatever kind of operation you want to do on it. The reader plus the other processing code that goes along with it is called a consumer. The next construct in this context is called a controller. Each readable stream has an associated controller that, as the name suggests, allows you to control the stream. Only one reader can read a stream at a time; when a reader is created and starts reading a stream (that is, becomes an active reader), it is locked to it. If you want another reader to take over reading your stream, you typically need to release the first reader before you do anything else (although you can tee streams). Creating a readable stream # You create a readable stream by calling its constructor ReadableStream(). The constructor has an optional argument underlyingSource, which represents an object with methods and properties that define how the constructed stream instance will behave. The underlyingSource # This can use the following optional, developer-defined methods: start(controller): Called immediately when the object is constructed. The method can access the stream source, and do anything else required to set up the stream functionality. If this process is to be done asynchronously, the method can return a promise to signal success or failure. The controller parameter passed to this method is a ReadableStreamDefaultController. pull(controller): Can be used to control the stream as more chunks are fetched. It is called repeatedly as long as the stream's internal queue of chunks is not full, up until the queue reaches its high water mark. If the result of calling pull() is a promise, pull() will not be called again until said promise fulfills. If the promise rejects, the stream will become errored. cancel(reason): Called when the stream consumer cancels the stream. const readableStream = new ReadableStream({ start(controller) { /* … */ }, pull(controller) { /* … */ }, cancel(reason) { /* … */ }, }); The ReadableStreamDefaultController supports the following methods: ReadableStreamDefaultController.close() closes the associated stream. ReadableStreamDefaultController.enqueue() enqueues a given chunk in the associated stream. ReadableStreamDefaultController.error() causes any future interactions with the associated stream to error. /* … */ start(controller) { controller.enqueue('The first chunk!'); }, /* … */ The queuingStrategy # The second, likewise optional, argument of the ReadableStream() constructor is queuingStrategy. It is an object that optionally defines a queuing strategy for the stream, which takes two parameters: highWaterMark: A non-negative number indicating the high water mark of the stream using this queuing strategy. size(chunk): A function that computes and returns the finite non-negative size of the given chunk value. The result is used to determine backpressure, manifesting via the appropriate ReadableStreamDefaultController.desiredSize property. It also governs when the underlying source's pull() method is called. const readableStream = new ReadableStream({ /* … */ }, { highWaterMark: 10, size(chunk) { return chunk.length; }, }, ); You could define your own custom queuingStrategy, or use an instance of ByteLengthQueuingStrategy or CountQueuingStrategy for this object's value. If no queuingStrategy is supplied, the default used is the same as a CountQueuingStrategy with a highWaterMark of 1. The getReader() and read() methods # To read from a readable stream, you need a reader, which will be a ReadableStreamDefaultReader. The getReader() method of the ReadableStream interface creates a reader and locks the stream to it. While the stream is locked, no other reader can be acquired until this one is released. The read() method of the ReadableStreamDefaultReader interface returns a promise providing access to the next chunk in the stream's internal queue. It fulfills or rejects with a result depending on the state of the stream. The different possibilities are as follows: If a chunk is available, the promise will be fulfilled with an object of the form { value: chunk, done: false }. If the stream becomes closed, the promise will be fulfilled with an object of the form { value: undefined, done: true }. If the stream becomes errored, the promise will be rejected with the relevant error. const reader = readableStream.getReader(); while (true) { const { done, value } = await reader.read(); if (done) { console.log('The stream is done.'); break; } console.log('Just read a chunk:', value); } The locked property # You can check if a readable stream is locked by accessing its ReadableStream.locked property. const locked = readableStream.locked; console.log(`The stream is ${locked ? 'indeed' : 'not'} locked.`); Readable stream code samples # The code sample below shows all the steps in action. You first create a ReadableStream that in its underlyingSource argument (that is, the TimestampSource class) defines a start() method. This method tells the stream's controller to enqueue() a timestamp every second during ten seconds. Finally, it tells the controller to close() the stream. You consume this stream by creating a reader via the getReader() method and calling read() until the stream is done. class TimestampSource { #interval start(controller) { this.#interval = setInterval(() => { const string = new Date().toLocaleTimeString(); // Add the string to the stream. controller.enqueue(string); console.log(`Enqueued ${string}`); }, 1_000); setTimeout(() => { clearInterval(this.#interval); // Close the stream after 10s. controller.close(); }, 10_000); } cancel() { // This is called if the reader cancels. clearInterval(this.#interval); } } const stream = new ReadableStream(new TimestampSource()); async function concatStringStream(stream) { let result = ''; const reader = stream.getReader(); while (true) { // The `read()` method returns a promise that // resolves when a value has been received. const { done, value } = await reader.read(); // Result objects contain two properties: // `done` - `true` if the stream has already given you all its data. // `value` - Some data. Always `undefined` when `done` is `true`. if (done) return result; result += value; console.log(`Read ${result.length} characters so far`); console.log(`Most recently read chunk: ${value}`); } } concatStringStream(stream).then((result) => console.log('Stream complete', result)); Asynchronous iteration # Checking upon each read() loop iteration if the stream is done may not be the most convenient API. Luckily there will soon be a better way to do this: asynchronous iteration. for await (const chunk of stream) { console.log(chunk); } Caution: Asynchronous iteration is not yet implemented in any browser. A workaround to use asynchronous iteration today is to implement the behavior with a helper function. This allows you to use the feature in your code as shown in the snippet below. function streamAsyncIterator(stream) { // Get a lock on the stream: const reader = stream.getReader(); return { next() { // Stream reads already resolve with {done, value}, so // we can just call read: return reader.read(); }, return() { // Release the lock if the iterator terminates. reader.releaseLock(); return {}; }, // for-await calls this on whatever it's passed, so // iterators tend to return themselves. [Symbol.asyncIterator]() { return this; }, }; } async function example() { const response = await fetch(url); for await (const chunk of streamAsyncIterator(response.body)) { console.log(chunk); } } Teeing a readable stream # The tee() method of the ReadableStream interface tees the current readable stream, returning a two-element array containing the two resulting branches as new ReadableStream instances. This allows two readers to read a stream simultaneously. You might do this, for example, in a service worker if you want to fetch a response from the server and stream it to the browser, but also stream it to the service worker cache. Since a response body cannot be consumed more than once, you need two copies to do this. To cancel the stream, you then need to cancel both resulting branches. Teeing a stream will generally lock it for the duration, preventing other readers from locking it. const readableStream = new ReadableStream({ start(controller) { // Called by constructor. console.log('[start]'); controller.enqueue('a'); controller.enqueue('b'); controller.enqueue('c'); }, pull(controller) { // Called `read()` when the controller's queue is empty. console.log('[pull]'); controller.enqueue('d'); controller.close(); }, cancel(reason) { // Called when the stream is canceled. console.log('[cancel]', reason); }, }); // Create two `ReadableStream`s. const [streamA, streamB] = readableStream.tee(); // Read streamA iteratively one by one. Typically, you // would not do it this way, but you certainly can. const readerA = streamA.getReader(); console.log('[A]', await readerA.read()); //=> {value: "a", done: false} console.log('[A]', await readerA.read()); //=> {value: "b", done: false} console.log('[A]', await readerA.read()); //=> {value: "c", done: false} console.log('[A]', await readerA.read()); //=> {value: "d", done: false} console.log('[A]', await readerA.read()); //=> {value: undefined, done: true} // Read streamB in a loop. This is the more common way // to read data from the stream. const readerB = streamB.getReader(); while (true) { const result = await readerB.read(); if (result.done) break; console.log('[B]', result); } Readable byte streams # For streams representing bytes, an extended version of the readable stream is provided to handle bytes efficiently, in particular by minimizing copies. Byte streams allow for bring-your-own-buffer (BYOB) readers to be acquired. The default implementation can give a range of different outputs such as strings or array buffers in the case of WebSockets, whereas byte streams guarantee byte output. In addition, BYOB readers have stability benefits. This is because if a buffer detaches, it can guarantee that one does not write into the same buffer twice, hence avoiding race conditions. BYOB readers can reduce the number of times the browser needs to run garbage collection, because it can reuse buffers. Creating a readable byte stream # You can create a readable byte stream by passing an additional type parameter to the ReadableStream() constructor. new ReadableStream({ type: 'bytes' }); The underlyingSource # The underlying source of a readable byte stream is given a ReadableByteStreamController to manipulate. Its ReadableByteStreamController.enqueue() method takes a chunk argument whose value is an ArrayBufferView. The property ReadableByteStreamController.byobRequest returns the current BYOB pull request, or null if there is none. Finally, the ReadableByteStreamController.desiredSize property returns the desired size to fill the controlled stream's internal queue. The queuingStrategy # The second, likewise optional, argument of the ReadableStream() constructor is queuingStrategy. It is an object that optionally defines a queuing strategy for the stream, which takes one parameter: highWaterMark: A non-negative number of bytes indicating the high water mark of the stream using this queuing strategy. This is used to determine backpressure, manifesting via the appropriate ReadableByteStreamController.desiredSize property. It also governs when the underlying source's pull() method is called. Unlike queuing strategies for other stream types, a queuing strategy for a readable byte stream does not have a size(chunk) function. The size of each chunk is always determined by its byteLength property. If no queuingStrategy is supplied, the default used is one with a highWaterMark of 0. The getReader() and read() methods # You can then get access to a ReadableStreamBYOBReader by setting the mode parameter accordingly: ReadableStream.getReader({ mode: "byob" }). This allows for more precise control over buffer allocation in order to avoid copies. To read from the byte stream, you need to call ReadableStreamBYOBReader.read(view), where view is an ArrayBufferView. Readable byte stream code sample # const reader = readableStream.getReader({ mode: "byob" }); let startingAB = new ArrayBuffer(1_024); const buffer = await readInto(startingAB); console.log("The first 1024 bytes, or less:", buffer); async function readInto(buffer) { let offset = 0; while (offset < buffer.byteLength) { const { value: view, done } = await reader.read(new Uint8Array(buffer, offset, buffer.byteLength - offset)); buffer = view.buffer; if (done) { break; } offset += view.byteLength; } return buffer; } The following function returns readable byte streams that allow for efficient zero-copy reading of a randomly generated array. Instead of using a predetermined chunk size of 1,024, it attempts to fill the developer-supplied buffer, allowing for full control. const DEFAULT_CHUNK_SIZE = 1_024; function makeReadableByteStream() { return new ReadableStream({ type: 'bytes', pull(controller) { // Even when the consumer is using the default reader, // the auto-allocation feature allocates a buffer and // passes it to us via `byobRequest`. const view = controller.byobRequest.view; view = crypto.getRandomValues(view); controller.byobRequest.respond(view.byteLength); }, autoAllocateChunkSize: DEFAULT_CHUNK_SIZE, }); } The mechanics of a writable stream # A writable stream is a destination into which you can write data, represented in JavaScript by a WritableStream object. This serves as an abstraction over the top of an underlying sink—a lower-level I/O sink into which raw data is written. The data is written to the stream via a writer, one chunk at a time. A chunk can take a multitude of forms, just like the chunks in a reader. You can use whatever code you like to produce the chunks ready for writing; the writer plus the associated code is called a producer. When a writer is created and starts writing to a stream (an active writer), it is said to be locked to it. Only one writer can write to a writable stream at one time. If you want another writer to start writing to your stream, you typically need to release it, before you then attach another writer to it. An internal queue keeps track of the chunks that have been written to the stream but not yet been processed by the underlying sink. A queuing strategy is an object that determines how a stream should signal backpressure based on the state of its internal queue. The queuing strategy assigns a size to each chunk, and compares the total size of all chunks in the queue to a specified number, known as the high water mark. The final construct is called a controller. Each writable stream has an associated controller that allows you to control the stream (for example, to abort it). Creating a writable stream # The WritableStream interface of the Streams API provides a standard abstraction for writing streaming data to a destination, known as a sink. This object comes with built-in backpressure and queuing. You create a writable stream by calling its constructor WritableStream(). It has an optional underlyingSink parameter, which represents an object with methods and properties that define how the constructed stream instance will behave. The underlyingSink # The underlyingSink can include the following optional, developer-defined methods. The controller parameter passed to some of the methods is a WritableStreamDefaultController. start(controller): This method is called immediately when the object is constructed. The contents of this method should aim to get access to the underlying sink. If this process is to be done asynchronously, it can return a promise to signal success or failure. write(chunk, controller): This method will be called when a new chunk of data (specified in the chunk parameter) is ready to be written to the underlying sink. It can return a promise to signal success or failure of the write operation. This method will be called only after previous writes have succeeded, and never after the stream is closed or aborted. close(controller): This method will be called if the app signals that it has finished writing chunks to the stream. The contents should do whatever is necessary to finalize writes to the underlying sink, and release access to it. If this process is asynchronous, it can return a promise to signal success or failure. This method will be called only after all queued-up writes have succeeded. abort(reason): This method will be called if the app signals that it wishes to abruptly close the stream and put it in an errored state. It can clean up any held resources, much like close(), but abort() will be called even if writes are queued up. Those chunks will be thrown away. If this process is asynchronous, it can return a promise to signal success or failure. The reason parameter contains a DOMString describing why the stream was aborted. const writableStream = new WritableStream({ start(controller) { /* … */ }, write(chunk, controller) { /* … */ }, close(controller) { /* … */ }, abort(reason) { /* … */ }, }); The WritableStreamDefaultController interface of the Streams API represents a controller allowing control of a WritableStream's state during set up, as more chunks are submitted for writing, or at the end of writing. When constructing a WritableStream, the underlying sink is given a corresponding WritableStreamDefaultController instance to manipulate. The WritableStreamDefaultController has only one method: WritableStreamDefaultController.error(), which causes any future interactions with the associated stream to error. /* … */ write(chunk, controller) { try { // Try to do something dangerous with `chunk`. } catch (error) { controller.error(error.message); } }, /* … */ The queuingStrategy # The second, likewise optional, argument of the WritableStream() constructor is queuingStrategy. It is an object that optionally defines a queuing strategy for the stream, which takes two parameters: highWaterMark: A non-negative number indicating the high water mark of the stream using this queuing strategy. size(chunk): A function that computes and returns the finite non-negative size of the given chunk value. The result is used to determine backpressure, manifesting via the appropriate WritableStreamDefaultWriter.desiredSize property. You could define your own custom queuingStrategy, or use an instance of ByteLengthQueuingStrategy or CountQueuingStrategy for this object value. If no queuingStrategy is supplied, the default used is the same as a CountQueuingStrategy with a highWaterMark of 1. The getWriter() and write() methods # To write to a writable stream, you need a writer, which will be a WritableStreamDefaultWriter. The getWriter() method of the WritableStream interface returns a new instance of WritableStreamDefaultWriter and locks the stream to that instance. While the stream is locked, no other writer can be acquired until the current one is released. The write() method of the WritableStreamDefaultWriter interface writes a passed chunk of data to a WritableStream and its underlying sink, then returns a promise that resolves to indicate the success or failure of the write operation. Note that what "success" means is up to the underlying sink; it might indicate that the chunk has been accepted, and not necessarily that it is safely saved to its ultimate destination. const writer = writableStream.getWriter(); const resultPromise = writer.write('The first chunk!'); The locked property # You can check if a writable stream is locked by accessing its WritableStream.locked property. const locked = writableStream.locked; console.log(`The stream is ${locked ? 'indeed' : 'not'} locked.`); Writable stream code sample # The code sample below shows all steps in action. const writableStream = new WritableStream({ start(controller) { console.log('[start]'); }, async write(chunk, controller) { console.log('[write]', chunk); // Wait for next write. await new Promise((resolve) => setTimeout(() => { document.body.textContent += chunk; resolve(); }, 1_000)); }, close(controller) { console.log('[close]'); }, abort(reason) { console.log('[abort]', reason); }, }); const writer = writableStream.getWriter(); const start = Date.now(); for (const char of 'abcdefghijklmnopqrstuvwxyz') { // Wait to add to the write queue. await writer.ready; console.log('[ready]', Date.now() - start, 'ms'); // The Promise is resolved after the write finishes. writer.write(char); } await writer.close(); Piping a readable stream to a writable stream # A readable stream can be piped to a writable stream through the readable stream's pipeTo() method. ReadableStream.pipeTo() pipes the current ReadableStreamto a given WritableStream and returns a promise that fulfills when the piping process completes successfully, or rejects if any errors were encountered. const readableStream = new ReadableStream({ start(controller) { // Called by constructor. console.log('[start readable]'); controller.enqueue('a'); controller.enqueue('b'); controller.enqueue('c'); }, pull(controller) { // Called when controller's queue is empty. console.log('[pull]'); controller.enqueue('d'); controller.close(); }, cancel(reason) { // Called when the stream is canceled. console.log('[cancel]', reason); }, }); const writableStream = new WritableStream({ start(controller) { // Called by constructor console.log('[start writable]'); }, async write(chunk, controller) { // Called upon writer.write() console.log('[write]', chunk); // Wait for next write. await new Promise((resolve) => setTimeout(() => { document.body.textContent += chunk; resolve(); }, 1_000)); }, close(controller) { console.log('[close]'); }, abort(reason) { console.log('[abort]', reason); }, }); await readableStream.pipeTo(writableStream); console.log('[finished]'); Creating a transform stream # The TransformStream interface of the Streams API represents a set of transformable data. You create a transform stream by calling its constructor TransformStream(), which creates and returns a transform stream object from the given handlers. The TransformStream() constructor accepts as its first argument an optional JavaScript object representing the transformer. Such objects can contain any of the following methods: The transformer # start(controller): This method is called immediately when the object is constructed. Typically this is used to enqueue prefix chunks, using controller.enqueue(). Those chunks will be read from the readable side but do not depend on any writes to the writable side. If this initial process is asynchronous, for example because it takes some effort to acquire the prefix chunks, the function can return a promise to signal success or failure; a rejected promise will error the stream. Any thrown exceptions will be re-thrown by the TransformStream() constructor. transform(chunk, controller): This method is called when a new chunk originally written to the writable side is ready to be transformed. The stream implementation guarantees that this function will be called only after previous transforms have succeeded, and never before start() has completed or after flush() has been called. This function performs the actual transformation work of the transform stream. It can enqueue the results using controller.enqueue(). This permits a single chunk written to the writable side to result in zero or multiple chunks on the readable side, depending on how many times controller.enqueue() is called. If the process of transforming is asynchronous, this function can return a promise to signal success or failure of the transformation. A rejected promise will error both the readable and writable sides of the transform stream. If no transform() method is supplied, the identity transform is used, which enqueues chunks unchanged from the writable side to the readable side. flush(controller): This method is called after all chunks written to the writable side have been transformed by successfully passing through transform(), and the writable side is about to be closed. Typically this is used to enqueue suffix chunks to the readable side, before that too becomes closed. If the flushing process is asynchronous, the function can return a promise to signal success or failure; the result will be communicated to the caller of stream.writable.write(). Additionally, a rejected promise will error both the readable and writable sides of the stream. Throwing an exception is treated the same as returning a rejected promise. const transformStream = new TransformStream({ start(controller) { /* … */ }, transform(chunk, controller) { /* … */ }, flush(controller) { /* … */ }, }); The writableStrategy and readableStrategy queueing strategies # The second and third optional parameters of the TransformStream() constructor are optional writableStrategy and readableStrategy queueing strategies. They are defined as outlined in the readable and the writable stream sections respectively. Transform stream code sample # The following code sample shows a simple transform stream in action. // Note that `TextEncoderStream` and `TextDecoderStream` exist now. // This example shows how you would have done it before. const textEncoderStream = new TransformStream({ transform(chunk, controller) { console.log('[transform]', chunk); controller.enqueue(new TextEncoder().encode(chunk)); }, flush(controller) { console.log('[flush]'); controller.terminate(); }, }); (async () => { const readStream = textEncoderStream.readable; const writeStream = textEncoderStream.writable; const writer = writeStream.getWriter(); for (const char of 'abc') { writer.write(char); } writer.close(); const reader = readStream.getReader(); for (let result = await reader.read(); !result.done; result = await reader.read()) { console.log('[value]', result.value); } })(); Piping a readable stream through a transform stream # The pipeThrough() method of the ReadableStream interface provides a chainable way of piping the current stream through a transform stream or any other writable/readable pair. Piping a stream will generally lock it for the duration of the pipe, preventing other readers from locking it. const transformStream = new TransformStream({ transform(chunk, controller) { console.log('[transform]', chunk); controller.enqueue(new TextEncoder().encode(chunk)); }, flush(controller) { console.log('[flush]'); controller.terminate(); }, }); const readableStream = new ReadableStream({ start(controller) { // called by constructor console.log('[start]'); controller.enqueue('a'); controller.enqueue('b'); controller.enqueue('c'); }, pull(controller) { // called read when controller's queue is empty console.log('[pull]'); controller.enqueue('d'); controller.close(); // or controller.error(); }, cancel(reason) { // called when rs.cancel(reason) console.log('[cancel]', reason); }, }); (async () => { const reader = readableStream.pipeThrough(transformStream).getReader(); for (let result = await reader.read(); !result.done; result = await reader.read()) { console.log('[value]', result.value); } })(); The next code sample (a bit contrived) shows how you could implement a "shouting" version of fetch() that uppercases all text by consuming the returned response promise as a stream and uppercasing chunk by chunk. The advantage of this approach is that you do not need to wait for the whole document to be downloaded, which can make a huge difference when dealing with large files. function upperCaseStream() { return new TransformStream({ transform(chunk, controller) { controller.enqueue(chunk.toUpperCase()); }, }); } function appendToDOMStream(el) { return new WritableStream({ write(chunk) { el.append(chunk); } }); } fetch('./lorem-ipsum.txt').then((response) => response.body .pipeThrough(new TextDecoderStream()) .pipeThrough(upperCaseStream()) .pipeTo(appendToDOMStream(document.body)) ); Browser support and polyfill # Support for the Streams API in browsers varies. Be sure to check Can I use for detailed compatibility data. Note that some browsers only have partial implementations of certain features, so be sure to check the data thoroughly. The good news is that there is a reference implementation available and a polyfill targeted at production use. Gotchas! If possible, load the polyfill conditionally and only if the built-in feature is not available. Demo # The demo below shows readable, writable, and transform streams in action. It also includes examples of pipeThrough() and pipeTo() pipe chains, and also demonstrates tee(). You can optionally run the demo in its own window or view the source code. Useful streams available in the browser # There are a number of useful streams built right into the browser. You can easily create a ReadableStream from a blob. The Blob interface's stream() method returns a ReadableStream which upon reading returns the data contained within the blob. Also recall that a File object is a specific kind of a Blob, and can be used in any context that a blob can. const readableStream = new Blob(['hello world'], { type: 'text/plain' }).stream(); The streaming variants of TextDecoder.decode() and TextEncoder.encode() are called TextDecoderStream and TextEncoderStream respectively. const response = await fetch('https://streams.spec.whatwg.org/'); const decodedStream = response.body.pipeThrough(new TextDecoderStream()); Compressing or decompressing a file is easy with the CompressionStream and DecompressionStream transform streams respectively. The code sample below shows how you can download the Streams spec, compress (gzip) it right in the browser, and write the compressed file directly to disk. const response = await fetch('https://streams.spec.whatwg.org/'); const readableStream = response.body; const compressedStream = readableStream.pipeThrough(new CompressionStream('gzip')); const fileHandle = await showSaveFilePicker(); const writableStream = await fileHandle.createWritable(); compressedStream.pipeTo(writableStream); The File System Access API's FileSystemWritableFileStream and the experimental fetch() request streams are examples of writable streams in the wild. The Serial API makes heavy use of both readable and writable streams. // Prompt user to select any serial port. const port = await navigator.serial.requestPort(); // Wait for the serial port to open. await port.open({ baudRate: 9_600 }); const reader = port.readable.getReader(); // Listen to data coming from the serial device. while (true) { const { value, done } = await reader.read(); if (done) { // Allow the serial port to be closed later. reader.releaseLock(); break; } // value is a Uint8Array. console.log(value); } // Write to the serial port. const writer = port.writable.getWriter(); const data = new Uint8Array([104, 101, 108, 108, 111]); // hello await writer.write(data); // Allow the serial port to be closed later. writer.releaseLock(); Finally, the WebSocketStream API integrates streams with the WebSocket API. const wss = new WebSocketStream(WSS_URL); const { readable, writable } = await wss.connection; const reader = readable.getReader(); const writer = writable.getWriter(); while (true) { const { value, done } = await reader.read(); if (done) { break; } const result = await process(value); await writer.write(result); } Useful resources # Streams specification Accompanying demos Streams polyfill 2016—the year of web streams Async iterators and generators Stream Visualizer Acknowledgements # This article was reviewed by Jake Archibald, François Beaufort, Sam Dutton, Mattias Buelens, Surma, Joe Medley, and Adam Rice. Jake Archibald's blog posts have helped me a lot in understanding streams. Some of the code samples are inspired by GitHub user @bellbind's explorations and parts of the prose build heavily on the MDN Web Docs on Streams. The Streams Standard's authors have done a tremendous job on writing this spec. Hero image by Ryan Lara on Unsplash.

Building a Tabs component

In this post I want to share thinking on building a Tabs component for the web that is responsive, supports multiple device inputs, and works across browsers. Try the demo. Demo If you prefer video, here's a YouTube version of this post: Overview # Tabs are a common component of design systems but can take many shapes and forms. First there were desktop tabs built on <frame> element, and now we have buttery mobile components that animate content based on physics properties. They're all trying to do the same thing: save space. Today, the essentials of a tabs user experience is a button navigation area which toggles the visibility of content in a display frame. Many different content areas share the same space, but are conditionally presented based on the button selected in the navigation. Web Tactics # All in all I found this component pretty straightforward to build, thanks to a few critical web platform features: scroll-snap-points for elegant swipe and keyboard interactions with appropriate scroll stop positions Deep links via URL hashes for browser handled in-page scroll anchoring and sharing support Screen reader support with <a> and id="#hash" element markup prefers-reduced-motion for enabling crossfade transitions and instant in-page scrolling The in-draft @scroll-timeline web feature for dynamically underlining and color changing the selected tab The HTML # Fundamentally, the UX here is: click a link, have the URL represent the nested page state, and then see the content area update as the browser scrolls to the matching element. There are some structural content members in there: links and :targets. We need a list of links, which a <nav> is great for, and a list of <article> elements, which a <section> is great for. Each link hash will match a section, letting the browser scroll things via anchoring. A link button is clicked, sliding in focused content For example, clicking a link automatically focuses the :target article in Chrome 89, no JS required. The user can then scroll the article content with their input device as always. It's complimentary content, as indicated in the markup. I used the following markup to organize the tabs: <snap-tabs> <header> <nav> <a></a> <a></a> <a></a> <a></a> </nav> </header> <section> <article></article> <article></article> <article></article> <article></article> </section> </snap-tabs> I can establish connections between the <a> and <article> elements with href and id properties like this: <snap-tabs> <header> <nav> <a href="#responsive"></a> <a href="#accessible"></a> <a href="#overscroll"></a> <a href="#more"></a> </nav> </header> <section> <article id="responsive"></article> <article id="accessible"></article> <article id="overscroll"></article> <article id="more"></article> </section> </snap-tabs> I next filled the articles with mixed amounts of lorem, and the links with a mixed length and image set of titles. With content to work with, we can begin layout. Scrolling layouts # There are 3 different types of scroll areas in this component: The navigation (pink) is horizontally scrollable The content area (blue) is horizontally scrollable Each article item (green) is vertically scrollable. There's 2 different types of elements involved with scrolling: A window A box with defined dimensions that has the overflow property style. An oversized surface In this layout, it's the list containers: nav links, section articles, and article contents. <snap-tabs> layout # The top level layout I chose was flex (Flexbox). I set the direction to column, so the header and section are vertically ordered. This is our first scroll window, and it hides everything with overflow hidden. The header and section will employ overscroll soon, as individual zones. HTML <snap-tabs> <header></header> <section></section> </snap-tabs> CSS snap-tabs { display: flex; flex-direction: column; /* establish primary containing box */ overflow: hidden; position: relative; & > section { /* be pushy about consuming all space */ block-size: 100%; } & > header { /* defend against <section> needing 100% */ flex-shrink: 0; /* fixes cross browser quarks */ min-block-size: fit-content; } } Pointing back to the colorful 3-scroll diagram: <header> is now prepared to be the (pink) scroll container. <section> is prepared to be the (blue) scroll container. The frames I've highlighted below with VisBug help us see the windows the scroll containers have created. Tabs <header> layout # The next layout is nearly the same: I use flex to create vertical ordering. HTML <snap-tabs> <header> <nav></nav> <span class="snap-indicator"></span> </header> <section></section> </snap-tabs> CSS header { display: flex; flex-direction: column; } The .snap-indicator should travel horizontally with the group of links, and this header layout helps set that stage. No absolute positioned elements here! Next, the scroll styles. It turns out that we can share the scroll styles between our 2 horizontal scroll areas (header and section), so I made a utility class, .scroll-snap-x. .scroll-snap-x { /* browser decide if x is ok to scroll and show bars on, y hidden */ overflow: auto hidden; /* prevent scroll chaining on x scroll */ overscroll-behavior-x: contain; /* scrolling should snap children on x */ scroll-snap-type: x mandatory; @media (hover: none) { scrollbar-width: none; &::-webkit-scrollbar { width: 0; height: 0; } } } Each needs overflow on the x axis, scroll containment to trap overscroll, hidden scrollbars for touch devices and lastly scroll-snap for locking content presentation areas. Our keyboard tab order is accessible and any interactions guide focus naturally. Scroll snap containers also get a nice carousel style interaction from their keyboard. Tabs header <nav> layout # The nav links need to be laid out in a line, with no line breaks, vertically centered, and each link item should snap to the scroll-snap container. Swift work for 2021 CSS! HTML <nav> <a></a> <a></a> <a></a> <a></a> </nav> CSS nav { display: flex; & a { scroll-snap-align: start; display: inline-flex; align-items: center; white-space: nowrap; } } Each link styles and sizes itself, so the nav layout only needs to specify direction and flow. Unique widths on nav items makes the transition between tabs fun as the indicator adjusts its width to the new target. Depending on how many elements are in here, the browser will render a scrollbar or not. Tabs <section> layout # This section is a flex item and needs to be the dominant consumer of space. It also needs to create columns for the articles to be placed into. Again, swift work for CSS 2021! The block-size: 100% stretches this element to fill the parent as much as possible, then for its own layout, it creates a series of columns that are 100% the width of the parent. Percentages work great here because we've written strong constraints on the parent. HTML <section> <article></article> <article></article> <article></article> <article></article> </section> CSS section { block-size: 100%; display: grid; grid-auto-flow: column; grid-auto-columns: 100%; } It's as if we're saying "expand vertically as much as possible, in a pushy way" (remember the header we set to flex-shrink: 0: it is a defense against this expansion push), which sets the row height for a set of full height columns. The auto-flow style tells the grid to always lay children out in a horizontal line, no wrapping, exactly what we want; to overflow the parent window. I find these difficult to wrap my head around sometimes! This section element is fitting into a box, but also created a set of boxes. I hope the visuals and explanations are helping. Tabs <article> layout # The user should be able to scroll the article content, and the scrollbars should only show up if there is overflow. These article elements are in a neat position. They are simultaneously a scroll parent and a scroll child. The browser is really handling some tricky touch, mouse, and keyboard interactions for us here. HTML <article> <h2></h2> <p></p> <p></p> <h2></h2> <p></p> <p></p> ... </article> CSS article { scroll-snap-align: start; overflow-y: auto; overscroll-behavior-y: contain; } I chose to have the articles snap within their parent scroller. I really like how the navigation link items and the article elements snap to the inline-start of their respective scroll containers. It looks and feels like a harmonious relationship. The article is a grid child, and it's size is predetermined to be the viewport area we want to provide scroll UX. This means I don't need any height or width styles here, I just need to define how it overflows. I set overflow-y to auto, and then also trap the scroll interactions with the handy overscroll-behavior property. 3 scroll areas recap # Below I've chosen in my system settings to "always show scrollbars". I think it's doubly important for the layout to work with this setting turned on, as it is for me to review the layout and the scroll orchestration. I think seeing the scrollbar gutter in this component helps clearly show where the scroll areas are, the direction they support, and how they interact with each other. Consider how each of these scroll window frames also are flex or grid parents to a layout. DevTools can help us visualize this: The scroll layouts are complete: snapping, deep linkable, and keyboard accessible. Strong foundation for UX enhancements, style and delight. Feature highlight # Scroll snapped children maintain their locked position during resize. This means JavaScript won't need to bring anything into view on device rotate or browser resize. Try it out in Chromium DevTools Device Mode by selecting any mode other than Responsive, and then resizing the device frame. Notice the element stays in view and locked with its content. This has been available since Chromium updated their implementation to match the spec. Here's a blog post about it. Animation # The goal of the animation work here is to clearly link interactions with UI feedback. This helps guide or assist the user through to their (hopefully) seamless discovery of all the content. I'll be adding motion with purpose and conditionally. Users can now specify their motion preferences in their operating system, and I thoroughly enjoy responding to their preferences in my interfaces. I'll be linking a tab underline with the article scroll position. Snapping isn't only pretty alignment, it's also anchoring the start and end of an animation. This keeps the <nav>, which acts like a mini-map, connected to the content. We'll be checking the user's motion preference from both CSS and JS. There's a few great places to be considerate! Scroll behavior # There's an opportunity to enhance the motion behavior of both :target and element.scrollIntoView(). By default, it's instant. The browser just sets the scroll position. Well, what if we want to transition to that scroll position, instead of blink there? @media (prefers-reduced-motion: no-preference) { .scroll-snap-x { scroll-behavior: smooth; } } Since we're introducing motion here, and motion that the user doesn't control (like scrolling), we only apply this style if the user has no preference in their operating system around reduced motion. This way, we only introduce scroll motion for folks who are OK with it. Tabs indicator # The purpose of this animation is to help associate the indicator with the state of the content. I decided to color crossfade border-bottom styles for users who prefer reduced motion, and a scroll linked sliding + color fade animation for users who are OK with motion. In Chromium Devtools, I can toggle the preference and demonstrate the 2 different transition styles. I had a ton of fun building this. @media (prefers-reduced-motion: reduce) { snap-tabs > header a { border-block-end: var(--indicator-size) solid hsl(var(--accent) / 0%); transition: color .7s ease, border-color .5s ease; &:is(:target,:active,[active]) { color: var(--text-active-color); border-block-end-color: hsl(var(--accent)); } } snap-tabs .snap-indicator { visibility: hidden; } } I hide the .snap-indicator when the user prefers reduced motion since I don't need it anymore. Then I replace it with border-block-end styles and a transition. Also notice in the tabs interaction that the active nav item not only has a brand underline highlight, but it's text color also is darker. The active element has higher text color contrast and a bright underlight accent. Just a few extra lines of CSS will make someone feel seen (in the sense that we're thoughtfully respecting their motion preferences). I love it. @scroll-timeline # In the above section I showed you how I handle the reduced motion crossfade styles, and in this section I'll show you how I linked the indicator and a scroll area together. This is some fun experimental stuff up next. I hope you're as excited as me. const { matches:motionOK } = window.matchMedia( '(prefers-reduced-motion: no-preference)' ); I first check the user's motion preference from JavaScript. If the result of this is false, meaning the user prefers reduced motion, then we'll not run any of the scroll linking motion effects. if (motionOK) { // motion based animation code } At the time of writing this, the browser support for @scroll-timeline is none. It's a draft spec with only experimental implementations. It has a polyfill though, which I use in this demo. ScrollTimeline # While CSS and JavaScript can both create scroll timelines, I opted into JavaScript so I could use live element measurements in the animation. const sectionScrollTimeline = new ScrollTimeline({ scrollSource: tabsection, // snap-tabs > section orientation: 'inline', // scroll in the direction letters flow fill: 'both', // bi-directional linking }); I want 1 thing to follow another's scroll position, and by creating a ScrollTimeline I define the driver of the scroll link, the scrollSource. Normally an animation on the web runs against a global time frame tick, but with a custom sectionScrollTimeline in memory, I can change all that. tabindicator.animate({ transform: ..., width: ..., }, { duration: 1000, fill: 'both', timeline: sectionScrollTimeline, } ); Before I get into the keyframes of the animation, I think it's important to point out the follower of the scrolling, tabindicator, will be animated based on a custom timeline, our section's scroll. This completes the linkage, but is missing the final ingredient, stateful points to animate between, also known as keyframes. Dynamic keyframes # There's a really powerful pure declarative CSS way to animate with @scroll-timeline, but the animation I chose to do was too dynamic. There's no way to transition between auto width, and there's no way to dynamically create a number of keyframes based on children length. JavaScript knows how to get that information though, so we'll iterate over the children ourselves and grab the computed values at runtime: tabindicator.animate({ transform: [...tabnavitems].map(({offsetLeft}) => `translateX(${offsetLeft}px)`), width: [...tabnavitems].map(({offsetWidth}) => `${offsetWidth}px`) }, { duration: 1000, fill: 'both', timeline: sectionScrollTimeline, } ); For each tabnavitem, destructure the offsetLeft position and return a string that uses it as a translateX value. This creates 4 transform keyframes for the animation. The same is done for width, each is asked what its dynamic width is and then it's used as a keyframe value. Here's example output, based on my fonts and browser preferences: TranslateX Keyframes: [...tabnavitems].map(({offsetLeft}) => `translateX(${offsetLeft}px)`) // results in 4 array items, which represent 4 keyframe states // ["translateX(0px)", "translateX(121px)", "translateX(238px)", "translateX(464px)"] Width Keyframes: [...tabnavitems].map(({offsetWidth}) => `${offsetWidth}px`) // results in 4 array items, which represent 4 keyframe states // ["121px", "117px", "226px", "67px"] To summarize the strategy, the tab indicator will now animate across 4 keyframes depending on the scroll snap position of the section scroller. The snap points create clear delineation between our keyframes and really add to the synchronized feel of the animation. The user drives the animation with their interaction, seeing the width and position of the indicator change from one section to the next, tracking perfectly with scroll. You may not have noticed, but I'm very proud of the transition of color as the highlighted navigation item becomes selected. The unselected lighter grey appears even more pushed back when the highlighted item has more contrast. It's common to transition color for text, like on hover and when selected, but it's next-level to transition that color on scroll, synchronized with the underline indicator. Here's how I did it: tabnavitems.forEach(navitem => { navitem.animate({ color: [...tabnavitems].map(item => item === navitem ? `var(--text-active-color)` : `var(--text-color)`) }, { duration: 1000, fill: 'both', timeline: sectionScrollTimeline, } ); }); Each tab nav link needs this new color animation, tracking the same scroll timeline as the underline indicator. I use the same timeline as before: since it's role is to emit a tick on scroll, we can use that tick in any type of animation we want. As I did before, I create 4 keyframes in the loop, and return colors. [...tabnavitems].map(item => item === navitem ? `var(--text-active-color)` : `var(--text-color)`) // results in 4 array items, which represent 4 keyframe states // [ "var(--text-active-color)", "var(--text-color)", "var(--text-color)", "var(--text-color)", ] The keyframe with the color var(--text-active-color) highlights the link, and it's otherwise a standard text color. The nested loop there makes it relatively straightforward, as the outer loop is each nav item, and the inner loop is each navitem's personal keyframes. I check if the outer loop element is the same as the inner loop one, and use that to know when it's selected. I had a lot of fun writing this. So much. Even more JavaScript enhancements # It's worth a reminder that the core of what I'm showing you here works without JavaScript. With that said, let's see how we can enhance it when JS is available. Deep links # Deep links are more of a mobile term, but I think the intent of the deep link is met here with tabs in that you can share a URL directly to a tab's contents. The browser will in-page navigate to the ID that is matched in the URL hash. I found this onload handler made the effect across platforms. window.onload = () => { if (location.hash) { tabsection.scrollLeft = document .querySelector(location.hash) .offsetLeft; } } Scroll end synchronization # Our users aren't always clicking or using a keyboard, sometimes they're just free scrolling, as they should be able to. When the section scroller stops scrolling, wherever it lands needs to be matched in the top navigation bar. Here's how I wait for scroll end: tabsection.addEventListener('scroll', () => { clearTimeout(tabsection.scrollEndTimer); tabsection.scrollEndTimer = setTimeout(determineActiveTabSection, 100); }); Whenever the sections are being scrolled, clear the section timeout if there, and start a new one. When sections stop being scrolled, don't clear the timeout, and fire 100ms after resting. When it fires, call function that seeks to figure out where the user stopped. const determineActiveTabSection = () => { const i = tabsection.scrollLeft / tabsection.clientWidth; const matchingNavItem = tabnavitems[i]; matchingNavItem && setActiveTab(matchingNavItem); }; Assuming the scroll snapped, dividing the current scroll position from the width of the scroll area should result in an integer and not a decimal. I then try to grab a navitem from our cache via this calculated index, and if it finds something, I send the match to be set active. const setActiveTab = tabbtn => { tabnav .querySelector(':scope a[active]') .removeAttribute('active'); tabbtn.setAttribute('active', ''); tabbtn.scrollIntoView(); }; Setting the active tab starts by clearing any currently active tab, then giving the incoming nav item the active state attribute. The call to scrollIntoView() has a fun interaction with CSS that is worth noting. .scroll-snap-x { overflow: auto hidden; overscroll-behavior-x: contain; scroll-snap-type: x mandatory; @media (prefers-reduced-motion: no-preference) { scroll-behavior: smooth; } } In the horizontal scroll snap utility CSS, we've nested a media query which applies smooth scrolling if the user is motion tolerant. JavaScript can freely make calls to scroll elements into view, and CSS can manage the UX declaratively. Quite the delightful little match they make sometimes. Conclusion # Now that you know how I did it, how would you?! This makes for some fun component architecture! Who's going to make the 1st version with slots in their favorite framework? 🙂 Let's diversify our approaches and learn all the ways to build on the web. Create a Glitch, tweet me your version, and I'll add it to the Community remixes section below. Community remixes # @devnook, @rob_dodson, & @DasSurma with Web Components: article

Accessing hardware devices on the web

The goal of this guide is to help you pick the best API to communicate with a hardware device (e.g. webcam, microphone, etc.) on the web. By "best" I mean it gives you everything you need with the shortest amount of work. In other words, you know the general use case you want to solve (e.g. accessing video) but you don't know what API to use or wonder if there's another way to achieve it. One problem that I commonly see web developers fall into is jumping into low-level APIs without learning about the higher-level APIs that are easier to implement and provide a better UX. Therefore, this guide starts by recommending higher-level APIs first, but also mentions lower-level APIs in case you have determined that the higher-level API doesn't meet your needs. 🕹 Receive input events from this device # Try listening for Keyboard and Pointer events. If this device is a game controller, use the Gamepad API to know which buttons are being pressed and which axes moved. If none of these options work for you, a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 📸 Access audio and video from this device # Use MediaDevices.getUserMedia() to get live audio and video streams from this device and learn about capturing audio and video. You can also control the camera's pan, tilt, and zoom, and other camera settings such as brightness and contrast, and even take still images. Web Audio can be used to add effects to audio, create audio visualizations, or apply spatial effects (such as panning). Check out how to profile the performance of Web Audio apps in Chrome as well. If none of these options work for you, a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 🖨 Print to this device # Use window.print() to open a browser dialog that lets the user pick this device as a destination to print the current document. If this doesn't work for you, a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 🔐 Authenticate with this device # Use WebAuthn to create a strong, attested, and origin-scoped public-key credential with this hardware security device to authenticate users. It supports the use of Bluetooth, NFC, and USB-roaming U2F or FIDO2 authenticators —also known as security keys— as well as a platform authenticator, which lets users authenticate with their fingerprints or screen locks. Check out Build your first WebAuthn app. If this device is another type of hardware security device (e.g. a cryptocurrency wallet), a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 🗄 Access files on this device # Use the File System Access API to read from and save changes directly to files and folders on the user's device. If not available, use the File API to ask the user to select local files from a browser dialog and then read the contents of those files. If none of these options work for you, a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 🧲 Access sensors on this device # Use the Generic Sensor API to read raw sensor values from motion sensors (e.g. accelerometer or gyroscope) and environmental sensors (e.g. ambient light, magnetometer). If not available, use the DeviceMotion and DeviceOrientation events to get access to the built-in accelerometer, gyroscope, and compass in mobile devices. If it doesn't work for you, a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 🛰 Access GPS coordinates on this device # Use the Geolocation API to get the latitude and longitude of the user's current position on this device. If it doesn't work for you, a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 🔋 Check the battery on this device # Use the Battery API to get host information about the battery charge level and be notified when the battery level or charging status change. If it doesn't work for you, a low-level API may be the solution. Check out Discover how to communicate with your device to start your journey. 📞 Communicate with this device over the network # In the local network, use the Remote Playback API to broadcast audio and/or video on a remote playback device (e.g. a smart TV or a wireless speaker) or use the Presentation API to render a web page on a second screen (e.g. a secondary display connected with an HDMI cable or a smart TV connected wirelessly). If this device exposes a web server, use the Fetch API and/or WebSockets to fetch some data from this device by hitting appropriate endpoints. While TCP and UDP sockets are not available on the web, check out WebTransport to handle interactive, bidirectional, and multiplexed network connections. Note that WebRTC can also be used to communicate data in real-time with other browsers using a peer-to-peer protocol. 🧱 Discover how to communicate with your device # The decision of what low-level API you should use is determined by the nature of your physical connection to the device. If it is wireless, check out Web NFC for very short-range wireless connections and Web Bluetooth for nearby wireless devices. With Web NFC, read and write to this device when it's in close proximity to the user's device (usually 5–10 cm, 2–4 inches). Tools like NFC TagInfo by NXP allow you to browse the content of this device for reverse-engineering purposes. With Web Bluetooth, connect to this device over a Bluetooth Low Energy connection. It should be pretty easy to communicate with when it uses standardized Bluetooth GATT services (such as the battery service) as their behavior is well-documented. If not, at this point, you either have to find some hardware documentation for this device or reverse-engineer it. You can use external tools like nRF Connect for Mobile and built-in browser tools such as the internal page about://bluetooth-internals in Chromium-based browsers for that. Check out Reverse-Engineering a Bluetooth Lightbulb from Uri Shaked. Note that Bluetooth devices may also speak the HID or serial protocols. If wired, check out these APIs in this specific order: With WebHID, understanding HID reports and report descriptors through collections is key to your comprehension of this device. This can be challenging without vendor documentation for this device. Tools like Wireshark can help you reverse-engineering it. With Web Serial, without vendor documentation for this device and what commands this device supports, it's hard but still possible with lucky guessing. Reverse-engineering this device can be done by capturing raw USB traffic with tools like Wireshark. You can also use the Serial Terminal web app to experiment with this device if it uses a human-readable protocol. With WebUSB, without clear documentation for this device and what USB commands this device supports, it's hard but still possible with lucky guessing. Watch Exploring WebUSB and its exciting potential from Suz Hinton. You can also reverse-engineer this device by capturing raw USB traffic and inspecting USB descriptors with external tools like Wireshark and built-in browser tools such as the internal page about://usb-internals in Chromium-based browsers. Acknowledgements # Thanks to Reilly Grant, Thomas Steiner, and Kayce Basques for reviewing this article. Photo by Darya Tryfanava on Unsplash.

Requesting performance isolation with the Origin-Agent-Cluster header

Origin-Agent-Cluster is a new HTTP response header that instructs the browser to prevent synchronous scripting access between same-site cross-origin pages. Browsers may also use Origin-Agent-Cluster as a hint that your origin should get its own, separate resources, such as a dedicated process. Browser compatibility # Currently the Origin-Agent-Cluster header is only implemented in Chrome 88 onward. It was designed in close collaboration with representatives from Mozilla Firefox who have marked it as worth prototyping, and has a preliminary positive reception from representatives of WebKit, the browser engine used by Safari. But in the meantime, there's no problem with deploying the Origin-Agent-Cluster header to all your users today. Browsers which don't understand it will just ignore it. And, since pages in origin-keyed agent clusters can actually do fewer things than site-keyed ones (the default), there's no interoperability issue to be worried about. Why browsers can't automatically segregate same-site origins # The web is built on the same-origin policy, which is a security feature that restricts how documents and scripts can interact with resources from another origin. For example, a page hosted at https://a.example is at a different origin from one at https://b.example, or one at https://sub.a.example. Behind the scenes, browsers use the separation that origins provide in different ways. In the old days, even though separate origins would not be able to access each other's data, they would still share resources like operating system threads, processes, and memory allocation. This meant that if one tab was slow, it would slow down all the other tabs. Or if one tab used too much memory, it would crash the entire browser. These days browsers are more sophisticated, and try to separate different origins into different processes. How exactly this works varies per browser: most browsers have some level of separation between tabs, but different iframes inside a single tab might share a process. And because processes come with some memory overhead, they use heuristics to avoid spawning too many: for example, Firefox has a user-configurable process limit, and Chrome varies its behavior between desktop (where memory is more plentiful) and mobile (where it is scarce). These heuristics are not perfect. And they suffer from an important limitation: because there are exceptions to the same-origin policy which allow subdomains like https://sub.a.example and https://a.example to talk to each other, browsers cannot automatically segregate subdomains from each other. The technical distinction here is that the browser cannot automatically segregate pages which are same-site to each other, even if they are cross-origin. The most common cases of same-site cross-origin pages happen with subdomains, but see the article Understanding "same-site" and "same-origin" for more. This default behavior is called "site-keyed agent clusters": that is, the browser groups pages based on their site. The new Origin-Agent-Cluster header asks the browser to change this default behavior for a given page, putting it into an origin-keyed agent cluster, so that it is grouped only with other pages that have the exact same origin. In particular, same-site cross-origin pages will be excluded from the agent cluster. This opt-in separation allows browsers to give these new origin-keyed agent clusters their own dedicated resources, which are not combined with those of other origins. For example, such pages could get their own process, or be scheduled on separate threads. By adding the Origin-Agent-Cluster header to your page, you're indicating to the browser that the page would benefit from such dedicated resources. However, in order to perform the separation, and get these benefits, the browser needs to disable some legacy features. What origin-keyed pages cannot do # When your page is in an origin-keyed agent cluster, you give up some abilities to talk to same-site cross-origin pages that were previously available. In particular: You can no longer set document.domain. This is a legacy feature that normally allows same-site cross-origin pages to synchronously access each other's DOM, but in origin-keyed agent clusters, it is disabled. You can no longer send WebAssembly.Module objects to other same-site cross-origin pages via postMessage(). (Chrome-only) You can no longer send SharedArrayBuffer or WebAssembly.Memory objects to other same-site cross-origin pages. Caution: Chrome is the only browser that allows sending SharedArrayBuffer and WebAssembly.Memory objects to same-site cross-origin pages. Other browsers, and future versions of Chrome, will prevent sending these objects across the origin boundary regardless of whether the agent cluster is origin-keyed or site-keyed. When to use origin-keyed agent clusters # The origins that most benefit from the Origin-Agent-Cluster header are those that: Perform best with their own dedicated resources when possible. Examples include performance-intensive games, video conferencing sites, or multimedia creation apps. Contains resource-intensive iframes that are different-origin, but same-site. For example, if https://mail.example.com embeds https://chat.example.com iframes, origin-keying https://mail.example.com/ ensures that the code written by the chat team cannot accidentally interfere with code written by the mail team, and can hint to the browser to give them separate processes to schedule them independently and decrease their performance impact on each other. Expect to be embedded on different-origin, same-site pages, but know themselves to be resource-intensive. For example, if https://customerservicewidget.example.com expects to use lots of resources for video chat, and will be embedded on various origins throughout https://*.example.com, the team maintaining that widget could use the Origin-Agent-Cluster header to try to decrease their performance impact on embedders. Additionally, you'll also need to make sure you're OK disabling the above-discussed rarely-used cross-origin communication features, and that your site is using HTTPS. But in the end, these are just guidelines. Whether origin-keyed agent clusters will help your site or not is ultimately best determined through measurements. In particular, you'll want to measure your Web Vitals, and potentially your memory usage, to see what impact origin-keying has. (Memory usage in particular is a potential concern, as increasing the number of processes in play can cause more per-process memory overhead.) You shouldn't just roll out origin-keying and hope for the best. How is this related to cross-origin isolation? # Origin-keying of agent clusters via the Origin-Agent-Cluster header is related to, but separate from, cross-origin isolation via the Cross-Origin-Opener-Policy and Cross-Origin-Embedder-Policy headers. Any site which makes itself cross-origin isolated will also disable the same same-site cross-origin communications features as when using the Origin-Agent-Cluster header. However, the Origin-Agent-Cluster header can still be useful on top of cross-origin isolation, as an additional hint to the browser to modify its resource allocation heuristics. So you should still consider applying the Origin-Agent-Cluster header, and measuring the results, even on pages that are already cross-origin isolated. How to use the Origin-Agent-Cluster header # To use the Origin-Agent-Cluster header, configure your web server to send the following HTTP response header: Origin-Agent-Cluster: ?1 The value of ?1 is the structured header syntax for a boolean true value. It's important to send this header on all responses from your origin, not just some pages. Otherwise, you can get inconsistent results, where the browser "remembers" seeing an origin-keying request and so it origin-keys even on pages that don't ask for it. Or the reverse: if the first page a user visits doesn't have the header, then the browser will remember that your origin does not want to be origin-keyed, and will ignore the header on subsequent pages. Caution: Don't forget to send the header on error pages, like your 404 page! The reason for this "memory" is to ensure consistency of keying for an origin. If some pages on an origin were origin-keyed, while others weren't, then you could have two same-origin pages which were put into different agent clusters, and thus weren't allowed to talk to each other. This would be very strange, both for web developers and for the internals of the browser. So, the specification for Origin-Agent-Cluster instead ignores the header if it's inconsistent with what it's previously seen for a given origin. In Chrome, this will result in a console warning. This consistency is scoped to a browsing context group, which is a group of tabs, windows, or iframes which can all reach each other via mechanisms like window.opener, frames[0], or window.parent. This means that, once an origin's origin- or site-keying has been settled (by the browser either seeing, or not seeing, the header), changing it requires opening up an entirely new tab, not connected to the old one in any way. These details can be important for testing the Origin-Agent-Cluster header. When first adding it to your site, just reloading the page will not work; you'll need to close the tab and open a new one. To check whether the Origin-Agent-Cluster header is applied, use the JavaScript window.originAgentCluster property. This will be true in cases where the header (or other mechanisms, like cross-origin isolation) caused origin-keying; false when it did not; and undefined in browsers that don't implement the Origin-Agent-Cluster header. Logging this data to your analytics platform can provide a valuable check that you've configured your server correctly. Finally, note that the Origin-Agent-Cluster header will only work on secure contexts, i.e. on HTTPS pages or on http://localhost. Non-localhost HTTP pages do not support origin-keyed agent clusters. Origin-keying is not a security feature # While using an origin-keyed agent cluster does isolate your origin from synchronous access from same-site cross-origin pages, it does not give the protection of security-related headers like Cross-Origin-Resource-Policy and Cross-Origin-Opener-Policy. In particular, it is not a reliable protection against side channel attacks like Spectre. This might be a bit surprising, because origin-keying can sometimes cause your origin to get its own process, and separate processes are an important defence against side-channel attacks. But remember that the Origin-Agent-Cluster header is only a hint in that regard. The browser is under no obligation to give your origin a separate process, and it might not do so for a variety of reasons: A browser might not implement the technology to do so. For example, currently Safari and Firefox can put separate tabs into their own processes, but cannot yet do so for iframes. The browser might decide it's not worth the overhead of a separate process. For example, on low-memory Android devices, or in Android WebView, Chrome uses as few processes as possible. The browser might want to respect the request that the Origin-Agent-Cluster header indicates, but it could do so using different isolation technology than processes. For example, Chrome is exploring using threads instead of processes for this sort of performance isolation. The user, or code running on a different site, might have already navigated to a site-keyed page on your origin, causing the consistency guarantee to kick in and the Origin-Agent-Cluster header to be ignored entirely. For these reasons, it's important not to think of origin-keyed agent clusters as a security feature. Instead, it's a way of helping the browser prioritize resource allocation, by hinting that your origin would benefit from dedicated resources (and that you're willing to give up certain features in exchange). Feedback # The Chrome team would love to hear from you if you're using, or considering using, the Origin-Agent-Cluster header. Your public interest and support helps us prioritize features and show other browser vendors how important they are. Tweet at @ChromiumDev and let Chrome DevRel know your thoughts and experiences. If you have more questions about the specification, or the details of how the feature works, you can file an issue on the HTML Standard GitHub repository. And if you encounter any issues with Chrome's implementation, you can file a bug at new.crbug.com with the Components field set to Internals>Sandbox>SiteIsolation. Learn more # To learn more about origin-keyed agent clusters, you can dive into the details at these links: Demo and demo source Explainer Specification Tracking bugs: Chrome, Firefox, Safari

New aspect-ratio CSS property supported in Chromium, Safari Technology Preview, and Firefox Nightly

Summary: Maintaining a consistent width-to-height ratio, called an aspect ratio, is critical in responsive web design and for preventing cumulative layout shift. Now, there's a more straightforward way to do this with the new aspect-ratio property launching in Chromium 88, Firefox 87, and Safari Technology Preview 118. Aspect ratio # Aspect ratio is most commonly expressed as two integers and a colon in the dimensions of: width:height, or x:y. The most common aspect ratios for photography are 4:3 and 3:2, while video, and more recent consumer cameras, tend to have a 16:9 aspect ratio. Two images with the same aspect ratio. One is 634 x 951px while the other is 200 x 300px. Both have a 2:3 aspect ratio. With the advent of responsive design, maintaining aspect ratio has been increasingly important for web developers, especially as image dimensions differ and element sizes shift based on available space. Some examples of where maintaining aspect ratio become important are: Creating responsive iframes, where they are 100% of a parent's width, and the height should remain a specific viewport ratio Creating intrinsic placeholder containers for images, videos, and embeds to prevent re-layout when the items load and take up space Creating uniform, responsive space for interactive data visualizations or SVG animations Creating uniform, responsive space for multi-element components such as cards or calendar dates Creating uniform, responsive space for multiple images of varying dimension (can be used alongside object-fit) Object-fit # Defining an aspect ratio helps us with sizing media in a responsive context. Another tool in this bucket is the object-fit property, which enables users to describe how an object (such an as image) within a block should fill that block: Showcasing various object-fit values. See demo on Codepen. The initial and fill values re-adjust the image to fill the space. In our example, this causes the image to be squished and blurry, as it re-adjusts pixels. Not ideal. object-fit: cover uses the image's smallest dimension to fill the space and crops the image to fit into it based on this dimension. It "zooms in" at its lowest boundary. object-fit: contain ensures that the entire image is always visible, and so the opposite of cover, where it takes the size of the largest boundary (in our example above this is width), and resizes the image to maintain its intrinsic aspect ratio while fitting into the space. The object-fit: none case shows the image cropped in its center (default object position) at its natural size. object-fit: cover tends to work in most situations to ensure a nice uniform interface when dealing with images of varying dimensions, however, you lose information this way (the image is cropped at its longest edges). If these details are important (for example, when working with a flat lay of beauty products), cropping important content is not acceptable. So the ideal scenario would be responsive images of varying sizes that fit the UI space without cropping. The old hack: maintaining aspect ratio with padding-top # Using padding-top to set a 1:1 aspect ratio on post preview images within a carousel. In order to make these more responsive, we can use aspect ratio. This allows for us to set a specific ratio size and base the rest of the media on an individual axis (height or width). A currently well-accepted cross-browser solution for maintaining aspect ratio based on an image's width is known as the "Padding-Top Hack". This solution requires a parent container and an absolutely placed child container. One would then calculate the aspect ratio as a percentage to set as the padding-top. For example: 1:1 aspect ratio = 1 / 1 = 1 = padding-top: 100% 4:3 aspect ratio = 3 / 4 = 0.75 = padding-top: 75% 3:2 aspect ratio = 2 / 3 = 0.66666 = padding-top: 66.67% 16:9 aspect ratio = 9 / 16 = 0.5625 = padding-top: 56.25% Now that we have identified the aspect ratio value, we can apply that to our parent container. Consider the following example: <div class="container"> <img class="media" src="..." alt="..."> </div> We could then write the following CSS: .container { position: relative; width: 100%; padding-top: 56.25%; /* 16:9 Aspect Ratio */ } .media { position: absolute; top: 0; } Maintaining aspect ratio with aspect-ratio # Using aspect-ratio to set a 1:1 aspect ratio on post preview images within a carousel. Unfortunately, calculating these padding-top values is not very intuitive, and requires some additional overhead and positioning. With the new intrinsic aspect-ratio CSS property, the language for maintaining aspect ratios is much more clear. With the same markup, we can replace: padding-top: 56.25% with aspect-ratio: 16 / 9, setting aspect-ratio to a specified ratio of width / height. Using padding-top .container { width: 100%; padding-top: 56.25%; } Using aspect-ratio .container { width: 100%; aspect-ratio: 16 / 9; } Using aspect-ratio instead of padding-top is much more clear, and does not overhaul the padding property to do something outside of its usual scope. This new property also adds the ability to set aspect ratio to auto, where "replaced elements with an intrinsic aspect ratio use that aspect ratio; otherwise the box has no preferred aspect ratio." If both auto and a <ratio> are specified together, the preferred aspect ratio is the specified ratio of width divided by height unless it is a replaced element with an intrinsic aspect ratio, in which case that aspect ratio is used instead. Example: consistency in a grid # This works really well with CSS layout mechanisms like CSS Grid and Flexbox as well. Consider a list with children that you want to maintain a 1:1 aspect ratio, such as a grid of sponsor icons: <ul class="sponsor-grid"> <li class="sponsor"> <img src="..." alt="..."/> </li> <li class="sponsor"> <img src="..." alt="..."/> </li> </ul> .sponsor-grid { display: grid; grid-template-columns: repeat(auto-fill, minmax(120px, 1fr)); } .sponsor img { aspect-ratio: 1 / 1; width: 100%; object-fit: contain; } Images in a grid with their parent element at various aspect ratio dimensions. See demo on Codepen. Example: preventing layout shift # Another great feature of aspect-ratio is that it can create placeholder space to prevent Cumulative Layout Shift and deliver better Web Vitals. In this first example, loading an asset from an API such as Unsplash creates a layout shift when the media is finished loading. Using aspect-ratio, on the other hand, creates a placeholder to prevent this layout shift: img { width: 100%; aspect-ratio: 8 / 6; } Video with a set aspect ratio is set on a loaded asset. This video is recorded with an emulated 3G network. See demo on Codepen. Bonus tip: image attributes for aspect ratio # Another way to set an image's aspect ratio is through image attributes. If you know the dimensions of the image ahead of time, it is a best practice to set these dimensions as its width and height. For our example above, knowing the dimensions are 800px by 600px, the image markup would look like: <img src="image.jpg" alt="..." width="800" height="600">. If the image sent has the same aspect ratio, but not necessarily those exact pixel values, we could still use image attribute values to set the ratio, combined with a style of width: 100% so that the image takes up the proper space. All together that would look like: <!-- Markup --> <img src="image.jpg" alt="..." width="8" height="6"> /* CSS */ img { width: 100%; } In the end, the effect is the same as setting the aspect-ratio on the image via CSS, and cumulative layout shift is avoided (see demo on Codepen). Conclusion # With the new aspect-ratio CSS property, launching across multiple modern browsers, maintaining proper aspect ratios in your media and layout containers gets a little bit more straightforward. Photos by Amy Shamblen and Lionel Gustave via Unsplash.

WebRTC is now a W3C and IETF standard

The process of defining a web standard is a lengthy process that ensures usefulness, consistency and compatibility across browsers. Today the W3C and IETF mark the completion of perhaps one of the most important standards during the pandemic: WebRTC. Check out the Real-time communication with WebRTC codelab for a hands-on walkthrough of implementing WebRTC. History # WebRTC is a platform giving browsers, mobile apps, and desktop apps real-time communication capabilities, typically used for video calling. The platform consists of a comprehensive set of technologies and standards. Google initiated the idea to create WebRTC in 2009, as an alternative to Adobe Flash and desktop applications that couldn't run in the browser. The previous generation of browser-based products were built on top of licensed proprietary technology. Various products were built with this technology, including Hangouts. Google then acquired the companies it had been licensing the technology from and made it available as the open source WebRTC project. This codebase is integrated in Chrome and used by the majority of applications using WebRTC. Together with other browser vendors and industry leaders such as Mozilla, Microsoft, Cisco, and Ericsson, the standardization of WebRTC was kicked off in both the W3C and IETF. In 2013, Mozilla and Google demonstrated video calling between their browsers. Through the evolution of the standard, many architectural discussions had led to implementation differences across browsers and challenged compatibility and interoperability. Most of these disagreements were ultimately settled as the standard became finalized in the past years. The WebRTC specification is now accompanied with a full set of platform tests and tools to address compatibility and browsers have largely adapted their implementations accordingly. This brings an end to a challenging period where web developers had to continuously adopt their services to different browser implementations and specification changes. Architecture and functionality # The RTCPeerConnection API is the central part of the WebRTC specification. RTCPeerConnection deals with connecting two applications on different endpoints to communicate using a peer-to-peer protocol. The PeerConnection API interacts closely with getUserMedia for accessing camera and microphone, and getDisplayMedia for capturing screen content. WebRTC allows you to send and receive streams that include audio and/or video content, as well as arbitrary binary data through the DataChannel. The media functionality for processing, encoding, and decoding audio and video provides the core of any WebRTC implementation. WebRTC supports various audio codecs, with Opus being the most used and versatile. WebRTC implementations are required to support both Google's free-to-use VP8 video codec and H.264 for processing video. WebRTC connections are always encrypted, which is achieved through two existing protocols: DTLS and SRTP. WebRTC leans heavily on existing standards and technologies, from video codecs (VP8,H264), network traversal (ICE), transport (RTP, SCTP), to media description protocols (SDP). This is tied together in over 50 RFCs. Use cases: when it's a matter of milliseconds # WebRTC is widely used in time-critical applications such as remote surgery, system monitoring, and remote control of autonomous cars, and voice or video calls built on UDP where buffering is not possible. Nearly all browser-based video callings services from companies such as Google, Facebook, Cisco, RingCentral, and Jitsi use WebRTC. Google Stadia and NVIDIA GeForce NOW use WebRTC to get the stream of gameplay from the cloud to the web browser without perceivable delay. Pandemic puts focus on video calling performance # Over the past year, WebRTC has seen a 100X increase of usage in Chrome due to increased video calling from within the browser. Recognizing that video calling has become a fundamental part of many people's lives during the pandemic, browser vendors have begun to optimize the technologies that video calling depends on. This was particularly important as resource demanding large meetings and video effects in video meetings became more common when employees and students started to work and study from home. In the past year Chrome has become up to 30% more battery friendly for video calling, with more optimizations to come for heavy usage scenarios. Mozilla, Apple, and Microsoft all have made significant improvements in their implementation of WebRTC through the pandemic, in particular in making sure they adhere to the now formalized standard. The future of WebRTC # While WebRTC is now completed as a W3C standard, improvements continue. The new video codec AV1 which saves up to 50% of bandwidth is becoming available in WebRTC and web browsers. Continued improvements in the open source code base are expected to further reduce delay and improve the quality of video that can be streamed. WebRTC NV gathers the initiative to create supplementary APIs to enable new use cases. These consist of extensions to existing APIs to give more control over existing functionality such as Scalable Video Coding as well as APIs that give access to lower-level components. The latter gives more flexibility to web developers to innovate by integrating high-performance custom WebAssembly components. With emerging 5G networks and demand for more interactive services, we're expecting to see a continued increase of services building on top of WebRTC in the year to come.

Best practices for carousels

A carousel is a UX component that displays content in slideshow-like manner. Carousels can "autoplay" or be navigated manually by users. Although carousels can be used elsewhere, they are most frequently used to display images, products, and promotions on homepages. This article discusses performance and UX best practices for carousels. Performance # A well-implemented carousel, in and of itself, should have very minimal or no impact on performance. However, carousels often contain large media assets. Large assets can impact performance regardless of whether they are displayed in a carousel or elsewhere. LCP (Largest Contentful Paint) Large, above-the-fold carousels often contain the page's LCP element, and therefore can have a significant impact on LCP. In these scenarios, optimizing the carousel may significantly improve LCP. For an in-depth explanation of how LCP measurement works on pages containing carousels, refer to the LCP measurement for carousels section. FID (First Input Delay) Carousels have minimal JavaScript requirements and therefore should not impact page interactivity. If you discover that your site's carousel has long-running scripts, you should consider replacing your carousel tooling. CLS (Cumulative Layout Shift) A surprising number of carousels use janky, non-composited animations that can contribute to CLS. On pages with autoplaying carousels, this has the potential to cause infinite CLS. This type of CLS typically isn't apparent to the human eye, which makes the issue easy to overlook. To avoid this issue, avoid using non-composited animations in your carousel (for example, during slide transitions). Performance best practices # Load carousel content using HTML # Carousel content should be loaded via the page's HTML markup so that it is discoverable by the browser early in the page load process. Using JavaScript to initiate the loading of carousel content is probably the single biggest performance mistake to avoid when using carousels. This delays image loading and can negatively impact LCP. Do <div class="slides"> <img src="https://example.com/cat1.jpg"> <img src="https://example.com/cat2.jpg"> <img src="https://example.com/cat3.jpg"> </div> Don't const slides = document.querySelector(".slides"); const newSlide = document.createElement("img"); newSlide.src = "htttp://example.com/cat1.jpg"; slides.appendChild(newSlide); For advanced carousel optimization, consider loading the first slide statically, then progressively enhancing it to include navigation controls and additional content. This technique is most applicable to environments where you have a user's prolonged attention—this gives the additional content time to load. In environments like home pages, where users may only stick around for a second or two, only loading a single image may be similarly effective. Avoid layout shifts # Chrome 88-90 shipped a variety of bug fixes related to how layout shifts are calculated. Many of these bug fixes are relevant to carousels. As a result of these fixes, sites should expect to see lower carousel-related layout shift scores in later versions of Chrome. Slide transitions and navigation controls are the two most common sources of layout shifts in carousels: Slide transitions: Layout shifts that occur during slide transitions are usually caused by updating the layout-inducing properties of DOM elements. Examples of some of these properties include: left, top, width, and marginTop. To avoid layout shifts, instead use the CSS transform property to transition these elements. This demo shows how to use transform to build a basic carousel. Navigation controls: Moving or adding/removing carousel navigation controls from the DOM can cause layout shifts depending on how these changes are implemented. Carousels that exhibit this behavior typically do so in response to user hover. These are some of the common points of confusion regarding CLS measurement for carousels: Autoplay carousels: Slide transitions are the most common source of carousel-related layout shifts. In a non-autoplay carousel these layout shifts typically occur within 500ms of a user interaction and therefore do not count towards Cumulative Layout Shift (CLS). However, for autoplay carousels, not only can these layout shifts potentially count towards CLS - but they can also repeat indefinitely. Thus, it is particularly important to verify that an autoplay carousel is not a source of layout shifts. Scrolling: Some carousels allow users to use scrolling to navigate through carousel slides. If an element's start position changes but its scroll offset (that is, scrollLeft or scrollTop) changes by the same amount (but in the opposite direction) this is not considered a layout shift provided that they occur in the same frame. For more information on layout shifts, see Debug layout shifts. Use modern technology # Many sites use third-party JavaScript libraries to implement carousels. If you currently use older carousel tooling, you may be able to improve performance by switching to newer tooling. Newer tools tend to use more efficient APIs and are less likely to require additional dependencies like jQuery. However, dependng on the type of carousel you are building, you may not need JavaScript at all. The new Scroll Snap API makes it possible to implement carousel-like transitions using only HTML and CSS. Here are some resources on using scroll-snap that you may find helpful: Building a Stories component (web.dev) Next-generation web styling: scroll snap (web.dev) CSS-Only Carousel (CSS Tricks) How to Make a CSS-Only Carousel (CSS Tricks) Optimize carousel content # Carousels often contain some of a site's largest images, so it can be worth your time to make sure that these images are fully optimized. Choosing the right image format and compression level, using an image CDN, and using srcset to serve multiple image versions are all techniques that can reduce the transfer size of images. Performance measurement # This section discusses LCP measurement as it relates to carousels. Although carousels are treated no differently than any other UX element during LCP calculation, the mechanics of calculating LCP for autoplaying carousels is a common point of confusion. LCP measurement for carousels # These are the key points to understanding how LCP calculation works for carousels: LCP considers page elements as they are painted to the frame. New candidates for the LCP element are no longer considered once the user interacts (taps, scrolls, or keypresses) with the page. Thus, any slide in an autoplaying carousel has the potential to be the final LCP element—whereas in a static carousel only the first slide would be a potential LCP candidate. If two equally sized images are rendered, the first image will be considered the LCP element. The LCP element is only updated when the LCP candidate is larger than the current LCP element. Thus, if all carousel elements are equally sized, the LCP element should be the first image that is displayed. When evaluating LCP candidates, LCP considers the "visible size or the intrinsic size, whichever is smaller." Thus, if an autoplaying carousel displays images at a consistent size, but contains images of varying intrinsic sizes that are smaller than the display size, the LCP element may change as new slides are displayed. In this case, if all images are displayed at the same size, the image with the largest intrinsic size will be considered the LCP element. To keep LCP low, you should ensure that all items in an autoplaying carousel are the same intrinsic size. Changes to LCP calculation for carousels in Chrome 88 # As of Chrome 88, images that are later removed from the DOM are considered as potential largest contentful paints. Prior to Chrome 88, these images were excluded from consideration. For sites that use autoplaying carousels, this definition change will either have a neutral or positive impact on LCP scores. This change was made in response to the observation that many sites implement carousel transitions by removing the previously displayed image from the DOM tree. Prior to Chrome 88, each time that a new slide was presented, the removal of the previous element would trigger an LCP update. This change only affects autoplaying carousels-by definition, potential largest contentful paints can only occur before a user first interacts with the page. Other considerations # This section discusses UX and product best practices that you should keep in mind when implementing carousels. Carousels should advance your business goals and present content in a way that is easy to navigate and read. Navigation best practices # Provide prominent navigation controls # Carousel navigation controls should be easy to click and highly visible. This is something that is rarely done well-most carousels have navigation controls that are both small and subtle. Keep in mind that a single color or style of navigation control will rarely work in all situations. For example, an arrow that is clearly visible against a dark background might be difficult to see against a light background. Indicate navigation progress # Carousel navigation controls should provide context about the total number of slides and the user's progress through them. This information makes it easier for the user to navigate to a particular slide and understand which content has already been viewed. In some situations providing a preview of upcoming content—whether it be an excerpt of the next slide or a list of thumbnails-can also be helpful and increase engagement. Support mobile gestures # On mobile, swipe gestures should be supported in addition to traditional navigation controls (such as on screen buttons). Provide alternate navigation paths # Because it's unlikely that most users will engage with all carousel content, the content that carousel slides link to should be accessible from other navigation paths. Readability best practices # Don't use autoplay # The use of autoplay creates two almost paradoxical problems: on-screen animations tend to distract users and move the eyes away from more important content; simultaneously, because users often associate animations with ads, they will ignore carousels that autoplay. Thus, it's a rare that autoplay is a good choice. If content is important, not using autoplay will maximize its exposure; if carousel content is not important, then the use of autoplay will detract from more important content. In addition, autoplaying carousels can be difficult to read (and annoying, too). People read at different speeds, so it's rare that a carousel consistently transitions at the "right" time for different users. Ideally, slide navigation should be user-directed via navigation controls. If you must use autoplay, autoplay should be disabled on user hover. In addition, the slide transition rate should take slide content into account-the more text that a slide contains, the longer it should be displayed on screen. Keep text and images separate # Carousel text content is often "baked into" the corresponding image file, rather than displayed separately using HTML markup. This approach is bad for accessibility, localization, and compression rates. It also encourages a one-size-fits-all approach to asset creation. However, the same image and text formatting is rarely equally readable across desktop and mobile formats. Be concise # You only have a fraction of a second to catch a user's attention. Short, to-the-point copy will increase the odds that your message gets across. Product best practices # Carousels work well in situations where using additional vertical space to display additional content is not an option. Carousels on product pages are often a good example of this use case. However, carousels are not always used effectively. Carousels, particularly if they contain promotions or advance automatically, are easily mistaken for advertisements by users. Users tend to ignore advertisements—a phenomenon known as banner blindness. Carousels are often used to placate multiple departments and avoid making decisions about business priorities. As a result, carousels can easily turn into a dumping ground for ineffective content. Test your assumptions # The business impact of carousels, particularly those on homepages, should be evaluated and tested. Carousel clickthrough rates can help you determine whether a carousel and its content is effective. Be relevant # Carousels work best when they contain interesting and relevant content that is presented with a clear context. If content wouldn't engage a user outside of a carousel—placing it in a carousel won't make it perform any better. If you must use a carousel, prioritize content and ensure that each slide is sufficiently relevant that a user would want to click through to the subsequent slide.

When to use HTTPS for local development

Also see: How to use HTTPS for local development. In this post, statements about localhost are valid for and [::1] as well, since they both describe the local computer address, also called "loopback address". Also, to keep things simple, the port number isn't specified. So when you see http://localhost, read it as http://localhost:{PORT} or{PORT}. Summary # When developing locally, use http://localhost by default. Service Workers, Web Authentication API, and more will work. However, in the following cases, you'll need HTTPS for local development: Setting Secure cookies in a consistent way across browsers Debugging mixed-content issues Using HTTP/2 and later Using third-party libraries or APIs that require HTTPS Using a custom hostname When to use HTTPS for local development. If you need HTTPS for one of the above use cases, check out How to use HTTPS for local development. ✨ This is all you need to know. If you're interested in more details keep reading! Why your development site should behave securely # To avoid running into unexpected issues, you want your local development site to behave as much as possible like your production website. So, if your production website uses HTTPS, you want your local development site to behave like an HTTPS site. Warning: If your production website doesn't use HTTPS, make it a priority. Use http://localhost by default # Browsers treat http://localhost in a special way: although it's HTTP, it mostly behaves like an HTTPS site. On http://localhost, Service Workers, Sensor APIs, Authentication APIs, Payments, and other features that require certain security guarantees are supported and behave exactly like on an HTTPS site. When to use HTTPS for local development # You may encounter special cases where http://localhost doesn't behave like an HTTPS site—or you may simply want to use a custom site name that's not http://localhost. You need to use HTTPS for local development in the following cases: You need to set a cookie locally that is Secure, or SameSite:none, or has the __Host prefix. Secure cookies are set only on HTTPS, but not on http://localhost for all browsers. And because SameSite:none and __Host also require the cookie to be Secure, setting such cookies on your local development site requires HTTPS as well. Gotchas! When it comes to setting Secure cookies locally, not all browsers behave in the same way! For example, Chrome and Safari don't set Secure cookies on localhost, but Firefox does. In Chrome, this is considered a bug. You need to debug locally an issue that only occurs on an HTTPS website but not on an HTTP site, not even http://localhost, such as a mixed-content issue. You need to locally test or reproduce a behaviour specific to HTTP/2 or newer. For example, if you need to test loading performance on HTTP/2 or newer. Insecure HTTP/2 or newer is not supported, not even on localhost. You need to locally test third-party libraries or APIs that require HTTPS (for example OAuth). You're not using localhost, but a custom host name for local development, for example mysite.example. Typically, this means you've overridden your local hosts file: Editing a hosts file to add a custom hostname. In this case, Chrome, Edge, Safari, and Firefox by default do not consider mysite.example to be secure, even though it's a local site. So it won't behave like an HTTPS site. Other cases! This is not an exhaustive list, but if you encounter a case that's not listed here, you'll know: things will break on http://localhost, or it won't quite behave like your production site. 🙃 In all of these cases, you need to use HTTPS for local development. How to use HTTPS for local development # If you need to use HTTPS for local development, head over to How to use HTTPS for local development. Tips if you're using a custom hostname # If you're using a custom hostname, for example, editing your hosts file: Don't use a bare hostname like mysite because if there's a top-level domain (TLD) that happens to have the same name (mysite), you'll run into issues. And it's not that unlikely: in 2020, there are over 1,500 TLDs, and the list is growing. coffee, museum, travel, and many large company names (maybe even the company you're working at!) are TLDs. See the full list here. Only use domains that are yours, or that are reserved for this purpose. If you don't have a domain of your own, you can use either test or localhost (mysite.localhost). test doesn't have special treatment in browsers, but localhost does: Chrome and Edge support http://<name>.localhost out of the box, and it will behave securely when localhost does. Try it out: run any site on localhost and access http://<whatever name you like>.localhost:<your port> in Chrome or Edge. This may soon be possible in Firefox and Safari as well. The reason you can do this (have subdomains like mysite.localhost) is because localhost is not just a hostname: it's also a full TLD, like com. Learn more # Secure contexts localhost as a secure context localhost as a secure context in Chrome With many thanks for contributions and feedback to all reviewers—especially Ryan Sleevi, Filippo Valsorda, Milica Mihajlija, Rowan Merewood and Jake Archibald. 🙌 Hero image by @moses_lee on Unsplash, edited.

Introducing Open Web Docs

High-quality documentation for web platform technologies is a critically important component of our shared, open digital infrastructure. Today, I'm excited to publicly introduce Open Web Docs, a collective project between Google, Microsoft, Mozilla, Coil, W3C, Samsung, and Igalia. It is designed to support a community of technical writers around strategic creation and long-term maintenance of web platform technology documentation that is open and inclusive for all. This is not a new docs platform: Open Web Docs is instead working closely with existing platforms, and its current priority is contributions to MDN Web Docs. It was created to ensure the long-term health of web platform documentation on de facto standard resources, independently of any single vendor or organization. Through full-time staff, community management, and Open Web Docs' network of partner organizations, it enables these resources to better maintain and sustain documentation of core web platform technologies. Rather than create new documentation sites, Open Web Docs is committed to improving existing platforms through our contributions. Head over to Open Web Docs and the launch post and FAQ to learn more!

How to use HTTPS for local development

Most of the time, http://localhost does what you need: in browsers, it mostly behaves like HTTPS 🔒. That's why some APIs that won't work on a deployed HTTP site, will work on http://localhost. What this means is that you need to use HTTPS locally only in special cases (see When to use HTTPS for local development), like custom hostnames or Secure cookies across browsers. Keep reading if that's you! In this post, statements about localhost are valid for and [::1] as well, since they both describe the local computer address, also called "loopback address". Also, to keep things simple, the port number isn't specified. So when you see http://localhost, read it as http://localhost:{PORT} or{PORT}. If your production website uses HTTPS, you want your local development site to behave like an HTTPS site (if your production website doesn't use HTTPS, make it a priority to switch to HTTPS). Most of the time, you can trust http://localhost to behave like an HTTPS site. But in some cases, you need to run your site locally with HTTPS. Let's take a look at how to do this. ⏩ Are you looking for quick instructions, or have you been here before? Skip to the Cheatsheet. Running your site locally with HTTPS using mkcert (recommended) # To use HTTPS with your local development site and access https://localhost or https://mysite.example (custom hostname), you need a TLS certificate. But browsers won't consider just any certificate valid: your certificate needs to be signed by an entity that is trusted by your browser, called a trusted certificate authority (CA). What you need to do is to create a certificate and sign it with a CA that is trusted locally by your device and browser. mkcert is a tool that helps you do this in a few commands. Here's how it works: If you open your locally running site in your browser using HTTPS, your browser will check the certificate of your local development server. Upon seeing that the certificate has been signed by the mkcert-generated certificate authority, the browser checks whether it's registered as a trusted certificate authority. mkcert is listed as a trusted authority, so your browser trusts the certificate and creates an HTTPS connection. A diagram of how mkcert works. mkcert (and similar tools) provide several benefits: mkcert is specialized in creating certificates that are compliant with what browsers consider valid certificates. It stays updated to match requirements and best practices. This is why you won't have to run mkcert commands with complex configurations or arguments to generate the right certificates! mkcert is a cross-platform tool. Anyone on your team can use it. mkcert is the tool we recommend for creating a TLS certificate for local development. You can check out other options too. Many operating systems may include libraries to produce certificates, such as openssl. Unlike mkcert and similar tools, such libraries may not consistently produce correct certificates, may require complex commands to be run, and are not necessarily cross-platform. Gotchas! The mkcert we're interested in in this post is this one, not this one. Caution # Never export or share the file rootCA-key.pem mkcert creates automatically when you run mkcert -install. An attacker getting hold of this file can create on-path attacks for any site you may be visiting. They could intercept secure requests from your machine to any site—your bank, healthcare provider, or social networks. If you need to know where rootCA-key.pem is located to make sure it's safe, run mkcert -CAROOT. Only use mkcert for development purposes—and by extension, never ask end-users to run mkcert commands. Development teams: all members of your team should install and run mkcert separately (not store and share the CA and certificate). Setup # Install mkcert (only once). Follow the instructions for installing mkcert on your operating system. For example, on macOS: brew install mkcert brew install nss # if you use Firefox Add mkcert to your local root CAs. In your terminal, run the following command: mkcert -install This generates a local certificate authority (CA). Your mkcert-generated local CA is only trusted locally, on your device. Generate a certificate for your site, signed by mkcert. In your terminal, navigate to your site's root directory or whichever directory you'd like the certificates to be located at. Then, run: mkcert localhost If you're using a custom hostname like mysite.example, run: mkcert mysite.example The command above does two things: Generates a certificate for the hostname you've specified Lets mkcert (that you've added as a local CA in Step 2) sign this certificate. Now, your certificate is ready and signed by a certificate authority your browser trusts locally. You're almost done, but your server doesn't know about your certificate yet! Configure your server. You now need to tell your server to use HTTPS (since development servers tend to use HTTP by default) and to use the TLS certificate you've just created. How to do this exactly depends on your server. A few examples: 👩🏻‍💻 With node: server.js (replace {PATH/TO/CERTIFICATE...} and {PORT}): const https = require('https'); const fs = require('fs'); const options = { key: fs.readFileSync('{PATH/TO/CERTIFICATE-KEY-FILENAME}.pem'), cert: fs.readFileSync('{PATH/TO/CERTIFICATE-FILENAME}.pem'), }; https .createServer(options, function (req, res) { // server code }) .listen({PORT}); 👩🏻‍💻 With http-server: Start your server as follows (replace {PATH/TO/CERTIFICATE...}): http-server -S -C {PATH/TO/CERTIFICATE-FILENAME}.pem -K {PATH/TO/CERTIFICATE-KEY-FILENAME}.pem -S runs your server with HTTPS, while -C sets the certificate and -K sets the key. 👩🏻‍💻 With a React development server: Edit your package.json as follows, and replace {PATH/TO/CERTIFICATE...}: "scripts": { "start": "HTTPS=true SSL_CRT_FILE={PATH/TO/CERTIFICATE-FILENAME}.pem SSL_KEY_FILE={PATH/TO/CERTIFICATE-KEY-FILENAME}.pem react-scripts start" For example, if you've created a certificate for localhost that is located in your site's root directory as follows: |-- my-react-app |-- package.json |-- localhost.pem |-- localhost-key.pem |--... Then your start script should look like this: "scripts": { "start": "HTTPS=true SSL_CRT_FILE=localhost.pem SSL_KEY_FILE=localhost-key.pem react-scripts start" 👩🏻‍💻 Other examples: Angular development server Python ✨ You're done! Open https://localhost or https://mysite.example in your browser: you're running your site locally with HTTPS. You won't see any browser warnings, because your browser trusts mkcert as a local certificate authority. Your server may use a different port for HTTPS. Using mkcert: cheatsheet # To run your local development site with HTTPS: Set up mkcert. If you haven't yet, install mkcert, for example on macOS: brew install mkcert Check install mkcert for Windows and Linux instructions. Then, create a local certificate authority: mkcert -install Create a trusted certificate. mkcert {YOUR HOSTNAME e.g. localhost or mysite.example} This create a valid certificate (that will be signed by mkcert automatically). Configure your development server to use HTTPS and the certificate you've created in Step 2. ✨ You're done! You can now access https://{YOUR HOSTNAME} in your browser, without warnings Do this only for development purposes and never export or share the file rootCA-key.pem (if you need to know where this file is located to make sure it's safe, run mkcert -CAROOT). Running your site locally with HTTPS: other options # Self-signed certificate # You may also decide to not use a local certificate authority like mkcert, and instead sign your certificate yourself. Beware of a few pitfalls with this approach: Browsers don't trust you as a certificate authority and they'll show warnings you'll need to bypass manually. In Chrome, you may use the flag #allow-insecure-localhost to bypass this warning automatically on localhost. It feels a bit hacky, because it is. This is unsafe if you're working in an insecure network. Self-signed certificates won't behave in exactly the same way as trusted certificates. It's not necessarily easier or faster than using a local CA like mkcert. If you're not using this technique in a browser context, you may need to disable certificate verification for your server. Omitting to re-enable it in production would be dangerous. The warnings browsers show when a self-signed certificate is used. If you don't specify any certificate, React's and Vue's development server HTTPS options create a self-signed certificate under the hood. This is quick, but you'll get browser warnings and encounter other pitfalls related to self-signed certificates that are listed above. Luckily you can use frontend frameworks' built-in HTTPS option and specify a locally trusted certificate created via mkcert or similar. See how to do this in the mkcert with React example. If you open your locally running site in your browser using HTTPS, your browser will check the certificate of your local development server. When it sees that the certificate has been signed by yourself, it checks whether you're registered as a trusted certificate authority. Because you're not, your browser can't trust the certificate; it displays a warning telling you that your connection is not secure. You may proceed at your own risk—if you do, an HTTPS connection will be created. Why browsers don't trust self-signed certificates. Certificate signed by a regular certificate authority # You may also find techniques based on having an actual certificate authority—not a local one—sign your certificate. A few things to keep in mind if you're considering using these techniques: You'll have more setup work to do than when using a local CA technique like mkcert. You need to use a domain name that you control and that is valid. This means that you can't use actual certificate authorities for: localhost and other domain names that are reserved, such as example or test. Any domain name that you don't control. Top-level domains that are not valid. See the list of valid top-level domains. Reverse proxy # Another option to access a locally running site with HTTPS is to use a reverse proxy such as ngrok. A few points to consider: Anyone can access your local development site once you share with them a URL created with a reverse proxy. This can be very handy when demoing your project to clients! Or this can be a downside, if your project is sensitive. You may need to consider pricing. New security measures in browsers may affect the way these tools work. Flag (not recommended) # If you're using a custom hostname like mysite.example, you can use a flag in Chrome to forcefully consider mysite.example secure. Avoid doing this, because: You would need to be 100% sure that mysite.example always resolves to a local address, otherwise you could leak production credentials. You won't be able to debug across browsers with this trick 🙀. With many thanks for contributions and feedback to all reviewers and contributors—especially Ryan Sleevi, Filippo Valsorda, Milica Mihajlija and Rowan Merewood. 🙌 Hero image background by @anandu on Unsplash, edited.

Feedback wanted: The road to a better layout shift metric for long-lived pages

Cumulative Layout Shift (CLS) is a metric that measures the visual stability of a web page. The metric is called cumulative layout shift because the score of every individual shift is summed throughout the lifespan of the page. While all layout shifts are poor user experiences, they do add up more on pages that are open longer. That's why the Chrome Speed Metrics Team set out to improve the CLS metric to be more neutral to the time spent on a page. It's important that the metric focuses on user experience through the full page lifetime, as we've found that users often have negative experiences after load, while scrolling or navigating through pages. But we've heard concerns about how this impacts long-lived pages—pages which the user generally has open for a long time. There are several different types of pages which tend to stay open longer; some of the most common are social media apps with infinite scroll and single-page applications. An internal analysis of long-lived pages with high CLS scores found that most problems were caused by the following patterns: Infinite scrollers shifting content as the user scrolls. Input handlers taking longer than 500 ms to update the UI in response to a user interaction, without any kind of placeholder or skeleton pattern. While we encourage developers to improve those user experiences, we're also working towards improving the metric and looking for feedback on possible approaches. How would we decide if a new metric is better? # Before diving into metric design, we wanted to ensure that we evaluated our ideas on a wide variety of real-world web pages and use cases. To start, we designed a small user study. First, we recorded videos and Chrome traces of 34 user journeys through various websites. In selecting the user journeys, we aimed for a few things: A variety of different types of sites, such as news and shopping sites. A variety of user journeys, such as initial page load, scrolling, single-page app navigations, and user interactions. A variety of both number and intensity of individual layout shifts on the sites. Few negative experiences on the sites apart from layout shifts. We asked 41 of our colleagues to watch two videos at a time, rating which of the pair was better in terms of layout shift. From these ratings, we created an idealized ranking order of the sites. The results of the user ranking confirmed our suspicions that our colleagues, like most users, are really frustrated by layout shifts after load, especially during scrolling and single-page app navigations. We saw that some sites have much better user experiences during these activities than others. Since we recorded Chrome traces along with the videos, we had all the details of the individual layout shifts in each user journey. We used those to compute metric values for each idea for each user journey. This allowed us to see how each metric variant ranked the user journeys, and how different each was from the ideal ranking. What metric ideas did we test? # Windowing strategies # Often pages have multiple layout shifts bunched closely together, because elements can shift multiple times as new content comes in piece by piece. This prompted us to try out techniques for grouping shifts together. To accomplish that, we looked at three windowing approaches: Tumbling windows Sliding windows Session windows In each of these examples, the page has layout shifts of varying severity over time. Each blue bar represents a single layout shift, and the length represents the score of that shift. The images illustrate the ways different windowing strategies group the layout shifts over time. Tumbling windows # The simplest approach is just to break the page into windows of equal-sized chunks. These are called tumbling windows. You'll notice above that the fourth bar really looks like it should be grouped into the second tumbling window, but because the windows are all a fixed size it is in the first window instead. If there are slight differences in timing of loads or user interactions on the page, the same layout shifts might fall on different sides of the tumbling window boundaries. Sliding windows # An approach that lets us see more possible groupings of the same length is to continuously update the potential window over time. The image above shows one sliding window at a time, but we could look at all possible sliding windows or a subset of them to create a metric. Session windows # If we wanted to focus on identifying areas of the page with bursts of layout shifts, we could start each window at a shift, and keep growing it until we encountered a gap of a given size between layout shifts. This approach groups the layout shifts together, and ignores most of the non-shifting user experience. One potential problem is that if there are no gaps in the layout shifts, a metric based on session windows could grow unbounded just like the current CLS metric. So we tried this out with a maximum window size as well. Window sizes # The metric might give very different results depending on how big the windows actually are, so we tried multiple different window sizes: Each shift as its own window (no windows) 100 ms 300 ms 1 second 5 seconds Summarization # We tried out many ways to summarize the different windows. Percentiles # We looked at the maximum window value, as well as the 95th percentile, 75th percentile, and median. Average # We looked at the mean window value. Budgets # We wondered if maybe there was some minimum layout shift score that users wouldn't notice, and we could just count layout shifts over that "budget" in the score. So for various potential "budget" values, we looked at the percentage of shifts over budget, and the total shift score over budget. Other strategies # We also looked at many strategies that didn't involve windows, like the total layout shift divided by time on page, and the average of the worst N individual shifts. The initial results # Overall, we tested 145 different metric definitions based on permutations of the above ideas. For each metric, we ranked all the user journeys by their score on the metric, and then ranked the metrics by how close they were to the ideal ranking. To get a baseline, we also ranked all the sites by their current CLS score. CLS placed 32nd, tied with 13 other strategies, so it was better than most permutations of the strategies above. To ensure the results were meaningful, we also added in three random orderings. As expected, the random orderings did worse than every strategy tested. To understand if we might be overfitting for the data set, after our analysis we recorded some new layout shift videos and traces, manually ranked those, and saw that the metric rankings were very similar for the new data set and the original one. A few different strategies stood out in the rankings. Best strategies # When we ranked the strategies, we found that three types of strategies topped the list. Each had roughly the same performance, so we plan to move forward with a deeper analysis on all three. We'd also like to hear developer feedback to understand if there are factors outside of user experience we should be considering when deciding between them. (See below for how to give feedback.) High percentiles of long windows # A few windowing strategies worked well with long window sizes: 1 second sliding windows Session windows capped at 5 seconds with 1 second gap Session windows uncapped with 1 second gap These all ranked really well at both the 95th percentile and the maximum. But for such large window sizes, we were concerned about using the 95th percentile—often we were looking at only 4-6 windows, and taking the 95th percentile of that is a lot of interpolation. It's unclear what the interpolation is doing in terms of the metric value. The maximum value is a lot clearer, so we decided to move forward with checking the maximum. Average of session windows with long gaps # Averaging the scores of all uncapped session windows with 5 second gaps between them performed really well. This strategy has a few interesting characteristics: If the page doesn't have gaps between layout shifts, it ends up being one long session window with the exact same score as the current CLS. This metric didn't take idle time into account directly; it only looked at the shifts that happened on the page, and not at points in time when the page was not shifting. High percentiles of short windows # The maximum 300 ms sliding window ranked very highly, as well as the 95th percentile. For the shorter window size, there is less percentile interpolation than larger window sizes, but we were also concerned about "repeat" sliding windows—if a set of layout shifts occurs over two frames, there are multiple 300 ms windows that include them. Taking the maximum is much clearer and simpler than taking the 95th percentile one. So again we decided to move forward with checking the maximum. Strategies that didn't work out # Strategies that tried to look at the "average" experience of time spent both without layout shifts and with layout shifts did very poorly. None of the median or 75th percentile summaries of any windowing strategy ranked the sites well. Neither did the sum of layout shifts over time. We evaluated a number of different "budgets" for acceptable layout shifts: Percent of layout shifts above some budget. For various budgets, these all ranked quite poorly. Average layout shift above some excess. Most variations on this strategy did poorly, but average excess over a long session with a large gap did almost as well as the average of session windows with long gaps. We decided to move forward with only the latter because it is simpler. Next steps # Larger-scale analysis # We've implemented the top strategies listed above in Chrome, so that we can get data on real-world usage for a much larger set of websites. We plan to use a similar approach of ranking sites based on their metric scores to do the larger-scale analysis: Rank all the sites by CLS, and by each new metric candidate. Which sites are ranked most differently by CLS and each candidate? Do we find anything unexpected when we look at these sites? What are the largest differences between the new metric candidates? Do any of the differences stand out as advantages or disadvantages of a specific candidate? Repeat the above analysis, but bucketing by time spent on each page load. Do we see an expected improvement for long-lived page loads with acceptable layout shift? Do we see any unexpected results for short-lived pages? Feedback on our approach # We'd love to get feedback from web developers on these approaches. Some things to keep in mind while considering the new approaches: What's not changing # We do want to clarify that a lot of things will not be changing with a new approach: None of our metric ideas change the way layout shift scores for individual frames are calculated, only the way we summarize multiple frames. This means that the JavaScript API for layout shifts will stay the same, and the underlying events in Chrome traces that developer tools use will also stay the same, so layout shift rects in tools like WebPageTest and Chrome DevTools will continue to work the same way. We'll continue to work hard on making the metrics easy for developers to adopt, including them in the web-vitals library, documenting on web.dev, and reporting them in our developer tooling like Lighthouse. Trade-offs between metrics # One of the top strategies summarizes the layout shift windows as an average, and the rest report the maximum window. For pages which are open a very long time, the average will likely report a more representative value, but in general it will likely be easier for developers to act on a single window—they can log when it occurred, the elements that shifted, and so on. We'd love feedback on which is more important to developers. Do you find sliding or session windows easier to understand? Are the differences important to you? How to give feedback # You can try out the new layout shift metrics on any site using our example JavaScript implementations or our fork of the Core Web Vitals extension. Please email feedback to our web-vitals-feedback Google group, with "[Layout Shift Metrics]" in the subject line. We're really looking forward to hearing what you think!

Building a sidenav component

In this post I want to share with you how I prototyped a Sidenav component for the web that is responsive, stateful, supports keyboard navigation, works with and without JavaScript, and works across browsers. Try the demo. If you prefer video, here's a YouTube version of this post: Overview # It's tough building a responsive navigation system. Some users will be on a keyboard, some will have powerful desktops, and some will visit from a small mobile device. Everyone visiting should be able to open and close the menu. Web Tactics # In this component exploration I had the joy of combining a few critical web platform features: CSS :target CSS grid CSS transforms CSS Media Queries for viewport and user preference JS for focus UX enhancements My solution has one sidebar and toggles only when at a "mobile" viewport of 540px or less. 540px will be our breakpoint for switching between the mobile interactive layout and the static desktop layout. CSS :target pseudo-class # One <a> link sets the url hash to #sidenav-open and the other to empty (''). Lastly, an element has the id to match the hash: <a href="#sidenav-open" id="sidenav-button" title="Open Menu" aria-label="Open Menu"> <a href="#" id="sidenav-close" title="Close Menu" aria-label="Close Menu"></a> <aside id="sidenav-open"> … </aside> Clicking each of these links changes the hash state of our page URL, then with a pseudo-class I show and hide the sidenav: @media (max-width: 540px) { #sidenav-open { visibility: hidden; } #sidenav-open:target { visibility: visible; } } CSS Grid # In the past, I only used absolute or fixed position sidenav layouts and components. Grid though, with its grid-area syntax, lets us assign multiple elements to the same row or column. Stacks # The primary layout element #sidenav-container is a grid that creates 1 row and 2 columns, 1 of each are named stack. When space is constrained, CSS assigns all of the <main> element's children to the same grid name, placing all elements into the same space, creating a stack. #sidenav-container { display: grid; grid: [stack] 1fr / min-content [stack] 1fr; min-height: 100vh; } @media (max-width: 540px) { #sidenav-container > * { grid-area: stack; } } Menu backdrop # The <aside> is the animating element that contains the side navigation. It has 2 children: the navigation container <nav> named [nav] and a backdrop <a> named [escape], which is used to close the menu. #sidenav-open { display: grid; grid-template-columns: [nav] 2fr [escape] 1fr; } Adjust 2fr & 1fr to find the ratio you like for the menu overlay and its negative space close button. CSS 3D transforms & transitions # Our layout is now stacked at a mobile viewport size. Until I add some new styles, it's overlaying our article by default. Here's some UX I'm shooting for in this next section: Animate open and close Only animate with motion if the user is OK with that Animate visibility so keyboard focus doesn't enter the offscreen element As I begin to implement motion animations, I want to start with accessibility top of mind. Accessible motion # Not everyone will want a slide out motion experience. In our solution this preference is applied by adjusting a --duration CSS variable inside a media query. This media query value represents a user's operating system preference for motion (if available). #sidenav-open { --duration: .6s; } @media (prefers-reduced-motion: reduce) { #sidenav-open { --duration: 1ms; } } A demo of the interaction with and without duration applied. Now when our sidenav is sliding open and closed, if a user prefers reduced motion, I instantly move the element into view, maintaining state without motion. Transition, transform, translate # Sidenav out (default) # To set the default state of our sidenav on mobile to an offscreen state, I position the element with transform: translateX(-110vw). Note, I added another 10vw to the typical offscreen code of -100vw, to ensure the box-shadow of the sidenav doesn't peek into the main viewport when it's hidden. @media (max-width: 540px) { #sidenav-open { visibility: hidden; transform: translateX(-110vw); will-change: transform; transition: transform var(--duration) var(--easeOutExpo), visibility 0s linear var(--duration); } } Sidenav in # When the #sidenav element matches as :target, set the translateX() position to homebase 0, and watch as CSS slides the element from its out position of -110vw, to its "in" position of 0 over var(--duration) when the URL hash is changed. @media (max-width: 540px) { #sidenav-open:target { visibility: visible; transform: translateX(0); transition: transform var(--duration) var(--easeOutExpo); } } Transition visibility # The goal now is to hide the menu from screenreaders when it's out, so systems don't put focus into an offscreen menu. I accomplish this by setting a visibility transition when the :target changes. When going in, don't transition visibility; be visible right away so I can see the element slide in and accept focus. When going out, transition visibility but delay it, so it flips to hidden at the end of the transition out. Accessibility UX enhancements # Links # This solution relies on changing the URL in order for the state to be managed. Naturally, the <a> element should be used here, and it gets some nice accessibility features for free. Let's adorn our interactive elements with labels clearly articulating intent. <a href="#" id="sidenav-close" title="Close Menu" aria-label="Close Menu"></a> <a href="#sidenav-open" id="sidenav-button" class="hamburger" title="Open Menu" aria-label="Open Menu"> <svg>...</svg> </a> A demo of the voiceover and keyboard interaction UX. Now our primary interaction buttons clearly state their intent for both mouse and keyboard. :is(:hover, :focus) # This handy CSS functional pseudo-selector lets us swiftly be inclusive with our hover styles by sharing them with focus as well. .hamburger:is(:hover, :focus) svg > line { stroke: hsl(var(--brandHSL)); } Sprinkle on JavaScript # Press escape to close # The Escape key on your keyboard should close the menu right? Let's wire that up. const sidenav = document.querySelector('#sidenav-open'); sidenav.addEventListener('keyup', event => { if (event.code === 'Escape') document.location.hash = ''; }); Focus UX # The next snippet helps us put focus on the open and close buttons after they open or close. I want to make toggling easy. sidenav.addEventListener('transitionend', e => { const isOpen = document.location.hash === '#sidenav-open'; isOpen ? document.querySelector('#sidenav-close').focus() : document.querySelector('#sidenav-button').focus(); }) When the sidenav opens, focus the close button. When the sidenav closes, focus the open button. I do this by calling focus() on the element in JavaScript. Conclusion # Now that you know how I did it, how would you?! This makes for some fun component architecture! Who's going to make the 1st version with slots? 🙂 Let's diversify our approaches and learn all the ways to build on the web. Create a Glitch, tweet me your version, and I'll add it to the Community remixes section below. Community remixes # @_developit with custom elements: demo & code @mayeedwin1 with HTML/CSS/JS: demo & code @a_nurella with a Glitch Remix: demo & code @EvroMalarkey with HTML/CSS/JS: demo & code

Deprecating Excalidraw Electron in favor of the web version

Excalidraw is a virtual collaborative whiteboard that lets you easily sketch diagrams that feel hand-drawn. This article was cross-posted to and first appeared on the Excalidraw blog. On the Excalidraw project, we have decided to deprecate Excalidraw Desktop, an Electron wrapper for Excalidraw, in favor of the web version that you can—and always could—find at excalidraw.com. After a careful analysis, we have decided that Progressive Web App (PWA) is the future we want to build upon. Read on to learn why. How Excalidraw Desktop came into being # Soon after @vjeux created the initial version of Excalidraw in January 2020 and blogged about it, he proposed the following in Issue #561: Would be great to wrap Excalidraw within Electron (or equivalent) and publish it as a [platform-specific] application to the various app stores. The immediate reaction by @voluntadpear was to suggest: What about making it a PWA instead? Android currently supports adding them to the Play Store as Trusted Web Activities and hopefully iOS will do the same soon. On Desktop, Chrome lets you download a desktop shortcut to a PWA. The decision that @vjeux took in the end was simple: We should do both :) While work on converting the version of Excalidraw into a PWA was started by @voluntadpear and later others, @lipis independently went ahead and created a separate repo for Excalidraw Desktop. To this day, the initial goal set by @vjeux, that is, to submit Excalidraw to the various app stores, has not been reached yet. Honestly, no one has even started the submission process to any of the stores. But why is that? Before I answer, let's look at Electron, the platform. What is Electron? # The unique selling point of Electron is that it allows you to "build cross-platform desktop apps with JavaScript, HTML, and CSS". Apps built with Electron are "compatible with Mac, Windows, and Linux", that is, "Electron apps build and run on three platforms". According to the homepage, the hard parts that Electron makes easy are automatic updates, system-level menus and notifications, crash reporting, debugging and profiling, and Windows installers. Turns out, some of the promised features need a detailed look at the small print. For example, automatic updates "are [currently] only [supported] on macOS and Windows. There is no built-in support for auto-updater on Linux, so it is recommended to use the distribution's package manager to update your app". Developers can create system-level menus by calling Menu.setApplicationMenu(menu). On Windows and Linux, the menu will be set as each window's top menu, while on macOS there are many system-defined standard menus, like the Services menu. To make one's menus a standard menu, developers should set their menu's role accordingly, and Electron will recognize them and make them become standard menus. This means that a lot of menu-related code will use the following platform check: const isMac = process.platform === 'darwin'. Windows installers can be made with windows-installer. The README of the project highlights that "for a production app you need to sign your application. Internet Explorer's SmartScreen filter will block your app from being downloaded, and many anti-virus vendors will consider your app as malware unless you obtain a valid cert" [sic]. Looking at just these three examples, it is clear that Electron is far from "write once, run everywhere". Distributing an app on app stores requires code signing, a security technology for certifying app ownership. Packaging an app requires using tools like electron-forge and thinking about where to host packages for app updates. It gets complex relatively quickly, especially when the objective truly is cross platform support. I want to note that it is absolutely possible to create stunning Electron apps with enough effort and dedication. For Excalidraw Desktop, we were not there. Where Excalidraw Desktop left off # Excalidraw Desktop so far is basically the Excalidraw web app bundled as an .asar file with an added About Excalidraw window. The look and feel of the application is almost identical to the web version. Excalidraw Desktop is almost indistinguishable from the web version The About Excalibur menu providing insights into the versions On macOS, there is now a system-level menu at the top of the application, but since none of the menu actions—apart from Close Window and About Excalidraw—are hooked up to to anything, the menu is, in its current state, pretty useless. Meanwhile, all actions can of course be performed via the regular Excalidraw toolbars and the context menu. The menu bar of Excalidraw Desktop on macOS We use electron-builder, which supports file type associations. By double-clicking an .excalidraw file, ideally the Excalidraw Desktop app should open. The relevant excerpt of our electron-builder.json file looks like this: { "fileAssociations": [ { "ext": "excalidraw", "name": "Excalidraw", "description": "Excalidraw file", "role": "Editor", "mimeType": "application/json" } ] } Unfortunately, in practice, this does not always work as intended, since, depending on the installation type (for the current user, for all users), apps on Windows 10 do not have the rights to associate a file type to themselves. These shortcomings and the pending work to make the experience truly app-like on all platforms (which, again, with enough effort is possible) were a strong argument for us to reconsider our investment in Excalidraw Desktop. The way bigger argument for us, though, was that we foresee that for our use case, we do not need all the features Electron offers. The grown and still growing set of capabilities of the web serves us equally well, if not better. How the web serves us today and in the future # Even in 2020, jQuery is still incredibly popular. For many developers it has become a habit to use it, despite the fact that today they might not need jQuery. There is a similar resource for Electron, aptly called You Might Not Need Electron. Let me outline why we think we do not need Electron. Installable Progressive Web App # Excalidraw today is an installable Progressive Web App with a service worker and a Web App Manifest. It caches all its resources in two caches, one for fonts and font-related CSS, and one for everything else. Excalidraw's cache contents This means the application is fully offline-capable and can run without a network connection. Chromium-based browsers on both desktop and mobile prompt the user to install the app. You can see the installation prompt in the screenshot below. The Excalidraw install dialog in Chrome Excalidraw is configured to run as a standalone application, so when you install it, you get an app that runs in its own window. It is fully integrated in the operating system's multitasking UI and gets its own app icon on the home screen, Dock, or task bar; depending on the platform where you install it. The Excalidraw PWA in a standalone window The Excalidraw icon on the macOS Dock File system access # Excalidraw uses browser-fs-access for accessing the file system of the operating system. On supporting browsers, this allows for a true open→edit→save workflow and actual over-saving and "save as", with a transparent fallback for other browsers. You can learn more about this feature in my blog post Reading and writing files and directories with the browser-fs-access library. Drag and drop support # Files can be dragged and dropped onto the Excalidraw window just as in platform-specific applications. On a browser that supports the File System Access API, a dropped file can be immediately edited and the modifications be saved to the original file. This is so intuitive that you sometimes forget that you are dealing with a web app. Clipboard access # Excalidraw works well with the operating system's clipboard. Entire Excalidraw drawings or also just individual objects can be copied and pasted in image/png and image/svg+xml formats, allowing for an easy integration with other platform-specific tools like Inkscape or web-based tools like SVGOMG. The Excalidraw context menu offering clipboard actions File handling # Excalidraw already supports the experimental File Handling API, which means .excalidraw files can be double-clicked in the operating system's file manager and open directly in the Excalidraw app, since Excalidraw registers as a file handler for .excalidraw files in the operating system. Declarative link capturing # Excalidraw drawings can be shared by link. Here is an example. In the future, if people have Excalidraw installed as a PWA, such links will not open in a browser tab, but launch a new standalone window. Pending implementation, this will work thanks to declarative link capturing, an, at the time of writing, bleeding-edge proposal for a new web platform feature. Conclusion # The web has come a long way, with more and more features landing in browsers that only a couple of years or even months ago were unthinkable on the web and exclusive to platform-specific applications. Excalidraw is at the forefront of what is possible in the browser, all while acknowledging that not all browsers on all platforms support each feature we use. By betting on a progressive enhancement strategy, we enjoy the latest and greatest wherever possible, but without leaving anyone behind. Best viewed in any browser. Electron has served us well, but in 2020 and beyond, we can live without it. Oh, and for that objective of @vjeux: since the Android Play Store now accepts PWAs in a container format called Trusted Web Activity and since the Microsoft Store supports PWAs, too, you can expect Excalidraw in these stores in the not too distant future. Meanwhile, you can always use and install Excalidraw in and from the browser. Acknowledgements # This article was reviewed by @lipis, @dwelle, and Joe Medley.

Centering in CSS

Centering in CSS is a notorious challenge, fraught with jokes and mockery. 2020 CSS is all grown up and now we can laugh at those jokes honestly, not through clenched teeth. If you prefer video, here's a YouTube version of this post: The challenge # There are difference types of centering. From differing use cases, number of things to center, etc. In order to demonstrate a rationale behind "a winning" centering technique, I created The Resilience Ringer. It's a series of stress tests for each centering strategy to balance within and you to observe their performance. At the end, I reveal the highest scoring technique, as well as a "most valuable." Hopefully you walk away with new centering techniques and solutions. The Resilience Ringer # The Resilience Ringer is a representation of my beliefs that a centering strategy should be resilient to international layouts, variable sized viewports, text edits and dynamic content. These tenets helped shape the following resilience tests for the centering techniques to endure: Squished: centering should be able to handle changes to width Squashed: centering should be able to handle changes to height Duplicate: centering should be dynamic to number of items Edit: centering should be dynamic to length and language of content Flow: centering should be document direction and writing mode agnostic The winning solution should demonstrate its resilience by keeping contents in center while being squished, squashed, duplicated, edited, and swapped to various language modes and directions. Trustworthy and resilient center, a safe center. Legend # I've provided some visual color hinting to help you keep some meta information in context: A pink border indicates ownership of centering styles The grey box is the background on the container which seeks to have centered items Each child has a white background color so you can see any effects the centering technique has on child box sizes (if any) The 5 Contestants # 5 centering techniques enter the Resilience Ringer, only one will receive the Resilience Crown 👸. 1. Content Center # VisBug Squish: great! Squash: great! Duplicate: great! Edit: great! Flow: great! It's going to be hard to beat the conciseness of display: grid and the place-content shorthand. Since it centers and justifies children collectively, it's a solid centering technique for groups of elements meant to be read. .content-center { display: grid; place-content: center; gap: 1ch; } Pros Content is centered even under constrained space and overflow Centering edits and maintenance are all in one spot Gap guarantees equal spacing amongst n children Grid creates rows by default Cons The widest child (max-content) sets the width for all the rest. This will be discussed more in Gentle Flex. Great for macro layouts containing paragraphs and headlines, prototypes, or generally things that need legible centering. place-content is not exclusive to display: grid. display: flex can also benefit from place-content and place-item shorthands. 2. Gentle Flex # Squish: great! Squash: great! Duplicate: great! Edit: great! Flow: great! Gentle Flex is a truer centering-only strategy. It's soft and gentle, because unlike place-content: center, no children's box sizes are changed during the centering. As gently as possible, all items are stacked, centered, and spaced. .gentle-flex { display: flex; flex-direction: column; align-items: center; justify-content: center; gap: 1ch; } Pros Only handles alignment, direction, and distribution Edits and maintenance are all in one spot Gap guarantees equal spacing amongst n children Cons Most lines of code Great for both macro and micro layouts. Key Term: Macro layouts are like states or territories of a country: very high-level, large coverage zones. The zones created by macro layouts tend to contain more layouts. The less surface the layout covers, the less of a macro layout it becomes. As a layout covers less surface area or contains less layouts, it becomes more of a micro layout. 3. Autobot # Squish: great Squash: great Duplicate: fine Edit: great Flow: great The container is set to flex with no alignment styles, while the direct children are styled with auto margins. There's something nostalgic and wonderful about margin: auto working on all sides of the element. .autobot { display: flex; } .autobot > * { margin: auto; } Pros Fun trick Quick and dirty Cons Awkward results when overflowing Reliance on distribution instead of gap means layouts can occur with children touching sides Being "pushed" into position doesn't seem optimal and can result in a change to the child's box size Great for centering icons or pseudo-elements. 4. Fluffy Center # Squish: bad Squash: bad Duplicate: bad Edit: great! Flow: great! (so long as you use logical properties) Contestant "fluffy center" is by far our tastiest sounding contender, and is the only centering technique that's entirely element/child owned. See our solo inner pink border!? .fluffy-center { padding: 10ch; } Pros Protects content Atomic Every test is secretly containing this centering strategy Word space is gap Cons Illusion of not being useful There's a clash between the container and the items, naturally since each are being very firm about their sizing Great for word or phrase-centric centering, tags, pills, buttons, chips, and more. 5. Pop & Plop # Squish: okay Squash: okay Duplicate: bad Edit: fine Flow: fine This "pops" because the absolute positioning pops the element out of normal flow. The "plop" part of the names comes from when I find it most useful: plopping it on top of other stuff. It's a classic and handy overlay centering technique that's flexible and dynamic to content size. Sometimes you just need to plop UI on top of other UI. Pros Useful Reliable When you need it, it’s invaluable Cons Code with negative percentage values Requires position: relative to force a containing block Early and awkward line breaks There can be only 1 per containing block without additional effort Great for modals, toasts and messages, stacks and depth effects, popovers. The winner # If I was on an island and could only have 1 centering technique, it would be… [drum roll] Gentle Flex 🎉 .gentle-flex { display: flex; flex-direction: column; align-items: center; justify-content: center; gap: 1ch; } You can always find it in my stylesheets because it's useful for both macro and micro layouts. It's an all-around reliable solution with results that match my expectations. Also, because I'm an intrinsic sizing junkie, I tend to graduate into this solution. True, it's a lot to type out, but the benefits it provides outweighs the extra code. MVP # Fluffy Center .fluffy-center { padding: 2ch; } Fluffy Center is so micro that it's easy to overlook as a centering technique, but it's a staple of my centering strategies. It's so atomic that sometimes I forget I'm using it. Conclusion # What types of things break your centering strategies? What other challenges could be added to the resilience ringer? I considered translation and an auto-height switch on the container… what else!? Now that you know how I did it, how would you?! Let's diversify our approaches and learn all the ways to build on the web. Follow the codelab with this post to create your own centering example, just like the ones in this post. Tweet me your version, and I'll add it to the Community remixes section below. Community remixes # CSS Tricks with a blog post

Love your cache ❤️

This post is a companion to the Love your cache video, part of the Extended Content at Chrome Dev Summit 2020. Be sure to check out the video: When users load your site a second time, their browser will use resources inside its HTTP cache to help make that load faster. But the standards for caching on the web date back to 1999, and they're defined pretty broadly—determining whether a file, like CSS or an image, might be fetched again from the network versus loaded from your cache is a bit of an inexact science. In this post, I'll talk through a sensible, modern default for caching—one that actually does no caching at all. But that's just the default, and it's of course more nuanced than just "turning it off". Read on! Something to remember when building your site is that performance metrics like Core Web Vitals include all loads, not just the 1st load. Yet, a lot of Google's guidance focuses on optimizing the first load (which is definitely important to bring users in!), and Lighthouse only tests your site on an empty cache. Goals # When a site is loaded for the 2nd time, you have two goals: Ensure that your users get the most up-to-date version available—if you've changed something, it should be reflected quickly Do #1 while fetching as little from the network as possible In the broadest sense, you only want to send the smallest change to your clients when they load your site again. And structuring your site to ensure the most efficient distribution of any change is challenging (more on that below, and in the video). Having said that, you also have other knobs when you consider caching—perhaps you've decided to let a user's browser HTTP cache hold onto your site for a long time so that no network requests are required to serve it at all. Or you've constructed a service worker that will serve a site entirely offline before checking if it's up-to-date. This is an extreme option, which is valid—and used for many offline-first app-like web experiences—but the web doesn't need to be at a cache-only extreme, or even a completely network-only extreme. Background # As web developers, we're all accustomed to the idea of having a "stale cache". But we know, almost instinctively, the tools available to solve this: do a "hard refresh", or open an incognito window, or use some combination of your browser's developer tools to clear a site's data. Regular users out there on the internet don't have that same luxury. So while we have some core goals of ensuring our users have a great time with their 2nd load, it's also really important to make sure they don't have a bad time or get stuck. (Check out the video if you'd like to hear me talk about how we nearly got the web.dev/live site stuck!) For a bit of background, a really common reason for "stale cache" is actually the 1999-era default for caching. It relies on the Last-Modified header: Assets generated at different times (in gray) will be cached for different times, so a 2nd load can get a combination of cached and fresh assets Every file you load is kept for an additional 10% of its current lifetime, as your browser sees it. For example, if index.html was created a month ago, it'll be cached by your browser for about another three days. This was a well-intentioned idea back in the day, but given the tightly integrated nature of today's websites this default behavior means it's possible to get into a state where a user has files designed for different releases of your website (e.g., the JS from Tuesday's release, and the CSS from Friday's release), all because those files were not updated at exactly the same time. The well-lit path # A modern default for caching is to actually do no caching at all, and use CDNs to bring your content close to your users. Every time a user loads your site, they'll go to the network to see whether it's up-to-date. This request will have low latency, as it'll be provided by a CDN geographically close to each end user. You can configure your web host to respond to web requests with this header: Cache-Control: max-age=0,must-revalidate,public This basically says; the file is valid for no time at all, and you must validate it from the network before you can use it again (otherwise it's only "suggested"). Instead of max-age=0,must-revalidate, you could also specify no-cache: this is equivalent. However, no-cache is a confusing name, because it could be interpreted as "never cache this file"—even though that's not the case. For some heavy reading, see Cache-Control on MDN. This validation process is relatively cheap in terms of bytes transferred—if a large image file hasn't changed, your browser will receive a small 304 response—but it costs latency as a user must still go to the network to find out. And this is the primary downside of this approach. It can work really well for folks on fast connections in the 1st world, and where your CDN of choice has great coverage, but not for those folks who might be on slower mobile connections or using poor infrastructure. Regardless, this is a modern approach that is the default on a popular CDN, Netlify, but can be configured on nearly any CDN. For Firebase Hosting, you can include this header in the hosting section of your firebase.json file: "headers": [ // Be sure to put this last, to not override other headers { "source": "**", "headers": [ { "key": "Cache-Control", "value": "max-age=0,must-revalidate,public" } } ] So while I still suggest this as a sensible default, it's only that—the default! Read on to find out how to step in and upgrade the defaults. Fingerprinted URLs # By including a hash of the file's content in the name of assets, images, and so on served on your site, you can ensure that these files will always have unique content—this will result in files named sitecode.af12de.js for example. When your server responds to requests for these files, you can safely instruct your end-user's browsers to cache them for a long time by configuring them with this header: Cache-Control: max-age=31536000,immutable This value is a year, in seconds. And according to the spec, this is effectively equal to "forever". Importantly, don't generate these hashes by hand—it's too much manual work! You can use tools like Webpack, Rollup and so on to help you out with this. Be sure to read more about them on Tooling Report. Remember that it's not just JavaScript that can benefit from fingerprinted URLs; assets like icons, CSS and other immutable data files can also be named this way. (And be sure to watch the video above to learn a bit more about code splitting, which lets you ship less code whenever your site changes.) We include the keyword immutable in the Cache-Control recommendation above. Without this keyword, our long Cache-Control is only considered to be a suggestion, and some browsers will still ignore it and go to the server. (In 2017, Chrome changed its behavior, so it always acts as if the immutable keyword is on anyway—so right now, it's only needed for Safari and Firefox). Regardless of how your site approaches caching, these sorts of fingerprinted files are incredibly valuable to any site you might build. Most sites just aren't changing on every release. Of course, we can't rename our 'friendly', user-facing pages this way: renaming your index.html file to index.abcd12.html—that's infeasible, you can't tell users to go to a new URL every time they load your site! These 'friendly' URLs can't be renamed and cached in this way, which leads me on to a possible middle ground. The middle ground # There's obviously room for a middle ground when it comes to caching. I've presented two extreme options; cache never, or cache forever. And there will be a number of files which you might like to cache for a time, such as the "friendly" URLs I mentioned above. If you do want to cache these "friendly" URLs and their HTML, it's worth considering what dependencies they include, how they may be cached, and how caching their URLs for a time might affect you. Let's look at a HTML page which includes an image like this: <img src="/images/foo.jpeg" loading="lazy" /> If you update or change your site by deleting or changing this lazy-loaded image, users who view a cached version of your HTML might get an incorrect or missing image—because they've still cached the original /images/foo.jpeg when they revisit your site. If you're careful, this might not affect you. But broadly it's important to remember that your site—when cached by your end users—no longer just exists on your servers. Rather, it may exist in pieces inside the caches of your end user's browsers. In general, most guides out there on caching will talk about this kind of setting—do you want to cache for an hour, several hours, and so on. To set this kind of cache up, use a header like this (which caches for 3600 seconds, or one hour): Cache-Control: max-age=3600,immutable,public One last point. If you're creating timely content which typically might only be accessed by users once—like news articles!—my opinion is that these should never be cached, and you should use our sensible default above. I think we often overestimate the value of caching over a user's desire to always see the latest and greatest content, such as a critical update on a news story or current event. Non-HTML options # Aside from HTML, some other options for files that live in the middle ground include: In general, look for assets that don't affect others For example: avoid CSS, as it causes changes in how your HTML is rendered Large images that are used as part of timely articles Your users probably aren't going to visit any single article more than a handful of times, so don't cache photos or hero images forever and waste storage An asset which represents something that itself has lifetime JSON data about the weather might only be published every hour, so you can cache the previous result for an hour—it won't change in your window Builds of an open-source project might be rate-limited, so cache a build status image until it's possible that the status might change Summary # When users load your site a second time, you've already had a vote of confidence—they want to come back and get more of what you're offering. At this point, it's not always just about bringing that load time down, and you have a bunch of options available to you to ensure that your browser does only the work it needs to deliver both a fast and an up-to-date experience. Caching is not a new concept on the web, but perhaps it needs a sensible default—consider using one and strongly opting-in to better caching strategies when you need them. Thanks for reading! See also # For a general guide on the HTTP cache, check out Prevent unnecessary network requests with the HTTP Cache.

Publish, ship, and install modern JavaScript for faster applications

Over 90% of browsers are capable of running modern JavaScript, but the prevalence of legacy JavaScript remains one of the biggest contributors to performance problems on the web today. EStimator.dev is a simple web-based tool that calculates the size and performance improvement a site could achieve by delivering modern JavaScript syntax. The web today is limited by legacy JavaScript, and no single optimization will improve performance as much as writing, publishing, and shipping your web page or package using ES2017 syntax. Modern JavaScript # Modern JavaScript is not characterized as code written in a specific ECMAScript specification version, but rather in syntax that is supported by all modern browsers. Modern web browsers like Chrome, Edge, Firefox, and Safari make up more than 90% of the browser market, and different browsers that rely on the same underlying rendering engines make up an additional 5%. This means that 95% of global web traffic comes from browsers that support the most widely used JavaScript language features from the past 10 years, including: Classes (ES2015) Arrow functions (ES2015) Generators (ES2015) Block scoping (ES2015) Destructuring (ES2015) Rest and spread parameters (ES2015) Object shorthand (ES2015) Async/await (ES2017) Features in newer versions of the language specification generally have less consistent support across modern browsers. For example, many ES2020 and ES2021 features are only supported in 70% of the browser market—still the majority of browsers, but not enough that it's safe to rely on those features directly. This means that although "modern" JavaScript is a moving target, ES2017 has the widest range of browser compatibility while including most of the commonly used modern syntax features. In other words, ES2017 is the closest to modern syntax today. Legacy JavaScript # Legacy JavaScript is code that specifically avoids using all the above language features. Most developers write their source code using modern syntax, but compile everything to legacy syntax for increased browser support. Compiling to legacy syntax does increase browser support, however the effect is often smaller than we realize. In many cases the support increases from around 95% to 98% while incurring a significant cost: Legacy JavaScript is typically around 20% larger and slower than equivalent modern code. Tooling deficiencies and misconfiguration often widen this gap even further. Installed libraries account for as much as 90% of typical production JavaScript code. Library code incurs an even higher legacy JavaScript overhead due to polyfill and helper duplication that could be avoided by publishing modern code. Modern JavaScript on npm # Recently, Node.js has standardized an "exports" field to define entry points for a package: { "exports": "./index.js" } Modules referenced by the "exports" field imply a Node version of at least 12.8, which supports ES2019. This means that any module referenced using the "exports" field can be written in modern JavaScript. Package consumers must assume modules with an "exports" field contain modern code and transpile if necessary. Modern-only # If you want to publish a package with modern code and leave it up to the consumer to handle transpiling it when they use it as a dependency—use only the "exports" field. { "name": "foo", "exports": "./modern.js" } Caution: This approach is not recommended. In a perfect world, every developer would have already configured their build system to transpile all dependencies (node_modules) to their required syntax. However, this is not currently the case, and publishing your package using only modern syntax would prevent its usage in applications that would be accessed through legacy browsers. Modern with legacy fallback # Use the "exports" field along with "main" in order to publish your package using modern code but also include an ES5 + CommonJS fallback for legacy browsers. { "name": "foo", "exports": "./modern.js", "main": "./legacy.cjs" } Modern with legacy fallback and ESM bundler optimizations # In addition to defining a fallback CommonJS entrypoint, the "module" field can be used to point to a similar legacy fallback bundle, but one that uses JavaScript module syntax (import and export). { "name": "foo", "exports": "./modern.js", "main": "./legacy.cjs", "module": "./module.js" } Many bundlers, such as webpack and Rollup, rely on this field to take advantage of module features and enable tree shaking. This is still a legacy bundle that does not contain any modern code aside from import/export syntax, so use this approach to ship modern code with a legacy fallback that is still optimized for bundling. Modern JavaScript in applications # Third-party dependencies make up the vast majority of typical production JavaScript code in web applications. While npm dependencies have historically been published as legacy ES5 syntax, this is no longer a safe assumption and risks dependency updates breaking browser support in your application. With an increasing number of npm packages moving to modern JavaScript, it's important to ensure that the build tooling is set up to handle them. There's a good chance some of the npm packages you depend on are already using modern language features. There are a number of options available to use modern code from npm without breaking your application in older browsers, but the general idea is to have the build system transpile dependencies to the same syntax target as your source code. webpack # As of webpack 5, it is now possible to configure what syntax webpack will use when generating code for bundles and modules. This doesn't transpile your code or dependencies, it only affects the "glue" code generated by webpack. To specify the browser support target, add a browserslist configuration to your project, or do it directly in your webpack configuration: module.exports = { target: ['web', 'es2017'], }; It is also possible to configure webpack to generate optimized bundles that omit unnecessary wrapper functions when targeting a modern ES Modules environment. This also configures webpack to load code-split bundles using <script type="module">. module.exports = { target: ['web', 'es2017'], output: { module: true, }, experiments: { outputModule: true, }, }; There are a number of webpack plugins available that make it possible to compile and ship modern JavaScript while still supporting legacy browsers, such as Optimize Plugin and BabelEsmPlugin. Optimize Plugin # Optimize Plugin is a webpack plugin that transforms final bundled code from modern to legacy JavaScript instead of each individual source file. It's a self-contained setup that allows your webpack configuration to assume everything is modern JavaScript with no special branching for multiple outputs or syntaxes. Since Optimize Plugin operates on bundles instead of individual modules, it processes your application's code and your dependencies equally. This makes it safe to use modern JavaScript dependencies from npm, because their code will be bundled and transpiled to the correct syntax. It can also be faster than traditional solutions involving two compilation steps, while still generating separate bundles for modern and legacy browsers. The two sets of bundles are designed to be loaded using the module/nomodule pattern. // webpack.config.js const OptimizePlugin = require('optimize-plugin'); module.exports = { // ... plugins: [new OptimizePlugin()], }; Optimize Plugin can be faster and more efficient than custom webpack configurations, which typically bundle modern and legacy code separately. It also handles running Babel for you, and minifies bundles using Terser with separate optimal settings for the modern and legacy outputs. Finally, polyfills needed by the generated legacy bundles are extracted into a dedicated script so they are never duplicated or unnecessarily loaded in newer browsers. BabelEsmPlugin # BabelEsmPlugin is a webpack plugin that works along with @babel/preset-env to generate modern versions of existing bundles to ship less transpiled code to modern browsers. It is the most popular off-the-shelf solution for module/nomodule, used by Next.js and Preact CLI. // webpack.config.js const BabelEsmPlugin = require('babel-esm-plugin'); module.exports = { //... module: { rules: [ // your existing babel-loader configuration: { test: /\.js$/, exclude: /node_modules/, use: { loader: 'babel-loader', options: { presets: ['@babel/preset-env'], }, }, }, ], }, plugins: [new BabelEsmPlugin()], }; BabelEsmPlugin supports a wide array of webpack configurations, because it runs two largely separate builds of your application. Compiling twice can take a little bit of extra time for large applications, however this technique allows BabelEsmPlugin to integrate seamlessly into existing webpack configurations and makes it one of the most convenient options available. Configure babel-loader to transpile node_modules # If you are using babel-loader without one of the previous two plugins, there's an important step required in order to consume modern JavaScript npm modules. Defining two separate babel-loader configurations makes it possible to automatically compile modern language features found in node_modules to ES2017, while still transpiling your own first-party code with the Babel plugins and presets defined in your project's configuration. This doesn't generate modern and legacy bundles for a module/nomodule setup, but it does make it possible to install and use npm packages that contain modern JavaScript without breaking older browsers. webpack-plugin-modern-npm uses this technique to compile npm dependencies that have an "exports" field in their package.json, since these may contain modern syntax: // webpack.config.js const ModernNpmPlugin = require('webpack-plugin-modern-npm'); module.exports = { plugins: [ // auto-transpile modern stuff found in node_modules new ModernNpmPlugin(), ], }; Alternatively, you can implement the technique manually in your webpack configuration by checking for an "exports" field in the package.json of modules as they are resolved. Omitting caching for brevity, a custom implementation might look like this: // webpack.config.js module.exports = { module: { rules: [ // Transpile for your own first-party code: { test: /\.js$/i, loader: 'babel-loader', exclude: /node_modules/, }, // Transpile modern dependencies: { test: /\.js$/i, include(file) { let dir = file.match(/^.*[/\\]node_modules[/\\](@.*?[/\\])?.*?[/\\]/); try { return dir && !!require(dir[0] + 'package.json').exports; } catch (e) {} }, use: { loader: 'babel-loader', options: { babelrc: false, configFile: false, presets: ['@babel/preset-env'], }, }, }, ], }, }; When using this approach, you'll need to ensure modern syntax is supported by your minifier. Both Terser and uglify-es have an option to specify {ecma: 2017} in order to preserve and in some cases generate ES2017 syntax during compression and formatting. Rollup # Rollup has built-in support for generating multiple sets of bundles as part of a single build, and generates modern code by default. As a result, Rollup can be configured to generate modern and legacy bundles with the official plugins you're likely already using. @rollup/plugin-babel # If you use Rollup, the getBabelOutputPlugin() method (provided by Rollup's official Babel plugin) transforms the code in generated bundles rather than individual source modules. Rollup has built-in support for generating multiple sets of bundles as part of a single build, each with their own plugins. You can use this to produce different bundles for modern and legacy by passing each through a different Babel output plugin configuration: // rollup.config.js import {getBabelOutputPlugin} from '@rollup/plugin-babel'; export default { input: 'src/index.js', output: [ // modern bundles: { format: 'es', plugins: [ getBabelOutputPlugin({ presets: [ [ '@babel/preset-env', { targets: {esmodules: true}, bugfixes: true, loose: true, }, ], ], }), ], }, // legacy (ES5) bundles: { format: 'amd', entryFileNames: '[name].legacy.js', chunkFileNames: '[name]-[hash].legacy.js', plugins: [ getBabelOutputPlugin({ presets: ['@babel/preset-env'], }), ], }, ], }; Additional build tools # Rollup and webpack are highly-configurable, which generally means each project must update its configuration enable modern JavaScript syntax in dependencies. There are also higher-level build tools that favor convention and defaults over configuration, like Parcel, Snowpack, Vite and WMR. Most of these tools assume npm dependencies may contain modern syntax, and will transpile them to the appropriate syntax level(s) when building for production. In addition to dedicated plugins for webpack and Rollup, modern JavaScript bundles with legacy fallbacks can be added to any project using devolution. Devolution is a standalone tool that transforms the output from a build system to produce legacy JavaScript variants, allowing bundling and transformations to assume a modern output target. Conclusion # EStimator.dev was built to provide an easy way to assess how much of an impact it can make to switch to modern-capable JavaScript code for the majority of your users. Today, ES2017 is the closest to modern syntax and tools such as npm, Babel, webpack, and Rollup have made it possible to configure your build system and write your packages using this syntax. This post covers several approaches, and you should use the easiest option that works for your use case.

Cross-browser paint worklets and Houdini.how

CSS Houdini is an umbrella term that describes a series of low-level browser APIs that give developers much more control and power over the styles they write. Houdini enables more semantic CSS with the Typed Object Model. Developers can define advanced CSS custom properties with syntax, default values, and inheritance through the Properties and Values API. It also introduces paint, layout, and animation worklets, which open up a world of possibilities, by making it easier for authors to hook into the styling and layout process of the browser's rendering engine. Understanding Houdini worklets # Houdini worklets are browser instructions that run off the main thread and can be called when needed. Worklets enable you to write modular CSS to accomplish specific tasks, and require a single line of JavaScript to import and register. Much like service workers for CSS style, Houdini worklets are registered to your application, and once registered can be used in your CSS by name. Write worklet file Register worklet module (CSS.paintWorklet.addModule(workletURL)) Use worklet (background: paint(confetti)) Implementing your own features with the CSS Painting API # The CSS Painting API is an example of such a worklet (the Paint worklet), and enables developers to define canvas-like custom painting functions that can be used directly in CSS as backgrounds, borders, masks, and more. There is a whole world of possibilities for how you can use CSS Paint in your own user interfaces. For example, instead of waiting for a browser to implement an angled borders feature, you can write your own Paint worklet, or use an existing published worklet. Then, rather than using border-radius apply this worklet to borders and clipping. The example above uses the same paint worklet with different arguments (see code below) to accomplish this result. Demo on Glitch. .angled { --corner-radius: 15 0 0 0; --paint-color: #6200ee; --stroke-weight: 0; /* Mask every angled button with fill mode */ -webkit-mask: paint(angled-corners, filled); } .outline { --stroke-weight: 1; /* Paint outline */ border-image: paint(angled-corners, outlined) 0 fill !important; } The CSS Painting API is currently one of the best-supported Houdini APIs, its spec being a W3C candidate recommendation. It is currently enabled in all Chromium-based browsers, partially supported in Safari, and is under consideration for Firefox. The CSS Painting API is currently supported on Chromium-based browsers. But even without full browser support, you can still get creative with the Houdini Paint API and see your styles work across all modern browsers with the CSS Paint Polyfill. And to showcase a few unique implementations, as well as to provide a resource and worklet library, my team built houdini.how. Houdini.how # Screenshot from the Houdini.how homepage. Houdini.how is a library and reference for Houdini worklets and resources. It provides everything you need to know about CSS Houdini: browser support, an overview of its various APIs, usage information, additional resources, and live paint worklet samples. Each sample on Houdini.how is backed by the CSS Paint API, meaning they each work on all modern browsers. Give it a whirl! Using Houdini # Houdini worklets must either be run via a server locally, or on HTTPS in production. In order to work with a Houdini worklet, you will need to either install it locally or use a content delivery network (CDN) like unpkg to serve the files. You will then need to register the worklet locally. There are a few ways to include the Houdini.how showcase worklets in your own web projects. They can either be used via a CDN in a prototyping capacity, or you can manage the worklets on your own using npm modules. Either way, you'll want to also include the CSS Paint Polyfill to ensure they are cross-browser compatible. 1. Prototyping with a CDN # When registering from unpkg, you can link directly to the worklet.js file without needing to locally install the worklet. Unpkg will resolve to the worklet.js as the main script, or you can specify it yourself. Unpkg will not cause CORS issues, as it is served over HTTPS. CSS.paintWorklet.addModule("https://unpkg.com/<package-name>"); Note that this does not register the custom properties for syntax and fallback values. Instead, they each have fallback values embedded into the worklet. To optionally register the custom properties, include the properties.js file as well. <script src="https://unpkg.com/<package-name>/properties.js"></script> To include the CSS Paint Polyfill with unpkg: <script src="https://unpkg.com/css-paint-polyfill"></script> 2. Managing worklets via NPM # Install your worklet from npm: npm install <package-name> npm install css-paint-polyfill Importing this package does not automatically inject the paint worklet. To install the worklet, you'll need to generate a URL that resolves to the package's worklet.js, and register that. You do so with: CSS.paintWorklet.addModule(..file-path/worklet.js) The npm package name and link can be found on each worklet card. You will also need to include the CSS Paint Polyfill via script or import it directly, as you would with a framework or bundler. Here is an example of how to use Houdini with the paint polyfill in modern bundlers: import 'css-paint-polyfill'; import '<package-name>/properties.js'; // optionally register properties import workletURL from 'url:<package-name>/worklet.js'; CSS.paintWorklet.addModule(workletURL); For more specific instruction per-bundler, check out the usage page on Houdini.how. Contribute # Now that you've played around with some Houdini samples, it's your turn to contribute your own! Houdini.how does not host any worklets itself, and instead showcases the work of the community. If you have a worklet or resource you would like to submit, check out the github repo with contribution guidelines. We'd love to see what you come up with!

Extending Workbox

In this article, we're going to take a quick tour of some ways of extending Workbox. By the end, you'll be writing your own strategies and plugins, and hopefully sharing them with the world. If you're more of a visual person, you can watch a recording of a Chrome Dev Summit talk covering the same material: What's Workbox? # At its core, Workbox is a set of libraries to help with common service worker caching scenarios. And when we've written about Workbox in the past, the emphasis has been on "common" scenarios. For most developers, the caching strategies that Workbox already provides will handle your caching needs. The built-in strategies include stale-while-revalidate, where a cached response is used to respond to a request immediately, while the cache is also updated so that it's fresh the next time around. They also include network-first, falling back to the cache when the network is unavailable, and a few more. Custom strategies # But what if you wanted to go beyond those common caching scenarios? Let's cover writing your own custom caching strategies. Workbox v6 offers a new Strategy base class that sits in front of lower-level APIs, like Fetch and Cache Storage. You can extend the Strategy base class, and then implement your own logic in the _handle() method. Handle simultaneous, duplicate requests with DedupeNetworkFirst # For instance, imagine that you want to implement a strategy that can handle multiple, simultaneous requests for the same URL by deduplicating them. A copy of the response is then used to fulfill all of the in-flight requests, saving bandwidth that would otherwise be wasted. Here's the code you can use to implement that, by extending the NetworkFirst strategy (which itself extends the Strategy base): // See https://developers.google.com/web/tools/workbox/guides/using-bundlers import {NetworkFirst} from 'workbox-strategies'; class DedupeNetworkFirst extends NetworkFirst { constructor(options) { super(options); // This maps inflight requests to response promises. this._requests = new Map(); } // _handle is the standard entry point for our logic. async _handle(request, handler) { let responsePromise = this._requests.get(request.url); if (responsePromise) { // If there's already an inflight request, return a copy // of the eventual response. const response = await responsePromise; return response.clone(); } else { // If there isn't already an inflight request, then use // the _handle() method of NetworkFirst to kick one off. responsePromise = super._handle(request, handler); this._requests.set(request.url, responsePromise); try { const response = await responsePromise; return response.clone(); } finally { // Make sure to clean up after a batch of inflight // requests are fulfilled! this._requests.delete(request.url); } } } } This code assumes that all requests for the same URL can be satisfied with the same response, which won't always be the case if cookies or session state information comes into play. Create a race between the cache and network with CacheNetworkRace # Here's another example of a custom strategy—one that's a twist on stale-while-revalidate, where both the network and cache are checked at the same time, with a race to see which will return a response first. // See https://developers.google.com/web/tools/workbox/guides/using-bundlers import {Strategy} from 'workbox-strategies'; // Instead of extending an existing strategy, // this extends the generic Strategy base class. class CacheNetworkRace extends Strategy { // _handle is the standard entry point for our logic. _handle(request, handler) { // handler is an instance of the StrategyHandler class, // and exposes helper methods for interacting with the // cache and network. const fetchDone = handler.fetchAndCachePut(request); const matchDone = handler.cacheMatch(request); // The actual response generation logic relies on a "race" // between the network and cache promises. return new Promise((resolve, reject) => { fetchDone.then(resolve); matchDone.then((response) => response && resolve(response)); // Promise.allSettled() is implemented in recent browsers. Promise.allSettled([fetchDone, matchDone]).then(results => { if (results[0].status === 'rejected' && !results[1].value) { reject(results[0].reason); } }); }); } } StategyHandler: the recommended approach for creating custom strategies # Although it's not required, it's strongly recommended that when interacting with the network or cache, you use the instance of the StrategyHandler class that's passed to your _handle() method. It's the second parameter, called handler in the example code. This StrategyHandler instance will automatically pick up the cache name you've configured for the strategy, and calling its methods will invoke the expected plugin lifecycle callbacks that we'll describe soon. A StrategyHandler instance supports the following methods: Method Purpose fetch Calls fetch(), invokes lifecycle events. cachePut Calls cache.put() on the configured cache, invokes lifecycle events. cacheMatch Calls cache.match() on the configured cache, invokes lifecycle events. fetchAndCachePut Calls fetch() and then cache.put() on the configured cache, invokes lifecycle events. Drop-in support for routing # Writing a Workbox strategy class is a great way to package up response logic in a reusable, and shareable, form. But once you've written one, how do you use it within your larger Workbox service worker? That's the best part—you can drop any of these strategies directly into your existing Workbox routing rules, just like any of the "official" strategies. // See https://developers.google.com/web/tools/workbox/guides/using-bundlers import {ExpirationPlugin} from 'workbox-expiration'; import {registerRoute} from 'workbox-routing'; // DedupeNetworkFirst can be defined inline, or imported. registerRoute( ({url}) => url.pathname.startsWith('/api'), // DedupeNetworkFirst supports the standard strategy // configuration options, like cacheName and plugins. new DedupeNetworkFirst({ cacheName: 'my-cache', plugins: [ new ExpirationPlugin({...}), ] }) ); A properly written strategy should automatically work with all plugins as well. This applies to the standard plugins that Workbox provides, like the one that handles cache expiration. But you're not limited to using the standard set of plugins! Another great way to extend Workbox is to write your own reusable plugins. Custom plugins # Taking a step back, what is a Workbox plugin, and why would you write your own? A plugin doesn't fundamentally change the order of network and cache operations performed by a strategy. Instead, it allows you to add in extra code that will be run at critical points in the lifetime of a request, like when a network request fails, or when a cached response is about to be returned to the page. Lifecycle event overview # Here's an overview of all the events that a plugin could listen to. Technical details about implementing callbacks for these events is in the Workbox documentation. Lifecycle Event Purpose cacheWillUpdate Change response before it's written to cache. cacheDidUpdate Do something following a cache write. cacheKeyWillBeUsed Override the cache key used for reads or writes. cachedResponseWillBeUsed Change response read from cache before it's used. requestWillFetch Change request before it's sent to the network. fetchDidFail Do something when a network request fails. fetchDidSucceed Do something when a network request succeeds. handlerWillStart Take note of when a handler starts up. handlerWillRespond Take note of when a handler is about to respond. handlerDidRespond Take note of when a handler finishes responding. handlerDidComplete Take note of when a handler has run all its code. handlerDidError Provide a fallback response if a handler throws an error. When writing your own plugin, you'll only implement callbacks for the limited number of events that match your purpose—there's no need to add in callbacks for all of the possible events. Additionally, it's up to you whether you implement your plugin as an Object with properties that match the lifecycle event names, or as a class that exposes methods with those names. Lifecycle events example: FallbackOnErrorPlugin # For instance, here's a custom plugin class that implements callback methods for two events: fetchDidSucceed, and handlerDidError. class FallbackOnErrorPlugin { constructor(fallbackURL) { // Pass in a URL that you know is cached. this.fallbackURL = fallbackURL; } fetchDidSucceed({response}) { // If the network request returned a 2xx response, // just use it as-is. if (response.ok) { return response; }; // Otherwise, throw an error to trigger handlerDidError. throw new Error(`Error response (${response.status})`); } // Invoked whenever the strategy throws an error during handling. handlerDidError() { // This will match the cached URL regardless of whether // there's any query parameters, i.e. those added // by Workbox precaching. return caches.match(this.fallbackURL, { ignoreSearch: true, }); } } This plugin class provides a "fallback" whenever a strategy would otherwise generate an error response. It can be added to any strategy class, and if running that strategy does not result in a 2xx OK response, it will use a backup response from the cache instead. Custom strategy or custom plugin? # Now that you know more about custom strategies and plugins, you might be wondering which one to write for a given use case. A good rule of thumb is to sketch out a diagram of your desired request and response flow, taking into account the network and cache interactions. Then, compare that to the diagrams of the built-in strategies. If your diagram has a set of connections then that's fundamentally different, that's a sign that a custom strategy is the best solution. Conversely, if your diagram ends up looking mostly like a standard strategy but with a few extra pieces of logic injected at keys points, then you should probably write a custom plugin. Takeaways # Whichever approach to customizing Workbox you go with, I hope this article has inspired you write your own strategies and plugins, and then release them on npm, tagged with workbox-strategy or workbox-plugin. Using those tags, you can search npm for strategies and plugins that have already been released. Go out there and extend Workbox, and then share what you build!

Announcing Squoosh v2

Squoosh is an image compression app our team built and debuted at Chrome Dev Summit 2018. We built it to make it easy to experiment with different image codecs, and to showcase the capabilities of the modern web. Today, we are releasing a major update to the app with more codecs support, a new design, and a new way to use Squoosh on your command line called Squoosh CLI. New codecs support # We now support OxiPNG, MozJPEG, WebP, and AVIF, in addition to codecs natively supported in your browser. A new codec was made possible again with the use of WebAssembly. By compiling a codec encoder and decoder as WebAssembly module users can access and experiment with newer codecs even if their preferred browser does not support them. Launching a command line Squoosh! # Ever since the original launch in 2018, common user request was to interact with Squoosh programmatically without UI. We felt a bit conflicted about this path since our app was a UI on top of command-line-based codec tools. However we do understand the desire to interact with the whole package of codecs instead of multiple tools. Squoosh CLI does just that. You can install the beta version of the Squoosh CLI by running npm -i @squoosh/cli or run it directly using npx @squoosh/cli [parameters]. The Squoosh CLI is written in Node and makes use of the exact same WebAssembly modules the PWA uses. Through extensive use of workers, all images are decoded, processed and encoded in parallel. We also use Rollup to bundle everything into one JavaScript file to make sure installation via npx is quick and seamless. The CLI also offers auto compression, where it tries to reduce the quality of an image as much as possible without degrading the visual fidelity (using the Butteraugli metric). With the Squoosh CLI you can compress the images in your web app to multiple formats and use the <picture> element to let the browser choose the best version. We also plan to build plugins for Webpack, Rollup, and other build tools to make image compression an automatic part of your build process. Build process change from Webpack to Rollup # The same team that built Squoosh has spent a significant amount of time looking at build tooling this year for Tooling Report, and decided to switch our build process from Webpack to Rollup. The project was initially started with Webpack because we wanted to try it as a team, and at the time in 2018 Webpack was the only tool that gave us enough control to set up the project the way we wanted. Over time, we've found Rollup's easy plugin system and simplicity with ESM made it a natural choice for this project. Updated UI design # We've also updated the UI design of the app featuring blobs as a visual element. It is a little pun on how we treat data in our code. Squoosh passes image data around as a blob, so it felt natural to include some blobs in the design (get it?). Color usage was honed in as well, so that color was more than an accent but additionally a vector to distinguish and reinforce which image is in context for the options. All in all, the homepage is a bit more vibrant and the tool itself is a bit more clear and concise. What's next ? # We plan to keep working on Squoosh. As the new image format gets released, we want our users to have a place where they can play with the codec without heavy lifting. We also hope to expand use of Squoosh CLI and integrate more into the build process of a web application. Squoosh has always been open source but we've never had focus on growing the community. In 2021, we plan to expand our contributor base and have a better onboarding process to the project. Do you have any ideas for Squoosh? Please let us know on our issue tracker. The team is headed to extended winter vacation but we promise to get back to you in the new year.

SMS OTP form best practices

Asking a user to provide the OTP (one time password) delivered via SMS is a common way to confirm a user's phone number. There are a few use cases for SMS OTP: Two-factor authentication. In addition to username and password, SMS OTP can be used as a strong signal that the account is owned by the person who received the SMS OTP. Phone number verification. Some services use a phone number as the user's primary identifier. In such services, users can enter their phone number and the OTP received via SMS to prove their identity. Sometimes it's combined with a PIN to constitute a two-factor authentication. Account recovery. When a user loses access to their account, there needs to be a way to recover it. Sending an email to their registered email address or an SMS OTP to their phone number are common account recovery methods. Payment confirmation In payment systems, some banks or credit card issuers request additional authentication from the payer for security reasons. SMS OTP is commonly used for that purpose. This post explains best practices to build an SMS OTP form for the above use cases. Caution: While this post discusses SMS OTP form best practices, be aware that SMS OTP is not the most secure method of authentication by itself because phone numbers can be recycled and sometimes hijacked. And the concept of OTP itself is not phishing resistant. If you are looking for better security, consider using WebAuthn. Learn more about it from the talk "What's new in sign-up & sign-in" at the Chrome Dev Summit 2019 and build a reauthentication experience using a biometric sensor with "Build your first WebAuthn app" codelab. Checklist # To provide the best user experience with the SMS OTP, follow these steps: Use the <input> element with: type="text" inputmode="numeric" autocomplete="one-time-code" Use @BOUND_DOMAIN #OTP_CODE as the last line of the OTP SMS message. Use the WebOTP API. Use the <input> element # Using a form with an <input> element is the most important best practice you can follow because it works in all browsers. Even if other suggestions from this post don't work in some browser, the user will still be able to enter and submit the OTP manually. <form action="/verify-otp" method="POST"> <input type="text" inputmode="numeric" autocomplete="one-time-code" pattern="\d{6}" required> </form> The following are a few ideas to ensure an input field gets the best out of browser functionality. type="text" # Since OTPs are usually five or six digit numbers, using type="number" for an input field might seem intuitive because it changes the mobile keyboard to numbers only. This is not recommended because the browser expects an input field to be a countable number rather than a sequence of multiple numbers, which can cause unexpected behavior. Using type="number" causes up and down buttons to be displayed beside the input field; pressing these buttons increments or decrements the number and may remove preceding zeros. Use type="text" instead. This won't turn the mobile keyboard into numbers only, but that is fine because the next tip for using inputmode="numeric" does that job. inputmode="numeric" # Use inputmode="numeric" to change the mobile keyboard to numbers only. Some websites use type="tel" for OTP input fields since it also turns the mobile keyboard to numbers only (including * and #) when focused. This hack was used in the past when inputmode="numeric" wasn't widely supported. Since Firefox started supporting inputmode="numeric", there's no need to use the semantically incorrect type="tel" hack. autocomplete="one-time-code" # autocomplete attribute lets developers specify what permission the browser has to provide autocomplete assistance and informs the browser about the type of information expected in the field. With autocomplete="one-time-code" whenever a user receives an SMS message while a form is open, the operating system will parse the OTP in the SMS heuristically and the keyboard will suggest the OTP for the user to enter. It works only on Safari 12 and later on iOS, iPadOS, and macOS, but we strongly recommend using it, because it is an easy way to improve the SMS OTP experience on those platforms. autocomplete="one-time-code" improves the user experience, but there's more you can do by ensuring that the SMS message complies with the origin-bound message format. Optional attributes: pattern specifies the format that the entered OTP must match. Use regular expressions to specify the matching pattern, for example, \d{6} constrains the OTP to a six digit string. Learn more about the pattern attribute in [Use JavaScript for more complex real-time validation] (https://developers.google.com/web/fundamentals/design-and-ux/input/forms#use_javascript_for_more_complex_real-time_validation) required indicates that a field is required. For more general form best practices, Sam Dutton's Sign-in form best practices is a great starting point. Format the SMS text # Enhance the user experience of entering an OTP by aligning with the origin-bound one-time codes delivered via SMS specification. The format rule is simple: Finish the SMS message with the receiver domain preceded with @ and the OTP preceded with #. For example: Your OTP is 123456 @web-otp.glitch.me #123456 Using a standard format for OTP messages makes extraction of codes from them easier and more reliable. Associating OTP codes with websites makes it harder to trick users into providing a code to malicious sites. The precise rules are: The message begins with (optional) human-readable text that contains a four to ten character alphanumeric string with at least one number, leaving the last line for the URL and the OTP. The domain part of the URL of the website that invoked the API must be preceded by @. The URL must contain a pound sign ("#") followed by the OTP. Make sure the number of characters doesn't exceed 140 in total. To learn more about Chrome specific rules, read Format the SMS message section of WebOTP API post. Using this format provides a couple of benefits: The OTP will be bound to the domain. If the user is on domains other than the one specified in the SMS message, the OTP suggestion won't appear. This also mitigates the risk of phishing attacks and potential account hijacks. Browser will now be able to reliably extract the OTP without depending on mysterious and flaky heuristics. When a website uses autocomplete="one-time-code", Safari with iOS 14 or later will suggest the OTP following the above rules. If the user is on a desktop with macOS Big Sur with the same iCloud account set up as on iOS, the OTP received on the iOS device will be available on the desktop Safari as well. To learn more about other benefits and nuances of the availability on Apple platforms, read Enhance SMS-delivered code security with domain-bound codes. This SMS message format also benefits browsers other than Safari. Chrome, Opera, and Vivaldi on Android also support the origin-bound one-time codes rule with the WebOTP API, though not through autocomplete="one-time-code". Use the WebOTP API # The WebOTP API provides access to the OTP received in an SMS message. By calling navigator.credentials.get() with otp type (OTPCredential) where transport includes sms, the website will wait for an SMS that complies with the origin-bound one-time codes to be delivered and granted access by the user. Once the OTP is passed to JavaScript, the website can use it in a form or POST it directly to the server. Caution: The WebOTP API requires a secure origin (HTTPS). navigator.credentials.get({ otp: {transport:['sms']} }) .then(otp => input.value = otp.code); WebOTP API in action. Learn how to use the WebOTP API in detail in Verify phone numbers on the web with the WebOTP API or copy and paste the following snippet. (Make sure the <form> element has an action and method attribute properly set.) // Feature detection if ('OTPCredential' in window) { window.addEventListener('DOMContentLoaded', e => { const input = document.querySelector('input[autocomplete="one-time-code"]'); if (!input) return; // Cancel the WebOTP API if the form is submitted manually. const ac = new AbortController(); const form = input.closest('form'); if (form) { form.addEventListener('submit', e => { // Cancel the WebOTP API. ac.abort(); }); } // Invoke the WebOTP API navigator.credentials.get({ otp: { transport:['sms'] }, signal: ac.signal }).then(otp => { input.value = otp.code; // Automatically submit the form when an OTP is obtained. if (form) form.submit(); }).catch(err => { console.log(err); }); }); } Photo by Jason Leung on Unsplash.

Sign-up form best practices

If users ever need to log in to your site, then good sign-up form design is critical. This is especially true for people on poor connections, on mobile, in a hurry, or under stress. Poorly designed sign-up forms get high bounce rates. Each bounce could mean a lost and disgruntled user—not just a missed sign-up opportunity. Try it! If you would prefer to learn these best practices with a hands-on tutorial, check out the Sign-up form best practices codelab. Here is an example of a very simple sign-up form that demonstrates all of the best practices: Caution: This post is about form best practices. It does not explain how to implement sign-up via a third-party identity provider (federated login) or show how to build backend services to authenticate users, store credentials, and manage accounts. Integrating Google Sign-In into your web app explains how to add federated login to your sign-up options. 12 best practices for user account, authorization and password management outlines core back-end principles for managing user accounts. Checklist # Avoid sign-in if you can. Make it obvious how to create an account. Make it obvious how to access account details. Cut form clutter. Consider session length. Help password managers securely suggest and store passwords. Don't allow compromised passwords. Do allow password pasting. Never store or transmit passwords in plain text. Don't force password updates. Make it simple to change or reset passwords. Enable federated login. Make account switching simple. Consider offering multi-factor authentication. Take care with usernames. Test in the field as well as the lab. Test on a range of browsers, devices and platforms. Avoid sign-in if you can # Before you implement a sign-up form and ask users to create an account on your site, consider whether you really need to. Wherever possible you should avoid gating features behind login. The best sign-up form is no sign-up form! By asking a user to create an account, you come between them and what they're trying to achieve. You're asking a favor, and asking the user to trust you with personal data. Every password and item of data you store carries privacy and security "data debt", becoming a cost and liability for your site. If the main reason you ask users to create an account is to save information between navigations or browsing sessions, consider using client-side storage instead. For shopping sites, forcing users to create an account to make a purchase is cited as a major reason for shopping cart abandonment. You should make guest checkout the default. Make sign-in obvious # Make it obvious how to create an account on your site, for example with a Login or Sign in button at the top right of the page. Avoid using an ambiguous icon or vague wording ("Get on board!", "Join us") and don't hide login in a navigational menu. The usability expert Steve Krug summed up this approach to website usability: Don't make me think! If you need to convince others on your web team, use analytics to show the impact of different options. Make sign-in obvious. An icon may be ambiguous, but a Sign in button or link is obvious. You may be wondering whether to add a button (or link) to create an account and another one for existing users to sign in. Many popular sites now simply display a single Sign in button. When the user taps or clicks on that, they also get a link to create an account if necessary. That's a common pattern now, and your users are likely to understand it, but you can use interaction analytics to monitor whether or not a single button works best. The Gmail sign-in page has a link to create an account. Sign in link and a Create an account button. Make sure to link accounts for users who sign up via an identity provider such as Google and who also sign up using email and password. That's easy to do if you can access a user's email address from the profile data from the identity provider, and match the two accounts. The code below shows how to access email data for a Google Sign-in user. // auth2 is initialized with gapi.auth2.init() if (auth2.isSignedIn.get()) { var profile = auth2.currentUser.get().getBasicProfile(); console.log(`Email: ${profile.getEmail()}`); } Once a user has signed in, make it clear how to access account details. In particular, make it obvious how to change or reset passwords. Cut form clutter # In the sign-up flow, your job is to minimize complexity and keep the user focused. Cut the clutter. This is not the time for distractions and temptations! Don't distract users from completing sign-up. On sign-up, ask for as little as possible. Collect additional user data (such as name and address) only when you need to, and when the user sees a clear benefit from providing that data. Bear in mind that every item of data you communicate and store incurs cost and liability. Don't double up your inputs just to make sure users get their contact details right. That slows down form completion and doesn't make sense if form fields are autofilled. Instead, send a confirmation code to the user once they've entered their contact details, then continue with account creation once they respond. This is a common sign-up pattern: users are used to it. You may want to consider password-free sign-in by sending users a code every time they sign in on a new device or browser. Sites such as Slack and Medium use a version of this. Password-free sign-in on medium.com. As with federated login, this has the added benefit that you don't have to manage user passwords. Consider session length # Whatever approach you take to user identity, you'll need to make a careful decision about session length: how long the user remains logged in, and what might cause you to log them out. Consider whether your users are on mobile or desktop, and whether they are sharing on desktop, or sharing devices. You can get around some of the issues of shared devices by enforcing re-authentication for sensitive features, for example when a purchase is made or an account updated. You can find out more about ways to implement re-authentication from the codelab Your First WebAuthn App. Help password managers securely suggest and store passwords # You can help third party and built-in browser password managers suggest and store passwords, so that users don't need to choose, remember or type passwords themselves. Password managers work well in modern browsers, synchronizing accounts across devices, across platform-specific and web apps—and for new devices. This makes it extremely important to code sign-up forms correctly, in particular to use the correct autocomplete values. For sign-up forms use autocomplete="new-password" for new passwords, and add correct autocomplete values to other form fields wherever possible, such as autocomplete="email" and autocomplete="tel". You can also help password managers by using different name and id values in sign-up and sign-in forms, for the form element itself, as well as any input, select and textarea elements. You should also use the appropriate type attribute to provide the right keyboard on mobile and enable basic built-in validation by the browser. You can find out more from Payment and address form best practices. Sign-in form best practices has lots more tips on how to improve form design, layout and accessibility, and how to code forms correctly in order to take advantage of built-in browser features. Ensure users enter secure passwords # Enabling password managers to suggest passwords is the best option, and you should encourage users to accept the strong passwords suggested by browsers and third-party browser managers. However, many users want to enter their own passwords, so you need to implement rules for password strength. The US National Institute of Standards and Technology explains how to avoid insecure passwords. Warning: Sign-up forms on some sites have password validation rules that don't allow the strong passwords generated by browser and third-party password managers. Make sure your site doesn't do this, since it interrupts form completion, annoys users, and requires users to make up their own passwords, which may be less secure than those generated by password managers. Don't allow compromised passwords # Whatever rules you choose for passwords, you should never allow passwords that have been exposed in security breaches. Once a user has entered a password, you need to check that it's not a password that's already been compromised. The site Have I Been Pwned provides an API for password checking, or you can run this as a service yourself. Google's Password Manager also allows you to check if any of your existing passwords have been compromised. If you do reject the password that a user proposes, tell them specifically why it was rejected. Show problems inline and explain how to fix them, as soon as the user has entered a value—not after they've submitted the sign-up form and had to wait for a response from your server. Be clear why a password is rejected. Don't prohibit password pasting # Some sites don't allow text to be pasted into password inputs. Disallowing password pasting annoys users, encourages passwords that are memorable (and therefore may be easier to compromise) and, according to organizations such as the UK National Cyber Security Centre, may actually reduce security. Users only become aware that pasting is disallowed after they try to paste their password, so disallowing password pasting doesn't avoid clipboard vulnerabilities. Never store or transmit passwords in plain text # Make sure to salt and hash passwords—and don't try to invent your own hashing algorithm! Don't force password updates # Don't force users to update their passwords arbitrarily. Forcing password updates can be costly for IT departments, annoying to users, and doesn't have much impact on security. It's also likely to encourage people to use insecure memorable passwords, or to keep a physical record of passwords. Rather than force password updates, you should monitor for unusual account activity and warn users. If possible you should also monitor for passwords that become compromised because of data breaches. You should also provide your users with access to their account login history, showing them where and when a login happened. Gmail account activity page. Make it simple to change or reset passwords # Make it obvious to users where and how to update their account password. On some sites, it's surprisingly difficult. You should, of course, also make it simple for users to reset their password if they forget it. The Open Web Application Security Project provides detailed guidance on how to handle lost passwords. To keep your business and your users safe, it's especially important to help users change their password if they discover that it's been compromised. To make this easier, you should add a /.well-known/change-password URL to your site that redirects to your password management page. This enables password managers to navigate your users directly to the page where they can change their password for your site. This feature is now implemented in Safari, Chrome, and is coming to other browsers. Help users change passwords easily by adding a well-known URL for changing passwords explains how to implement this. You should also make it simple for users to delete their account if that's what they want. Offer login via third-party identity providers # Many users prefer to log in to websites using an email address and password sign-up form. However, you should also enable users to log in via a third party identity provider, also known as federated login. WordPress login page, with Google and Apple login options. This approach has several advantages. For users who create an account using federated login, you don't need to ask for, communicate, or store passwords. You may also be able to access additional verified profile information from federated login, such as an email address—which means the user doesn't have to enter that data and you don't need to do the verification yourself. Federated login can also make it much easier for users when they get a new device. Integrating Google Sign-In into your web app explains how to add federated login to your sign-up options. Many other identity platforms are available. "First day experience" when you get a new device is increasingly important. Users expect to log in from multiple devices including their phone, laptop, desktop, tablet, TV, or from a car. If your sign-up and sign-in forms aren't seamless, this is a moment where you risk losing users, or at least losing contact with them until they get set up again. You need to make it as quick and easy as possible for users on new devices to get up and running on your site. This is another area where federated login can help. Make account switching simple # Many users share devices and swap between accounts using the same browser. Whether users access federated login or not, you should make account switching simple. Account switching on Gmail. Consider offering multi-factor authentication # Multi-factor authentication means ensuring that users provide authentication in more than one way. For example, as well as requiring the user to set a password, you might also enforce verification using a one-time-passcode sent by email or an SMS text message, or by using an app-based one-time code, security key or fingerprint sensor. SMS OTP best practices and Enabling Strong Authentication with WebAuthn explain how to implement multi-factor authentication. You should certainly offer (or enforce) multi-factor authentication if your site handles personal or sensitive information. Take care with usernames # Don't insist on a username unless (or until) you need one. Enable users to sign up and sign in with only an email address (or telephone number) and password—or federated login if they prefer. Don't force them to choose and remember a username. If your site does require usernames, don't impose unreasonable rules on them, and don't stop users from updating their username. On your backend you should generate a unique ID for every user account, not an identifier based on personal data such as username. Also make sure to use autocomplete="username" for usernames. Caution: As with personal names, ensure that usernames aren't restricted to characters from the Latin alphabet. Payment and address form best practices explains how and why to validate using Unicode letter matching. Test on a range of devices, platforms, browsers and versions # Test sign-up forms on the platforms most common for your users. Form element functionality may vary, and differences in viewport size can cause layout problems. BrowserStack enables free testing for open source projects on a range of devices and browsers. Implement analytics and Real User Monitoring # You need field data as well as lab data to understand how users experience your sign-up forms. Analytics and Real User Monitoring (RUM) provide data for the actual experience of your users, such as how long sign-up pages take to load, which UI components users interact with (or not) and how long it takes users to complete sign-up. Page analytics: page views, bounce rates and exits for every page in your sign-up flow. Interaction analytics: goal funnels and events indicate where users abandon the sign-up flow and what proportion of users click buttons, links, and other components of your sign-up pages. Website performance: user-centric metrics can tell you if your sign-up flow is slow to load or visually unstable. Small changes can make a big difference to completion rates for sign-up forms. Analytics and RUM enable you to optimize and prioritize changes, and monitor your site for problems that aren't exposed by local testing. Keep learning # Sign-in form best practices Payment and address form best practices Create Amazing Forms Best Practices For Mobile Form Design More capable form controls Creating Accessible Forms Streamlining the Sign-up Flow Using Credential Management API Verify phone numbers on the web with the WebOTP API Photo by @ecowarriorprincess on Unsplash.

Payment and address form best practices

Well-designed forms help users and increase conversion rates. One small fix can make a big difference! Try it! If you prefer to learn these best practices with a hands-on tutorial, check out the two codelabs for this post: Payment form best practices codelab Address form best practices codelab Here is an example of a simple payment form that demonstrates all of the best practices: Here is an example of a simple address form that demonstrates all of the best practices: Checklist # Use meaningful HTML elements: <form>, <input>, <label>, and <button>. Label each form field with a <label>. Use HTML element attributes to access built-in browser features, in particular type and autocomplete with appropriate values. Avoid using type="number" for numbers that aren't meant to be incremented, such as payment card numbers. Use type="text" and inputmode="numeric" instead. If an appropriate autocomplete value is available for an input, select, or textarea, you should use it. To help browsers autofill forms, give input name and id attributes stable values that don't change between page loads or website deployments. Disable submit buttons once they've been tapped or clicked. Validate data during entry—not just on form submission. Make guest checkout the default and make account creation simple once checkout is complete. Show progress through the checkout process in clear steps with clear calls to action. Limit potential checkout exit points by removing clutter and distractions. Show full order details at checkout and make order adjustments easy. Don't ask for data you don't need. Ask for names with a single input unless you have a good reason not to. Don't enforce Latin-only characters for names and usernames. Allow for a variety of address formats. Consider using a single textarea for address. Use autocomplete for billing address. Internationalize and localize where necessary. Consider avoiding postal code address lookup. Use appropriate payment card autocomplete values. Use a single input for payment card numbers. Avoid using custom elements if they break the autofill experience. Test in the field as well as the lab: page analytics, interaction analytics, and real-user performance measurement. Test on a range of browsers, devices, and platforms. This article is about frontend best practices for address and payment forms. It does not explain how to implement transactions on your site. To find out more about adding payment functionality to your website, see Web Payments. Use meaningful HTML # Use the elements and attributes built for the job: <form>, <input>, <label>, and <button> type, autocomplete, and inputmode These enable built-in browser functionality, improve accessibility, and add meaning to your markup. Use HTML elements as intended # Put your form in a <form> # You might be tempted not to bother wrapping your <input> elements in a <form>, and to handle data submission purely with JavaScript. Don't do it! An HTML <form> gives you access to a powerful set of built-in features across all modern browsers, and can help make your site accessible to screen readers and other assistive devices. A <form> also makes it simpler to build basic functionality for older browsers with limited JavaScript support, and to enable form submission even if there's a glitch with your code—and for the small number of users who actually disable JavaScript. If you have more than one page component for user input, make sure to put each in its own <form> element. For example, if you have search and sign-up on the same page, put each in its own <form>. Use <label> to label elements # To label an <input>, <select>, or <textarea>, use a <label>. Associate a label with an input by giving the label's for attribute the same value as the input's id. <label for="address-line1">Address line 1</label> <input id="address-line1" …> Use a single label for a single input: don't try to label multiple inputs with only one label. This works best for browsers, and best for screenreaders. A tap or click on a label moves focus to the input it's associated with, and screenreaders announce label text when the label or the label's input gets focus. Caution: Don't use placeholders on their own instead of labels. Once you start entering text in an input, the placeholder is hidden, so it can be easy to forget what the input is for. The same is true if you use the placeholder to show the correct format for values such as dates. This can be especially problematic for users on phones, particularly if they're distracted or feeling stressed! Make buttons helpful # Use <button> for buttons! You can also use <input type="submit">, but don't use a div or some other random element acting as a button. Button elements provide accessible behaviour, built-in form submission functionality, and can easily be styled. Give each form submit button a value that says what it does. For each step towards checkout, use a descriptive call-to-action that shows progress and makes the next step obvious. For example, label the submit button on your delivery address form Proceed to Payment rather than Continue or Save. Consider disabling a submit button once the user has tapped or clicked it—especially when the user is making a payment or placing an order. Many users click buttons repeatedly, even if they're working fine. That can mess up checkout and add to server load. On the other hand, don't disable a submit button waiting on complete and valid user input. For example, don't just leave a Save Address button disabled because something is missing or invalid. That doesn't help the user—they may continue to tap or click the button and assume that it's broken. Instead, if users attempt to submit a form with invalid data, explain to them what's gone wrong and what to do to fix it. This is particularly important on mobile, where data entry is more difficult and missing or invalid form data may not be visible on the user's screen by the time they attempt to submit a form. Make the most of HTML attributes # Make it easy for users to enter data # Use the appropriate input type attribute to provide the right keyboard on mobile and enable basic built-in validation by the browser. For example, use type="email" for email addresses and type="tel" for phone numbers. Keyboards appropriate for email and telephone. Warning: Using type="number" adds an up/down arrow to increment numbers, which makes no sense for data such as telephone, payment card or account numbers. For numbers like these, set type="text" (or leave off the attribute, since text is the default). For telephone numbers, use type="tel" to get the appropriate keyboard on mobile. For other numbers use inputmode="numeric" to get a numeric keyboard on mobile. Some sites still use type="tel" for payment card numbers to ensure that mobile users get the right keyboard. However, inputmode is very widely supported now, so you shouldn't have to do that—but do check your users' browsers. For dates, try to avoid using custom select elements. They break the autofill experience if not properly implemented and don't work on older browsers. For numbers such as birth year, consider using an input element rather than a select, since entering digits manually can be easier and less error prone than selecting from a long drop-down list—especially on mobile. Use inputmode="numeric" to ensure the right keyboard on mobile and add validation and format hints with text or a placeholder to make sure the user enters data in the appropriate format. The datalist element enables a user to select from a list of available options and provides matching suggestions as the user enters text. Try out datalist for text, range and color inputs at simpl.info/datalist. For birth year input, you can compare a select with an input and datalist at datalist-select.glitch.me. Use autocomplete to improve accessibility and help users avoid re-entering data # Using appropriate autocomplete values enables browsers to help users by securely storing data and autofilling input, select, and textarea values. This is particularly important on mobile, and crucial for avoiding high form abandonment rates. Autocomplete also provides multiple accessibility benefits. If an appropriate autocomplete value is available for a form field, you should use it. MDN web docs has a full list of values and explanations of how to use them correctly. As well as using appropriate autocomplete values, help browsers autofill forms by giving form fields name and id attributes stable values that don't change between page loads or website deployments. By default, set the billing address to be the same as the delivery address. Reduce visual clutter by providing a link to edit the billing address (or use summary and details elements) rather than displaying the billing address in a form. Add a link to review billing. Use appropriate autocomplete values for the billing address, just as you do for shipping address, so the user doesn't have to enter data more than once. Add a prefix word to autocomplete attributes if you have different values for inputs with the same name in different sections. <input autocomplete="shipping address-line-1" ...> ... <input autocomplete="billing address-line-1" ...> Help users enter the right data # Try to avoid "telling off" customers because they "did something wrong". Instead, help users complete forms more quickly and easily by helping them fix problems as they happen. Through the checkout process, customers are trying to give your company money for a product or service—your job is to assist them, not to punish them! You can add constraint attributes to form elements to specify acceptable values, including min, max, and pattern. The validity state of the element is set automatically depending on whether the element's value is valid, as are the :valid and :invalid CSS pseudo-classes which can be used to style elements with valid or invalid values. For example, the following HTML specifies input for a birth year between 1900 and 2020. Using type="number" constrains input values to numbers only, within the range specified by min and max. If you attempt to enter a number outside the range, the input will be set to have an invalid state. The following example uses pattern="[\d ]{10,30}" to ensure a valid payment card number, while allowing spaces: Modern browsers also do basic validation for inputs with type email or url. Add the multiple attribute to an input with type="email" to enable built-in validation for multiple comma-separated email addresses in a single input. On form submission, browsers automatically set focus on fields with problematic or missing required values. No JavaScript required! Basic built-in validation by the browser. Validate inline and provide feedback to the user as they enter data, rather than providing a list of errors when they click the submit button. If you need to validate data on your server after form submission, list all problems that are found and clearly highlight all form fields with invalid values, as well as displaying a message inline next to each problematic field explaining what needs to be fixed. Check server logs and analytics data for common errors—you may need to redesign your form. You should also use JavaScript to do more robust validation while users are entering data and on form submission. Use the Constraint Validation API (which is widely supported) to add custom validation using built-in browser UI to set focus and display prompts. Find out more in Use JavaScript for more complex real-time validation. Warning: Even with client-side validation and data input constraints, you must still ensure that your back-end securely handles input and output of data from users. Never trust user input: it could be malicious. Find out more in OWASP Input Validation Cheat Sheet. Help users avoid missing required data # Use the required attribute on inputs for mandatory values. When a form is submitted modern browsers automatically prompt and set focus on required fields with missing data, and you can use the :required pseudo-class to highlight required fields. No JavaScript required! Add an asterisk to the label for every required field, and add a note at the start of the form to explain what the asterisk means. Simplify checkout # Mind the mobile commerce gap! # Imagine that your users have a fatigue budget. Use it up, and your users will leave. You need to reduce friction and maintain focus, especially on mobile. Many sites get more traffic on mobile but more conversions on desktop—a phenomenon known as the mobile commerce gap. Customers may simply prefer to complete a purchase on desktop, but lower mobile conversion rates are also a result of poor user experience. Your job is to minimize lost conversions on mobile and maximize conversions on desktop. Research has shown that there's a huge opportunity to provide a better mobile form experience. Most of all, users are more likely to abandon forms that look long, that are complex, and without a sense of direction. This is especially true when users are on smaller screens, distracted, or in a rush. Ask for as little data as possible. Make guest checkout the default # For an online store, the simplest way to reduce form friction is to make guest checkout the default. Don't force users to create an account before making a purchase. Not allowing guest checkout is cited as a major reason for shopping cart abandonment. From baymard.com/checkout-usability You can offer account sign-up after checkout. At that point, you already have most of the data you need to set up an account, so account creation should be quick and easy for the user. Gotchas! If you do offer sign-up after checkout, make sure that the purchase the user just made is linked to their newly created account! Show checkout progress # You can make your checkout process feel less complex by showing progress and making it clear what needs to be done next. The video below shows how UK retailer johnlewis.com achieves this. Show checkout progress. You need to maintain momentum! For each step towards payment, use page headings and descriptive button values that make it clear what needs to be done now, and what checkout step is next. Give form buttons meaningful names that show what's next. Use the enterkeyhint attribute on form inputs to set the mobile keyboard enter key label. For example, use enterkeyhint="previous" and enterkeyhint="next" within a multi-page form, enterkeyhint="done" for the final input in the form, and enterkeyhint="search" for a search input. Enter key buttons on Android: 'next' and 'done'. The enterkeyhint attribute is supported on Android and iOS. You can find out more from the enterkeyhint explainer. Make it easy for users to go back and forth within the checkout process, to easily adjust their order, even when they're at the final payment step. Show full details of the order, not just a limited summary. Enable users to easily adjust item quantities from the payment page. Your priority at checkout is to avoid interrupting progress towards conversion. Remove distractions # Limit potential exit points by removing visual clutter and distractions such as product promotions. Many successful retailers even remove navigation and search from checkout. Search, navigation and other distractions removed for checkout. Keep the journey focused. This is not the time to tempt users to do something else! Don't distract customers from completing their purchase. For returning users you can simplify the checkout flow even more, by hiding data they don't need to see. For example: display the delivery address in plain text (not in a form) and allow users to change it via a link. Hide data customers don't need to see. Make it easy to enter name and address # Only ask for the data you need # Before you start coding your name and address forms, make sure to understand what data is required. Don't ask for data you don't need! The simplest way to reduce form complexity is to remove unnecessary fields. That's also good for customer privacy and can reduce back-end data cost and liability. Use a single name input # Allow your users to enter their name using a single input, unless you have a good reason for separately storing given names, family names, honorifics, or other name parts. Using a single name input makes forms less complex, enables cut-and-paste, and makes autofill simpler. In particular, unless you have good reason not to, don't bother adding a separate input for a prefix or title (like Mrs, Dr or Lord). Users can type that in with their name if they want to. Also, honorific-prefix autocomplete currently doesn't work in most browsers, and so adding a field for name prefix or title will break the address form autofill experience for most users. Enable name autofill # Use name for a full name: <input autocomplete="name" ...> If you really do have a good reason to split out name parts, make sure to to use appropriate autocomplete values: honorific-prefix given-name nickname additional-name-initial additional-name family-name honorific-suffix Allow international names # You might want to validate your name inputs, or restrict the characters allowed for name data. However, you need to be as unrestrictive as possible with alphabets. It's rude to be told your name is "invalid"! For validation, avoid using regular expressions that only match Latin characters. Latin-only excludes users with names or addresses that include characters that aren't in the Latin alphabet. Allow Unicode letter matching instead—and ensure your back-end supports Unicode securely as both input and output. Unicode in regular expressions is well supported by modern browsers. Don't <!-- Names with non-Latin characters (such as Françoise or Jörg) are 'invalid'. --> <input pattern="[\w \-]+" ...> Do <!-- Accepts Unicode letters. --> <input pattern="[\p{L} \-]+" ...> Unicode letter matching compared to Latin-only letter matching. You can find out more about internationalization and localization below, but make sure your forms work for names in all regions where you have users. For example, for Japanese names you should consider having a field for phonetic names. This helps customer support staff say the customer's name on the phone. Allow for a variety of address formats # When you're designing an address form, bear in mind the bewildering variety of address formats, even within a single country. Be careful not to make assumptions about "normal" addresses. (Take a look at UK Address Oddities! if you're not convinced!) Make address forms flexible # Don't force users to try to squeeze their address into form fields that don't fit. For example, don't insist on a house number and street name in separate inputs, since many addresses don't use that format, and incomplete data can break browser autofill. Be especially careful with required address fields. For example, addresses in large cities in the UK do not have a county, but many sites still force users to enter one. Using two flexible address lines can work well enough for a variety of address formats. <input autocomplete="address-line-1" id="address-line1" ...> <input autocomplete="address-line-2" id="address-line2" ...> Add labels to match: <label for="address-line-1"> Address line 1 (or company name) </label> <input autocomplete="address-line-1" id="address-line1" ...> <label for="address-line-2"> Address line 2 (optional) </label> <input autocomplete="address-line-2" id="address-line2" ...> You can try this out by remixing and editing the demo embedded below. Caution: Research shows that Address line 2 may be problematic for users. Bear this in mind when designing address forms—you should consider alternatives such as using a single textarea (see below) or other options. Consider using a single textarea for address # The most flexible option for addresses is to provide a single textarea. The textarea approach fits any address format, and it's great for cutting and pasting—but bear in mind that it may not fit your data requirements, and users may miss out on autofill if they previously only used forms with address-line1 and address-line2. For a textarea, use street-address as the autocomplete value. Here is an example of a form that demonstrates the use of a single textarea for address: Internationalize and localize your address forms # It's especially important for address forms to consider internationalization and localization, depending on where your users are located. Be aware that the naming of address parts varies as well as address formats, even within the same language. ZIP code: US Postal code: Canada Postcode: UK Eircode: Ireland PIN: India It can be irritating or puzzling to be presented with a form that doesn't fit your address or that doesn't use the words you expect. Customizing address forms for multiple locales may be necessary for your site, but using techniques to maximize form flexibility (as described above) may be adequate. If you don't localize your address forms, make sure to understand the key priorities to cope with a range of address formats: Avoid being over-specific about address parts, such as insisting on a street name or house number. Where possible avoid making fields required. For example, addresses in many countries don't have a postal code, and rural addresses may not have a street or road name. Use inclusive naming: 'Country/region' not 'Country'; 'ZIP/postal code' not 'ZIP'. Keep it flexible! The simple address form example above can be adapted to work 'well enough' for many locales. Consider avoiding postal code address lookup # Some websites use a service to look up addresses based on postal code or ZIP. This may be sensible for some use cases, but you should be aware of the potential downsides. Postal code address suggestion doesn't work for all countries—and in some regions, post codes can include a large number of potential addresses. ZIP or postal codes may include a lot of addresses! It's difficult for users to select from a long list of addresses—especially on mobile if they're rushed or stressed. It can be easier and less error prone to let users take advantage of autofill, and enter their complete address filled with a single tap or click. A single name input enables one-tap (one-click) address entry. Simplify payment forms # Payment forms are the single most critical part of the checkout process. Poor payment form design is a common cause of shopping cart abandonment. The devil's in the details: small glitches can tip users towards abandoning a purchase, particularly on mobile. Your job is to design forms to make it as easy as possible for users to enter data. Help users avoid re-entering payment data # Make sure to add appropriate autocomplete values in payment card forms, including the payment card number, name on the card, and the expiry month and year: cc-number cc-name cc-exp-month cc-exp-year This enables browsers to help users by securely storing payment card details and correctly entering form data. Without autocomplete, users may be more likely to keep a physical record of payment card details, or store payment card data insecurely on their device. Caution: Don't add a selector for payment card type, since this can always be inferred from the payment card number. Avoid using custom elements for payment card dates # If not properly designed, custom elements can interrupt payment flow by breaking autofill, and won't work on older browsers. If all other payment card details are available from autocomplete but a user is forced to find their physical payment card to look up an expiry date because autofill didn't work for a custom element, you're likely to lose a sale. Consider using standard HTML elements instead, and style them accordingly. Autocomplete filled all the fields—except the expiry date! Use a single input for payment card and phone numbers # For payment card and phone numbers use a single input: don't split the number into parts. That makes it easier for users to enter data, makes validation simpler, and enables browsers to autofill. Consider doing the same for other numeric data such as PIN and bank codes. Don't use multiple inputs for a credit card number. Validate carefully # You should validate data entry both in realtime and before form submission. One way to do this is by adding a pattern attribute to a payment card input. If the user attempts to submit the payment form with an invalid value, the browser displays a warning message and sets focus on the input. No JavaScript required! However, your pattern regular expression must be flexible enough to handle the range of payment card number lengths: from 14 digits (or possibly less) to 20 (or more). You can find out more about payment card number structuring from LDAPwiki. Allow users to include spaces when they're entering a new payment card number, since this is how numbers are displayed on physical cards. That's friendlier to the user (you won't have to tell them "they did something wrong"), less likely to interrupt conversion flow, and it's straightforward to remove spaces in numbers before processing. You may want to use a one-time passcode for identity or payment verification. However, asking users to manually enter a code or copy it from an email or an SMS text is error-prone and a source of friction. Learn about better ways to enable one-time passcodes in SMS OTP form best practices. Test on a range of devices, platforms, browsers and versions # It's particularly important to test address and payment forms on the platforms most common for your users, since form element functionality and appearance may vary, and differences in viewport size can lead to problematic positioning. BrowserStack enables free testing for open source projects on a range of devices and browsers. The same page on iPhone 7 and iPhone 11. Reduce padding for smaller mobile viewports to ensure the Complete payment button isn't hidden. Implement analytics and RUM # Testing usability and performance locally can be helpful, but you need real-world data to properly understand how users experience your payment and address forms. For that you need analytics and Real User Monitoring—data for the experience of actual users, such as how long checkout pages take to load or how long payment takes to complete: Page analytics: page views, bounce rates and exits for every page with a form. Interaction analytics: goal funnels and events indicate where users abandon your checkout flow and what actions do they take when interacting with your forms. Website performance: user-centric metrics can tell you if your checkout pages are slow to load and, if so—what's the cause. Page analytics, interaction analytics, and real user performance measurement become especially valuable when combined with server logs, conversion data, and A/B testing, enabling you to answer questions such as whether discount codes increase revenue, or whether a change in form layout improves conversions. That, in turn, gives you a solid basis for prioritizing effort, making changes, and rewarding success. Keep learning # Sign-in form best practices Sign-up form best practices Verify phone numbers on the web with the WebOTP API Create Amazing Forms Best Practices For Mobile Form Design More capable form controls Creating Accessible Forms Streamlining the Sign-up Flow Using Credential Management API Frank's Compulsive Guide to Postal Addresses provides useful links and extensive guidance for address formats in over 200 countries. Countries Lists has a tool for downloading country codes and names in multiple languages, in multiple formats. Photo by @rupixen on Unsplash.

Automating audits with AutoWebPerf

What is AutoWebPerf (AWP)? # AutoWebPerf (AWP) is a modular tool that enables automatic gathering of performance data from multiple sources. Currently there are many tools available to measure website performance for different scopes (lab and field), such as Chrome UX Report, PageSpeed Insights, or WebPageTest. AWP offers integration with various audit tools with a simple setup so you can continuously monitor the site performance in one place. The release of Web Vitals guidance means that close and active monitoring of web pages is becoming increasingly important. The engineers behind this tool have been doing performance audits for years and they created AWP to automate a manual, recurring, and time consuming part of their daily activities. Today, AWP has reached a level of maturity and it's ready to be shared broadly so anyone can benefit from the automation it brings. The tool is accessible on the AutoWebPerf public repository on GitHub. What is AWP for? # Although several tools and APIs are available to monitor the performance of web pages, most of them expose data measured at a specific time. To adequately monitor a website and maintain good performance of key pages, it's recommended to continuously take measurements of Core Web Vitals over time and observe trends. AWP makes that easier by providing an engine and pre-built API integrations which can be programmatically configured to automate recurrent queries to various performance monitoring APIs. For example, with AWP, you can set a daily test on your home page to capture the field data from CrUX API and lab data from a Lighthouse report from PageSpeed Insights. This data can be written and stored over time, for example, in Google Sheets and then visualised in the Data Studio dashboard. AWP automates the heavy-lifting part of the entire process, making it a great solution to follow lab and field trends over time. See Visualising audit results in Data Studio below for more details). Architecture overview # AWP is a modular-based library with three different types of modules: the engine connector modules gatherer modules The engine takes a list of tests from a connector (for example, from a local CSV file), runs performance audits through selected gatherers (such as PageSpeed Insights), and writes results to the output connector (for example, Google Sheets). AWP comes with a number of pre-implemented gatherers and connectors: Pre-implemented gatherers: CrUX API CrUX BigQuery PageSpeed Insights API WebPageTest API Pre-implemented connectors: Google Sheets JSON CSV Automating audits with AWP # AWP automates the performance audits via your preferred audit platforms such as PageSpeed Insights, WebPageTest, or CrUX API. AWP offers the flexibility to choose where to load the list of tests, and where to write the results to. For example, you can run audits for a list of tests stored in a Google Sheet, and write the results to a CSV file, with the command below: PSI_APIKEY=<YOUR_KEY> SHEETS_APIKEY=<YOUR_KEY> ./awp run sheets:<SheetID> csv:output.csv Recurring audits # You can run recurring audits in daily, weekly, or monthly frequency. For example, you can run daily audits for a list of tests defined in a local JSON like below: { "tests": [ { "label": "web.dev", "url": "https://web.dev", "gatherer": "psi" } ] } The command below reads the list of audit tests from the local JSON file, runs audits on a local machine, then outputs results to a local CSV file: PSI_APIKEY=<YOUR_KEY> ./awp run json:tests.json csv:output.csv To run audits every day as a background service continuously, you can use the command below instead: PSI_APIKEY=<YOUR_KEY> ./awp continue json:tests.json csv:output.csv Alternatively, you can set up the crontab in a Unix-like environment to run AWP as a daily cron job: 0 0 * * * PSI_APIKEY=<YOUR_KEY> ./awp run json:tests.json csv:output.csv You can find more ways to automate daily audits and result collection in the AWP GitHub repository. Visualising audit results in Data Studio # Along with continuously measuring Core Web Vitals, it is important to be able to evaluate the trends and discover potential regressions with real user metrics (RUM) or the Chrome UX Report (CrUX) data collected by AWP. Note that Chrome UX Report (CrUX) is a 28-day moving aggregation, hence it is recommended to also use your own RUM data along with CrUX so you can spot regressions sooner. Data Studio is a free visualization tool that you can easily load performance metrics into and draw trends as charts. For example, the time series charts below show Core Web Vitals based on Chrome UX Report data. One of the charts shows increasing Cumulative Layout Shift in recent weeks, which means regressions in the layout stability for certain pages. In this scenario, you would want to prioritize the efforts to analyze the underlying issues of these pages. To simplify the end-to-end process from data collection to visualization, you can run AWP with a list of URLs to automatically export results to Google Sheets with the following command: PSI_APIKEY=<YOUR_KEY> SHEETS_APIKEY=<YOUR_KEY> ./awp run sheets:<SheetID> csv:output.csv After collecting daily metrics in a spreadsheet, you can create a Data Studio dashboard that loads the data directly from the spreadsheet, and plots the trends into a time series chart. Check out Google Spreadsheets API Connector for detailed steps about how to set up AWP with spreadsheets as a data source to visualize on Data Studio. What's next? # AWP provides a simple and integrated way to minimize the efforts to set up a continuous monitoring pipeline to measure Core Web Vitals and other performance metrics. As for now, AWP covers the most common use cases and will continue to provide more features to address other use cases in the future. Learn more in the AutoWebPerf repository.

Workers overview

This overview explains how web workers and service workers can improve the performance of your website, and when to use a web worker versus a service worker. Check out the rest of this series for specific patterns of window and service worker communication. How workers can improve your website # The browser uses a single thread (the main thread) to run all the JavaScript in a web page, as well as to perform tasks like rendering the page and performing garbage collection. Running excessive JavaScript code can block the main thread, delaying the browser from performing these tasks and leading to a poor user experience. In iOS/Android application development, a common pattern to ensure that the app's main thread remains free to respond to user events is to offload operations to additional threads. In fact, in the latest versions of Android, blocking the main thread for too long leads to an app crash. On the web, JavaScript was designed around the concept of a single thread, and lacks capabilities needed to implement a multithreading model like the one apps have, like shared memory. Despite these limitations, a similar pattern can be achieved in the web by using workers to run scripts in background threads, allowing them to perform tasks without interfering with the main thread. Workers are an entire JavaScript scope running on a separate thread, without any shared memory. In this post you'll learn about two different types of workers (web workers and service workers), their similarities and differences, and the most common patterns for using them in production websites. Web workers and service workers # Similarities # Web workers and service workers are two types of workers available to websites. They have some things in common: Both run in a secondary thread, allowing JavaScript code to execute without blocking the main thread and the user interface. They don't have access to the Window and Document objects, so they can't interact with the DOM directly, and they have limited access to browser APIs. Differences # One might think that most things that can be delegated to a web worker can be done in a service worker and vice versa, but there are important differences between them: Unlike web workers, service workers allow you to intercept network requests (via the fetch event) and to listen for Push API events in the background (via the push event). A page can spawn multiple web workers, but a single service worker controls all the active tabs under the scope it was registered with. The lifespan of the web worker is tightly coupled to the tab it belongs to, while the service worker's lifecycle is independent of it. For that reason, closing the tab where a web worker is running will terminate it, while a service worker can continue running in the background, even when the site doesn't have any active tabs open. For relatively short bits of work like sending a message, the browser won't likely terminate a service worker when there are no active tabs, but if the task takes too long the browser will terminate the service worker, otherwise it's a risk to the user's privacy and battery. APIs like Background Fetch, that can let you avoid the service worker's termination. Use cases # The differences between both types of workers suggest in which situations one might want to use one or the other: Use cases for web workers are more commonly related to offloading work (like heavy computations) to a secondary thread, to avoid blocking the UI. Example: the team that built the videogame PROXX wanted to leave the main thread as free as possible to take care of user input and animations. To achieve that, they used web workers to run the game logic and state maintenance on a separate thread. Service workers tasks are generally more related to acting as a network proxy, handling background tasks, and things like caching and offline. Example: In a podcast PWA, one might want to allow users to download complete episodes to listen to them while offline. A service worker, and, in particular, the Background Fetch API can be used to that end. That way, if the user closes the tab while the episode is downloading, the task doesn't have to be interrupted. The UI is updated to indicate the progress of a download (left). Thanks to service workers, the operation can continue running when all tabs have been closed (right). Tools and libraries # Window and worker communication can be implemented by using different lower level APIs. Fortunately, there are libraries that abstract this process, taking care of the most common use cases. In this section, we'll cover two of them that take care of window to web workers and service workers respectively: Comlink and Workbox. Comlink # Comlink is a small (1.6k) RPC library that takes care of many underlying details when building websites that use Web Workers. It has been used in websites like PROXX and Squoosh. A summary of its motivations and code samples can be found here. Workbox # Workbox is a popular library to build websites that use service workers. It packages a set of best practices around things like caching, offline, background synchronization, etc. The workbox-window module provides a convenient way to exchange messages between the service worker and the page. Next steps # The rest of this series focuses on patterns for window and service worker communication: Imperative caching guide: Calling a service worker from the page to cache resources in advance (e.g. in prefetching scenarios). Broadcast updates: Calling the page from the service worker to inform about important updates (e.g. a new version of the website is available). Two-way communication: Delegating a task to a service worker (e.g. a heavy download), and keeping the page informed on the progress. For patterns of window and web worker communication check out: Use web workers to run JavaScript off the browser's main thread.

PWA users are 2.5x more likely to purchase Gravit Designer PRO

+24% PWA users have 24% more active sessions than all other platforms +31% PWA accounts for 31% more repeat users than all other platforms 2.5x PWA users are 2.5x more likely to purchase Gravit Designer PRO Reza is a product manager at Corel. Corel Corporation's Gravit Designer is a powerful vector design tool. With roots as a startup, Gravit Designer joined Corel's extensive product portfolio in 2018, and serves tens of thousands of daily active users demanding rich, affordable, and accessible vector illustration software. Corel builds a host of creative and productivity software including CorelDRAW, Corel PHOTO-PAINT, Corel Painter, Parallels, and more. Gravit Designer's target audience is creators of all stripes - from students learning about vector illustration to seasoned designers looking for a fully-functional solution. Corel has always wanted to meet designers and creatives where they are, on their platform of choice, and Gravit Designer allows us to deliver powerful vector illustration tools via the web. Progressive web apps (PWAs) are of particular interest to Gravit Designer and Corel's Online Graphics initiatives, as they help bridge the gap between web apps and traditional desktop applications. Progressive web apps are quickly becoming the preferred way to deliver desktop experiences for traditional web apps. Chrome OS and the Play Store also present a great opportunity to Corel by offering secure in-app payments, PWA support for bringing the web app experience to the desktop in a seamless manner (local font and file system access are particularly relevant for us), and most importantly, greater visibility to our web apps for more users. Students and educators can install the Chrome OS version of Gravit Designer with ease, and enjoy the same powerful features regardless of platform. Engineering challenges # There are a great many engineering challenges with supporting multiple platforms, particularly web and desktop. In our case, we take great care when deciding to support a new platform, as our app began its life on the web. When supporting desktop platforms, we typically have to wrap our application in a supporting container, which brings its own set of challenges depending on the host platform. Our users want an experience that carries over seamlessly from one platform to another. This is vital to many of our customers who might switch from web, to desktop, to Chromebooks, and back to web in the course of a design. Furthermore, our users want their work to travel with them, unencumbered by their situation. Whether on-the-go, offline, or connected to the internet, they want their documents accessible in the Gravit Cloud, for example. At Corel, we have decades of experience porting software to many platforms and navigating the challenges therein. There is a balancing act in ensuring proper performance, feature parity, and platform-specific UI support. Gravit Designer is no stranger to these challenges. Gravit Designer's desktop PWA # With some platforms, the answer will be wrapping a web app in a platform-specific container application for the foreseeable future (e.g. Electron). However, with PWAs and Chrome OS we can start to deliver on the promise of a web app ported to a desktop experience with minimal disruption. For Gravit Designer, our team could see the growing value of PWAs, and made great strides to support it as an enabling technology going forward. The potential of several major platforms supporting PWA (namely Chrome OS, iOS, Windows, and more) could usher in a new era of cross-platform support for certain applications. Since Chrome was the clear leader in browsers among our users, and provided the best user experience for PWA, we decided to investigate the work involved in building out a PWA version of Gravit Designer. The team began by first creating a proof-of-concept to understand the effort required. Next came the development effort associated with local font and local file system support. In the end, we had to stage our support for local fonts. Once improvements were made to file loading times, installation, and performance, we felt more comfortable moving past the proof-of-concept phase and targeting PWA support for a major release. Impact # Since launching our desktop PWA, we've seen a steady increase in installations, and we're excited by the prospect of releasing the PWA version with enhanced platform-specific features for Chrome OS and other platforms. In fact, the standard PWA version of Gravit Designer now leads downloads from the Microsoft Store and Linux installations, so we're looking forward to even more growth. Key figures # 18% of total Chrome OS users have installed our PWA (PWA installs across all operating systems account for ~5% of our total). PWA users are 24% more active than all other install types (more sessions per user). PWA accounts for 31% more repeat users than all other platforms. PWA users are 2.5x more likely to purchase Gravit Designer PRO. PWA makes up about 5% of all new user accounts, and growing. Summary # The growth of PWA installations in general-past other more established platforms-points to a future where we could offer a satisfying desktop experience without the need for platform-specific wrappers on a multitude of platforms. Our work with Google on PWAs and Chrome OS is vital to this aim, as more and more features are supported.

Clipchamp's video editor PWA installs see a 97% monthly growth

97% Monthly growth in PWA installations 2.3x Performance improvement 9% Higher retention in PWA users Clipchamp is the in-browser online video editor that empowers anyone to tell stories worth sharing through video. Around the world, over 12 million creators use Clipchamp to easily edit videos. We offer simple solutions for making videos, from intuitive tools like crop and trim, to practical features like our screen recorder, and even a meme maker. Who uses Clipchamp? # Our users (or everyday editors as we call them) are diverse. No expertise is necessary to be a video editor with Clipchamp. Specifically, we're currently noticing sales, support training, and product marketing teams using our webcam and screen recorder for quick explainer content with added text and GIFs to make it engaging. We're also observing a lot of small businesses edit and post social videos while on the move. What challenges do they face? # We recognise that video editing can be intimidating at first. The assumption is that it's hard, probably due to previous frustrating experiences with complex editing software. In contrast, Clipchamp focuses on ease and simplicity, providing support with text overlays, stock video and music, templates, and more. We find most everyday editors aren't wanting to create motion picture masterpieces. We talk to our users a lot and are continually reminded that they're busy and just want to get their story out to the world as quickly and easily as possible, so this is a focus for us. Developing a Clipchamp PWA # At Clipchamp, we're all about empowering people to tell their stories through video. To live up to this vision, we soon realised that allowing our users to use their own footage when putting together a video project is important. That insight put the pressure on Clipchamp's engineering team to come up with a technology that can efficiently process Gigabyte-scale media files in a web application. Having network bandwidth constraints in mind, we were quick to rule out a traditional cloud-based solution. Uploading large media files from a retail internet connection would invariably introduce massive waiting times before editing could even begin, effectively resulting in a poor user experience. That made us switch to a fully in-browser solution, where all the "heavy lifting" of video processing is done locally using hardware resources available on the end user's device. We strategically bet on the Chrome browser and, by extension, the Chrome OS platform to help us overcome the inevitable challenges of building an in-browser video creation platform. Video processing is enormously resource hungry, affecting computer and storage resources alike. We started out building the first version of Clipchamp on top of Google's (Portable) Native Client (PNaCl). While eventually phased out, PNaCl was a great confirmation for our team that web apps can be fast and low latency, while still running on end user hardware. When later switching to WebAssembly, we were glad to see Chrome taking the lead in incorporating post-MVP features such as bulk memory operations, threading, and most recently: fixed-width vector operations. The latter has been hotly anticipated by our engineering team, offering us the ability to optimize our video processing stack to take advantage of SIMD operations, prevalent on contemporary CPUs. Taking advantage of Chrome's WebAssembly SIMD support, we were able to speed up some particularly demanding workloads such as 4K video decoding and video encoding. With little prior experience and in less than a month of effort for one of our engineers, we managed to improve performance by 2.3x. While still limited to a Chrome origin trial, we were already able to roll out these SIMD enhancements to the majority of our users. While our users run wildly different hardware setups, we were able to confirm a matching performance uplift in production without seeing any detrimental effects in failure rates. More recently, we integrated the emerging WebCodecs API, currently available under another Chrome origin trial. Using this new capability, we will be able to further improve performance of video decoding on low-spec hardware as found in many popular Chromebooks. With a PWA created, it's important to encourage its adoption. As with many web apps, we've focused on ease of access which includes things like social logins including Google, quickly getting the user into a place where they can edit video and then making it easy to export the video. Additionally, we promoted our PWA install prompts in the toolbar and as a pop-up notice in our menu navigation. Results # Our installable Chrome PWA has been doing really well. We've been so pleased to see 9% higher retention with PWA users than with our standard desktop users. Installation of the PWA has been massive, increasing at a rate of 97% a month since we launched five months ago. And, as mentioned before, the WebAssembly SIMD enhancements improved performance 2.3x. Future # We're pleasantly surprised by the engagement and uptake of our PWA. We think Clipchamp user retention benefited because the PWA is installed and easier to get to. We also noted the PWA performs better for the editor, which makes it more compelling and keeps people coming back. Looking to the future, we're excited about the opportunity Chrome OS provides for even more users to get more done with less fuss. Specifically, we're excited about some of the convenience integrations with the local OS when working with files. We think this will help speed up workflows for our busy everyday editors, and that's one of our highest priorities.

Disable mouse acceleration to provide a better FPS gaming experience

Accelerated movement is an ergonomic feature when using a mouse or trackpad to move the pointer on screen. It allows precise movement by moving slowly while also allowing the pointer to cross the entire screen with a quick short motion. Specifically, for the same physical distance that you move the mouse, the pointer on screen travels further if the distance was traveled faster. Operating systems enable mouse acceleration by default. For some first-party perspective games, commonly first party shooters (FPS), raw mouse input data is used to control camera rotation without an acceleration adjustment. The same physical motion, slow or fast, results in the same rotation. This results in a better gaming experience and higher accuracy according to professional gamers. Pointer motion control in Windows 10 settings. Starting in Chrome 88, web apps can switch back and forth between accelerated and non-accelerated mouse movement data thanks to the updated Pointer Lock API. Web-based gaming platforms such as Google Stadia and Nvidia GeForce Now already use these new capabilities to please FPS gamers. Using the API # Request a pointer lock # A pointer lock is the canonical term for when a desktop application hides the pointer icon and interprets mouse motion for something else, e.g. looking around in a 3D world. The movementX and movementY attributes from the mousemove document events tell you how much the mouse pointer moved since the last move event. However, those are not updated when the pointer moves outside of the web page. document.addEventListener("mousemove", (event) => { console.log(`movementX: ${event.movementX} movementY: ${event.movementY}`); }); Capturing the mouse pointer (or requesting a pointer lock) allows you to not worry about the pointer moving outside anymore. This is especially useful for immersive web games. When the pointer is locked, all mouse events go to the target element of the pointer lock. Call requestPointerLock() on the target element to request a pointer lock, and listen to pointerlockchange and pointerlockerror events to monitor pointer lock changes. const myTargetElement = document.body; // Call this function to request a pointer lock. function requestPointerLock() { myTargetElement.requestPointerLock(); } document.addEventListener("pointerlockchange", () => { if (document.pointerLockElement) { console.log(`pointer is locked on ${document.pointerLockElement}`); } else { console.log("pointer is unlocked"); } }); document.addEventListener("pointerlockerror", () => { console.log("pointer lock error"); }); Disable mouse acceleration # Call requestPointerLock() with { unadjustedMovement: true } to disable OS-level adjustment for mouse acceleration, and access raw mouse input. This way, mouse movement data from mousemove events won't include mouse acceleration when the pointer is locked. Use the new returned promise from requestPointerLock() to know if the request was successful. function requestPointerLockWithUnadjustedMovement() { const promise = myTargetElement.requestPointerLock({ unadjustedMovement: true, }); if (!promise) { console.log("disabling mouse acceleration is not supported"); return; } return promise .then(() => console.log("pointer is locked")) .catch((error) => { if (error.name === "NotSupportedError") { // Some platforms may not support unadjusted movement. // You can request again a regular pointer lock. return myTargetElement.requestPointerLock(); } }); } It is possible to toggle between accelerated and non-accelerated mouse movement data without releasing the pointer lock. Simply request the pointer lock again with the desired option. If that request fails, the original lock will remain intact and the returned promise will reject. No pointer lock events will fire for a failed change request. Browser support # The Pointer Lock API is well supported across browsers. However Chromium-based browsers (e.g. Chrome, Edge, etc.) are the only ones to support disabling OS-level adjustment for mouse acceleration as of October 2020. See MDN's Browser compatibility table for updates. Operating system support # Disabling OS-level adjustment for mouse acceleration is supported on Chrome OS, macOS Catalina 10.15.1, and Windows. Linux will follow. Sample # You can play with the Pointer Lock API by running the sample on Glitch. Be sure to check out the source code. Helpful links # Explainer Specification PR GitHub repository ChromeStatus entry Chrome tracking bug Intent to ship Mozilla's position WebKit's position Acknowledgements # Thanks to James Hollyer, Thomas Steiner, Joe Medley, Kayce Basques, and Vincent Scheib for their reviews of this article.

Building a Stories component

In this post I want to share thinking on building a Stories component for the web that is responsive, supports keyboard navigation, and works across browsers. --> Demo If you would prefer a hands-on demonstration of building this Stories component yourself, check out the Stories component codelab. If you prefer video, here's a YouTube version of this post: Overview # Two popular examples of the Stories UX are Snapchat Stories and Instagram Stories (not to mention fleets). In general UX terms, Stories are usually a mobile-only, tap-centric pattern for navigating multiple subscriptions. For example, on Instagram, users open a friend's story and go through the pictures in it. They generally do this many friends at a time. By tapping on the right side of the device, a user skips ahead to that friend's next story. By swiping right, a user skips ahead to a different friend. A Story component is fairly similar to a carousel, but allows navigating a multi-dimensional array as opposed to a single-dimensional array. It's as if there's a carousel inside each carousel. 🤯 1st carousel of friends 2nd "stacked" carousel of stories 👍 List in a list, aka: a multi-dimensional array Picking the right tools for the job # All in all I found this component pretty straightforward to build, thanks to a few critical web platform features. Let's cover them! CSS Grid # Our layout turned out to be no tall order for CSS Grid as it's equipped with some powerful ways to wrangle content. Friends layout # Our primary .stories component wrapper is a mobile-first horizontal scrollview: .stories { inline-size: 100vw; block-size: 100vh; display: grid; grid: 1fr / auto-flow 100%; gap: 1ch; overflow-x: auto; scroll-snap-type: x mandatory; overscroll-behavior: contain; touch-action: pan-x; } /* desktop constraint */ @media (hover: hover) and (min-width: 480px) { max-inline-size: 480px; max-block-size: 848px; } Using Chrome DevTools' Device Mode to highlight the columns created by Grid Let's breakdown that grid layout: We explicitly fill the viewport on mobile with 100vh and 100vw and constrain the size on desktop / separates our row and column templates auto-flow translates to grid-auto-flow: column The autoflow template is 100%, which in this case is whatever the scroll window width is Note that the location of the / separator relative to auto-flow is important. If auto-flow came before / it would be shorthand for grid-auto-flow: row. On a mobile phone, think of this like the row size being the viewport height and each column being the viewport width. Continuing with the Snapchat Stories and Instagram Stories example, each column will be a friend's story. We want friends stories to continue outside of the viewport so we have somewhere to scroll to. Grid will make however many columns it needs to layout your HTML for each friend story, creating a dynamic and responsive scrolling container for us. Grid enabled us to centralize the whole effect. Stacking # For each friend we need their stories in a pagination-ready state. In preparation for animation and other fun patterns, I chose a stack. When I say stack, I mean like you're looking down on a sandwich, not like you're looking from the side. With CSS grid, we can define a single-cell grid (i.e. a square), where the rows and columns share an alias ([story]), and then each child gets assigned to that aliased single-cell space: .user { display: grid; grid: [story] 1fr / [story] 1fr; scroll-snap-align: start; scroll-snap-stop: always; } .story { grid-area: story; background-size: cover; … } This puts our HTML in control of the stacking order and also keeps all elements in flow. Notice how we didn't need to do anything with absolute positioning or z-index and we didn't need to box correct with height: 100% or width: 100%. The parent grid already defined the size of the story picture viewport, so none of these story components needed to be told to fill it! CSS Scroll Snap Points # The CSS Scroll Snap Points spec makes it a cinch to lock elements into the viewport on scroll. Before these CSS properties existed, you had to use JavaScript, and it was… tricky, to say the least. Check out Introducing CSS Scroll Snap Points by Sarah Drasner for a great breakdown of how to use them. --> Horizontal scrolling without and with scroll-snap-points styles. Without it, users can free scroll as normal. With it, the browser rests gently on each item. parent .stories { display: grid; grid: 1fr / auto-flow 100%; gap: 1ch; overflow-x: auto; scroll-snap-type: x mandatory; overscroll-behavior: contain; touch-action: pan-x; } Parent with overscroll defines snap behavior. child .user { display: grid; grid: [story] 1fr / [story] 1fr; scroll-snap-align: start; scroll-snap-stop: always; } Children opt into being a snap target. I chose Scroll Snap Points for a few reasons: Free accessibility. The Scroll Snap Points spec states that pressing the Left Arrow and Right Arrow keys should move through the snap points by default. A growing spec. The Scroll Snap Points spec is getting new features and improvements all the time, which means that my Stories component will probably only get better from here on out. Ease of implementation. Scroll Snap Points are practically built for the touch-centric horizontal-pagination use case. Free platform-style inertia. Every platform will scroll and rest in its style, as opposed to normalized inertia which can have an uncanny scrolling and resting style. Cross-browser compatibility # We tested on Opera, Firefox, Safari, and Chrome, plus Android and iOS. Here's a brief rundown of the web features where we found differences in capabilities and support. Success: All of the features chosen were supported and none were buggy. We did though have some CSS not apply, so some platforms are currently missing out on UX optimizations. I did enjoy not needing to manage these features and feel confident that they'll eventually reach other browsers and platforms. scroll-snap-stop # Carousels were one of the major UX use cases that prompted the creation of the CSS Scroll Snap Points spec. Unlike Stories, a carousel doesn't always need to stop on each image after a user interacts with it. It might be fine or encouraged to quickly cycle through the carousel. Stories, on the other hand, are best navigated one-by-one, and that's exactly what scroll-snap-stop provides. .user { scroll-snap-align: start; scroll-snap-stop: always; } At the time of writing this post, scroll-snap-stop is only supported on Chromium-based browsers. Check out Browser compatibility for updates. It's not a blocker, though. It just means that on unsupported browsers users can accidentally skip a friend. So users will just have to be more careful, or we'll need to write JavaScript to ensure that a skipped friend isn't marked as viewed. Read more in the spec if you're interested. overscroll-behavior # Have you ever been scrolling through a modal when all of a sudden you start scrolling the content behind the modal? overscroll-behavior lets the developer trap that scroll and never let it leave. It's nice for all sorts of occasions. My Stories component uses it to prevent additional swipes and scrolling gestures from leaving the component. .stories { overflow-x: auto; overscroll-behavior: contain; } Safari and Opera were the 2 browsers that didn't support this, and that's totally OK. Those users will get an overscroll experience like they're used to and may never notice this enhancement. I'm personally a big fan and like including it as part of nearly every overscroll feature I implement. It's a harmless addition that can only lead to improved UX. scrollIntoView({behavior: 'smooth'}) # When a user taps or clicks and has reached the end of a friend's set of stories, it's time to move to the next friend in the scroll snap point set. With JavaScript, we were able to reference the next friend and request for it to be scrolled into view. The support for the basics of this are great; every browser scrolled it into view. But, not every browser did it 'smooth'. This just means it's scrolled into view instead of snapped. element.scrollIntoView({ behavior: 'smooth' }) Safari was the only browser not to support behavior: 'smooth' here. Check out Browser compatibility for updates. Hands-on # Now that you know how I did it, how would you?! Let's diversify our approaches and learn all the ways to build on the web. Create a Glitch, tweet me your version, and I'll add it to the Community remixes section below. Community remixes # @geoffrich_ with Svelte: demo & code @GauteMeekOlsen with Vue: demo + code @AnaestheticsApp with Lit: demo & code

JD.ID improves their mobile conversion rate by 53% with caching strategies, installation, and push notifications

JD.ID is an e-commerce platform in Indonesia providing delivery services for a wide range of products including electronic devices, household appliances, clothing, fashion accessories, and sports products. Currently operating across more than 350 Indonesian cities, JD.ID wanted to expand its online presence further by focusing on performance and a strong network-independent experience for their Progressive Web App (PWA). With this enhanced experience, JD.ID was able to increase its overall mobile conversion rate (mCVR) by 53%, its mCVR for installed users by 200%, and its daily active users by 26%, putting it on course to becoming the most popular and trusted e-commerce company in the country. Highlighting the opportunity # To overcome the unstable mobile networks in Indonesia due to the vast number of operators, JD.ID was looking for a solution that would keep its website and user experience performing at all times, as well as solve any local caching issues. It saw huge acquisition potential from users that had visited its website but not downloaded the iOS/Android app. To capture this opportunity it used PWA best practices to help build an app-like UX on its website to enhance engagement, with a focus on network resilience for dependability. The approach # Caching strategies # To mitigate network issues and improve user experience, the JD.ID team used Workbox to ensure its PWA performed well even when the user was offline or on a bad network. Workbox made it easier to execute their PWA caching strategy, which consisted of 3 parts: Network first, falling back to cache: This strategy aims to get a response from the network first. Once a response is received, it passes it to the browser and saves it to a cache. If the network request fails, the last cached response will be used. JD.ID applied this strategy to the homepage to ensure that users can access the homepage even if they're offline. Cache first, falling back to network: This strategy checks the cache for a response first and uses it if available. If not, the JD.ID website goes to the network, caches the response, and then passes it to the browser. When the service worker gets installed, it will have the static resources of the homepage, offline fallback page (explained below), category page, product page, shopping cart, and settlement page cached into the user's cache in advance. When the user routes to any of these pages, this caching strategy ensures the browser gets the static resource files from the cache directly, improving the loading speed of these critical pages. Network only: This strategy forces the response to come from the network only. JD.ID uses this strategy for the shopping cart and settlement page because those pages require very high data accuracy. Workbox also enables JD.ID to configure routing rules, the default duration of request timeouts, the number of responses that can be stored in the cache, and the duration of how long responses should be cached. Offline fallback page # The JD.ID team created an offline fallback page to provide users with a consistent experience and enhance the branding for the website. They also added a web app manifest which enables users to easily install the web app on their mobile device. Push notifications # Additionally, for further re-engagement, JD.ID implemented push notifications with Firebase Cloud Messaging for Web, applying them specifically during product sale promotional events. Overall business results # Overall mobile conversion rate (mCVR) improved 53% mCVR for users who installed the JD.ID PWA improved 200% Fengxian Liu, Web Engineering Manager, JD.ID Check out the Scale on web case studies page for more success stories from India and Southeast Asia.

Schemeful Same-Site

This article is part of a series on the SameSite cookie attribute changes: SameSite cookies explained SameSite cookies recipes Schemeful Same-Site Schemeful Same-Site modifies the definition of a (web)site from just the registrable domain to the scheme + registrable domain. You can find more details and examples in Understanding "same-site" and "same-origin". Key Term: This means that the insecure HTTP version of a site, for example, http://website.example, and the secure HTTPS version of that site, https://website.example, are now considered cross-site to each other. The good news is: if your website is already fully upgraded to HTTPS then you don't need to worry about anything. Nothing will change for you. If you haven't fully upgraded your website yet then this should be the priority. However, if there are cases where your site visitors will go between HTTP and HTTPS then some of those common scenarios and the associated SameSite cookie behavior are outlined below. Warning: The long-term plan is to phase out support for third-party cookies entirely, replacing them with privacy preserving alternatives. Setting SameSite=None; Secure on a cookie to allow it to be sent across schemes should only be considered a temporary solution in the migration towards full HTTPS. You can enable these changes for testing in both Chrome and Firefox. From Chrome 86, enable chrome://flags/#schemeful-same-site. Track progress on the Chrome Status page. From Firefox 79, set network.cookie.sameSite.schemeful to true via about:config. Track progress via the Bugzilla issue. One of the main reasons for the change to SameSite=Lax as the default for cookies was to protect against Cross-Site Request Forgery (CSRF). However, insecure HTTP traffic still presents an opportunity for network attackers to tamper with cookies that will then be used on the secure HTTPS version of the site. Creating this additional cross-site boundary between schemes provides further defense against these attacks. Common cross-scheme scenarios # Key Term: In the examples below where the URLs all have the same registrable domain, e.g. site.example, but different schemes, for example, http://site.example vs. https://site.example, they are referred to as cross-scheme to each other. Navigation # Navigating between cross-scheme versions of a website (for example, linking from http://site.example to https://site.example) would previously allow SameSite=Strict cookies to be sent. This is now treated as a cross-site navigation which means SameSite=Strict cookies will be blocked. HTTP → HTTPS HTTPS → HTTP SameSite=Strict ⛔ Blocked ⛔ Blocked SameSite=Lax ✓ Allowed ✓ Allowed SameSite=None;Secure ✓ Allowed ⛔ Blocked Loading subresources # Warning: All major browsers block active mixed content such as scripts or iframes. Additionally, browsers including Chrome and Firefox are working toward upgrading or blocking passive mixed content. Any changes you make here should only be considered a temporary fix while you work to upgrade to full HTTPS. Examples of subresources include images, iframes, and network requests made with XHR or Fetch. Loading a cross-scheme subresource on a page would previously allow SameSite=Strict or SameSite=Lax cookies to be sent or set. Now this is treated the same way as any other third-party or cross-site subresource which means that any SameSite=Strict or SameSite=Lax cookies will be blocked. Additionally, even if the browser does allow resources from insecure schemes to be loaded on a secure page, all cookies will be blocked on these requests as third-party or cross-site cookies require Secure. HTTP → HTTPS HTTPS → HTTP SameSite=Strict ⛔ Blocked ⛔ Blocked SameSite=Lax ⛔ Blocked ⛔ Blocked SameSite=None;Secure ✓ Allowed ⛔ Blocked POSTing a form # Posting between cross-scheme versions of a website would previously allow cookies set with SameSite=Lax or SameSite=Strict to be sent. Now this is treated as a cross-site POST—only SameSite=None cookies can be sent. You may encounter this scenario on sites that present the insecure version by default, but upgrade users to the secure version on submission of the sign-in or check-out form. As with subresources, if the request is going from a secure, e.g. HTTPS, to an insecure, e.g. HTTP, context then all cookies will be blocked on these requests as third-party or cross-site cookies require Secure. Warning: The best solution here is to ensure both the form page and destination are on a secure connection such as HTTPS. This is especially important if the user is entering any sensitive information into the form. HTTP → HTTPS HTTPS → HTTP SameSite=Strict ⛔ Blocked ⛔ Blocked SameSite=Lax ⛔ Blocked ⛔ Blocked SameSite=None;Secure ✓ Allowed ⛔ Blocked How can I test my site? # Developer tooling and messaging are available in Chrome and Firefox. From Chrome 86, the Issue tab in DevTools will include Schemeful Same-Site issues. You may see the following issues highlighted for your site. Navigation issues: "Migrate entirely to HTTPS to continue having cookies sent on same-site requests"—A warning that the cookie will be blocked in a future version of Chrome. "Migrate entirely to HTTPS to have cookies sent on same-site requests"—A warning that the cookie has been blocked. Subresource loading issues: "Migrate entirely to HTTPS to continue having cookies sent to same-site subresources" or "Migrate entirely to HTTPS to continue allowing cookies to be set by same-site subresources"—Warnings that the cookie will be blocked in a future version of Chrome. "Migrate entirely to HTTPS to have cookies sent to same-site subresources" or "Migrate entirely to HTTPS to allow cookies to be set by same-site subresources"—Warnings that the cookie has been blocked. The latter warning can also appear when POSTing a form. More detail is available in Testing and Debugging Tips for Schemeful Same-Site. From Firefox 79, with network.cookie.sameSite.schemeful set to true via about:config the console will display message for Schemeful Same-Site issues. You may see the following on your site: "Cookie cookie_name will be soon treated as cross-site cookie against http://site.example/ because the scheme does not match." "Cookie cookie_name has been treated as cross-site against http://site.example/ because the scheme does not match." FAQ # My site is already fully available on HTTPS, why am I seeing issues in my browser's DevTools? # It's possible that some of your links and subresources still point to insecure URLs. One way to fix this issue is to use HTTP Strict-Transport-Security (HSTS) and the includeSubDomain directive. With HSTS + includeSubDomain even if one of your pages accidentally includes an insecure link the browser will automatically use the secure version instead. What if I can't upgrade to HTTPS? # While we strongly recommend that you upgrade your site entirely to HTTPS to protect your users, if you're unable to do so yourself we suggest speaking with your hosting provider to see if they can offer that option. If you self-host, then Let's Encrypt provides a number of tools to install and configure a certificate. You can also investigate moving your site behind a CDN or other proxy that can provide the HTTPS connection. If that's still not possible then try relaxing the SameSite protection on affected cookies. In cases where only SameSite=Strict cookies are being blocked you can lower the protection to Lax. In cases where both Strict and Lax cookies are being blocked and your cookies are being sent to (or set from) a secure URL you can lower the protections to None. This workaround will fail if the URL you're sending cookies to (or setting them from) is insecure. This is because SameSite=None requires the Secure attribute on cookies which means those cookies may not be sent or set over an insecure connection. In this case you will be unable to access that cookie until your site is upgraded to HTTPS. Remember, this is only temporary as eventually third-party cookies will be phased out entirely. How does this affect my cookies if I haven't specified a SameSite attribute? # Cookies without a SameSite attribute are treated as if they specified SameSite=Lax and the same cross-scheme behavior applies to these cookies as well. Note that the temporary exception to unsafe methods still applies, see the Lax + POST mitigation in the Chromium SameSite FAQ for more information. How are WebSockets affected? # WebSocket connections will still be considered same-site if they're the same secureness as the page. Same-site: wss:// connection from https:// ws:// connection from http:// Cross-site: wss:// connection from http:// ws:// connection from https:// Photo by Julissa Capdevilla on Unsplash

Browser-level lazy-loading for CMSs

My goal with this post is to persuade CMS platform developers and contributors (i.e. the people who develop CMS cores) that now is the time to implement support for the browser-level image lazy-loading feature. I'll also share recommendations on how to ensure high-quality user experiences and enable customization by other developers while implementing lazy-loading. These guidelines come from our experience adding support to WordPress as well as helping Joomla, Drupal, and TYPO3 implement the feature. Regardless of whether you're a CMS platform developer or a CMS user (i.e. a person who builds websites with a CMS), you can use this post to learn more about the benefits of browser-level lazy-loading in your CMS. Check out the Next steps section for suggestions on how you can encourage your CMS platform to implement lazy-loading. Background # Over the past year, lazy-loading images and iframes using the loading attribute has become part of the WHATWG HTML Standard and seen growing adoption by various browsers. These milestones however only lay the groundwork for a faster and more resource-saving web. It is now on the distributed web ecosystem to make use of the loading attribute. Content management systems power about 60% of websites, so these platforms play a vital role in bringing adoption of modern browser features to the web. With a few popular open-source CMSs such as WordPress, Joomla, and TYPO3 having already implemented support for the loading attribute on images, let's have a look at their approaches and the takeaways which are relevant for adopting the feature in other CMS platforms as well. Lazy-loading media is a key web performance feature that sites should benefit from at a large scale, which is why adopting it at the CMS core level is recommended. The case for implementing lazy-loading now # Standardization # Adoption of non-standardized browser features in CMSs facilitates widespread testing and can surface potential areas of improvement. However, the general consensus across CMSs is that, as long as a browser feature is not standardized, it should preferably be implemented in the form of an extension or plugin for the respective platform. Only once standardized can a feature be considered for adoption in the platform core. Success: Browser-level lazy-loading is now part of the WHATWG HTML Standard for both img and iframe elements. Browser support # Browser support of the feature is a similar concern: The majority of CMS users should be able to benefit from the feature. If there is a considerable percentage of browsers where the feature is not yet supported, the feature has to ensure that it at least has no adverse effect for those. Success: Browser-level lazy-loading is widely supported by browsers and the loading attribute is simply ignored by those browsers that have not adopted it yet. Distance-from-viewport thresholds # A common concern with lazy-loading implementations is that they in principle increase the likelihood that an image will not be loaded once it becomes visible in the user's viewport because the loading cycle starts at a later stage. Contrary to previous JavaScript-based solutions, browsers approach this conservatively and furthermore can fine-tune their approach based on real-world user experience data, minimizing the impact, so browser-level lazy-loading should be safe to adopt by CMS platforms. Success: Experiments using Chrome on Android indicated that on 4G networks, 97.5% of below-the-fold lazy-loaded images were fully loaded within 10ms of becoming visible, compared to 97.6% for non lazy-loaded images. In other words, there was virtually no difference (0.1%) in the user experience of eagerly-loaded images and lazy-loaded images. User experience recommendations # Require dimension attributes on elements # In order to avoid layout shifts, it has been a long-standing recommendation that embedded content such as images or iframes should always include the dimension attributes width and height, so that the browser can infer the aspect ratio of those elements before actually loading them. This recommendation is relevant regardless of whether an element is being lazy-loaded or not. However, due to the 0.1% greater likelihood of an image not being fully loaded once in the viewport it becomes slightly more applicable with lazy-loading in place. CMSs should preferably provide dimension attributes on all images and iframes. If this is not possible for every such element, they are recommended to skip lazy-loading images which do not provide both of these attributes. Caution: If the CMS is unable to provide width and height attributes on images and iframes on a large scale, you will have to weigh the trade-offs between saving additional network resources and a slightly higher chance for layout shifts to decide whether lazy-loading is worth it. Avoid lazy-loading above-the-fold elements # At the moment CMSs are recommended to only add loading="lazy" attributes to images and iframes which are positioned below-the-fold, to avoid a slight delay in the Largest Contentful Paint metric. However it has to be acknowledged that it's complex to assess the position of an element relative to the viewport before the rendering process. This applies especially if the CMS uses an automated approach for adding loading attributes, but even based on manual intervention several factors such as the different viewport sizes and aspect ratios have to be considered. Fortunately, the impact of marking above-the-fold elements with loading="lazy" is fairly small, with a regression of <1% at the 75th and 99th percentiles compared to eagerly-loaded elements. Depending on the capabilities and audience of the CMS, try to define reasonable estimates for whether an image or iframe is likely to be in the initial viewport, for example never lazy-loading elements in a header template. In addition, offer either a UI or API which allows modifying the existence of the loading attribute on elements. Avoid a JavaScript fallback # While JavaScript can be used to provide lazy-loading to browsers which do not (yet) support the loading attribute, such mechanisms always rely on initially removing the src attribute of an image or iframe, which causes a delay for the browsers that do support the attribute. In addition, rolling out such a JavaScript-based solution in the frontends of a large-scale CMS increases the surface area for potential issues, which is part of why no major CMS had adopted lazy-loading in its core prior to the standardized browser feature. Caution: Avoid providing a JavaScript-based fallback in the CMS. With growing adoption of the loading attribute and no adverse effects on browser versions that do not support it yet, it is safer to not provide the feature to those browsers and instead encourage updating to a newer browser version. Technical recommendations # Enable lazy-loading by default # The overall recommendation for CMSs implementing browser-level lazy-loading is to enable it by default, i.e. loading="lazy" should be added to images and iframes, preferably only for those elements that include dimension attributes. Having the feature enabled by default will result in greater network resource savings than if it had to be enabled manually, for example on a per-image basis. If possible, loading="lazy" should only be added to elements which likely appear below-the-fold. If this requirement is too complex to implement for a CMS, it is then preferable to globally provide the attribute rather than omit it, since on most websites the amount of page content outside of the initial viewport is far greater than the initially visible content. In other words, the resource-saving wins from using the loading attribute are greater than the LCP wins from omitting it. Allow per-element modifications # While loading="lazy" should be added to images and iframes by default, it is crucial to allow omitting the attribute on certain images, for example to optimize for LCP. If the audience of the CMS is on average considered more tech-savvy, this could be a UI control exposed for every image and iframe allowing to opt out of lazy-loading for that element. Alternatively or in addition, an API could be exposed to third-party developers so that they can make similar changes through code. WordPress for example allows to skip the loading attribute either for an entire HTML tag or context or for a specific HTML element in the content. Caution: If an element should not be lazy-loaded, require or encourage skipping the loading attribute entirely. While using loading="eager" is a supported alternative, this would tell the browser explicitly to always load the image right away, which would prevent potential benefits if browsers implemented further mechanisms and heuristics to automatically decide which elements to lazy-load. Retrofit existing content # At a high level, there are two approaches for adding the loading attribute to HTML elements in a CMS: Either add the attribute from within the content editor in the backend, persistently saving it in the database. Add the attribute on the fly when rendering content from the database in the frontend. It is recommended for CMS to opt for adding the attribute on the fly when rendering, in order to bring the lazy-loading benefits to any existing content as well. If the attribute could solely be added through the editor, only new or recently modified pieces of content would receive the benefits, drastically reducing the CMS's impact on saving network resources. Furthermore, adding the attribute on the fly will easily allow for future modifications, should the capabilities of browser-level lazy-loading be further expanded. Adding the attribute on the fly should cater for a potentially existing loading attribute on an element though and let such an attribute take precedence. This way, the CMS or an extension for it could also implement the editor-driven approach without causing a conflict with duplicate attributes. Optimize server-side performance # When adding the loading attribute to content on the fly using (for example) a server-side middleware, speed is a consideration. Depending on the CMS, the attribute could be added either via DOM traversal or regular expressions, with the latter being recommended for performance. Regular expressions use should be kept to a minimum, for example a single regex which collects all img and iframe tags in the content including their attributes and then adds the loading attribute to each tag string as applicable. WordPress for example goes as far as having a single general regular expression to perform various on-the-fly operations to certain elements, of which adding loading="lazy" is just one, using a single regular expression to facilitate multiple features. This form of optimization furthermore is another reason why adopting lazy-loading in a CMS's core is recommended over an extension - it allows for better server-side performance optimization. Next steps # See if there is an existing feature request ticket to add support for the feature in your CMS, or open a new one if there is none yet. Use references to this post as needed to support your proposal. Tweet me (felixarntz@) for questions or comments, or to get your CMS listed on this page if support for browser-level lazy-loading has been added. If you encounter other challenges, I am also curious to learn more about them to hopefully find a solution. If you're a CMS platform developer, study how other CMSs have implemented lazy-loading: WordPress Core Joomla TYPO3 You can use the learnings from your research and the technical recommendations from this post to start contributing code to your CMS, for example in form of a patch or pull-request. Hero photo by Colin Watts on Unsplash.

Better JS scheduling with isInputPending()

Loading fast is hard. Sites that leverage JS to render their content currently have to make a trade-off between load performance and input responsiveness: either perform all the work needed for display all at once (better load performance, worse input responsiveness), or chunk the work into smaller tasks in order to remain responsive to input and paint (worse load performance, better input responsiveness). To eliminate the need to make this trade-off, Facebook proposed and implemented the isInputPending() API in Chromium in order to improve responsiveness without yielding. Based on origin trial feedback, we've made a number of updates to the API, and are happy to announce that the API is now shipping by default in Chromium 87! Browser compatibility # isInputPending() is shipping in Chromium-based browsers starting in version 87. No other browser has signaled an intent to ship the API. Background # For the full background, check out our Facebook Engineering blog post, Faster input events with Facebook's first browser API contribution. Most work in today's JS ecosystem gets done on a single thread: the main thread. This provides a robust execution model to developers, but the user experience (responsiveness in particular) can suffer drastically if script executes for a long time. If the page is doing a lot of work while an input event is fired, for instance, the page won't handle the click input event until after that work completes. The current best practice is to deal with this issue by breaking the JavaScript up into smaller blocks. While the page is loading, the page can run a bit of JavaScript, and then yield and pass control back to the browser. The browser can then check its input event queue and see whether there is anything it needs to tell the page about. Then the browser can go back to running the JavaScript blocks as they get added. This helps, but it can cause other issues. Each time the page yields control back to the browser, it takes some time for the browser to check its input event queue, process events, and pick up the next JavaScript block. While the browser responds to events quicker, the overall loading time of the page gets slowed down. And if we yield too often, the page loads too slowly. If we yield less often, it takes longer for the browser to respond to user events, and people get frustrated. Not fun. At Facebook, we wanted to see what things would look like if we came up with a new approach for loading that would eliminate this frustrating trade-off. We reached out to our friends at Chrome about this, and came up with the proposal for isInputPending(). The isInputPending() API is the first to use the concept of interrupts for user inputs on the web, and allows for JavaScript to be able to check for input without yielding to the browser. Since there was interest in the API, we partnered with our colleagues at Chrome to implement and ship the feature in Chromium. With help from the Chrome engineers, we got the patches landed behind an origin trial (which is a way for Chrome to test changes and get feedback from developers before fully releasing an API). We've now taken feedback from the origin trial and from the other members of the W3C Web Performance Working Group and implemented changes to the API. Example: a yieldier scheduler # Suppose that you've got a bunch of display-blocking work to do to load your page, for example generating markup from components, factoring out primes, or just drawing a cool loading spinner. Each one of these is broken into a discrete work item. Using the scheduler pattern, let's sketch out how we might process our work in a hypothetical processWorkQueue() function: const DEADLINE = performance.now() + QUANTUM; while (workQueue.length > 0) { if (performance.now() >= DEADLINE) { // Yield the event loop if we're out of time. setTimeout(processWorkQueue); return; } let job = workQueue.shift(); job.execute(); } By invoking processWorkQueue() later in a new macrotask via setTimeout(), we give the browser the ability to remain somewhat responsive to input (it can run event handlers before work resumes) while still managing to run relatively uninterrupted. Though, we might get descheduled for a long time by other work that wants control of the event loop, or get up to an extra QUANTUM milliseconds of event latency. A good value for QUANTUM (under the RAIL model) is <50ms, depending on the type of work being done. This value is primarily what dictates the tradeoff between throughput and latency. This is okay, but can we do better? Absolutely! const DEADLINE = performance.now() + QUANTUM; while (workQueue.length > 0) { if (navigator.scheduling.isInputPending() || performance.now() >= DEADLINE) { // Yield if we have to handle an input event, or we're out of time. setTimeout(processWorkQueue); return; } let job = workQueue.shift(); job.execute(); } By introducing a call to navigator.scheduling.isInputPending(), we're able to respond to input quicker while still ensuring that our display-blocking work executes uninterrupted otherwise. If we're not interested in handling anything other than input (e.g. painting) until work is complete, we can handily increase the length of QUANTUM as well. By default, "continuous" events are not returned from isInputPending(). These include mousemove, pointermove, and others. If you're interested in yielding for these as well, no problem. By providing a dictionary to isInputPending() with includeContinuous set to true, we're good to go: const DEADLINE = performance.now() + QUANTUM; const options = { includeContinuous: true }; while (workQueue.length > 0) { if (navigator.scheduling.isInputPending(options) || performance.now() >= DEADLINE) { // Yield if we have to handle an input event (any of them!), or we're out of time. setTimeout(processWorkQueue); return; } let job = workQueue.shift(); job.execute(); } That's it! Frameworks like React are building isInputPending() support into their core scheduling libraries using similar logic. Hopefully, this will lead developers who use these frameworks to be able to benefit from isInputPending() behind the scenes without significant rewrites. Yielding isn't always bad # It's worth noting that yielding less isn't the right solution for every use case. There are many reasons to return control to the browser other than to process input events, such as to perform rendering and execute other scripts on the page. There exist cases where the browser isn't able to properly attribute pending input events. In particular, setting complex clips and masks for cross-origin iframes may report false negatives (i.e. isInputPending() may unexpectedly return false when targeting these frames). Be sure that you're yielding often enough if your site does require interactions with stylized subframes. Be mindful of other pages that share an event loop, as well. On platforms such as Chrome for Android, it's quite common for multiple origins to share an event loop. isInputPending() will never return true if input is dispatched to a cross-origin frame, and thus backgrounded pages may interfere with the responsiveness of foreground pages. You may wish to reduce, postpone, or yield more often when doing work in the background using the Page Visibility API. We encourage you to use isInputPending() with discretion. If there isn't user-blocking work to be done, then be kind to others on the event loop by yielding more frequently. Long tasks can be harmful. Feedback # Leave feedback on the spec in the is-input-pending repository. Contact @acomminos (one of the spec authors) on Twitter. Conclusion # We're excited that isInputPending() is launching, and that developers are able to start using it today. This API is the first time that Facebook has built a new web API and taken it from idea incubation to standards proposal to actually shipping in a browser. We'd like to thank everyone who helped us get to this point, and give a special shoutout to everyone at Chrome who helped us flesh out this idea and get it shipped! Hero photo by Will H McMahan on Unsplash.

Rakuten 24’s investment in PWA increases user retention by 450%

Rakuten 24 is an online store provided by Rakuten, one of the largest e-commerce companies in Japan. It provides a wide selection of everyday items including grocery, medicine, healthcare, kitchen utensils, and more. The team's main goal over the last year was to improve mobile customer retention and re-engagement. By making their web app installable, they saw a 450% jump in visitor retention rate as compared to the previous mobile web flow over a 1-month timeframe. Highlighting the opportunity # In their efforts to gain market share and improve user experience, Rakuten 24 identified the following areas of opportunities: As a relatively new service, Rakuten 24 was not in a position to invest the time and cost in developing a platform-specific app both for iOS and Android and were seeking an alternative, efficient way to fill this gap. As Rakuten-Ichiba (Rakuten's e-commerce marketplace) is the biggest service in Japan, many people think Rakuten 24 is a seller in Rakuten-Ichiba. As a result, they acknowledged the need to invest in brand awareness and drive more user retention. The tools they used # Installability # To capture the two opportunities identified above, Rakuten 24 decided to build Progressive Web App (PWA) features on an incremental basis, starting with installability. Implementing installability resulted in increased traffic, visitor retention, sales per customer, and conversions. beforeinstallprompt # To gain more flexibility and control over their install dialogue's behaviour, the team implemented their own install prompt using the beforeinstallprompt event. In doing so, they were able to detect if the app was already installed on Android or iOS and provide a more meaningful and relevant experience to their users. Custom installation instructions # For users who weren't able to install the PWA from the banner, they created a custom guide (linked from the banner) with instructions on how to install the PWA manually on both Android and iOS devices. Workbox for service workers # The Rakuten 24 team used Workbox (the workbox-webpack-plugin to be precise) to ensure their PWA worked well even when the user was offline or on a bad network. Workbox's APIs for controlling the cache worked significantly better than Rakuten 24's previous in-house script. Moreover, with workbox-webpack-plugin (and Babel), was able to automate the process of supporting a wider range of browsers. To further build network resilience, they implemented a cache-first strategy for their CSS and JS assets, and used stale-while-revalidate for their images that don't change frequently. Overall business results # Other ways the business improved with installability # Brand Awareness: Since users can directly access Rakuten 24 from their home screen, it helped both users and Rakuten separate Rakuten 24 from Rakuten-Ichiba. Efficiency: Rakuten 24 was able to drive these results without spending significant time and money building platform-specific apps for iOS and Android. Masashi Watanabe, General Manager, Group Marketing Department, Rakuten Inc. Previously the concept of installability was known as add to homescreen (A2HS). Check out the Scale on web case studies page for more success stories from India and Asia.

Using the Event Conversion Measurement API

The Conversion Measurement API will be renamed to Attribution Reporting API and offer more features. If you're experimenting with (Conversion Measurement API) in Chrome 91 and below, read this post to find more details, use cases and instructions for how to use the API. If you're interested in the next iteration of this API (Attribution Reporting), which will be available for experimentation in Chrome (origin trial), join the mailing list for updates on available experiments. The Event Conversion Measurement API measures when an ad click leads to a conversion, without using cross-site identifiers. Here, you'll find must-dos and tips to use this API locally or as an experiment for your end users. Demo # If you're considering using the API, see the demo and the corresponding code for a simple end-to-end implementation example. Browser support # The Event Conversion Measurement API is supported: As an origin trial, from Chrome 86 beta until Chrome 91 (April 2021). Origin trials enable the API for all visitors of a given origin. You need to register your origin for the origin trial in order to try the API with end users. Or by turning on flags, in Chrome 86 and later. Flags enable the API on a single user's browser. Flags are useful when developing locally. See details about the Chrome versions where the API is active on the Chrome feature entry. Experiment with end users # Experiment with the API, with end users # To test the API with end users, you'll need to: Design your experiment. Set it up. Run it. Design your experiment # Defining your goal will help you outline your plan for your experiment. If your goal is to understand the API mechanics, run your experiment as follows: Track conversions. See how you can assign different values to conversion events. Look at the conversion reports you're receiving. If your goal is to see how well the API satisfies basic use cases, run your experiment as follows: Track conversions. Look at the aggregate count of conversions you're receiving. Recover the corrected count of conversions. See how in Recover the corrected conversion count. Optionally, if you want to try something more advanced: tweak the noise correction script. For example, try different groupings to see what sizes are necessary for the noise to be negligible. Compare the corrected count of conversions with source-of-truth data (cookie-based conversion data). Set up your experiment # Register for the origin trial # Registering for an origin trial is the first step to activate the API for end users. Upon registering for an origin trial, you have two choices to make: what type of tokens you need, and how the API usage should be controlled. Token type: If you're planning to use the API directly on your own origin(s), register your origin(s) for a regular origin trial. If you're planning on using the API as a third-party—for example if you need to use the API in a script you wrote that is executed on origins you don't own—you may be eligible to register your origin for a third-party origin trial. This is convenient if you need to test at scale across different sites. API usage control: Origin trial features shouldn't exceed a small percentage of global page loads, because they're ephemeral. Because of this, sites that have registered for origin trials typically need to selectively enable API usage for small portions of their users. You can do this yourself, or let Chrome do this for you. In the dropdown How is (third-party) usage controlled?: Select Standard limit to activate the API for all end users on origins where a token is present. Pick this if you don't need to A/B Test (with/without the experiment) or if you want to selectively enable API usage for small portions of your users yourself. Select Exclude a subset of users to let Chrome selectively activate the API on a small subset of users on origins where a token is present. This consistently diverts a user into an experiment group across sites to avoid the usage limit. Pick this if you don't want to worry about implementing throttling for your API usage. Gotchas! If you pick Exclude a subset of users, the API won't be enabled for all users, even for origins that are registered for origin trials. This is the intended behaviour. Add your origin trial tokens # Once your origin trial tokens are created, add them where relevant. Adapt your code # If you've picked Exclude a subset of users, use client-side feature detection alongside the origin trial to check whether the API can be used. Run your experiment # You're now ready to run your experiment. (Optional) Recover the corrected conversion count # Even though the conversion data is noised, the reporting endpoint can recover the true count of reports that have a specific conversion value. See how in this noise corrector example script. User privacy isn't impacted by this technique, because you can't determine whether a specific event's conversion data was noised. But this gives you the correct conversion count at an aggregated level. Develop locally # A few tips when developing locally with the conversion measurement API. Set up your browser for local development # Use Chrome version 86 or later. You can check what version of Chrome you're using by typing chrome://version in the URL bar. To activate the feature locally (for example if you're developing on localhost), enable flags. Go to flags by typing chrome://flags in Chrome's URL bar. Turn on the two flags #enable-experimental-web-platform-features and #conversion-measurement-api. Disable third-party cookie blocking. In the long term, dedicated browser settings will be available to allow/block the API. Until then, third-party cookie blocking is used as the signal that users don't want to share data about their conversions—and hence that this API should be disabled. Don't use Incognito or Guest mode. The API is disabled on these profiles. Some ad-blocking browser extensions may block some of the API's functionality (e.g. script names containing ad). Deactivate ad-blocking extensions on the pages where you need to test the API, or create a fresh user profile without extensions. Debug # You can see the conversion reports the browser has scheduled to send at chrome://conversion-internals/ > Pending Reports. Reports are sent at scheduled times, but for debugging purposes you may want to get the reports immediately. To receive all of the scheduled reports now, click Send All Reports in chrome://conversion-internals/ > Pending Reports. To always receive reports immediately without having to click this button, enable the flag chrome://flags/#conversion-measurement-debug-mode. Test your origin trial token(s) # If you've chosen Exclude a subset of users in the dropdown How is usage controlled? when you've registered your token(s), the API is only enabled for a subset of Chrome users. You may not be part of this group. To test your origin trial tokens, enforce that your browser behave as if it was in the selected Chrome group by enabling the flag #conversion-measurement-api. Share your feedback # If you're experimenting with the API, your feedback is key in order to improve the API and support more use cases—please share it! Further reading # Origin trials developer guide Getting started with Chrome's origin trials What are third-party origin trials? With many thanks to Jxck and John Delaney for their feedback on this article. Hero image by William Warby / @wawarby on Unsplash, edited.

Back/forward cache

Back/forward cache (or bfcache) is a browser optimization that enables instant back and forward navigation. It significantly improves the browsing experience for users—especially those with slower networks or devices. As web developers, it's critical to understand how to optimize your pages for bfcache across all browsers, so your users can reap the benefits. Browser compatibility # bfcache has been supported in both Firefox and Safari for many years, across desktop and mobile. Starting in version 86, Chrome has enabled bfcache for cross-site navigations on Android for a small percentage of users. In Chrome 87, bfcache support will be rolled out to all Android users for cross-site navigation, with the intent to support same-site navigation as well in the near future. bfcache basics # bfcache is an in-memory cache that stores a complete snapshot of a page (including the JavaScript heap) as the user is navigating away. With the entire page in memory, the browser can quickly and easily restore it if the user decides to return. How many times have you visited a website and clicked a link to go to another page, only to realize it's not what you wanted and click the back button? In that moment, bfcache can make a big difference in how fast the previous page loads: Without bfcache enabled A new request is initiated to load the previous page, and, depending on how well that page has been optimized for repeat visits, the browser might have to re-download, re-parse, and re-execute some (or all) of resources it just downloaded. With bfcache enabled Loading the previous page is essentially instant, because the entire page can be restored from memory, without having to go to the network at all Check out this video of bfcache in action to understand the speed up it can bring to navigations: In the video above, the example with bfcache is quite a bit faster than the example without it. bfcache not only speeds up navigation, it also reduces data usage, since resources do not have to be downloaded again. Chrome usage data shows that 1 in 10 navigations on desktop and 1 in 5 on mobile are either back or forward. With bfcache enabled, browsers could eliminate the data transfer and time spent loading for billions of web pages every single day! How the "cache" works # The "cache" used by bfcache is different from the HTTP cache (which is also useful in speeding up repeat navigations). The bfcache is a snapshot of the entire page in memory (including the JavaScript heap), whereas the HTTP cache contains only the responses for previously made requests. Since it's quite rare that all requests required to load a page can be fulfilled from the HTTP cache, repeat visits using bfcache restores are always faster than even the most well-optimized non-bfcache navigations. Creating a snapshot of a page in memory, however, involves some complexity in terms of how best to preserve in-progress code. For example, how do you handle setTimeout() calls where the timeout is reached while the page is in the bfcache? The answer is that browsers pause running any pending timers or unresolved promises—essentially all pending tasks in the JavaScript task queues—and resume processing tasks when (or if) the page is restored from the bfcache. In some cases this is fairly low-risk (for example, timeouts or promises), but in other cases it might lead to very confusing or unexpected behavior. For example, if the browser pauses a task that's required as part of an IndexedDB transaction, it can affect other open tabs in the same origin (since the same IndexedDB databases can be accessed by multiple tabs simultaneously). As a result, browsers will generally not attempt to cache pages in the middle of an IndexedDB transaction or using APIs that might affect other pages. For more details on how various API usage affects a page's bfcache eligibility, see Optimize your pages for bfcache below. APIs to observe bfcache # While bfcache is an optimization that browsers do automatically, it's still important for developers to know when it's happening so they can optimize their pages for it and adjust any metrics or performance measurement accordingly. The primary events used to observe bfcache are the page transition events—pageshow and pagehide—which have been around as long as bfcache has and are supported in pretty much all browsers in use today. The newer Page Lifecycle events—freeze and resume—are also dispatched when pages go in or out of the bfcache, as well as in some other situations. For example when a background tab gets frozen to minimize CPU usage. Note, the Page Lifecycle events are currently only supported in Chromium-based browsers. Observe when a page is restored from bfcache # The pageshow event fires right after the load event when the page is initially loading and any time the page is restored from bfcache. The pageshow event has a persisted property which will be true if the page was restored from bfcache (and false if not). You can use the persisted property to distinguish regular page loads from bfcache restores. For example: window.addEventListener('pageshow', function(event) { if (event.persisted) { console.log('This page was restored from the bfcache.'); } else { console.log('This page was loaded normally.'); } }); In browsers that support the Page Lifecycle API, the resume event will also fire when pages are restored from bfcache (immediately before the pageshow event), though it will also fire when a user revisits a frozen background tab. If you want to restore a page's state after it's frozen (which includes pages in the bfcache), you can use the resume event, but if you want to measure your site's bfcache hit rate, you'd need to use the pageshow event. In some cases, you might need to use both. See Implications for performance and analytics for more details on bfcache measurement best practices. Observe when a page is entering bfcache # The pagehide event is the counterpart to the pageshow event. The pageshow event fires when a page is either loaded normally or restored from the bfcache. The pagehide event fires when the page is either unloaded normally or when the browser attempts to put it into the bfcache. The pagehide event also has a persisted property, and if it's false then you can be confident a page is not about to enter the bfcache. However, if the persisted property is true, it doesn't guarantee that a page will be cached. It means that the browser intends to cache the page, but there may be factors that make it impossible to cache. window.addEventListener('pagehide', function(event) { if (event.persisted === true) { console.log('This page *might* be entering the bfcache.'); } else { console.log('This page will unload normally and be discarded.'); } }); Similarly, the freeze event will fire immediately after the pagehide event (if the event's persisted property is true), but again that only means the browser intends to cache the page. It may still have to discard it for a number of reasons explained below. Optimize your pages for bfcache # Not all pages get stored in bfcache, and even when a page does get stored there, it won't stay there indefinitely. It's critical that developers understand what makes pages eligible (and ineligible) for bfcache to maximize their cache-hit rates. The following sections outline the best practices to make it as likely as possible that the browser can cache your pages. Never use the unload event # The most important way to optimize for bfcache in all browsers is to never use the unload event. Ever! The unload event is problematic for browsers because it predates bfcache and many pages on the internet operate under the (reasonable) assumption that a page will not continue to exist after the unload event has fired. This presents a challenge because many of those pages were also built with the assumption that the unload event would fire any time a user is navigating away, which is no longer true (and hasn't been true for a long time). So browsers are faced with a dilemma, they have to choose between something that can improve the user experience—but might also risk breaking the page. Firefox has chosen to make pages ineligible for bfcache if they add an unload listener, which is less risky but also disqualifies a lot of pages. Safari will attempt to cache some pages with an unload event listener, but to reduce potential breakage it will not run the unload event when a user is navigating away. Since 65% of pages in Chrome register an unload event listener, to be able to cache as many pages as possible, Chrome chose to align implementation with Safari. Instead of using the unload event, use the pagehide event. The pagehide event fires in all cases where the unload event currently fires, and it also fires when a page is put in the bfcache. In fact, Lighthouse v6.2.0 has added a no-unload-listeners audit, which will warn developers if any JavaScript on their pages (including that from third-party libraries) adds an unload event listener. Warning: Never add an unload event listener! Use the pagehide event instead. Adding an unload event listener will make your site slower in Firefox, and the code won't even run most of the time in Chrome and Safari. Only add beforeunload listeners conditionally # The beforeunload event will not make your pages ineligible for bfcache in Chrome or Safari, but it will make them ineligible in Firefox, so avoid using it unless absolutely necessary. Unlike the unload event, however, there are legitimate uses for beforeunload. For example, when you want to warn the user that they have unsaved changes they'll lose if they leave the page. In this case, it's recommended that you only add beforeunload listeners when a user has unsaved changes and then remove them immediately after the unsaved changes are saved. Don't window.addEventListener('beforeunload', (event) => { if (pageHasUnsavedChanges()) { event.preventDefault(); return event.returnValue = 'Are you sure you want to exit?'; } }); The code above adds a beforeunload listener unconditionally. Do function beforeUnloadListener(event) { event.preventDefault(); return event.returnValue = 'Are you sure you want to exit?'; }; // A function that invokes a callback when the page has unsaved changes. onPageHasUnsavedChanges(() => { window.addEventListener('beforeunload', beforeUnloadListener); }); // A function that invokes a callback when the page's unsaved changes are resolved. onAllChangesSaved(() => { window.removeEventListener('beforeunload', beforeUnloadListener); }); The code above only adds the beforeunload listener when it's needed (and removes it when it's not). Avoid window.opener references # In some browsers (including Chromium-based browsers) if a page was opened using window.open() or (in Chromium-based browsers prior to version 88) from a link with target=_blank—without specifying rel="noopener"—then the opening page will have a reference to the window object of the opened page. In addition to being a security risk, a page with a non-null window.opener reference cannot safely be put into the bfcache because that could break any pages attempting to access it. As a result, it's best to avoid creating window.opener references by using rel="noopener" whenever possible. If your site requires opening a window and controlling it through window.postMessage() or directly referencing the window object, neither the opened window nor the opener will be eligible for bfcache. Always close open connections before the user navigates away # As mentioned above, when a page is put into the bfcache all scheduled JavaScript tasks are paused and then resumed when the page is taken out of the cache. If these scheduled JavaScript tasks are only accessing DOM APIs—or other APIs isolated to just the current page—then pausing these tasks while the page is not visible to the user is not going to cause any problems. However, if these tasks are connected to APIs that are also accessible from other pages in the same origin (for example: IndexedDB, Web Locks, WebSockets, etc.) this can be problematic because pausing these tasks may prevent code in other tabs from running. As a result, most browsers will not attempt to put a page in bfcache in the following scenarios: Pages with an unfinished IndexedDB transaction Pages with in-progress fetch() or XMLHttpRequest Pages with an open WebSocket or WebRTC connection If your page is using any of these APIs, it's best to always close connections and remove or disconnect observers during the pagehide or freeze event. That will allow the browser to safely cache the page without the risk of it affecting other open tabs. Then, if the page is restored from the bfcache, you can re-open or re-connect to those APIs (in the pageshow or resume event). Using the APIs listed above does not disqualify a page from being stored in bfcache, as long as they are not actively in use before the user navigates away. However, there are APIs (Embedded Plugins, Workers, Broadcast Channel, and several others) where usage currently does disqualify a page from being cached. While Chrome is intentionally being conservative in its initial release of bfcache, the long-term goal is to make bfcache work with as many APIs as possible. Test to ensure your pages are cacheable # While there's no way to determine whether a page was put into the cache as it's unloading, it is possible to assert that a back or forward navigation did restore a page from the cache. Currently, in Chrome, a page can remain in the bfcache for up to three minutes, which should be enough time to run a test (using a tool like Puppeteer or WebDriver) to ensure that the persisted property of a pageshow event is true after navigating away from a page and then clicking the back button. Note that, while under normal conditions a page should remain in the cache for long enough to run a test, it can be evicted silently at any time (for example, if the system is under memory pressure). A failing test doesn't necessarily mean your pages are not cacheable, so you need to configure your test or build failure criteria accordingly. Gotchas! In Chrome, bfcache is currently only enabled on mobile. To test bfcache on desktop you need to enable the #back-forward-cache flag. Ways to opt out of bfcache # If you do not want a page to be stored in the bfcache you can ensure it's not cached by setting the Cache-Control header on the top-level page response to no-store: Cache-Control: no-store All other caching directives (including no-cache or even no-store on a subframe) will not affect a page's eligibility for bfcache. While this method is effective and works across browsers, it has other caching and performance implications that may be undesirable. To address that, there's a proposal to add a more explicit opt-out mechanism, including a mechanism to clear the bfcache if needed (for example, when a user logs out of a website on a shared device). Also, in Chrome, user-level opt-out is currently possible via the #back-forward-cache flag, as well an enterprise policy-based opt-out. Caution: Given the significantly better user experience that bfcache delivers, it is not recommended to opt-out unless absolutely necessary for privacy reasons, for example if a user logs out of a website on a shared device. How bfcache affects analytics and performance measurement # If you track visits to your site with an analytics tool, you will likely notice a decrease in the total number of pageviews reported as Chrome continues to enable bfcache for more users. In fact, you're likely already underreporting pageviews from other browsers that implement bfcache since most of the popular analytics libraries do not track bfcache restores as new pageviews. If you don't want your pageview counts to go down due to Chrome enabling bfcache, you can report bfcache restores as pageviews (recommended) by listening to the pageshow event and checking the persisted property. The following example shows how to do this with Google Analytics; the logic should be similar for other analytics tools: // Send a pageview when the page is first loaded. gtag('event', 'page_view') window.addEventListener('pageshow', function(event) { if (event.persisted === true) { // Send another pageview if the page is restored from bfcache. gtag('event', 'page_view') } }); Performance measurement # bfcache can also negatively affect performance metrics collected in the field, specifically metrics that measure page load times. Since bfcache navigations restore an existing page rather than initiate a new page load, the total number of page loads collected will decrease when bfcache is enabled. What's critical, though, is that the page loads being replaced by bfcache restores would likely have been some of the fastest page loads in your dataset. This is because back and forward navigations, by definition, are repeat visits, and repeat page loads are generally faster than page loads from first time visitors (due to HTTP caching, as mentioned earlier). The result is fewer fast page loads in your dataset, which will likely skew the distribution slower—despite the fact that the performance experienced by the user has probably improved! There are a few ways to deal with this issue. One is to annotate all page load metrics with their respective navigation type: navigate, reload, back_forward, or prerender. This will allow you to continue to monitor your performance within these navigation types—even if the overall distribution skews negative. This approach is recommended for non-user-centric page load metrics like Time to First Byte (TTFB). For user-centric metrics like the Core Web Vitals, a better option is to report a value that more accurately represents what the user experiences. Caution: The back_forward navigation type in the Navigation Timing API is not to be confused with bfcache restores. The Navigation Timing API only annotates page loads, whereas bfcache restores are re-using a page loaded from a previous navigation. Impact on Core Web Vitals # Core Web Vitals measure the user's experience of a web page across a variety of dimensions (loading speed, interactivity, visual stability), and since users experience bfcache restores as faster navigations than traditional page loads, it's important that the Core Web Vitals metrics reflect this. After all, a user doesn't care whether or not bfcache was enabled, they just care that the navigation was fast! Tools like the Chrome User Experience Report, that collect and report on the Core Web Vitals metrics treat bfcache restores as separate page visits in their dataset. And while there aren't (yet) dedicated web performance APIs for measuring these metrics after bfcache restores, their values can be approximated using existing web APIs. For Largest Contentful Paint (LCP), you can use the delta between the pageshow event's timestamp and the timestamp of the next painted frame (since all elements in the frame will be painted at the same time). Note that in the case of a bfcache restore, LCP and FCP will be the same. For First Input Delay (FID), you can re-add the event listeners (the same ones used by the FID polyfill) in the pageshow event, and report FID as the delay of the first input after the bfcache restore. For Cumulative Layout Shift (CLS), you can continue to keep using your existing Performance Observer; all you have to do is reset the current CLS value to 0. For more details on how bfcache affects each metric, refer to the individual Core Web Vitals metric guides pages. And for a specific example of how to implement bfcache versions of these metrics in code, refer to the PR adding them to the web-vitals JS library. As of v1, the web-vitals JavaScript library supports bfcache restores in the metrics it reports. Developers using v1 or greater should not need to update their code. Additional Resources # Firefox Caching (bfcache in Firefox) Page Cache (bfcache in Safari) Back/forward cache: web exposed behavior (bfcache differences across browsers) bfcache tester (test how different APIs and events affect bfcache in browsers)

Feedback wanted: CORS for private networks (RFC1918)

CORS-RFC1918 has been renamed to Private Network Access for clarity. An update to this post is published at developer.chrome.com blog. Malicious websites making requests to devices and servers hosted on a private network have long been a threat. Attackers may, for example, change a wireless router's configuration to enable Man-in-the-Middle attacks. CORS-RFC1918 is a proposal to block such requests by default on the browser and require internal devices to opt-in to requests from the public internet. To understand how this change impacts the web ecosystem, the Chrome team is looking for feedback from developers who build servers for private networks. What's wrong with the status quo? # Many web servers run within a private network—wireless routers, printers, intranet websites, enterprise services, and Internet of Things (IoT) devices are only part of them. They might seem to be in a safer environment than the ones exposed to the public but those servers can be abused by attackers using a web page as a proxy. For example, malicious websites can embed a URL that, when simply viewed by the victim (on a JavaScript-enabled browser), attempts to change the DNS server settings on the victim's home broadband router. This type of attack is called "Drive-By Pharming" and it happened in 2014. More than 300,000 vulnerable wireless routers were exploited by having their DNS settings changed and allowing attackers to redirect users to malicious servers. CORS-RFC1918 # To mitigate the threat of similar attacks, the web community is bringing CORS-RFC1918—Cross Origin Resource Sharing (CORS) specialized for private networks defined in RFC1918. Browsers that implement CORS check with target resources whether they are okay being loaded from a different origin. This is accomplished either with extra headers inline describing the access or by using a mechanism called preflight requests, depending on the complexity. Read Cross Origin Resource Sharing to learn more. With CORS-RFC1918 the browser will block loading resources over the private network by default except ones that are explicitly allowed by the server using CORS and through HTTPS. The website making requests to those resources will need to send CORS headers and the server will need to explicitly state that it accepts the cross-origin request by responding with corresponding CORS headers. (The exact CORS headers are still under development.) Developers of such devices or servers will be requested to do two things: Make sure the website making requests to a private network is served over HTTPS. Set up the server support for CORS-RFC1918 and respond with expected HTTP headers. What kinds of requests are affected? # Affected requests include: Requests from the public network to a private network Requests from a private network to a local network Requests from the public network to a local network A private network A destination that resolves to the private address space defined in Section 3 of RFC1918 in IPv4, an IPv4-mapped IPv6 address where the mapped IPv4 address is itself private, or an IPv6 address outside the ::1/128, 2000::/3 and ff00::/8 subnets. A local network A destination that resolves to the "loopback" space ( defined in section of RFC1122 of IPv4, the "link-local" space ( defined in RFC3927 of IPv4, the "Unique Local Address" prefix (fc00::/7) defined in Section 3 of RFC4193 of IPv6, or the "link-local" prefix (fe80::/10) defined in section 2.5.6 of RFC4291 of IPv6. A public network All others. Relationship between public, private, local networks in CORS-RFC1918. Chrome's plans to enable CORS-RFC1918 # Chrome is bringing CORS-RFC1918 in two steps: Step 1: Requests to private network resources will be allowed only from HTTPS web pages # Chrome 87 adds a flag that mandates public websites making requests to private network resources to be on HTTPS. You can go to chrome://flags#block-insecure-private-network-requests to enable it. With this flag turned on, any requests to a private network resource from an HTTP website will be blocked. Starting from Chrome 88, CORS-RFC1918 errors will be reported as CORS policy errors in the console. CORS-RFC1918 errors will be reported as CORS policy errors in the Console. In the Network panel of Chrome DevTools you can enable the Blocked Requests checkbox to focus in on blocked requests: CORS-RFC1918 errors will also be reported as CORS error errors in the Network panel. In Chrome 87, CORS-RFC1918 errors are only reported in the DevTools Console as ERR_INSECURE_PRIVATE_NETWORK_REQUEST instead. You can try it out yourself using this test website. Step 2: Sending preflight requests with a special header # In the future, whenever a public website is trying to fetch resources from a private or a local network, Chrome will send a preflight request before the actual request. The request will include an Access-Control-Request-Private-Network: true header in addition to other CORS request headers. Among other things, these headers identify the origin making the request, allowing for fine-grained access control. The server can respond with an Access-Control-Allow-Private-Network: true header to explicitly indicate that it grants access to the resource. These headers are still under development and may change in the future. No action is currently required. Feedback wanted # If you are hosting a website within a private network that expects requests from public networks, the Chrome team is interested in your feedback and use cases. There are two things you can do to help: Go to chrome://flags#block-insecure-private-network-requests, turn on the flag and see if your website sends requests to the private network resource as expected. If you encounter any issues or have feedback, file an issue at crbug.com and set the component to Blink>SecurityFeature>CORS>RFC1918. Example feedback # Our wireless router serves an admin website for the same private network but through HTTP. If HTTPS is required for websites that embed the admin website, it will be mixed content. Should we enable HTTPS on the admin website in a closed network? This is exactly the type of feedback Chrome is looking for. Please file an issue with your concrete use case at crbug.com. Chrome would love to hear from you. Hero image by Stephen Philips on Unsplash.

Play the Chrome dino game with your gamepad

Chrome's offline page easter egg is one of the worst-kept secrets in history ([citation needed], but claim made for the dramatic effect). If you press the space key or, on mobile devices, tap the dinosaur, the offline page becomes a playable arcade game. You might be aware that you do not actually have to go offline when you feel like playing: in Chrome, you can just navigate to chrome://dino, or, for the geek in you, browse to chrome://network-error/-106. But did you know that there are currently 270 million Chrome dino games played every month? Another fact that arguably is more useful to know and that you might not be aware of is that in arcade mode you can play the game with a gamepad. Gamepad support was added roughly one year ago as of the time of this writing in a commit by Reilly Grant. As you can see, the game, just like the rest of the Chromium project, is fully open source. In this post, I want to show you how to use the Gamepad API. Using the Gamepad API # The Gamepad API has been around for a long time. This post disregards all the legacy features and vendor prefixes. Feature detection and browser support # The Gamepad API has universally great browser support across both desktop and mobile. You can detect if the Gamepad API is supported using the snippet below: if ('getGamepads' in navigator) { // The API is supported! } How the browser represents a gamepad # The browser represents gamepads as Gamepad objects. A Gamepad has the following fields: id: An identification string for the gamepad. This string identifies the brand or style of connected gamepad device. index: The index of the gamepad in the navigator. connected: Indicates whether the gamepad is still connected to the system. timestamp: The last time the data for this gamepad was updated. mapping: The button and axes mapping in use for this device. Currently the only mapping is "standard". axes: An array of values for all axes of the gamepad, linearly normalized to the range of -1.0–1.0. buttons: An array of button states for all buttons of the gamepad. Note that buttons can be digital (pressed or not pressed) or analog (for example, 78% pressed). This is why buttons are reported as GamepadButton objects, with the following attributes: pressed: The pressed state of the button (true if the button is currently pressed, and false if it is not pressed. touched: The touched state of the button. If the button is capable of detecting touch, this property is true if the button is currently being touched, and false otherwise. value: For buttons that have an analog sensor, this property represents the amount by which the button has been pressed, linearly normalized to the range of 0.0–1.0. One additional thing that you might encounter, depending on your browser and the gamepad you have, is a vibrationActuator property. This field is currently implemented in Chrome and earmarked for merging into the Gamepad Extensions spec. The schematic overview below, taken straight from the spec, shows the mapping and the arrangement of the buttons and axes on a generic gamepad. Source). Being notified when a gamepad gets connected # To learn when a gamepad is connected, listen for the gamepadconnected event that triggers on the window object. When the user connects a gamepad, which can either happen via USB or via Bluetooth, a GamepadEvent is fired that has the gamepad's details in an aptly named gamepad property. Below, you can see an example from an Xbox 360 controller that I had lying around (yes, I am into retro gaming). window.addEventListener('gamepadconnected', (event) => { console.log('✅ 🎮 A gamepad was connected:', event.gamepad); /* gamepad: Gamepad axes: (4) [0, 0, 0, 0] buttons: (17) [GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton] connected: true id: "Xbox 360 Controller (STANDARD GAMEPAD Vendor: 045e Product: 028e)" index: 0 mapping: "standard" timestamp: 6563054.284999998 vibrationActuator: GamepadHapticActuator {type: "dual-rumble"} */ }); Being notified when a gamepad gets disconnected # Being notified of gamepad disconnects happens analogously to the way connections are detected. This time the app listens for the gamepaddisconnected event. Note how in the example below connected is now false when I unplug the Xbox 360 controller. window.addEventListener('gamepaddisconnected', (event) => { console.log('❌ 🎮 A gamepad was disconnected:', event.gamepad); /* gamepad: Gamepad axes: (4) [0, 0, 0, 0] buttons: (17) [GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton, GamepadButton] connected: false id: "Xbox 360 Controller (STANDARD GAMEPAD Vendor: 045e Product: 028e)" index: 0 mapping: "standard" timestamp: 6563054.284999998 vibrationActuator: null */ }); The gamepad in your game loop # Getting a hold of a gamepad starts with a call to navigator.getGamepads(), which returns a GamepadList object with Gamepad items. The GamepadList object in Chrome always has a fixed length of four items. If zero or less than four gamepads are connected, an item may just be null. Always be sure to check all items of the GamepadList and be aware that gamepads "remember" their slot and may not always be present at the first available slot. // When no gamepads are connected: navigator.getGamepads(); // GamepadList {0: null, 1: null, 2: null, 3: null, length: 4} If one or several gamepads are connected, but navigator.getGamepads() still reports null items, you may need to "wake" each gamepad by pressing any of its buttons. You can then poll the gamepad states in your game loop as shown below. const pollGamepad = () => { // Always call `navigator.getGamepads()` inside of // the game loop, not outside. const gamepads = navigator.getGamepads(); for (const gamepad of gamepads) { // Disregard empty slots. if (!gamepad) { continue; } // Process the gamepad state. console.log(gamepad); } // Call yourself upon the next animation frame. // (Typically this happens every 60 times per second.) window.requestAnimationFrame(pollGamepad) }; // Kick off the initial game loop iteration. pollGamepad(); Gotchas! Do not store a lasting reference to the GamepadList result outside of the game loop, since the method returns a static snapshot, not a live object. Call navigator.getGamepads() each time anew in your game loop. Making use of the vibration actuator # The vibrationActuator property returns a GamepadHapticActuator object, which corresponds to a configuration of motors or other actuators that can apply a force for the purposes of haptic feedback. Haptic effects can be played by calling Gamepad.vibrationActuator.playEffect(). The only currently valid effect type is 'dual-rumble'. Dual-rumble describes a haptic configuration with an eccentric rotating mass vibration motor in each handle of a standard gamepad. In this configuration, either motor is capable of vibrating the whole gamepad. The two masses are unequal so that the effects of each can be combined to create more complex haptic effects. Dual-rumble effects are defined by four parameters: duration: Sets the duration of the vibration effect in milliseconds. startDelay: Sets the duration of the delay until the vibration is started. strongMagnitude and weakMagnitude: Set the vibration intensity levels for the heavier and lighter eccentric rotating mass motors, normalized to the range 0.0–1.0. // This assumes a `Gamepad` as the value of the `gamepad` variable. const vibrate = (gamepad, delay = 0, duration = 100, weak = 1.0, strong = 1.0) { if (!('vibrationActuator' in gamepad)) { return; } gamepad.vibrationActuator.playEffect('dual-rumble', { // Start delay in ms. startDelay: delay, // Duration is ms. duration: duration, // The magnitude of the weak actuator (between 0 and 1). weakMagnitude: weak, // The magnitude of the strong actuator (between 0 and 1). strongMagnitude: strong, }); }; Integration with Permissions Policy # The Gamepad API spec defines a policy-controlled feature identified by the string "gamepad". Its default allowlist is "self". A document's permissions policy determines whether any content in that document is allowed to access navigator.getGamepads(). If disabled in any document, no content in the document will be allowed to use navigator.getGamepads(), nor will the gamepadconnected and gamepaddisconnected events fire. <iframe src="index.html" allow="gamepad"></iframe> Demo # A simple gamepad tester demo is embedded below. The source code is available on Glitch. Try the demo by connecting a gamepad via USB or Bluetooth and pressing any of its buttons or moving any of its axis. Bonus: play Chrome dino on web.dev # You can play Chrome dino with your gamepad on this very site. The source code is available on GitHub. Check out the gamepad polling implementation in trex-runner.js and note how it is emulating key presses. For the Chrome dino gamepad demo to work, I have ripped out the Chrome dino game from the core Chromium project (updating an earlier effort by Arnelle Ballane), placed it on a standalone site, extended the existing gamepad API implementation by adding ducking and vibration effects, created a full screen mode, and Mehul Satardekar contributed a dark mode implementation. Happy gaming! Useful links # Gamepad API spec Gamepad API extensions spec GitHub repository Acknowledgements # This article was reviewed by François Beaufort and Joe Medley. The Gamepad API spec is currently edited by Steve Agoston, James Hollyer, and Matt Reynolds. The former spec editors are Brandon Jones, Scott Graham, and Ted Mielczarek. The Gamepad Extensions spec is edited by Brandon Jones. Hero image by Laura Torrent Puig.

Measuring offline usage

This article shows you how to track offline usage of your site to help you make a case for why your site needs a better offline mode. It also explains pitfalls and problems to avoid when implementing offline usage analytics. The pitfalls of the online and offline browser events # The obvious solution for tracking offline usage is to create event listeners for the online and offline events (which many browsers support) and to put your analytics tracking logic in those listeners. Unfortunately, there are several problems and limitations with this approach: In general tracking every network connection status event might be excessive, and is counter-productive in a privacy-centric world where as little data as possible should be collected. Additionally the online and offline events can fire for just a split second of network loss, which a user probably wouldn't even see or notice. The analytics tracking of offline activity would never reach the analytics server because the user is… well, offline. Tracking a timestamp locally when a user goes offline and sending the offline activity to the analytics server when the user goes back online depends on the user revisiting your site. If the user drops off your site due to a lack of an offline mode and never revisits, you have no way to track that. The ability to track offline drop-offs is critical data for building a case about why your site needs a better offline mode. The online event is not very reliable as it only knows about network access, not internet access. Therefore a user might still be offline, and sending the tracking ping can still fail. Even if the user still stays on the current page while being offline, none of the other analytics events (e.g. scroll events, clicks, etc.) are tracked either, which might be the more relevant and useful information. Being offline in itself is also not too meaningful in general. As a website developer it may be more important to know what kinds of resources failed to load. This is especially relevant in the context of SPAs, where a dropped network connection might not lead to a browser offline error page (which users understand) but more likely to random dynamic parts of the page failing silently. You can still use this solution to gain a basic understanding of offline usage, but the many drawbacks and limitations need to be considered carefully. A better approach: the service worker # The solution that enables offline mode turns out to be the better solution for tracking offline usage. The basic idea is to store analytics pings into IndexedDB as long as the user is offline, and just resend them when the user goes online again. For Google Analytics this is already available off-the-shelf through a Workbox module, but keep in mind that hits sent more than four hours deferred may not be processed. In its simplest form, it can be activated within a Workbox-based service worker with these two lines: import * as googleAnalytics from 'workbox-google-analytics'; googleAnalytics.initialize(); This tracks all existing events and pageview pings while being offline, but you wouldn't know that they happened offline (as they are just replayed as-is). For this you can manipulate tracking requests with Workbox by adding an offline flag to the analytics ping, using a custom dimension (cd1 in the code sample below): import * as googleAnalytics from 'workbox-google-analytics'; googleAnalytics.initialize({ parameterOverrides: { cd1: 'offline', }, }); What if the user drops out of the page due to being offline, before an internet connection comes back? Even though this normally puts the service worker to sleep (i.e. it's unable to send the data when the connection comes back), the Workbox Google Analytics module uses the Background Sync API, which sends the analytics data later when the connection comes back, even if the user closes the tab or browser. There is still a drawback: while this makes existing tracking offline-capable, you would most likely not see much relevant data coming in until you implement a basic offline mode. Users would still drop off your site quickly when the connection breaks away. But now you can at least measure and quantify this, by comparing average session length and user engagement for users with the offline dimension applied versus your regular users. SPAs and lazy loading # If users visiting a page built as a multi-page website go offline and try to navigate, the browser's default offline page shows up, helping users understand what is happening. However, pages built as single-page applications work differently. The user stays on the same page, and new content is loaded dynamically through AJAX without any browser navigation. Users do not see the browser error page when going offline. Instead, the dynamic parts of the page render with errors, go into undefined states, or just stop being dynamic. Similar effects can happen within multi-page websites due to lazy loading. For example, maybe the initial load happened online, but the user went offline before scrolling. All lazy loaded content below the fold will silently fail and be missing. As these cases are really irritating to users, it makes sense to track them. Service workers are the perfect spot to catch network errors, and eventually track them using analytics. With Workbox, a global catch handler can be configured to inform the page about failed requests by sending a message event: import { setCatchHandler } from 'workbox-routing'; setCatchHandler(({ event }) => { // https://developer.mozilla.org/en-US/docs/Web/API/Client/postMessage event.waitUntil(async function () { // Exit early if we don't have access to the client. // Eg, if it's cross-origin. if (!event.clientId) return; // Get the client. const client = await clients.get(event.clientId); // Exit early if we don't get the client. // Eg, if it closed. if (!client) return; // Send a message to the client. client.postMessage({ action: "network_fail", url: event.request.url, destination: event.request.destination }); return Response.error(); }()); }); Rather than listening to all failed requests, another way is to catch errors on specific routes only. As an example, if we want to report errors happening on routes to /products/* only, we can add a check in setCatchHandler which filters the URI with a regular expression. import { registerRoute } from 'workbox-routing'; import { NetworkOnly } from 'workbox-strategies'; const networkOnly = new NetworkOnly(); registerRoute( new RegExp('https:\/\/example\.com\/products\/.+'), async (params) => { try { // Attempt a network request. return await networkOnly.handle(params); } catch (error) { // If it fails, report the error. const event = params.event; if (!event.clientId) return; const client = await clients.get(event.clientId); if (!client) return; client.postMessage({ action: "network_fail", url: event.request.url, destination: "products" }); return Response.error(); } } ); As a final step, the page needs to listen to the message event, and send out the analytics ping. Again, make sure to buffer analytics requests that happen offline within the service worker. As described before, initialize the workbox-google-analytics plugin for built-in Google Analytics support. The following example uses Google Analytics, but can be applied in the same way for other analytics vendors. if ("serviceWorker" in navigator) { // ... SW registration here // track offline error events navigator.serviceWorker.addEventListener("message", event => { if (gtag && event.data && event.data.action === "network_fail") { gtag("event", "network_fail", { event_category: event.data.destination, // event_label: event.data.url, // value: event.data.value }); } }); } This will track failed resource loads in Google Analytics, where they can be analyzed with reporting. The derived insight can be used to improve service worker caching and error handling in general, to make the page more robust and reliable under unstable network conditions. Next steps # This article showed different ways of tracking offline usage with their advantages and shortcomings. While this can help to quantify how many of your users go offline and run into problems due to it, it's still just a start. As long as your website does not offer a well-built offline mode, you obviously won't see much offline usage in analytics. We recommend to get the full tracking in place, and then extend your offline capabilities in iterations with an eye on tracking numbers. Start with a simple offline error page first–with Workbox it's trivial to do–and should be considered a UX best practice similar to custom 404 pages anyway. Then work your way towards more advanced offline fallbacks and finally towards real offline content. Make sure you advertise and explain this to your users well, and you will see increasing usage. After all, everyone goes offline every once in a while. Check out How to report metrics and build a performance culture and Fixing website speed cross-functionally for tips on persuading cross-functional stakeholders to invest more in your website. Although those posts are focused on performance, they should help you get general ideas about how to engage stakeholders. Hero photo by JC Gellidon on Unsplash.

NDTV achieved a 55% improvement in LCP by optimizing for Core Web Vitals

NDTV is one of India's leading news stations and websites. By following the Web Vitals program, they improved one of their most important user metrics, Largest Contentful Paint (LCP), by 55% in just a month. This was correlated with a 50% reduction in bounce rates. NDTV made other product changes while they optimized for Web Vitals so it is not possible to conclusively say that optimizing for Web Vitals was the only cause of the bounce rate reduction. 55% Improvement in LCP 50% Reduction in bounce rates Highlighting the opportunity # With close to 200M unique users every month, it was critical for NDTV to optimize for quality of user experience. Although their engagement rates were well over industry average and the highest amongst their peers, the NDTV team still saw room for improvement and decided to invest in Web Vitals along with other product changes to further improve their engagement rates. The approach they used # With the help of tools like PageSpeed Insights, web.dev/measure, and WebPageTest the NDTV team analyzed potential improvement areas on the site. These clearly defined optimization ideas helped them re-prioritize high-impact tasks and achieve immediate results in the improvement of Core Web Vitals. Optimizations included: Prioritizing the largest content block by delaying third-party requests, including ad calls for below-the-fold ad slots, and social network embeds, which are also below-the-fold. Increasing the caching of static content from a few minutes to 30 days. Using font-display to display text sooner while fonts are downloaded. Using vector graphics for icons instead of TrueType Fonts (TTF). Lazy loading JavaScript and CSS: loading the page with the minimum possible JS and CSS and then lazy loading the remaining JS and CSS on page scroll. Preconnecting to origins delivering critical assets. Impact # Web Vitals equipped the team with metric-driven signals to expedite the process of improving user experience. Chrome User Experience Report field data). After the optimization project, it was down to 1.6 seconds. They also reduced their Cumulative Layout Shift (CLS) score to 0.05. Other metrics on WebPageTest like "First Byte Time" and "Effective use of CDN" improved to an A grade. When optimizing your site, remember that it's important to not think of your metric scores as single values, but rather a distribution of field data values from real users. You'll want to make sure that the distribution overall is improving. See Web Performance: Leveraging The Metrics That Most Affect UX for more information. Return on investment # Despite the complexity and depth of ndtv.com, the site was already achieving decent FID and CLS scores, thanks to the team's longstanding focus on performance and UX best practices. To further improve their user experience, the team focused on LCP and managed to meet the threshold within a few weeks of kicking off their optimization work. Overall business results # 55% improvement in LCP as a result of optimizing for Core Web Vitals. Kawaljit Singh Bedi, Chief Technology and Product Officer, NDTV Group Check out the Scale on web case studies page for more success stories from India and Southeast Asia.

Let web applications be file handlers

The File Handling API is part of the capabilities project and is currently in development. This post will be updated as the implementation progresses. Now that web apps are capable of reading and writing files, the next logical step is to let developers declare these very web apps as file handlers for the files their apps can create and process. The File Handling API allows you to do exactly this. After registering a text editor app as a file handler, you can right-click a .txt file on macOS and select "Get Info" to then instruct the OS that it should always open .txt files with this app as default. Suggested use cases for the File Handling API # Examples of sites that may use this API include: Office applications like text editors, spreadsheet apps, and slideshow creators. Graphics editors and drawing tools. Video game level editor tools. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification Not started 3. Gather feedback & iterate on design In progress 4. Origin trial Not started 5. Launch Not started How to use the File Handling API # Enabling via chrome://flags # To experiment with the File Handling API locally, without an origin trial token, enable the #file-handling-api flag in chrome://flags. Progressive enhancement # The File Handling API per se cannot be polyfilled. The functionality of opening files with a web app, however, can be achieved through two other means: The Web Share Target API lets developers specify their app as a share target so files can be opened from the operating system's share sheet. The File System Access API can be integrated with file drag and drop, so developers can handle dropped files in the already opened app. Feature detection # To check if the File Handling API is supported, use: if ('launchQueue' in window) { // The File Handling API is supported. } The declarative part of the File Handling API # As a first step, web apps need to declaratively describe in their Web App Manifest what kind of files they can handle. The File Handling API extends Web App Manifest with a new property called "file_handlers" that accepts an array of, well, file handlers. A file handler is an object with two properties: An "action" property that points to a URL within the scope of the app as its value. An "accept" property with an object of MIME-types as keys and lists of file extensions as their values. The example below, showing only the relevant excerpt of the Web App Manifest, should make it clearer: { … "file_handlers": [ { "action": "/open-csv", "accept": { "text/csv": [".csv"] } }, { "action": "/open-svg", "accept": { "image/svg+xml": ".svg" } }, { "action": "/open-graf", "accept": { "application/vnd.grafr.graph": [".grafr", ".graf"], "application/vnd.alternative-graph-app.graph": ".graph" } } ], … } This is for a hypothetical application that handles comma-separated value (.csv) files at /open-csv, scalable vector graphics (.svg) files at /open-svg, and a made-up Grafr file format with any of .grafr, .graf, or .graph as the extension at /open-graf. For this declaration to have any effect, the application must be installed. You can learn more in an article series on this very site on making your app installable. The imperative part of the File Handling API # Now that the app has declared what files it can handle at which in-scope URL in theory, it needs to imperatively do something with incoming files in practice. This is where the launchQueue comes into play. To access launched files, a site needs to specify a consumer for the window.launchQueue object. Launches are queued until they are handled by the specified consumer, which is invoked exactly once for each launch. In this manner, every launch is handled, regardless of when the consumer was specified. if ('launchQueue' in window) { launchQueue.setConsumer((launchParams) => { // Nothing to do when the queue is empty. if (!launchParams.files.length) { return; } for (const fileHandle of launchParams.files) { // Handle the file. } }); } DevTools support # There is no DevTools support at the time of this writing, but I have filed a feature request for support to be added. Demo # I have added file handling support to Excalidraw, a cartoon-style drawing app. When you create a file with it and store it somewhere on your file system, you can open the file via a double click, or a right click and then select "Excalidraw" in the context menu. You can check out the implementation in the source code. .excalidraw files. Security and permissions # The Chrome team has designed and implemented the File Handling API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics. File-related challenges # There is a large category of attack vectors that are opened by allowing websites access to files. These are outlined in the article on the File System Access API. The additional security-pertinent capability that the File Handling API provides over the File System Access API is the ability to grant access to certain files through the operating system's built-in UI, as opposed to through a file picker shown by a web application. Any restrictions as to the files and folders that can be opened via the picker will also be applied to the files and folders opened via the operating system. There is still a risk that users may unintentionally grant a web application access to a file by opening it. However, it is generally understood that opening a file allows the application it is opened with to read and/or manipulate that file. Therefore, a user's explicit choice to open a file in an installed application, such as via an "Open with…" context menu, can be read as a sufficient signal of trust in the application. Default handler challenges # The exception to this is when there are no applications on the host system for a given file type. In this case, some host operating systems may automatically promote the newly registered handler to the default handler for that file type, silently and without any intervention by the user. This would mean if the user double clicks a file of that type, it would automatically open in the registered web app. On such host operating systems, when the user agent determines that there is no existing default handler for the file type, an explicit permission prompt might be necessary to avoid accidentally sending the contents of a file to a web application without the user's consent. User control # The spec states that browsers should not register every site that can handle files as a file handler. Instead, file handling registration should be gated behind installation and never happen without explicit user confirmation, especially if a site is to become the default handler. Rather than hijacking existing extensions like .json that the user probably already has a default handler registered for, sites should consider crafting their own extensions. Transparency # All operating systems allow users to change the present file associations. This is outside the scope of the browser. Feedback # The Chrome team wants to hear about your experiences with the File Handling API. Tell us about the API design # Is there something about the API that doesn't work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chrome's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter UI>Browser>WebAppInstalls>FileHandling in the Components box. Glitch works great for sharing quick and easy repros. Show support for the API # Are you planning to use the File Handling API? Your public support helps the Chrome team to prioritize features and shows other browser vendors how critical it is to support them. Share how you plan to use it on the WICG Discourse thread. Send a tweet to @ChromiumDev using the hashtag #FileHandling and let us know where and how you are using it. Helpful links # Public explainer File Handling API demo | File Handling API demo source Chromium tracking bug ChromeStatus.com entry Blink Component: UI>Browser>WebAppInstalls>FileHandling Wanna go deeper # TAG Review Mozilla Standards Position Acknowledgements # The File Handling API was specified by Eric Willigers, Jay Harris, and Raymes Khoury. This article was reviewed by Joe Medley.

Signed Exchanges (SXGs)

A signed exchange (SXG) is a delivery mechanism that makes it possible to authenticate the origin of a resource independently of how it was delivered. This decoupling advances a variety of use cases such as privacy-preserving prefetching, offline internet experiences, and serving from third-party caches. Additionally, implementing SXGs can improve Largest Contentful Paint (LCP) for some sites. This article provides a comprehensive overview of SXGs: how they work, use cases, and tooling. Browser compatibility # SXGs are supported by Chromium-based browsers (starting with versions: Chrome 73, Edge 79, and Opera 64). Overview # Signed Exchanges (SXGs) allow a site to cryptographically sign a request/response pair (an "HTTP exchange") in a way that makes it possible for the browser to verify the origin and integrity of the content independently of how the content was distributed. As a result, the browser can display the URL of the origin site in the address bar, rather than the URL of the server that delivered the content. The broader implication of SXGs is that they make content portable: content delivered via a SXG can be easily distributed by third parties while maintaining full assurance and attribution of its origin. Historically, the only way for a site to use a third-party to distribute its content while maintaining attribution has been for the site to share its SSL certificates with the distributor. This has security drawbacks; moreover, it is a far stretch from making content truly portable. In the long-term, truly portable content can be utilized to achieve use cases like fully offline experiences. In the immediate-term, the primary use case of SXGs is the delivery of faster user experiences by providing content in an easily cacheable format. Specifically, Google Search will cache and sometimes prefetch SXGs. For sites that receive a large portion of their traffic from Google Search, SXGs can be an important tool for delivering faster page loads to users. The SXG format # An SXG is encapsulated in a binary-encoded file that has two primary components: an HTTP exchange and a signature. The HTTP exchange consists of a request URL, content negotiation information, and an HTTP response. Here's an example of a decoded SXG file: format version: 1b3 request: method: GET uri: https://example.org/ headers: response: status: 200 headers: Cache-Control: max-age=604800 Digest: mi-sha256-03=kcwVP6aOwYmA/j9JbUU0GbuiZdnjaBVB/1ag6miNUMY= Expires: Mon, 24 Aug 2020 16:08:24 GMT Content-Type: text/html; charset=UTF-8 Content-Encoding: mi-sha256-03 Date: Mon, 17 Aug 2020 16:08:24 GMT Vary: Accept-Encoding signature: label;cert-sha256=*ViFgi0WfQ+NotPJf8PBo2T5dEuZ13NdZefPybXq/HhE=*; cert-url="https://test.web.app/ViFgi0WfQ-NotPJf8PBo2T5dEuZ13NdZefPybXq_HhE"; date=1597680503;expires=1598285303;integrity="digest/mi-sha256-03";sig=*MEUCIQD5VqojZ1ujXXQaBt1CPKgJxuJTvFlIGLgkyNkC6d7LdAIgQUQ8lC4eaoxBjcVNKLrbS9kRMoCHKG67MweqNXy6wJg=*; validity-url="https://example.org/webpkg/validity" header integrity: sha256-Gl9bFHnNvHppKsv+bFEZwlYbbJ4vyf4MnaMMvTitTGQ= The exchange has a valid signature. payload [1256 bytes]: <!doctype html> <html> <head> <title>SXG example</title> <meta charset="utf-8" /> <meta http-equiv="Content-type" content="text/html; charset=utf-8" /> <style type="text/css"> body { background-color: #f0f0f2; margin: 0; padding: 0; } </style> </head> <body> <div> <h1>Hello</h1> </div> </body> </html> The expires parameter in the signature indicates a SXG's expiration date. A SXG may be valid for at most 7 days. If the expiration date of an SXG is more than 7 days in the future, the browser will reject it. Find more information on the signature header in the Signed HTTP Exchanges spec. Web Packaging # SXGs are a part of the broader Web Packaging spec proposal family. In addition to SXGs, the other major component of the Web Packaging spec is Web Bundles ("bundled HTTP exchanges"). Web Bundles are a collection of HTTP resources and the metadata necessary to interpret the bundle. The relationship between SXGs and Web Bundles is a common point of confusion. SXGs and Web Bundles are two distinct technologies that don't depend on each other—Web Bundles can be used with both signed and unsigned exchanges. The common goal advanced by both SXGs and Web Bundles is the creation of a "web packaging" format that allows sites to be shared in their entirety for offline consumption. SXGs are the first part of the Web Packaging spec that Chromium-based browsers will implement. Loading SXGs # Initially, the primary use case of SXGs will likely be as a delivery mechanism for a page's main document. For this use case, a SXG could be referenced using the <link> or <a> tags, as well as the Link header. Like other resources, a SXG can be loaded by entering its URL in the browser's address bar. <a href="https://example.com/article.html.sxg"> <link rel="prefetch" as="document" href="https://example.com/article.html.sxg"> SXGs can also be used to deliver subresources. For more information, refer to Signed Exchange subresource substitution. Serving SXGs # Content negotiation # Content negotiation is a mechanism for serving different representations of the same resource at the same URL depending on the capabilities and preferences of a client—for example, serving the gzip version of a resource to some clients but the Brotli version to others. Content negotiation makes it possible to serve both SXG and non-SXG representations of the same content depending on a browser's capabilities. Web browsers use the Accept request header to communicate the MIME types they support. If a browser supports SXGs, the MIME type application/signed-exchange will automatically be included in this list of values. For example, this is the Accept header sent by Chrome 84: accept: text/html, application/xhtml+xml, application/xml;q=0.9, image/webp,image/apng, \*/\*;q=0.8, application/signed-exchange;v=b3;q=0.9 The application/signed-exchange;v=b3;q=0.9 portion of this string informs the web server that Chrome supports SXGs—specifically, version b3. The last part q=0.9 indicates the q-value. The q-value expresses a browser's relative preference for a particular format using a decimal scale from 0 to 1, with 1 representing the highest priority. When a q-value is not supplied for a format, 1 is the implied value. Best practices # Servers should serve SXGs when the Accept header indicates that the q-value for application/signed-exchange is greater than or equal to the q-value for text/html. In practice, this means that an origin server will serve SXGs to crawlers, but not browsers. SXGs can deliver superior performance when used with caching or prefetching. However, for content that is loaded directly from the origin server without the benefit of these optimizations, text/html delivers better performance than SXGs. Serving content as SXG allows crawlers and other intermediaries to cache SXGs for faster delivery to users. The following regular expression can be used to match the Accept header of requests that should be served as SXG: Accept: /(^|,)\s\*application\/signed-exchange\s\*;\s\*v=[[:alnum:]\_-]+\s\*(,|$)/ Note that the subexpression (,|$) matches headers where the q-value for SXG has been omitted; this omission implies a q-value of 1 for SXG. Although an Accept header could theoretically contain the substring q=1, in practice browsers don't explicitly list a format's q-value when it has the default value of 1. Debugging SXGs with Chrome DevTools # Signed Exchanges can be identified by looking for signed-exchange in the Type column of the Network panel in Chrome DevTools. The Network panel in DevTools The Preview tab provides more information about the contents of a SXG. The Preview tab in DevTools To see a SXG firsthand, visit this demo in one of the browsers that supports SXG Use cases # SXGs can be used to deliver content directly from an origin server to a user—but this would largely defeat the purpose of SXGs. Rather, the intended use and benefits of SXGs are primarily achieved when the SXGs generated by an origin server are cached and served to users by an intermediary. Although this section primarily discusses the caching and serving of SXGs by Google Search, it is a technique that is applicable to any site that wishes to provide its outlinks with a faster user experience or greater resiliency to limited network access. This not only includes search engines and social media platforms, but also information portals that serve content for offline consumption. Google Search # Google Search uses SXGs to provide users with a faster page load experience for pages loaded from the search results page. Sites that receive significant traffic from Google Search can potentially see significant performance improvements by serving content as SXG. Google Search will now crawl, cache, and prefetch SXGs when applicable. Google and other search engines sometimes prefetch content that the user is likely to visit—for example, the page corresponding to the first search result. SXGs are particularly well suited to prefetching because of their privacy benefits over non-SXG formats. There is a certain amount of user information inherent to all network requests regardless of how or why they were made: this includes information like IP address, the presence or absence of cookies, and the value of headers like Accept-Language. This information is "disclosed" to the destination server when a request is made. Because SXGs are prefetched from a cache, rather than the origin server, a user's interest in a site will only be disclosed to the origin server once the user navigates to the site, rather than at the time of prefetching. In addition, content prefetched via SXG does not set cookies or access localStorage unless the content is loaded by the user. Furthermore, this reveals no new user information to the SXG referrer. The use of SXGs for prefetching is an example of the concept of privacy-preserving prefetching. Crawling # The Accept header sent by the Google Search crawler expresses an equal preference for text/html and application/signed-exchange. As described in the previous section, sites that wish to use SXGs should serve them when the Accept header of a request expresses an equal or greater preference for SXGs over text/html. In practice, only crawlers will express a preference for SXGs over text/html. Indexing # The SXG and non-SXG representations of a page are not ranked or indexed separately by Google Search. SXG is ultimately a delivery mechanism—it does not change the underlying content. Given this, it would not make sense for Google Search to separately index or rank the same content delivered in different ways. Web Vitals # For sites that receive a significant portion of their traffic from Google Search, SXGs can be used to improve Web Vitals—namely LCP. Cached and prefetched SXGs can be delivered to users incredibly quickly and this yields a faster LCP. Although SXGs can be a powerful tool, they work best when combined with other performance optimizations such as use of CDNs and reduction of render-blocking subresources. AMP # AMP content can be delivered using SXG. SXG allows AMP content to be prefetched and displayed using its canonical URL, rather than its AMP URL. All of the concepts described in this document still apply to the AMP use case, however, AMP has its own separate tooling for generating SXGs. Learn how to serve AMP using signed exchanges on amp.dev. Tooling # This section discusses the tooling options and technical requirements of SXGs. At a high level, implementing SXGs consists of generating the SXG corresponding to a given URL and then serving that SXG to users. To generate a SXG you will need a certificate that can sign SXGs. Certificates # Certificates associate an entity with a public key. Signing a SXG with a certificate allows the content to be associated with the entity. Production use of SXGs requires a certificate that supports the CanSignHttpExchanges extension. Per spec, certificates with this extension must have a validity period no longer than 90 days and require that the requesting domain have a DNS CAA record configured. This page lists the certificate authorities that can issue this type of certificate. Certificates for SXGs are only available through a commercial certificate authority. Web Packager # Web Packager is an open-source, Go-based tool that is the de facto tooling for generating ("packaging") signed exchanges. You can use it to manually create SXGs, or as a server that automatically creates and serves SXGs. Web Packager is currently in alpha. Web Packager CLI # The Web Packager CLI generates a SXG corresponding to a given URL. webpackager \ --private\_key=private.key \ --cert\_url=https://example.com/certificate.cbor \ --url=https://example.com Once the SXG file has been generated, upload it to your server and serve it with the application/signed-exchange;v=b3 MIME type. In addition, you will need to serve the SXG certificate as application/cert-chain+cbor. Web Packager Server # The Web Packager server, webpkgserver, acts as a reverse proxy for serving SXGs. Given a URL, webpkgserver will fetch the URL's contents, package them as an SXG, and serve the SXG in response. For instructions on setting up the Web Packager server, see How to set up signed exchanges using Web Packager. In production, webpkgserver should not use a public endpoint. Instead, the frontend web server should forward SXG requests to webpkgserver. These recommendations contain more information on running webpkgserver behind a frontend edge server. Other tooling # This section lists tooling alternatives to Web Packager. In addition to these options, you can also choose to build your own SXG generator. NGINX SXG Module The NGINX SXG module generates and serves SXGs. Sites that already use NGINX should consider using this module over Web Packager Server. The NGINX SXG module only works with CanSignHttpExchanges certificates. Setup instructions can be found here. libsxg libsxg is a minimal, C-based library for generating SXGs. libsxg can be used to build an SXG generator that integrates into other pluggable servers. The NGINX SXG module is built on top of libsxg. gen-signedexchange gen-signedexchange is a tool provided by the webpackage specification as a reference implementation of generating SXGs. Due to its limited feature set, gen-signedexchange is useful for trying out SXGs, but impractical for larger-scale and production use. Conclusion # Signed Exchanges are a delivery mechanism that make it possible to verify the origin and validity of a resource independently of how the resource was delivered. As a result, SXGs can be distributed by third-parties while maintaining full publisher attribution. Further reading # Draft spec for Signed HTTP Exchanges Web Packaging explainers Get started with signed exchanges on Google Search How to set up Signed Exchanges using Web Packager Demo of Signed Exchanges

min(), max(), and clamp(): three logical CSS functions to use today

With responsive design evolving and becoming increasingly nuanced, CSS itself is constantly evolving and providing authors increased control. The min(), max(), and clamp() functions, now supported in all modern browsers, are among the latest tools in making authoring websites and apps more dynamic and responsive. When it comes to flexible and fluid typography, controlled element resizing, and maintaining proper spacing, min(), max(), and clamp() can help. Background # The math functions, calc(), min(), max(), and clamp() allow mathematical expressions with addition (+), subtraction (-), multiplication (*), and division (/) to be used as component values CSS Values And Units Level 4 Safari was the first to ship the complete set of functions in April 2019, with Chromium following later that year in version 79. This year, with Firefox 75 shipping, we now have browser parity for min(), max(), and clamp() in all evergreen browsers. Caniuse support table. Usage # See Demo on Codepen. You can use min(), max(), and clamp() on the right hand side of any CSS expression where it would make sense. For min() and max(), you provide an argument list of values, and the browser determines which one is either the smallest or largest, respectively. For example, in the case of: min(1rem, 50%, 10vw), the browser calculates which of these relative units is the smallest, and uses that value as the actual value. See Demo on Codepen. The max() function selects the largest value from a list of comma-separated expressions. See Demo on Codepen. To use clamp() enter three values: a minimum value, ideal value (from which to calculate), and maximum value. Any of these functions can be used anywhere a <length>, <frequency>, <angle>, <time>, <percentage>, <number>, or <integer> is allowed. You can use these on their own (i.e. font-size: max(0.5vw, 50%, 2rem)), in conjunction with calc() (i.e. font-size: max(calc(0.5vw - 1em), 2rem)), or composed (i.e. font-size: max(min(0.5vw, 1em), 2rem)). When using a calculation inside of a min(), max(), or clamp() function, you can remove the call to calc(). For example, writing font-size: max(calc(0.5vw - 1em), 2rem) would be the same as font-size: max(0.5vw - 1em, 2rem). To recap: min(<value-list>): selects the smallest (most negative) value from a list of comma-separated expressions max(<value-list>): selects the largest (most positive) value from a list of comma-separated expressions clamp(<min>, <ideal>, <max>): clamps a value between an upper and lower bound, based on a set ideal value Let's take a look at some examples. The perfect width # According to The Elements of Typographic Style by Robert Bringhurst, "anything from 45 to 75 characters is widely regarded as a satisfactory length of line for a single-column page set in a serifed text face in a text size." To ensure that your text blocks are not narrower than 45 characters or wider than 75 characters, use clamp() and the ch (0-width character advance) unit: p { width: clamp(45ch, 50%, 75ch); } This allows for the browser to determine the width of the paragraph. It will set the width to 50%, unless 50% is smaller than 45ch, at which point 45ch will be selected, and visa versa for if 50% is wider than 75ch. In this demo, the card itself is getting clamped: See Demo on Codepen. You could break this up with just the min() or max() function. If you want the element to always be at 50% width, and not exceed 75ch in width (i.e. on larger screens), write: width: min(75ch, 50%);. This essentially sets a "max" size by using the min() function. By the same token, you can ensure a minimum size for legible text using the max() function. This would look like: width: max(45ch, 50%);. Here, the browser selects whichever is larger, 45ch or 50%, meaning the element must be at least 45ch or larger. Padding management # Using the same concept as above, where the min() function can set a "max" value and max() sets a "min" value, you can use max() to set a minimum padding size. This example comes from CSS Tricks, where reader Caluã de Lacerda Pataca shared this idea: The idea is to enable an element to have additional padding at larger screen sizes, but maintain a minimum padding at smaller screen sizes, particularly on the inline padding. To achieve this, use calc() and subtract the minimum padding from either side: calc((100vw - var(--contentWidth)) / 2), or use max: max(2rem, 50vw - var(--contentWidth) / 2). All together it looks like: footer { padding: var(--blockPadding) max(2rem, 50vw - var(--contentWidth) / 2); } Setting a minimum padding for a component using the max() function. See Demo on Codepen. Fluid typography # In order to enable fluid typography, Mike Riethmeuller popularized a technique that uses the calc() function to set a minimum font size, maximum font size, and allow for scaling from the min to the max. See Demo on Codepen. With clamp(), you can write this more clearly. Rather than requiring a complex string, the browser can do the work for you. Set the minimum acceptable font size (for example, 1.5rem for a title, maximum size (i.e. 3rem) and ideal size of 5vw. Now, we get typography that scales with the viewport width of the page until it reaches the limiting minimum and maximum values, in a much more succinct line of code: p { font-size: clamp(1.5rem, 5vw, 3rem); } Warning: Limiting how large text can get with max() or clamp() can cause a WCAG failure under 1.4.4 Resize text (AA) , because a user may be unable to scale the text to 200% of its original size. Be certain to test the results with zoom. Conclusion # The CSS math functions, min(), max(), and clamp() are very powerful, well supported, and could be just what you're looking for to help you build responsive UIs. For more resources, check out: CSS Values and Units on MDN CSS Values and Units Level 4 Spec CSS Tricks on Article on Inner-Element Width min(), max(), clamp() Overview by Ahmad Shadeed Cover image from @yer_a_wizard on Unsplash.

Video processing with WebCodecs

Modern web technologies provide ample ways to work with video. Media Stream API, Media Recording API, Media Source API, and WebRTC API add up to a rich tool set for recording, transferring, and playing video streams. While solving certain high-level tasks, these APIs don't let web programmers work with individual components of a video stream such as frames and unmuxed chunks of encoded video or audio. To get low-level access to these basic components, developers have been using WebAssembly to bring video and audio codecs into the browser. But given that modern browsers already ship with a variety of codecs (which are often accelerated by hardware), repackaging them as WebAssembly seems like a waste of human and computer resources. WebCodecs API eliminates this inefficiency by giving programmers a way to use media components that are already present in the browser. Specifically: Video and audio decoders Video and audio encoders Raw video frames Image decoders The WebCodecs API is useful for web applications that require full control over the way media content is processed, such as video editors, video conferencing, video streaming, etc. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification Complete 3. Gather feedback & iterate on design In Progress 4. Origin trial In Progress 5. Launch Not started Video processing workflow # Frames are the centerpiece in video processing. Thus in WebCodecs most classes either consume or produce frames. Video encoders convert frames into encoded chunks. Video decoders do the opposite. Track readers turn video tracks into a sequence of frames. By design all these transformations happen asynchronously. WebCodecs API tries to keep the web responsive by keeping the heavy lifting of video processing off the main thread. Currently in WebCodecs the only way to show a frame on the page is to convert it into an ImageBitmap and either draw the bitmap on a canvas or convert it into a WebGLTexture. WebCodecs in action # Encoding # It all starts with a VideoFrame. There are two ways to convert existing pictures into VideoFrame objects. The first is to create a frame directly from an ImageBitmap. Just call the VideoFrame() constructor and give it a bitmap and a presentation timestamp. let cnv = document.createElement('canvas'); // draw something on the canvas …let bitmap = await createImageBitmap(cnv); let frame_from_bitmap = new VideoFrame(bitmap, { timestamp: 0 }); The path from ImageBitmap to the network or to storage. The second is to use VideoTrackReader to set a function that will be called each time a new frame appears in a MediaStreamTrack. This is useful when you need to capture a video stream from a camera or the screen. let frames_from_stream = []; let stream = await navigator.mediaDevices.getUserMedia({ … }); let vtr = new VideoTrackReader(stream.getVideoTracks()[0]); vtr.start((frame) => { frames_from_stream.push(frame); }); The path from MediaStreamTrack to the network or to storage. No matter where they are coming from, frames can be encoded into EncodedVideoChunk objects with a VideoEncoder. Before encoding, VideoEncoder needs to be given two JavaScript objects: Init dictionary with two functions for handling encoded chunks and errors. These functions are developer-defined and can't be changed after they're passed to the VideoEncoder constructor. Encoder configuration object, which contains parameters for the output video stream. You can change these parameters later by calling configure(). const init = { output: handleChunk, error: (e) => { console.log(e.message); } }; let config = { codec: 'vp8', width: 640, height: 480, bitrate: 8_000_000, // 8 Mbps framerate: 30, }; let encoder = new VideoEncoder(init); encoder.configure(config); After the encoder has been set up, it's ready to start accepting frames. When frames are coming from a media stream, the callback given to VideoTrackReader.start() will pump frames into the encoder, periodically inserting keyframes and checking that the encoder is not overwhelmed with incoming frames. Both configure() and encode() return immediately without waiting for the actual work to complete. It allows several frames to queue for encoding at the same time. But it makes error reporting somewhat cumbersome. Errors are reported either by immediately throwing exceptions or by calling the error() callback. Some errors are easy to detect immediately, others become evident only during encoding. If encoding completes successfully the output() callback is called with a new encoded chunk as an argument. Another important detail here is that encode() consumes the frame, if the frame is needed later (for example, to encode with another encoder) it needs to be duplicated by calling clone(). let frame_counter = 0; let pending_outputs = 0; let vtr = new VideoTrackReader(stream.getVideoTracks()[0]); vtr.start((frame) => { if (pending_outputs > 30) { // Too many frames in flight, encoder is overwhelmed // let's drop this frame. return; } frame_counter++; pending_outputs++; const insert_keyframe = (frame_counter % 150) == 0; encoder.encode(frame, { keyFrame: insert_keyframe }); }); Finally it's time to finish encoding code by writing a function that handles chunks of encoded video as they come out of the encoder. Usually this function would be sending data chunks over the network or muxing them into a media container for storage. function handleChunk(chunk) { let data = new Uint8Array(chunk.data); // actual bytes of encoded data let timestamp = chunk.timestamp; // media time in microseconds let is_key = chunk.type == 'key'; // can also be 'delta' pending_outputs--; fetch(`/upload_chunk?timestamp=${timestamp}&type=${chunk.type}`, { method: 'POST', headers: { 'Content-Type': 'application/octet-stream' }, body: data }); } If at some point you'd need to make sure that all pending encoding requests have been completed, you can call flush() and wait for its promise. await encoder.flush(); Decoding # Setting up a VideoDecoder is similar to what's been done for the VideoEncoder: two functions are passed when the decoder is created, and codec parameters are given to configure(). The set of codec parameters can vary from codec to codec, for example for H264 you currently need to specify a binary blob with AVCC extradata. const init = { output: handleFrame, error: (e) => { console.log(e.message); } }; const config = { codec: 'vp8', codedWidth: 640, codedHeight: 480 }; let decoder = new VideoDecoder(init); decoder.configure(config); Once the decoder is initialized, you can start feeding it with EncodedVideoChunk objects. Creating a chunk just takes a BufferSourceof data and a frame timestamp in microseconds. Any chunks emitted by the encoder are ready for the decoder as is, although it's hard to imagine a real-world use case for decoding newly encoded chunks (except for the demo below). All of the things said above about the asynchronous nature of encoder's methods are equally true for decoders. let responses = await downloadVideoChunksFromServer(timestamp); for (let i = 0; i < responses.length; i++) { let chunk = new EncodedVideoChunk({ timestamp: responses[i].timestamp, data: new Uint8Array ( responses[i].body ) }); decoder.decode(chunk); } await decoder.flush(); The path from the network or storage to an ImageBitmap. Now it's time to show how a freshly decoded frame can be shown on the page. It's better to make sure that the decoder output callback (handleFrame()) quickly returns. In the example below, it only adds a frame to the queue of frames ready for rendering. Rendering happens separately, and consists of three steps: Converting the VideoFrame into an ImageBitmap. Waiting for the right time to show the frame. Drawing the image on the canvas. Once a frame is no longer needed, call destroy() to release underlying memory before the garbage collector gets to it, this will reduce the average amount of memory used by the web application. let cnv = document.getElementById('canvas_to_render'); let ctx = cnv.getContext('2d', { alpha: false }); let ready_frames = []; let underflow = true; let time_base = 0; function handleFrame(frame) { ready_frames.push(frame); if (underflow) setTimeout(render_frame, 0); } function delay(time_ms) { return new Promise((resolve) => { setTimeout(resolve, time_ms); }); } function calculateTimeTillNextFrame(timestamp) { if (time_base == 0) time_base = performance.now(); let media_time = performance.now() - time_base; return Math.max(0, (timestamp / 1000) - media_time); } async function render_frame() { if (ready_frames.length == 0) { underflow = true; return; } let frame = ready_frames.shift(); underflow = false; let bitmap = await frame.createImageBitmap(); // Based on the frame's timestamp calculate how much of real time waiting // is needed before showing the next frame. let time_till_next_frame = calculateTimeTillNextFrame(frame.timestamp); await delay(time_till_next_frame); ctx.drawImage(bitmap, 0, 0); // Immediately schedule rendering of the next frame setTimeout(render_frame, 0); frame.destroy(); } Demo # The demo below shows two canvases, the first one is animated at the refresh rate of your display, the second one shows a sequence of frames captured by VideoTrackReader at 30 FPS, encoded and decoded using WebCodecs API. Feature detection # To check for WebCodecs support: if ('VideoEncoder' in window) { // WebCodecs API is supported. } Using the WebCodecs API # Enabling via a command line flag # To experiment with the WebCodecs API locally on all desktop platforms, without an origin trial token, start Chrome with a command line flag: --enable-blink-features=WebCodecs Enabling support during the origin trial phase # The WebCodecs API is available on all desktop platforms (Chrome OS, Linux, macOS, and Windows) as an origin trial in Chrome 86. The origin trial is expected to end just before Chrome 88 moves to stable in February 2021. The API can also be enabled using a flag. Origin trials allow you to try new features and give feedback on their usability, practicality, and effectiveness to the web standards community. For more information, see the Origin Trials Guide for Web Developers. To sign up for this or another origin trial, visit the registration page. Register for the origin trial # Request a token for your origin. Add the token to your pages. There are two ways to do that: Add an origin-trial <meta> tag to the head of each page. For example, this may look something like: <meta http-equiv="origin-trial" content="TOKEN_GOES_HERE"> If you can configure your server, you can also add the token using an Origin-Trial HTTP header. The resulting response header should look something like: Origin-Trial: TOKEN_GOES_HERE Feedback # The Chrome team wants to hear about your experiences with the Idle Detection API. Tell us about the API design # Is there something about the API that doesn't work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chrome's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter Blink>Media>WebCodecs in the Components box. Glitch works great for sharing quick and easy repros. Show support for the API # Are you planning to use the WebCodecs API? Your public support helps the Chrome team to prioritize features and shows other browser vendors how critical it is to support them. Send emails to media-dev@chromium.org or send a tweet to [@ChromiumDev][cr-dev-twitter] using the hashtag #WebCodecs and let us know where and how you're using it. Hero image by Denise Jans on Unsplash.

How focusing on web performance improved Tokopedia's click-through rate by 35%

Tokopedia is one of the largest e-commerce companies in Indonesia. With 2.7M+ nationwide merchant networks, 18M+ product listings, and 50M+ monthly visitors, the web team knew that investment in web performance was essential. By building a performance-first culture, they achieved a 35% increase in click-through rates (CTR) and an 8% increase in conversions (CVR). 35% Increase in CTR 8% Increase in CVR 4sec Improvement in TTI Highlighting the opportunity # The web team talked to their leadership team on the importance of investing in web performance to improve user experience and engagement, and also showed the impact of performance using advanced patterns and APIs. Check out web.dev's Build a performance culture collection for tips on how to persuade your cross-functional stakeholders to focus on website performance. The approach they used # JavaScript and resource optimization # JavaScript is a common cause of performance issues. The team took several steps to minimize this: code splitting and optimized for above-the-fold content. adaptive loading, e.g. only loading high-quality images for devices on fast networks and using lower-quality images for devices on slow networks. Lazy-loaded below-the-fold images. Homepage optimization # Svelte to build a lite version of the homepage for first-time visitors, ensuring a fast website experience. This version also used a service worker to cache the non-lite assets in the background. Performance budgeting and monitoring # Lighthouse and other tools to improve the quality of web pages: and the Server-Timing header), the PageSpeed Insights (PSI) API, and Chrome User Experience Report data to monitor field and lab metrics. Dendi Sunardi, Engineering Manager, Web Platform, Tokopedia Check out the Scale on web case studies page for more success stories from India and Southeast Asia.

Logical layout enhancements with flow-relative shorthands

Since Chromium 69 (September 3rd 2018), logical properties and values have helped developers maintain control of their international layouts through logical, rather than physical, direction and dimension styles. In Chromium 87, shorthands and offsets have shipped to make these logical properties and values a bit easier to write. This catches Chromium up to Firefox, which has had support for the shorthands since 66. Safari has them ready in their tech preview. Document flow # If you're already familiar with logical properties, inline and block axes, and don't want a refresher, you can skip ahead. Otherwise, here's a short refresher. In English, letters and words flow left to right while paragraphs are stacked top to bottom. In traditional Chinese, letters and words are top to bottom while paragraphs are stacked right to left. In just these 2 cases, if we write CSS that puts "margin top" on a paragraph, we're only appropriately spacing 1 language style. If the page is translated into traditional Chinese from English, the margin may well not make sense in the new vertical writing mode. Therefore the physical side of the box isn't very useful internationally. Thus begins the process of supporting multiple languages; learning about physical versus logical sides of the box model. Key Term: A logical property is one that references a side, corner or axis of the box model in context of the applicable language direction. It's akin to referencing someone's strong arm, rather than assuming it's their right arm. "Right" is a physical arm reference, "strong" is a logical arm reference, contextual to the individual. Have you ever inspected the p element in Chrome DevTools? If so, you might have noticed that the default User Agent styles are not physical, but logical. p { margin-block-start: 1em; margin-block-end: 1em; margin-inline-start: 0px; margin-inline-end: 0px; } CSS from Chromium's User Agent Stylesheet The margin is not on the top or bottom like an English reader might believe. It's block-start and block-end! These logical properties are akin to an English reader's top and bottom, but also akin to a Japanese reader as right and left. Written once, works everywhere. Normal flow is when the webpage is part of this multi-directionality intentionally. When page content updates according to document direction changes, the layout and its elements are considered in flow. Read more about "in" and "out" of flow on MDN or in the CSS Display Module spec. While logical properties are not required to be in flow, they do much of the heavy lifting for you as directionality changes. Flow implies direction, which letters, words and content need to travel along. This leads us to block and inline logical directions. Block direction is the direction that new content blocks follow, like asking yourself, "where to put the next paragraph?". You might think of it as a "content block", or "block of text". Every language arranges their blocks and orders them along their respective block-axis. block-start is the side a paragraph is first placed, while block-end is the side new paragraphs flow towards. Key Term: The block direction is defined by the writing-mode property. For example, horizontal-tb (the initial value) has a vertical block axis that flows top-to-bottom (tb). Other values have an horizontal block axis, which can flow left-to-right (like in vertical-lr) or right-to-left (like in vertical-rl). In traditional Japanese handwriting, for example, block direction flows right to left: Inline direction is the direction that letters and words go. Consider the direction your arm and hand travel when you write; they are traveling along the inline-axis. inline-start is the side where you start writing, while inline-end is the side where writing ends or wraps. The above video, the inline-axis is top to bottom, but in this next video the inline-axis flows right to left. Key Term: The inline direction is defined by both writing-mode and direction. For example, it flows left-to-right with horizontal-tb and ltr, right-to-left with horizontal-tb and rtl, top-to-bottom with vertical-lr and ltr, and bottom-to-top with vertical-rl and rtl. Being flow-relative means that the styles written for one language will be contextual and appropriately applied into other languages. Content will flow relative to the language it's being delivered for. New shorthands # Some of the following shorthands are not new features for the browser, rather, easier ways to write styles by taking advantage of being able to set values on both block or inline edges at once. The inset-* logical properties do bring new abilities, as there were no longhand ways to specify absolute positions with logical properties before it. Insets and shorthands flow (hehe) together so well though, I'm going to tell you about all of the new logical properties features landing in Chromium 87 at once. Margin shorthands # No new abilities shipped, but some super handy shorthands did: margin-block and margin-inline. Caution: If the above items do not have space between them, then margin-block shorthand is not supported in your browser. Longhand margin-block-start: 2ch; margin-block-end: 2ch; New shorthand margin-block: 2ch; /* or */ margin-block: 2ch 2ch; There is no shorthand for "top and bottom" or "left and right"… until now! You probably reference all 4 sides using the shorthand of margin: 10px;, and now you can easily reference 2 complimentary sides by using the logical property shorthand. Longhand margin-inline-start: 4ch; margin-inline-end: 2ch; New shorthand margin-inline: 4ch 2ch; Padding shorthands # No new abilities shipped, but more super handy shorthands did: padding-block and padding-inline. Longhand padding-block-start: 2ch; padding-block-end: 2ch; New shorthand padding-block: 2ch; /* or */ padding-block: 2ch 2ch; And the inline complimentary set of shorthands: Longhand padding-inline-start: 4ch; padding-inline-end: 2ch; New shorthand padding-inline: 4ch 2ch; Inset and shorthands # The physical properties top, right, bottom and left can all be written as values for the inset property. Any value of position can benefit from setting sides with inset. .cover { position: absolute; top: 0; right: 0; bottom: 0; left: 0; inset: 0; } Physical longhand position: absolute; top: 1px; right: 2px; bottom: 3px; left: 4px; New physical shorthand position: absolute; inset: 1px 2px 3px 4px; That should look immediately convenient! Inset is shorthand for the physical sides, and it works just like margin and padding. New features # As exciting as the physical sides shorthand is, there's even more from the logical features brought by additional inset shorthands. These shorthands bring developer authoring convenience (they're shorter to type) but also increase the potential reach for the layout because they're flow-relative. Physical longhand position: absolute; top: 10px; bottom: 10px; Logical shorthand position: absolute; inset-block: 10px; Physical longhand position: absolute; left: 10px; right: 20px; Logical shorthand position: absolute; inset-inline: 10px 20px; Further reading and a full list of inset shorthand and longhand is available on MDN. Border shorthands # Border, plus its nested color, style, and width properties have all got new logical shorthands as well. Physical longhand border-top-color: hotpink; border-bottom-color: hotpink; Logical shorthand border-block-color: hotpink; /* or */ border-block-color: hotpink hotpink; Physical longhand border-left-style: dashed; border-right-style: dashed; Logical shorthand border-inline-style: dashed; /* or */ border-inline-style: dashed dashed; Physical longhand border-left-width: 1px; border-right-width: 1px; Logical shorthand border-inline-width: 1px; /* or */ border-inline-width: 1px 1px; Further reading and a full list of border shorthand and longhand is available on MDN. Logical property <figure> example # Let's put it all together in a small example. Logical properties can layout an image with a caption to handle different writing and document directions. Or try it! You don't have to do much to make a card internationally responsive with a <figure> and a few logical properties. If you're curious how all this internationally considerate CSS works together, I hope this is a small meaningful introduction. Polyfilling and cross-browser support # The Cascade or build tools are viable options to have old and new browsers alike, properly spaced with updated logical properties. For Cascade fallbacks, follow a physical property with a logical one and the browser will use the "last" property it found during style resolution. p { /* for unsupporting browsers */ margin-top: 1ch; margin-bottom: 2ch; /* for supporting browsers to use */ /* and unsupporting browsers to ignore and go 🤷‍♂️ */ margin-block: 1ch 2ch; } That's not quite a full solution for everyone though. Here's a handwritten fallback that leverages the :lang() pseudo-selector to target specific languages, adjusts their physical spacing appropriately, then at the end offers the logical spacing for supporting browsers: /* physical side styles */ p { margin-top: 1ch; margin-bottom: 2ch; } /* adjusted physical side styles per language */ :lang(ja) { p { /* zero out styles not useful for traditional Japanese */ margin-top: 0; margin-bottom: 0; /* add appropriate styles for traditional Japanese */ margin-right: 1ch; margin-left: 2ch; } } /* add selectors and adjust for languages all supported */ :lang(he) {…} :lang(mn) {…} /* Logical Sides */ /* Then, for supporting browsers to use */ /* and unsupporting browsers to ignore #TheCascade */ p { /* remove any potential physical cruft.. */ margin: 0; /* explicitly set logical value */ margin-block: 1ch 2ch; } You could also use @supports to determine whether or not to provide physical property fallbacks: p { margin-top: 1ch; margin-bottom: 2ch; } @supports (margin-block: 0) { p { margin-block: 1ch 2ch; } } Sass, PostCSS, Emotion and others have automated bundler and/or build time offerings that have a wide array of fallbacks or solutions. Check out each one to see which matches your toolchain and overall site strategy. What's next # More of CSS will offer logical properties, it's not done yet! There's one big missing set of shorthands though, and a resolution is still pending in this Github issue. There is a temporary solution in a draft. What if you want to style all logical sides of a box with a shorthand? Physical shorthand margin: 1px 2px 3px 4px; margin: 1px 2px; margin: 2px; Logical shorthand margin: logical 1px 2px 3px 4px; margin: logical 1px 2px; margin: logical 2px; The current draft proposal would mean you have to write logical in every shorthand in order to get the logical equivalent applied, which doesn't sound very DRY to some. There are other proposals to change it at the block or page level, but that could leak logical uses into styles still assuming physical sides. html { flow-mode: physical; /* or */ flow-mode: logical; /* now all margin/padding/etc references are logical */ } /* hopefully no 3rd/1st party code is hard coded to top/left/etc ..? */ It's a tough one! Cast your vote, voice your opinion, we want to hear from you. Want to learn or study logical properties more? Here's a detailed reference, along with guides and examples, on MDN 🤓 Feedback # To propose changes to the CSS syntax of flow-relative shorthands, first check the existing issues on the csswg-drafts repository. If none of the existing issues match your proposal, create a new issue. To report bugs on Chromium's implementation of flow-relative shorthands, first check the existing issues on Chromium Bug Tracker. If none of the existing issues match your bug, create a new issue.

How ZDF created a video PWA with offline and dark mode

When broadcaster ZDF was considering redesigning their frontend technology stack, they decided to take a closer look at Progressive Web Apps for their streaming site ZDFmediathek. Development agency Cellular took on the challenge to build a web-based experience that is on par with ZDF's platform-specific iOS and Android apps. The PWA offers installability, offline video playback, transition animations, and a dark mode. Adding a service worker # A key feature of a PWA is offline support. For ZDF most of the heavy lifting is done by Workbox, a set of libraries and Node modules that make it easy to support different caching strategies. The ZDF PWA is built with TypeScript and React, so it uses the Workbox library already built into create-react-app to precache static assets. This lets the application focus on making the dynamic content available offline, in this case the videos and their metadata. The basic idea is quite simple: fetch the video and store it as a blob in IndexedDB. Then during playback, listen for online/offline events, and switch to the downloaded version when the device goes offline. Unfortunately things turned out to be a little more complex. One of the project requirements was to use the official ZDF web player which doesn't provide any offline support. The player takes a content ID as input, talks to the ZDF API, and plays back the associated video. This is where one of the web's most powerful features comes to the rescue: service workers. The service worker can intercept the various requests done by the player and respond with the data from IndexedDB. This transparently adds offline capabilities without having to change a single line of the player's code. Since offline videos tend to be quite large, a big question is how many of them can actually be stored on a device. With the help of the StorageManager API the app can estimate the available space and inform the user when there is insufficient space before even starting the download. Unfortunately Safari isn't on the list of browsers implementing this API and at the time of writing there wasn't much up-to-date information about how other browsers applied quotas. Therefore, the team wrote a small utility to test the behavior on various devices. By now a comprehensive article exists that sums up all the details. Adding a custom install prompt # The ZDF PWA offers a custom in-app installation flow and prompts users to install the app as soon as they want to download their first video. This is a good point in time to prompt for install because the user has expressed a clear intention to use the app offline. Custom install prompt being triggered when downloading a video for offline consumption. Building an offline page to access downloads # When the device is not connected to the internet and the user navigates to a page that is not available in offline mode, a special page is shown instead that lists all videos that have previously been downloaded or (in case no content has been downloaded yet) a short explanation of the offline feature. Offline page showing all content available for watching offline. Using frame loading rate for adaptive features # To offer a rich user experience the ZDF PWA includes some subtle transitions that happen when the user scrolls or navigates. On low-end devices such animations usually have the opposite effect and make the app feel sluggish and less responsive if they don't run at 60 frames per second. To take this into account the app measures the actual frame rate via requestAnimationFrame() while the application loads and disables all animations when the value drops below a certain threshold. const frameRate = new Promise(resolve => { let lastTick; const samples = []; function measure() { const tick = Date.now(); if (lastTick) samples.push(tick - lastTick); lastTick = tick; if (samples.length < 20) requestAnimationFrame(measure); else { const avg = samples.reduce((a, b) => a + b) / samples.length; const fps = 1000 / avg; resolve(fps); } } measure(); }); Even if this measurement provides only a rough indication of the device's performance and varies on each load, it was still a good basis for decision-making. It's worth mentioning that depending on the use case there are other techniques for adaptive loading that developers can implement. One great advantage of this approach is that it is available on all platforms. Dark mode # A popular feature for modern mobile experiences is dark mode. Especially when watching videos in low ambient light many people prefer a dimmed UI. The ZDF PWA not only provides a switch that allows users to toggle between a light and a dark theme, it also reacts to changes of the OS-wide color preferences. This way the app will automatically change its appearance on devices that have set up a schedule to change the theme base on the time of day. Results # The new progressive web app was silently launched as a public beta in March 2020 and has received a lot of positive feedback since then. While the beta phase continues, the PWA still runs under its own temporary domain. Even though the PWA wasn't publicly promoted there is a steadily growing number of users. Many of these are from the Microsoft Store which allows Windows 10 users to discover PWAs and install them like platform-specific apps. What's next? # ZDF plans to continue adding features to their PWA, including login for personalization, cross-device and platform viewing, and push notifications.

Handling range requests in a service worker

Some HTTP requests contain a Range: header, indicating that only a portion of the full resource should be returned. They're commonly used for streaming audio or video content to allow smaller chunks of media to be loaded on demand, instead of requesting the entirety of the remote file all at once. A service worker is JavaScript code that sits in between your web app and the network, potentially intercepting outgoing network requests and generating responses for them. Historically, range requests and service workers haven't played nicely together. It's been necessary to take special steps to avoid bad outcomes in your service worker. Fortunately, this is starting to change. In browsers exhibiting the correct behavior, range requests will "just work" when passing through a service worker. What's the issue? # Consider a service worker with the following fetch event listener, which takes every incoming request and passes it to the network: self.addEventListener('fetch', (event) => { // The Range: header will not pass through in // browsers that behave incorrectly. event.respondWith(fetch(event.request)); }); This sort of trivial fetch event listener should normally be avoided; it's used here for illustrative purposes. In browsers with the incorrect behavior, if event.request included a Range: header, that header would be silently dropped. The request that was received by the remote server would not include Range: at all. This would not necessarily "break" anything, since a server is technically allowed to return the full response body, with a 200 status code, even when a Range: header is present in the original request. But it would result in more data being transferred than is strictly needed from the perspective of the browser. Developers who were aware of this behavior could work around it by explicitly checking for the presence of a Range: header, and not calling event.respondWith() if one is present. By doing this, the service worker effectively removes itself from the response generation picture, and the default browser networking logic, which knows how to preserve range requests, is used instead. self.addEventListener('fetch', (event) => { // Return without calling event.respondWith() // if this is a range request. if (event.request.headers.has('range')) { return; } event.respondWith(fetch(event.request)); }); It's safe to say that most developers were not aware of the need to do this, though. And it wasn't clear why that should be required. Ultimately, this limitation was due to browsers needing to catch up to changes in the underlying specification, which added support for this functionality. What's been fixed? # Browsers that behave correctly preserve the Range: header when event.request is passed to fetch(). This means the service worker code in my initial example will allow the remote server to see the Range: header, if it was set by the browser: self.addEventListener('fetch', (event) => { // The Range: header will pass through in browsers // that behave correctly. event.respondWith(fetch(event.request)); }); The server now gets a chance to properly handle the range request and return a partial response with a 206 status code. Which browsers behave correctly? # Recent versions of Safari have the correct functionality. Chrome and Edge, starting with version 87, behave correctly as well. As of this October 2020, Firefox has not yet fixed this behavior, so you may still need to account for it while deploying your service worker's code to production. Checking the "Include range header in network request" row of the Web Platform Tests dashboard is the best way to confirm whether or not a given browser has corrected this behavior. What about serving range requests from the cache? # Service workers can do much more than just pass a request through to the network. A common use case is to add resources, like audio and video files, to a local cache. A service worker can then fulfill requests from that cache, bypassing the network entirely. All browsers, including Firefox, support inspecting a request inside a fetch handler, checking for the presence of the Range: header, and then locally fulfilling the request with a 206 response that comes from a cache. The service worker code to properly parse the Range: header and return only the appropriate segment of the complete cached response is not trivial, though. Fortunately, developers who want some help can turn to Workbox, which is a set of libraries that simplifies common service worker use cases. The workbox-range-request module implements all the logic necessary to serve partial responses directly from the cache. A full recipe for this use case can be found in the Workbox documentation. The hero image on this post is by Natalie Rhea Riggs on Unsplash.

A more private way to measure ad conversions, the Event Conversion Measurement API

The Conversion Measurement API will be renamed to Attribution Reporting API and offer more features. If you're experimenting with (Conversion Measurement API) in Chrome 91 and below, read this post to find more details, use cases and instructions for how to use the API. If you're interested in the next iteration of this API (Attribution Reporting), which will be available for experimentation in Chrome (origin trial), join the mailing list for updates on available experiments. In order to measure the effectiveness of ad campaigns, advertisers and publishers need to know when an ad click or view leads to a conversion, such as a purchase or sign-up. Historically, this has been done with third-party cookies. Now, the Event Conversion Measurement API enables the correlation of an event on a publisher's website with a subsequent conversion on an advertiser site without involving mechanisms that can be used to recognize a user across sites. This proposal needs your feedback! If you have comments, please create an issue in the API proposal's repository. This API is part of the Privacy Sandbox, a series of proposals to satisfy third-party use cases without third-party cookies or other cross-site tracking mechanisms. See Digging into the Privacy Sandbox for an overview of all the proposals. Glossary # Adtech platforms: companies that provide software and tools to enable brands or agencies to target, deliver, and analyze their digital advertising. Advertisers: companies paying for advertising. Publishers: companies that display ads on their websites. Click-through conversion: conversion that is attributed to an ad click. View-through conversion: conversion that is attributed to an ad impression (if the user doesn't interact with the ad, then later converts). Who needs to know about this API: adtech platforms, advertisers, and publishers # Adtech platforms such as demand-side platforms are likely to be interested in using this API to support functionality that currently relies on third-party cookies. If you're working on conversion measurement systems: try out the demo, experiment with the API, and share your feedback. Advertisers and publishers relying on custom code for advertising or conversion measurement may similarly be interested in using this API to replace existing techniques. Advertisers and publishers relying on adtech platforms for advertising or conversion measurement don't need to use the API directly, but the rationale for this API may be of interest, particularly if you are working with adtech platforms that may integrate the API. API overview # Why is this needed? # Today, ad conversion measurement often relies on third-party cookies. But browsers are restricting access to these. Chrome plans on phasing out support for third-party cookies and offers ways for users to block them if they choose. Safari blocks third-party cookies, Firefox blocks known third-party tracking cookies, and Edge offers tracking prevention. Third-party cookies are becoming a legacy solution. New purpose-built APIs, like this one, are emerging to address in a privacy-preserving way the use cases that third-party cookies solved. How does the Event Conversion Measurement API compare to third-party cookies? It's purpose-built to measure conversions, unlike cookies. This in turn can enable browsers to apply more enhanced privacy protections. It's more private: it makes it difficult to recognize a user across two different top-level sites, for example to link publisher-side and advertiser-side user profiles. See how in How this API preserves user privacy. A first iteration # This API is at an early experimental stage. What's available as an origin trial is the first iteration of the API. Things may change substantially in future iterations. Only clicks # This iteration of the API only supports click-through conversion measurement, but view-through conversion measurement is under public incubation. How it works # This API can be used with two types of links (<a> elements) used for advertising: Links in a first-party context, such as ads on a social network or a search engine results page; Links in a third-party iframe, such as on a publisher site that uses a third-party adtech provider. With this API, such outbound links can be configured with attributes that are specific to ad conversions: Custom data to attach to an ad click on the publisher's side, for example a click ID or campaign ID. The website for which a conversion is expected for this ad. The reporting endpoint that should be notified of successful conversions. The cut-off date and time for when conversions can no longer be counted for this ad. When the user clicks an ad, the browser—on the user's local device—records this event, alongside conversion configuration and click data specified by Conversion Measurement attributes on the <a> element. Later on, the user may visit the advertiser's website and perform an action that the advertiser or their adtech provider categorizes as a conversion. If this happens, the ad click and the conversion event are matched by the user's browser. The browser finally schedules a conversion report to be sent to the endpoint specified in the <a> element's attributes. This report includes data about the ad click that led to this conversion, as a well as data about the conversion. If several conversions are registered for a given ad click, as many corresponding reports are scheduled to be sent (up to a maximum of three per ad click). Reports are sent after a delay: days or sometimes weeks after conversion (see why in Reports timing). Browser support and similar APIs # Browser support # The Event Conversion Measurement API can be supported: As an origin trial. Origin trials enable the API for all visitors of a given origin. You need to register your origin for the origin trial in order to try the API with end users. See Using the conversion measurement API for details about the origin trial. By turning on flags, in Chrome 86 and later. Flags enable the API on a single user's browser. Flags are useful when developing locally. See details on the current status on the Chrome feature entry. Standardization # This API is being designed in the open, in the Web Platform Incubator Community Group (WICG). It's available for experimentation in Chrome. Similar APIs # WebKit, the web browser engine used by Safari, has a proposal with similar goals, the Private Click Measurement. It's being worked on within the Privacy Community Group (PrivacyCG). How this API preserves user privacy # With this API, conversions can be measured while protecting users' privacy: users can't be recognized across sites. This is made possible by data limits, noising of conversion data, and report timing mechanisms. Let's take a closer look at how these mechanisms work, and what they mean in practice. Data limits # In the following, click-time or view-time data is data available to adtech.example when the ad is served to the user and then clicked or viewed. Data from when a conversion happened is conversion-time data. Let's look at a publisher news.example and an advertiser shoes.example. Third-party scripts from the adtech platform adtech.example are present on the publisher site news.example to include ads for the advertiser shoes.example. shoes.example includes adtech.example scripts as well, to detect conversions. How much can adtech.example learn about web users? With third-party cookies # adtech.example relies on a a third-party cookie used as a unique cross-site identifier to recognize a user across sites. In addition, adtech.example can access both detailed click- or view-time data and detailed conversion-time data—and link them. As a result, adtech.example can track the behavior of a single user across sites, between an ad view, click, and conversion. Because adtech.example is likely present on a large number of publisher and advertiser sites—not just news.example and shoes.example—a user's behavior can be tracked across the web. With the Event Conversion Measurement API # "Ad ID" on the cookies diagram and "Click ID" are both identifiers that enable mapping to detailed data. On this diagram, it's called "Click ID" because only click-through conversion measurement is supported. adtech.example can't use a cross-site identifier and hence can't recognize a user across sites. A 64 bit-identifier can be attached to an ad click. Only 3 bits of conversion data can be attached to the conversion event. 3 bits can fit an integer value from 0 to 7. This is not much data, but enough that advertisers can learn how to make good decisions about where to spend their advertising budget in the future (for example by training data models). The click data and conversion data are never exposed to a JavaScript environment in the same context. Without an alternative to third-party cookies # Without an alternative to third-party cookies such as the Event Conversion Measurement API, conversions can't be attributed: if adtech.example is present on both the publisher's and advertiser's site, it may access click-time or conversion-time data but it can't link them at all. In this case, user privacy is preserved but advertisers can't optimize their ad spend. This is why an alternative like the Event Conversion Measurement API is needed. Noising of conversion data # The 3 bits gathered at conversion time are noised. For example, in Chrome's implementation, data noising works as follows: 5% of the time, the API reports a random 3-bit value instead of the actual conversion data. This protects users from privacy attacks. An actor trying to misuse the data from several conversions to create an identifier won't have full confidence in the data they receive—making these types of attacks more complicated. Note that it's possible to recover the true conversion count. Summing up click data and conversion data: Data Size Example Click data (impressiondata attribute) 64 bits An ad ID or click ID Conversion data 3 bits, noised An integer from 0 to 7 that can map to a conversion type: signup, complete checkout, etc. Report timing # If several conversions are registered for a given ad click, a corresponding report is sent for each conversion, up to a maximum of three per click. To prevent conversion time from being used to get more information from the conversion side and hence hinder users' privacy, this API specifies that conversion reports aren't sent immediately after a conversion happens. After the initial ad click, a schedule of reporting windows associated with this click begins. Each reporting window has a deadline, and conversions registered before that deadline will be sent at the end of that window. Reports may not be exactly sent at these scheduled dates and times: if the browser isn't running when a report is scheduled to be sent, the report is sent at browser startup—which could be days or weeks after the scheduled time. After expiry (click time + impressionexpiry), no conversion is counted—impressionexpiry is the cut-off date and time for when conversions can no longer be counted for this ad. In Chrome, report scheduling works as follows: impressionexpiry Depending on conversion time, a conversion report is sent (if the browser is open)... Number of reporting windows 30 days, the default and maximum value 2 days after the ad was clicked or 7 days after ad click or impressionexpiry = 30 days after ad click. 3 impressionexpiry is between 7 and 30 days 2 days after ad click or 7 days after ad click or impressionexpiry after ad click. 3 impressionexpiry is between 2 and 7 days 2 days after ad click or impressionexpiry after ad click. 2 impressionexpiry is under 2 days 2 days after ad click 1 See Sending Scheduled Reports for more details on timing. Example # demo ⚡️ and see the corresponding code. Here's how the API records and reports a conversion. Note that this is how a click-to-convert flow would work with the current API. Future iterations of this API may be different. Ad click (steps 1 to 5) # An <a> ad element is loaded on a publisher site by adtech.example within an iframe. The adtech platform developers have configured the <a> element with conversion measurement attributes: <a id="ad" impressiondata="200400600" conversiondestination="https://advertiser.example" reportingorigin="https://adtech.example" impressionexpiry="864000000" href="https://advertiser.example/shoes07" > <img src="/images/shoe.jpg" alt="shoe" /> </a> This code specifies the following: Attribute Default value, maximum, minimum Example impressiondata (required): a 64-bit identifier to attach to an ad click. (no default) A dynamically generated click ID such as a 64-bit integer: 200400600 conversiondestination (required): the eTLD+1 where a conversion is expected for this ad. (no default) https://advertiser.example. If the conversiondestination is https://advertiser.example, conversions on both https://advertiser.example and https://shop.advertiser.example will be attributed. The same happens if the conversiondestination is https://shop.advertiser.example: conversions on both https://advertiser.example and https://shop.advertiser.example will be attributed. impressionexpiry (optional): in milliseconds, the cutoff time for when conversions can be attributed to this ad. 2592000000 = 30 days (in milliseconds). Maximum: 30 days (in milliseconds). Minimum: 2 days (in milliseconds). Ten days after click: 864000000 reportingorigin (optional): the destination for reporting confirmed conversions. Top-level origin of the page where the link element is added. https://adtech.example href: the intended destination of the ad click. / https://advertiser.example/shoes07 Some notes about the example: You will find the term "impression" used in the attributes of the API or in the API proposal, even though only clicks are supported for now. Names may be updated in future iterations of the API. The ad doesn't have to be in an iframe, but this is what this example is based on. Gotchas! Flows based on navigating via window.open or window.location won't be eligible for attribution. When the user taps or clicks the ad, they navigate to the advertiser's site. Once the navigation is committed, the browser stores an object that includes impressiondata, conversiondestination, reportingorigin, and impressionexpiry: { "impression-data": "200400600", "conversion-destination": "https://advertiser.example", "reporting-origin": "https://adtech.example", "impression-expiry": 864000000 } Conversion and report scheduling (steps 6 to 9) # Either directly after clicking the ad, or later on—for example, on the next day—the user visits advertiser.example, browses sports shoes, finds a pair they want to purchase, and proceeds to checkout. advertiser.example has included a pixel on the checkout page: <img height="1" width="1" src="https://adtech.example/conversion?model=shoe07&type=checkout&…" /> adtech.example receives this request, and decides that it qualifies as a conversion. They now need to request the browser to record a conversion. adtech.example compresses all of the conversion data into 3 bits—an integer between 0 and 7, for example they might map a Checkout action to a conversion value of 2. adtech.example then sends a specific register-conversion redirect to the browser: const conversionValues = { signup: 1, checkout: 2, }; app.get('/conversion', (req, res) => { const conversionData = conversionValues[req.query.conversiontype]; res.redirect( 302, `/.well-known/register-conversion?conversion-data=${conversionData}`, ); }); .well-known URLs are special URLs. They make it easy for software tools and servers to discover commonly-needed information or resources for a site—for example, on what page a user can change their password. Here, .well-known is only used so that the browser recognizes this as a special conversion request. This request is actually cancelled internally by the browser. The browser receives this request. Upon detecting .well-known/register-conversion, the browser: Looks up all ad clicks in storage that match this conversiondestination (because it's receiving this conversion on a URL that has been registered as a conversiondestination URL when the user clicked the ad). It finds the ad click that happened on the publisher's site one day before. Registers a conversion for this ad click. Several ad clicks can match a conversion—the user may have clicked an ad for shoes.example on both news.example and weather.example. In this case, several conversions are registered. Now, the browser knows that it needs to inform the adtech server of this conversion—more specifically, the browser must inform the reportingorigin that is specified in both the <a> element and in the pixel request (adtech.example). To do so, the browser schedules to send a conversion report, a blob of data containing the click data (from the publisher's site) and the conversion data (from the advertiser's). For this example, the user converted one day after click. So the report is scheduled to be sent on the next day, at the two-day-after-click mark if the browser is running. Sending the report (steps 10 and 11) # Once the scheduled time to send the report is reached, the browser sends the conversion report: it sends an HTTP POST to the reporting origin that was specified in the <a> element (adtech.example). For example: https://adtech.example/.well-known/register-conversion?impression-data=200400600&conversion-data=2&credit=100 Included as parameters are: The data associated with the original ad click (impression-data). The data associated with a conversion, potentially noised. The conversion credit attributed to the click. This API follows a last-click attribution model: the most recent matching ad click is given a credit of 100, all other matching ad clicks are given a credit of 0. As the adtech server receives this request, it can pull the impression-data and conversion-data from it, i.e. the conversion report: {"impression-data": "200400600", "conversion-data": 3, "credit": 100} Subsequent conversions and expiry # Later on, the user may convert again—for example by purchasing a tennis racket on advertiser.example to go alongside their shoes. A similar flow takes place: The adtech server sends a conversion request to the browser. The browser matches this conversion with the ad click, schedules a report, and sends it to the adtech server later on. After impressionexpiry, conversions for this ad click stop being counted and the ad click is deleted from browser storage. Use cases # What is currently supported # Measure click-through conversions: determine which ad clicks lead to conversions, and access coarse information about the conversion. Gather data to optimize ad selection, for example by training machine learning models. What is not supported in this iteration # The following features aren't supported, but may be in future iterations of this API, or in Aggregate reports: View-through conversion measurement. Multiple reporting endpoints. Web conversions that started in an iOS/Android app. Conversion lift measurement / incrementality: measurement of causal differences in conversion behavior, by measuring the difference between a test group that saw an ad and a control group that didn't. Attribution models that are not last-click. Use cases that require larger amounts of information about the conversion event. For example, granular purchase values or product categories. Before these features and more can be supported, more privacy protections (noise, fewer bits, or other limitations) must be added to the API. Discussion of additional possible features takes place in the open, in the Issues of the API proposal repository. Is your use case missing? Do you have feedback on the API? Share it. What else may change in future iterations # This API is at an early, experimental stage. In future iterations, this API may undergo substantial changes including but not limited to the ones listed below. Its goal is to measure conversions while preserving user privacy, and any change that would help better address this use case will be made. API and attribute naming may evolve. Click data and conversion data may not require encoding. The 3-bit limit for conversion data may be increased or decreased. More features may be added, and more privacy protections (noise / fewer bits / other limitations) if needed to support these new features. To follow and participate in discussions on new features, watch the proposal's GitHub repository and submit ideas. Try it out # Demo # Try out the demo. Make sure to follow the "Before you start" instructions. Tweet @maudnals or @ChromiumDev for any question about the demo! Experiment with the API # If you're planning to experiment with the API (locally or with end users), see Using the conversion measurement API. Share your feedback # Your feedback is crucial, so that new conversion measurement APIs can support your use cases and provide a good developer experience. To report a bug on the Chrome implementation, open a bug. To share feedback and discuss use cases on the Chrome API, create a new issue or engage in existing ones on the API proposal repository. Similarly, you can discuss the WebKit/Safari API and its use cases on the API proposal repository. To discuss advertising use cases and exchange views with industry experts: join the Improving Web Advertising Business Group. Join the Privacy Community Group for discussions around the WebKit/Safari API. Keep an eye out # As developer feedback and use cases are gathered, the Event Conversion Measurement API will evolve over time. Watch the proposal's GitHub repository. Follow along the evolution of the Aggregate Conversion Measurement API that will complement this API. With many thanks for contributions and feedback to all reviewers—especially Charlie Harrison, John Delaney, Michael Kleber and Kayce Basques. Hero image by William Warby / @wawarby on Unsplash, edited.

Control camera pan, tilt, and zoom

Room-scale video conferencing solutions deploy cameras with pan, tilt, and zoom (PTZ) capabilities so that software can point the camera at meeting participants. Starting in Chrome 87, the pan, tilt, and zoom features on cameras are available to websites using media track constraints in MediaDevices.getUserMedia() and MediaStreamTrack.applyConstraints(). Using the API # Feature detection # Feature detection for hardware is different from what you're probably used to. The presence of "pan", "tilt", and "zoom" constraint names in navigator.mediaDevices.getSupportedConstraints() tells you that the browser supports the API to control camera PTZ, but not whether the camera hardware supports it. As of Chrome 87, controlling camera PTZ is supported on desktop, while Android still supports zoom only. const supports = navigator.mediaDevices.getSupportedConstraints(); if (supports.pan && supports.tilt && supports.zoom) { // Browser supports camera PTZ. } Request camera PTZ access # A website is allowed to control camera PTZ only if the user has explicitly granted the camera with PTZ permission through a prompt. To request camera PTZ access, call navigator.mediaDevices.getUserMedia() with the PTZ constraints as shown below. This will prompt the user to grant both regular camera and camera with PTZ permissions. Camera PTZ user prompt. The returned promise will resolve with a MediaStream object used to show the camera video stream to the user. If the camera does not support PTZ, the user will get a regular camera prompt. try { // User is prompted to grant both camera and PTZ access in a single call. // If camera doesn't support PTZ, it falls back to a regular camera prompt. const stream = await navigator.mediaDevices.getUserMedia({ // Website asks to control camera PTZ as well without altering the // current pan, tilt, and zoom settings. video: { pan: true, tilt: true, zoom: true } }); // Show camera video stream to user. document.querySelector("video").srcObject = stream; } catch (error) { // User denies prompt or matching media is not available. console.log(error); } A previously-granted camera permission, specifically one without PTZ access, does not automatically gain PTZ access if it becomes available. This is true even when the camera itself supports PTZ. The permission must be requested again. Fortunately, you can use the Permissions API to query and monitor the status of PTZ permission. try { const panTiltZoomPermissionStatus = await navigator.permissions.query({ name: "camera", panTiltZoom: true }); if (panTiltZoomPermissionStatus.state == "granted") { // User has granted access to the website to control camera PTZ. } panTiltZoomPermissionStatus.addEventListener("change", () => { // User has changed PTZ permission status. }); } catch (error) { console.log(error); } To know whether a Chromium-based browser supports PTZ for a camera, go to the internal about://media-internals page and check out the "Pan-Tilt-Zoom" column in the "Video Capture" tab; "pan tilt" and "zoom" respectively mean the camera supports the "PanTilt (Absolute)" and "Zoom (Absolute)" UVC controls. The "PanTilt (Relative)" and "Zoom (Relative)" UVC controls are not supported in Chromium-based browsers. Internal page to debug PTZ camera support. Control camera PTZ # Manipulate camera PTZ capabilities and settings using the preview MediaStreamTrack from the stream object obtained earlier. MediaStreamTrack.getCapabilities() returns a dictionary with the supported capabilities and the ranges or allowed values. Correspondingly, MediaStreamTrack.getSettings() returns the current settings. Pan, tilt, and zoom capabilities and settings are available only if supported by the camera and the user has granted PTZ permission to the camera. Controlling Camera PTZ. Call videoTrack.applyConstraints() with the appropriate PTZ advanced constraints to control camera pan, tilt, and zoom as shown in the example below. The returned promise will resolve if successful. Otherwise it will reject if either: the camera with PTZ permission is not granted. the camera hardware does not support the PTZ constraint. the page is not visible to the user. Use the Page Visibility API to detect page visibility changes. // Get video track capabilities and settings. const [videoTrack] = stream.getVideoTracks(); const capabilities = videoTrack.getCapabilities(); const settings = videoTrack.getSettings(); // Let the user control the camera pan motion if the camera supports it // and PTZ access is granted. if ("pan" in settings) { const input = document.querySelector("input[type=range]"); input.min = capabilities.pan.min; input.max = capabilities.pan.max; input.step = capabilities.pan.step; input.value = settings.pan; input.addEventListener("input", async () => { await videoTrack.applyConstraints({ advanced: [{ pan: input.value }] }); }); } if ("tilt" in settings) { // similar for tilt... } if ("zoom" in settings) { // similar for zoom... } It is also possible to configure camera pan, tilt, and zoom by calling navigator.mediaDevices.getUserMedia() with some camera PTZ ideal constraint values. This is handy when camera PTZ capabilities are known in advance. Note that mandatory constraints (min, max, exact) are not allowed here. const stream = await navigator.mediaDevices.getUserMedia({ // Website asks to reset known camera pan. video: { pan: 0, deviceId: { exact: "myCameraDeviceId" } } }); Playground # You can play with the API by running the demo on Glitch. Be sure to check out the source code. Tip: If you don't have a camera that supports PTZ, you can run Chrome with the switch --use-fake-device-for-media-stream to simulate one on your machine. Enjoy! Security Considerations # The spec authors have designed and implemented this API using the core including user control, transparency, and ergonomics. The ability to use this API is primarily gated by the same permission model as the Media Capture and Streams API. In response to a user prompt, the website is allowed to control camera PTZ only when the page is visible to the user. Helpful links # PTZ Explainer Specification draft GitHub repository ChromeStatus entry Chrome tracking bug Acknowledgements # This article was reviewed by Joe Medley and Thomas Steiner. Thanks to Rijubrata Bhaumik and Eero Häkkinen at Intel for their work on the spec and the implementation. Hero image by Christina @ wocintechchat.com on Unsplash.

What are third-party origin trials?

Origin trials are a way to test a new or experimental web platform feature. Origin trials are usually only available on a first-party basis: they only work for a single registered origin. If a developer wants to test an experimental feature on other origins where their content is embedded, those origins all need to be registered for the origin trial, each with a unique trial token. This is not a scalable approach for testing scripts that are embedded across a number of sites. Third-party origin trials make it possible for providers of embedded content to try out a new feature across multiple sites. Third-party origin trials don't make sense for all features. Chrome will only make the third-party origin trial option available for features where embedding code on third-party sites is a common use case. Getting started with Chrome's origin trials provides more general information about how to participate in Chrome origin trials. If you participate in an origin trial as a third-party provider, it will be your responsibility to notify and set expectations with any partners or customers whose sites you intend to include in the origin trial. Experimental features may cause unexpected issues and browser vendors may not be able to provide troubleshooting support. Supporting third-party origin trials allows for broader participation, but also increases the potential for overuse or abuse of experimental features, so a "trusted tester" approach is more appropriate. The greater reach of third-party origin trials requires additional scrutiny and additional responsibility for web developers that participate as third-party providers. Requests to enable a third-party origin trial may be reviewed in order to avoid problematic third-party scripts affecting multiple sites. The Origin Trials Developer Guide explains the approval process. Check Chrome Platform Status for updates on progress with third-party origin trials. How to register for a third-party origin trial # Select a trial from the list of active trials. On the trial's registration page, enable the option to request a third-party token, if available. Select one of the choices for restricting usage for a third-party token: Standard Limit: This is the usual limit of 0.5% of Chrome page loads. User Subset: A small percentage of Chrome users will always be excluded from the trial, even when a valid third-party token is provided. The exclusion percentage varies (or might not apply) for each trial, but is typically less than 5%. Click the Register button to submit your request. Your third-party token will be issued immediately, unless further review of the request is required. (Depending on the trial, token requests may require review.) If review is required, you'll be notified by email when the review is complete and your third-party token is ready. Registration page for the Conversion Measurement trial. How to provide feedback # If you're registering for a third-party origin trial and have feedback to share on the process or ideas on how we can improve it, please create an issue on the Origin Trials GitHub code repo. Find out more # Getting started with Chrome's origin trials Origin Trials Guide for Web Developers Chrome Platform Status Photo by Louis Reed on Unsplash.

Declarative Shadow DOM

Declarative Shadow DOM is a proposed web platform feature that the Chrome team is looking for feedback on. Try it out using the experimental flag or polyfill. Shadow DOM is one of the three Web Components standards, rounded out by HTML templates and Custom Elements. Shadow DOM provides a way to scope CSS styles to a specific DOM subtree and isolate that subtree from the rest of the document. The <slot> element gives us a way to control where the children of a Custom Element should be inserted within its Shadow Tree. These features combined enable a system for building self-contained, reusable components that integrate seamlessly into existing applications just like a built-in HTML element. Until now, the only way to use Shadow DOM was to construct a shadow root using JavaScript: const host = document.getElementById('host'); const shadowRoot = host.attachShadow({mode: 'open'}); shadowRoot.innerHTML = '<h1>Hello Shadow DOM</h1>'; An imperative API like this works fine for client-side rendering: the same JavaScript modules that define our Custom Elements also create their Shadow Roots and set their content. However, many web applications need to render content server-side or to static HTML at build time. This can be an important part of delivering a reasonable experience to visitors who may not be capable of running JavaScript. The justifications for Server-Side Rendering (SSR) vary from project to project. Some websites must provide fully functional server-rendered HTML in order to meet accessibility guidelines, others choose to deliver a baseline no-JavaScript experience as a way to guarantee good performance on slow connections or devices. Historically, it has been difficult to use Shadow DOM in combination with Server-Side Rendering because there was no built-in way to express Shadow Roots in the server-generated HTML. There are also performance implications when attaching Shadow Roots to DOM elements that have already been rendered without them. This can cause layout shifting after the page has loaded, or temporarily show a flash of unstyled content ("FOUC") while loading the Shadow Root's stylesheets. Declarative Shadow DOM (DSD) removes this limitation, bringing Shadow DOM to the server. Building a Declarative Shadow Root # A Declarative Shadow Root is a <template> element with a shadowroot attribute: <host-element> <template shadowroot="open"> <slot></slot> </template> <h2>Light content</h2> </host-element> A template element with the shadowroot attribute is detected by the HTML parser and immediately applied as the shadow root of its parent element. Loading the pure HTML markup from the above sample results in the following DOM tree: <host-element> #shadow-root (open) <slot> ↳ <h2>Light content</h2> </slot> </host-element> This code sample is following the Chrome DevTools Elements panel's conventions for displaying Shadow DOM content. For example, the ↳ character represents slotted Light DOM content. This gives us the benefits of Shadow DOM's encapsulation and slot projection in static HTML. No JavaScript is needed to produce the entire tree, including the Shadow Root. Serialization # In addition to introducing the new <template> syntax for creating shadow roots and attaching them to elements, Declarative Shadow Dom also includes a new API for getting the HTML contents of an element. The new getInnerHTML() method works like .innerHTML, but provides an option to control whether shadow roots should be included in the returned HTML: const html = element.getInnerHTML({includeShadowRoots: true}); `<host-element> <template shadowroot="open"><slot></slot></template> <h2>Light content</h2> </host-element>`; Passing the includeShadowRoots:true option serializes the entire subtree of an element, including its shadow roots. The included shadow roots are serialized using the <template shadowroot> syntax. In order to preserve encapsulation semantics, any closed shadow roots within an element will not be serialized by default. To include closed shadow roots in the serialized HTML, an array of references to those shadow roots can be passed via a new closedRoots option: const html = element.getInnerHTML({ includeShadowRoots: true, closedRoots: [shadowRoot1, shadowRoot2, ...] }); When serializing the HTML within an element, any closed shadow roots that are present in the closedRoots array will be serialized using the same template syntax as open shadow roots: <host-element> <template shadowroot="closed"> <slot></slot> </template> <h2>Light content</h2> </host-element> Serialized closed shadow roots are indicated by a shadowroot attribute with a value of closed. Component hydration # Declarative Shadow DOM can be used on its own as a way to encapsulate styles or customize child placement, but it's most powerful when used with Custom Elements. Components built using Custom Elements get automatically upgraded from static HTML. With the introduction of Declarative Shadow DOM, it's now possible for a Custom Element to have a shadow root before it gets upgraded. A Custom Element being upgraded from HTML that includes a Declarative Shadow Root will already have that shadow root attached. This means the element will have a shadowRoot property already available when it is instantiated, without your code explicitly creating one. It's best to check this.shadowRoot for any existing shadow root in your element's constructor. If there is already a value, the HTML for this component included a Declarative Shadow Root. If the value is null, there was no Declarative Shadow Root present in the HTML or the browser doesn't support Declarative Shadow DOM. <menu-toggle> <template shadowroot="open"> <button> <slot></slot> </button> </template> Open Menu </menu-toggle> <script> class MenuToggle extends HTMLElement { constructor() { super(); // Detect whether we have SSR content already: if (this.shadowRoot) { // A Declarative Shadow Root exists! // wire up event listeners, references, etc.: const button = this.shadowRoot.firstElementChild; button.addEventListener('click', toggle); } else { // A Declarative Shadow Root doesn't exist. // Create a new shadow root and populate it: const shadow = this.attachShadow({mode: 'open'}); shadow.innerHTML = `<button><slot></slot></button>`; shadow.firstChild.addEventListener('click', toggle); } } } customElements.define('menu-toggle', MenuToggle); </script> Custom Elements have been around for a while, and until now there was no reason to check for an existing shadow root before creating one using attachShadow(). Declarative Shadow DOM includes a small change that allows existing components to work despite this: calling the attachShadow() method on an element with an existing Declarative Shadow Root will not throw an error. Instead, the Declarative Shadow Root is emptied and returned. This allows older components not built for Declarative Shadow DOM to continue working, since declarative roots are preserved until an imperative replacement is created. For newly-created Custom Elements, a new ElementInternals.shadowRoot property provides an explicit way to get a reference to an element's existing Declarative Shadow Root, both open and closed. This can be used to check for and use any Declarative Shadow Root, while still falling back toattachShadow() in cases where one was not provided. class MenuToggle extends HTMLElement { constructor() { super(); const internals = this.attachInternals(); // check for a Declarative Shadow Root: let shadow = internals.shadowRoot; if (!shadow) { // there wasn't one. create a new Shadow Root: shadow = this.attachShadow({mode: 'open'}); shadow.innerHTML = `<button><slot></slot></button>`; } // in either case, wire up our event listener: shadow.firstChild.addEventListener('click', toggle); } } customElements.define('menu-toggle', MenuToggle); One shadow per root # A Declarative Shadow Root is only associated with its parent element. This means shadow roots are always colocated with their associated element. This design decision ensures shadow roots are streamable like the rest of an HTML document. It's also convenient for authoring and generation, since adding a shadow root to an element does not require maintaining a registry of existing shadow roots. The tradeoff of associating shadow roots with their parent element is that it is not possible for multiple elements to be initialized from the same Declarative Shadow Root <template>. However, this is unlikely to matter in most cases where Declarative Shadow DOM is used, since the contents of each shadow root are seldom identical. While server-rendered HTML often contains repeated element structures, their content generally differs–slight variations in text, attributes, etc. Because the contents of a serialized Declarative Shadow Root are entirely static, upgrading multiple elements from a single Declarative Shadow Root would only work if the elements happened to be identical. Finally, the impact of repeated similar shadow roots on network transfer size is relatively small due to the effects of compression. In the future, it might be possible to revisit shared shadow roots. If the DOM gains support for built-in templating, Declarative Shadow Roots could be treated as templates that are instantiated in order to construct the shadow root for a given element. The current Declarative Shadow DOM design allows for this possibility to exist in the future by limiting shadow root association to a single element. Timing is everything # Associating Declarative Shadow Roots directly with their parent element simplifies the process of upgrading and attaching them to that element. Declarative Shadow Roots are detected during HTML parsing, and attached immediately when their closing </template> tag is encountered. <div id="el"> <script> el.shadowRoot; // null </script> <template shadowroot="open"> <!-- shadow realm --> </template> <script> el.shadowRoot; // ShadowRoot </script> </div> Prior to being attached, the contents of a <template> element with the shadowroot attribute are an inert Document Fragment and are not accessible via the .content property like a standard template. This security measure prevents JavaScript from being able to obtain a reference to closed shadow roots. As a result, the contents of a Declarative Shadow Root are not rendered until its closing </template> tag is parsed. <div> <template id="shadow" shadowroot="open"> shadow realm <script> shadow.content; // null </script> </template> </div> Parser-only # Declarative Shadow DOM is a feature of the HTML parser. This means that a Declarative Shadow Root will only be parsed and attached for <template> tags with a shadowroot attribute that are present during HTML parsing. In other words, Declarative Shadow Roots can be constructed during initial HTML parsing: <some-element> <template shadowroot="open"> shadow root content for some-element </template> </some-element> Setting the shadowroot attribute of a <template> element does nothing, and the template remains an ordinary template element: const div = document.createElement('div'); const template = document.createElement('template'); template.setAttribute('shadowroot', 'open'); // this does nothing div.appendChild(template); div.shadowRoot; // null To avoid some important security considerations, Declarative Shadow Roots also can't be created using fragment parsing APIs like innerHTML or insertAdjacentHTML(). The only way to parse HTML with Declarative Shadow Roots applied is to pass a new includeShadowRoots option to DOMParser: <script> const html = ` <div> <template shadowroot="open"></template> </div> `; const div = document.createElement('div'); div.innerHTML = html; // No shadow root here const fragment = new DOMParser().parseFromString(html, 'text/html', { includeShadowRoots: true }); // Shadow root here </script> Server-rendering with style # Inline and external stylesheets are fully supported inside Declarative Shadow Roots using the standard <style> and <link> tags: <nineties-button> <template shadowroot="open"> <style> button { color: seagreen; } </style> <link rel="stylesheet" href="/comicsans.css" /> <button> <slot></slot> </button> </template> I'm Blue </nineties-button> Styles specified this way are also highly optimized: if the same stylesheet is present in multiple Declarative Shadow Roots, it is only loaded and parsed once. The browser uses a single backing CSSStyleSheet that is shared by all of the shadow roots, eliminating duplicate memory overhead. Constructable Stylesheets are not supported in Declarative Shadow DOM. This is because there is currently no way to serialize constructable stylesheets in HTML, and no way to refer to them when populating adoptedStyleSheets. Feature detection and browser support # Declarative Shadow DOM is available in Chrome 90 and Edge 91. It can also be enabled using the Experimental Web Platform Features flag in Chrome 85. Navigate to chrome://flags/#enable-experimental-web-platform-features to find that setting. As a new web platform API, Declarative Shadow DOM does not yet have widespread support across all browsers. Browser support can be detected by checking for the existence of a shadowroot property on the prototype of HTMLTemplateElement: function supportsDeclarativeShadowDOM() { return HTMLTemplateElement.prototype.hasOwnProperty('shadowRoot'); } Polyfill # Building a simplified polyfill for Declarative Shadow DOM is relatively straightforward, since a polyfill doesn't need to perfectly replicate the timing semantics or parser-only characteristics that a browser implementation concerns itself with. To polyfill Declarative Shadow DOM, we can scan the DOM to find all <template shadowroot> elements, then convert them to attached Shadow Roots on their parent element. This process can be done once the document is ready, or triggered by more specific events like Custom Element lifecycles. document.querySelectorAll('template[shadowroot]').forEach(template => { const mode = template.getAttribute('shadowroot'); const shadowRoot = template.parentNode.attachShadow({ mode }); shadowRoot.appendChild(template.content); template.remove(); }); Further Reading # Explainer with alternatives and performance analysis Chromestatus for Declarative Shadow DOM Intent to Prototype

How Goibibo's PWA improved conversions by 60%

Goibibo is India's leading online travel booking portal. By building a full-featured and reliable Progressive Web App that matched the capabilities of their iOS and Android apps, Goibibo achieved a 60% increase in conversions (compared to their previous web flow). 60% Increase in conversions 20% Increase in logged-in users Highlighting the opportunity # In their journey to improve user experience, Goibibo noticed a few trends: With users either already shifted or quickly shifting to mobile, their initial strategy towards mobile web was to build a lightweight and functional application. This worked, with search-to-details-page conversions equalizing on web and iOS/Android, but the iOS/Android apps won in all further steps of the conversion funnel. There were significant drop offs at the payment stage of the PWA compared to their iOS/Android apps. This was when they decided to invest in their PWA with the goal of letting users experience the same UX on their PWA as on their iOS/Android apps. They also noticed nearly 20% of their users were starting a session on the web and converting on the app. This reiterated their belief that a chunk of users will go untapped without an aligned PWA and iOS/Android app strategy. The tools they used # Contact Picker API # Contact Picker API to enable PWA users to fill in forms on behalf of others hassle-free. WebOTP # WebOTP (One-Time Password) API to reduce sign-in friction on their PWA. Web Share API # Web Share API to make it easier to share links, text, or files around hotel details, train availability, and so on. Push notifications # web push notifications to retarget bounced users with relevant updates like flight fare alerts and other customized content. How new web capabilities improved Goibibo's funnel # Overall business results # Iterations to PWA interfaces resulted in a 60% jump in conversion rate (compared to the previous mobile web flow) and delighted users. New web capabilities improved UX and caused a 20% increase in logged-in users (who convert 6x more). Rithish Saralaya, VP Engineering, Goibibo Check out the Scale on web case studies page for more success stories from India and Southeast Asia.

Detached window memory leaks

What's a memory leak in JavaScript? # A memory leak is an unintentional increase in the amount of memory used by an application over time. In JavaScript, memory leaks happen when objects are no longer needed, but are still referenced by functions or other objects. These references prevent the unneeded objects from being reclaimed by the garbage collector. The job of the garbage collector is to identify and reclaim objects that are no longer reachable from the application. This works even when objects reference themselves, or cyclically reference each other–once there are no remaining references through which an application could access a group of objects, it can be garbage collected. let A = {}; console.log(A); // local variable reference let B = {A}; // B.A is a second reference to A A = null; // unset local variable reference console.log(B.A); // A can still be referenced by B B.A = null; // unset B's reference to A // No references to A are left. It can be garbage collected. A particularly tricky class of memory leak occurs when an application references objects that have their own lifecycle, like DOM elements or popup windows. It's possible for these types of objects to become unused without the application knowing, which means application code may have the only remaining references to an object that could otherwise be garbage collected. What's a detached window? # In the following example, a slideshow viewer application includes buttons to open and close a presenter notes popup. Imagine a user clicks Show Notes, then closes the popup window directly instead of clicking the Hide Notes button–the notesWindow variable still holds a reference to the popup that could be accessed, even though the popup is no longer in use. <button id="show">Show Notes</button> <button id="hide">Hide Notes</button> <script type="module"> let notesWindow; document.getElementById('show').onclick = () => { notesWindow = window.open('/presenter-notes.html'); }; document.getElementById('hide').onclick = () => { if (notesWindow) notesWindow.close(); }; </script> This is an example of a detached window. The popup window was closed, but our code has a reference to it that prevents the browser from being able to destroy it and reclaim that memory. When a page calls window.open() to create a new browser window or tab, a Window object is returned that represents the window or tab. Even after such a window has been closed or the user has navigated it away, the Window object returned from window.open() can still be used to access information about it. This is one type of detached window: because JavaScript code can still potentially access properties on the closed Window object, it must be kept in memory. If the window included a lot of JavaScript objects or iframes, that memory can't be reclaimed until there are no remaining JavaScript references to the window's properties. The same issue can also occur when using <iframe> elements. Iframes behave like nested windows that contain documents, and their contentWindow property provides access to the contained Window object, much like the value returned by window.open(). JavaScript code can keep a reference to an iframe's contentWindow or contentDocument even if the iframe is removed from the DOM or its URL changes, which prevents the document from being garbage collected since its properties can still be accessed. In cases where a reference to the document within a window or iframe is retained from JavaScript, that document will be kept in-memory even if the containing window or iframe navigates to a new URL. This can be particularly troublesome when the JavaScript holding that reference doesn't detect that the window/frame has navigated to a new URL, since it doesn't know when it becomes the last reference keeping a document in memory. How detached windows cause memory leaks # When working with windows and iframes on the same domain as the primary page, it's common to listen for events or access properties across document boundaries. For example, let's revisit a variation on the presentation viewer example from the beginning of this guide. The viewer opens a second window for displaying speaker notes. The speaker notes window listens forclick events as its cue to move to the next slide. If the user closes this notes window, the JavaScript running in the original parent window still has full access to the speaker notes document: <button id="notes">Show Presenter Notes</button> <script type="module"> let notesWindow; function showNotes() { notesWindow = window.open('/presenter-notes.html'); notesWindow.document.addEventListener('click', nextSlide); } document.getElementById('notes').onclick = showNotes; let slide = 1; function nextSlide() { slide += 1; notesWindow.document.title = `Slide ${slide}`; } document.body.onclick = nextSlide; </script> Imagine we close the browser window created by showNotes() above. There's no event handler listening to detect that the window has been closed, so nothing is informing our code that it should clean up any references to the document. The nextSlide() function is still "live" because it is bound as a click handler in our main page, and the fact that nextSlide contains a reference to notesWindow means the window is still referenced and can't be garbage collected. See Solution: communicate over postMessage to learn how to fix this particular memory leak. There are a number of other scenarios where references are accidentally retained that prevent detached windows from being eligible for garbage collection: Event handlers can be registered on an iframe's initial document prior to the frame navigating to its intended URL, resulting in accidental references to the document and the iframe persisting after other references have been cleaned up. A memory-heavy document loaded in a window or iframe can be accidentally kept in-memory long after navigating to a new URL. This is often caused by the parent page retaining references to the document in order to allow for listener removal. When passing a JavaScript object to another window or iframe, the Object's prototype chain includes references to the environment it was created in, including the window that created it. This means it's just as important to avoid holding references to objects from other windows as it is to avoid holding references to the windows themselves. index.html: <script> let currentFiles; function load(files) { // this retains the popup: currentFiles = files; } window.open('upload.html'); </script> upload.html: <input type="file" id="file" /> <script> file.onchange = () => { parent.load(file.files); }; </script> Detecting memory leaks caused by detached windows # Tracking down memory leaks can be tricky. It is often difficult to construct isolated reproductions of these issues, particularly when multiple documents or windows are involved. To make things more complicated, inspecting potential leaked references can end up creating additional references that prevent the inspected objects from being garbage collected. To that end, it's useful to start with tools that specifically avoid introducing this possibility. A great place to start debugging memory problems is to take a heap snapshot. This provides a point-in-time view into the memory currently used by an application - all the objects that have been created but not yet garbage-collected. Heap snapshots contain useful information about objects, including their size and a list of the variables and closures that reference them. To record a heap snapshot, head over to the Memory tab in Chrome DevTools and select Heap Snapshot in the list of available profiling types. Once the recording has finished, the Summary view shows current objects in-memory, grouped by constructor. Try it! Open this step-by-step walk through in a new window. Analyzing heap dumps can be a daunting task, and it can be quite difficult to find the right information as part of debugging. To help with this, Chromium engineers yossik@ and peledni@ developed a standalone Heap Cleaner tool that can help highlight a specific node like a detached window. Running Heap Cleaner on a trace removes other unnecessary information from the retention graph, which makes the trace cleaner and much easier to read. Measure memory programmatically # Heap snapshots provide a high level of detail and are excellent for figuring out where leaks occur, but taking a heap snapshot is a manual process. Another way to check for memory leaks is to obtain the currently used JavaScript heap size from the performance.memory API: The performance.memory API only provides information about the JavaScript heap size, which means it doesn't include memory used by the popup's document and resources. To get the full picture, we'd need to use the new performance.measureUserAgentSpecificMemory() API currently being trialled in Chrome. Solutions for avoiding detached window leaks # The two most common cases where detached windows cause memory leaks are when the parent document retains references to a closed popup or removed iframe, and when unexpected navigation of a window or iframe results in event handlers never being unregistered. Example: Closing a popup # The unset references, monitor and dispose, and WeakRef solutions are all based off of this example. In the following example, two buttons are used to open and close a popup window. In order for the Close Popup button to work, a reference to the opened popup window is stored in a variable: <button id="open">Open Popup</button> <button id="close">Close Popup</button> <script> let popup; open.onclick = () => { popup = window.open('/login.html'); }; close.onclick = () => { popup.close(); }; </script> At first glance, it seems like the above code avoids common pitfalls: no references to the popup's document are retained, and no event handlers are registered on the popup window. However, once the Open Popup button is clicked the popup variable now references the opened window, and that variable is accessible from the scope of the Close Popup button click handler. Unless popup is reassigned or the click handler removed, that handler's enclosed reference to popup means it can't be garbage-collected. Solution: Unset references # Variables that reference another window or its document cause it to be retained in memory. Since objects in JavaScript are always references, assigning a new value to variables removes their reference to the original object. To "unset" references to an object, we can reassign those variables to the value null. Applying this to the previous popup example, we can modify the close button handler to make it "unset" its reference to the popup window: let popup; open.onclick = () => { popup = window.open('/login.html'); }; close.onclick = () => { popup.close(); popup = null; }; This helps, but reveals a further problem specific to windows created using open(): what if the user closes the window instead of clicking our custom close button? Further still, what if the user starts browsing to other websites in the window we opened? While it originally seemed sufficient to unset the popup reference when clicking our close button, there is still a memory leak when users don't use that particular button to close the window. Solving this requires detecting these cases in order to unset lingering references when they occur. Solution: Monitor and dispose # In many situations, the JavaScript responsible for opening windows or creating frames does not have exclusive control over their lifecycle. Popups can be closed by the user, or navigation to a new document can cause the document previously contained by a window or frame to become detached. In both cases, the browser fires an pagehide event to signal that the document is being unloaded. Caution: There's another event calledunload which is similar topagehide but is considered harmful and should be avoided as much as possible. See Legacy lifecycle APIs to avoid: the unload event for details. The pagehide event can be used to detect closed windows and navigation away from the current document. However, there is one important caveat: all newly-created windows and iframes contain an empty document, then asynchronously navigate to the given URL if provided. As a result, an initial pagehide event is fired shortly after creating the window or frame, just before the target document has loaded. Since our reference cleanup code needs to run when the target document is unloaded, we need to ignore this first pagehide event. There are a number of techniques for doing so, the simplest of which is to ignore pagehide events originating from the initial document's about:blank URL. Here's how it would look in our popup example: let popup; open.onclick = () => { popup = window.open('/login.html'); // listen for the popup being closed/exited: popup.addEventListener('pagehide', () => { // ignore initial event fired on "about:blank": if (!popup.location.host) return; // remove our reference to the popup window: popup = null; }); }; It's important to note that this technique only works for windows and frames that have the same effective origin as the parent page where our code is running. When loading content from a different origin, both location.host and the pagehide event are unavailable for security reasons. While it's generally best to avoid keeping references to other origins, in the rare cases where this is required it is possible to monitor the window.closed or frame.isConnected properties. When these properties change to indicate a closed window or removed iframe, it's a good idea to unset any references to it. let popup = window.open('https://example.com'); let timer = setInterval(() => { if (popup.closed) { popup = null; clearInterval(timer); } }, 1000); Solution: Use WeakRef # WeakRef is a new feature of the JavaScript language, available in desktop Firefox since version 79 and Chromium-based browsers since version 84. Since it's not yet widely-supported, this solution is better suited to tracking down and debugging issues rather than fixing them for production. JavaScript recently gained support for a new way to reference objects that allows garbage collection to take place, called WeakRef. A WeakRef created for an object is not a direct reference, but rather a separate object that provides a special .deref() method that returns a reference to the object as long as it has not been garbage-collected. With WeakRef, it is possible to access the current value of a window or document while still allowing it to be garbage collected. Instead of retaining a reference to the window that must be manually unset in response to events like pagehide or properties like window.closed, access to the window is obtained as-needed. When the window is closed, it can be garbage collected, causing the .deref() method to begin returning undefined. <button id="open">Open Popup</button> <button id="close">Close Popup</button> <script> let popup; open.onclick = () => { popup = new WeakRef(window.open('/login.html')); }; close.onclick = () => { const win = popup.deref(); if (win) win.close(); }; </script> One interesting detail to consider when using WeakRef to access windows or documents is that the reference generally remains available for a short period of time after the window is closed or iframe removed. This is because WeakRef continues returning a value until its associated object has been garbage-collected, which happens asynchronously in JavaScript and generally during idle time. Thankfully, when checking for detached windows in the Chrome DevTools Memory panel, taking a heap snapshot actually triggers garbage collection and disposes the weakly-referenced window. It's also possible to check that an object referenced via WeakRef has been disposed from JavaScript, either by detecting when deref() returns undefined or using the new FinalizationRegistry API: let popup = new WeakRef(window.open('/login.html')); // Polling deref(): let timer = setInterval(() => { if (popup.deref() === undefined) { console.log('popup was garbage-collected'); clearInterval(timer); } }, 20); // FinalizationRegistry API: let finalizers = new FinalizationRegistry(() => { console.log('popup was garbage-collected'); }); finalizers.register(popup.deref()); Solution: Communicate over postMessage # Detecting when windows are closed or navigation unloads a document gives us a way to remove handlers and unset references so that detached windows can be garbage collected. However, these changes are specific fixes for what can sometimes be a more fundamental concern: direct coupling between pages. A more holistic alternative approach is available that avoids stale references between windows and documents: establishing separation by limiting cross-document communication to postMessage(). Thinking back to our original presenter notes example, functions like nextSlide() updated the notes window directly by referencing it and manipulating its content. Instead, the primary page could pass the necessary information to the notes window asynchronously and indirectly over postMessage(). let updateNotes; function showNotes() { // keep the popup reference in a closure to prevent outside references: let win = window.open('/presenter-view.html'); win.addEventListener('pagehide', () => { if (!win || !win.location.host) return; // ignore initial "about:blank" win = null; }); // other functions must interact with the popup through this API: updateNotes = (data) => { if (!win) return; win.postMessage(data, location.origin); }; // listen for messages from the notes window: addEventListener('message', (event) => { if (event.source !== win) return; if (event.data[0] === 'nextSlide') nextSlide(); }); } let slide = 1; function nextSlide() { slide += 1; // if the popup is open, tell it to update without referencing it: if (updateNotes) { updateNotes(['setSlide', slide]); } } document.body.onclick = nextSlide; While this still requires the windows to reference each other, neither retains a reference to the current document from another window. A message-passing approach also encourages designs where window references are held in a single place, meaning only a single reference needs to be unset when windows are closed or navigate away. In the above example, only showNotes() retains a reference to the notes window, and it uses the pagehide event to ensure that reference is cleaned up. Solution: Avoid references using noopener # In cases where a popup window is opened that your page doesn't need to communicate with or control, you may be able to avoid ever obtaining a reference to the window. This is particularly useful when creating windows or iframes that will load content from another site. For these cases, window.open() accepts a "noopener" option that works just like the rel="noopener" attribute for HTML links: window.open('https://example.com/share', null, 'noopener'); The "noopener" option causes window.open() to return null, making it impossible to accidentally store a reference to the popup. It also prevents the popup window from getting a reference to its parent window, since the window.opener property will be null. Feedback # Hopefully some of the suggestions in this article help with finding and fixing memory leaks. If you have another technique for debugging detached windows or this article helped uncover leaks in your app, I'd love to know! You can find me on Twitter @_developit.

Content delivery networks (CDNs)

Content delivery networks (CDNs) improve site performance by using a distributed network of servers to deliver resources to users. Because CDNs reduce server load, they reduce server costs and are well-suited to handling traffic spikes. This article discusses how CDNs work and provides platform-agnostic guidance on choosing, configuring, and optimizing a CDN setup. Overview # A content delivery network consists of a network of servers that are optimized for quickly delivering content to users. Although CDNs are arguably best known for serving cached content, CDNs can also improve the delivery of uncacheable content. Generally speaking, the more of your site delivered by your CDN, the better. At a high-level, the performance benefits of CDNs stem from a handful of principles: CDN servers are located closer to users than origin servers and therefore have a shorter round-trip time (RTT) latency; networking optimizations allow CDNs to deliver content more quickly than if the content was loaded "directly" from the origin server; lastly, CDN caches eliminate the need for a request to travel to the origin server. Key Term: Origin server refers to the server that a CDN retrieves content from. Resource delivery # Although it may seem non-intuitive, using a CDN to deliver resources (even uncacheable ones) will typically be faster than having the user load the resource "directly" from your servers. When a CDN is used to deliver resources from the origin, a new connection is established between the client and a nearby CDN server. The remainder of the journey (in other words, the data transfer between the CDN server and origin) occurs over the CDN's network - which often includes existing, persistent connections with the origin. The benefits of this are twofold: terminating the new connection as close to the user as possible eliminates unnecessary connection setup costs (establishing a new connection is expensive and requires multiple roundtrips); using a pre-warmed connection allows data to be immediately transferred at the maximum possible throughput. Some CDNs improve upon this even further by routing traffic to the origin through multiple CDN servers spread across the Internet. Connections between CDN servers occur over reliable and highly optimized routes, rather than routes determined by the Border Gateway Protocol (BGP). Although BGP is the internet's de facto routing protocol, its routing decisions are not always performance-oriented. Therefore, BGP-determined routes are likely to be less performant than the finely-tuned routes between CDN servers. Caching # Caching resources on a CDN's servers eliminates the need for a request to travel all the way to the origin in order to be served. As a result, the resource is delivered more quickly; this also reduces the load on the origin server. Adding resources to the cache # The most commonly used method of populating CDN caches is to have the CDN "pull" resources as they are needed - this is known as "origin pull". The first time that a particular resource is requested from the cache the CDN will request it from the origin server and cache the response. In this manner, the contents of the cache are built-up over time as additional uncached resources are requested. Removing resources from the cache # CDNs use cache eviction to periodically remove not-so-useful resources from the cache. In addition, site owners can use purging to explicitly remove resources. Cache eviction Caches have a finite storage capacity. When a cache nears its capacity, it makes room for new resources by removing resources that haven't been accessed recently, or which take up a lot of space. This process is known as cache eviction. A resource being evicted from one cache does not necessarily mean that it has been evicted from all caches in a CDN network. Purging Purging (also known as "cache invalidation") is a mechanism for removing a resource from a CDN's caches without having to wait for it to expire or be evicted. It is typically executed via an API. Purging is critical in situations where content needs to be retracted (for example, correcting typos, pricing errors, or incorrect news articles). On top of that, it can also play a crucial role in a site's caching strategy. If a CDN supports near instant purging, purging can be used as a mechanism for managing the caching of dynamic content: cache dynamic content using a long TTL, then purge the resource whenever it is updated. In this way, it is possible to maximize the caching duration of a dynamic resource, despite not knowing in advance when the resource will change. This technique is sometimes referred to as "hold-till-told caching". When purging is used at scale it is typically used in conjunction with a concept known as "cache tags" or "surrogate cache keys". This mechanism allows site owners to associate one or more additional identifiers (sometimes referred to as "tags") with a cached resource. These tags can then be used to carry out highly granular purging. For example, you might add a "footer" tag to all resources (for example, /about, /blog) that contain your site footer. When the footer is updated, instruct your CDN to purge all resources associated with the "footer" tag. Cacheable resources # If and how a resource should be cached depends on whether it is public or private; static or dynamic. Private and public resources # Private Resources Private resources contain data intended for a single user and therefore should not be cached by a CDN. Private resources are indicated by the Cache-Control: private header. Public Resources Public resources do not contain user-specific information and therefore are cacheable by a CDN. A resource may be considered cacheable by a CDN if it does not have a Cache-Control: no-store or Cache-Control: private header. The length of time that a public resource can be cached depends on how frequently the asset changes. Dynamic and static content # Dynamic content Dynamic content is content that changes frequently. An API response and a store homepage are examples of this content type. However, the fact that this content changes frequently doesn't necessarily preclude it from being cached. During periods of heavy traffic, caching these responses for very short periods of time (for example, 5 seconds) can significantly reduce the load on the origin server, while having minimal impact on data freshness. Static content Static content changes infrequently, if ever. Images, videos, and versioned libraries are typically examples of this content type. Because static content does not change, it should be cached with a long Time to Live (TTL) - for example, 6 months or 1 year. Choosing a CDN # Performance is typically a top consideration when choosing a CDN. However, the other features that a CDN offers (for example, security and analytics features), as well as a CDN's pricing, support, and onboarding are all important to consider when choosing a CDN. Performance # At a high-level, a CDN's performance strategy can be thought of in terms of the tradeoff between minimizing latency and maximizing cache hit ratio. CDNs with many points of presence (PoPs) can deliver lower latency but may experience lower cache hit ratios as a result of traffic being split across more caches. Conversely, CDNs with fewer PoPs may be located geographically further from users, but can achieve higher cache hit ratios. As a result of this tradeoff, some CDNs use a tiered approach to caching: PoPs located close to users (also known as "edge caches") are supplemented with central PoPs that have higher cache hit ratios. When an edge cache can't find a resource, it will look to a central PoP for the resource. This approach trades slightly greater latency for a higher likelihood that the resource can be served from a CDN cache - though not necessarily an edge cache. The tradeoff between minimizing latency and minimizing cache hit ratio is a spectrum. No particular approach is universally better; however, depending on the nature of your site and its user base, you may find that one of these approaches delivers significantly better performance than the other. It's also worth noting that CDN performance can vary significantly depending on geography, time of day, and even current events. Although it's always a good idea to do your own research on a CDN's performance, it can be difficult to predict the exact performance you'll get from a CDN. Additional features # CDNs typically offer a wide variety of features in addition to their core CDN offering. Commonly offered features include: load balancing, image optimization, video streaming, edge computing, and security products. How to setup and configure a CDN # Ideally you should use a CDN to serve your entire site. At a high-level, the setup process for this consists of signing up with a CDN provider, then updating your CNAME DNS record to point at the CDN provider. For example, the CNAME record for www.example.com might point to example.my-cdn.com. As a result of this DNS change, traffic to your site will be routed through the CDN. If using a CDN to serve all resources is not an option, you can configure a CDN to only serve a subset of resources - for example, only static resources. You can do this by creating a separate CNAME record that will only be used for resources that should be served by the CDN. For example, you might create a static.example.com CNAME record that points to example.my-cdn.com. You would also need to rewrite the URLs of resources being served by the CDN to point to the static.example.com subdomain that you created. Although your CDN will be set up at this point, there will likely be inefficiencies in your configuration. The next two sections of this article will explain how to get the most out of your CDN by increasing cache hit ratio and enabling performance features. Improving cache hit ratio # An effective CDN setup will serve as many resources as possible from the cache. This is commonly measured by cache hit ratio (CHR). Cache hit ratio is defined as the number of cache hits divided by the number of total requests during a given time interval. A freshly initialized cache will have a CHR of 0 but this increases as the cache is populated with resources. A CHR of 90% is a good goal for most sites. Your CDN provider should supply you with analytics and reporting regarding your CHR. When optimizing CHR, the first thing to verify is that all cacheable resources are being cached and cached for the correct length of time. This is a simple assessment that should be undertaken by all sites. The next level of CHR optimization, broadly speaking, is to fine tune your CDN settings to make sure that logically equivalent server responses aren't being cached separately. This is a common inefficiency that occurs due to the impact of factors like query params, cookies, and request headers on caching. Initial audit # Most CDNs will provide cache analytics. In addition, tools like WebPageTest and Lighthouse can also be used to quickly verify that all of a page's static resources are being cached for the correct length of time. This is accomplished by checking the HTTP Cache headers of each resource. Caching a resource using the maximum appropriate Time To Live (TTL) will avoid unnecessary origin fetches in the future and therefore increase CHR. At a minimum, one of these headers typically needs to be set in order for a resource to be cached by a CDN: Cache-Control: max-age= Cache-Control: s-maxage= Expires In addition, although it does not impact if or how a resource is cached by a CDN, it is good practice to also set the Cache-Control: immutable directive.Cache-Control: immutable indicates that a resource "will not be updated during its freshness lifetime". As a result, the browser will not revalidate the resource when serving it from the browser cache, thereby eliminating an unnecessary server request. Unfortunately, this directive is only supported by Firefox and Safari - it is not supported by Chromium-based browsers. This issue tracks Chromium support for Cache-Control: immutable. Starring this issue can help encourage support for this feature. For a more detailed explanation of HTTP caching, refer to Prevent unnecessary network requests with the HTTP Cache. Fine tuning # A slightly simplified explanation of how CDN caches work is that the URL of a resource is used as the key for caching and retrieving the resource from the cache. In practice, this is still overwhelmingly true, but is complicated slightly by the impact of things like request headers and query params. As a result, rewriting request URLs is an important technique for both maximizing CHR and ensuring that the correct content is served to users. A properly configured CDN instance strikes the correct balance between overly granular caching (which hurts CHR) and insufficiently granular caching (which results in incorrect responses being served to users). Query params # By default, CDNs take query params into consideration when caching a resource. However, small adjustments to query param handling can have a significant impact on CHR. For example: Unnecessary query params By default, a CDN would cache example.com/blog and example.com/blog?referral_id=2zjk separately even though they are likely the same underlying resource. This is fixed by adjusting a CDN's configuration to ignore the referral\_id query param. Query param order A CDN will cache example.com/blog?id=123&query=dogs separately from example.com/blog?query=dogs&id=123. For most sites, query param order does not matter, so configuring the CDN to sort the query params (thereby normalizing the URL used to cache the server response) will increase CHR. Vary # The Vary response header informs caches that the server response corresponding to a particular URL can vary depending on the headers set on the request (for example, the Accept-Language or Accept-Encoding request headers). As a result, it instructs a CDN to cache these responses separately. The Vary header is not widely supported by CDNs and may result in an otherwise cacheable resource not being served from a cache. Although the Vary header can be a useful tool, inappropriate usage hurts CHR. In addition, if you do use Vary, normalizing request headers will help improve CHR. For example, without normalization the request headers Accept-Language: en-US and Accept-Language: en-US,en;q=0.9 would result in two separate cache entries, even though their contents would likely be identical. Cookies # Cookies are set on requests via the Cookie header; they are set on responses via the Set-Cookie header. Unnecessary use of Set-Cookie header should be avoided given that caches will typically not cache server responses containing this header. Performance features # This section discusses performance features that are commonly offered by CDNs as part of their core product offering. Many sites forget to enable these features, thereby losing out on easy performance wins. Compression # All text-based responses should be compressed with either gzip or Brotli. If you have the choice, choose Brotli over gzip. Brotli is a newer compression algorithm, and compared to gzip, it can achieve higher compression ratios. There are two types of CDN support for Brotli compression: "Brotli from origin" and "automatic Brotli compression". Brotli from origin # Brotli from origin is when a CDN serves resources that were Brotli-compressed by the origin. Although this may seem like a feature that all CDNs should be able to support out of the box, it requires that a CDN be able to cache multiple versions (in other words, gzip-compressed and Brotli-compressed versions) of the resource corresponding to a given URL. Automatic Brotli compression # Automatic Brotli compression is when resources are Brotli compressed by the CDN. CDNs can compress both cacheable and non-cacheable resources. The first time that a resource is requested it is served using "good enough" compression - for example, Brotli-5. This type of compression is applicable to both cacheable and non-cacheable resources. Meanwhile, if a resource is cacheable, the CDN will use offline processing to compress the resource at a more powerful but far slower compression level - for example, Brotli-11. Once this compression completes, the more compressed version will be cached and used for subsequent requests. Compression best practices # Sites that want to maximize performance should apply Brotli compression at both their origin server and CDN. Brotli compression at the origin minimizes the transfer size of resources that can't be served from the cache. To prevent delays in serving requests, the origin should compress dynamic resources using a fairly conservative compression level - for example, Brotli-4; static resources can be compressed using Brotli-11. If an origin does not support Brotli, gzip-6 can be used to compress dynamic resources; gzip-9 can be used to compress static resources. TLS 1.3 # TLS 1.3 is the newest version of Transport Layer Security (TLS), the cryptographic protocol used by HTTPS. TLS 1.3 provides better privacy and performance compared to TLS 1.2. TLS 1.3 shortens the TLS handshake from two roundtrips to one. For connections using HTTP/1 or HTTP/2, shortening the TLS handshake to one roundtrip effectively reduces connection setup time by 33%. HTTP/2 and HTTP/3 # HTTP/2 and HTTP/3 both provide performance benefits over HTTP/1. Of the two, HTTP/3 offers greater potential performance benefits. HTTP/3 isn't fully standardized yet, but it will be widely supported once this occurs. HTTP/2 # If your CDN hasn't already enabled HTTP/2 by default, you should consider turning it on. HTTP/2 provides multiple performance benefits over HTTP/1 and is supported by all major browsers. Performance features of HTTP/2 include: multiplexing, stream prioritization, server push, and header compression. Multiplexing Multiplexing is arguably the most important feature of HTTP/2. Multiplexing enables a single TCP connection to serve multiple request-response pairs at the same time. This eliminates the overhead of unnecessary connection setups; given that the number of connections that a browser can have open at a given time is limited, this also has the implication that the browser is now able to request more of a page's resources in parallel. Multiplexing theoretically removes the need for HTTP/1 optimizations like concatenation and sprite sheets - however, in practice, these techniques will remain relevant given that larger files compress better. Stream prioritization Multiplexing enables multiple concurrent streams; stream prioritization provides an interface for communicating relative priority of each of these streams. This helps the server to send the most important resources first - even if they weren't requested first. Stream prioritization is expressed by the browser via a dependency tree and is merely a statement of preference: in other words, the server is not obligated to meet (or even consider) the priorities supplied by the browser. Stream prioritization becomes more effective when more of a site is served through a CDN. CDN implementations of HTTP/2 resource prioritization vary wildly. To identify whether your CDN fully and properly supports HTTP/2 resource prioritization, check out Is HTTP/2 Fast Yet?. Although switching your CDN instance to HTTP/2 is largely a matter of flipping a switch, it's important to thoroughly test this change before enabling it in production. HTTP/1 and HTTP/2 use the same conventions for request and response headers - but HTTP/2 is far less forgiving when these conventions aren't adhered to. As a result, non-spec practices like including non-ASCII or uppercase characters in headers may begin causing errors once HTTP/2 is enabled. If this occurs, a browser's attempts to download the resource will fail. The failed download attempt will be visible in the "Network" tab of DevTools. In addition, the error message "ERR_HTTP2_PROTOCOL_ERROR" will be displayed in the console. HTTP/3 # HTTP/3 is the successor to HTTP/2. As of September 2020, all major browsers have experimental support for HTTP/3 and some CDNs support it. Performance is the primary benefit of HTTP/3 over HTTP/2. Specifically, HTTP/3 eliminates head-of-line blocking at the connection level and reduces connection setup time. Elimination of head-of-line blocking HTTP/2 introduced multiplexing, a feature that allows a single connection to be used to transmit multiple streams of data simultaneously. However, with HTTP/2, a single dropped packet blocks all streams on a connection (a phenomena known as a head-of-line blocking). With HTTP/3, a dropped packet only blocks a single stream. This improvement is largely the result of HTTP/3 using UDP (HTTP/3 uses UDP via QUIC) rather than TCP. This makes HTTP/3 particularly useful for data transfer over congested or lossy networks. Reduced connection setup time HTTP/3 uses TLS 1.3 and therefore shares its performance benefits: establishing a new connection only requires a single round-trip and resuming an existing connection does not require any roundtrips. HTTP/3 will have the biggest impact on users on poor network connections: not only because HTTP/3 handles packet loss better than its predecessors, but also because the absolute time savings resulting from a 0-RTT or 1-RTT connection setup will be greater on networks with high latency. Image optimization # CDN image optimization services typically focus on image optimizations that can be applied automatically in order to reduce image transfer size. For example: stripping EXIF data, applying lossless compression, and converting images to newer file formats (for example, WebP). Images make up ~50% of the transfer bytes on the median web page, so optimizing images can significantly reduce page size. Minification # Minification removes unnecessary characters from JavaScript, CSS, and HTML. It's preferable to do minification at the origin server, rather than the CDN. Site owners have more context about the code to be minified and therefore can often use more aggressive minification techniques than those employed by CDNs. However, if minifying code at the origin is not an option, minification by the CDN is a good alternative. Conclusion # Use a CDN: CDNs deliver resources quickly, reduce load on the origin server, and are helpful for dealing with traffic spikes. Cache content as aggressively as possible: Both static and dynamic content can and should be cached - albeit for varying durations. Periodically audit your site to make sure that you are optimally cacheing content. Enable CDN performance features: Features like Brotli, TLS 1.3, HTTP/2, and HTTP/3 further improve performance.

How Mercado Libre optimized for Web Vitals (TBT/FID)

Mercado Libre is the largest e-commerce and payments ecosystem in Latin America. It is present in 18 countries and is a market leader in Brazil, Mexico, and Argentina (based on unique visitors and pageviews). Web performance has been a focus for the company for a long time, but they recently formed a team to monitor performance and apply optimizations across different parts of the site. This article summarizes the work done by Guille Paz, Pablo Carminatti, and Oleh Burkhay from Mercado Libre's frontend architecture team to optimize one of the Core Web Vitals: First Input Delay (FID) and its lab proxy, Total Blocking Time (TBT). 90% Reduction in Max Potential FID in Lighthouse 9% More users perceiving FID as "Fast" in CrUX Long tasks, First Input Delay, and Total Blocking Time # Running expensive JavaScript code can lead to long tasks, which are those that run for more than 50ms in the browser's main thread. FID (First Input Delay) measures the time from when a user first interacts with a page (e.g. when they click on a link) to the time when the browser is actually able to begin processing event handlers in response to that interaction. A site that executes expensive JavaScript code will likely have several long tasks, which will end up negatively impacting FID. To provide a good user experience, sites should strive to have a First Input Delay of less than 100 milliseconds: While Mercado Libre's site was performing well in most sections, they found in the Chrome User Experience Report that product detail pages had a poor FID. Based on that information, they decided to focus their efforts on improving the interactivity for product pages in the site. These pages allow the user to perform complex interactions, so the goal was interactivity optimization, without interfering with valuable functionality. Measure interactivity of product detail pages # FID requires a real user and thus cannot be measured in the lab. However, the Total Blocking Time (TBT) metric is lab-measurable, correlates well with FID in the field, and also captures issues that affect interactivity. In the following trace, for example, while the total time spent running tasks on the main thread is 560 ms, only 345 ms of that time is considered total blocking time (the sum of the portions of each task that exceeds 50ms): Mercado Libre took TBT as a proxy metric in the lab, in order to measure and improve the interactivity of product detail pages in the real world. Here's the general approach they took: Use WebPageTest to determine exactly which scripts were keeping the main thread busy on a real device. Use Lighthouse to determine the impact of the changes in Max Potential First Input Delay (Max Potential FID). During this project Mercado Libre used Max Potential FID in Lighthouse because that was the tool's main metric for measuring interactivity at that time. Lighthouse now recommends using Total Blocking Time instead. Use WebPageTest to visualize long tasks # WebPageTest (WPT) is a web performance tool that allows you to run tests on real devices in different locations around the world. Mercado Libre used WPT to reproduce the experience of their users by choosing a device type and location similar to real users. Specifically, they chose a Moto 4G device and Dulles, Virginia, because they wanted to approximate the experience of Mercado Libre users in Mexico. By observing the main thread view of WPT, Mercado Libre found that there were several consecutive long tasks blocking the main thread for 2 seconds: Analyzing the corresponding waterfall they found that a considerable part of those two seconds came from their analytics module. The main bundle size of the application was large (950KB) and took a long time to parse, compile, and execute. Use Lighthouse to determine Max Potential FID # Lighthouse doesn't allow you to choose between different devices and locations, but it's a very useful tool for diagnosing sites and obtaining performance recommendations. When running Lighthouse on product detail pages, Mercado Libre found that the Max Potential FID was the only metric marked in red, with a value of 1710ms. Based on this, Mercado Libre set a goal to improve their Max Potential FID score in a laboratory tool like Lighthouse and WebPageTest, under the assumption that these improvements would affect their real users, and therefore, show up in real user monitoring tools like the Chrome User Experience Report. Optimize long tasks # First iteration # Based on the main thread trace, Mercado Libre set the goal of optimizing the two modules that were running expensive code. They started optimizing the performance of the internal tracking module. This module contained a CPU-heavy task that wasn't critical for the module to work, and therefore could be safely removed. This led to a 2% reduction in JavaScript for the whole site. After that they started to work on improving the general bundle size: Mercado Libre used webpack-bundle-analyzer to detect opportunities for optimization: Initially they were requiring the full Lodash module. This was replaced with a per-method require to load only a subset of Lodash instead of the whole library, and used in conjunction with lodash-webpack-plugin to shrink Lodash even further. They also applied the following Babel optimizations: Using @babel/plugin-transform-runtime to reuse Babel's helpers throughout the code, and reduce the size of the bundle considerably. Using babel-plugin-search-and-replace to replace tokens at build time, in order to remove a large configuration file inside the main bundle. Adding babel-plugin-transform-react-remove-prop-types to save some extra bytes by removing the prop types. As a result of these optimizations, the bundle size was reduced by approximately 16%. Measure impact # The changes lowered Mercado Libre's consecutive long tasks from two seconds to one second: Page is Interactive row) between seconds 3 and 5. In the bottom waterfall, the bar has been broken into smaller pieces, occupying the main thread for shorter periods of time. Lighthouse showed a 57% reduction in Max Potential First Input Delay: Second iteration # The team continued digging into long tasks in order to find subsequent improvements. Browser Main Thread row) and the Page is Interactive row clearly shows that this main thread activity is blocking interactivity. Based on that information they decided to implement the following changes: Continue reducing the main bundle size to optimize compile and parse time (e.g. by removing duplicate dependencies throughout the different modules). Apply code splitting at component level, to divide JavaScript in smaller chunks and allow for smarter loading of the different components. Defer component hydration to allow for a smarter use of the main thread. This technique is commonly referred to as partial hydration. Measure impact # The resulting WebPageTest trace showed even smaller chunks of JS execution: And their Max Potential FID time in Lighthouse was reduced by an additional 60%: Visualize progress for real users # While laboratory testing tools like WebPageTest and Lighthouse are great for iterating on solutions during development, the true goal is to improve the experience for real users. The Chrome User Experience Report provides user experience metrics for how real-world Chrome users experience popular destinations on the web. The data from the report can be obtained by running queries in BigQuery, PageSpeedInsights, or the CrUX API. The CrUX dashboard is an easy way to visualize the progress of core metrics: Next steps # Web performance is never a finished task, and Mercado Libre understands the value these optimizations bring to their users. While they continue applying several optimizations across the site, including prefetching in product listing pages, image optimizations, and others, they continue adding improvements to product listing pages to reduce Total Blocking Time (TBT), and by proxy FID, even more. These optimizations include: Iterating on the code splitting solution. Improving the execution of third-party scripts. Continuing improvements in asset bundling at the bundler level (webpack). Mercado Libre has a holistic view of performance, so while they continue optimizing interactivity in the site, they have also started assessing opportunities for improvement on the other two current Core Web Vitals: LCP (Largest Contentful Paint) and CLS (Cumulative Layout Shift) even more.

Connecting to uncommon HID devices

Success: The WebHID API, part of the capabilities project, launched in Chrome 89. There is a long tail of human interface devices (HIDs), such as alternative keyboards or exotic gamepads, that are too new, too old, or too uncommon to be accessible by systems' device drivers. The WebHID API solves this by providing a way to implement device-specific logic in JavaScript. Suggested use cases # A HID device takes input from or provides output to humans. Examples of devices include keyboards, pointing devices (mice, touchscreens, etc.), and gamepads. The HID protocol makes it possible to access these devices on desktop computers using operating system drivers. The web platform supports HID devices by relying on these drivers. The inability to access uncommon HID devices is particularly painful when it comes to alternative auxiliary keyboards (e.g. Elgato Stream Deck, Jabra headsets, X-keys) and exotic gamepad support. Gamepads designed for desktop often use HID for gamepad inputs (buttons, joysticks, triggers) and outputs (LEDs, rumble). Unfortunately, gamepad inputs and outputs are not well standardized and web browsers often require custom logic for specific devices. This is unsustainable and results in poor support for the long tail of older and uncommon devices. It also causes the browser to depend on quirks in the behavior of specific devices. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification Complete 3. Gather feedback & iterate on design Complete 4. Origin trial Complete 5. Launch Complete Terminology # HID consists of two fundamental concepts: reports and report descriptors. Reports are the data that is exchanged between a device and a software client. The report descriptor describes the format and meaning of data that the device supports. A HID (Human Interface Device) is a type of device that takes input from or provides output to humans. It also refers to the HID protocol, a standard for bi-directional communication between a host and a device that is designed to simplify the installation procedure. The HID protocol was originally developed for USB devices, but has since been implemented over many other protocols, including Bluetooth. Applications and HID devices exchange binary data through three report types: Report type Description Input report Data that is sent from the device to the application (e.g. a button is pressed.) Output report Data that is sent from the application to the device (e.g. a request to turn on the keyboard backlight.) Feature report Data that may be sent in either direction. The format is device-specific. A report descriptor describes the binary format of reports supported by the device. Its structure is hierarchical and can group reports together as distinct collections within the top-level collection. The format of the descriptor is defined by the HID specification. A HID usage is a numeric value referring to a standardized input or output. Usage values allow a device to describe the intended use of the device and the purpose of each field in its reports. For example, one is defined for the left button of a mouse. Usages are also organized into usage pages, which provide an indication of the high-level category of the device or report. Using the WebHID API # Feature detection # To check if the WebHID API is supported, use: if ("hid" in navigator) { // The WebHID API is supported. } Open a HID connection # The WebHID API is asynchronous by design to prevent the website UI from blocking when awaiting input. This is important because HID data can be received at any time, requiring a way to listen to it. To open a HID connection, first access a HIDDevice object. For this, you can either prompt the user to select a device by calling navigator.hid.requestDevice(), or pick one from navigator.hid.getDevices() which returns a list of devices the website has been granted access to previously. The navigator.hid.requestDevice() function takes a mandatory object that defines filters. Those are used to match any device connected with a USB vendor identifier (vendorId), a USB product identifier (productId), a usage page value (usagePage), and a usage value (usage). You can get those from the USB ID Repository and the HID usage tables document. The multiple HIDDevice objects returned by this function represent multiple HID interfaces on the same physical device. // Filter on devices with the Nintendo Switch Joy-Con USB Vendor/Product IDs. const filters = [ { vendorId: 0x057e, // Nintendo Co., Ltd productId: 0x2006 // Joy-Con Left }, { vendorId: 0x057e, // Nintendo Co., Ltd productId: 0x2007 // Joy-Con Right } ]; // Prompt user to select a Joy-Con device. const [device] = await navigator.hid.requestDevice({ filters }); // Get all devices the user has previously granted the website access to. const devices = await navigator.hid.getDevices(); User prompt for selecting a Nintendo Switch Joy-Con. A HIDDevice object contains USB vendor and product identifiers for device identification. Its collections attribute is initialized with a hierarchical description of the device's report formats. for (let collection of device.collections) { // A HID collection includes usage, usage page, reports, and subcollections. console.log(`Usage: ${collection.usage}`); console.log(`Usage page: ${collection.usagePage}`); for (let inputReport of collection.inputReports) { console.log(`Input report: ${inputReport.reportId}`); // Loop through inputReport.items } for (let outputReport of collection.outputReports) { console.log(`Output report: ${outputReport.reportId}`); // Loop through outputReport.items } for (let featureReport of collection.featureReports) { console.log(`Feature report: ${featureReport.reportId}`); // Loop through featureReport.items } // Loop through subcollections with collection.children } The HIDDevice devices are by default returned in a "closed" state and must be opened by calling open() before data can be sent or received. // Wait for the HID connection to open before sending/receiving data. await device.open(); Receive input reports # Once the HID connection has been established, you can handle incoming input reports by listening to the "inputreport" events from the device. Those events contain the HID data as a DataView object (data), the HID device it belongs to (device), and the 8-bit report ID associated with the input report (reportId). Nintendo Switch Joy-Con devices. Continuing with the previous example, the code below shows you how to detect which button the user has pressed on a Joy-Con Right device so that you can hopefully try it at home. device.addEventListener("inputreport", event => { const { data, device, reportId } = event; // Handle only the Joy-Con Right device and a specific report ID. if (device.productId !== 0x2007 && reportId !== 0x3f) return; const value = data.getUint8(0); if (value === 0) return; const someButtons = { 1: "A", 2: "X", 4: "B", 8: "Y" }; console.log(`User pressed button ${someButtons[value]}.`); }); Send output reports # To send an output report to a HID device, pass the 8-bit report ID associated with the output report (reportId) and bytes as a BufferSource (data) to device.sendReport(). The returned promise resolves once the report has been sent. If the HID device does not use report IDs, set reportId to 0. The example below applies to a Joy-Con device and shows you how to make it rumble with output reports. // First, send a command to enable vibration. // Magical bytes come from https://github.com/mzyy94/joycon-toolweb const enableVibrationData = [1, 0, 1, 64, 64, 0, 1, 64, 64, 0x48, 0x01]; await device.sendReport(0x01, new Uint8Array(enableVibrationData)); // Then, send a command to make the Joy-Con device rumble. // Actual bytes are available in the sample below. const rumbleData = [ /* ... */ ]; await device.sendReport(0x10, new Uint8Array(rumbleData)); Send and receive feature reports # Feature reports are the only type of HID data reports that can travel in both directions. They allow HID devices and applications to exchange non standardized HID data. Unlike input and output reports, feature reports are not received or sent by the application on a regular basis. Laptop keyboard To send a feature report to a HID device, pass the 8-bit report ID associated with the feature report (reportId) and bytes as a BufferSource (data) to device.sendFeatureReport(). The returned promise resolves once the report has been sent. If the HID device does not use report IDs, set reportId to 0. The example below illustrates the use of feature reports by showing you how to request an Apple keyboard backlight device, open it, and make it blink. const waitFor = duration => new Promise(r => setTimeout(r, duration)); // Prompt user to select an Apple Keyboard Backlight device. const [device] = await navigator.hid.requestDevice({ filters: [{ vendorId: 0x05ac, usage: 0x0f, usagePage: 0xff00 }] }); // Wait for the HID connection to open. await device.open(); // Blink! const reportId = 1; for (let i = 0; i < 10; i++) { // Turn off await device.sendFeatureReport(reportId, Uint32Array.from([0, 0])); await waitFor(100); // Turn on await device.sendFeatureReport(reportId, Uint32Array.from([512, 0])); await waitFor(100); } To receive a feature report from a HID device, pass the 8-bit report ID associated with the feature report (reportId) to device.receiveFeatureReport(). The returned promise resolves with a DataView object that contains the contents of the feature report. If the HID device does not use report IDs, set reportId to 0. // Request feature report. const dataView = await device.receiveFeatureReport(/* reportId= */ 1); // Read feature report contents with dataView.getInt8(), getUint8(), etc... Listen to connection and disconnection # When the website has been granted permission to access a HID device, it can actively receive connection and disconnection events by listening to "connect" and "disconnect" events. navigator.hid.addEventListener("connect", event => { // Automatically open event.device or warn user a device is available. }); navigator.hid.addEventListener("disconnect", event => { // Remove |event.device| from the UI. }); Dev Tips # Debugging HID in Chrome is easy with the internal page, chrome://device-log where you can see all HID and USB device related events in one single place. Internal page in Chrome to debug HID. Browser support # The WebHID API is available on all desktop platforms (Chrome OS, Linux, macOS, and Windows) in Chrome 89. Demos # Some WebHID demos are listed at web.dev/hid-examples. Go have a look! Security and privacy # The spec authors have designed and implemented the WebHID API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics. The ability to use this API is primarily gated by a permission model that grants access to only a single HID device at a time. In response to a user prompt, the user must take active steps to select a particular HID device. To understand the security tradeoffs, check out the Security and Privacy Considerations section of the WebHID spec. On top of this, Chrome inspects the usage of each top-level collection and if a top-level collection has a protected usage (e.g. generic keyboard, mouse), then a website won't be able to send and receive any reports defined in that collection. The full list of protected usages is publicly available. Note that security-sensitive HID devices (such as FIDO HID devices used for stronger authentication) are also blocked in Chrome. See the USB blocklist and HID blocklist files. Feedback # The Chrome team would love to hear about your thoughts and experiences with the WebHID API. Tell us about the API design # Is there something about the API that doesn't work as expected? Or are there missing methods or properties that you need to implement your idea? File a spec issue on the WebHID API GitHub repo or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chrome's implementation? Or is the implementation different from the spec? File a bug at https://new.crbug.com. Be sure to include as much detail as you can, provide simple instructions for reproducing the bug, and have Components set to Blink>HID. Glitch works great for sharing quick and easy repros. Show support # Are you planning to use the WebHID API? Your public support helps the Chrome team prioritize features and shows other browser vendors how critical it is to support them. Send a tweet to @ChromiumDev using the hashtag #WebHID and let us know where and how you're using it. Helpful links # Specification Tracking bug ChromeStatus.com entry Blink Component: Blink>HID Acknowledgements # Thanks to Matt Reynolds and Joe Medley for their reviews of this article. Red and blue Nintendo Switch photo by Sara Kurfeß, and black and silver laptop computer photo by Athul Cyriac Ajay on Unsplash.

Managing several displays with the Multi-Screen Window Placement API

The Multi-Screen Window Placement API is part of the capabilities project and is currently in development. This post will be updated as the implementation progresses. The Multi-Screen Window Placement API allows you to enumerate the displays connected to your machine and to place windows on specific screens. Suggested use cases # Examples of sites that may use this API include: Multi-window graphics editors à la Gimp can place various editing tools in accurately positioned windows. Virtual trading desks can show market trends in multiple windows any of which can be viewed in fullscreen mode. Slideshow apps can show speaker notes on the internal primary screen and the presentation on an external projector. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification Complete 3. Gather feedback & iterate on design In progress 4. Origin trial In progress 5. Launch Not started How to use the Multi-Screen Window Placement API # Enabling via chrome://flags # To experiment with the Multi-Screen Window Placement API locally, without an origin trial token, enable the #enable-experimental-web-platform-features flag in chrome://flags. Enabling support during the origin trial phase # Starting in Chrome 86, the Multi-Screen Window Placement API will be available as an origin trial in Chrome. The origin trial is expected to end in Chrome 88 (February 24, 2021). Origin trials allow you to try new features and give feedback on their usability, practicality, and effectiveness to the web standards community. For more information, see the Origin Trials Guide for Web Developers. To sign up for this or another origin trial, visit the registration page. Register for the origin trial # Request a token for your origin. Add the token to your pages. There are two ways to do that: Add an origin-trial <meta> tag to the head of each page. For example, this may look something like: <meta http-equiv="origin-trial" content="TOKEN_GOES_HERE"> If you can configure your server, you can also add the token using an Origin-Trial HTTP header. The resulting response header should look something like: Origin-Trial: TOKEN_GOES_HERE The problem # The time-tested approach to controlling windows, Window.open(), is unfortunately unaware of additional screens. While some aspects of this API seem a little archaic, such as its windowFeatures DOMString parameter, it has nevertheless served us well over the years. To specify a window's position, you can pass the coordinates as left and top (or screenX and screenY respectively) and pass the desired size as width and height (or innerWidth and innerHeight respectively). For example, to open a 400×300 window at 50 pixels from the left and 50 pixels from the top, this is the code that you could use: const popup = window.open( "https://example.com/", "My Popup", "left=50,top=50,width=400,height=300" ); You can get information about the current screen by looking at the window.screen property, which returns a Screen object. This is the output on my MacBook Air 13″: window.screen; /* Output from my MacBook Air 13″: availHeight: 975 availLeft: 0 availTop: 23 availWidth: 1680 colorDepth: 30 height: 1050 id: "" internal: false left: 0 orientation: ScreenOrientation {angle: 0, type: "landscape-primary", onchange: null} pixelDepth: 30 primary: false scaleFactor: 2 top: 0 touchSupport: false width: 1680 */ Like most people working in tech, I have had to adapt myself to the new work reality and set up my personal home office. Mine looks like on the photo below (if you are interested, you can read the full details about my setup). The iPad next to my MacBook Air is connected to the laptop via Sidecar, so whenever I need to, I can quickly turn the iPad into a second screen. A multi-screen setup. If I want to take advantage of the bigger screen, I can put the popup from the code sample above on to the second screen. I do it like this: popup.moveTo(2500, 50); This is a rough guess, since there is no way to know the dimensions of the second screen. The info from window.screen only covers the built-in screen, but not the iPad screen. The reported width of the built-in screen was 1680 pixels, so moving to 2500 pixels might work to shift the window over to the iPad, since I happen to know that it is located on the right of my MacBook Air. How can I do this in the general case? Turns out, there is a better way than guessing. That way is the Multi-Screen Window Placement API. Feature detection # To check if the Multi-Screen Window Placement API is supported, use: if ("getScreens" in window) { // The Multi-Screen Window Placement API is supported. } The window-placement permission # Before I can use the Multi-Screen Window Placement API, I must ask the user for permission to do so. The new window-placement permission can be queried with the Permissions API like so: let granted = false; try { const { state } = await navigator.permissions.query({ name: "window-placement" }); granted = state === "granted"; } catch { // Nothing. } The browser can choose to show the permission prompt dynamically upon the first attempt to use any of the methods of the new API. Read on to learn more. The isMultiScreen() method # To use the the Multi-Screen Window Placement API, I will first call the Window.isMultiScreen() method. It returns a promise that resolves with either true or false, depending on whether one or multiple screens are currently connected to the machine. For my setup, it returns true. await window.isMultiScreen(); // Returns `true` or `false`. The getScreens() method # Now that I know that the current setup is multi-screen, I can obtain more information about the second screen using Window.getScreens(). It returns a promise that resolves with an array of Screen objects. On my MacBook Air 13 with a connected iPad, this returns an array of two Screen objects: await window.getScreens(); /* Output from my MacBook Air 13″ with the iPad attached: Screen 1 (built-in display): availHeight: 975 availLeft: 0 availTop: 23 availWidth: 1680 colorDepth: 30 height: 1050 id: "0" internal: true left: 0 orientation: null pixelDepth: 30 primary: true scaleFactor: 2 top: 0 touchSupport: false width: 1680 Screen 2 (iPad): availHeight: 1001 availLeft: 1680 availTop: 23 availWidth: 1366 colorDepth: 24 height: 1024 id: "1" internal: false left: 1680 orientation: null pixelDepth: 24 primary: false scaleFactor: 2 top: 0 touchSupport: false width: 1366 */ Note how the value of left for the iPad starts at 1680, which is exactly the width of the built-in display. This allows me to determine exactly how the screens are arranged logically (next to each other, on top of each other, etc.). There is also data now for each screen to show whether it is an internal one and whether it is a primary one. Note that the built-in screen is not necessarily the primary screen. Both also have an id, which, if persisted across browser sessions, allows for window arrangements to be restored. The screenschange event # The only thing missing now is a way to detect when my screen setup changes. A new event, screenschange, does exactly that: it fires whenever the screen constellation is modified. (Notice that "screens" is plural in the event name.) It also fires when the resolution of one of the connected screens changes or when a new or an existing screen is (physically or virtually in the case of Sidecar) plugged in or unplugged. Note that you need to look up the new screen details asynchronously, the screenschange event itself does not provide this data. This may change in the future. For now you can look up the screen details by calling window.getScreens() as shown below. window.addEventListener('screenschange', async (event) => { console.log('I am there, but mostly useless', event); const details = await window.getScreens(); }); New fullscreen options # Until now, you could request that elements be displayed in fullscreen mode via the aptly named requestFullScreen() method. The method takes an options parameter where you can pass FullscreenOptions. So far, its only property has been navigationUI. The Multi-Screen Window Placement API adds a new screen property that allows you to determine which screen to start the fullscreen view on. For example, if you want to make the primary screen fullscreen: try { const primaryScreen = (await getScreens()).filter((screen) => screen.primary)[0]; await document.body.requestFullscreen({ screen: primaryScreen }); } catch (err) { console.error(err.name, err.message); } Polyfill # It is not possible to polyfill the Multi-Screen Window Placement API, but you can shim its shape so you can code exclusively against the new API: if (!("getScreens" in window)) { // Returning a one-element array with the current screen, // noting that there might be more. window.getScreens = async () => [window.screen]; // Returning `false`, noting that this might be a lie. window.isMultiScreen = async () => false; } The other aspects of the API—the onscreenschange event and the screen property of the FullscreenOptions—would simply never fire or silently be ignored respectively by non-supporting browsers. Demo # If you are anything like me, you keep a close eye on the development of the various cryptocurrencies. (In reality I very much do not, but, for the sake of this article, just assume I do.) To keep track of the cryptocurrencies that I own, I have developed a web app that allows me to watch the markets in all life situations, such as from the comfort of my bed, where I have a decent single-screen setup. Relaxing and watching the markets. This being about crypto, the markets can get hectic at any time. Should this happen, I can quickly move over to my desk where I have a multi-screen setup. I can click on any currency's window and quickly see the full details in a fullscreen view on the opposite screen. Below is a recent photo of me taken during the last YCY bloodbath. It caught me completely off-guard and left me with my hands on my face. Panicky, witnessing the YCY bloodbath. You can play with the demo embedded below, or see its source code on glitch. Security and permissions # The Chrome team has designed and implemented the Multi-Screen Window Placement API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics. The Multi-Screen Window Placement API exposes new information about the screens connected to a device, increasing the fingerprinting surface of users, especially those with multiple screens consistently connected to their devices. As one mitigation of this privacy concern, the exposed screen properties are limited to the minimum needed for common placement use cases. User permission is required for sites to get multi-screen information and place windows on other screens. User control # The user is in full control of the exposure of their setup. They can accept or decline the permission prompt, and revoke a previously granted permission via the site information feature in the browser. Transparency # The fact whether the permission to use the Multi-Screen Window Placement API has been granted is exposed in the browser's site information and is also queryable via the Permissions API. Permission persistence # The browser persists permission grants. The permission can be revoked via the browser's site information. Feedback # The Chrome team wants to hear about your experiences with the Multi-Screen Window Placement API. Tell us about the API design # Is there something about the API that does not work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chrome's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter Blink>WindowDialog in the Components box. Glitch works great for sharing quick and easy repros. Show support for the API # Are you planning to use the Multi-Screen Window Placement API? Your public support helps the Chrome team to prioritize features and shows other browser vendors how critical it is to support them. Share how you plan to use it on the WICG Discourse thread. Send a tweet to @ChromiumDev using the hashtag #WindowPlacement and let us know where and how you are using it. Ask other browser vendors to implement the API. Helpful links # Spec draft Public explainer Multi-Screen Window Placement API demo | Multi-Screen Window Placement API demo source Chromium tracking bug ChromeStatus.com entry Blink Component: Blink>WindowDialog Wanna go deeper # TAG Review Intent to Experiment Acknowledgements # The Multi-Screen Window Placement API spec was edited by Victor Costan and Joshua Bell. The API was implemented by Mike Wasserman. This article was reviewed by Joe Medley, François Beaufort, and Kayce Basques. Thanks to Laura Torrent Puig for the photos.

Apply effects to images with CSS's mask-image property

When you clip an element using the clip-path property the clipped area becomes invisible. If instead you want to make part of the image opaque or apply some other effect to it, then you need to use masking. This post explains how to use the mask-image property in CSS, which lets you specify an image to use as a mask layer. This gives you three options. You can use an image file as your mask, an SVG, or a gradient. Browser compatibility # Most browsers only have partial support for the standard CSS masking property. You will need to use the -webkit- prefix in addition to the standard property in order to achieve the best browser compatibility. See Can I use CSS Masks? for full browser support information. While browser support using the prefixed property is good, when using masking to make text on top of an image visible take care of what will happen if masking is unavailable. It may be worth using feature queries to detect support for mask-image or -webkit-mask-image and providing a readable fallback before adding your masked version. @supports(-webkit-mask-image: url(#mask)) or (mask-image: url(#mask)) { /* code that requires mask-image here. */ } Masking with an image # The mask-image property works in a similar way to the background-image property. Use a url() value to pass in an image. Your mask image needs to have a transparent or semi-transparent area. A fully transparent area will cause the part of the image under that area to be invisible. Using an area which is semi-transparent however will allow some of the original image to show through. You can see the difference in the Glitch below. The first image is the original image of balloons with no mask. The second image has a mask applied which has a white star on a fully transparent background. The third image has a white star on a background with a gradient transparency. In this example I am also using the mask-size property with a value of cover. This property works in the same way as background-size. You can use the keywords cover and contain or you can give the background a size using any valid length unit, or a percentage. You can also repeat your mask just as you might repeat a background image, in order to use a small image as a repeating pattern. Masking with SVG # Rather than using an image file as the mask, you could use SVG. There are a couple of ways this can be achieved. The first is to have a <mask> element inside the SVG and reference the ID of that element in the mask-image property. <svg width="0" height="0" viewBox="0 0 400 300"> <defs> <mask id="mask"> <rect fill="#000000" x="0" y="0" width="400" height="300"></rect> <circle fill="#FFFFFF" cx="150" cy="150" r="100" /> <circle fill="#FFFFFF" cx="50" cy="50" r="150" /> </mask> </defs> </svg> <div class="container"> <img src="balloons.jpg" alt="Balloons"> </div> .container img { height: 100%; width: 100%; object-fit: cover; -webkit-mask-image: url(#mask); mask-image: url(#mask); } The advantage of this approach is that the mask could be applied to any HTML element, not just an image. Unfortunately Firefox is the only browser that supports this approach. All is not lost however, as for the most common scenario of masking an image, we can include the image in the SVG. Masking with a gradient # Using a CSS gradient as your mask is an elegant way of achieving a masked area without needing to go to the trouble of creating an image or SVG. A simple linear gradient used as a mask could ensure that the bottom part of an image will not be too dark underneath a caption, for example. You can use any of the supported gradient types, and get as creative as you like. This next example uses a radial gradient to create a circular mask to illuminate behind the caption. Using multiple masks # As with background images you can specify multiple mask sources, combining them to get the effect that you want. This is particularly useful if you want to use a pattern generated with CSS gradients as your mask. These typically will use multiple background images and so can be translated easily into a mask. As an example, I found a nice checkerboard pattern in this article. The code, using background images, looks like this: background-image: linear-gradient(45deg, #ccc 25%, transparent 25%), linear-gradient(-45deg, #ccc 25%, transparent 25%), linear-gradient(45deg, transparent 75%, #ccc 75%), linear-gradient(-45deg, transparent 75%, #ccc 75%); background-size:20px 20px; background-position: 0 0, 0 10px, 10px -10px, -10px 0px; To turn this, or any other pattern designed for background images, into a mask, you will need to replace the background-* properties with the relevant mask properties, including the -webkit prefixed ones. -webkit-mask-image: linear-gradient(45deg, #000000 25%, rgba(0,0,0,0.2) 25%), linear-gradient(-45deg, #000000 25%, rgba(0,0,0,0.2) 25%), linear-gradient(45deg, rgba(0,0,0,0.2) 75%, #000000 75%), linear-gradient(-45deg, rgba(0,0,0,0.2) 75%, #000000 75%); -webkit-mask-size:20px 20px; -webkit-mask-position: 0 0, 0 10px, 10px -10px, -10px 0px; There are some really nice effects to be made by applying gradient patterns to images. Try remixing the Glitch and testing out some other variations. Along with clipping, CSS masks are a way to add interest to images and other HTML elements without needing to use a graphics application. Photo by Julio Rionaldo on Unsplash.

Create interesting image shapes with CSS's clip-path property

Elements on web pages are all defined inside a rectangular box. However that doesn't mean that we have to make everything look like a box. You can use the CSS clip-path property to clip away parts of an image or other element, to create interesting effects. In the example above, the balloon image is square (source). Using clip-path and the basic shape value of circle() the additional sky around the balloon is clipped away leaving a circular image on the page. As the image is a link you can see something else about the clip-path property. Only the visible area of the image can be clicked on, as events do not fire on the hidden parts of the image. Clipping can be applied to any HTML element, not just images. There are a few different ways to create a clip-path, in this post we will take a look at them. Browser compatibility # Other than the box values as explained later in the post, the various values of clip-path demonstrated have excellent browser support. For legacy browsers a fallback may be to allow the browser to ignore the clip-path property and show the unclipped image. If this is a problem you could test for clip-path in a feature query and offer an alternate layout for unsupporting browsers. @supports(clip-path: circle(45%)) { /* code that requires clip-path here. */ } Basic shapes # The clip-path property can take a number of values. The value used in the initial example was circle(). This is one of the basic shape values, which are defined in the CSS Shapes specification. This means that you can clip an area, and also use the same value for shape-outside to cause text to wrap around that shape. Note that CSS Shapes can only be applied to floated elements. The clip-path property does not require the element to be floated. The full list of basic shapes is: inset() circle() ellipse() polygon() inset() # The inset() value insets the clipped area from the edge of the element, and can be passed values for the top, right, bottom, and left edges. A border-radius can also be added to curve the corners of the clipped area, by using the round keyword. In my example I have two boxes both with a class of .box. The first box has no clipping, the second is clipped using inset() values. circle() # As you have seen, the circle() value creates a circular clipped area. The first value is a length or a percentage and is the radius of the circle. A second optional value allows you to set the center of the circle. In the example below I am using keyword values to set my clipped circle top right. You could also use lengths or percentages. Watch out for flat edges! # Be aware with all of these values that the shape will be clipped by the margin box on the element. If you create a circle on an image, and that shape would extend outside of the natural size of the image, you will get a flat edge. The image used earlier now has circle(50%) applied. As the image is not square, we hit the margin box at the top and bottom and the circle is clipped. ellipse() # An ellipse is essentially a squashed circle, and so acts very much like circle() but accepts a radius for x and a radius for y, plus the value for the center of the ellipse. polygon() # The polygon() value can help you create fairly complex shapes, defining as many points as you need, by setting the coordinates of each point. To help you create polygons and see what is possible check out Clippy, a clip-path generator, then copy and paste the code into your own project. Shapes from box values # Also defined in CSS Shapes are shapes from box values. These relate to the CSS Box Model -- the content box, padding box, border box, and margin box with keyword values of content-box, border-box, padding-box, and margin-box. These values can be used alone, or alongside a basic shape to define the reference box used by the shape. For example, the following would clip the shape to the edge of the content. .box { clip-path: content-box; } In this example the circle would use the content-box as the reference box rather than the margin-box (which is the default). .box { clip-path: circle(45%) content-box; } Currently browsers do not support the use of box values for the clip-path property. They are supported for shape-outside however. Using an SVG element # For more control over your clipped area than is possible with basic shapes, use an SVG clipPath element. Then reference that ID, using url() as the value for clip-path. Animating the clipped area # CSS transitions and animations can be applied to the clip-path to create some interesting effects. In this next example I am animating a circle on hover by transitioning between two circles with a different radius value. There are lots of creative ways in which animation can be used with clipping. Animating with clip-path on CSS Tricks runs through some ideas. Photo by Matthew Henry on Burst.

Introducing 1.1

3D models are more relevant than ever. Retailers bring in-store shopping experiences to customers' homes. Museums are making 3D models of their artifacts available to everyone on the web. Unfortunately, it can be difficult to add a 3D model to a website in a way that provides a great user experience without a deep knowledge of 3D technologies or resorting to hosting 3D content on a third-party site. The <model-viewer> web component, introduced in early 2019, seeks to make putting 3D models on the web as easy as writing a few lines of HTML. Since then, the team has been working to address feedback and requests from the community. The culmination of that work was <model-viewer> version 1.0, released earlier this year. We're now announcing the release of <model-viewer> 1.1. You can read the release notes in GitHub. What's new since last year? # Version 1.1 includes built-in support for augmented reality (AR) on the web, improvements to speed and fidelity, and other frequently-requested features. Augmented reality # Viewing a 3D model on a blank canvas is great, but being able to view it in your space is even better. For an entirely-within-the-browser 3D and AR Chrome Android supports augmented reality using WebXR . <model-viewer> AR capability. When it's ready, you'll be able to use it by add an ar attribute to the <model-viewer> tag. Other attributes allow you to customize the WebXR AR experience, as shown in the WebXR sample on modelviewer.dev. The code sample below shows what this might look like. <model-viewer src="Chair.glb" ar ar-scale="auto" camera-controls alt="A 3D model of an office chair."> </model-viewer> --> It looks something like the embedded video shown under this heading. Camera controls # <model-viewer> now gives full control over the view's virtual camera (the perspective of the viewer). This includes the camera target, orbit (position relative to the model), and field of view. You can also enable auto-rotation and set limits on user interaction (e.g. maximum and minimum fields of view). Annotations # You can also annotate your models using HTML and CSS. This capability is often used to "attach" labels to parts of the model in a way that moves with the model as it's manipulated. The annotations are customizable, including their appearance and the extent to which they're hidden by the model. Annotations also work in AR. <style> button{ display: block; width: 6px; height: 6px; border-radius: 3px; border: 3px solid blue; background-color: blue; box-sizing: border-box; } #annotation{ background-color: #dddddd; position: absolute; transform: translate(10px, 10px); border-radius: 10px; padding: 10px; } </style> <model-viewer src="https://modelviewer.dev/assets/ShopifyModels/ToyTrain.glb" alt="A 3D model of a Toy Train" camera-controls> <button slot="hotspot-hand" data-position="-0.023 0.0594 0.0714" data-normal="-0.3792 0.0004 0.9253"> <div id="annotation">Whistle</div> </button> </model-viewer> --> A space suit with an annotation. See the annotations documentation page for more information. Editor # Version 1.1 introduces and hosts a <model-viewer> "editing" tool, which enables you to quickly preview your model, try out different <model-viewer> configurations (e.g. exposure and shadow softness), generate a poster image, and interactively get coordinates for annotations. Rendering and performance improvements # Rendering fidelity is greatly improved, especially for high dynamic range (HDR) environments. <model-viewer> now also uses a direct render path when only one <model-viewer> element is in the viewport, which increases performance (especially on Firefox). Lastly, dynamically scaling the render resolution improved frame rate dramatically. The example below shows off some of these recent improvements. <model-viewer camera-controls skybox-image="spruit_sunrise_1k_HDR.hdr" alt="A 3D model of a well-worn helmet" src="DamagedHelmet.glb"></model-viewer> --> A 3D model of a well-worn helmet. Stability # With <model-viewer> reaching its first major version, API stability is a priority, so breaking changes will be avoided until version 2.0 is released. What's next? # <model-viewer> version 1.0 includes the most-requested capabilities, but the team is not done yet. More features will be added, as will improvements in performance, stability, documentation, and tooling. If you have suggestions, file an issue in Github; also, PRs are always welcome. You can stay connected by following <model-viewer> on Twitter and checking out the community chat on Spectrum.

Custom bullets with CSS ::marker

Thanks to Igalia, sponsored by Bloomberg, we can finally put our hacks away for styling lists. See! View Source Thanks to CSS ::marker we can change the content and some of the styles of bullets and numbers. Browser compatibilty # ::marker is supported in Firefox for desktop and Android, desktop Safari and iOS Safari (but only the color and font-* properties, see Bug 204163), and Chromium-based desktop and Android browsers. See MDN's Browser compatibility table for updates. Pseudo-elements # Consider the following essential HTML unordered list: <ul> <li>Lorem ipsum dolor sit amet consectetur adipisicing elit</li> <li>Dolores quaerat illo totam porro</li> <li>Quidem aliquid perferendis voluptates</li> <li>Ipsa adipisci fugit assumenda dicta voluptates nihil reprehenderit consequatur alias facilis rem</li> <li>Fuga</li> </ul> Which results in the following unsurprising rendering: The dot at the beginning of each <li> item is free! The browser is drawing and creating a generated marker box for you. Today we're excited to talk about the ::marker pseudo-element, which gives the ability to style the bullet element that browsers create for you. Key Term: A pseudo-element represents an element in the document other than those which exist in the document tree. For example, you can select the first line of a paragraph using the pseudo-element p::first-line, even though there is no HTML element wrapping that line of text. Creating a marker # The ::marker pseudo-element marker box is automatically generated inside every list item element, preceding the actual contents and the ::before pseudo-element. li::before { content: "::before"; background: lightgray; border-radius: 1ch; padding-inline: 1ch; margin-inline-end: 1ch; } Typically, list items are <li> HTML elements, but other elements can also become list items with display: list-item. <dl> <dt>Lorem</dt> <dd>Lorem ipsum dolor sit amet consectetur adipisicing elit</dd> <dd>Dolores quaerat illo totam porro</dd> <dt>Ipsum</dt> <dd>Quidem aliquid perferendis voluptates</dd> </dl> dd { display: list-item; list-style-type: "🤯"; padding-inline-start: 1ch; } Styling a marker # Until ::marker, lists could be styled using list-style-type and list-style-image to change the list item symbol with 1 line of CSS: li { list-style-image: url(/right-arrow.svg); /* OR */ list-style-type: '👉'; padding-inline-start: 1ch; } That's handy but we need more. What about changing the color, size, spacing, etc!? That's where ::marker comes to the rescue. It allows individual and global targeting of these pseudo-elements from CSS: li::marker { color: hotpink; } li:first-child::marker { font-size: 5rem; } Caution: If the above list does not have pink bullets, then ::marker is not supported in your browser. The list-style-type property gives very limited styling possibilities. The ::marker pseudo-element means that you can target the marker itself and apply styles directly to it. This allows for far more control. That said, you can't use every CSS property on a ::marker. The list of which properties are allowed and not allowed are clearly indicated in the spec. If you try something interesting with this pseudo-element and it doesn't work, the list below is your guide into what can and can't be done with CSS: Allowed CSS ::marker Properties # animation-* transition-* color direction font-* content unicode-bidi white-space Changing the contents of a ::marker is done with content as opposed to list-style-type. In this next example the first item is styled using list-style-type and the second with ::marker. The properties in the first case apply to the entire list item, not just the marker, which means that the text is animating as well as the marker. When using ::marker we can target just the marker box and not the text. Also, note how the disallowed background property has no effect. List Styles li:nth-child(1) { list-style-type: '?'; font-size: 2rem; background: hsl(200 20% 88%); animation: color-change 3s ease-in-out infinite; } Mixed results between the marker and the list item Marker Styles li:nth-child(2)::marker { content: '!'; font-size: 2rem; background: hsl(200 20% 88%); animation: color-change 3s ease-in-out infinite; } Focused results between marker and list item Gotchas! In Chromium, white-space only works for inside positioned markers. For outside positioned markers, the style adjuster always forces white-space: pre in order to preserve the trailing space. Changing the content of a marker # Here are some of the ways you could style your markers. Changing all list items li { list-style-type: "😍"; } /* OR */ li::marker { content: "😍"; } Changing just one list item li:last-child::marker { content: "😍"; } Changing a list item to SVG li::marker { content: url(/heart.svg); content: url(#heart); content: url("data:image/svg+xml;charset=UTF-8,<svg xmlns='http://www.w3.org/2000/svg' version='1.1' height='24' width='24'><path d='M12 21.35l-1.45-1.32C5.4 15.36 2 12.28 2 8.5 2 5.42 4.42 3 7.5 3c1.74 0 3.41.81 4.5 2.09C13.09 3.81 14.76 3 16.5 3 19.58 3 22 5.42 22 8.5c0 3.78-3.4 6.86-8.55 11.54L12 21.35z' fill='none' stroke='hotpink' stroke-width='3'/></svg>"); } Changing numbered lists What about an <ol> though? The marker on an ordered list item is a number and not a bullet by default. In CSS these are called Counters, and they're quite powerful. They even have properties to set and reset where the number starts and ends, or switching them to roman numerals. Can we style that? Yep, and we can even use the marker content value to build our own numbering presentation. li::marker { content: counter(list-item) "› "; color: hotpink; } Debugging # Chrome DevTools is ready to help you inspect, debug and modify the styles applying to ::marker pseudo elements. Future Pseudo-element styling # You can find out more about ::marker from: CSS Lists, Markers and Counters from Smashing Magazine Counting With CSS Counters and CSS Grid from CSS-Tricks Using CSS Counters from MDN It's great to get access to something which has been hard to style. You might wish that you could style other automatically generated elements. You might be frustrated with <details> or the search input autocomplete indicator, things that are not implemented in the same way across browsers. One way to share what you need is by creating a want at https://webwewant.fyi.

Help users change passwords easily by adding a well-known URL for changing passwords

Set a redirect from /.well-known/change-password to the change password page of your website. This will enable password managers to navigate your users directly to that page. Introduction # As you may know, passwords are not the best way to manage accounts. Luckily, there are emerging technologies such as WebAuthn and techniques such as one-time passwords that are helping us get closer to a world without passwords. However, these technologies are still being developed and things won't change rapidly. Many developers will still need to deal with passwords for at least the next few years. While we wait for the emerging technologies and techniques to become commonplace, we can at least make passwords easier to use. A good way to do this is to provide better support for password managers. How password managers help # Password managers can be built into browsers or provided as third-party apps. They can help users in various ways: Autofill the password for the correct input field: Some browsers can find the correct input heuristically even if the website is not optimized for this purpose. Web developers can help password managers by correctly annotating HTML input tags. Prevent phishing: Because password managers remember where the password was recorded, the password can be autofilled only at appropriate URLs, and not at phishing websites. Generate strong and unique passwords: Because strong and unique passwords are directly generated and stored by the password manager, users don't have to remember a single character of the password. Generating and autofilling passwords using a password manager have already served the web well, but considering their lifecycle, updating the passwords whenever it's required is as important as generating and autofilling. To properly leverage that, password managers are adding a new feature: Detect vulnerable passwords and suggest updating them: Password managers can detect passwords that are reused, analyze the entropy and weakness of them, and even detect potentially leaked passwords or ones that are known to be unsafe from sources such as Have I Been Pwned. A password manager can warn users about problematic passwords, but there's a lot of friction in asking users to navigate from the homepage to a change password page, on top of going through the actual process of changing the password (which varies from site to site). It would be much easier if password managers could navigate the user directly to the change-password URL. This is where a well-known URL for changing passwords becomes useful. By reserving a well-known URL path that redirects the user to the change password page, the website can easily redirect users to the right place to change their passwords. Set up "a well-known URL for changing passwords" # .well-known/change-password is proposed as a well-known URL for changing passwords. All you have to do is to configure your server to redirect requests for .well-known/change-password to the change password URL of your website. For example, let's say your website is https://example.com and the change password URL is https://example.com/settings/password. You'll just need to set your server to redirect a request for https://example.com/.well-known/change-password to https://example.com/settings/password. That's it. For the redirection, use the HTTP status code 302 Found, 303 See Other or 307 Temporary Redirect. Alternatively you can serve HTML at your .well-known/change-password URL with a <meta> tag using an http-equiv="refresh". <meta http-equiv="refresh" content="0;url=https://example.com/settings/password"> Revisit your change password page HTML # The goal of this feature is to help the user's password lifecycle be more fluid. You can do two things to empower the user to update their password without friction: If your change-password form needs the current password, add autocomplete="current-password" to the <input> tag to help the password manager autofill it. For the new password field (in many cases it's two fields to ensure that the user has entered the new password correctly), add autocomplete="new-password" to the <input> tag to help the password manager suggest a generated password. Learn more at Sign-in form best practices. How it is used in real world # Examples # Thanks to Apple Safari's implementation, /.well-known/change-password, has already been available on some major websites for a while: Google GitHub Facebook Twitter WordPress Try them yourself and do the same for yours! Browser compatibility # A well-known URL for changing passwords has been supported in Safari since 2019. Chrome's password manager is starting to support it from version 86 onwards (which is scheduled for Stable release in late October 2020) and other Chromium-based browsers may follow. Firefox considers it worth implementing, but has not signalled that it plans to do so as of August 2020. Chrome's password manager behavior # Let's have a look at how Chrome's password manager treats vulnerable passwords. Chrome's password manager is able to check for leaked passwords. By navigating to chrome://settings/passwords users can run Check passwords against stored passwords, and see a list of passwords that are recommended for update. Check passwords functionality By clicking the Change password button next to a password that is recommended to be updated, the browser will: Open the website's change password page if /.well-known/change-password is set up correctly. Open the website's homepage if /.well-known/change-password is not set up and Google doesn't know the fallback. 200 OK even /.well-known/change-password doesn't exist? Password managers try to determine if a website supports a well-known URL for changing passwords by sending a request to /.well-known/change-password before actually forwarding a user to this URL. If the request returns 404 Not Found it is obvious that the URL is not available, but a 200 OK response doesn't necessarily mean that the URL is available, because there are a few edge cases: A server-side-rendering website displays "Not found" when there is no content but with 200 OK. A server-side-rendering website responds with 200 OK when there is no content after redirecting to the "Not found" page. A single page app responds with the shell with 200 OK and renders the "Not found" page on the client side when there is no content. For these edge cases users will be forwarded to a "Not Found" page and that will be a source of confusion. That's why there's a proposed standard mechanism to determine whether the server is configured to respond with 404 Not Found when there is genuinely no content, by requesting a random page. Actually, the URL is also reserved: /.well-known/resource-that-should-not-exist-whose-status-code-should-not-be-200. Chrome for example uses this URL path to determine whether it can expect a proper change password URL from /.well-known/change-password in advance. When you are deploying /.well-known/change-password, make sure that your server returns 404 Not Found for any non-existing contents. Feedback # If you have any feedback on the specification, please file an issue to the spec repository. Resources # A Well-Known URL for Changing Passwords Detecting the reliability of HTTP status codes Sign-in form best practices Photo by Matthew Brodeur on Unsplash

Use advanced typography with local fonts

The Local Font Access API is part of the capabilities project and is currently in development. This post will be updated as the implementation progresses. Web safe fonts # If you have been doing web development long enough, you may remember the so-called web safe fonts. These fonts are known to be available on nearly all instances of the most used operating systems (namely Windows, macOS, the most common Linux distributions, Android, and iOS). In the early 2000s, Microsoft even spearheaded an initiative called TrueType core fonts for the Web that provided these fonts for free download with the objective that "whenever you visit a Web site that specifies them, you'll see pages exactly as the site designer intended". Yes, this included sites set in Comic Sans MS. Here is a classic web safe font stack (with the ultimate fallback of whatever sans-serif font) might look like this: body { font-family: Helvetica, Arial, sans-serif; } Web fonts # The days where web safe fonts really mattered are long gone. Today, we have web fonts, some of which are even variable fonts that we can tweak further by changing the values for the various exposed axes. You can use web fonts by declaring an @font-face block at the start of the CSS, which specifies the font file(s) to download: @font-face { font-family: 'FlamboyantSansSerif'; src: url('flamboyant.woff2'); } After this, you can then use the custom web font by specifying the font-family, as normal: body { font-family: 'FlamboyantSansSerif'; } Local fonts as fingerprint vector # Most web fonts come from, well, the web. An interesting fact, though, is that the src property in the @font-face declaration, apart from the url() function, also accepts a local() function. This allows custom fonts to be loaded (surprise!) locally. If the user happens to have FlamboyantSansSerif installed on their operating system, the local copy will be used rather than it being downloaded: @font-face { font-family: 'FlamboyantSansSerif'; src: local('FlamboyantSansSerif'), url('flamboyant.woff2'); } This approach provides a nice fallback mechanism that potentially saves bandwidth. On the Internet, unfortunately, we cannot have nice things. The problem with the local() function is that it can be abused for browser fingerprinting. Turns out, the list of fonts a user has installed can be pretty identifying. A lot of companies have their own corporate fonts that are installed on employees' laptops. For example, Google has a corporate font called Google Sans. An attacker can try to determine what company someone works for by testing for the existence of a large number of known corporate fonts like Google Sans. The attacker would attempt rendering text set in these fonts on a canvas and measure the glyphs. If the glyphs match the known shape of the corporate font, the attacker has a hit. If the glyphs do not match, the attacker knows that a default replacement font was used since the corporate font was not installed. For full details on this and other browser fingerprinting attacks, read the survey paper by Laperdix et al. Company fonts apart, even just the list of installed fonts can be identifying. The situation with this attack vector has become so bad that recently the WebKit team decided to "only include [in the list available fonts] web fonts and fonts that come with the operating system, but not locally user-installed fonts". (And here I am, with an article on granting access to local fonts.) The Local Font Access API # The beginning of this article may have put you in a negative mood. Can we really not have nice things? Fret not. We think we can, and maybe everything is not hopeless. But first, let me answer a question that you might be asking yourself. Why do we need the Local Font Access API when there are web fonts? # Professional-quality design and graphics tools have historically been difficult to deliver on the web. One stumbling block has been an inability to access and use the full variety of professionally constructed and hinted fonts that designers have locally installed. Web fonts enable some publishing use-cases, but fail to enable programmatic access to the vector glyph shapes and font tables used by rasterizers to render the glyph outlines. There is likewise no way to access a web font's binary data. Design tools need access to font bytes to do their own OpenType layout implementation and allow design tools to hook in at lower levels, for actions such as performing vector filters or transforms on the glyph shapes. Developers may have legacy font stacks for their applications that they are bringing to the web. To use these stacks, they usually require direct access to font data, something web fonts do not provide. Some fonts may not be licensed for delivery over the web. For example, Linotype has a license for some fonts that only includes desktop use. The Local Font Access API is an attempt at solving these challenges. It consists of two parts: A font enumeration API, which allows users to grant access to the full set of available system fonts. From each enumeration result, the ability to request low-level (byte-oriented) SFNT container access that includes the full font data. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification In progress 3. Gather feedback & iterate on design In progress 4. Origin trial In progress 5. Launch Not started How to use the Local Font Access API # Enabling via chrome://flags # To experiment with the Local Font Access API locally, enable the #font-access flag in chrome://flags. Enabling support during the origin trial phase # Starting in Chrome 87, the Local Font Access API will be available as an origin trial in Chrome. The origin trial is expected to end in Chrome 89 (April 7, 2021). Origin trials allow you to try new features and give feedback on their usability, practicality, and effectiveness to the web standards community. For more information, see the Origin Trials Guide for Web Developers. To sign up for this or another origin trial, visit the registration page. Register for the origin trial # Request a token for your origin. Add the token to your pages. There are two ways to do that: Add an origin-trial <meta> tag to the head of each page. For example, this may look something like: <meta http-equiv="origin-trial" content="TOKEN_GOES_HERE"> If you can configure your server, you can also add the token using an Origin-Trial HTTP header. The resulting response header should look something like: Origin-Trial: TOKEN_GOES_HERE Feature detection # To check if the Local Font Access API is supported, use: if ('fonts' in navigator) { // The Local Font Access API is supported } Asking for permission # Access to a user's local fonts is gated behind the "local-fonts" permission, which you can request with navigator.permissions.request(). // Ask for permission to use the API try { const status = await navigator.permissions.request({ name: 'local-fonts', }); if (status.state !== 'granted') { throw new Error('Permission to access local fonts not granted.'); } } catch (err) { // A `TypeError` indicates the 'local-fonts' // permission is not yet implemented, so // only `throw` if this is _not_ the problem. if (err.name !== 'TypeError') { throw err; } } Enumerating local fonts # Once the permission has been granted, you can then, from the FontManager interface that is exposed on navigator.fonts, call query() to ask the browser for the locally installed fonts, which it will display in a picker for the user to select all or a subset from to be shared with the page. This results in an array that you can loop over. Each font is represented as a FontMetadata object with the properties family (for example, "Comic Sans MS"), fullName (for example, "Comic Sans MS"), and postscriptName (for example, "ComicSansMS"). // Query for all available fonts and log metadata. try { const pickedFonts = await navigator.fonts.query(); for (const metadata of pickedFonts) { console.log(metadata.postscriptName); console.log(metadata.fullName); console.log(metadata.family); } } catch (err) { console.error(err.name, err.message); } Accessing SFNT data # Full SFNT access is available via the blob() method of the FontMetadata object. SFNT is a font file format which can contain other fonts, such as PostScript, TrueType, OpenType, Web Open Font Format (WOFF) fonts and others. try { const pickedFonts = await navigator.fonts.query(); for (const metadata of pickedFonts) { // We're only interested in a particular font. if (metadata.family !== 'Comic Sans MS') { continue; } // `blob()` returns a Blob containing valid and complete // SFNT-wrapped font data. const sfnt = await metadata.blob(); const sfntVersion = new TextDecoder().decode( // Slice out only the bytes we need: the first 4 bytes are the SFNT // version info. // Spec: https://docs.microsoft.com/en-us/typography/opentype/spec/otff#organization-of-an-opentype-font await sfnt.slice(0, 4).arrayBuffer(), ); let outlineFormat = 'UNKNOWN'; switch (sfntVersion) { case '\x00\x01\x00\x00': case 'true': case 'typ1': outlineFormat = 'truetype'; break; case 'OTTO': outlineFormat = 'cff'; break; } console.log('Outline format:', outlineFormat); } } catch (err) { console.error(err.name, err.message); } Demo # You can see the Local Font Access API in action in the demo below. Be sure to also check out the source code. The demo showcases a custom element called <font-select> that implements a local font picker. Privacy considerations # The "local-fonts" permission appears to provide a highly fingerprintable surface. However, browsers are free to return anything they like. For example, anonymity-focused browsers may choose to only provide a set of default fonts built into the browser. Similarly, browsers are not required to provide table data exactly as it appears on disk. Wherever possible, the Local Font Access API is designed to only expose exactly the information needed to enable the mentioned use cases. System APIs may produce a list of installed fonts not in a random or a sorted order, but in the order of font installation. Returning exactly the list of installed fonts given by such a system API can expose additional data that may be used for fingerprinting, and use cases we want to enable are not assisted by retaining this ordering. As a result, this API requires that the returned data be sorted before being returned. Security and permissions # The Chrome team has designed and implemented the Local Font Access API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics. User control # Access to a user's fonts is fully under their control and will not be allowed unless the "local-fonts" permission, as listed in the permission registry, is granted. Transparency # Whether a site has been granted access to the user's local fonts will be visible in the site information sheet. Permission persistence # The "local-fonts" permission will be persisted between page reloads. It can be revoked via the site information sheet. Feedback # The Chrome team wants to hear about your experiences with the Local Font Access API. Tell us about the API design # Is there something about the API that does not work like you expected? Or are there missing methods or properties that you need to implement your idea? Have a question or comment on the security model? File a spec issue on the corresponding GitHub repo, or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chrome's implementation? Or is the implementation different from the spec? File a bug at new.crbug.com. Be sure to include as much detail as you can, simple instructions for reproducing, and enter Blink>Storage>FontAccess in the Components box. Glitch works great for sharing quick and easy repros. Show support for the API # Are you planning to use the Local Font Access API? Your public support helps the Chrome team to prioritize features and shows other browser vendors how critical it is to support them. Send a tweet to @ChromiumDev using the hashtag #LocalFontAccess and let us know where and how you're using it. Helpful links # Explainer Spec draft Chromium bug for font enumeration Chromium bug for font table access ChromeStatus entry GitHub repo TAG review Mozilla standards position Origin Trial Acknowledgements # The Local Font Access API spec was edited by Emil A. Eklund, Alex Russell, Joshua Bell, and Olivier Yiptong. This article was reviewed by Joe Medley, Dominik Röttsches, and Olivier Yiptong. Hero image by Brett Jordan on Unsplash.

ARIA: poison or antidote?

What is ARIA? # ARIA lets web authors create an alternative reality, seen only by screen readers 🤥 Sometimes it's necessary to expand on the truth or even downright "lie" to screen readers about what's happening in web content. For example, "focus is really over here!" or "this is really a slider!". It's like adding magical sticky notes on top of tools and widgets on your workbench. These magical sticky notes make everyone believe what's written on them. Whenever a magical sticky note exists, it either overrides our belief about what each tool is, or something about the tool. Example: "this thing over here is a glue gun!". Even though it's still actually an empty blue box sitting there on the workbench, the magical sticky note will make us see it is a glue gun. We can also add, "and it is 30% full!". The screen reader will now report that there is a 30% full glue gun there. The web equivalent to this is to take a plain box element (a div) with an image inside of it, and use ARIA to say it's a slider at value 30 out of 100. What isn't ARIA? # ARIA does not affect the appearance of a web page, or the behavior for a mouse or keyboard user. Only users of assistive technologies will notice any difference from ARIA. Web developers can add any arbitrary ARIA without affecting users that aren't running an assistive technology. You read it right: ARIA doesn't actually do anything to keyboard focus or tab order. That's all done in HTML, sometimes tweaked with bits of JavaScript. How does ARIA work? # Browsers are asked by a screen reader or other assistive technology for information about each element. When ARIA is present on an element, the browser takes in the information and changes what it tells the screen reader about that element. Why ARIA? # Why would we ever want to lie to our users!? Let's say the local web store doesn't sell all the widgets we need. But, we are MacGyver, dammit. We can just invent our own widgets from other widgets! FWIW, the MacGyver's seven most used things are Swiss Army knives, gum, shoe strings, matches, paper clips, birthday candles, and duct tape. He uses them to make bombs and other things that aren't just laying around. This is pretty similar to a web author who needs to make a menu bar. Menu bars are so useful you would think they would be part of HTML, but they aren't. Oh well! You didn't think authors would be happy with links and buttons did you? So the author will cobble one together using their favorite tools: divs, images, style, click handlers, keypress handlers, spit, and ARIA. Sometimes, rather than using ARIA to the max, we just use it as an enhancement. It can be useful to sprinkle a little ARIA on some HTML that already basically works. For example, we might want a form control to point to an error message alert that relates to some invalid input. Or we might want to indicate that a textbox is for searching. These little tweaks can make ordinary websites more usable with a screen reader. Menu bar example # Supporting mouse clicker people # Let's make a menu bar together. We show a bunch of items in generic box elements called divs. Any time our user clicks on a div, it executes the corresponding command. Cool, it works for mouse clicker people! Next we make it look pretty. We use CSS, i.e. styles, lining things up nicely and putting visual outlines around them. We make it look enough like other menu bars that sighties intuitively know that it's a menu bar and how to use it. Our menu bar even uses a different background color on any item that the mouse is over, giving the user some helpful visual feedback. Some menu items are parents. They spawn child submenus. Whenever the user hovers on one of these we start an animation that slides out the child submenu. This, of course, is all pretty inaccessible, as is the usual case for many things on the web, largely because the HTML standards wizards didn't add everything a web author needs. And even if they did, web authors would always want to invent their own special version anyway. Making our menu bar keyboard accessible # As a first step toward accessibility, let's add keyboard accessibility. This part only uses HTML, and not ARIA. Remember that ARIA does not affect core aspects such as appearance, mouse, or keyboard for users without assistive technologies. Just like a web page can respond to the mouse, it can also respond to the keyboard. Our JavaScript will listen to all keystrokes that occur and decide if the keypress is useful. If not, it throws it back to the page like a fish that's too small to eat. Our rules are something like: If the user presses an arrow key, let's look at our own internal menu bar blueprints and decide what the new active menu item should be. We will clear any current highlights and highlight the new menu item so the sighted user visually knows where they are. The web page should then call event.preventDefault() to prevent the browser from performing the usual action (scrolling the page, in this case). If the user presses the Enter key, we can treat it just like a click, and perform the appropriate action (or even open another menu). If the user presses a key that should do something else, don't eat that! Throw it back to the page as nature intended. For example, our menu bar doesn't need the Tab key, so throw it back! This is hard to get right, and authors often mess it up. For example, the menu bar needs arrow keys, but not Alt+Arrow or Command+Arrow. Those are shortcuts for moving to the previous/next page in the web history of your browser tab. If the author isn't careful, the menu bar will eat those. This kind of bug happens a lot, and we haven't even started with ARIA yet! Screen reader access to our menu bar # Our menu bar was created with duct tape and divs. As a result, a screen reader has no idea what any of it is. The background color for the active item is just a color. The menu item divs are just plain objects with no particular meaning. Consequently, a user of our menu bar doesn't get any instructions about what keys to press or what item they're on. But that's no fair! The menu bar acts just fine for the sighted user. ARIA to the rescue. ARIA lets us pretend to the screen reader that focus is in a menu bar. If the author does everything right, our custom menu bar will look to the screen reader just like a menu bar in a desktop application. Our first, ahem, ARIA lie, is to use the aria-activedescendant attribute, and set it to the ID of the currently active menuitem, being careful to update it whenever it changes. For example, aria-activedescendant="settings-menuitem". This little white lie causes the screen reader to consider our ARIA active item as the focus, which is read aloud or shown on a Braille display. ancestor, parent, and descendant Back to aria-activedescendant. By using it to point from the focused menu bar to a specific menu item, the screen reader now knows where the user has moved, but nothing else about the object. What is this div thing anyway? That's where the role attribute comes in. We use role="menubar" on the containing element for the entire thing, then we use role="menu" on groups of items, and role="menuitem" on … drumroll … the individual menu items. And what if the menuitem can lead to a child menu? The user needs to know that right? For a sighted user, there might be a little picture of a triangle at the end of the menu, but the screen reader doesn't know how to automatically read images, at least at this point. We can add aria-expanded="false" on each expandable menuitem to indicate that 1) there is something that can be expanded, and 2) it currently is not expanded. As an added touch the author should put role="none" on the img triangle to indicate it's for prettification purposes only. This prevents the screen reader from saying anything about the image that would be redundant at best and possibly annoying. Dealing with bugs # Keyboard bugs (HTML!) # Although keyboard access is a part of core HTML, authors mess it up all the time, either because they don't use keyboard navigation all that much, or because there is much nuance to get right. Examples of bugs: A checkbox uses spacebar to toggle, but the author forgot to call preventDefault(). Now the spacebar will both toggle the checkbox and page down, which is the default browser behavior for spacebar. An ARIA modal dialog wants to trap tab navigation inside of it, and the author forgets to specifically allow Control+Tab through to the browser. Now, Control+Tab just navigates within their dialog, and doesn't switch tabs in the browser as it should. Ugh. An author creates a selection list, and implements up/down, but does not implement home/end/pageup/pagedown or first letter navigation. Authors should follow known patterns. Check out the Resources section for more information. For pure keyboard access issues, it's useful to also try without a screen reader, or with virtual browser mode off. Screen readers are not generally necessary to discover keyboard bugs, and keyboard access is actually implemented with HTML, not ARIA. After all, ARIA doesn't affect basic stuff like the keyboard or mouse behavior, it only lies to the screen reader about what's in the web page, what's currently focused, etc. Keyboard bugs are almost always a bug in the web content, specifically in their HTML and JavaScript, not in ARIA. ARIA bugs: why are there so many? # There are many, many places where authors can get ARIA wrong, and each will lead to either complete breakage or subtle differences. The subtle ones are probably worse, because the author won't catch most of them before publishing. After all, unless the author is an experienced screen reader user, something is going to go wrong in the ARIA. In our menu bar example, the author could think the "option" role was to be used when "menuitem" was correct. They could forget to use aria-expanded, forget to set and clear aria-activedescendant at the right times, or forget to have a menu bar containing the other menus. And what about menu item counts? Usually menu items are presented by screen readers with something like "item 3 of 5" so that the user knows where they are. This is generally counted automatically by the browser, but in some cases, and in some browser - screen reader combinations, the wrong numbers might be computed, and the author would need to override these numbers with aria-posinset and aria-setsize. And this is just menu bars. Think of how many kinds of widgets there are. Glance at the ARIA spec or authoring practices if you like. For each pattern, there are a dozen ways ARIA could be misused. ARIA relies on authors to know what they're doing. What could possibly go wrong, given that most authors are not screen reader users? In other words, it is 100 percent necessary for actual screen reader users to try ARIA widgets before they're considered shippable. There's too much nuance. Ideally everything would be tried with several different browser-screen reader combinations, because of the numerous implementation quirks, in addition to a few incomplete implementations. Summary # In summary, ARIA magic can be used to override or add to anything and everything that the HTML says. It can be used to do little fine changes to the accessibility presentation, or to create an entire experience. This is why ARIA is both incredibly powerful and yet dangerous in the hands of our friendly local web authors who don't generally use screen readers themselves. ARIA is just a dumb truth override markup layer. When a screen reader asks what's happening, if ARIA exists, they get the ARIA version of the truth instead of the real underlying truth. Addendum 1: Additional Resources # Hybrid reference with keyboard info and code examples # W3C's ARIA Authoring Practices: this documents the important keyboard navigation characteristics of each example and provides working JS/CSS/ARIA code. The examples are focused on what works today, and do not cover mobile. Addendum 2: What is ARIA most used for? # Because ARIA can replace or supplement small or large truths, generally useful for saying stuff that the screen reader cares about. Here are some common uses of ARIA. Special widgets that don't exist in HTML, like a menu bar, autocomplete, tree, or spreadsheet Widgets that exist in HTML, but the author invented their own anyway, possibly because they needed to tweak the behavior or appearance of the normal widget. For example, an HTML <input type="range"> element is basically a slider, but authors want to make it look different. For most things, CSS can be used, but for input type="range", CSS is awkward. An author can make their own slider, and use role="slider" on it with aria-valuenow to say what the current value is. Live regions tell screen readers "in this area of the page, anything that changes is worth telling the user about." Landmarks (HTML has equivalents now). These are somewhat like headings, in that they help screen reader users find what they want faster. However, they're different in that they contain the entire related area. Like, "this container is the main area of the page" and "this container over here is a navigation panel". Addendum 3: What's an Accessibility API? # An accessibility API is how a screen reader or other AT knows what's in the page and what's happening right now. Examples include MSAA, IA2, and UIA. And that's just Windows! There are two parts to an accessibility API: A "tree" of objects that represents a container hierarchy. These are like Russian nesting dolls, but each doll can contain multiple other dolls. For example, a document can contain a bunch of paragraphs, and a paragraph can have text, images, links, boldface, etc. Each item in the object tree can have properties like a role (what am I?), a name/label, a user-entered value, a description, as well as boolean states like focusable, focused, required, checked. ARIA can override any of these properties. A series of events that occur describing changes to the tree, like "focus is now over here!". The screen reader uses the events to tell the user what has just happened. When important HTML or ARIA markup changes, an event is fired to tell the screen reader that something changed. Usually authors just use HTML, which maps nicely to these accessibility APIs. When HTML is not enough, ARIA is used and the browser overrides the HTML semantics before sending the object tree or events to the screen reader.

Debugging memory leaks in WebAssembly using Emscripten

Squoosh.app is a PWA that illustrates just how much different image codecs and settings can improve image file size without significantly affecting quality. However, it's also a technical demo showcasing how you can take libraries written in C++ or Rust and bring them to the web. Being able to port code from existing ecosystems is incredibly valuable, but there are some key differences between those static languages and JavaScript. One of those is in their different approaches to memory management. While JavaScript is fairly forgiving in cleaning up after itself, such static languages are definitely not. You need to explicitly ask for a new allocated memory and you really need to make sure you give it back afterwards, and never use it again. If that doesn't happen, you get leaks… and it actually happens fairly regularly. Let's take a look at how you can debug those memory leaks and, even better, how you can design your code to avoid them next time. Suspicious pattern # Recently, while starting to work on Squoosh, I couldn't help but notice an interesting pattern in C++ codec wrappers. Let's take a look at an ImageQuant wrapper as an example (reduced to show only object creation and deallocation parts): liq_attr* attr; liq_image* image; liq_result* res; uint8_t* result; RawImage quantize(std::string rawimage, int image_width, int image_height, int num_colors, float dithering) { const uint8_t* image_buffer = (uint8_t*)rawimage.c_str(); int size = image_width * image_height; attr = liq_attr_create(); image = liq_image_create_rgba(attr, image_buffer, image_width, image_height, 0); liq_set_max_colors(attr, num_colors); liq_image_quantize(image, attr, &res); liq_set_dithering_level(res, dithering); uint8_t* image8bit = (uint8_t*)malloc(size); result = (uint8_t*)malloc(size * 4); // … free(image8bit); liq_result_destroy(res); liq_image_destroy(image); liq_attr_destroy(attr); return { val(typed_memory_view(image_width * image_height * 4, result)), image_width, image_height }; } void free_result() { free(result); } JavaScript (well, TypeScript): export async function process(data: ImageData, opts: QuantizeOptions) { if (!emscriptenModule) { emscriptenModule = initEmscriptenModule(imagequant, wasmUrl); } const module = await emscriptenModule; const result = module.quantize(/* … */); module.free_result(); return new ImageData( new Uint8ClampedArray(result.view), result.width, result.height ); } Do you spot an issue? Hint: it's use-after-free, but in JavaScript! In Emscripten, typed_memory_view returns a JavaScript Uint8Array backed by the WebAssembly (Wasm) memory buffer, with byteOffset and byteLength set to the given pointer and length. The main point is that this is a TypedArray view into a WebAssembly memory buffer, rather than a JavaScript-owned copy of the data. When we call free_result from JavaScript, it, in turn, calls a standard C function free to mark this memory as available for any future allocations, which means the data that our Uint8Array view points to, can be overwritten with arbitrary data by any future call into Wasm. Or, some implementation of free might even decide to zero-fill the freed memory immediately. The free that Emscripten uses doesn't do that, but we are relying on an implementation detail here that cannot be guaranteed. Or, even if the memory behind the pointer gets preserved, new allocation might need to grow the WebAssembly memory. When WebAssembly.Memory is grown either via JavaScript API, or corresponding memory.grow instruction, it invalidates the existing ArrayBuffer and, transitively, any views backed by it. Let me use the DevTools (or Node.js) console to demonstrate this behavior: > memory = new WebAssembly.Memory({ initial: 1 }) Memory {} > view = new Uint8Array(memory.buffer, 42, 10) Uint8Array(10) [0, 0, 0, 0, 0, 0, 0, 0, 0, 0] // ^ all good, we got a 10 bytes long view at address 42 > view.buffer ArrayBuffer(65536) {} // ^ its buffer is the same as the one used for WebAssembly memory // (the size of the buffer is 1 WebAssembly "page" == 64KB) > memory.grow(1) 1 // ^ let's say we grow Wasm memory by +1 page to fit some new data > view Uint8Array [] // ^ our original view is no longer valid and looks empty! > view.buffer ArrayBuffer(0) {} // ^ its buffer got invalidated as well and turned into an empty one Finally, even if we don't explicitly call into Wasm again between free_result and new Uint8ClampedArray, at some point we might add multithreading support to our codecs. In that case it could be a completely different thread that overwrites the data just before we manage to clone it. Looking for memory bugs # Just in case, I've decided to go further and check if this code exhibits any issues in practice. This seems like a perfect opportunity to try out the new(ish) Emscripten sanitizers support that was added last year and presented in our WebAssembly talk at the Chrome Dev Summit: Sanitizers are special tools that instrument the code with auto-generated checks during compilation, which can then help catch common bugs during runtime. Since they introduce runtime overhead, they're primarily used during development, although in some critical applications they're sometimes enabled on a [subset of] production environments as well. In this case, we're interested in the AddressSanitizer, which can detect various pointer- and memory-related issues. To use it, we need to recompile our codec with -fsanitize=address: emcc \ --bind \ ${OPTIMIZE} \ --closure 1 \ -s ALLOW_MEMORY_GROWTH=1 \ -s MODULARIZE=1 \ -s 'EXPORT_NAME="imagequant"' \ -I node_modules/libimagequant \ -o ./imagequant.js \ --std=c++11 \ imagequant.cpp \ -fsanitize=address \ node_modules/libimagequant/libimagequant.a This will automatically enable pointer safety checks, but we also want to find potential memory leaks. Since we're using ImageQuant as a library rather than a program, there is no "exit point" at which Emscripten could automatically validate that all memory has been freed. Instead, for such cases the LeakSanitizer (included in the AddressSanitizer) provides the functions __lsan_do_leak_check and __lsan_do_recoverable_leak_check, which can be manually invoked whenever we expect all memory to be freed and want to validate that assumption. __lsan_do_leak_check is meant to be used at the end of a running application, when you want to abort the process in case any leaks are detected, while __lsan_do_recoverable_leak_check is more suitable for library use-cases like ours, when you want to print leaks to the console, but keep the application running regardless. Let's expose that second helper via Embind so that we can call it from JavaScript at any time: #include <sanitizer/lsan_interface.h> // … void free_result() { free(result); } EMSCRIPTEN_BINDINGS(my_module) { function("zx_quantize", &zx_quantize); function("version", &version); function("free_result", &free_result); function("doLeakCheck", &__lsan_do_recoverable_leak_check); } And invoke it from the JavaScript side once we're done with the image. Doing this from the JavaScript side, rather than the C++ one, helps to ensure that all the scopes have been exited and all the temporary C++ objects were freed by the time we run those checks: // … const result = opts.zx ? module.zx_quantize(data.data, data.width, data.height, opts.dither) : module.quantize(data.data, data.width, data.height, opts.maxNumColors, opts.dither); module.free_result(); module.doLeakCheck(); return new ImageData( new Uint8ClampedArray(result.view), result.width, result.height ); } This gives us a report like the following in the console: Uh-oh, there are some small leaks, but the stacktrace is not very helpful as all the function names are mangled. Let's recompile with a basic debugging info to preserve them: emcc \ --bind \ ${OPTIMIZE} \ --closure 1 \ -s ALLOW_MEMORY_GROWTH=1 \ -s MODULARIZE=1 \ -s 'EXPORT_NAME="imagequant"' \ -I node_modules/libimagequant \ -o ./imagequant.js \ --std=c++11 \ imagequant.cpp \ -fsanitize=address \ -g2 \ node_modules/libimagequant/libimagequant.a This looks much better: Some parts of the stacktrace still look obscure as they point to Emscripten internals, but we can tell that the leak is coming from a RawImage conversion to "wire type" (to a JavaScript value) by Embind. Indeed, when we look at the code, we can see that we return RawImage C++ instances to JavaScript, but we never free them on either side. As a reminder, currently there is no garbage collection integration between JavaScript and WebAssembly, although one is being developed. Instead, you have to manually free any memory and call destructors from the JavaScript side once you're done with the object. For Embind specifically, the official docs suggest to call a .delete() method on exposed C++ classes: JavaScript code must explicitly delete any C++ object handles it has received, or the Emscripten heap will grow indefinitely. var x = new Module.MyClass; x.method(); x.delete(); Indeed, when we do that in JavaScript for our class: // … const result = opts.zx ? module.zx_quantize(data.data, data.width, data.height, opts.dither) : module.quantize(data.data, data.width, data.height, opts.maxNumColors, opts.dither); module.free_result(); result.delete(); module.doLeakCheck(); return new ImageData( new Uint8ClampedArray(result.view), result.width, result.height ); } The leak goes away as expected. Discovering more issues with sanitizers # Building other Squoosh codecs with sanitizers reveals both similar as well as some new issues. For example, I've got this error in MozJPEG bindings: Here, it's not a leak, but us writing to a memory outside of the allocated boundaries 😱 Digging into the code of MozJPEG, we find that the problem here is that jpeg_mem_dest—the function that we use to allocate a memory destination for JPEG—reuses existing values of outbuffer and outsize when they're non-zero: if (*outbuffer == NULL || *outsize == 0) { /* Allocate initial buffer */ dest->newbuffer = *outbuffer = (unsigned char *) malloc(OUTPUT_BUF_SIZE); if (dest->newbuffer == NULL) ERREXIT1(cinfo, JERR_OUT_OF_MEMORY, 10); *outsize = OUTPUT_BUF_SIZE; } However, we invoke it without initialising either of those variables, which means MozJPEG writes the result into a potentially random memory address that happened to be stored in those variables at the time of the call! uint8_t* output; unsigned long size; // … jpeg_mem_dest(&cinfo, &output, &size); Zero-initialising both variables before the invocation solves this issue, and now the code reaches a memory leak check instead. Luckily, the check passes successfully, indicating that we don't have any leaks in this codec. Issues with shared state # …Or do we? We know that our codec bindings store some of the state as well as results in global static variables, and MozJPEG has some particularly complicated structures. uint8_t* last_result; struct jpeg_compress_struct cinfo; val encode(std::string image_in, int image_width, int image_height, MozJpegOptions opts) { // … } What if some of those get lazily initialized on the first run, and then improperly reused on future runs? Then a single call with a sanitizer would not report them as problematic. Let's try and process the image a couple of times by randomly clicking at different quality levels in the UI. Indeed, now we get the following report: 262,144 bytes—looks like the whole sample image is leaked from jpeg_finish_compress! After checking out the docs and the official examples, it turns out that jpeg_finish_compress doesn't free the memory allocated by our earlier jpeg_mem_dest call—it only frees the compression structure, even though that compression structure already knows about our memory destination… Sigh. We can fix this by freeing the data manually in the free_result function: void free_result() { /* This is an important step since it will release a good deal of memory. */ free(last_result); jpeg_destroy_compress(&cinfo); } I could keep hunting those memory bugs one by one, but I think by now it's clear enough that the current approach to memory management leads to some nasty systematic issues. Some of them can be caught by the sanitizer right away. Others require intricate tricks to be caught. Finally, there are issues like in the beginning of the post that, as we can see from the logs, aren't caught by the sanitizer at all. The reason is that the actual mis-use happens on the JavaScript side, into which the sanitizer has no visibility. Those issues will reveal themselves only in production or after seemingly unrelated changes to the code in the future. Building a safe wrapper # Let's take a couple of steps back, and instead fix all of these problems by restructuring the code in a safer way. I'll use ImageQuant wrapper as an example again, but similar refactoring rules apply to all the codecs, as well as other similar codebases. First of all, let's fix the use-after-free issue from the beginning of the post. For that, we need to clone the data from the WebAssembly-backed view before marking it as free on the JavaScript side: // … const result = /* … */; const imgData = new ImageData( new Uint8ClampedArray(result.view), result.width, result.height ); module.free_result(); result.delete(); module.doLeakCheck(); return new ImageData( new Uint8ClampedArray(result.view), result.width, result.height ); return imgData; } Now, let's make sure that we don't share any state in global variables between invocations. This will both fix some of the issues we've already seen, as well as will make it easier to use our codecs in a multithreaded environment in the future. To do that, we refactor the C++ wrapper to make sure that each call to the function manages its own data using local variables. Then, we can change the signature of our free_result function to accept the pointer back: liq_attr* attr; liq_image* image; liq_result* res; uint8_t* result; RawImage quantize(std::string rawimage, int image_width, int image_height, int num_colors, float dithering) { const uint8_t* image_buffer = (uint8_t*)rawimage.c_str(); int size = image_width * image_height; attr = liq_attr_create(); image = liq_image_create_rgba(attr, image_buffer, image_width, image_height, 0); liq_attr* attr = liq_attr_create(); liq_image* image = liq_image_create_rgba(attr, image_buffer, image_width, image_height, 0); liq_set_max_colors(attr, num_colors); liq_result* res = nullptr; liq_image_quantize(image, attr, &res); liq_set_dithering_level(res, dithering); uint8_t* image8bit = (uint8_t*)malloc(size); result = (uint8_t*)malloc(size * 4); uint8_t* result = (uint8_t*)malloc(size * 4); // … } void free_result() { void free_result(uint8_t *result) { free(result); } But, since we're already using Embind in Emscripten to interact with JavaScript, we might as well make the API even safer by hiding C++ memory management details altogether! For that, let's move the new Uint8ClampedArray(…) part from JavaScript to the C++ side with Embind. Then, we can use it to clone the data into the JavaScript memory even before returning from the function: class RawImage { public: val buffer; int width; int height; RawImage(val b, int w, int h) : buffer(b), width(w), height(h) {} }; thread_local const val Uint8ClampedArray = val::global("Uint8ClampedArray"); RawImage quantize(/* … */) { val quantize(/* … */) { // … return { val(typed_memory_view(image_width * image_height * 4, result)), image_width, image_height }; val js_result = Uint8ClampedArray.new_(typed_memory_view( image_width * image_height * 4, result )); free(result); return js_result; } In this case we can return Uint8ClampedArray, because JavaScript already knows the width and height of the resulting image. If this wasn't the case, then we could return an ImageData instance instead, which is functionally equivalent to our previous RawImage wrapper, but is a standard JavaScript-owned object: // … val js_result = Uint8ClampedArray.new_(typed_memory_view( image_width * image_height * 4, result )); free(result); return ImageData.new_(js_result, image_width, image_height); } Note how, with a single change, we both ensure that the resulting byte array is owned by JavaScript and not backed by WebAssembly memory, and get rid of the previously leaked RawImage wrapper too. Now JavaScript doesn't have to worry about freeing data at all anymore, and can use the result like any other garbage-collected object: // … const result = /* … */; const imgData = new ImageData( new Uint8ClampedArray(result.view), result.width, result.height ); module.free_result(); result.delete(); // module.doLeakCheck(); return imgData; return new ImageData(result, result.width, result.height); } This also means we no longer need a custom free_result binding on the C++ side: void free_result(uint8_t* result) { free(result); } EMSCRIPTEN_BINDINGS(my_module) { class_<RawImage>("RawImage") .property("buffer", &RawImage::buffer) .property("width", &RawImage::width) .property("height", &RawImage::height); function("quantize", &quantize); function("zx_quantize", &zx_quantize); function("version", &version); function("free_result", &free_result, allow_raw_pointers()); } All in all, our wrapper code became both cleaner and safer at the same time. After this I went through some further minor improvements to the code of the ImageQuant wrapper and replicated similar memory management fixes for other codecs. If you're interested in more details, you can see the resulting PR here: Memory fixes for C++ codecs. Takeaways # What lessons can we learn and share from this refactoring that could be applied to other codebases? Don't use memory views backed by WebAssembly—no matter which language it's built from—beyond a single invocation. You can't rely on them surviving any longer than that, and you won't be able to catch these bugs by conventional means, so if you need to store the data for later, copy it to the JavaScript side and store it there. If possible, use a safe memory management language or, at least, safe type wrappers, instead of operating on raw pointers directly. This won't save you from bugs on the JavaScript ↔ WebAssembly boundary, but at least it will reduce the surface for bugs self-contained by the static language code. No matter which language you use, run code with sanitizers during development—they can help to catch not only problems in the static language code, but also some issues across the JavaScript ↔ WebAssembly boundary, such as forgetting to call .delete() or passing in invalid pointers from the JavaScript side. If possible, avoid exposing unmanaged data and objects from WebAssembly to JavaScript altogether. JavaScript is a garbage-collected language, and manual memory management is not common in it. This can be considered an abstraction leak of the memory model of the language your WebAssembly was built from, and incorrect management is easy to overlook in a JavaScript codebase. This might be obvious, but, like in any other codebase, avoid storing mutable state in global variables. You don't want to debug issues with its reuse across various invocations or even threads, so it's best to keep it as self-contained as possible.

Read from and write to a serial port

Success: The Web Serial API, part of the capabilities project, launched in Chrome 89. What is the Web Serial API? # A serial port is a bidirectional communication interface that allows sending and receiving data byte by byte. The Web Serial API provides a way for websites to read from and write to a serial device with JavaScript. Serial devices are connected either through a serial port on the user's system or through removable USB and Bluetooth devices that emulate a serial port. In other words, the Web Serial API bridges the web and the physical world by allowing websites to communicate with serial devices, such as microcontrollers and 3D printers. This API is also a great companion to WebUSB as operating systems require applications to communicate with some serial ports using their higher-level serial API rather than the low-level USB API. Suggested use cases # In the educational, hobbyist, and industrial sectors, users connect peripheral devices to their computers. These devices are often controlled by microcontrollers via a serial connection used by custom software. Some custom software to control these devices is built with web technology: Arduino Create Betaflight Configurator Espruino Web IDE Microsoft MakeCode In some cases, websites communicate with the device through an agent application that users installed manually. In others, the application is delivered in a packaged application through a framework such as Electron. And in others, the user is required to perform an additional step such as copying a compiled application to the device via a USB flash drive. In all these cases, the user experience will be improved by providing direct communication between the website and the device that it is controlling. Current status # Step Status 1. Create explainer Complete 2. Create initial draft of specification Complete 3. Gather feedback & iterate on design Complete 4. Origin trial Complete 5. Launch Complete Using the Web Serial API # Feature detection # To check if the Web Serial API is supported, use: if ("serial" in navigator) { // The Web Serial API is supported. } Open a serial port # The Web Serial API is asynchronous by design. This prevents the website UI from blocking when awaiting input, which is important because serial data can be received at any time, requiring a way to listen to it. To open a serial port, first access a SerialPort object. For this, you can either prompt the user to select a single serial port by calling navigator.serial.requestPort(), or pick one from navigator.serial.getPorts() which returns a list of serial ports the website has been granted access to previously. // Prompt user to select any serial port. const port = await navigator.serial.requestPort(); // Get all serial ports the user has previously granted the website access to. const ports = await navigator.serial.getPorts(); The navigator.serial.requestPort() function takes an optional object literal that defines filters. Those are used to match any serial device connected over USB with a mandatory USB vendor (usbVendorId) and optional USB product identifiers (usbProductId). // Filter on devices with the Arduino Uno USB Vendor/Product IDs. const filters = [ { usbVendorId: 0x2341, usbProductId: 0x0043 }, { usbVendorId: 0x2341, usbProductId: 0x0001 } ]; // Prompt user to select an Arduino Uno device. const port = await navigator.serial.requestPort({ filters }); const { usbProductId, usbVendorId } = port.getInfo(); User prompt for selecting a BBC micro:bit Calling requestPort() prompts the user to select a device and returns a SerialPort object. Once you have a SerialPort object, calling port.open() with the desired baud rate will open the serial port. The baudRate dictionary member specifies how fast data is sent over a serial line. It is expressed in units of bits-per-second (bps). Check your device's documentation for the correct value as all the data you send and receive will be gibberish if this is specified incorrectly. For some USB and Bluetooth devices that emulate a serial port this value may be safely set to any value as it is ignored by the emulation. // Prompt user to select any serial port. const port = await navigator.serial.requestPort(); // Wait for the serial port to open. await port.open({ baudRate: 9600 }); You can also specify any of the options below when opening a serial port. These options are optional and have convenient default values. dataBits: The number of data bits per frame (either 7 or 8). stopBits: The number of stop bits at the end of a frame (either 1 or 2). parity: The parity mode (either "none", "even" or "odd"). bufferSize: The size of the read and write buffers that should be created (must be less than 16MB). flowControl: The flow control mode (either "none" or "hardware"). Read from a serial port # Input and output streams in the Web Serial API are handled by the Streams API. If streams are new to you, check out Streams API concepts. This article barely scratches the surface of streams and stream handling. After the serial port connection is established, the readable and writable properties from the SerialPort object return a ReadableStream and a WritableStream. Those will be used to receive data from and send data to the serial device. Both use Uint8Array instances for data transfer. When new data arrives from the serial device, port.readable.getReader().read() returns two properties asynchronously: the value and a done boolean. If done is true, the serial port has been closed or there is no more data coming in. Calling port.readable.getReader() creates a reader and locks readable to it. While readable is locked, the serial port can't be closed. const reader = port.readable.getReader(); // Listen to data coming from the serial device. while (true) { const { value, done } = await reader.read(); if (done) { // Allow the serial port to be closed later. reader.releaseLock(); break; } // value is a Uint8Array. console.log(value); } Some non-fatal serial port read errors can happen under some conditions such as buffer overflow, framing errors, or parity errors. Those are thrown as exceptions and can be caught by adding another loop on top of the previous one that checks port.readable. This works because as long as the errors are non-fatal, a new ReadableStream is created automatically. If a fatal error occurs, such as the serial device being removed, then port.readable becomes null. while (port.readable) { const reader = port.readable.getReader(); try { while (true) { const { value, done } = await reader.read(); if (done) { // Allow the serial port to be closed later. reader.releaseLock(); break; } if (value) { console.log(value); } } } catch (error) { // TODO: Handle non-fatal read error. } } If the serial device sends text back, you can pipe port.readable through a TextDecoderStream as shown below. A TextDecoderStream is a transform stream that grabs all Uint8Array chunks and converts them to strings. const textDecoder = new TextDecoderStream(); const readableStreamClosed = port.readable.pipeTo(textDecoder.writable); const reader = textDecoder.readable.getReader(); // Listen to data coming from the serial device. while (true) { const { value, done } = await reader.read(); if (done) { // Allow the serial port to be closed later. reader.releaseLock(); break; } // value is a string. console.log(value); } Write to a serial port # To send data to a serial device, pass data to port.writable.getWriter().write(). Calling releaseLock() on port.writable.getWriter() is required for the serial port to be closed later. const writer = port.writable.getWriter(); const data = new Uint8Array([104, 101, 108, 108, 111]); // hello await writer.write(data); // Allow the serial port to be closed later. writer.releaseLock(); Send text to the device through a TextEncoderStream piped to port.writable as shown below. const textEncoder = new TextEncoderStream(); const writableStreamClosed = textEncoder.readable.pipeTo(port.writable); const writer = textEncoder.writable.getWriter(); await writer.write("hello"); Close a serial port # port.close() closes the serial port if its readable and writable members are unlocked, meaning releaseLock() has been called for their respective reader and writer. await port.close(); However, when continuously reading data from a serial device using a loop, port.readable will always be locked until it encounters an error. In this case, calling reader.cancel() will force reader.read() to resolve immediately with { value: undefined, done: true } and therefore allowing the loop to call reader.releaseLock(). // Without transform streams. let keepReading = true; let reader; async function readUntilClosed() { while (port.readable && keepReading) { reader = port.readable.getReader(); try { while (true) { const { value, done } = await reader.read(); if (done) { // reader.cancel() has been called. break; } // value is a Uint8Array. console.log(value); } } catch (error) { // Handle error... } finally { // Allow the serial port to be closed later. reader.releaseLock(); } } await port.close(); } const closedPromise = readUntilClosed(); document.querySelector('button').addEventListener('click', async () => { // User clicked a button to close the serial port. keepReading = false; // Force reader.read() to resolve immediately and subsequently // call reader.releaseLock() in the loop example above. reader.cancel(); await closedPromise; }); Closing a serial port is more complicated when using transform streams (like TextDecoderStream and TextEncoderStream). Call reader.cancel() as before. Then call writer.close() and port.close(). This propagates errors through the transform streams to the underlying serial port. Because error propagation doesn't happen immediately, you need to use the readableStreamClosed and writableStreamClosed promises created earlier to detect when port.readable and port.writable have been unlocked. Cancelling the reader causes the stream to be aborted; this is why you must catch and ignore the resulting error. // With transform streams. const textDecoder = new TextDecoderStream(); const readableStreamClosed = port.readable.pipeTo(textDecoder.writable); const reader = textDecoder.readable.getReader(); // Listen to data coming from the serial device. while (true) { const { value, done } = await reader.read(); if (done) { reader.releaseLock(); break; } // value is a string. console.log(value); } const textEncoder = new TextEncoderStream(); const writableStreamClosed = textEncoder.readable.pipeTo(port.writable); reader.cancel(); await readableStreamClosed.catch(() => { /* Ignore the error */ }); writer.close(); await writableStreamClosed; await port.close(); Listen to connection and disconnection # If a serial port is provided by a USB device then that device may be connected or disconnected from the system. When the website has been granted permission to access a serial port, it should monitor the connect and disconnect events. navigator.serial.addEventListener("connect", (event) => { // TODO: Automatically open event.target or warn user a port is available. }); navigator.serial.addEventListener("disconnect", (event) => { // TODO: Remove |event.target| from the UI. // If the serial port was opened, a stream error would be observed as well. }); Prior to Chrome 89 the connect and disconnect events fired a custom SerialConnectionEvent object with the affected SerialPort interface available as the port attribute. You may want to use event.port || event.target to handle the transition. Handle signals # After establishing the serial port connection, you can explicitly query and set signals exposed by the serial port for device detection and flow control. These signals are defined as boolean values. For example, some devices such as Arduino will enter a programming mode if the Data Terminal Ready (DTR) signal is toggled. Setting output signals and getting input signals are respectively done by calling port.setSignals() and port.getSignals(). See usage examples below. // Turn off Serial Break signal. await port.setSignals({ break: false }); // Turn on Data Terminal Ready (DTR) signal. await port.setSignals({ dataTerminalReady: true }); // Turn off Request To Send (RTS) signal. await port.setSignals({ requestToSend: false }); const signals = await port.getSignals(); console.log(`Clear To Send: ${signals.clearToSend}`); console.log(`Data Carrier Detect: ${signals.dataCarrierDetect}`); console.log(`Data Set Ready: ${signals.dataSetReady}`); console.log(`Ring Indicator: ${signals.ringIndicator}`); Transforming streams # When you receive data from the serial device, you won't necessarily get all of the data at once. It may be arbitrarily chunked. For more information, see Streams API concepts. To deal with this, you can use some built-in transform streams such as TextDecoderStream or create your own transform stream which allows you to parse the incoming stream and return parsed data. The transform stream sits between the serial device and the read loop that is consuming the stream. It can apply an arbitrary transform before the data is consumed. Think of it like an assembly line: as a widget comes down the line, each step in the line modifies the widget, so that by the time it gets to its final destination, it's a fully functioning widget. World War II Castle Bromwich Aeroplane Factory For example, consider how to create a transform stream class that consumes a stream and chunks it based on line breaks. Its transform() method is called every time new data is received by the stream. It can either enqueue the data or save it for later. The flush() method is called when the stream is closed, and it handles any data that hasn't been processed yet. To use the transform stream class, you need to pipe an incoming stream through it. In the third code example under Read from a serial port, the original input stream was only piped through a TextDecoderStream, so we need to call pipeThrough() to pipe it through our new LineBreakTransformer. class LineBreakTransformer { constructor() { // A container for holding stream data until a new line. this.chunks = ""; } transform(chunk, controller) { // Append new chunks to existing chunks. this.chunks += chunk; // For each line breaks in chunks, send the parsed lines out. const lines = this.chunks.split("\r\n"); this.chunks = lines.pop(); lines.forEach((line) => controller.enqueue(line)); } flush(controller) { // When the stream is closed, flush any remaining chunks out. controller.enqueue(this.chunks); } } const textDecoder = new TextDecoderStream(); const readableStreamClosed = port.readable.pipeTo(textDecoder.writable); const reader = textDecoder.readable .pipeThrough(new TransformStream(new LineBreakTransformer())) .getReader(); For debugging serial device communication issues, use the tee() method of port.readable to split the streams going to or from the serial device. The two streams created can be consumed independently and this allows you to print one to the console for inspection. const [appReadable, devReadable] = port.readable.tee(); // You may want to update UI with incoming data from appReadable // and log incoming data in JS console for inspection from devReadable. Dev Tips # Debugging the Web Serial API in Chrome is easy with the internal page, chrome://device-log where you can see all serial device related events in one single place. Internal page in Chrome for debugging the Web Serial API. Codelab # In the Google Developer codelab, you'll use the Web Serial API to interact with a BBC micro:bit board to show images on its 5x5 LED matrix. Browser support # The Web Serial API is available on all desktop platforms (Chrome OS, Linux, macOS, and Windows) in Chrome 89. Polyfill # On Android, support for USB-based serial ports is possible using the WebUSB API and the Serial API polyfill. This polyfill is limited to hardware and platforms where the device is accessible via the WebUSB API because it has not been claimed by a built-in device driver. Security and privacy # The spec authors have designed and implemented the Web Serial API using the core principles defined in Controlling Access to Powerful Web Platform Features, including user control, transparency, and ergonomics. The ability to use this API is primarily gated by a permission model that grants access to only a single serial device at a time. In response to a user prompt, the user must take active steps to select a particular serial device. To understand the security tradeoffs, check out the security and privacy sections of the Web Serial API Explainer. Feedback # The Chrome team would love to hear about your thoughts and experiences with the Web Serial API. Tell us about the API design # Is there something about the API that doesn't work as expected? Or are there missing methods or properties that you need to implement your idea? File a spec issue on the Web Serial API GitHub repo or add your thoughts to an existing issue. Report a problem with the implementation # Did you find a bug with Chrome's implementation? Or is the implementation different from the spec? File a bug at https://new.crbug.com. Be sure to include as much detail as you can, provide simple instructions for reproducing the bug, and have Components set to Blink>Serial. Glitch works great for sharing quick and easy repros. Show support # Are you planning to use the Web Serial API? Your public support helps the Chrome team prioritize features and shows other browser vendors how critical it is to support them. Send a tweet to @ChromiumDev using the hashtag #SerialAPI and let us know where and how you're using it. Helpful links # Specification Tracking bug ChromeStatus.com entry Blink Component: Blink>Serial Demos # Serial Terminal Espruino Web IDE Acknowledgements # Thanks to Reilly Grant and Joe Medley for their reviews of this article. Aeroplane factory photo by Birmingham Museums Trust on Unsplash.

`content-visibility`: the new CSS property that boosts your rendering performance

The content-visibility property, launching in Chromium 85, might be one of the most impactful new CSS properties for improving page load performance. content-visibility enables the user agent to skip an element's rendering work, including layout and painting, until it is needed. Because rendering is skipped, if a large portion of your content is off-screen, leveraging the content-visibility property makes the initial user load much faster. It also allows for faster interactions with the on-screen content. Pretty neat. In our article demo, applying content-visibility: auto to chunked content areas gives a 7x rendering performance boost on initial load. Read on to learn more. Browser support # content-visibility relies on primitives within the the CSS Containment Spec. While content-visibility is only supported in Chromium 85 for now (and deemed "worth prototyping" for Firefox), the Containment Spec is supported in most modern browsers. CSS Containment # The key and overarching goal of CSS containment is to enable rendering performance improvements of web content by providing predictable isolation of a DOM subtree from the rest of the page. Basically a developer can tell a browser what parts of the page are encapsulated as a set of content, allowing the browsers to reason about the content without needing to consider state outside of the subtree. Knowing which bits of content (subtrees) contain isolated content means the browser can make optimization decisions for page rendering. There are four types of CSS containment, each a potential value for the contain CSS property, which can be combined together in a space-separated list of values: size: Size containment on an element ensures that the element's box can be laid out without needing to examine its descendants. This means we can potentially skip layout of the descendants if all we need is the size of the element. layout: Layout containment means that the descendants do not affect the external layout of other boxes on the page. This allows us to potentially skip layout of the descendants if all we want to do is lay out other boxes. style: Style containment ensures that properties which can have effects on more than just its descendants don't escape the element (e.g. counters). This allows us to potentially skip style computation for the descendants if all we want is to compute styles on other elements. paint: Paint containment ensures that the descendants of the containing box don't display outside its bounds. Nothing can visibly overflow the element, and if an element is off-screen or otherwise not visible, its descendants will also not be visible. This allows us to potentially skip painting the descendants if the element is offscreen. Skipping rendering work with content-visibility # It may be hard to figure out which containment values to use, since browser optimizations may only kick in when an appropriate set is specified. You can play around with the values to see what works best, or you can use another CSS property called content-visibility to apply the needed containment automatically. content-visibility ensures that you get the largest performance gains the browser can provide with minimal effort from you as a developer. The content-visibility property accepts several values, but auto is the one that provides immediate performance improvements. An element that has content-visibility: auto gains layout, style and paint containment. If the element is off-screen (and not otherwise relevant to the user—relevant elements would be the ones that have focus or selection in their subtree), it also gains size containment (and it stops painting and hit-testing its contents). What does this mean? In short, if the element is off-screen its descendants are not rendered. The browser determines the size of the element without considering any of its contents, and it stops there. Most of the rendering, such as styling and layout of the element's subtree are skipped. As the element approaches the viewport, the browser no longer adds the size containment and starts painting and hit-testing the element's content. This enables the rendering work to be done just in time to be seen by the user. A note on accessibility # One of the features of content-visibility: auto is that the off-screen content remains available in the document object model and therefore, the accessibility tree (unlike with visibility: hidden). This means, that content can be searched for on the page, and navigated to, without waiting for it to load or sacrificing rendering performance. The flip-side of this, however, is that landmark elements with style features such as display: none or visibility: hidden will also appear in the accessibility tree when off-screen, since the browser will not render these styles until they enter the viewport. To prevent these from being visible in the accessibility tree, potentially causing clutter, be sure to also add aria-hidden="true". Caution: In Chromium 85-89, off-screen children within content-visibility: auto were marked as invisible. In particular, headings and landmark roles were not exposed to accessibility tools. In Chromium 90 this was updated so that they are exposed. Example: a travel blog # In this example, we baseline our travel blog on the right, and apply content-visibility: auto to chunked areas on the left. The results show rendering times going from 232ms to 30ms on initial page load. A travel blog typically contains a set of stories with a few pictures, and some descriptive text. Here is what happens in a typical browser when it navigates to a travel blog: A part of the page is downloaded from the network, along with any needed resources. The browser styles and lays out all of the contents of the page, without considering if the content is visible to the user. The browser goes back to step 1 until all of the page and resources are downloaded. In step 2, the browser processes all of the contents looking for things that may have changed. It updates the style and layout of any new elements, along with the elements that may have shifted as a result of new updates. This is rendering work. This takes time. An example of a travel blog. See Demo on Codepen Now consider what happens if you put content-visibility: auto on each of the individual stories in the blog. The general loop is the same: the browser downloads and renders chunks of the page. However, the difference is in the amount of work that it does in step 2. With content-visibility, it will style and layout all of the contents that are currently visible to the user (they are on-screen). However, when processing the story that is fully off-screen, the browser will skip the rendering work and only style and layout the element box itself. The performance of loading this page would be as if it contained full on-screen stories and empty boxes for each of the off-screen stories. This performs much better, with expected reduction of 50% or more from the rendering cost of loading. In our example, we see a boost from a 232ms rendering time to a 30ms rendering time. That's a 7x performance boost. What is the work that you need to do in order to reap these benefits? First, we chunk the content into sections: Example of chunking content into sections with the story class applied, to receive content-visibility: auto. See Demo on Codepen Then, we apply the following style rule to the sections: .story { content-visibility: auto; contain-intrinsic-size: 1000px; /* Explained in the next section. */ } Note that as content moves in and out of visibility, it will start and stop being rendered as needed. However, this does not mean that the browser will have to render and re-render the same content over and over again, since the rendering work is saved when possible. Specifying the natural size of an element with contain-intrinsic-size # In order to realize the potential benefits of content-visibility, the browser needs to apply size containment to ensure that the rendering results of contents do not affect the size of the element in any way. This means that the element will lay out as if it was empty. If the element does not have a height specified in a regular block layout, then it will be of 0 height. This might not be ideal, since the size of the scrollbar will shift, being reliant on each story having a non-zero height. Thankfully, CSS provides another property, contain-intrinsic-size, which effectively specifies the natural size of the element if the element is affected by size containment. In our example, we are setting it to 1000px as an estimate for the height and width of the sections. This means it will lay out as if it had a single child of "intrinsic-size" dimensions, ensuring that your unsized divs still occupy space. contain-intrinsic-size acts as a placeholder size in lieu of rendered content. Hiding content with content-visibility: hidden # What if you want to keep the content unrendered regardless of whether or not it is on-screen, while leveraging the benefits of cached rendering state? Enter: content-visibility: hidden. The content-visibility: hidden property gives you all of the same benefits of unrendered content and cached rendering state as content-visibility: auto does off-screen. However, unlike with auto, it does not automatically start to render on-screen. This gives you more control, allowing you to hide an element's contents and later unhide them quickly. Compare it to other common ways of hiding element's contents: display: none: hides the element and destroys its rendering state. This means unhiding the element is as expensive as rendering a new element with the same contents. visibility: hidden: hides the element and keeps its rendering state. This doesn't truly remove the element from the document, as it (and it's subtree) still takes up geometric space on the page and can still be clicked on. It also updates the rendering state any time it is needed even when hidden. content-visibility: hidden, on the other hand, hides the element while preserving its rendering state, so, if there are any changes that need to happen, they only happen when the element is shown again (i.e. the content-visibility: hidden property is removed). Some great use cases for content-visibility: hidden are when implementing advanced virtual scrollers, and measuring layout. Conclusion # content-visibility and the CSS Containment Spec mean some exciting performance boosts are coming right to your CSS file. For more information on these properties, check out: The CSS Containment Spec MDN Docs on CSS Containment CSSWG Drafts

Unblocking clipboard access

Over the past few years, browsers have used document.execCommand() for clipboard interactions. Though widely supported, this method of cutting and pasting came at a cost: clipboard access was synchronous, and could only read and write to the DOM. That's fine for small bits of text, but there are many cases where blocking the page for clipboard transfer is a poor experience. Time consuming sanitization or image decoding might be needed before content can be safely pasted. The browser may need to load or inline linked resources from a pasted document. That would block the page while waiting on the disk or network. Imagine adding permissions into the mix, requiring that the browser block the page while requesting clipboard access. At the same time, the permissions put in place around document.execCommand() for clipboard interaction are loosely defined and vary between browsers. The Async Clipboard API addresses these issues, providing a well-defined permissions model that doesn't block the page. Safari recently announced support for it in version 13.1. With that, major browsers have a basic level of support in place. As of this writing, Firefox only supports text; and image support is limited to PNGs in some browsers. If you're interested in using the API, consult a browser support table before proceeding. The Async Clipboard API is limited to handling text and images. Chrome 84 introduces an experimental feature that allows the clipboard to handle any arbitrary data type. Copy: writing data to the clipboard # writeText() # To copy text to the clipboard call writeText(). Since this API is asynchronous, the writeText() function returns a Promise that resolves or rejects depending on whether the passed text is copied successfully: async function copyPageUrl() { try { await navigator.clipboard.writeText(location.href); console.log('Page URL copied to clipboard'); } catch (err) { console.error('Failed to copy: ', err); } } write() # Actually, writeText() is just a convenience method for the generic write() method, which also lets you copy images to the clipboard. Like writeText(), it is asynchronous and returns a Promise. To write an image to the clipboard, you need the image as a blob. One way to do this is by requesting the image from a server using fetch(), then calling blob() on the response. Requesting an image from the server may not be desirable or possible for a variety of reasons. Fortunately, you can also draw the image to a canvas and call the canvas' toBlob() method. Next, pass an array of ClipboardItem objects as a parameter to the write() method. Currently you can only pass one image at a time, but we hope to add support for multiple images in the future. ClipboardItem takes an object with the MIME type of the image as the key and the blob as the value. For Blob objects obtained from fetch() or canvas.toBlob(), the blob.type property automatically contains the correct MIME type for an image. try { const imgURL = '/images/generic/file.png'; const data = await fetch(imgURL); const blob = await data.blob(); await navigator.clipboard.write([ new ClipboardItem({ [blob.type]: blob }) ]); console.log('Image copied.'); } catch (err) { console.error(err.name, err.message); } The copy event # In the case where a user initiates a clipboard copy, non-textual data is provided as a Blob for you. The copy event includes a clipboardData property with the items already in the right format, eliminating the need to manually create a Blob. Call preventDefault() to prevent the default behavior in favor of your own logic, then copy contents to the clipboard. What's not covered in this example is how to fall back to earlier APIs when the Clipboard API isn't supported. I'll cover that under Feature detection, later in this article. document.addEventListener('copy', async (e) => { e.preventDefault(); try { let clipboardItems = []; for (const item of e.clipboardData.items) { if (!item.type.startsWith('image/')) { continue; } clipboardItems.push( new ClipboardItem({ [item.type]: item, }) ); await navigator.clipboard.write(clipboardItems); console.log('Image copied.'); } } catch (err) { console.error(err.name, err.message); } }); Paste: reading data from clipboard # readText() # To read text from the clipboard, call navigator.clipboard.readText() and wait for the returned Promise to resolve: async function getClipboardContents() { try { const text = await navigator.clipboard.readText(); console.log('Pasted content: ', text); } catch (err) { console.error('Failed to read clipboard contents: ', err); } } read() # The navigator.clipboard.read() method is also asynchronous and returns a Promise. To read an image from the clipboard, obtain a list of ClipboardItem objects, then iterate over them. Each ClipboardItem can hold its contents in different types, so you'll need to iterate over the list of types, again using a for...of loop. For each type, call the getType() method with the current type as an argument to obtain the corresponding Blob. As before, this code is not tied to images, and will work with other future file types. async function getClipboardContents() { try { const clipboardItems = await navigator.clipboard.read(); for (const clipboardItem of clipboardItems) { for (const type of clipboardItem.types) { const blob = await clipboardItem.getType(type); console.log(URL.createObjectURL(blob)); } } } catch (err) { console.error(err.name, err.message); } } The paste event # As noted before, there are plans to introduce events to work with the Clipboard API, but for now you can use the existing paste event. It works nicely with the new asynchronous methods for reading clipboard text. As with the copy event, don't forget to call preventDefault(). document.addEventListener('paste', async (e) => { e.preventDefault(); const text = await navigator.clipboard.readText(); console.log('Pasted text: ', text); }); As with the copy event, falling back to earlier APIs when the Clipboard API isn't supported will be covered under Feature detection. Handling multiple file types # Most implementations put multiple data formats on the clipboard for a single cut or copy operation. There are two reasons for this: as an app developer, you have no way of knowing the capabilities of the app that a user wants to copy text or images to, and many applications support pasting structured data as plain text. This is presented to users with an Edit menu item with a name such as Paste and match style or Paste without formatting. The following example shows how to do this. This example uses fetch() to obtain image data, but it could also come from a <canvas> or the File System Access API. function copy() { const image = await fetch('kitten.png'); const text = new Blob(['Cute sleeping kitten'], {type: 'text/plain'}); const item = new ClipboardItem({ 'text/plain': text, 'image/png': image }); await navigator.clipboard.write([item]); } Security and permissions # Clipboard access has always presented a security concern for browsers. Without proper permissions, a page could silently copy all manner of malicious content to a user's clipboard that would produce catastrophic results when pasted. Imagine a web page that silently copies rm -rf / or a decompression bomb image to your clipboard. Giving web pages unfettered read access to the clipboard is even more troublesome. Users routinely copy sensitive information like passwords and personal details to the clipboard, which could then be read by any page without the user's knowledge. As with many new APIs, the Clipboard API is only supported for pages served over HTTPS. To help prevent abuse, clipboard access is only allowed when a page is the active tab. Pages in active tabs can write to the clipboard without requesting permission, but reading from the clipboard always requires permission. Permissions for copy and paste have been added to the Permissions API. The clipboard-write permission is granted automatically to pages when they are the active tab. The clipboard-read permission must be requested, which you can do by trying to read data from the clipboard. The code below shows the latter: const queryOpts = { name: 'clipboard-read', allowWithoutGesture: false }; const permissionStatus = await navigator.permissions.query(queryOpts); // Will be 'granted', 'denied' or 'prompt': console.log(permissionStatus.state); // Listen for changes to the permission state permissionStatus.onchange = () => { console.log(permissionStatus.state); }; You can also control whether a user gesture is required to invoke cutting or pasting using the allowWithoutGesture option. The default for this value varies by browser, so you should always include it. Here's where the asynchronous nature of the Clipboard API really comes in handy: attempting to read or write clipboard data automatically prompts the user for permission if it hasn't already been granted. Since the API is promise-based, this is completely transparent, and a user denying clipboard permission causes the promise to reject so the page can respond appropriately. Because Chrome only allows clipboard access when a page is the active tab, you'll find that some of the examples here don't run if pasted directly into DevTools, since DevTools itself is the active tab. There's a trick: defer clipboard access using setTimeout(), then quickly click inside the page to focus it before the functions are called: setTimeout(async () => { const text = await navigator.clipboard.readText(); console.log(text); }, 2000); Permissions policy integration # To use the API in iframes, you need to enable it with Permissions Policy, which defines a mechanism that allows for selectively enabling and disabling various browser features and APIs. Concretely, you need to pass either or both of clipboard-read or clipboard-write, depending on the needs of your app. <iframe src="index.html" allow="clipboard-read; clipboard-write" > </iframe> Feature detection # To use the Async Clipboard API while supporting all browsers, test for navigator.clipboard and fall back to earlier methods. For example, here's how you might implement pasting to include other browsers. document.addEventListener('paste', async (e) => { e.preventDefault(); let text; if (navigator.clipboard) { text = await navigator.clipboard.readText(); } else { text = e.clipboardData.getData('text/plain'); } console.log('Got pasted text: ', text); }); That's not the whole story. Before the Async Clipboard API, there were a mix of different copy and paste implementations across web browsers. In most browsers, the browser's own copy and paste can be triggered using document.execCommand('copy') and document.execCommand('paste'). If the text to be copied is a string not present in the DOM, it must be injected into the DOM and selected: button.addEventListener('click', (e) => { const input = document.createElement('input'); document.body.appendChild(input); input.value = text; input.focus(); input.select(); const result = document.execCommand('copy'); if (result === 'unsuccessful') { console.error('Failed to copy text.'); } }); In Internet Explorer, you can also access the clipboard through window.clipboardData. If accessed within a user gesture such as a click event—part of asking permission responsibly—no permissions prompt is shown. Demos # You can play with the Async Clipboard API in the demos below or directly on Glitch. The first example demonstrates moving text on and off the clipboard. To try the API with images use this demo. Recall that only PNGs are supported and only in a few browsers. Next Steps # Chrome is actively working on expanding the Asynchronous Clipboard API with simplified events aligned with the Drag and Drop API. Because of potential risks Chrome is treading carefully. To stay up to date on Chrome's progress, watch this article and our blog for updates. For now, support for the Clipboard API is available in a number of browsers. Happy copying and pasting! Related links # MDN Acknowledgements # The Asynchronous Clipboard API was implemented by Darwin Huang and Gary Kačmarčík. Darwin also provided the demo. Thanks to Kyarik and again Gary Kačmarčík for reviewing parts of this article. Hero image by Markus Winkler on Unsplash.

Web on Android

The Android platform has been around for more than ten years, and since its early days it has had great support for the Web. It shipped with WebView, a component that allows developers to use the web inside their own Android Apps. More than that, Android allows developers to bring their own browser engine into the platform, fostering competition and innovation. Developers can include the web in their Android applications in many ways. WebView is frequently used to render ads, as a layout component used along with Android UI elements, or for packaging HTML 5 games. Custom Tabs allows developers to build in-app browsers and provide a seamless navigation experience to third-party web content, and Trusted Web Activity allows developers to use their Progressive Web Apps (PWAs) in Android apps, which can be downloaded from the Play Store. Android WebView # WebView gives developers access to modern HTML, CSS, and JavaScript inside their Android apps, and allows content to be shipped inside the APK or hosted on the internet. It's one of Android's most flexible and powerful components, which can be used for most of the use-cases where web content is included in an Android app. From powering ad services like AdMob to building and shipping complete HTML5 games that use modern APIs such as WebGL. But, when used to create an in-app-browser or including a PWA in an Android application, WebView lacks the security, features, and capabilities of the web platform. The in-app browser challenge # Over time, more and more developers have built browser experiences incorporating third-party content into their Android application, with the goal of creating a more seamless experience for their users when navigating to third-party websites. Those experiences became known as in-app browsers. WebView has extensive support for the modern web tech stack and supports many modern web APIs, like WebGL. But WebView is primarily a web UI toolkit. It is not meant to - and does not - support all features of the web platform. When an API already has an OS-level alternative, like Web Bluetooth, or it requires browser UI to be implemented, like push notifications, it may not be supported. As the web platform evolves and adds more features that were only available to Android apps, this gap will become even larger. As app developers don't control which features are used when opening third-party content, it makes WebView a poor choice for in-app browsers or opening Progressive Web Apps. Even if WebView implemented support for all web platform features, developers would still need to write code and implement their own UI for functionality like permissions or push notifications, making it hard to achieve consistency for users. Another option available to developers is embedding a browser engine in their application. Besides leading to increased application size, this approach is both complex and time-consuming. Custom Tabs as a solution for in-app browsers # Custom Tabs was introduced in Chrome 45 and allows developers to use a tab from the user's default browser as part of their application. Custom Tabs was originally launched by Chrome, and was therefore known as "Chrome Custom Tabs". Today it's an Android API and most popular browsers support Custom Tabs, including Chrome, Firefox, Edge, and Samsung Internet, so it's more appropriate to just call it "Custom Tabs". Custom Tabs helps developers seamlessly integrate web content into their app experience. They also allow developers to customise the activity in which web content is shown by allowing them to customize the toolbar color, action buttons, transition animation, and more. They also offer features that were previously unavailable when using WebView or embedding a browser engine. Since the in-app browser is powered by the user's browser, Custom Tabs shares storage with the browser, so users don't need to re-login to their favourite websites every time one of their installed apps starts an In-App browsing session. Unlike WebViews, Custom Tabs supports all web platform features and APIs that are supported by the browser powering it. Open Progressive Web Apps using Trusted Web Activity # Progressive Web Apps bring many behaviors and capabilities that were once only available to platform-specific apps to the web. With the introduction of app-like behaviour, the desire from developers to re-use those experiences on Android increased, and developers started asking for ways to integrate PWAs into their apps. Custom Tabs has support for all modern web capabilities and APIs but, since it was primarily designed to open third-party content, it has a toolbar on the top that tells the users which URL they are visiting, as well as the lock icon indicating whether the site is secure. When opening an app's own experience, the toolbar prevents the application from feeling like it is integrated with the operating system. Trusted Web Activities was introduced in Chrome 72 and allows developers to use their PWA inside an Android app. Its protocol is similar to the Custom Tabs protocol, but introduces APIs that allow developers to verify (through Digital Asset Links) that they control both the Android app and the URL being opened and remove the URL bar when both are true. They also introduced APIs for creating splash screens when opening the PWA or delegating web notifications to be handled by Android code. More features like support for Play Billing are coming soon. Since URLs opened in Trusted Web Activities are expected to be PWAs and have a set of behaviors and performance characteristics, Trusted Web Activities introduces quality criteria for PWAs being opened inside it. Limitations of the current solutions # Developer feedback showed a need for the platform compatibility of Custom Tabs combined with the flexibility of WebView so they could, for instance, access the DOM or inject JavaScript, into their in-app browsers. Custom Tabs is effectively a tab rendered by the user's browser, with a custom UI or with no UI at all. This means that the browser needs to honour the user's expectations around privacy and security towards the browser, making some of those features impossible. The Web on Android team at Google is looking into alternatives and experimenting with solutions to solve those use-cases. Stay tuned for details! Summary # WebView is useful when an application needs HTML, CSS, and JavaScript inside their Android app, but doesn't use more advanced features and capabilities available on the modern web such as Push Notifications, Web Bluetooth and others. It is not recommended when opening content that has been designed for the modern web platform, as it may not be displayed in the way the developer intended. WebView is not recommended for creating in-app browsers. On the other hand displaying first-party web content is an area where WebViews really shine. Trusted Web Activity should be used when the developers want to render their own Progressive Web App in fullscreen inside their Android application. It can be used as the only activity in the app or used along with other Android activities. Custom Tabs is the recommended way for opening third-party content that is designed for the web platform, also known as in-app browsers.

Referer and Referrer-Policy best practices

Summary # Unexpected cross-origin information leakage hinders web users' privacy. A protective referrer policy can help. Consider setting a referrer policy of strict-origin-when-cross-origin. It retains much of the referrer's usefulness, while mitigating the risk of leaking data cross-origins. Don't use referrers for Cross-Site Request Forgery (CSRF) protection. Use CSRF tokens instead, and other headers as an extra layer of security. Before we start: If you're unsure of the difference between "site" and "origin", check out Understanding "same-site" and "same-origin". The Referer header is missing an R, due to an original misspelling in the spec. The Referrer-Policy header and referrer in JavaScript and the DOM are spelled correctly. Referer and Referrer-Policy 101 # HTTP requests may include the optional Referer header, which indicates the origin or web page URL the request was made from. The Referrer-Policy header defines what data is made available in the Referer header. In the example below, the Referer header includes the complete URL of the page on site-one from which the request was made. The Referer header might be present in different types of requests: Navigation requests, when a user clicks a link Subresource requests, when a browser requests images, iframes, scripts, and other resources that a page needs. For navigations and iframes, this data can also be accessed via JavaScript using document.referrer. The Referer value can be insightful. For example, an analytics service might use the value to determine that 50% of the visitors on site-two.example came from social-network.example. But when the full URL including the path and query string is sent in the Referer across origins, this can be privacy-hindering and pose security risks as well. Take a look at these URLs: URLs #1 to #5 contain private information—sometimes even identifying or sensitive. Leaking these silently across origins can compromise web users' privacy. URL #6 is a capability URL. You don't want it to fall in the hands of anyone other than the intended user. If this were to happen, a malicious actor could hijack this user's account. In order to restrict what referrer data is made available for requests from your site, you can set a referrer policy. What policies are available and how do they differ? # You can select one of eight policies. Depending on the policy, the data available from the Referer header (and document.referrer) can be: No data (no Referer header is present) Only the origin The full URL: origin, path, and query string Some policies are designed to behave differently depending on the context: cross-origin or same-origin request, security (whether the request destination is as secure as the origin), or both. This is useful to limit the amount of information shared across origins or to less secure origins—while maintaining the richness of the referrer within your own site. Here is an overview showing how referrer policies restrict the URL data available from the Referer header and document.referrer: MDN provides a full list of policies and behavior examples. Things to note: All policies that take the scheme (HTTPS vs. HTTP) into account (strict-origin, no-referrer-when-downgrade and strict-origin-when-cross-origin) treat requests from an HTTP origin to another HTTP origin the same way as requests from an HTTPS origin to another HTTPS origin—even if HTTP is less secure. That's because for these policies, what matters is whether a security downgrade takes place, i.e. if the request can expose data from an encrypted origin to an unencrypted one. An HTTP → HTTP request is unencrypted all along, so there is no downgrade. HTTPS → HTTP requests, on the contrary, present a downgrade. If a request is same-origin, this means that the scheme (HTTPS or HTTP) is the same; hence there is no security downgrade. Default referrer policies in browsers # As of July 2020 If no referrer policy is set, the browser's default policy will be used. Browser Default Referrer-Policy / Behavior Chrome Planning to switch to strict-origin-when-cross-origin in version 85 (previously no-referrer-when-downgrade) Firefox strict-origin-when-cross-origin (see closed bug) strict-origin-when-cross-origin in Private Browsing and for trackers Edge no-referrer-when-downgrade Experimenting with strict-origin-when-cross-origin Safari Similar to strict-origin-when-cross-origin. See Preventing Tracking Prevention Tracking for details. Setting your referrer policy: best practices # Objective: Explicitly set a privacy-enhancing policy, such as strict-origin-when-cross-origin(or stricter). There are different ways to set referrer policies for your site: As an HTTP header Within your HTML From JavaScript on a per-request basis You can set different policies for different pages, requests or elements. The HTTP header and the meta element are both page-level. The precedence order when determining an element's effective policy is: Element-level policy Page-level policy Browser default Example: index.html: <meta name="referrer" content="strict-origin-when-cross-origin" /> <img src="..." referrerpolicy="no-referrer-when-downgrade" /> The image will be requested with a no-referrer-when-downgrade policy, while all other subresource requests from this page will follow the strict-origin-when-cross-origin policy. How to see the referrer policy? # securityheaders.com is handy to determine the policy a specific site or page is using. You can also use the developer tools of Chrome, Edge, or Firefox to see the referrer policy used for a specific request. At the time of this writing, Safari doesn't show the Referrer-Policy header but does show the Referer that was sent. Network panel with a request selected. Which policy should you set for your website? # Summary: Explicitly set a privacy-enhancing policy such as strict-origin-when-cross-origin (or stricter). Why "explicitly"? # If no referrer policy is set, the browser's default policy will be used—in fact, websites often defer to the browser's default. But this is not ideal, because: Browser default policies are either no-referrer-when-downgrade, strict-origin-when-cross-origin, or stricter—depending on the browser and mode (private/incognito). So your website won't behave predictably across browsers. Browsers are adopting stricter defaults such as strict-origin-when-cross-origin and mechanisms such as referrer trimming for cross-origin requests. Explicitly opting into a privacy-enhancing policy before browser defaults change gives you control and helps you run tests as you see fit. Why strict-origin-when-cross-origin (or stricter)? # You need a policy that is secure, privacy-enhancing, and useful—what "useful" means depends on what you want from the referrer: Secure: if your website uses HTTPS (if not, make it a priority), you don't want your website's URLs to leak in non-HTTPS requests. Since anyone on the network can see these, this would expose your users to person-in-the-middle-attacks. The policies no-referrer-when-downgrade, strict-origin-when-cross-origin, no-referrer and strict-origin solve this problem. Privacy-enhancing: for a cross-origin request, no-referrer-when-downgrade shares the full URL—this is not privacy-enhancing. strict-origin-when-cross-origin and strict-origin only share the origin, and no-referrer shares nothing at all. This leaves you with strict-origin-when-cross-origin, strict-origin, and no-referrer as privacy-enhancing options. Useful: no-referrer and strict-origin never share the full URL, even for same-origin requests—so if you need this, strict-origin-when-cross-origin is a better option. All of this means that stric