Planet Igalia WebKit

January 19, 2026

Igalia WebKit Team

WebKit Igalia Periodical #53

Update on what happened in WebKit in the week from December 26 to January 19.

We're back! The first periodical of 2026 brings you performance optimizations, improvements to the memory footprint calculation, new APIs, the removal of the legacy Qt5 WPE backend, and as always, progress on JSC's Temporal implementation.

Cross-Port 🐱

The memory footprint calculation mechanism has been unified across GTK, JSC, and WPE ports. Therefore, the expensive /proc/self/smaps is not used anymore and the WPE uses /proc/self/statm with extra cache now to prevent frequent file reading.

Added a new webkit_context_menu_get_position() function to the API that allows obtaining the pointer coordinates, relative to the web view origin, at the moment when a context menu was triggered.

Additionally, behaviour of context menus has been made more consistent between the GTK and WPE ports, and handling of GAction objects attached to menu items has been rewritten and improved with the goal of better supporting context menus in the WPE port.

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

In JavaScriptCore's implementation of Temporal, fixed a bug in Temporal.PlainTime.from that read options in the wrong order, which caused a test262 test to fail.

In JavaScriptCore's implementation of Temporal, fixed several bugs in PlainYearMonth methods and enabled all PlainYearMonth tests that don't depend on the Intl object. This completes the implementation of Temporal PlainYearMonth objects in JSC.

Graphics 🖼️

In WebKit's Skia graphics backend, fixed GrDirectContext management for GPU resources. Operations on GPU-backed resources must use the context that created them, not the current thread's context. The fix stores GrDirectContext at creation time for NativeImage and uses surface->recordingContext()->asDirectContext() for SkSurface, correcting multiple call sites that previously used the shared display's context incorrectly.

Damage propagation has been added to the recently-added, non-composited mode in WPE.

In WebKit's Skia graphics backend for GTK/WPE, added canvas 2D operation recording for GPU-accelerated rendering. Instead of executing drawing commands immediately, operations are recorded into an SkPicture and replayed in batch when the canvas contents are needed, reducing GPU state change overhead for workloads with many small drawing operations, improving the MotionMark Canvas Lines performance on embedded devices with low-end tiled GPUs.

WPE WebKit 📟

Due to Qt5 not receiving maintenance since mid-2025, the WPE Qt5 binding that used the legacy libwpe API has been removed from the tree. The Qt6 binding remains part of the source tree, which is a better alternative that allows using supported Qt versions, and is built atop the new WPEPlatform API, making it a future-proof option. The WPE Qt API may be enabled when configuring the build with CMake, using the ENABLE_WPE_QT_API option.

WPE Platform API 🧩

New, modern platform API that supersedes usage of libwpe and WPE backends.

The WPEScreenSyncObserver class has been improved to support multiple callbacks. Instead of a single callback set with wpe_screen_sync_observer_set_callback(), clients of the API can now use wpe_screen_sync_observer_add_callback() and wpe_screen_sync_observer_remove_callback(). The observer will be paused automatically when there are no callbacks attached to it.

That’s all for this week!

by Igalia WebKit Team at January 19, 2026 07:25 PM

December 25, 2025

Igalia WebKit Team

WebKit Igalia Periodical #52

Update on what happened in WebKit in the week from December 16 to December 25.

Right during the holiday season 🎄, the last WIP installment of the year comes packed with new releases, a couple of functions added to the public API, cleanups, better timer handling, and improvements to MathML and WebXR support.

Cross-Port 🐱

Landed support for font-size: math. Now math-depth can automatically control the font size inside of <math> blocks, making scripts and nested content smaller to improve readability and presentation.

Two new functions have been added to the public API:

  • webkit_context_menu_item_get_gaction_target() to obtain the GVariant associated with a context menu item created from a GAction.

  • webkit_context_menu_item_get_title() may be used to obtain the title of a context menu item.

Improved timers, by making some of them use the timerfd API. This reduces timer “lateness”—the amount of time elapsed between the configured trigger time, and the effective one—, which in turn improves the perceived smoothness of animations thanks to steadier frame delivery timings. Systems where the timerfd_create and timerfd_settime functions are not available will continue working as before.

On the WebXR front, support was added for XR_TRACKABLE_TYPE_DEPTH_ANDROID through the XR_ANDROID_trackables extension, which allows reporting depth information for elements that take part in hit testing.

Graphics 🖼️

Landed a change that implements non-composited page rendering in the WPE port. This new mode is disabled by default, and may be activated by disabling the AcceleratedCompositing runtime preference. In such case, the frames are rendered using a simplified code path that does not involve the internal WebKit compositor. Therefore it may offer a better performance in some specific cases on constrained embedded devices.

Since version 2.10.2, the FreeType library can be built with direct support for loading fonts in the WOFF2 format. Until now, the WPE and GTK WebKit ports used libwoff2 in an intermediate step to convert those fonts on-the-fly before handing them to FreeType for rendering. The CMake build system will now detect when FreeType supports WOFF2 directly and skip the conversion step. This way, in systems which provide a suitable version of FreeType, libwoff2 will no longer be needed.

WPE WebKit 📟

WPE Platform API 🧩

New, modern platform API that supersedes usage of libwpe and WPE backends.

The legacy libwpe-based API can now be disabled at build time, by toggling the ENABLE_WPE_LEGACY_API CMake option. This allows removal of uneeded code when an application is exclusively using the new WPEPlatform API.

WPE Android 🤖

Adaptation of WPE WebKit targeting the Android operating system.

AHardwareBuffer is now supported as backing for accelerated graphics surfaces that can be shared across processes. This is the last piece of the puzzle to use WPEPlatform on Android without involving expensive operations to copy rendered frames back-and-forth between GPU and system memory.

Releases 📦️

WebKitGTK 2.50.4 and WPE WebKit 2.50.4 have been released. These stable releases include a number of important patches for security issues, and we urge users and distributors to update to this release if they have not yet done it. An accompanying security advisory, WSA-2025-0010, has been published (GTK, WPE).

Development releases of WebKitGTK 2.51.4 and WPE WebKit 2.51.4 are available as well, and may be used to preview upcoming features. As usual, bug reports are welcome in Bugzilla.

Community & Events 🤝

Paweł Lampe has published a blog post that discusses various pre-rendering techniques useful in the context of using WPE on embedded devices.

That’s all for this week!

by Igalia WebKit Team at December 25, 2025 06:26 PM

December 19, 2025

Pawel Lampe

WPE performance considerations: pre-rendering

This article is a continuation of the series on WPE performance considerations. While the previous article touched upon fairly low-level aspects of the DOM tree overhead, this one focuses on more high-level problems related to managing the application’s workload over time. Similarly to before, the considerations and conclusions made in this blog post are strongly related to web applications in the context of embedded devices, and hence the techniques presented should be used with extra care (and benchmarking) if one would like to apply those on desktop-class devices.

The workload #

Typical web applications on embedded devices have their workloads distributed over time in various ways. In practice, however, the workload distributions can usually be fitted into one of the following categories:

  1. Idle applications with occasional updates - the applications that present static content and are updated at very low intervals. As an example, one can think of some static dashboard that presents static content and switches the page every, say, 60 seconds - such as e.g. a static departures/arrivals dashboard on the airport.
  2. Idle applications with frequent updates - the applications that present static content yet are updated frequently (or are presenting some dynamic content, such as animations occasionally). In that case, one can imagine a similar airport departures/arrivals dashboard, yet with the animated page scrolling happening quite frequently.
  3. Active applications with occasional updates - the applications that present some dynamic content (animations, multimedia, etc.), yet with major updates happening very rarely. An example one can think of in this case is an application playing video along with presenting some metadata about it, and switching between other videos every few minutes.
  4. Active applications with frequent updates - the applications that present some dynamic content and change the surroundings quite often. In this case, one can think of a stock market dashboard continuously animating the charts and updating the presented real-time statistics very frequently.

Such workloads can be well demonstrated on charts plotting the browser’s CPU usage over time:

Typical web application workloads.

As long as the peak workload (due to updates) is small, no negative effects are perceived by the end user. However, when the peak workload is significant, some negative effects may start getting noticeable.

In case of applications from groups (1) and (2) mentioned above, a significant peak workload may not be a problem at all. As long as there are no continuous visual changes and no interaction is allowed during updates, the end-user is unable to notice that the browser was not responsive or missed some frames for some period of time. In such cases, the application designer does not need to worry much about the workload.

In other cases, especially the ones involving applications from groups (3) and (4) mentioned above, the significant peak workload may lead to visual stuttering, as any processing making the browser busy for longer than 16.6 milliseconds will lead to lost frames. In such cases, the workload has to be managed in a way that the peaks are reduced either by optimizing them or distributing them over time.

First step: optimization #

The first step to addressing the peak workload is usually optimization. Modern web platform gives a full variety of tools to optimize all the stages of web application processing done by the browser. The usual process of optimization is a 2-step cycle starting with measuring the bottlenecks and followed by fixing them. In the process, the usual improvements involve:

  • using CSS containment,
  • using shadow DOM,
  • promoting certain parts of the DOM to layers and manipulating them with transforms,
  • parallelizing the work with workers/worklets,
  • using the visibility CSS property to separate painting from layout,
  • optimizing the application itself (JavaScript code, the structure of the DOM, the architecture of the application),
  • etc.

Second step: pre-rendering #

Unfortunately, in practice, it’s not uncommon that even very well optimized applications still have too much of a peak workload for the constrained embedded devices they’re used on. In such cases, the last resort solution is pre-rendering. As long as it’s possible from the application business-logic perspective, having at least some web page content pre-rendered is very helpful in situations when workload has to be managed, as pre-rendering allows the web application designer to choose the precise moment when the content should actually be rendered and how it should be done. With that, it’s possible to establish a proper trade-off between reduction in peak workload and the amount of extra memory used for storing the pre-rendered contents.

Pre-rendering techniques #

Nowadays, the web platform provides at lest a few widely-adapted APIs that provide means for the application to perform various kinds of pre-rendering. Also, due to the ways the browsers are implemented, some APIs can be purposely misused to provide pre-rendering techniques not necessarily supported by the specification. However, in the pursuit of good trade-offs, all the possibilities should be taken into account.

Before jumping into particular pre-rendering techniques, it’s necessary to emphasize that the pre-rendering term used in this article refers to the actual rendering being done earlier than it’s visually presented. In that sense, the resource is rasterized to some intermediate form when desired and then just composited by the browser engine’s compositor later.

Pre-rendering offline #

The most basic (and limited at the same time) pre-rendering technique is one that involves rendering offline i.e. before the browser even starts. In that case, the first limitation is that the content to be rendered must be known beforehand. If that’s the case, the rendering can be done in any way, and the result may be captured as e.g. raster or vector image (depending on the desired trade-off). However, the other problem is that such a rendering is usually out of the given web application scope and thus requires extra effort. Moreover, depending on the situation, the amount of extra memory used, the longer web application startup (due to loading the pre-rendered resources), and the processing power required to composite a given resource, it may not always be trivial to obtain the desired gains.

Pre-rendering using canvas #

The first group of actual pre-rendering techniques happening during web application runtime is related to Canvas and OffscreenCavas. Those APIs are really useful as they offer great flexibility in terms of usage and are usually very performant. However, in this case, the natural downside is the lack of support for rendering the DOM inside the canvas. Moreover, canvas has a very limited support for painting text — unlike the DOM, where CSS has a significant amount of features related to it. Interestingly, there’s an ongoing proposal called HTML-in-Canvas that could resolve those limitations to some degree. In fact, Blink has a functioning prototype of it already. However, it may take a while before the spec is mature and widely adopted by other browser engines.

When it comes to actual usage of canvas APIs for pre-rendering, the possibilities are numerous, and there are even more of them when combined with processing using workers. The most popular ones are as follows:

  • rendering to an invisible canvas and showing it later,
  • rendering to a canvas detached from the DOM and attaching it later,
  • rendering to an invisible/detached canvas and producing an image out of it to be shown later,
  • rendering to an offscreen canvas and producing an image out of it to be shown later.

When combined with workers, some of the above techniques may be used in the worker threads with the rendered artifacts transferred to the main for presentation purposes. In that case, one must be careful with the transfer itself, as some objects may get serialized, which is very costly. To avoid that, it’s recommended to use transferable objects and always perform a proper benchmarking to make sure the transfer is not involving serialization in the particular case.

While the use of canvas APIs is usually very straightforward, one must be aware of two extra caveats.

First of all, in the case of many techniques mentioned above, there is no guarantee that the browser will perform actual rasterization at the given point in time. To ensure the rasterization is triggered, it’s usually necessary to enforce it using e.g. a dummy readback (getImageData()).

Finally, one should be aware that the usage of canvas comes with some overhead. Therefore, creating many canvases or creating them often, may lead to performance problems that could outweigh the gains from the pre-rendering itself.

Pre-rendering using eventually-invisible layers #

The second group of pre-rendering techniques happening during web application runtime is limited to the DOM rendering and comes out of a combination of purposeful spec misuse and tricking the browser engine into making it rasterizing on demand. As one can imagine, this group of techniques is very much browser-engine-specific. Therefore, it should always be backed by proper benchmarking of all the use cases on the target browsers and target hardware.

In principle, all the techniques of this kind consist of 3 parts:

  1. Enforcing the content to be pre-rendered being placed on a separate layer backed by an actual buffer internally in the browser,
  2. Tricking the browser’s compositor into thinking that the layer needs to be rasterized right away,
  3. Ensuring the layer won’t be composited eventually.

When all the elements are combined together, the browser engine will allocate an internal buffer (e.g. texture) to back the given DOM fragment, it will process that fragment (style recalc, layout), and rasterize it right away. It will do so as it will not have enough information to allow delaying the rasterization of the layer (as e.g. in case of display: none). Then, when the compositing time comes, the layer will turn out to be invisible in practice due to e.g. being occluded, clipped, etc. This way, the rasterization will happen right away, but the results will remain invisible until a later time when the layer is made visible.

In practice, the following approaches can be used to trigger the above behavior:

  • for (1), the CSS properties such as will-change: transform, z-index, position: fixed, overflow: hidden etc. can be used depending on the browser engine,
  • for (2) and (3), the CSS properties such as opacity: 0, overflow: hidden, contain: strict etc. can be utilized, again, depending on the browser engine.
The scrolling trick

While the above CSS properties allow for various combinations, in case of WPE WebKit in the context of embedded devices (tested on NXP i.MX8M Plus), the combination that has proven to yield the best performance benefits turns out to be a simple approach involving overflow: hidden and scrolling. The example of such an approach is explained below.

Suppose the goal of the application is to update a big table with numbers once every N frames — like in the following demo: random-numbers-bursting-in-table.html?cs=20&rs=20&if=59

Bursting numbers demo.

With the number of idle frames (if) set to 59, the idea is that the application does nothing significant for the 59 frames, and then every 60th frame it updates all the numbers in the table.

As one can imagine, on constrained embedded devices, such an approach leads to a very heavy workload during every 60th frame and hence to lost frames and unstable application’s FPS.

As long as the numbers are available earlier than every 60th frame, the above application is a perfect example where pre-rendering could be used to reduce the peak workload.

To simulate that, the 3 variants of the approach involving the scrolling trick were prepared for comparison with the above:

In the above demos, the idea is that each cell with a number becomes a scrollable container with 2 numbers actually — one above the other. In that case, because overflow: hidden is set, only one of the numbers is visible while the other is hidden — depending on the current scrolling:

Scrolling trick explained.

With such a setup, it’s possible to update the invisible numbers during idle frames without the user noticing. Due to how WPE WebKit accelerates the scrolling, changing the invisible numbers, in practice, triggers the layout and rendering right away. Moreover, the actual rasterization to the buffer backing the scrollable container happens immediately (depending on the tiling settings), and hence the high cost of layout and text rasterization can be distributed. When the time comes, and all the numbers need to be updated, the scrollable containers can be just scrolled, which in that case turns out to be ~2 times faster than updating all the numbers in place.

To better understand the above effect, it’s recommended to compare the mark views from sysprof traces of the random-numbers-bursting-in-table.html?cs=10&rs=10&if=11 and random-numbers-bursting-in-table-prerendered-1.html?cs=10&rs=10&if=11 demos:

Sysprof from basic demo.



Sysprof from pre-rendering demo.

While the first sysprof trace shows very little processing during 11 idle frames and a big chunk of processing (21 ms) every 12th frame, the second sysprof trace shows how the distribution of load looks. In that case, the amount of work during 11 idle frames is much bigger (yet manageable), but at the same time, the formerly big chunk of processing every 12th frame is reduced almost 2 times (to 11 ms). Therefore, the overall frame rate in the application is much better.

Results

Despite the above improvement speaking for itself, it’s worth summarizing the improvement with the benchmarking results of the above demos obtained from the NXP i.MX8M Plus and presenting the application’s average frames per second (FPS):

Benchmarking results.

Clearly, the positive impact of pre-rendering can be substantial depending on the conditions. In practice, when the rendered DOM fragment is more complex, the trick such as above can yield even better results. However, due to how tiling works, the effect can be minimized if the content to be pre-rendered spans multiple tiles. In that case, the browser may defer rasterization until the tiles are actually needed. Therefore, the above needs to be used with care and always with proper benchmarking.

Conclusions #

As demonstrated in the above sections, when it comes to pre-rendering the contents to distribute the web application workload over time, the web platform gives both the official APIs to do it, as well as unofficial means through purposeful misuse of APIs and exploitation of browser engine implementations. While this article hasn’t covered all the possibilities available, the above should serve as a good initial read with some easy-to-try solutions that may yield surprisingly good results. However, as some of the ideas mentioned above are very much browser-engine-specific, they should be used with extra care and with the limitations (lack of portability) in mind.

As the web platform constantly evolves, the pool of pre-rendering techniques and tricks should keep evolving as well. Also, as more and more web applications are used on embedded devices, more pressure should be put on the specification, which should yield more APIs targeting the low-end devices in the future. With that in mind, it’s recommended for the readers to stay up-to-date with the latest specification and perhaps even to get involved if some interesting use cases would be worth introducing new APIs.

December 19, 2025 12:00 AM

December 15, 2025

Igalia WebKit Team

WebKit Igalia Periodical #51

Update on what happened in WebKit in the week from December 8 to December 15.

In this end-of-year special have a new GMallocString helper that makes management of malloc-based strings more efficient, development releases, and a handful of advancements on JSC's implementation of Temporal, in particular the PlainYearMonth class.

Cross-Port 🐱

Added GMallocString class to WTF to adopt UTF8 C strings and make them WebKit first class citizens efficiently (no copies). Applied in GStreamer code together with other improvements by using CStringView. Fixed other two bugs about string management.

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

Releases 📦️

Development releases of WebKitGTK 2.51.3 and WPE WebKit 2.51.3 are now available. These include a number of API additions and new features, and are intended to allow interested parties to test those in advance, prior to the next stable release series. As usual, bug reports are welcome in Bugzilla.

That’s all for this week!

by Igalia WebKit Team at December 15, 2025 07:58 PM

December 08, 2025

Igalia WebKit Team

WebKit Igalia Periodical #50

Update on what happened in WebKit in the week from December 1 to December 8.

In this edition of the periodical we have further advancements on the Temporal implementation, support for Vivante super-tiled format, and an adaptation of the DMA-BUF formats code to the Android port.

Cross-Port 🐱

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

Implemented the toString, toJSON, and toLocaleString methods for PlainYearMonth objects in JavaScriptCore's implementation of Temporal.

Graphics 🖼️

BitmapTexture and TextureMapper were prepared to handle textures where the logical size (e.g. 100×100) differs from the allocated size (e.g. 128×128) due to alignment requirements. This allowed to add support for using memory-mapped GPU buffers in the Vivante super-tiled format available on i.MX platforms. Set WEBKIT_SKIA_USE_VIVANTE_SUPER_TILED_TILE_TEXTURES=1 to activate at runtime.

WPE WebKit 📟

WPE Platform API 🧩

New, modern platform API that supersedes usage of libwpe and WPE backends.

The WPEBufferDMABufFormats class has been renamed to WPEBufferFormats, as it can be used in situations where mechanisms other than DMA-BUF may be used for buffer sharing—on Android targets AHardwareBuffer is used instead, for example. The naming change involved also WPEBufferFormatsBuilder (renamed from WPEBufferDMABufFormatsBuilder), and methods and signals in other classes that use these types. Other than the renames, there is no change in functionality.

That’s all for this week!

by Igalia WebKit Team at December 08, 2025 08:26 PM

December 05, 2025

Enrique Ocaña

Meow: Process log text files as if you could make cat speak

Some years ago I had mentioned some command line tools I used to analyze and find useful information on GStreamer logs. I’ve been using them consistently along all these years, but some weeks ago I thought about unifying them in a single tool that could provide more flexibility in the mid term, and also as an excuse to unrust my Rust knowledge a bit. That’s how I wrote Meow, a tool to make cat speak (that is, to provide meaningful information).

The idea is that you can cat a file through meow and apply the filters, like this:

cat /tmp/log.txt | meow appsinknewsample n:V0 n:video ht: \
ft:-0:00:21.466607596 's:#([A-za-z][A-Za-z]*/)*#'

which means “select those lines that contain appsinknewsample (with case insensitive matching), but don’t contain V0 nor video (that is, by exclusion, only that contain audio, probably because we’ve analyzed both and realized that we should focus on audio for our specific problem), highlight the different thread ids, only show those lines with timestamp lower than 21.46 sec, and change strings like Source/WebCore/platform/graphics/gstreamer/mse/AppendPipeline.cpp to become just AppendPipeline.cpp“, to get an output as shown in this terminal screenshot:

Screenshot of a terminal output showing multiple log lines. Some of them have the word

Cool, isn’t it? After all, I’m convinced that the answer to any GStreamer bug is always hidden in the logs (or will be, as soon as I add “just a couple of log lines more, bro🤭).

Currently, meow supports this set of manipulation commands:

  • Word filter and highlighting by regular expression (fc:REGEX, or just REGEX): Every expression will highlight its matched words in a different color.
  • Filtering without highlighting (fn:REGEX): Same as fc:, but without highlighting the matched string. This is useful for those times when you want to match lines that have two expressions (E1, E2) but the highlighting would pollute the line too much. In those case you can use a regex such as E1.*E2 and then highlight the subexpressions manually later with an h: rule.
  • Negative filter (n:REGEX): Selects only the lines that don’t match the regex filter. No highlighting.
  • Highlight with no filter (h:REGEX): Doesn’t discard any line, just highlights the specified regex.
  • Substitution (s:/REGEX/REPLACE): Replaces one pattern for another. Any other delimiter character can be used instead of /, it that’s more convenient to the user (for instance, using # when dealing with expressions to manipulate paths).
  • Time filter (ft:TIME-TIME): Assuming the lines start with a GStreamer log timestamp, this filter selects only the lines between the target start and end time. Any of the time arguments (or both) can be omitted, but the - delimiter must be present. Specifying multiple time filters will generate matches that fit on any of the time ranges, but overlapping ranges can trigger undefined behaviour.
  • Highlight threads (ht:): Assuming a GStreamer log, where the thread id appears as the third word in the line, highlights each thread in a different color.

The REGEX pattern is a regular expression. All the matches are case insensitive. When used for substitutions, capture groups can be defined as (?CAPTURE_NAMEREGEX).

The REPLACEment string is the text that the REGEX will be replaced by when doing substitutions. Text captured by a named capture group can be referred to by ${CAPTURE_NAME}.

The TIME pattern can be any sequence of numbers, : or . . Typically, it will be a GStreamer timestamp (eg: 0:01:10.881123150), but it can actually be any other numerical sequence. Times are compared lexicographically, so it’s important that all of them have the same string length.

The filtering algorithm has a custom set of priorities for operations, so that they get executed in an intuitive order. For instance, a sequence of filter matching expressions (fc:, fn:) will have the same priority (that is, any of them will let a text line pass if it matches, not forbidding any of the lines already allowed by sibling expressions), while a negative filter will only be applied on the results left by the sequence of filters before it. Substitutions will be applied at their specific position (not before or after), and will therefore modify the line in a way that can alter the matching of subsequent filters. In general, the user doesn’t have to worry about any of this, because the rules are designed to generate the result that you would expect.

Now some practical examples:

Example 1: Select lines with the word “one”, or the word “orange”, or a number, highlighting each pattern in a different color except the number, which will have no color:

$ cat file.txt | meow one fc:orange 'fn:[0-9][0-9]*'
000 one small orange
005 one big orange

Example 2: Assuming a pictures filename listing, select filenames not ending in “jpg” nor in “jpeg”, and rename the filename to “.bak”, preserving the extension at the end:

$ cat list.txt | meow 'n:jpe?g' \
   's:#^(?<f>[^.]*)(?<e>[.].*)$#${f}.bak${e}'
train.bak.png
sunset.bak.gif

Example 3: Only print the log lines with times between 0:00:24.787450146 and 0:00:24.790741865 or those at 0:00:30.492576587 or after, and highlight every thread in a different color:

$ cat log.txt | meow ft:0:00:24.787450146-0:00:24.790741865 \
 
  ft:0:00:30.492576587- ht:
0:00:24.787450146 739 0x1ee2320 DEBUG …
0:00:24.790382735 739 0x1f01598 INFO …
0:00:24.790741865 739 0x1ee2320 DEBUG …
0:00:30.492576587 739 0x1f01598 DEBUG …
0:00:31.938743646 739 0x1f01598 ERROR …

This is only the begining. I have great ideas for this new tool (as time allows), such as support for parenthesis (so the expressions can be grouped), or call stack indentation on logs generated by tracers, in a similar way to what Alicia’s gst-log-indent-tracers tool does. I might also predefine some common expressions to use in regular expressions, such as the ones to match paths (so that the user doesn’t have to think about them and reinvent the wheel every time). Anyway, these are only ideas. Only time and hyperfocus slots will tell…

By now, you can find the source code on my github. Meow!

by eocanha at December 05, 2025 11:16 AM

December 02, 2025

Igalia WebKit Team

WebKit Igalia Periodical #49

Update on what happened in WebKit in the week from November 24 to December 1.

The main highlights for this week are the completion of `PlainMonthDay` in Temporal, moving networking access for GstWebRTC to the WebProcess, and Xbox Cloud Gaming now working in the GTK and WPE ports.

Cross-Port 🐱

Multimedia 🎥

GStreamer-based multimedia support for WebKit, including (but not limited to) playback, capture, WebAudio, WebCodecs, and WebRTC.

Xbox Cloud Gaming is now usable in WebKitGTK and WPE with the GstWebRTC backend, we had to fix non-spec compliant ICE candidates handling and add a WebRTC quirk forcing max-bundle in PeerConnections to make it work. Happy cloud gaming!

Support for remote inbound RTP statistics was improved in 303671@main, we now properly report framesPerSecond and totalDecodeTime metrics, those fields are used in the Xbox Cloud Gaming service to show live stats about the connection and video decoder performance in an overlay.

The GstWebRTC backend now relies on librice for its ICE. The Sans-IO architecture of librice allows us to keep the WebProcess sandboxed and to route WebRTC-related UDP and (eventually) TCP packets using the NetworkProcess. This work landed in 303623@main. The GNOME SDK should also soon ship librice.

Support for seeking in looping videos was fixed in 303539@main.

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

Implemented the valueOf and toPlainDate for PlainMonthDay objects. This completes the implementation of Temporal PlainMonthDay objects in JSC!

WebKitGTK 🖥️

The GTK port has gained support for interpreting touch input as pointer events. This matches the behaviour of other browsers by following the corresponding specifications.

WPE WebKit 📟

Fixed an issue that prevented WPE from processing further input events after receiving a secondary mouse button press.

Fixed an issue that caused right mouse button clicks to prevent processing of further pointer events.

WPE Platform API 🧩

New, modern platform API that supersedes usage of libwpe and WPE backends.

We landed a patch to add a new signal in WPEDisplay to notify when the connection to the native display has been lost.

Infrastructure 🏗️

Modernized the CMake modules used to find libtasn1, libsecret, libxkbcommon, libhyphen, and Enchant libraries.

Note that this work removed the support for building against Enchant 1.x, and only version 2 will be supported. The first stable release to require Enchant 2.x will be 2.52.0 due in March 2026. Major Linux and BSD distributions have included Enchant 2 packages for years, and therefore this change is not expected to cause any trouble. The Enchant library is used by the GTK port for spell checking.

Community & Events 🤝

We have published an article detailing our work making MathML interoperable across browser engines! It has live demonstrations and feature tables with our progress on WebKit support.

We have published new blogs post highlighting the most important changes in both WPE WebKit and WebKitGTK 2.50. Enjoy!

That’s all for this week!

by Igalia WebKit Team at December 02, 2025 02:15 PM

November 27, 2025

WPE WebKit Blog

Highlights of the WPE WebKit 2.50 release series

This fall, the WPE WebKit team has released the 2.50 series of the Web engine after six months of hard work. Let’s have a deeper look at some of the most interesting changes in this release series!

Improved rendering performance

For this series, the threaded rendering implementation has been switched to use the Skia API. What has changed is the way we record the painting commands for each layer. Previously we used WebCore’s built-in mechanism (DisplayList) which is not thread-safe, and led to obscure rendering issues in release builds and/or sporadic assertions in debug builds when replaying the display lists in threads other than the main one. The DisplayList usage was replaced with SkPictureRecorder, Skia’s built-in facility, that provides similar functionality but in a thread-safe manner. Using the Skia API, we can leverage multithreading in a reliable way to replay recorded drawing commands in different worker threads, improving rendering performance.

An experimental hybrid rendering mode has also been added. In this mode, WPE WebKit will attempt to use GPU worker threads for rendering but, if these are busy, CPU worker threads will be used whenever possible. This rendering mode is still under investigation, as it is unclear whether the improvements are substantial enough to justify the extra complexity.

Damage propagation to the system compositor, which was added during the 2.48 cycle but remained disabled by default, has now been enabled. The system compositor may now leverage the damage information for further optimization.

Vertical writing-mode rendering has also received improvements for this release series.

Changes in Multimedia support

When available in the system, WebKit can now leverage the XDG desktop portal for accessing capture devices (like cameras) so that no specific sandbox exception is required. This provides secure access to capture devices in browser applications that use WPE WebKit.

Managed Media Source support has been enabled. This potentially improves multimedia playback, for example in mobile devices, by allowing the user agent to react to changes in memory and CPU availability.

Transcoding is now using the GStreamer built-in uritranscodebin element instead of GstTranscoder, which improves stability of the media recording that needs transcoding.

SVT-AV1 encoder support has been added to the media backend.

WebXR support

The WebXR implementation had been stagnating since it was first introduced, and had a number of shortcomings. This was removed in favor of a new implementation, also built using OpenXR, that better adapts to the multiprocess architecture of WebKit.

This feature is considered experimental in 2.50, and while it is complete enough to load and display a number of immersive experiences, a number of improvements and optional features continue to be actively developed. Therefore, WebXR support needs to be enabled at build time with the ENABLE_WEBXR=ON CMake option.

Android support

Support for Android targets has been greatly improved. It is now possible to build WPE WebKit without the need for additional patches when using the libwpe-based WPEBackend-android. This was achieved by incorporating changes that make WebKit use more appropriate defaults (like disabling MediaSession) or using platform-specific features (like ASharedMemory and AHardwareBuffer) when targeting Android.

The WebKit logging system has gained support to use the Android logd service. This is particularly useful for both WebKit and application developers, allowing to configure logging channels at runtime in any WPE WebKit build. For example, the following commands may be used before launching an application to debug WebGL setup and multimedia playback errors:

adb setprop log.tag.WPEWebKit VERBOSE   # Global logging filter
adb setprop debug.WPEWebKit.log 'WebGL,Media=error'  # Channels
adb logcat -s WPEWebKit                   # Follow log messages

There is an ongoing effort to enable the WPEPlatform API on Android, and while it builds now, rendering is not yet working.

Web Platform support

As usual, changes in this area are extensive as WebKit constantly adopts, improves, and supports new Web Platform features. However, some interesting additions in this release cycle include:

API changes

WPEPlatform

Work continues on the new WPEPlatform API, which is still shipped as a preview feature in the 2.50 and needs to be explicitly enabled at build time with the ENABLE_WPE_PLATFORM=ON CMake option. The API may still change and applications developed using WPEPlatform are likely to need changes with future WPE WebKit releases; but not for long: the current goal is to have it ready and enabled by default for the upcoming 2.52 series.

One of the main changes is that WPEPlatform now gets built into libWPEWebKit. The rationale for this change is avoiding shipping two copies of shared code from the Web Template Framework (WTF), which saves both disk and memory space usage. The wpe-platform-2.0 pkg-config module is still shipped, which allows application developers to know whether WPEPlatform support has been built into WPE WebKit.

The abstract base class WPEScreenSyncObserver has been introduced, and allows platform implementations to notify on display synchronization, allowing WebKit to better pace rendering.

WPEPlatform has gained support for controllers like gamepads and joysticks through the new WPEGamepadManager and WPEGamepad classes. When building with the ENABLE_MANETTE=ON CMake option a built-in implementation based on libmanette is used by default, if a custom one is not specified.

WPEPlatform now includes a new WPEBufferAndroid class, used to represent graphics buffers backed by AHardwareBuffer. These buffers support being imported into an EGLImage using wpe_buffer_import_to_egl_image().

As part of the work to improve Android support, the buffer rendering and release fences have been moved from WPEBufferDMABuf to the base class, WPEBuffer. This is leveraged by WPEBufferAndroid, and should be helpful if more buffer types are introduced in the future.

Other additions include clipboard support, Interaction Media Features, and an accessibility implementation using ATK.

What’s new for WebKit developers?

WebKit now supports sending tracing marks and counters to Sysprof. Marks indicate when certain events occur and their duration; while counters track variables over time. Together, these allow developers to find performance bottlenecks and monitor internal WebKit performance metrics like frame rates, memory usage, and more. This integration enables developers to analyze the performance of applications, including data for WebKit alongside system-level metrics, in a unified view. For more details see this article, which also details how Sysprof was improved to handle the massive amounts of data produced by WebKit.

Finally, GCC 12.2 is now the minimum required version to build WPE WebKit. Increasing the minimum compiler version allows us to remove obsolete code and focus on improving code quality, while taking advantage of new C++ and compiler features.

Looking forward to 2.52

The 2.52 release series will bring even more improvements, and we expect it to be released during the spring of 2026. Until then!

November 27, 2025 12:00 AM

November 24, 2025

Igalia WebKit Team

WebKit Igalia Periodical #48

Update on what happened in WebKit in the week from November 17 to November 24.

In this week's rendition, the WebView snapshot API was enabled on the WPE port, further progress on the Temporal and Trusted Types implementations, and the release of WebKitGTK and WPE WebKit 2.50.2.

Cross-Port 🐱

A WebKitImage-based implementation of WebView snapshot landed this week, enabling this feature on WPE when it was previously only available in GTK. This means you can now use webkit_web_view_get_snapshot (and webkit_web_view_get_snapshot_finish) to get a WebKitImage-representation of your screenshot.

WebKitImage implements the GLoadableIcon interface (as well as GIcon's), so you can get a PNG-encoded image using g_loadable_icon_load.

Remove incorrect early return in Trusted Types DOM attribute handling to align with spec changes.

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

In JavaScriptCore's implementation of Temporal, implemented the with method for PlainMonthDay objects.

In JavaScriptCore's implementation of Temporal, implemented the from and equals methods for PlainMonthDay objects.

Releases 📦️

WebKitGTK 2.50.2 and WPE WebKit 2.50.2 have been released.

These stable releases include a number of patches for security issues, and as such a new security advisory, WSA-2025-0008, has been issued (GTK, WPE).

It is recommend to apply an additional patch that fixes building with the JavaScriptCore “CLoop” interpreter is enabled, which is typicall for architectures where JIT compilation is unsupported. Releases after 2.50.2 will include it and manual patching will no longer be needed.

That’s all for this week!

by Igalia WebKit Team at November 24, 2025 08:12 PM

November 17, 2025

Igalia WebKit Team

WebKit Igalia Periodical #47

Update on what happened in WebKit in the week from November 10 to November 17.

This week's update is composed of a new CStringView internal API, more MathML progress with the implementation of the "scriptlevel" attribute, the removal of the Flatpak-based SDK, and the maintanance update of WPEBackend-fdo.

Cross-Port 🐱

Implement the MathML scriptlevel attribute using math-depth.

Finished implementing CStringView, which is a wrapper around UTF8 C strings. It allows you to recover the string without making any copies and perform string operations safely by taking into account the encoding at compile time.

Releases 📦️

WPEBackend-fdo 1.16.1 has been released. This is a maintenance update which adds compatibility with newer Mesa versions.

Infrastructure 🏗️

Most of the Flatpak-based SDK was removed. Developers are warmly encouraged to use the new SDK for their contributions to the Linux ports, this SDK has been successfully deployed on EWS and post-commits bots.

That’s all for this week!

by Igalia WebKit Team at November 17, 2025 09:22 PM

November 10, 2025

Igalia WebKit Team

WebKit Igalia Periodical #46

Update on what happened in WebKit in the week from November 3 to November 10.

This week brought a hodgepodge of fixes in Temporal and multimedia, a small addition to the public API in preparation for future work, plus advances in WebExtensions, WebXR, and Android support.

Cross-Port 🐱

The platform-independent part of the WebXR Hit Test Module has been implemented. The rest, including the FakeXRDevice mock implementation used for testing will be done later.

On the WebExtensions front, parts of the WebExtensionCallbackHandler code have been rewritten to use more C++ constructs and helper functions, in preparation to share more code among the different WebKit ports.

A new WebKitImage utility class landed this week. This image abstraction is one of the steps towards delivering a new improved API for page favicons, and it is also expected to be useful for the WebExtensions work, and to enable the webkit_web_view_get_snapshot() API for the WPE port.

Multimedia 🎥

GStreamer-based multimedia support for WebKit, including (but not limited to) playback, capture, WebAudio, WebCodecs, and WebRTC.

Videos with BT2100-PQ colorspace are now tone-mapped to SDR in WebKit's compositor, ensuring colours do not appear washed out.

Lots of deadlock fixes this week, one among many in the MediaStream GStreamer source element.

Video frame rendering to WebGL was fixed. Another pending improvement is GPU-to-GPU texture copies, which might be coming soon.

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

JavaScriptCore's implementation of Temporal received a number of improvements this week:

  • Fixed a bug that would cause wrong results when adding a duration with a very large microseconds or nanoseconds value to a PlainTime.

  • Fixed a rounding bug of Instant values.

  • Fixed a bug that resulted in incorrect printing of certain Instant values before the Epoch.

  • Fixed a bug that resulted in wrong results instead of exceptions when a date addition operation would result in an out-of-range date.

WPE WebKit 📟

WPE Android 🤖

Adaptation of WPE WebKit targeting the Android operating system.

One of the last pieces needed to have the WPEPlatform API working on Android has been merged: a custom platform EGL display implementation, and enabling the default display as fallback.

Community & Events 🤝

The dates for the next Web Engines Hackfest have been announced: it will take place from Monday, June 15th to Wednesday, June 17th. As it has been the case in the last years, it will be possible to attend both on-site, and remotely for those who cannot to travel to A Coruña.

The video recording for Adrian Pérez's “WPE Android 🤖 State of the Bot” talk from this year's edition of the WebKit Contributors' Meeting has been published. This was an update on what the Igalia WebKit team has been done during the last year to improve WPE WebKit on Android, and what is coming up next.

That’s all for this week!

by Igalia WebKit Team at November 10, 2025 11:04 PM

November 03, 2025

Igalia WebKit Team

WebKit Igalia Periodical #45

Update on what happened in WebKit in the week from October 27 to November 3.

A calmer week this time! This week we have the GTK and WPE ports implementing the RunLoopObserver infrastructure, which enables more sophisticated scheduling in WebKit Linux ports, as well as more information in webkit://gpu. On the Trusted Types front, the timing of check was changed to align with spec changes.

Cross-Port 🐱

Implemented the RunLoopObserver infrastructure for GTK and WPE ports, a critical piece of technology previously exclusive to Apple ports that enables sophisticated scheduling features like OpportunisticTaskScheduler for optimal garbage collection timing.

The implementation refactored the GLib run loop to notify clients about activity-state transitions (BeforeWaiting, Entry, Exit, AfterWaiting), then moved from timer-based to observer-based layer flushing for more precise control over rendering updates. Finally support was added to support cross-thread scheduling of RunLoopObservers, allowing the ThreadedCompositor to use them, enabling deterministic composition notifications across thread boundaries.

Changed timing of Trusted Types checks within DOM attribute handling to align with spec changes.

Graphics 🖼️

The webkit://gpu page now shows more information like the list of preferred buffer formats, the list of supported buffer formats, threaded rendering information, number of MSAA samples, view size, and toplevel state.

It is also now possible to make the page autorefresh every the given amount of seconds by passing a ?refresh=<seconds> parameter in the URL.

That’s all for this week!

by Igalia WebKit Team at November 03, 2025 07:16 PM

October 28, 2025

Igalia WebKit Team

WebKit Igalia Periodical #44

Update on what happened in WebKit in the week from October 21 to October 28.

This week has again seen a spike in activity related to WebXR and graphics performance improvements. Additionally, we got in some MathML additions, a fix for hue interpolation, a fix for WebDriver screenshots, development releases, and a blog post about memory profiling.

Cross-Port 🐱

Support for WebXR Layers has seen the very first changes needed to have them working on WebKit. This is expected to take time to complete, but should bring improvements in performance, rendering quality, latency, and power consumption down the road.

Work has started on the WebXR Hit Test Module, which will allow WebXR experiences to check for real world surfaces. The JavaScript API bindings were added, followed by an initial XRRay implementation. More work is needed to actually provide data from device sensors.

Now that the WebXR implementation used for the GTK and WPE ports is closer to the Cocoa ones, it was possible to unify the code used to handle opaque buffers.

Implemented the text-transform: math-auto CSS property, which replaces the legacy mathvariant system and is used to make identifiers italic in MathML Core.

Implemented the math-depth CSS extension from MathML Core.

Graphics 🖼️

The hue interpolation method for gradients has been fixed. This is expected to be part of the upcoming 2.50.2 stable release.

Usage of Multi-Sample Antialiasing (MSAA) has been enabled when using GPU rendering, and then further changed to use dynamic MSAA to improve performance.

Paths that contain a single arc, oval, or line have been changed to use a specialized code path, resulting in improved performance.

WebGL content rendering will be handled by a new isolated process (dubbed “GPU Process”) by default. This is the first step towards moving more graphics processing out of the process that handles processing Web content (the “Web Process”), which will result in increased resilience against buggy graphics drivers and certain kinds of malicious content.

The internal webkit://gpu page has been improved to also display information about the graphics configuration used in the rendering process.

WPE WebKit 📟

WPE Platform API 🧩

New, modern platform API that supersedes usage of libwpe and WPE backends.

The new WPE Platform, when using Skia (the default), now takes WebDriver screenshots in the UI Process, using the final assembled frame that was sent to the system compositor. This fixes the issues of some operations like 3D CSS animations that were not correctly captured in screenshots.

Releases 📦️

The first development releases for the current development cycle have been published: WebKitGTK 2.51.1 and WPE WebKit 2.51.1. These are intended to let third parties test upcoming features and improvements and as such bug reports for those are particularly welcome in Bugzilla. We are particularly interested in reports related to WebGL, now that it is handled in an isolated process.

Community & Events 🤝

Paweł Lampe has published a blog post that discusses GTK/WPE WebKit memory profiling using industry-standard tools and a built-in "Malloc Heap Breakdown" WebKit feature.

That’s all for this week!

by Igalia WebKit Team at October 28, 2025 02:31 PM

October 24, 2025

Pawel Lampe

Tracking WebKit's memory allocations with Malloc Heap Breakdown

One of the main constraints that embedded platforms impose on the browsers is a very limited memory. Combined with the fact that embedded web applications tend to run actively for days, weeks, or even longer, it’s not hard to imagine how important the proper memory management within the browser engine is in such use cases. In fact, WebKit and WPE in particular receive numerous memory-related fixes and improvements every year. Before making any changes, however, the areas to fix/improve need to be narrowed down first. Like any C++ application, WebKit memory can be profiled using a variety of industry-standard tools. Although such well-known tools are really useful in the majority of use cases, they have their limits that manifest themselves when applied on production-grade embedded systems in conjunction with long-running web applications. In such cases, a very useful tool is a debug-only feature of WebKit itself called malloc heap breakdown, which this article describes.

Industry-standard memory profilers #

When it comes to profiling memory of applications on linux systems, the 2 outstanding tools used usually are Massif (Valgrind) and Heaptrack.

Massif (Valgrind) #

Massif is a heap profiler that comes as part of the Valgrind suite. As its documentation states:

It measures how much heap memory your program uses. This includes both the useful space, and the extra bytes allocated for book-keeping and alignment purposes. It can also measure the size of your program’s stack(s), although it does not do so by default.

Using Massif with WebKit is very straightforward and boils down to a single command:

Malloc=1 valgrind --tool=massif --trace-children=yes WebKitBuild/GTK/Debug/bin/MiniBrowser '<URL>'
  • The Malloc=1 environment variable set above is necessary to instruct WebKit to enable debug heaps that use the system malloc allocator.

Given some results are generated, the memory usage over time can be visualized using massif-visualizer utility. An example of such a visualization is presented in the image below:

TODO.

While Massif has been widely adopted and used for many years now, from the very beginning, it suffered from a few significant downsides.

First of all, the way Massif instruments the profiled application introduces significant overhead that may slow down the application up to 2 orders of magnitude. In some cases, such overhead makes it simply unusable.

The other important problem is that Massif is snapshot-based, and hence, the level of detail is not ideal.

Heaptrack #

Heaptrack is a modern heap profiler developed as part of KDE. The below is its description from the git repository:

Heaptrack traces all memory allocations and annotates these events with stack traces. Dedicated analysis tools then allow you to interpret the heap memory profile to:

  • find hotspots that need to be optimized to reduce the memory footprint of your application
  • find memory leaks, i.e. locations that allocate memory which is never deallocated
  • find allocation hotspots, i.e. code locations that trigger a lot of memory allocation calls
  • find temporary allocations, which are allocations that are directly followed by their deallocation

At first glance, Heaptrack resembles Massif. However, a closer look at the architecture and features shows that it’s much more than the latter. While it’s fair to say it’s a bit similar, in fact, it is a significant progression.

Usage of Heaptrack to profile WebKit is also very simple. At the moment of writing, the most suitable way to use it is to attach to a certain running WebKit process using the following command:

heaptrack -p <PID>

while the WebKit needs to be run with system malloc, just like in Massif case:

WEBKIT_DISABLE_SANDBOX_THIS_IS_DANGEROUS=1 Malloc=1 WebKitBuild/GTK/Debug/bin/MiniBrowser '<URL>'
  • If profiling of e.g. web content process startup is essential, it’s then recommended also to use WEBKIT2_PAUSE_WEB_PROCESS_ON_LAUNCH=1, which adds 30s delay to the process startup.

When the profiling session is done, the analysis of the recordings is done using:

heaptrack --analyze <RECORDING>

The utility opened with the above, shows various things, such as the memory consumption over time:

TODO.

flame graphs of memory allocations with respect to certain functions in the code:

TODO.

etc.

As Heaptrack records every allocation and deallocation, the data it gathers is very precise and full of details, especially when accompanied by stack traces arranged into flame graphs. Also, as Heaptrack does instrumentation differently than e.g. Massif, it’s usually much faster in the sense that it slows down the profiled application only up to 1 order of magnitude.

Shortcomings on embedded systems #

Although the memory profilers such as above are really great for everyday use, their limitations on embedded platforms are:

  • they significantly slow down the profiled application — especially on low-end devices,
  • they effectively cannot be run for a longer period of time such as days or weeks, due to memory consumption,
  • they are not always provided in the images — and hence require additional setup,
  • they may not be buildable out of the box on certain architectures — thus requiring extra patching.

While the above limitations are not always a problem, usually at least one of them is. What’s worse, usually at least one of the limitations turns into a blocking problem. For example, if the target device is very short on memory, it may be basically impossible to run anything extra beyond the browser. Another example could be a situation where the application slowdown due to the profiler usage, leads to different application behavior, such as a problem that originally reproduced 100% of the time, does not reproduce anymore etc.

Malloc heap breakdown in WebKit #

Profiling the memory of WebKit while addressing the above problems points towards a solution that does not involve any extra tools, i.e. instrumenting WebKit itself. Normally, adding such an instrumentation to the C++ application means a lot of work. Fortunately, in the case of WebKit, all that work is already done and can be easily enabled by using the Malloc heap breakdown.

In a nutshell, Malloc heap breakdown is a debug-only feature that enables memory allocation tracking within WebKit itself. Since it’s built into WebKit, it’s very lightweight and very easy to build, as it’s just about setting the ENABLE_MALLOC_HEAP_BREAKDOWN build option. Internally, when the feature is enabled, WebKit switches to using debug heaps that use system malloc along with the malloc zone API to mark objects of certain classes as belonging to different heap zones and thus allowing one to track the allocation sizes of such zones.

As the malloc zone API is specific to BSD-like OSes, the actual implementations (and usages) in WebKit have to be considered separately for Apple and non-Apple ports.

Malloc heap breakdown on Apple ports #

Malloc heap breakdown was originally designed only with Apple ports in mind, with the reason being twofold:

  1. The malloc zone API is provided virtually by all platforms that Apple ports integrate with.
  2. MacOS platforms provide a great utility called footprint that allows one to inspect per-zone memory statistics for a given process.

Given the above, usage of malloc heap breakdown with Apple ports is very smooth and as simple as building WebKit with the ENABLE_MALLOC_HEAP_BREAKDOWN build option and running on macOS while using the footprint utility:

Footprint is a macOS specific tool that allows the developer to check memory usage across regions.

For more details, one should refer to the official documentation page.

Malloc heap breakdown on non-Apple ports #

Since all of the non-Apple WebKit ports are mostly being built and run on non-BSD-like systems, it’s safe to assume the malloc zone API is not offered to such ports by the system itself. Because of the above, for many years, malloc heap breakdown was only available for Apple ports.

Fortunately, with the changes introduced in 2025, such as: 294667@main (+ fix 294848@main), 301702@main, and improvements such as: 294848@main, 299555@main, 301695@main, 301709@main, 301712@main, 301839@main, 301861@main, the malloc heap breakdown integrates also with non-Apple ports and is stable as of main@a235408c2b4eb12216d519e996f70828b9a45e19.

The idea behind the integration for non-Apple ports is to provide a simple WebKit-internal library that provides a fake <malloc/malloc.h> header along with simple implementation that provides malloc_zone_*() function implementations as proxy calls to malloc(), calloc(), realloc() etc. along with a tracking mechanism that keeps references to memory chunks. Such an approach gathers all the information needed to be reported later on.

At the moment of writing, the above allows 2 methods of reporting the memory usage statistics periodically:

  • printing to standard output,
  • reporting to sysprof as counters.
Periodic reporting to standard output

By default, when WebKit is built with ENABLE_MALLOC_HEAP_BREAKDOWN, the heap breakdown is printed to the standard output every few seconds for each process. That can be tweaked by setting WEBKIT_MALLOC_HEAP_BREAKDOWN_LOG_INTERVAL=<SECONDS> environment variable.

The results have a structure similar to the one below:

402339 MHB: | PID | "Zone name" | #chunks | #bytes | {
402339 "ExecutableMemoryHandle" 2 32
402339 "AssemblerData" 1 192
402339 "VectorBuffer" 37 16184
402339 "StringImpl" 103 5146
402339 "WeakPtrImplBase" 17 272
402339 "HashTable" 37 9408
402339 "Vector" 1 16
402339 "EmbeddedFixedVector" 1 32
402339 "BloomFilter" 2 65536
402339 "CStringBuffer" 3 86
402339 "Default Zone" 0 0
402339 } MHB: grand total bytes allocated: 9690

Given the allocation statistics per-zone, it’s easy to narrow down the unusual usage patterns manually. The example of a successful investigation is presented in the image below:

TODO.

Moreover, the data presented can be processed either manually or using scripts to create memory usage charts that span as long as the application lifetime so e.g. hours (20+ like below), days, or even longer:

TODO.
Periodic reporting to sysprof

The other reporting mechanism currently supported is reporting periodically to sysprof as counters. In short, sysprof is a modern system-wide profiling tool that already integrates with WebKit very well when it comes to non-Apple ports.

The condition for malloc heap breakdown reporting to sysprof is that the WebKit browser needs to be profiled e.g. using:

sysprof-cli -f -- <BROWSER_COMMAND>

and the sysprof has to be in the latest version possible.

With the above, the memory usage statistics can then be inspected using the sysprof utility and look like in the image below:

TODO.

In the case of sysprof, memory statistics in that case are just a minor addition to other powerful features that were well described in this blog post from Georges.

Caveats #

While malloc heap breakdown is very useful in some use cases — especially on embedded systems — there are a few problems with it.

First of all, compilation with -DENABLE_MALLOC_HEAP_BREAKDOWN=ON is not guarded by any continuous integration bots; therefore, the compilation issues are expected on the latest WebKit main. Fortunately, fixing the problems is usually straightforward. For a reference on what may be causing compilation problems usually, one should refer to 299555@main, which contains a full variety of fixes.

The second problem is that malloc heap breakdown uses WebKit’s debug heaps, and hence the memory usage patterns may be different just because system malloc is used.

The third, and final problem, is that malloc heap breakdown integration for non-Apple ports introduces some overhead as the allocations need to lock/unlock the mutex, and as statistics are stored in the memory as well.

Opportunities #

Although malloc heap breakdown can be considered fairly constrained, in the case of non-Apple ports, it gives some additional possibilities that are worth mentioning.

Because on non-Apple ports, the custom library is used to track allocations (as mentioned at the beginning of the Malloc heap breakdown on non-Apple ports section), it’s very easy to add more sophisticated tracking/debugging/reporting capabilities. The only file that requires changes in such a case is: Source/WTF/wtf/malloc_heap_breakdown/main.cpp.

Some examples of custom modifications include:

  • adding different reporting mechanisms — e.g. writing to a file, or to some other tool,
  • reporting memory usage with more details — e.g. reporting the per-memory-chunk statistics,
  • dumping raw memory bytes — e.g. when some allocations are suspicious.
  • altering memory in-place — e.g. to simulate memory corruption.

Summary #

While the presented malloc heap breakdown mechanism is a rather poor approximation of what industry standard tools offer, the main benefit of it is that it’s built into WebKit, and that in some rare use-cases (especially on embedded platforms), it’s the only way to perform any reasonable profiling.

In general, as a rule of thumb, it’s not recommended to use malloc heap breakdown unless all other methods have failed. In that sense, it should be considered a last resort approach. With that in mind, malloc heap breakdown can be seen as a nice mechanism complementing other tools in the toolbox.

October 24, 2025 12:00 AM

October 20, 2025

Igalia WebKit Team

WebKit Igalia Periodical #43

Update on what happened in WebKit in the week from October 13 to October 20.

This week was calmer than previous week but we still had some meaningful updates. We had a Selenium update, improvements to how tile sizes are calculated, and a new Igalian in the list of WebKit committer!

Cross-Port 🐱

Selenium's relative locators are now supported after commit 301445@main. Before, finding elements with locate_with(By.TAG_NAME, "input").above({By.ID: "password"}) could lead to "Unsupported locator strategy" errors.

Graphics 🖼️

A patch landed to compute the layers tile size, using a different strategy depending on whether GPU rendering is enabled, which improved the performance for both GPU and CPU rendering modes.

Community & Events 🤝

Our coworker Philip Chimento gained WebKit committer status!

That’s all for this week!

by Igalia WebKit Team at October 20, 2025 08:35 PM

October 13, 2025

Igalia WebKit Team

WebKit Igalia Periodical #42

Update on what happened in WebKit in the week from October 6 to October 13.

Another week with many updates in Temporal, the automated testing infrastructure is now running WebXR API tests; and WebKitGTK gets a fix for the janky Inspector resize while it drops support for libsoup 2. Last but not least, there are fresh releases of both the WPE and GTK ports including a security fix.

Cross-Port 🐱

Multimedia 🎥

GStreamer-based multimedia support for WebKit, including (but not limited to) playback, capture, WebAudio, WebCodecs, and WebRTC.

When using libwebrtc, support has been added to register MDNS addresses of local networks as ICE candidates, to avoid exposing private addresses.

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

JavaScriptCore's implementation of Temporal received a flurry of improvements:

  • Implemented the toString, toJSON, and toLocaleString methods for the PlainMonthDay type.

  • Brought the implementation of the round method on TemporalDuration objects up to spec. This is the last in the series of patches that refactor TemporalDuration methods to use the InternalDuration type, enabling mathematically precise computations on time durations.

  • Implemented basic support for the PlainMonthDay type, without most methods yet.

  • Brought the implementations of the since and until functions on Temporal PlainDate objects up to spec, improving the precision of computations.

WebKitGTK 🖥️

WebKitGTK will no longer support using libsoup 2 for networking starting with version 2.52.0, due in March 2026. An article in the website has more details and migrations tips for application developers.

Fixed the jittering bug of the docked Web Inspector window width and height while dragging the resizer.

Releases 📦️

WebKitGTK 2.50.1 and WPE WebKit 2.50.1 have been released. These include a number of small fixes, improved text rendering performance, and a fix for audio playback on Instagram.

A security advisory, WSA-2025-0007 (GTK, WPE), covers one security issue fixed in these releases. As usual, we recommend users and distributors to keep their WPE WebKit and WebKitGTK packages updated.

Infrastructure 🏗️

Updated the API test runner to run monado-service without standard input using XRT_NO_STDIN=TRUE, which allows the WPE and GTK bots to start validating the WebXR API.

Submitted a change that allows relaxing the DMA-BUF requirement when creating an OpenGL display in the OpenXRCoordinator, so that bots can run API tests in headless environments that don't have that extension.

That’s all for this week!

by Igalia WebKit Team at October 13, 2025 08:02 PM

October 06, 2025

Igalia WebKit Team

WebKit Igalia Periodical #41

Update on what happened in WebKit in the week from September 29 to October 6.

Another exciting weekful of updates, this time we have a number of fixes on MathML, content secutiry policy, and Aligned Trusted types, public API for WebKitWebExtension has finally been added, and fixed enumeration of speaker devices. In addition to that, there's ongoing work to improved compatibility for broken AAC audio streams in MSE, a performance improvement to text rendering with Skia was merged, and fixed multi-plane DMA-BUF handling in WPE. Last but not least, The 2026 edition of the Web Engines Hackfest has been announced! It will take place from June 15th to the 17th.

Cross-Port 🐱

Fixed rendering for unknown elements in MathML.

Fixed incorrect parsing of malformed require-trusted-types-for CSP directive.

Align reporting of Trusted Types violations with spec in case of multiple Content-Security-Policy headers.

Aligned Trusted Types event handler namespace checks with an update to the specification.

Fixed some incorrect handling of null or undefined policy values in Trusted Types.

On the WebExtensions front, the WebKitWebExtension API has finally been added, after porting some more code from Objective-C code to C++.

Improved alignment with MathML Core by making mfenced, semantics and maction render like an mrow, ignoring the subscriptshift/superscriptshift legacy attributes and cleaning the User-Agent stylesheet to more closely match the spec.

Multimedia 🎥

GStreamer-based multimedia support for WebKit, including (but not limited to) playback, capture, WebAudio, WebCodecs, and WebRTC.

Speaker device enumeration has been fixed to properly enumerate ALSA PCM devices, while improving audio output device handling in general.

Improved compatibility for broken AAC audio streams in MSE is currently in review.

JavaScriptCore 🐟

The built-in JavaScript/ECMAScript engine for WebKit, also known as JSC or SquirrelFish.

In JavaScriptCore's implementation of Temporal, improved the precision of addition and subtraction on Durations.

In JavaScriptCore's implementation of Temporal, improved the precision of calculations with the total() function on Durations. This was joint work with Philip Chimento.

In JavaScriptCore's implementation of Temporal, continued refactoring addition for Durations to be closer to the spec.

Graphics 🖼️

Landed a patch to build a SkTextBlob when recording DrawGlyphs operations for the GlyphDisplayListCache, which shows a significant improvement in MotionMark “design” test when using GPU rendering.

WPE WebKit 📟

WPE Platform API 🧩

New, modern platform API that supersedes usage of libwpe and WPE backends.

Improved wpe_buffer_import_to_pixels() to work correctly on non-linear and multi-plane DMA-BUF buffers by taking into account their modifiers when mapping the buffers.

Community & Events 🤝

The 2026 edition of the Web Engines Hackfest has been announced, and it will take place from June 15th to the 17th.

That’s all for this week!

by Igalia WebKit Team at October 06, 2025 08:20 PM

September 29, 2025

Igalia WebKit Team

WebKit Igalia Periodical #40

Update on what happened in WebKit in the week from September 22 to September 29.

Many news this week! We've got a performance improvement in the Vector implementation, a fix that makes a SVG attribute work similarly to HTML, and further advancements on WebExtension support. We also saw an update to WPE Android, the test infrastructure can now run WebXR tests, WebXR support in WPE Android, and a rather comprehensive blog post about the performance considerations of WPE WebKit with regards to the DOM tree.

Cross-Port 🐱

Vector copies performance was improved across the board, and specially for MSE use-cases

Fixed SVG <a> rel attribute to work the same as HTML  <a>'s.

Work on WebExtension support continues with more Objective-C converted to C++, which allows all WebKit ports to reuse the same utility code in all ports.

Added handling of the visibilityState value for inline WebXR sessions.

Graphics 🖼️

WPE now supports importing pixels from non-linear DMABuf formats since commit 300687@main. This will help the work to make WPE take screenshots from the UIProcess (WIP) instead of from the WebProcess, so they match better what's actually shown on the screen.

Added support for the WebXR passthroughFullyObscured rendering hint when using the OpenXR backend.

WPE WebKit 📟

WPE Platform API 🧩

New, modern platform API that supersedes usage of libwpe and WPE backends.

The build system will now compile WPEPlatform with warning-as-errors in developer builds. This helps catch potential programming errors earlier.

WPE Android 🤖

Adaptation of WPE WebKit targeting the Android operating system.

WPE-Android is being updated to use WPE WebKit 2.50.0. As usual, the ready-to-use packages will arrive in a few days to the Maven Central repository.

Added support to run WebXR content on Android, by using AHarwareBuffer to share graphics buffers between the main process and the content rendering process. This required coordination to make the WPE-Android runtime glue expose the current JavaVM and Activity in a way that WebKit could then use to initialize the OpenXR platform bindings.

Community & Events 🤝

Paweł Lampe has published in his blog the first post in a series about different aspects of Web engines that affect performance, with a focus on WPE WebKit and interesting comparisons between desktop-class hardware and embedded devices. This first article analyzes how “idle” nodes in the DOM tree render measurable effects on performance (pun intended).

Infrastructure 🏗️

The test infrastructure can now run API tests that need WebXR support, by using a dummy OpenXR compositor provided by the Monado runtime, along with the first tests and an additional one that make use of this.

That’s all for this week!

by Igalia WebKit Team at September 29, 2025 08:34 PM

September 26, 2025

Pawel Lampe

WPE performance considerations: DOM tree

Designing performant web applications is not trivial in general. Nowadays, as many companies decide to use web platform on embedded devices, the problem of designing performant web applications becomes even more complicated. Typical embedded devices are orders of magnitude slower than desktop-class ones. Moreover, the proportion between CPU and GPU power is commonly different as well. This usually results in unexpected performance bottlenecks when the web applications designed with desktop-class devices in mind are being executed on embedded environments.

In order to help web developers approach the difficulties that the usage of web platform on embedded devices may bring, this blog post initiates a series of articles covering various performance-related aspects in the context of WPE WebKit usage on embedded devices. The coverage in general will include:

  • introducing the demo web applications dedicated to showcasing use cases of a given aspect,
  • benchmarking and profiling the WPE WebKit performance using the above demos,
  • discussing the causes for the performance measured,
  • inferring some general pieces of advice and rules of thumb based on the results.

This article, in particular, discusses the overhead of nodes in the DOM tree when it comes to layouting. It does that primarily by investigating the impact of idle nodes that introduce the least overhead and hence may serve as a lower bound for any general considerations. With the data presented in this article, it should be clear how the DOM tree size/depth scales in the case of embedded devices.

DOM tree #

Historically, the DOM trees emerging from the usual web page designs were rather limited in size and fairly shallow. This was the case as there were no reasons for them to be excessively large unless the web page itself had a very complex UI. Nowadays, not only are the DOM trees much bigger and deeper, but they also tend to contain idle nodes that artificially increase the size/depth of the tree. The idle nodes are the nodes in the DOM that are active yet do not contribute to any visual effects. Such nodes are usually a side effect of using various frameworks and approaches that conceptualize components or services as nodes, which then participate in various kinds of processing utilizing JavaScript. Other than idle nodes, the DOM trees are usually bigger and deeper nowadays, as there are simply more possibilities that emerged with the introduction of modern APIs such as Shadow DOM, Anchor positioning, Popover, and the like.

In the context of web platform usage on embedded devices, the natural consequence of the above is that web designers require more knowledge on how the particular browser performance scales with the DOM tree size and shape. Before considering embedded devices, however, it’s worth to take a brief look at how various web engines scale on desktop with the DOM tree growing in depth.

Desktop considerations #

To measure the impact of the DOM tree depth on the performance, the random-number-changing-in-the-tree.html?vr=0&ms=1&dv=0&ns=0 demo can be used to perform a series of experiments with different parameters.

In short, the above demo measures the average duration of a benchmark function run, where the run does the following:

  • changes the text of a single DOM element to a random number,
  • forces a full tree layout.

Moreover, the demo allows one to set 0 or more parent idle nodes for the node holding text, so that the layout must consider those idle nodes as well.

The parameters used in the URL above mean the following:

  • vr=0 — the results are reported to the console. Alternatively (vr=1), at the end of benchmarking (~23 seconds), the result appears on the web page itself.
  • ms=1 — the results are reported in “milliseconds per run”. Alternatively (ms=0), “runs per second” are reported instead.
  • dv=0 — the idle nodes are using <span> tag. Alternatively, (dv=1) <div> tag is used instead.
  • ns=N — the N idle nodes are added.

The idea behind the experiment is to check how much overhead is added as the number of extra idle nodes (ns=N) in the DOM tree increases. Since the browsers used in the experiments are not fair to compare due to various reasons, instead of concrete numbers in milliseconds, the results are presented in relative terms for each browser separately. It means that the benchmarking result for ns=0 serves as a baseline, and other results show the relative duration increase to that baseline result, where, e.g. a 300% increase means 3 times the baseline duration.

The results for a few mainstream browsers/browser engines (WebKit GTK MiniBrowser [09.09.2025], Chromium 140.0.7339.127, and Firefox 142.0) and a few experimental ones (Servo [04.07.2024] and Ladybird [30.06.2024]) are presented in the image below:

Idle nodes overhead on mainstream browsers.

As the results show, trends among all the browsers are very close to linear. It means that the overhead is very easy to assess, as usually N times more idle nodes will result in N times the overhead. Moreover, up until 100-200 extra idle nodes in the tree, the overhead trends are very similar in all the browsers except for experimental Ladybird. That in turn means that even for big web applications, it’s safe to assume the overhead among the browsers will be very much the same. Finally, past the 200 extra idle nodes threshold, the overhead across browsers diverges. It’s very likely due to the fact that the browsers are not optimizing such cases as a result of a lack of real-world use cases.

All in all, the conclusion is that on desktop, only very large / specific web applications should be cautious about the overhead of nodes, as modern web browsers/engines are very well optimized for handling substantial amounts of nodes in the DOM.

Embedded device considerations #

When it comes to the embedded devices, the above conclusions are no longer applicable. To demonstrate that, a minimal browser utilizing WPE WebKit is used to run the demo from the previous section both on desktop and NXP i.MX8M Plus platforms. The latter is a popular choice for embedded applications as it has quite an interesting set of features while still having strong specifications, which may be compared to those of Raspberry Pi 5. The results are presented in the image below:

Idle nodes overhead compared between desktop and embedded devices.

This time, the Y axis presents the duration (in milliseconds) of a single benchmark run, and hence makes it very easy to reason about overhead. As the results show, in the case of the desktop, 100 extra idle nodes in the DOM introduce barely noticeable overhead. On the other hand, on an embedded platform, even without any extra idle nodes, the time to change and layout the text is already taking around 0.6 ms. With 10 extra idle nodes, this duration increases to 0.75 ms — thus yielding 0.15 ms overhead. With 100 extra idle nodes, such overhead grows to 1.3 ms.

One may argue if 1.3 ms is much, but considering an application that e.g. does 60 FPS rendering, the time at application disposal each frame is below 16.67 ms, and 1.3 ms is ~8% of that, thus being very considerable. Similarly, for the application to be perceived as responsive, the input-to-output latency should usually be under 20 ms. Again, 1.3 ms is a significant overhead for such a scenario.

Given the above, it’s safe to state that the 20 extra idle nodes should be considered the safe maximum for embedded devices in general. In case of low-end embedded devices i.e. ones comparable to Raspberry Pi 1 and 2, the maximum should be even lower, but a proper benchmarking is required to come up with concrete numbers.

Inline vs block #

While the previous subsection demonstrated that on embedded devices, adding extra idle nodes as parents must usually be done in a responsible way, it’s worth examining if there are nuances that need to be considered as well.

The first matter that one may wonder about is whether there’s any difference between the overhead of idle nodes being inlines (display: inline) or blocks (display: block). The intuition here may be that, as idle nodes have no visual impact on anything, the overhead should be similar.

To verify the above, the demo from Desktop considerations section can be used with dv parameter used to control whether extra idle nodes should be blocks (1, <div>) or inlines (0, <span>). The results from such experiments — again, executed on NXP i.MX8M Plus — are presented in the image below:

Comparison of overhead of idle nodes being inline or block elements.

While in the safe range of 0-20 extra idle nodes the results are very much similar, it’s evident that in general, the idle nodes of block type are actually introducing more overhead.

The reason for the above is that, for layout purposes, the handling of inline and block elements is very different. The inline elements sharing the same line can be thought of as being flattened within so called line box tree. The block elements, on the other hand, have to be represented in a tree.

To show the above visually, it’s interesting to compare sysprof flamegraphs of WPE WebProcess from the scenarios comprising 20 idle nodes and using either <span> or <div> for idle nodes:

idle <span> nodes:
Sysprof flamegraph of WPE WebProcess layouting inline elements.
idle <div> nodes:
Sysprof flamegraph of WPE WebProcess layouting block elements.

The first flamegraph proves that there’s no clear dependency between the call stack and the number of idle nodes. The second one, on the other hand, shows exactly the opposite — each of the extra idle nodes is visible as adding extra calls. Moreover, each of the extra idle block nodes adds some overhead thus making the flamegraph have a pyramidal shape.

Whitespaces #

Another nuance worth exploring is the overhead of text nodes created because of whitespaces.

When the DOM tree is created from the HTML, usually a lot of text nodes are created just because of whitespaces. It’s because the HTML usually looks like:

<span>
<span>
(...)
</span>
</span>

rather than:

<span><span>(...)</span></span>

which makes sense from the readability point of view. From the performance point of view, however, more text nodes naturally mean more overhead. When such redundant text nodes are combined with idle nodes, the net outcome may be that with each extra idle node, some overhead will be added.

To verify the above hypothesis, the demo similar to the above one can be used along with the above one to perform a series of experiments comparing the approach with and without redundant whitespaces: random-number-changing-in-the-tree-w-whitespaces.html?vr=0&ms=1&dv=0&ns=0. The only difference between the demos is that the w-whitespaces one creates the DOM tree with artificial whitespaces, simulating as-if it was written in the formatted document. The comparison results from the experiments run on NXP i.MX8M Plus are presented in the image below:

Overhead of redundant whitespace nodes.

As the numbers suggest, the overhead of redundant text nodes is rather small on a per-idle-node basis. However, as the number of idle nodes scales, so does the overhead. Around 100 extra idle nodes, the overhead is noticeable already. Therefore, a natural conclusion is that the redundant text nodes should rather be avoided — especially as the number of nodes in the tree becomes significant.

Parents vs siblings #

The last topic that deserves a closer look is whether adding idle nodes as siblings is better than adding them as parent nodes. In theory, having extra nodes added as siblings should be better as the layout engine will have to consider them, yet it won’t mark them with a dirty flag and hence it won’t have to layout them.

As in other cases, the above can be examined using a series of experiments run on NXP i.MX8M Plus using the demo from Desktop considerations section and comparing against either random-number-changing-before-siblings.html?vr=0&ms=1&dv=0&ns=0 or random-number-changing-after-siblings.html?vr=0&ms=1&dv=0&ns=0 demo. As both of those yield similar results, any of them can be used. The results of the comparison are depicted in the image below:

Overhead of idle nodes added as parents vs as siblings.

The experiment results corroborate the theoretical considerations made above — idle nodes added as siblings indeed introduce less layout overhead. The savings are not very large from a single idle node perspective, but once scaled enough, they are beneficial enough to justify DOM tree re-organization (if possible).

Conclusions #

The above experiments mostly emphasized the idle nodes, however, the results can be extrapolated to regular nodes in the DOM tree. With that in mind, the overall conclusion to the experiments done in the former sections is that DOM tree size and shape has a measurable impact on web application performance on embedded devices. Therefore, web developers should try to optimize it as early as possible and follow the general rules of thumb that can be derived from this article:

  1. Nodes are not free, so they should always be added with extra care.
  2. Idle nodes should be limited to ~20 on mid-end and ~10 on low-end embedded devices.
  3. Idle nodes should be inline elements, not block ones.
  4. Redundant whitespaces should be avoided — especially with idle nodes.
  5. Nodes (especially idle ones) should be added as siblings.

Although the above serves as great guidance, for better results, it’s recommended to do the proper browser benchmarking on a given target embedded device — as long as it’s feasible.

Also, the above set of rules is not recommended to follow on desktop-class devices, as in that case, it can be considered a premature optimization. Unless the particular web application yields an exceptionally large DOM tree, the gains won’t be worth the time spent optimizing.

September 26, 2025 12:00 AM