Planet Igalia WebKit

July 12, 2024

Georges Stavracas

Profiling a web engine

One topic that interests me endlessly is profiling. I’ve covered this topic many times in this blog, but not enough to risk sounding like a broken record yet. So here we are again!

Not everyone may know this but GNOME has its own browser, Web (a.k.a. Epiphany, or Ephy for the intimates). It’s a fairly old project, descendant of Galeon. It uses the GTK port of WebKit as its web engine.

The recent announcement that WebKit on Linux (both WebKitGTK and WPE WebKit) switched to Skia for rendering brought with it a renewed interest in measuring the performance of WebKit.

And that was only natural; prior to that, WebKit on Linux was using Cairo, which is entirely CPU-based, whereas Skia had both CPU and GPU-based rendering easily available. The CPU renderer mostly matches Cairo in terms of performance and resource usage. Thus one of the big promises of switching to Skia was better hardware utilization and better overall performance by switching to the GPU renderer.

A Note About Cairo

Even though nowadays we often talk about Cairo as a legacy piece of software, there’s no denying that Cairo is really good at what it does. Cairo can and often is extremely fast at 2D rendering on the CPU, specially for small images with simple rendering. Cairo has received optimizations and improvements for this specific use case for almost 20 years, and it is definitely not a low bar to beat.

I think it’s important to keep this in mind because, as tempting as it may sound, simply switching to use GPU rendering doesn’t necessarily imply better performance.

Guesswork is a No-No

Optimizations should always be a byproduct of excellent profiling. Categorically speaking, meaningful optimizations are a consequence of instrumenting the code so much that the bottlenecks become obvious.

I think the most important and practical lesson I’ve learned is: when I’m guessing what are the performance issues of my code, I will be wrong pretty much 100% of the time. The only reliable way to optimize anything is to have hard data about the behavior of the app.

I mean, so many people – myself included – were convinced that GNOME Software was slow due to Flatpak that nobody thought about looking at app icons loading.

Enter the Profiler

Thanks to the fantastic work of Søren Sandmann, Christian Hergert, et al, we have a fantastic modern system profiler: Sysprof.

Sysprof offers a variety of instruments to profile the system. The most basic one uses perf to gather stack traces of the processes that are running. Sysprof also supports time marks, which allow plotting specific events and timings in a timeline. Sysprof also offers extra instrumentation for more specific metrics, such as network usage, graphics, storage, and more.

  • Screenshot of Sysprof's callgraph view
  • Screenshot of Sysprof's flamegraphs view
  • Screenshot of Sysprof's mark chart view
  • Screenshot of Sysprof's waterfall view

All these metrics are super valuable when profiling any app, but they’re particularly useful for profiling WebKit.

One challenging aspect of WebKit is that, well, it’s not exactly a small project. A WebKit build can easily take 30~50min. You need a fairly beefy machine to even be able to build a debug build of WebKit. The debug symbols can take hundreds of megabytes. This makes WebKit particularly challenging to profile.

Another problem is that Sysprof marks require integration code. Apps have to purposefully link against, and use, libsysprof-capture to send these marks to Sysprof.

Integrating with Sysprof

As a first step, Adrian brought the libsysprof-capture code into the WebKit tree. As libsysprof-capture is a static library with minimal dependencies, this was relatively easy. We’re probably going to eventually remove the in-tree copy and switch to host system libsysprof-capture, but having it in-tree was enough to kickstart the whole process.

Originally I started sprinkling Sysprof code all around the WebKit codebase, and to some degree, it worked. But eventually I learned that WebKit has its own macro-based tracing mechanism that is only ever implemented for Apple builds.

Looking at it, it didn’t seem impossible to implement these macros using Sysprof, and that’s what I’ve been doing for the past few weeks. The review was lengthy but behold, WebKit now reports Sysprof marks!

Screenshot of Sysprof with WebKit marks highlighted

Right now these marks cover a variety of JavaScript events, layout and rendering events, and web page resources. This all came for free from integrating with the preexisting tracing mechanism!

This gives us a decent understanding of how the Web process behaves. It’s not yet complete enough, but it’s a good start. I think the most interesting data to me is correlating frame timings across the whole stack, from the kernel driver to the compositor to GTK to WebKit’s UI process to WebKit’s Web process, and back:

Screenshot of Sysprof with lots of compositor and GTK and WebKit marks

But as interesting as it may be, oftentimes the fun part of profiling is being surprised by the data you collect.

For example, in WebKit, one specific, seemingly innocuous, completely bland method is in the top 3 of the callgraph chart:

Screenshot of Sysprof showing the callgraph view with an interesting result highlighted

Why is WebCore::FloatRect::contains so high in the profiling? That’s what I’m investigating right now. Who guessed this specific method would be there? Nobody, as far as I know.

Once this is out in a stable release, anyone will be able to grab a copy of GNOME Web, and run it with Sysprof, and help find out any performance issues that only reproduce in particular combinations of hardware.

Next Plans

To me this already is a game changer for WebKit, but of course we can do more. Besides the rectangular surprise, and one particular slowdown that comes from GTK loading Vulkan on startup, no other big obvious data point popped up. Specially in the marks, I think their coverage is still fairly small compared to what it could have been.

We need more data.

Some ideas that are floating right now:

  • Track individual frames and correlate them with Sysprof marks
  • Measure top-to-bottom-to-top latency
  • Measure input delay
  • Integrate with multimedia frames

Perhaps this will allow us to make WebKit the prime web engine for Linux, with top-tier performance, excellent system integration, and more. Maybe we can even redesign the whole rendering architecture of WebKit on Linux to be more GPU friendly now. I can dream high, can’t I? 🙂

In any case, I think we have a promising and exciting time ahead for WebKit on Linux!

by Georges Stavracas at July 12, 2024 12:42 PM

May 23, 2024

Patrick Griffis

Introducing the WebKit Container SDK

Developing WebKitGTK and WPE has always had challenges such as the amount of dependencies or it’s fairly complex C++ codebase which not all compiler versions handle well. To help with this we’ve made a new SDK to make it easier.

Current Solutions

There have always been multiple ways to build WebKit and its dependencies on your host however this was never a great developer experience. Only very specific hosts could be “supported”, you often had to build a large number of dependencies, and the end result wasn’t very reproducable for others.

The current solution used by default is a Flatpak based one. This was a big improvement for ease of use and excellent for reproducablity but it introduced many challenges doing development work. As it has a strict sandbox and provides read-only runtimes it was difficult to use complex tooling/IDEs or develop third party libraries in it.

The new SDK tries to take a middle ground between those two alternatives, isolating itself from the host to be somewhat reproducable, yet being a mutable environment to be flexible enough for a wide range of tools and workflows.

The WebKit Container SDK

At the core it is an Ubuntu OCI image with all of the dependencies and tooling needed to work on WebKit. On top of this we added some scripts to run/manage these containers with podman and aid in developing inside of the container. It’s intention is to be as simple as possible and not change traditional development workflows.

You can find the SDK and follow the quickstart guide on our GitHub: https://github.com/Igalia/webkit-container-sdk

The main requirements is that this only works on Linux with podman 4.0+ installed. For example Ubuntu 23.10+.

In the most simple case, once you clone https://github.com/Igalia/webkit-container-sdk.git, using the SDK can be a few commands:

source /your/path/to/webkit-container-sdk/register-sdk-on-host.sh
wkdev-create --create-home
wkdev-enter

From there you can use WebKit’s build scripts (./Tools/Scripts/build-webkit --gtk) or CMake. As mentioned before it is an Ubuntu installation so you can easily install your favorite tools directly like VSCode. We even provide a wkdev-setup-vscode script to automate that.

Advanced Usage

Disposibility

A workflow that some developers may not be familiar with is making use of entirely disposable development environments. Since these are isolated containers you can easily make two. This allows you to do work in parallel that would interfere with eachother while not worrying about it as well as being able to get back to a known good state easily:

wkdev-create --name=playground1
wkdev-create --name=playground2

podman rm playground1 # You would stop first if running.
wkdev-enter --name=playground2

Working on Dependencies

An important part of WebKit development is working on the dependencies of WebKit rather than itself, either for debugging or for new features. This can be difficult or error-prone with previous solutions. In order to make this easier we use a project called JHBuild which isn’t new but works well with containers and is a simple solution to work on our core dependencies.

Here is an example workflow working on GLib:

wkdev-create --name=glib
wkdev-enter --name=glib

# This will clone glib main, build, and install it for us. 
jhbuild build glib

# At this point you could simply test if a bug was fixed in a different versin of glib.
# We can also modify and debug glib directly. All of the projects are cloned into ~/checkout.
cd ~/checkout/glib

# Modify the source however you wish then install your new version.
jhbuild make

Remember that containers are isoated from each other so you can even have two terminals open with different builds of glib. This can also be used to test projects like Epiphany against your build of WebKit if you install it into the JHBUILD_PREFIX.

To Be Continued

In the next blog post I’ll document how to use VSCode inside of the SDK for debugging and development.

May 23, 2024 04:00 AM

May 21, 2024

WPE WebKit Blog

Update on the Layer Based SVG Engine (LBSE) in WebKit

This blog entry gives an update on what we at Igalia have done on upstreaming and development of LBSE in WebKit in the last seven months. For an explanation of what LBSE is and how it is related to WPE, see this previous entry as a refresher.

Thanks to generous funding by Wix, which extensively uses SVG in their products and has a broad knowledge of the SVG peculiarities across browser engines, LBSE has made great progress in the past seven months. During this period, several advanced SVG painting features were implemented (e.g. clip-paths, masks, gradients, patterns), along with important performance improvements that expanded the new engine’s capabilities and stability. All this was possible thanks to Wix’s decision to address their problems by funding upstream work at the core of WebKit, instead of accepting the status-quo and implementing case-by-case fixes and workarounds on their end. As a result, WebKit-based browsers now benefit from the results of this fruitful collaboration, which we’ll try to explain in more detail in this blog post.

Project kick off and WebKit Contributors Meeting

In October 2023 we started the project mostly by thinking about the design and roadmap. We also did some general SVG bug fixing. For example, visual overflow computation for SVG renderers was corrected, which fixed quite a few SVG pixel tests. Various visual bugs were also fixed, such as unnecessary repainting when viewBox is used on <svg> elements, and incorrect clipping for outermost <svg> elements

Also in the same month, we attended the WebKit Contributors Meeting, where we presented a talk on the LBSE (slides are available here). The feedback on LBSE at the meeting was very positive. Giving the talk early on in the process actually helped us since we needed to have a good design in place.

Supporting SVG resources

The main focus in October 2023 was introducing the SVG resource concept, as already outlined in the WebKit Contributors Meeting talk. Thus, we started with adding a base SVG resource class: RenderSVGResourceContainer . This class was kept as simple as possible, with no support for resource invalidation or repainting logic. The main task of RenderSVGResourceContainer is to take care of registration so that the resource can be looked up by its clients.

For the first SVG resource to implement, we chose the SVG <clip-path> element, so we landed RenderSVGResourceClipper. To comply with the specification, the RenderSVGResourceClipper implementation produces 1-bit masks and uses a special rendering mode:

  • fill-opacity/stroke-opacity/opacity set to 1
  • masker/filter not applied when rendering the children
  • fill set to solid black and stroke set to none

The initial implementation did not use caching of ImageBuffers for clipping, but relied on Porter-Duff DestinationIn/SourceOver compositing operations to achieve the same effect, but faster. By integrating RenderSVGResourceClipper properly into RenderLayer, it aligned SVG clipping with HTML/CSS clipping.

Finally, the implementation prefers a pure clipping solution internally, as in legacy rendering, but for more complicated clip-paths (for example when the clip-path involves text content), a fallback to a mask is done.

Resource invalidation handling

After introducing the first SVG resource (RenderSVGResourceClipper), we noticed some issues with handling invalidations for it, such as adding to clip-path contents. In the legacy engine, invalidations have been handled through layouting. This caused various problems: for one, it could cause calling the setNeedsLayout method from within layout, which meant the invalidation chain depended on the DOM order.

In November 2023, an implementation landed that avoided using layout for resource invalidation. Instead, on dynamic updates, the style system is used to determine the appropriate action:

  • For non resources, changes that cause renderer geometry changes, like changing the x value of a <rect> element, still require a relayout. For visual changes not affecting geometry, like changing the fill color of a <rect>, a repaint action is enough.
  • For resources, the resource is invalidated/updated with the change and any of its clients are repainted using the new resource.

Support for masks

With improved support for SVG resource invalidation, in late November 2023 we were ready to upstream support for the next SVG resource, RenderSVGResourceMasker.

Like the support for clip-path, RenderSVGResourceMasker started out without caching image buffers and relied on creating temporary image buffers at rendering time. Mask content invalidations/changes were supported out of the box since we had improved resource invalidation handling (see above).

Support for gradients

In early January 2024, support for SVG gradients was upstreamed. Gradients are a kind of SVG resource that is a bit different to the previously implemented clipping paths and masks because it is a paint server, so a helper class for that called SVGPaintServerHandling and a base class RenderSVGResourcePaintServer were introduced. The main difference is in invalidation: paint servers simply need a repaint of all its clients on invalidation, whereas clipping paths/masks may need to do more work; i.e., masks underlying image buffers need to be updated before its clients can be repainted.

Support for patterns and markers

By the end of January 2024, support for SVG patterns was upstreamed. In the first implementation, no image buffer caching was implemented in order to keep things clean and simple. This implementation is different from the legacy implementation because the pattern contents are being rendered through pattern content layers (see RenderLayer::paintSVGResourceLayer). To make this work, RenderSVGResourcePattern has to set up the graphics context matrix correctly before calling paintSVGResourceLayer.

Around that same time, we implemented the next to last SVG resource on our TODO list, namely SVG markers.

SVG filters

In February 2024, Apple started work on supporting SVG filters in LBSE. A first iteration managed to fix a lot of the official SVG filter tests, but it turned out a filter regression had to be fixed first. Moreover, the initial work uncovered issues with the HTML/CSS filters implementation that need to be fixed in general. Finally, another reason why this support takes more time than some other features is that there is a strong requirement to make the support efficient in both memory usage and overall (re)painting speed. Still, the early results are very promising!

Cycle detection

It is quite easy in SVG to cause direct circular references:

<svg>
    <defs>
        <pattern id="p" xlink:href="#p" />
    </defs>
    ...
</svg>

It is also possible to cause indirect circular references:

<svg>
    <defs>
        <mask id="z" />
            <rect mask="url(#z)" />
        </mask>
    </defs>
    <ellipse mask="url(#z)" />
</svg>

The legacy engine solved this in an ad-hoc way in various places in the engine; it tried to break cycles before rendering, but still needed cycle protections in various places, since the solution was never unified or complete.

In February 2024 we provided a unified solution for LBSE by introducing SVGVisitedRendererTracking; see this commit for more. In the new approach, we don’t attempt to remove cycles, but detect them everywhere upon usage and stop processing in well-defined ways, all centralized in SVGVisitedRendererTracking.

Nested mask/pattern slowness

In April 2024, we addressed the slowness problems with nested masks/patterns. As an example, consider this for nested masks:

<svg>
    <defs>
        <mask id="z" />
            <rect id="1" mask="url(#y)" />
            <rect id="2" mask="url(#y)" />
            ...
        </mask>
        <mask id="y" />
            <rect id="1" mask="url(#x)" />
            <rect id="2" mask="url(#x)" />
            ...
        </mask>
        ...
    </defs>
    <ellipse mask="url(#z)" />
</svg>

For this example, the complexity can be increased at will by adding more masks and contents per mask.

The solution was twofold:

  • For masks, we realized bounding box calculations for a mask were not affected by masks used in the mask contents, so we could cut off bounding box calculations for nested masks.
  • For both masks and patterns, we added caching of image buffers per resource client so nested masks/patterns that are already encountered can reuse the image buffer cache.

See optimizations here for nested masks and patterns.

Next steps

For the short and mid-term, the plan is to make LBSE at least as good as legacy in regards to test coverage; i.e., all tests that pass in legacy should pass in LBSE. We have made a lot of progress over the last seven months just because of the amount of SVG resources that were implemented, but for example ,we will need to have SVG filters in place to pass this goal.

Another goal is to make sure LBSE passes all security requirements, as failing that would be a blocker to replacing the current engine. Fortunately, we are already taking this into account in several ways, such as adopting a lot of good smart pointer practices.

Finally, a big goal will be for LBSE to perform well on certain benchmarks like MotionMark, since WebKit has a golden rule to never ship a performance regression. So far there has not been an explicit focus on performance, and we know there are likely optimizations possible in RenderLayer usage, both in reducing the number of RenderLayer objects we create in certain situations as well as a possible reduction in complexity of RenderLayer for LBSE usage.

All in all, we are very pleased with the results and the progress we made in the last seven months. We at Igalia look forward to finishing the work to get the new engine in a shippable state in the near future!

May 21, 2024 12:00 AM

May 20, 2024

Frédéric Wang

Time travel debugging of WebKit with rr

Introduction

rr is a debugging tool for Linux that was originally developed by Mozilla for Firefox. It has long been adopted by Igalia and other web platform developers for Chromium and WebKit too. Back in 2019, there were breakout sessions on this topic at the Web Engines Hackfest and BlinkOn.

For WebKitGTK, the Flatpak SDK provides a copy of rr, but recently I was unable to use the instructions on trac.webkit.org. Fortunately, my colleague Adrián Pérez suggested using a direct build without flatpak or the bubblewrap sandbox, and that indeed solved my problem. I thought it might be interesting to share this information with others, so I decided to write this blog post.

Disclaimer: The build instructions below may be imperfect, will likely become outdated, and are in any case not a replacement for the official ones for WebKitGTK development. Use them at your own risk!

CMake configuration

The approach that worked for me was thus to perform a direct build from my system. I came up with the following configuration step:

cmake -S. -BWebKitBuild/Release \
   -DCMAKE_BUILD_TYPE=Release \
   -DCMAKE_INSTALL_PREFIX=$HOME/WebKit/WebKitBuild/install/Release \
   -GNinja -DPORT=GTK -DENABLE_BUBBLEWRAP_SANDBOX=OFF \
   -DDEVELOPER_MODE=ON -DDEVELOPER_MODE_FATAL_WARNINGS=OFF \
   -DENABLE_TOOLS=ON -DENABLE_LAYOUT_TESTS=ON

where:

  • The -B option specifies the build directory, which is traditionnaly called WebKitBuild/ for the WebKit project.
  • CMAKE_BUILD_TYPE specifies the build type, e.g. optimized release builds (Release, corresponding to --release for the offical script) or debug builds with assertions (Debug, corresponding to --debug) 1.
  • CMAKE_INSTALL_PREFIX specifies the installation directory, which I place inside WebKitBuild/install/ 2.
  • The -G option specifies the build system generator. I used Ninja, which is the default for the offical script too.
  • -DPORT=GTK is for building WebKitGTK. I haven’t tested rr with other Linux ports.
  • -DENABLE_BUBBLEWRAP_SANDBOX=OFF was suggested by Adrián. The bubblewrap sandbox probably does not make sense without flatpak, so it should be safe to disable it anyway.
  • I extracted the other -D flags from the official script, trying to stay as close as possible to what it provides for WebKit development (being able to run layout tests, building the Tools/, ignoring fatal warnings, etc).

Needless to say, the advantage of using flatpak is that it automatically downloads and install all the required dependencies. But if you do your own build, you need to figure out what they are and perform the setup manually. Generally, this is straightforward using your distribution’s package manager, but there can be some tricky exceptions 3.

While we are still at the configuration step, I believe it’s worth sharing two more tricks for WebKit developers:

  • You can use -DENABLE_SANITIZERS=address to produce Asan builds or builds with other sanitizers.
  • You can use -DCMAKE_CXX_FLAGS="-DENABLE_TREE_DEBUGGING" in release builds if you want to get access to the tree debugging functions (ShowRenderTree and the like). This flag is turned on by default for debug builds.

Building and running WebKit

Once the configure step is successful, you can build and install WebKit using the following CMake command 2.

cmake --build WebKitBuild/Release --target install

When that operation completes, you should be able to run MiniBrowser with the following command:

LD_LIBRARY_PATH=WebKitBuild/install/Release/lib ./WebKitBuild/Release/bin/MiniBrowser

For WebKitTestRunner, some extra environment variables are necessary 2:

TEST_RUNNER_INJECTED_BUNDLE_FILENAME=$HOME/WebKit/WebKitBuild/Release/lib/libTestRunnerInjectedBundle.so LD_LIBRARY_PATH=WebKitBuild/install/Release/lib ./WebKitBuild/Release/bin/WebKitTestRunner filename.html

You can also use the official scripts, Tools/Script/run-minibrowser and Tools/Script/run-webkit-tests. They expect some particular paths, but a quick workaround is to use a symbolic link:

ln -s $HOME/WebKit/WebKitBuild $HOME/WebKit/WebKitBuild/GTK

Using rr for WebKit debugging

rr is generally easily installable from your distribution’s package manager. However, as stated on the project wiki page:

Support for the latest hardware and kernel features may require building rr from Github master.

Indeed, using the source has always worked best for me to avoid mysterious execution failures when starting the recording 4.

If you are not familiar with rr, I strongly invite you to take a look at the overview on the project home page or at some of the references I mentioned in the introduction. In any case, the first step is to record a trace by passing the program and arguments to rr. For example, to record a trace for MiniBrowser:

LD_LIBRARY_PATH=WebKitBuild/install/Debug/lib rr ./WebKitBuild/Debug/bin/MiniBrowser https://www.igalia.com/

After the program exits, you can replay the recorded trace as many times as you want. For hard-to-reproduce bugs (e.g. non-deterministic issues or involving a lot of manual steps), that means you only need to be able to record and reproduce the bug once and then can just focus on debugging. You can even turn off your machine after hours of exhausting debugging, then continue the effort later when you have more time and energy! The trace is played in a deterministic way, always using the same timing and pointer addresses. You can use most gdb commands (to run the program, interrupt it, and inspect data), but the real power comes from new commands to perform reverse execution!

Before coming to that, let’s explain how to handle programs with multiple processes, which is the case for WebKit and modern browsers in general. After you recorded a trace, you can display the pids of all recorded processes using the rr ps command. For example, we can see in the following output that the MiniBrowser process (pid 24103) actually forked three child processes, including the Network Process (pid 24113) and the Web Process (24116):

PID     PPID    EXIT    CMD
24103   --      0       ./WebKitBuild/Debug/bin/MiniBrowser https://www.igalia.com/
24113   24103   -9      ./WebKitBuild/Debug/bin/WebKitNetworkProcess 7 12
24115   24103   1       (forked without exec)
24116   24103   -9      ./WebKitBuild/Debug/bin/WebKitWebProcess 15 15

Here is a small debugging session similar to the single-process example from Chromium Chronicle #13 5. We use the option -p 24116 to attach the debugger to the Web Process and -e to start debugging from where it exited:

rr replay -p 24116 -e
(rr) break RenderFlexibleBox::layoutBlock
(rr) rc # Run back to the last layout call
Thread 2 hit Breakpoint 1, WebCore::RenderFlexibleBox::layoutBlock (this=0x7f66699cc400, relayoutChildren=false) at /home/fred/src-obj/WebKit/Source/WebCore/rendering/RenderFlexibleBox.cpp:420
(rr) # Inspect anything you want here. To find the previous Layout call on this object:
(rr) cond 1 this == 0x7f66699cc400
(rr) rc
Thread 2 hit Breakpoint 1, WebCore::RenderFlexibleBox::layoutBlock (this=0x7f66699cc400, relayoutChildren=false) at /home/fred/src-obj/WebKit/Source/WebCore/rendering/RenderFlexibleBox.cpp:420
420     {
(rr) delete 1
(rr) watch -l m_style.m_nonInheritedFlags.effectiveDisplay # Or find the last time the effective display was changed
Thread 4 hit Hardware watchpoint 2: -location m_style.m_nonInheritedFlags.effectiveDisplay

Old value = 16
New value = 0
0x00007f6685234f39 in WebCore::RenderStyle::RenderStyle (this=0x7f66699cc4a8) at /home/fred/src-obj/WebKit/Source/WebCore/rendering/style/RenderStyle.cpp:176
176     RenderStyle::RenderStyle(RenderStyle&&) = default;

rc is an abbreviation for reverse-continue and continues execution backward. Similarly, you can use reverse-next, reverse-step and reverse-finish commands, or their abbreviations. Notice that the watchpoint change is naturally reversed compared to normal execution: the old value (sixteen) is the one after intialization, while the new value (zero) is the one before initialization!

Restarting playback from a known point in time

rr also has a concept of “event” and associates a number to each event it records. They can be obtained by the when command, or printed to the standard output using the -M option. To elaborate a bit more, suppose you add the following printf in RenderFlexibleBox::layoutBlock:

@@ -423,6 +423,8 @@ void RenderFlexibleBox::layoutBlock(bool relayoutChildren, LayoutUnit)
     if (!relayoutChildren && simplifiedLayout())
         return;

+    printf("this=%p\n", this);
+

After building, recording and replaying again, the output should look like this:

$ rr -M replay -p 70285 -e # replay with the new PID of the web process.
...
[rr 70285 57408]this=0x7f742203fa00
[rr 70285 57423]this=0x7f742203fc80
[rr 70285 57425]this=0x7f7422040200
...

Each printed output is now annotated with two numbers in bracket: a PID and an event number. So in order to restart from when an interesting output happened (let’s say [rr 70285 57425]this=0x7f7422040200), you can now execute run 57425 from the debugging session, or equivalently:

rr replay -p 70285 -g 57425

Older traces and parallel debugging

Another interesting thing to know is that traces are stored in ~/.local/share/rr/ and you can always specify an older trace to the rr command e.g. rr ps ~/.local/share/rr/MiniBrowser-0. Be aware that the executable image must not change, but you can use rr pack to be able to run old traces after a rebuild, or even to copy traces to another machine.

To be honest, most the time I’m just using the latest trace. However, one thing I’ve sometimes found useful is what I would call the “parallel debugging” technique. Basically, I’m recording one trace for a testcase that exhibits the bug and another one for a very similar testcase (e.g. with one CSS property difference) that behaves correctly. Then I replay the two traces side by side, comparing them to understand where the issue comes from and what can be done to fix it.

The usage documentation also provides further tips, but this should be enough to get you started with time travel debugging in WebKit!

  1. RelWithDebInfo build type (which yields an optimized release build with debug symbols) might also be interesting to consider in some situations, e.g. debugging bugs that reproduce in release builds but not in debug builds. 

  2. Using an installation directory might not be necessary, but without that, I had trouble making the whole thing work properly (wrong libraries loaded or libraries not found).  2 3

  3. In my case, I chose the easiest path to disable some features, namely -DUSE_JPEGXL=OFF, -DUSE_LIBBACKTRACE=OFF, and -DUSE_GSTREAMER_TRANSCODER=OFF

  4. Incidentally, you are likely to get an error saying that perf_event_paranoid is required to be at most 1, which you can force using sudo sysctl kernel.perf_event_paranoid=1

  5. The equivalent example would probably have been to watch for the previous style change with watch -l m_style, but this was exceeding my hardware watchpoint limit, so I narrowed it down to a smaller observation scope. 

May 20, 2024 10:00 PM

April 16, 2024

Philippe Normand

From WebKit/GStreamer to rust-av, a journey on our stack’s layers

In this post I’ll try to document the journey starting from a WebKit issue and ending up improving third-party projects that WebKitGTK and WPEWebKit depend on.

I’ve been working on WebKit’s GStreamer backends for a while. Usually some new feature needed on WebKit side would trigger work …

by Philippe Normand at April 16, 2024 08:15 PM

April 08, 2024

WPE WebKit Blog

WPE WebKit 2.44 highlights

The WPE team released WPE WebKit 2.44 a few weeks ago. Let’s have a look at the most significant changes to the port for this release cycle.

DisplayLink is a WebCore feature that improves resource utilization and improves synchronization with vertical screen retrace. 2.44 adds an implementation of this feature for the WPE port that improves rendering performance.

Improved hardware-acceleration video decoding and rendering

When WebKit is using GStreamer 1.24 or newer, video playback can use the new support for DRM modifiers in the DMA-BUF sink. This improves video decoding and rendering, as it allows for zero-copy negotiation with the video decoders.

WebCodec API supported

WPE now supports the WebCodecs API, which allows web developers low-level access to video frames and audio chunks, a feature of importance for multimedia applications that need finer grain control over what gets played on the browser.

Other noteworthy changes

  • Support for the JPEG2000 image format has been removed. WebKit was the only major engine still supporting the format, which these days is rarely used. As a consequence, OpenJPEG is no longer a dependency. JPEG2000 should not be confused with JPEG-XL, which is still supported.
  • Support usage of libbacktrace, enabled by default at build time. This library provides quality stacktraces that can help developers and deployers more efficiently debug crashes in WPE-powered browsers.
  • Many memory and stability improvements, particularly on the multimedia backends.

For a complete list of changes, please check the releases page.

April 08, 2024 12:00 AM

March 20, 2024

Jani Hautakangas

Bringing WebKit back to Android.

It’s been quite a while since the last blog post about WPE-Android, but that doesn’t mean WPE-Android hasn’t been in development. The focus has been on stabilizing the runtime and implementing the most crucial core features to make it easier to integrate new APIs into WPEView.

Main building blocks #

WPE-Android has three main building blocks:

  • Cerbero
    • Cross-platform build aggregator that is used to build WPE WebKit and all of its dependencies to Android.
  • WPEBackend-Android
    • Implements Android specific graphics buffer support and buffer sharing between WebKit UIProcess and WebProcess.
  • WPEView
    • Allows displaying web content in activity layout using WPEWebKit.

WPE-Android high-level design

What’s new #

The list of all work completed so far would be quite long, as there have been no official releases or public announcements, with the exception of the last blog post. Since that update, most of the efforts have been focused on ‘under the hood’ improvements, including enhancing stability, adding support for some core WebKit features, and making general improvements to the development infrastructure.

Here is a list of new features worth mentioning:

  • Based on WPE WebKit 2.42.1, the project was branched for the 2.42.x series. Future work will continue in the main branch for the next release.
  • Dropped support for 32-bit platforms (x86 and armv7). Only arm64-v8a and x86_64 are supported
  • Integration to Android main loop so that WPE WebKit GLib main loop is driven by Android main loop
  • Process-Swap On Navigation aka PSON
  • Added ASharedMemory support to WebKit SharedMemory
  • Hardware-accelerated multimedia playback
  • Fullscreen support
  • Cookies management
  • ANGLE based WebGL
  • Cross process fence insertion for composition synchronization with Surface Flinger
  • WebDriver support
  • GitHub Actions build bots
  • GitHub Actions WebDriver test bots

Demos #

WPEWebkit powered web view (WPEView) #

Demo uses WPE-Android MiniBrowser sample application to show basic web page loading and touch-based scrolling, usage of a simple cookie manager to clear the page’s date usage, and finally loads the popular “Aquarium sample” to show a smooth (60FPS) WebGL animation running thanks to HW acceleration support.

WebDriver #

Demo shows how to run WebDriver test with with emulator. Detailed instructions how to run WebDriver test can be found in README.md

Test requests a website through the Selenium remote webdriver. It then replaces the default behavior of window.alert on the requested page by injecting and executing a JavaScript snippet. After loading the page, it performs a click() action on the element that calls the alert. This results in the text ‘cheese’ being displayed right below the ‘click me’ link.

Test contains three building blocks:

  • WPE WebDriver running on emulator
  • HTTP server serving test.html web page
  • Selenium python test script executing the test

What’s next #

We have ambitious plans for the coming months regarding the development of WPE Android, focusing mainly on implementing additional features, stabilization, and performance improvements.

As for implementing new features, now that the integration with the WPE WebKit runtime has reached a more robust state, it’s time to start adding more of the APIs that are still missing in WPEView compared to other webviews on Android and to enable other web-facing features supported by WebKit. This effort, along with adding support for features like HTTP/2 and the remote Web Inspector, will be a major focus.

As for stabilization and performance, having WebDriver support will be very helpful as it will enable us to identify and fix issues promptly and thus help make WPE Android more stable and feature-complete. We would also like to focus on conformance testing compared to other web views on Android, which should help us prioritize our efforts.

The broader goal for this year is to develop WPE-Android into an easy-to-use platform for third-party projects, offering a compelling alternative to other webviews on the Android platform. We extend many thanks to the NLNet Foundation, whose support for WPE Android through a grant from the NGI Zero Core fund will be instrumental in helping us achieve this goal!

Try it yourself #

WPE-Android is still considered a prototype, and it’s a bit difficult to build. However, if you want to try it out, you can follow the instructions in the README.md file. Additionally, you can use the project’s issue tracker to report problems or suggest new features.

March 20, 2024 12:00 AM