Planet Igalia

February 22, 2021

Eric Meyer

Week One

The first week on the job at Igalia was… it was good, y’all.  Upon formally joining the Support Team, got myself oriented, built a series of tests-slash-demos that will be making their way into some forthcoming posts and videos, and forked a copy of the Mozilla Developer Network (MDN) so I can start making edits and pushing them to the public site.  In fact, the first of those edits landed Sunday night!  And there was the usual setting up accounts and figuring out internal processes and all that stuff.

A series of tests of the CSS logical property ';block-border'.
Illustrating the uses of border-block.

To be perfectly honest, a lot of my first-week momentum was provided by the rest of the Support Team, and setting expectations during the interview process.  You see, at one point in the past I had a position like this, and I had problems meeting expectations.  This was partly due to my inexperience working in that sort of setting, but also partly due to a lack of clear communication about expectations.  Which I know because I thought I was doing well in meeting them, and then was told otherwise in evaluations.

So when I was first talking with the folks at Igalia, I shared that experience.  Even though I knew Igalia has a different approach to management and evaluation, I told them repeatedly, “If I take this job, I want you to point me in a direction.”  They’ve done exactly that, and it’s been great.  Special thanks to Brian Kardell in this regard.

I’m already looking forward to what we’re going to do with the demos I built and am still refining, and to making more MDN edits, including some upgrades to code examples.  And I’ll have more to say about MDN editing soon.  Stay tuned!

by Eric Meyer at February 22, 2021 09:32 PM

February 15, 2021

Eric Meyer

First Day at Igalia

Today is my first day as a full-time employee at Igalia, where I’ll be doing a whole lot of things I love to do: document and explain web standards at MDN and other places, participate in standards work at the W3C, take on some webmaster duties, and play a part in planning Igalia’s strategy with respect to advancing the web.  And likely other things!

I’ll be honest, this is a pretty big change for me.  I haven’t worked for anyone other than myself since 2003.  But the last time I did work for someone else, it was for Netscape (slash AOL slash Time Warner) as a Standards Evangelist, a role I very much enjoyed.  In many ways, I’m taking that role back up at Igalia, in a company whose values and structure are much more in line with my own.  I’m really looking forward to finding out what we can do together.

If the name Igalia doesn’t ring any bells, don’t worry: nobody outside the field has heard of them, and most people inside the field haven’t either.  So, remember when CSS Grid came to browsers back in 2017?  Igalia did the implementation that landed in Safari and Chromium.  They’ve done a lot of other things besides that—some of which I’ll be helping to spread the word about—but it’s the thing that web folks will be most likely to recognize.

This being my first day and all, I’m still deep in the setting up of logins and filling out of forms and general orienting of oneself to a new team and set of opportunities to make a positive difference, so there isn’t much more to say besides I’m stoked and planning to say more a little further down the road.  For now, onward!

by Eric Meyer at February 15, 2021 05:23 PM

February 13, 2021

Eleni Maria Stea

About VK_EXT_sample_locations

More than a year ago, I had worked on the implementation of VK_EXT_sample_locations extension for anv, the Intel Vulkan driver of mesa3D, as part of my work for Igalia. The implementation had been reviewed (see acknowledgments) at the time, but as the conformance tests that were available back then had to be improved and that … Continue reading About VK_EXT_sample_locations

by hikiko at February 13, 2021 09:47 PM

February 09, 2021

Andrés Gómez

Replaying 3D traces with piglit


If you don’t know what is traces based rendering regression testing, read the appendix before continuing.


The Mesa community has witnessed an explosion of the Continuous Integration interest in the last two years.

In addition to checking the proper building of the project, integrating the testing of its functional correctness has become a priority. The user space graphics drivers exhibit a wide variety of types of tests and test suites. One kind of those tests are the traces based rendering regression testing.

The public effort to add this kind of tests into Mesa’s CI started with this mail from Alexandros Frantzis.

At some point, we had support for replaying OpenGL, Vulkan and D3D11 traces using apitrace, RenderDoc and GFXReconstruct with the in-tree tool tracie. However, it was a very custom solution made to the needs of Mesa so I proposed to move this codebase and integrate it into the piglit test suite. It was a natural step forward.

This is how replayer was born into piglit.

replayer

The first step to test a trace is, actually, obtaining a trace. I won’t go into the details about how to create one from scratch. The process is well documented on each of the tools listed above. However, the Mesa community has been collecting publicly distributable traces for a while and placing them in traces-db whose CI is copying them to Freedesktop.org’s MinIO instance.

To make things simple, once we have built and installed piglit, if we would like to test an apitrace created OpenGL trace, we can download from there with:

$ replayer.py download \
 	 --download-url https://minio-packet.freedesktop.org/mesa-tracie-public/ \
 	 --db-path ./traces-db \
 	 --force-download \
 	 glxgears/glxgears-2.trace

The parameters are self explanatory. The downloaded trace will now exist at ./traces-db/glxgears/glxgears-2.trace.

The next step will be to dump an image from the trace. Since it is a .trace file we will need to have apitrace installed in the system. If we do not specify the call(s) from which to dump the image(s), we will just get the last frame of the trace:

$ replayer.py dump ./traces-db/glxgears/glxgears-2.trace

The dumped PNG image will be at ./results/glxgears-2.trace-0000001413.png. Notice, the number suffix is the snapshot id from the trace.

Dumping from a trace may result in a range of different possible images. One example is when the trace makes use of uninitialized values, leading to undefined behaviors.

However, since the original aim was performing pre-merge rendering regression testing in Mesa’s CI, the idea is that replaying any of the provided traces would be quick and the dumped image will be consistent. In other words, if we would dump several times the same frame of a trace with the same GFX stack, the image will always be the same.

With this precondition, we can test whether 2 different images are the same just by doing a hash of its content. replayer can obtain the hash for the generated dumped image:

$ replayer.py checksum ./results/glxgears-2.trace-0000001413.png 
f8eba0fec6e3e0af9cb09844bc73bdc8

Now, if we would build a different commit of Mesa, we could check the generated image at this new point against the previously generated reference image. If everything goes well, we will see something like:

$ replayer.py compare trace \
 	 --download-url https://minio-packet.freedesktop.org/mesa-tracie-public/ \
 	 --device-name gl-vmware-llvmpipe \
 	 --db-path ./traces-db \
 	 --keep-image \
 	 glxgears/glxgears-2.trace f8eba0fec6e3e0af9cb09844bc73bdc8
[dump_trace_images] Info: Dumping trace ./traces-db/glxgears/glxgears-2.trace...
[dump_trace_images] Running: apitrace dump --calls=frame ./traces-db/glxgears/glxgears-2.trace
// process.name = "/usr/bin/glxgears"
1384 glXSwapBuffers(dpy = 0x56060e921f80, drawable = 31457282)

1413 glXSwapBuffers(dpy = 0x56060e921f80, drawable = 31457282)

error: drawable failed to resize: expected 1515x843, got 300x300
[dump_trace_images] Running: eglretrace --headless --snapshot=1413 --snapshot-prefix=./results/trace/gl-vmware-llvmpipe/glxgears/glxgears-2.trace- ./blog-traces-db/glxgears/glxgears-2.trace
Wrote ./results/trace/gl-vmware-llvmpipe/glxgears/glxgears-2.trace-0000001413.png

OK
[check_image]
    actual: f8eba0fec6e3e0af9cb09844bc73bdc8
  expected: f8eba0fec6e3e0af9cb09844bc73bdc8
[check_image] Images match for:
  glxgears/glxgears-2.trace

PIGLIT: {"images": [{"image_desc": "glxgears/glxgears-2.trace", "image_ref": "f8eba0fec6e3e0af9cb09844bc73bdc8.png", "image_render": "./results/trace/gl-vmware-llvmpipe/glxgears/glxgears-2.trace-0000001413-f8eba0fec6e3e0af9cb09844bc73bdc8.png"}], "result": "pass"}

replayer‘s compare subcommand is the one spitting a piglit formatted test expectations output.

Putting everything together

We can make the whole process way simpler by passing the replayer a YAML tests list file. For example:

$ cat testing-traces.yml
traces-db:
  download-url: https://minio-packet.freedesktop.org/mesa-tracie-public/

traces:
  - path: gputest/triangle.trace
    expectations:
      - device: gl-vmware-llvmpipe
        checksum: c8848dec77ee0c55292417f54c0a1a49
  - path: glxgears/glxgears-2.trace
    expectations:
      - device: gl-vmware-llvmpipe
        checksum: f53ac20e17da91c0359c31f2fa3f401e
$ replayer.py compare yaml \
 	 --device-name gl-vmware-llvmpipe \
 	 --yaml-file testing-traces.yml 
[check_image] Downloading file gputest/triangle.trace took 5s.
[dump_trace_images] Info: Dumping trace ./replayer-db/gputest/triangle.trace...
[dump_trace_images] Running: apitrace dump --calls=frame ./replayer-db/gputest/triangle.trace
// process.name = "/home/anholt/GpuTest_Linux_x64_0.7.0/GpuTest"
397 glXSwapBuffers(dpy = 0x7f0ad0005a90, drawable = 56623106)

510 glXSwapBuffers(dpy = 0x7f0ad0005a90, drawable = 56623106)


/home/anholt/GpuTest_Linux_x64_0.7.0/GpuTest
[dump_trace_images] Running: eglretrace --headless --snapshot=510 --snapshot-prefix=./results/trace/gl-vmware-llvmpipe/gputest/triangle.trace- ./replayer-db/gputest/triangle.trace
Wrote ./results/trace/gl-vmware-llvmpipe/gputest/triangle.trace-0000000510.png

OK
[check_image]
    actual: c8848dec77ee0c55292417f54c0a1a49
  expected: c8848dec77ee0c55292417f54c0a1a49
[check_image] Images match for:
  gputest/triangle.trace

[check_image] Downloading file glxgears/glxgears-2.trace took 5s.
[dump_trace_images] Info: Dumping trace ./replayer-db/glxgears/glxgears-2.trace...
[dump_trace_images] Running: apitrace dump --calls=frame ./replayer-db/glxgears/glxgears-2.trace
// process.name = "/usr/bin/glxgears"
1384 glXSwapBuffers(dpy = 0x56060e921f80, drawable = 31457282)

1413 glXSwapBuffers(dpy = 0x56060e921f80, drawable = 31457282)


/usr/bin/glxgears
error: drawable failed to resize: expected 1515x843, got 300x300
[dump_trace_images] Running: eglretrace --headless --snapshot=1413 --snapshot-prefix=./results/trace/gl-vmware-llvmpipe/glxgears/glxgears-2.trace- ./replayer-db/glxgears/glxgears-2.trace
Wrote ./results/trace/gl-vmware-llvmpipe/glxgears/glxgears-2.trace-0000001413.png

OK
[check_image]
    actual: f8eba0fec6e3e0af9cb09844bc73bdc8
  expected: f8eba0fec6e3e0af9cb09844bc73bdc8
[check_image] Images match for:
  glxgears/glxgears-2.trace

replayer features also the query subcommand, which is just a helper to read the YAML files with the tests configuration.

Testing the other kind of supported 3D traces doesn’t change much from what’s shown here. Just make sure to have the needed tools installed: RenderDoc, GFXReconstruct, the VK_LAYER_LUNARG_screenshot layer, Wine and DXVK. A good reference for building, installing and configuring these tools are Mesa’s GL and VK test containers building scripts.

replayer also accepts several configurations to tweak how to behave and where to find the actual tracing tools needed for replaying the different types of traces. Make sure to check the replay section in piglit’s configuration example file.

replayer‘s README.md file is also a good read for further information.

piglit

replayer is a test runner in a similar fashion to shader_runner or glslparsertest. We are now missing how does it integrate so we can do piglit runs which will produce piglit formatted results.

This is done through the replay test profile.

This profile needs a couple configuration values. Easiest is just to set the PIGLIT_REPLAY_DESCRIPTION_FILE and PIGLIT_REPLAY_DEVICE_NAME env variables. They are self explanatory, but make sure to check the documentation for this and other configuration options for this profile.

The following example features a similar run to the one done above invoking directly replayer but with piglit integration, providing formatted results:

$ PIGLIT_REPLAY_DESCRIPTION_FILE=testing-traces.yml PIGLIT_REPLAY_DEVICE_NAME=gl-vmware-llvmpipe piglit run replay -n replay-example replay-results
[2/2] pass: 2   
Thank you for running Piglit!
Results have been written to replay-results

We can create some summary based on the results:

# piglit summary console replay-results/
trace/gl-vmware-llvmpipe/glxgears/glxgears-2.trace: pass
trace/gl-vmware-llvmpipe/gputest/triangle.trace: pass
summary:
       name: replay-example
       ----  --------------
       pass:              2
       fail:              0
      crash:              0
       skip:              0
    timeout:              0
       warn:              0
 incomplete:              0
 dmesg-warn:              0
 dmesg-fail:              0
    changes:              0
      fixes:              0
regressions:              0
      total:              2
       time:       00:00:00

Creating an HTML summary may be also interesting, specially when finding failures!

Wishlist

  • Through different backends, replayer supports running apitrace, RenderDoc and GFXReconstruct traces. We may want to support other tracing tools in the future. The dummy backend used for functional testing is a good starting point when writing a new backend.
  • The solution chosen for checking whether we detect a rendering regression is dependent on having consistent results, as said before. It’d be great if we could add a secondary testing method whenever the expected rendered image is variable. From the top of my head, using exclusion masks could be a quick single-run solution when we know which specific areas in a rendered scenario are the ones fluctuating. For more complex variations, a multi-run based solution seems to be the best option. EzBench has a great statistical approach for this!
  • The current syntax of the YAML test list files implies running the compare subcommand with the default behavior of checking against the last frame of the tested trace. This means figuring out which call number is the one of the last frame first. It would be great to support providing the call numbers directly in the YAML files to be able to test more than just the last frame and, additionally, cut down the time taken to run the test.
  • The HTML generated summary allows us to see the reference and generated image during a test run side to side when it fails. It’d be great to have also some easy way of checking its differences. Using Rembrandt.js could be a possible solution.

Thanks a lot to the whole Mesa community for helping with the creation of this tool. Alexandros Frantzis, Rohan Garg and Tomeu Vizoso did a lot of the initial development for the in-tree tracie tool while Dylan Baker was very patient while reviewing my patches for the piglit integration.

Finally, thanks to Igalia for allowing me to work in this.


Appendix

In 3D computer graphics we say “traces”, for short, to name the files generated by 3D APIs capturing tools which store not only the calls to the specific 3D API but also the internal state of the 3D program during the capturing process: shaders, textures, buffers, etc.

Being able to “record” the execution of a 3D program is very useful. Usually, it will allow us to replay the execution without the need of the original program from which we generated the trace, it will also allow in-depth analysis for debugging and performance optimization, it’s a very good solution for sharing with other developers, and, in some cases, will allow us to check how the replay will happen with different GPUs.

In this post, however, I focus in a specific usage: rendering regression testing.

When doing a regression test what we would do is compare a specific metric obtained by replaying the trace with a specific version of the GFX software stack against the same metric obtained from a different version of the GFX stack. If the value of the metric changes we have found a regression (or an improvement!).

To make things simpler, we would like to check changes happening just in one of the many elements of the software stack. The most relevant component is the user space driver. In particular, I care about the Mesa drivers and the GNU/Linux stack.

Mainly, there are two kinds of regression testing we can do with a trace: performance or rendering regression testing. When doing a performance one, the checked metric(s) usually are in terms of speed or memory usage. In the case of the rendering ones what we would do is comparing the rendered output at one (or many) point during the trace replay. This output, a bitmap image, is the metric that we will compare in between two different points of the Mesa driver. If the images differ, we may have found a regression; artifacts, improper colors, etc, or an enhancement, if the reference image is the one featuring any of these problems.

by tanty at February 09, 2021 08:47 AM

February 03, 2021

Eleni Maria Stea

Build WebKit/WPE on Linux/X11

There’s a lot of documentation online about building Webkit/WPE on Linux. But as most instructions are targetting embedded platforms developers, the focus is on building Webkit with Wayland using the flatpak-sdk to automate and speed up the building process. As the steps I’ve followed to build it on my X11 system and run the Webkit/WPE … Continue reading Build WebKit/WPE on Linux/X11

by hikiko at February 03, 2021 09:41 PM

Brian Kardell

TAG 2021

TAG 2021

The W3C is in the middle of a big, and arguably very difficult election for the W3C Techincal Architecture Group (aka TAG). The TAG is one of two small bodies within the W3C which are elected by membership. If you're unfamilliar with this, I wrote this exceptionally brief Primer for Busy People. I'd like to tell you why I think this is a big election, why it is complex, what I (parsonally) think about it, and what you can do if you agree.

The current W3C TAG election is both important and complex for several reasons. It's big because 4 of the 6 elected seats are up for election and two exceptional members (Alice Boxhall from Google and David Baron from Mozilla) are unable to run for reelection. It's big because there are) nine candidates and each brings different things to the table (and statements don't always capture enough). It's complex because of the voting system and participation.

Let me share thoughts on what I think would be a good result, and then I'll explain why that's hard to achieve and what I think we need to avoid.

A good result...

I believe the best result involves 3 candidates for sure (listed alphabetically): Lea Verou, Sangwhan Moon and Theresa O’Connor.

Let's start with the incumbents. I fully support re-electing both Theresa O’Connor and Sangwhan Moon and cannot imagine reasons not to. Both have a history in standards, have served well on TAG, have a diversity of knowledge, and are very reasonable and able to work well. Re-electing some good incumbents with these qualities is a practical advantage to the TAG as well as they are already well immersed. These are easy choices.

Lea Verou is another easy choice for me. Lea brings a really diverse background, set of perspectives and skills to the table. She's worked for the W3C, she's a great communicator to developers (this is definitely a great skill in TAG whose outreach is important), she's worked with small teams, produced a number of popular libraries and helped drive some interesting standards. The OpenJS Foundation was pleased to nominate her, but Frontiers and several others were also supportive. Lea also deserves "high marks".

These 3 are also a happily diverse group.

This leaves one seat. There are 3 other candidates who I think would be good, for different reasons: Martin Thompson, Amy Guy and Jeffrey Yaskin. Each of them will bring something different to the table and if I am really honest, it is a hard choice. I wish we could seat all 3 of them, but we can't. At least in this election (that is, they can run again).

For brevity, I will not even attempt to make all the cases here, but I encourage you to read their statements and ask friends. Truth be told, I have a strong sense that "any mix of these 6 could be fine" and different mixes optimize for slightly different things. Also, as I will explain, there are some things that seem slightly more important to me than who I recommend is third best vs fourth or fifth...

TLDR; Turn out the vote

If you find yourself in agreement with me, more or less, I would suggest: "place the 3 I mentioned above (Lea, Tess, Sangwhan) at least your top 4 places", pick a fourth from my other list and put them in whatever order you like..

tess lea sangwhan

I think there are many possible great slates for TAG in 2021, but they all involve Lea, Tess and Sangwhan. Please help support them and place them among your top 4 votes.

If you're a W3C member, your AC Representative votes for you -- tell them. Make sure they vote - the best result definitely depends on higher than average turnout. Suprisingly, about 75% of membership doesn't normally vote. These elections are among the rare times when there are "votes" where there are equal voices. A tiny 1 person member org has exactly the same voting power as every mega company.

If you're not a W3C member, you don't have a way to vote directly but you can publicly state your support and tag in member orgs or reach out to people you know who work for member orgs. Historically this has definitely helped - let's keep W3C working well!


STV, Turnout and Negative Preference

The W3C's election system uses STV. STV stands for "single transferrable vote". Single is the operative word: While you express 9 preferences in this election, only one of those will actually be counted to help someone win a seat. The counting system is (I think) rather complex, but the higher up on your list someone appears it is far more likely to be the one that counts. Each vote that is counted counts a 1 vote in that candidates' favor.

Let me stress why this matters: STV optimizes for choosing diversity of opinion with demonstrably critical support. A passionate group supporting an 'issue' candidate will all place their candidate as the #1 choice - those are guaranteed to be counted.

Historically only about 100 of the W3C's 438 member organizations actually turn out in a good election. Let's imagine turnout is even lower in 2020 and it's only 70. This means that if a candidate reaches 18 votes (a little over 4% of membership) they have a seat, no matter how the rest of us vote - even if everyone else had and actively negative preference for them.

Non-participation is an issue for all voting systems, but it seems like STV can really amplify outcomes which are undesirable here. The only solution to this problem is to increase turnout. Increasing turnout raises the quota bar and helps ensure that this doesn't happen.

Regardless of how you feel about any of the candidates, please help turnout the vote. The W3C works best when as many members vote as possible!

February 03, 2021 05:00 PM

Alejandro Piñeiro

v3dv status update 2021-02-03

So some months have passed since our last update, when we announced that v3dv became Vulkan 1.0 conformant. The main reason for not publishing so many posts is that we saw the 1.0 checkpoint as a good moment to hold on adding new big features, and focus on improving the codebase (refactor, clean-ups, etc.) and the already existing features. For the latter we did a lot of work on performance. That alone would deserve a specific blog post, so in this one I will summarize the other stuff we did.

New features

Even if we didn’t focus on adding new features, we were still able to add some:

  • The following optional 1.0 features were enabled: logicOp, althaToOne, independentBlend, drawIndirectFirstInstance, and shaderStorageImageExtendedFormats.
  • Added support for timestamp queries.
  • Added implementation for VK_KHR_maintenance1, VK_EXT_private_data, and VK_KHR_display extensions
  • Added support for Wayland WSI.

Here I would like to highlight that we started to get feature contributions out of the initial core of developers that created the driver. VK_KHR_display was submitted by Steven Houston, and Wayland WSI support was submitted by Ella-0. Thanks a lot for it, really appreciated! We hope that this would begin a trend of having more things implemented by the rpi/mesa community as a whole.

Bugfixing and vulkan tools

Even if the driver got conformant, we were still testing the driver with several demos and applications, and provided fixes. As a example, we got Sascha Willem’s oit (Order Independent Transparency) working:

Sascha Willem’s oit demo on the rpi4

Among those applications that we were testing, we can highlight renderdoc and gfxreconstruct. The former is a frame-capture based graphics debugger and the latter is a tool that allows to capture and replay several frames. Both tools are heavily used when debugging and testing vulkan applications. We tested that they work on the rpi4 (fixing some bugs while doing it), and also started to use them to help/guide the performance work we are doing.

Fosdem 2021

If you are interested on an overview of the development of the driver during the last year, we are going to present “Overview of the Open Source Vulkan Driver for Raspberry Pi 4” on FOSDEM this weekend (presentation details here).

Previous updates

Just in case you missed any of the updates of the vulkan driver so far:

Vulkan raspberry pi first triangle
Vulkan update now with added source code
v3dv status update 2020-07-01
V3DV Vulkan driver update: VkQuake1-3 now working
v3dv status update 2020-07-31
v3dv status update 2020-09-07
Vulkan update: we’re conformant!

by infapi00 at February 03, 2021 11:13 AM

January 27, 2021

Manuel Rego

:focus-visible in WebKit - January 2021

Let’s do a small introduction as this is a kind of special post.

As you might already know, last summer Igalia launched the Open Prioritization experiment, and :focus-visible in WebKit was the winner according to the pledges that the different projects got. Now it has moved into collecting funds stage, so far we’ve reached 80% of the goal and Igalia has already started to work on this. If you are interested and want to help sponsoring this work, please visit the project page at Open Collective.

In our regular client projects in Igalia, we provide periodic progress reports about the status of tasks and next plans. This blog post is a kind of monthly report, but this time the project has many customers, so it looks like this is a better format to share information about the status of things. Thank you all for supporting us in this development! 🙏

Understanding :focus-visible

Disclaimer: This is not a blog post explaining how :focus-visible works or the implications it has, you can read other articles if you’re looking for that.

First things first, my initial thoughts were that :focus-visible was a pseduo-class which would match an element when the browser natively shows a focus indicator (focus ring, outline) when an element of a page is focused. And that’s more or less what the spec says on the first sentence:

The :focus-visible pseudo-class applies while an element matches the :focus pseudo-class and the user agent determines via heuristics that the focus should be made evident on the element.

They key part here is that native behavior doesn’t show the focus indicator on purpose in some situations when the :focus pseudo-class matches, mainly because usability studies indicate that showing it in all the cases is not what the user expects and wants. Before having :focus-visible the web authors have not way to access the same criteria to style the focus indicator only when it’s going to be shown natively, and still keep the website being accessible.

Apart from that the spec has a set of heuristics that despite being non-normative, it looks like all implementations are following them. Summarizing them briefly they’d be something like:

  • If you use the mouse to focus an element it won’t match :focus-visible.
  • If you use the keyboard it’ll match :focus-visible.
  • Elements that support keyboard input (like <input> or contenteditable) always match :focus-visible.
  • When a script focuses a new element it’ll match or not :focus-visible depending on the previous active element.

This is just a quick & dirty summary, please read the spec for all the details. There have been years of research around these topics (how focus should work or not on the different use cases, what are the users and accessibility needs, how websites are managing focus, etc.) and these heuristics are somehow the result of all that work.

:focus-visible in the default UA style sheet

At this point it looks like we can more or less understand what :focus-visible is about. So let’s start playing with it. The definition seems very clear, but testing things in the current implementations (Chromium and Firefox) you might find some unexpected situations.

Let’s use a very basic example:

<style>
  :focus-visible { background: lime; }
</style>
<div tabindex="0">Focus me.</div>
#example1:focus-visible { background: lime; } #warning1 { display: none; color: red; font-size: 75%; font-style: italic; } @supports not (selector(:focus-visible)) { #warning1 { display: block; } }
WARNING: :focus-visible is not supported on your browser.
Focus me.

If you focus the <div> with a mouse click, :focus-visible doesn’t match per spec, so in this case the background doesn’t become green (if you use the keyboard to focus it will match :focus-visible and the background would be green). This works the same in Chromium and Firefox, but Chromium (despite the element doesn’t match :focus-visible) shows a focus indicator. Somehow the first spec definition is already not working as expected on Chromium… The issue here is that Chromium still uses :focus { outline: auto; } in the default UA style sheet, and the element matches :focus after the mouse click, that’s why it’s showing a focus indicator while not matching :focus-visible.

Actually this was already on the spec, but Chromium is not following that yet:

User agents should also use :focus-visible to specify the default focus style, so that authors using :focus-visible will not also need to disable the default :focus style.

There was already a related CSSWG issue on the topic, as the spec currently suggests the following code:

:focus:not(:focus-visible) {
  outline: 0;
}

This works as a kind of workaround for this issue, but if the default UA style sheet uses :focus-visible that won’t be needed.

Anyway, I’ve reported the Chromium bug and created WPT tests, during the tests review Emilio Cobos realized that this needed a change on the HTML spec and he wrote a PR to update it. After some discussion with Alice Boxhall the HTML change and the tests were approved and merged. I even was brave enough to write a patch to change this in Chromium which is still under review and needs some extra work.

The tests

WebKit is the third browser engine adding support for :focus-visible so the first natural step was to import the WPT tests related to the feature. This looked like something simple but it ended up needing some work to improve the tests.

There was a bunch of :focus-visible tests already in the WPT suite, but they need some love:

  • Some parts of the spec were not covered by tests, so I added some new tests.
  • Some tests were passing in WebKit, even when there was not an implementation yet, so I modified them to fail if there’s no :focus-visible support.

Then I imported the tests in WebKit and I discovered a bug related to focus event and :focus pseudo-class. :focus pseudo-class was not matching inside the focus event handler. This is probably not important for web authors, but :focus-visbile tests were relying on that. Actually this had been fixed in Chromium more than 5 years ago, so first I moved to WPT the Chromium internal test and used it to fix the problem in WebKit.

Once the tests were imported in WebKit, the problem was that a bunch of them were timing out in Mac platforms. After investigating the issue I realized that it’s because focus event is not dispatched when you click on a button in Mac. Digging deeper I found this old bug with lots of discussions on the topic, it looks like this is done to keep alignment with the native platform behavior, and also in order to avoid showing a focus indicator. Even Firefox has the same behavior on Mac. However, Chromium always dispatch the event independently of the platform. This makes that some of the tests don’t work automatically on Mac, as they wait for a focus event that is never dispatched. Anyway maybe once :focus-visible is implemented, it could be rediscussed the possibility of modifying this behavior, thought it might be not possible anyway. In any case, WebKitGTK port, the one I’m using for the development, does trigger the focus event in this case; and I’ve also changed WPE port to do the same (maybe Windows port will follow too).

One more thing about the tests, lots of these :focus-visible tests use testdriver.js to simulate user actions. For example for clicking an element they use test_driver.click(element), however that simple instruction is causing some kind of memory leak on Chromium when running the tests. The actual Chromium bug hasn’t been fixed yet, but I landed some workarounds that prevent the issue in these tests (waiting for the promise to be resolved before marking the test as done).

Status of WPT :focus-visible tests in the different browsers Status of WPT :focus-visible tests in the different browsers

To close the tests part, you can check the status in wpt.fyi, most of them are passing in all implementations which is great, but there are some interoperability issues that we’ll review next.

Interop issues

As I mentioned the wpt.fyi website helps to easily identify the interop issues between the different implementations.

  • :focus-visible on the default UA style sheet: This has been already commented before, but this is the reason why Chromium fails focus-visible-018.html test. Firefox fails focus-visible-017.html because the default UA style sheet mentions outline: auto, but Firefox uses a dotted outline.

  • :focus-visible on <select> element: There’s a Firefox failure on focus-visible-002.html because it doesn’t match :focus-visible when you click a <select> element. I opened a CSSWG issue to discuss this, and I initially thought that the agreement was that Firefox behavior is the right one. So I did a patch to change Chromium’s behavior and update the tests, but during the review I was pointed to a Chromium bug about this topic that was closed as WONTFIX, the reason is that when you click a <select> element you can type letters to select the option from the keyboard. Right now the discussion has been reopened and we’ll need to wait for the final resolution on the topic, to see which is the right implementation.

  • Keyboard interaction once an element is focused: This is tested by focus-visible-007.html. The example here is that you click an element to focus it, initially the element doesn’t match :focus-visible but then you use the keyboard (for example you type a letter), in that situation Chromium will start matching :focus-visible while Firefox won’t. The spec is quite explicit on the topic so it looks like a Firefox bug:

    If the user interacts with the page via the keyboard, the currently focused element should match :focus-visible (i.e. keyboard usage may change whether this pseudo-class matches even if it doesn’t affect :focus).

  • Programmatic focus and :focus-visible: What should happen with :focus-visible when the website uses element.focus() from a script to move the focus? The spec has some heuristics that depend on if the active element before focus() is called was matching (or not) :focus-visible. But I’ve opened a CSSWG issue to discuss what should happen when there’s no active element. The discussion is still ongoing and depending on that there might be changes in the current implementations. Right now there are some subtle differences between Chromium and Firefox here.

:-webkit-direct-focus?

Probably you don’t know what’s that, but it’s somehow related to :focus-visible so I believe it’s worth to mention it here.

WebKit is the browser that supports better :focus pseudo-class behavior on Shadow DOM (see the WPT tests results). The issue here is that the ShadowRoot should match :focus if some of the descendants are focused, so if you have an <input> element in the Shadow Tree, and you focus it, you’ll have 2 elements matching :focus the ShadowRoot and the <input>.

<style>
  #host {  padding: 1em; background: lightgrey; }
  #host:focus { background: lime; }
</style>
<div id="host"></div>
<script>
  shadowRoot = host.attachShadow(
    {mode: 'open', delegatesFocus: true});
  shadowRoot.innerHTML =
    '<input value="Focus me">';
</script>
#example2host { padding: 1em; background: lightgrey; } #example2host:focus { background: lime; }

In Chromium if you use delegatesFocus=true in element.attachShadow(), and you have an example like the one described above, you’ll get two focus indicators, one in the ShadowRoot and one in the <input>. Firefox doesn’t match :focus in the ShadowRoot so the issue is not present there.

WebKit matches :focus independently of delegatesFocus value (which is the right behavior per spec), so it’d be even more common to have a situation of getting two focus indicators. To avoid that WebKit introduced :-webkit-direct-focus pseudo-class, that is not web exposed, but it’s used in the default UA style sheet to avoid this bad effect of having a focus indicator on the ShadowRoot.

I believe :focus-visible spec should describe that behavior regarding how it works on ShadowRoot so it doesn’t match on those situations. That way WebKit could get rid of :-webkit-direct-focus and use :focus-visible once it’s implemented. I’ve reported a CSSWG issue to discuss this topic.

WIP implementation

So far I haven’t talked about the implementation at all, but the reason is that all the previous work is required in order to be able to do a proper implementation, with good quality and that is interoperable between the different browsers. :focus-visbile is a new feature, and despite all the interop mess regarding how focus works in the different browsers and platforms, we should aim to have a :focus-visible implementation as much interoperable as possible.

Despite all this related work, I’ve also found some time to work on a patch. It’s still not ready to be sent upstream but it’s already doing some things and passing some of the WPT tests. Of course several things are still missing, but next you can see quick screen recording with :focus-visible working on WebKit.

:focus-visible example running on WebKitGTK MiniBrowser

Some numbers

I know this is not really relevant, but it helps to get a grasp on what has been happening during this month:

  • 3 CSSWG issues reported.
  • 13 PRs merged in WPT.
  • 5 patches landed in WebKit.
  • 4 patches landed in Chromium.
  • And many discussions with different people, special thanks to Alice and Emilio that have been really helpful.

Next steps

The plan for February is to try to find an agreement on the CSSWG issues, close them, and update the WPT tests accordingly. Maybe this work could include even landing some patches on the current implementations. And of course, focus (pun intended) the effort on implementation of :focus-visible in WebKit.

I hope this blog post helps you to understand better the work that goes behind the scenes when a web platform feature is implemented, especially if you want to do it on a way that ensures browser interoperability and reduces web authors’ pain.

If you enjoyed this project update, stay tuned as there will more in the future.

January 27, 2021 11:00 PM

January 24, 2021

Brian Kardell

Optimizing The W3C Sandwiches

Optimizing The W3C Sandwiches

In which I muse about a boring, but, I think important topic of flawed W3C structures and process, in a hopefully not-boring way...

Imagine that there is a trade organization which had a marathon meeting event for its 500 members every year. It's a long meeting, and so during it, they serve sandwiches. A break, with food and peers, it seems is really conducive to the productivity. So, while it's a little non-obvious, while small and with no special 'powers' - the sandwiches are actually really important. The trade organization realizes this and decides to formalize a way to make sure we always have good sandwiches. A few weeks before they send out an email asking people:

Please tell us an ingredient which you could provide enough of for every sandwich for the next 3 years. Your ingredient may or may not be used, we are just looking for ingredient nomination statements expressing what your ingredient brings to the table. If your ingredient is chosen, we will tell you whether that is a 1, 2 or 3 year commitment.

Even if everyone doesn't really understand how inter-connectedly important the sandwiches are, they aren't entirely apathetic about them either: When it comes time to eat, and everyone's hungry after long meetings, there will definitely be opinions about the sandwiches.

BUT... At the same time, there will incredibly be very few ingredient nominations. Why? So many reasons!

Even if a person really had some vague notion that "a lot of my favorite sandwiches had pickles", their company is small, and it doesn't actually deal with pickles. After a moment or three worth of thought about "which pickles?" and "do other people even like pickles?" and "how would I ask my boss for money to offer 1000 pickles... maybe" - this is quickly abandoned as not really worth the time. Each person receiving this email knows there are 499 other organizations, and that what we need are really just 4 ingredients! Someone else - with the money and food connections will nominate anyways.

And so, that's what happens... Several of them are from some grocery or food service industry. Their businesses totally get the importance of the sandwiches, as well as what people largely "like" and they are structured in a way where this basically isn't a problem for them. Several of them might even discuss among themselves to make sure they're bringing complimentary things... They'll nominate things like"lettuce" or "tomatoes", or maybe "cheese". Maybe some others don't discuss, but they do a lot of sandwich adjacent work and so they still nominate some pretty general condiments that could plausibly be used on a lot of good sandwiches... maybe they offer "mustard" or "mayonnaise". But, then there's also people with other food connections who offer other things. In the end, we wind up with eggs, Vegemite, peanut butter, some gourmet cactus jelly, caviar, and butter. Then, of course - there's someone who offers some not-even-food item.

Now, with the exception of the last one - there's not really something inherently wrong with any of those ingredients - but I think we could at least agree that making a sandwich from random combinations of them is likely to produce a lot of suboptimal sandwiches.

Vote for pickles!

Everyone should have a say here, and we need to account for multiple tastes - so the way organization goes about deciding on which sandwich everyone will have is by voting for ingredients.

Now, again -- not due to apathy about the actual sandwiches, or the results of the meeting that they will effect - only about 1/5th of people will actually vote for ingredients. Why? Again, "this isn't worth my time, surely someone else can figure out sandwiches" plays in heavily. Perhaps you could even say that lack of participation an indirect recognion of what a dysfunctional way this is to make a sandwich.

I mean, who really has time to read the ingredient nomination statements, and also to filter out what might be just bull? Who wants to take the time to learn about the dietary benefits of whey, or read the back and forth debates extolling the health virtues of mayonnaise over mustard - or read some people ranting about why the tomato lobby is colluding with the spinach people, putting honest bean farmers out of business - and inevitable back and forth. Who has time to get entangled in the debate about how maybe the caviar people shouldn't even have been invited in the first place?

Still, the organization believes that members should make the decision, and, it seems there are a few ways we could ask this question. So, in early years it was asked like this:

Here are 12 ingredients, everybody pick 4, and we'll use the some of the same ingredients from last year, plus the top 4 vote getters here to make everyone sandwiches!

The initial sandwiches reviews seemed pretty good, actually. Some people even praised the first sandwiches' 'distinguish tastes'. But, slowly, a lot of people began admitting that really, they actually preferred something a little more boring, or less healthy but had been kind of afraid to admit it.

Also, it was really kind of a crap shoot. There was really nothing preventing us getting a real mishmash of random ingredients that didn't go thereto at all. Well, except for the fact that, in the end that most people were casting 4 votes, representing all of the ingredients of an actual sandwich. Nobody on purpose probably picked the combination of vegemite, mustard, butter and caviar - even if they really liked some of those. This meant that there were at least some odds that lots of "common sandwich" votes were somewhat compatible.

But really... that still involved a lot of luck, and after a few years, people began to kind of resent the sandwiches. The quality of productivity, collaboration and relationships at the event suffered.

How about a nice club sandwich?

But then, some people realized... Hey... Maybe a better way would be to try to talk to lots of people and put together a proposal for a good sandwich. The whole sandwich. Let's call them the "Whole Sandwich People". Like, can we get enough people to vote for the same way, by just saying:

Hi. We've taken the time to help make sure we had good ingredients, and looking at everything that is available and trying to balance lots of things, we're recommending this nice club sandwich - <here's why>. If that sounds good, order the club.

Well, that question is a lot easier to answer, so at least some additional people say "yes, please, that sounds good". The simple volume of several new people voting in the same direction was enough to generally make sure that those sandwiches won.

Cool! The Whole Sandwich People also cared about variety and nutrition and all of the other things and so began actively working to make sure that good, compatible, but not commonly offered ingredients got nominated. And, we got some more interesting sandwiches because of it.

The Exclusive Club Sandwich

In the end, not everyone was happy though. Some kinds of ingredients just weren't getting picked anymore. People began asking: How can our trade organization possibly represent the tastes of the people who always offers us caviar if we never pick caviar!? Your club sandwich is an exclusive club, it's biased toward the food service people!

The Whole Sandwich People suggested that this seemed like a weird take. In fact, they pointed to the fact that sandwiches had gotten more diverse, interesting and nutritious by discussing the whole sandwich. Sure, a lot of the ingredients were coming from food service companies, but that's just practical realities about other parts of the dysfunctional system.

And, yes, the Whole Sandwich People admitted, it's true, none of our sandwiches have included caviar. But, where is the bug? Couldn't the caviar people work with others to find a way to offer an acceptable Whole Sandwich with caviar? It might be plausible - it's not even that we necessarily hate caviar! The truth is, we just don't know how to make a good sandwich with it, and all of our sandwich experiences with it so far are bad. Or, if you can't offer a whole sandwich with caviar that people will order willingly.... Maybe just find something to offer other than caviar.

This, it seems, was unacceptable. It was seen as unfair. The Whole Sandwich People held too much sway over sandwich determination. So, in an attempt to improve the fairness things, the trade organization changed the question to:

Here are 9 ingredients, everybody rank them in order of preference and they we will use a complex system whereby one of the things you ranked will be counted, and we'll use the 4 winners to make everyone sandwiches

This amplifies the likelihood that things like caviar wind up our sandwiches. That's literally the feature.

Why? Because there is no single ingredient that makes a club sandwich. While everyone ranks the widely approved ingredients the inevitably have to be in an order, and only one of them will count. Meanwhile, the small but passionate group of people who are really passionate about a less common ingredient, and who are maybe even a little irked because they are never picked, just put that as #1 - which definitely counts.

Have we optimized the sandwich? Because I thought was the goal. I think, no.

The TAG/AB Sandwiches

All of this is analogy for how the W3C TAG and AB, and I've laid it out in an attempt to explain why I think this is a really silly way to do it, and why I think it should change: The "best sandwich" isn't created by talking about ingredients in isolation - you have to talk about the sandwich.

Yes, I get that sandwiches are a bit of a strained analogy. TAG isn't exactly a sandwich.... Maybe "a bunch of people going to a restaurant and attempting to pick a shared menu of 4 items based on whatever random stuff is available" is a better one. Or maybe picking a DND party is a better one still...

But, the fuzzy point is mostly the same with any of them: "Good" and "bad" are, except at real extremes not really judge-able in isolation. Sure, there are some "not-actually-even-food" items we can universally say "no thanks" to, but that's also rare, and not really how it works.

What matters in counting is "which one is best" and there is almost never a clear answer to that question without considering the whole. What matters to really knowing what people want, is also asking them in such a way that they can actually participate meaningfully.

How big of a problem is this, really?

After all, it's been a few elections now since we changed the question, and while there were some early "surprises", things seem to be generally going alright. We just had a really big TAG election, for example, and people seem generally pretty happy with the results of the actual election part. I know I am. So... Maybe this is much ado about nothing? Is it really worth our time to discuss?

I think, yes. Here's why: Good results in recent elections have been a combination of factors that won't always hold and that we shouldn't have to count on. They as much despite the process as anything (as argued below).

Plus, it's just silly.

How to address this...?

Well, that's the big question. But, one thing seem obvious to me: We need to stop insisting that we can't talk about the sandwich and instead focus on how we talk about the sandwich all the way through the process.

We Whole Sandwich People have learned that even if the way the question has asked has been changed, it's still possible to help elect whole sandwiches, it's just much harder and requires an astonishing amount of more coordination, trust and ultimately still some luck.

The point is that all of the things that we need to happen are already happening, but the process actually fights them. That's broken.

I think we need to move to a more open model, but one that moves impractical noise out of the general discussion for people whom that is mostly annoying and/or confusing. I think we need to stop being "procedurally secretive" about nominations. I think we need to enlist a willing group of people who can help cooperatively search for candidates and make sure that we're getting ingredients that are good by a number of metrics. I think this group should work to seek something like consensus on a menu of a few possible, "well-balanced meals" and provide recommendations that allow ACs to work with that by default or dig into ala carte options and more details only if they really want to.

Further, I think that rotating terms only complicates this. It's very, very simple to re-affirm seats every time - and this really lets us talk about the whole thing at once.

We don't really even have to change a single rule to accomplish any of this in practice, but currently the rules fight it at every step and make it an extraordinary difficult exercise. We should fix that. I intend to open some issues with some more specific proposals soon, but I'd love to hear anyone's thoughts or work collaborative with anyone to make those good proposals.

January 24, 2021 05:00 AM

January 20, 2021

Sergio Villar

Flexbox Cats (a.k.a fixing images in flexbox)

In my previous post I discussed my most recent contributions to flexbox code in WebKit mainly targeted at reducing the number of interoperability issues among the most popular browsers. The ultimate goal was of course to make the life of web developers easier. It got quite some attention (I loved Alan Stearns’ description of the post) so I decided to write another one, this time focused in the changes I recently landed in WebKit (Safari’s engine) to improve the handling of elements with aspect ratio inside flexbox, a.k.a make images work inside flexbox. Some of them have been already released in the Safari 118 Tech Preview so it’s now possible to help test them and provide early feedback.

(BTW if you wonder about the blog post title I couldn’t resist the temptation of writing “Flexbox Cats” which sounded really great after the previous “Flexbox Gaps”. After all, image support was added to the Web just to post pictures of 🐱, wasn’t it?)

Same as I did before, I think it’d be useful to review some of the more relevant changes with examples so you could have any of those so inspiring a-ha moments when you realize that the issue you just couldn’t figure out was actually a problem in the implementation.

What was done

Images as flex items in column flows

Web engines are in charge of taking an element tree, and accompanying CSS and creating a box tree from this. All of this relies on Formatting Contexts. Each formatting context has specific ideas about how layout behaves. Both flex and grid, for example, created new, interesting formatting contexts which allow them to size their children by shrinking and or stretching them. But how all this works can vary. While there is “general” box code that is consulted by each formatting text, there are also special cases which require specialized overrides. Replaced elements (images, for example), should work a little differently in flex and grid containers. Consider this:

.flexbox {
    display: flex;
    flex-direction: column;
    height: 500px;
    justify-content: flex-start;
    align-items: flex-start;
}

.flexbox > * {
    flex: 1;
    min-width: 0;
    min-height: 0;
}


<div class="flexbox">
      <img src="cat1.jpg>
</div>

Ideally, the aspect ratio of the replaced element (the image, in the example) would be preserved as the flex context calculated its size in the relevant direction (column is the block direction/vertical in western writing modes, for example)…. But in WebKit, they weren’t. They are now.

Black and white cat by pixabay

Images as flex items in row flows

This second issue is kind of the specular twin of the previous one. The same issue that existed for block sizes was also there for inline sizes. Overriding inline sizes were not used to compute block sizes of items with aspect ratio (again the intrinsic inline size was used) and thus the aspect ratio of the image (replaced elements in general) was not preserved at all. Some examples of this issue:

.flexbox {
  display: flex;
  flex-direction: row;
  width: 500px;
  justify-content: flex-start;
  align-items: flex-start;
}
.flexbox > * {
  flex: 1;
  min-width: 0;
  min-height: 0;
}


<div class="flexbox">
    <img src="cat2.jpg">
</div>

Gray Cat by Gabriel Criçan

Images as flex items in auto-height flex containers

The two fixes above allowed us to “easily” fix this one because we can now rely on the computations done by the replaced elements code to compute sizes for items with aspect ratio even if they’re inside special formatting contexts as grid or flex. This fix was precisely about delegating that computation to the replaced elements code instead of duplicating all the aspect-ratio machinery in the flexbox code. This fix has apparently the potential to be a game changer:
This is a key bug to fix so that Authors can use Flexbox as intended. At the moment, no one can use Flexbox in a DOM structure where images are flex children.
Jen Simmons in bug 209983 Also don’t miss the opportunity to check this visually appealing demo by Jen which should work as expected now. For those of you not having a WebKit based browser I’ve recorded a screencast for you to compare (all circles should be round).
Left: old WebKit. Right: new WebKit (tested using WebKitGtk)
Apart from the screen cast, I’m also showcasing the issue with some actual code.

.flexbox {
    width: 500px;
    display: flex;
}
.flexbox > * {
    min-width: 0;
}


<div class="flexbox">  
  <img style="flex: auto;" src="cat3.jpg">
</div>

Tabby Cat by Bekka Mongeau

Flexbox additional cases for definite sizes

This was likely the trickiest one. I remember having nightmares with all the definite/indefinite stuff back then when I was implementing grid layout with other Igalia colleages. The whole thing about definite/indefinite sizes although sensible and relatively easy to understand is actually a huge challenge for web engines which were not really designed with them in mind. Laying out web content traditionally means taking a width as input to produce a height as output. However formatting contexts like grid or flex make the whole picture much more complicated.
This particular issue was not a malfunction but something that was not implemented. Essentially the flex specs define some cases where indefinite sizes should be considered as definite although the general rule considers them indefinite. For example, if a single-line flex container has a definite cross size we could assume that flex items have a definite size in the cross axis which is indeed equal to the flex container inner cross size.
In the following example the flex item, the image, has height:auto (by default) which is an indefinite size. However the flex container has a definite height (a fixed 300px). This means that when laying out the image, we could assume that its height is definite and equal to the height of the container. Having a definite height then allows you to properly compute the width using an aspect ratio.

.flexbox {
    display: flex;
    width: 0;
    height: 300px;
}


<div class="flexbox">
  <img src="cat4.png">
</div>

White and Black Cat With Blue Eyes by Thomas Svensson

Aspect ratio computations and box-sizing

Very common overlook in layout code. When dealing with layout bugs we (browser engineers) usually forget about box-sizing because the standard box model is the truth and the whole truth and the sole truth in our minds. Jokes aside, in this case the aspect ratio was applied to the border box (content + border + padding) instead of to the content box as it should. The result were distorted images because border and padding where altering the aspect ratio computations.

.flexbox {
  display: flex;
}
.flexbox > * {
  border-top: 150px solid blue;
  border-left: 30px solid orange;
  height: 300px;
  box-sizing: border-box;
}


<div class=flexbox>
  <img src="cat5.png"/>
</div>

Grayscale Photo of Long Fur Cat by Skyler Ewin

Conclusions

I mentioned this in the previous post but I’ll do it again here, having the web platform test suite has been an an absolute game changer for web browser engineers. They have helped us in many ways, from easily allowing us to verify our implementations to acting as a safety net against potential regressions we might add while fixing issues in the engines. We no longer have to manually test stuff in different browsers to check how other developers have interpreted the specs. We now have the test, period.
In this case, I’ve been using them in a different way. They have served me both as a guide, directing my efforts to reduce the flexbox interoperability issues and also as a nice metric to measure the progress of the task. Talking about metrics, this work made WebKit based browsers pass an additional 64 test cases from the WPT test suite, a very nice step forward for interoperability.
I’m attaching a screenshot with the current status of images as flex items from the WPT point of view. Each html file on the left column is a test, and each test performs multiple checks. For example the image-as-flexitem-* ones run 19 different checks (use cases) each. Each column show how many tests each browser successfully run. A quarter ago Safari’s (WebKit’s) figures for most of them were 11/19, 13/19 but now the last Tech Preview it’s passing all of them. Not bad huh?
image-as-flexitem-* flexbox tests in WPT as of 2021/01/20

Acknowledgements

Again many thanks to the different awesome folks at Apple, Google and my beloved Igalia that helped me with very insightful reviews and strong support at all levels.
Also I am thankful to all the photographers from whom I borrowed their nice cat pictures (including the Brown and Black Cat on top by pixabay).

by svillar at January 20, 2021 09:45 AM

January 08, 2021

Samuel Iglesias

Vkrunner RPM packages available

VkRunner is a Vulkan shader tester based on Piglit’s shader_runner (I already talked about it in my blog). This tool is very helpful for creating simple Vulkan tests without writing hundreds of lines of code. In the Graphics Team at Igalia, we use it extensively to help us in the open-source driver development in Mesa such as V3D and Turnip drivers.

As a hobby project for last Christmas holiday season, I wrote the .spec file for VkRunner and uploaded it to Fedora’s Copr and OpenSUSE Build Service (OBS) for generating the respective RPM packages.

This is the first time I create a package and thanks to the documentation on how to create RPM packages, the process was simpler than I initially thought. If I find the time to read Debian New Maintainers’ Guide, I will create a DEB package as well.

Anyway, if you have installed Fedora or OpenSUSE in your computer and you want to try VkRunner, just follow these steps:

Fedora

  • Fedora:
$ sudo dnf copr enable samuelig/vkrunner
$ sudo dnf install vkrunner

OpenSUSE logo

  • OpenSUSE / SLE:
$ sudo zypper addrepo https://download.opensuse.org/repositories/home:samuelig/openSUSE_Leap_15.2/home:samuelig.repo
$ sudo zypper refresh
$ sudo zypper install vkrunner

Enjoy it!

January 08, 2021 07:42 AM

January 01, 2021

Eric Meyer

Highlighting Accessible Twitter Content

For my 2020 holiday break, I decided to get more serious about supporting the use of alternative text on Twitter.  I try to be rigorous about adding descriptive text to my images, GIFs, and videos, but I want to be more conscientious about not spreading inaccessible content through my retweets.

The thing is, Twitter doesn’t make it obvious whether someone else’s content has been described, and the way it structures (if I can reasonably use that word) its content makes it annoyingly difficult to conduct element or accessibility-property inspections.  So, in keeping with the design principles that underlie both the Web and CSS, I decided to take matters into my own hands.  Which is to say, I wrote a user stylesheet.

I started out by calling out things that lacked useful alt text.  It went something like this:

div[aria-label="Image"]::before {
	content: "WARNING: no useful ALT text";
	background: yellow;
	border: 0.5em solid red;
	border-radius: 1em;
	box-shadow: 0 0 0.5em black;
	…

…and so on, layering on some sizing, font stuff, and positioning to hopefully place the text where it would be visible.  This failed to satisfy for two reasons:

  1. Because of the way Twitter nests it dozens of repeatedly utility-classes divs, and the styles repeatedly applied thereby, many images were tall (but cut off) or wide (ditto) in ways that pulled the positioned generated text out of the visible frame shown on the site.  There wasn’t an easily-found human-readable predictable way to address the element I wanted to use as a positioning context.  So that was a problem.
  2. Almost every image in my feed had a big red and yellow WARNING on it, which quickly depressed me.

What I realized was that rather than calling out the failures, I needed to highlight the successes.  So I commented out the Big Red Angry Text approach above and got a lot more simple.

div[aria-label="Image"] {
	filter: grayscale(1) contrast(0.5);
}
div[aria-label="Image"]:hover {
	filter: none;
}

A screenshot of the author’s Twitter timeline, showing three tweets.  The middle tweet is text only.  The top tweet has a blurred-out image which is grayscale and of reduced contrast, indicating it has no useful alternative text.  The bottom tweet is also blurred, but it full color and contrast, indicating it has been given useful alternative text by the person who posted it.
Three consecutive tweets from my timeline on Friday, January 1st, 2021.  The blurring of the images in the top and bottom tweets is an effect of the Data Saver preference, not my CSS.

Just that.  De-emphasize the images that have the default alt text by way of their enclosing divs, and remove that effect on hover so I can see the image as intended if I so choose.  I like this approach better because it de-emphasizes images that aren’t properly described, while those which are described get a visual pop.  They stand out as lush islands in a flat sea.

In case you’ve been wondering why I’m selecting divs instead of img and video elements, it’s because I use the Data Saver setting on Twitter, which requires me to click on an image or video to load it.  (You can set it via Settings > Accessibility, display and languages > Data usage > Data saver.  It’s also what’s blurring the images in the screenshot shown here.)  I enable this setting to reduce network load, but also to give me an extra layer of protection when disturbing images and videos circulate.  I generally follow people who are careful about not sharing disturbing content, but I sometimes go wandering outside my main timeline, and never know what I’ll find out there.

After some source digging, I discovered a decent way to select non-described videos, which I combined with the existing image styles:

div[aria-label="Image"],
div[aria-label="Embedded video"] {
	filter: grayscale(1) contrast(0.5);
}
div[aria-label="Image"]:hover,
div[aria-label="Embedded video"]:hover {
	filter: none;
}

The fun part is, Twitter’s architecture spits out nested divs with that same ARIA label for videos, which I imagine could be annoying to people using screen readers.  Besides that, it also has the effect of applying the filter twice, which means videos that haven’t been described get their contrast double-reduced!  And their grayscale double-enforced!  Fun.

What I didn’t expect was that when I start playing a video, it loses the grayscale and contrast reduction effects even when not being hovered, which makes the second rule above a little over-written.  I don’t see the DOM structure changing a whole lot when the video loads and plays, so either videos are being treated differently for filter purposes, or I’m missing something in the DOM that’s invalidating the selector matching.  I might poke at it over time to find a fix, or I may just let it go.  The user experience isn’t too far off what I wanted anyway.

There is a gap in my coverage, which is GIFs pulled from Twitter’s GIF pool.  These have default alt text other than Image, which makes selecting for them next to impossible.  Just a few examples pulled from Firefox’s Accessibility panel when I searched the GIF panel for “this is a test”:

testing GIF
This Is ATest Fool GIF
Corona Test GIF by euronews
Test Fail GIF
Corona Virus GIF by guardian
Its ATest Josh Subdquist GIF
Corona Stay Home GIF by INTO ACTION
Is This A Test GIF
Stressed Out Community GIF
A1b2c3 GIF

I assume these are Giphy titles or something like that.  In nearly every case, they’re insufficient, if not misleading or outright useless.  I looked for markers in the DOM to be able to catch these, but didn’t find anything that was obviously useful.

I did think briefly about filtering for any aria-label that contains the string GIF ([aria-label*="GIF"]), but that would improperly catch images and videos that have been described but happen to have the string GIF inside them somewhere.  This might be a relatively rare occurrence, but I’m loth to gray out media that someone went to the effort of describing.  I may change my mind about this, but for now, I’m accepting that GIFs which appear in full color are probably not described, particularly when containing common memes, and will try to be careful.

I apply the above styles in Firefox using Stylus, which also available for Chrome, and they’re working pretty well for me.  I wish I could figure out a way to apply them in mobile contexts, but that’s a (much bigger) problem for another day.

I’m not the first to tread this ground, nor do I expect to be the last, sadly.  For a deeper dive into all the details of Twitter accessibility and the pitfalls that can occur, please read Adrian Roselli’s excellent article Improving Your Tweet Accessibility from just over two years ago.  And if you want apply accessibility-aid CSS to your own Twitter experience but can’t or won’t use Stylus, Adrian has a bookmarklet that injects Twitter alt text all set up and ready to go—you can use it as-is, or replace the CSS in his bookmarklet with mine above or your own if you want to take a different approach.

So that’s how I’m upping my awareness of accessible content on Twitter in 2021.  I’d love to hear what y’all are using to improve your own experiences, or links to tools and resources on this same topic.  If you have any of that, please drop the links in a comment below, so that everyone who reads this can benefit.  Thanks!

by Eric Meyer at January 01, 2021 09:06 PM

December 22, 2020

Manuel Rego

2020 Recap

2020 is not a great year to do any kind of recap, but there have been some positive things happening in Igalia during this year. Next you can find a highlight of some of these things in no particular order.

CSS Working Group A Coruña F2F

The year couldn’t start better, on January Igalia hosted a CSS Working Group face-to-face meeting in our office in A Coruña (Galicia, Spain). Igalia has experience arranging other events in our office, but this was the first time that the CSSWG came here. It was an amazing week and I believe everyone enjoined the visit to this corner of the world. 🌍

Brian Kardell from Igalia was talking to everybody about Container Queries. This is one of the features that web authors have been asking for since ever, and Brian was trying to push the topic forward and find some kind of solution (even if not 100% feature complete) for this topic. In that week there were discussions about the relationship with other topics like Resize Observer or CSS Containment, and new ideas appeared too. Brian posted a blog post after the event, explaining some of those ideas. Later my colleague Javi Fernández worked on an experiment that Brian mentioned on a recent post. The good news is that all these conversations managed to bring this topic back to life, and past November Google announced that they have started working on a Container Queries prototype in Chromium.

During the meeting Jen Simmons (in Mozilla at that time, now in Apple) presented some topics from Mozilla, including a detailed proposal for Masonry Layout based on Grid, this has been something authors have also showed interest, and Firefox has already a prototype implementation behind a runtime flag.

Apart from the three days full of meetings and interesting discussions, some of the CSSWG members participated in a local meetup giving 4 nice talks:

Finally, I remember some corridor conversations about the Mozilla layoffs that had just happened just a few days before the event, but nobody could expect what was going to happen during the summer. It looks like 2020 has been a bad year for Mozilla in general and Servo in particular. 😢

Open Prioritization

This summer Igalia launched the Open Prioritization campaign, where we proposed a list of topics to be implemented on the different browser engines, and people supported them with different pledges; I wrote a blog post about it by that time.

Open Prioritization: :focus-visible in Safari/WebKit: $30.8K pledged out of $35K. Open Prioritization: :focus-visible in Safari/WebKit

This was a cool experiment, and it looks like a successful one, as :focus-visible in WebKit/Safari has been the winner. Igalia is currently collecting funds through Open Collective in order to start the implementation of :focus-visible in WebKit, you still have time to support it if you’re interested. If everything goes fine this should happen during the first quarter of 2021. 🚀

Igalia Chats

This actually started in later 2019, but it has been ongoing during the whole 2020. Brian Kardell has been recording a podcast series about the web platform and some of its features with different people from the industry. They have been getting more and popular, and Brian was even asked to record one of these for the last BlinkOn edition.

So far 8 episodes of around 1 hour length have been published, with 13 different guests. More to come in 2021! If you are curious and want to know more, you can find them at Igalia website or in your favourite podcasting platform.

Igalia contributions

This is not a comprehensive list but just some highlights of what Igalia has been doing in 2020 around CSS:

We’re working on a demo about these features, that we’ll be publishing next year.

In February Chromium published the requirements to become API owner. Due to my involvement on the Blink project since the fork from WebKit back in 2013, I was nominated and became Blink API Owner past March. 🥳

Yoav Weiss on the BlinkOn 13 Keynote announcing me as API owner Yoav Weiss on the BlinkOn 13 Keynote announcing me as API owner

The API owners met on a weekly basis to review the intent threads and discuss about them, it’s an amazing learning experience to be part of this group. In my case when reviewing intents I usually pay attention to things related to interoperability, like the status of the spec, test suites and other implementations. In addition, I have the support from all my awesome colleagues at Igalia that help me to play this role, thank you all!

2021 and beyond…

Igalia keeps growing and a bunch of amazing folks will join us soon, particularly Delan Azabani and Felipe Erias are already starting these days as part of the Web Platform team.

Open Prioritization should have the first successful project, as :focus-visible is advancing funding and it gets implemented in WebKit. We hope this can lead to new similar experiments in the future.

And I’m sure many other cool things will happen at Igalia next year, stay tuned!

December 22, 2020 11:00 PM

Brian Kardell

2020: The Good Parts

Note from the author...

My posts frequently (like this one) have a 'theme' and tend to use a number of images for visual flourish. Personally, I like it that way, I find it more engaging and I prefer for people to read it that way. However, for users on a metered or slow connection, downloading unnecessary images is, well, unnecessary, potentially costly and kind of rude. Just to be polite to my users, I offer the ability for you to opt out of 'optional' images if the total size of viewing the page would exceed a budget I have currently defined as 200k...

2020: The Good Parts

Each year, Igalians take a moment and look back on the year and assess what we've accomplished. Last year I wrote a wrap up for 2019 and hinted at being excited about some things in 2020 - I'd like to do the same this year.

Even in a "normal" year, making a list of things that you've actually accomplished can be a good exercise for your mental health. I do it once a month, in fact. It's easy to loose sight beyond what you're thinking of in the moment and feel overwhelmed by the sheer volume. If I can be honest with you, since it's just between us, heading into this exercise always fills me with a sense of dread. It always seems like now is the time when you have to come to grips with how little you actually accomplished this month. But, my experience is always quite the opposite: The simple act of even starting to create a list of things you actually did can give you a whole new perspective. Sometimes, usually maybe, I don't even finish the list because I hit a point where I say "Wow, actually, that's quite a lot" and feel quite a bit better.

But, 2020 is, of course, not a "normal year". It's more than fair to expect less of ourselves. So, when I sat down to write about what we accomplished, I was faced with this familiar sinking feeling -- and I had precisely the same reaction: Wow! We did a lot. So, let me share some highlights of Igalia's 2020: The Good Parts.

All the browsers

At Igalia, we are significant contributors to all of the browser engines (and several of the browsers that sit atop them too). There's a lot of ways you can look at just how much we do, and none of them are perfect, but commits are one kind of easy, but fuzzy measure of comparatively how much we did in the community. So, how much comparatively less did we do this year, than last? The opposite actually!

Igalia is again the #2 contributor to Chromium (Microsoft is coming up fast though). We are also again the #2 contributor to WebKit. Last year we raised some eyebrows by announcing that we had 11% of the total commits. This year: 15.5%! We also are up one place to the #6 contributors in the mozilla-central repository and up three places to #4 is servo! Keep in mind that #1 in all of these are the project owners (Google, Apple and Mozilla respectively).

We were huge contributors everywhere, but look at this: 15.5% of all WebKit Contributions in 2020!!

We worked on so many web features!

Some of the things we worked on are big or exciting things that everyone can appreciate and I want to highlight a little more here, but the list of features where we worked on at least one (sometimes two) implementations would be prohibitively long! Here is a very partial list of ones we worked on that I won't be highlighting.

  • Lazy loading
  • stale-while-revalidate
  • referrer-policy
  • Fixing bugs with XHR/fetch
  • Interop/Improvements to ResizeObserver/IntersectionObserver
  • Custom Properties performance
  • Text wrapping, line breaking and whitespace
  • Trailing ideograph spaces
  • Default aspect ratio from HTML Attributes
  • scroll snap
  • scroll-behavior
  • overscroll-behavior
  • scrollend event
  • Gamepad
  • PointerLock
  • list-stlye-type: <string>
  • ::marker
  • Lgical inset/border/padding/margin

A few web feature highlights...

Here are just a few things that I'd like to say a little more about...

Container Queries

I am exceptionally pleased that we have been pivotal in moving the ball in conversations on container queries. Not only did our outreach and discussions last year change the temperature of the room, but we got a start on two proposals and actually had CSS Working Group discussion on both. I'm also really pleased that Igalia built a functional prototype for further review and discussion of our switch proposal and that we've been collaborating with Google and Miriam Suzanne who have picked up where David Baron's proposal left.

unicorns walking around in paradise
It's like we just found not one, but two mytical unicorns

I expect 2021 to be an exciting year of developments in this space where we get a lot more answers sorted out.

MathML

Two years ago, MathML fell into a strange place in web history and had an uncertain future in browsers. Igalia has led the charge in righting all of this. Along with the MathML Refresh Community Group, peer implementers and help from various standards groups we now have MathML-Core - a well defined spec, with tests that define the most meaningful parts of MathML and their relation to the web platform as interoperability targets. We've made a ton of proress in aligning support, describing how things fit, and invested a lot of time this year up-streaming work in Chromium. Some additional work remains for next year pending Layout NG work at Google, but it's looking better and better and most of it shipping behind the experimental web platform features flag today. We also helped create and advocate for a new W3C charter for math.

But let me share why I'm especially proud of it...

A professor in front of a chalkboard full of math with space scenes super-imposed

Because Math is text, and a phenomenally import kind of text. The importance of begin able to render formulae is really highlighted during a pandemic, where researchers of all kinds need to share information and students are learning from home. I'm super proud to be a part of this single action that I believe really is a leap in helping the Web realize its potential for these societally important issues.

SVG/CSS Alignment

At the end of last year's post I hinted about something we were working on. The practical upshots that people will more immediately relate to will be the abilities to do 3D transforms in SVG and hardware accelerate SVG.

These are long requested enhancements but also come with historical baggage, so it's been difficult for browsers to invest. It's a great illustration of why Igalia is great for the ecosystem. This work is getting investment priority because Igalia are the maintainers of WPE WebKit, the official WebKit port for embedded devices.

Software on embedded devices has a marriage of unique qualities that lots of controls and displays want to be SVG-based, but also have to deal with typically low end hardware, which usually still has a GPU. Thus, this problem for those devices is a few orders of magnitude more critical than it is elsewhere. However, our work will ultimately fund improvements for all WebKit browsers, which also incentivizes others to follow!

OffscreenCanvas

One thing we haven't talked about yet, but I can't wait to is OffscreenCanvas. Apple originally pioneered the <canvas> element and it's super cool and useful for a lot of stuff. Unfortunately, it is historically tied to the DOM, did its work on the main thread and couldn't be used in workers. This is terrible because many of the use cases it is really great for are real intense. This is a bad situation - luckily, we're working on it! Chris Lord has been working on OffscreenCanvas in WebKit and it looks great so far - everything except text portions is done and I've been using it with great results.

OffscreenCanvas can be used in workers, and you can 'snap off' and transfer the context from a DOM rendered canvas to a worker too. So great! And guess why we're investing in it? You guessed it: Embedded.

WebXR

I mean, this is kinda huge right? Igalia is investing to help move XR forward in WebKit - part of this is generalized for WebKit, and I think that is kind of awesome. Still early days and there's a lot to do, but this is pretty exciting to see developing and I'm proud that Igalia is helping make it happen!

Important JavaScript stuff!

We Pushed and shipped public/private instance and static fields in JavaScriptCore (WebKit). Private methods are ongoing. We've really improved coordination with WebKit at large this year, and we're constantly improving the 32bit JSC too. We're working on WebAssembly and numerous emerging TC39 specifications we're excited about: Module blocks, decorators, bundling, Realms, decimal, records and tuples, Temporal and lots of things for ECMA 402 (Internationalization) too!

Web Related, non-feature things

There's a lot of other things that we accomplished this year at Igalia which are pretty exciting too!

  • Open Prioritization! This year we ran a pilot experiment called "Open Prioritization" to start some big and complex discussions and attempt to find ways to give more people a voice in the prioritization of work on the commons. We partnered with Open Collective and while I can't say the first experiment was flawless, we learned a lot and are moving forward with a project picked and funded by the larger community as well as establishing a collective to continue to do this!

  • Our new podcast! This year we also launched a podcast. We've had great discussions on complex topics and had amazing guests, including one of the creators of CSS Håkon Wium Lie, people from several browser vendors past and present, people who helped drive the two historically special embeddable forms in HTML (MathML and SVG), and some developer and web friends. It's available via all of your favorite podcasting services, a playlist on our YouTube channel and on our website

  • ipfs This year we also began working with Protocol Labs to improve some things around protocol handers - those are great for the web at large and it's interesting and exciting to see what is happening with things like IPFS!

  • Joined MDN PAB This year Igalia also joined the MDN Product Advisory Board, and we're looking forward to helping ensure that the vital resource that is MDN remains healthy!

  • WPE You might know that Igalia are the maintainers of a few of the official WebKit ports, and one of them is for embedded systems. I'm really pleased with all of the thins that this has allowed us to help drive for WebKit and the larger Web Platform. However, embedded "browsers" was kind of a new topic to me when I began my work here and it's somewhat different than the sorts of challenges I am used to. With embedded systems you typically build the OS specifically for the device. Sharing the same web tech commons is phenomenal, but for many developers like myself, my questions about embedded were difficult to explore on my own as someone who rarely compiles a browser, much less an operating system! I'm really pleased with the progress we've made on this, making wpewebkit.org more friendly, informative and relevant to people who might not already be experts at this, including making easy step-wise options available for people to explore. Someone with no experience and download a raspbian based OS with WPE WebKit on it and flash it right on a Raspberry Pi just to explore. For a lot of pet projects, you can do a lot with that too. That's not super representative of a good embedded system in terms of performance and things, but it is very easy and it's maintained by us, so it's pretty up to date. A short step away, if you're pretty comfortable with Linux shell and ssh, you can get a minimal/optimized for Raspberry Pi 3 install you can flash right onto your Pi that runs a Weston Wayland compositor. Finally, if you already kind of know what you're doing, we maintain Yocto receipes for developers to more easily build and maintain their real systems.

  • Vulkan! driver - You might know that Igalia does all kinds of stuff beyond just the Web, we work on all of the things that power the web too, and kind of all the way down - so we have lots of areas of specialization. I think it's really cool that we partnered with Raspberry Pi to create a Vulkan driver for the Mesa graphic driver stack for the latest generation of Raspberry Pi, achieving conformance in less than 1 year, passing over 100k tests from Kronos' Conformance Test Suite since our initial announcement of drawing the first triangle!

Looking forward...???

So, what exciting things can we look forward to in 2021? Definitely, advancing all of the things above - and that's exciting enough. It's hard to know what to be most excited for, but I'm personaly really looking forward to watchin Open Prioritization grow and get a real good idea and very concrete progress on Container Queries issues. We've also got our eyes on some new things we'll be starting to talk about in the next year, so stay tuned on those too.

One, that I'd like to mention, however is tabs. Over the past year, Igalia has begun to get involved with efforts like OpenUI and I've been talking to developers and peers at Igalia about tabs. I had some thoughts and ideas that I posted earlier this year. Just recently some actual work and collaboration has been done - gettinga number of us with similar ideas together to sort out a useful custom element that we can test out, as well as working in OpenUI on aligning all of the stars we'll need to align as we attempt to approach something like standard tabs. It is very early days here, but we've gone from a vague blog post to some actual involvement and we're getting behind the idea - which is pretty exciting and I can't wait to say more concrete things here!

December 22, 2020 05:00 AM

December 21, 2020

Oriol Brufau

CSS ::marker pseudo-element in Chromium

Introduction

Did you know that CSS makes it possible to style list markers?

In the past, if you wanted to customize the bullets or numbers in a list, you would probably have to hide the native markers with list-style: none, and then add fake markers with ::before.

However, now you can just use the ::marker pseudo-element in order to style the native markers directly!

If you are not familiar with it, I suggest reading these articles first:

In this post, I will explain the deep technical details of how I implemented ::marker in Chromium.

Thanks to Bloomberg for sponsoring Igalia to do it!

Implementing list-style-type: <string>

Before starting working on ::marker itself, I decided to add support for string values in list-style-type. It seemed a quite useful feature for authors, and Firefox already shipped it in 2015. Also, it’s like a more limited version of content in ::marker, so it was a good introduction.

It was relatively straight-forward to implement. I did it in a single patch, https://crrev.com/704563, which landed in Chromium 79. Then I also ported it into Webkit, it’s avilable since Safari Technology Preview 115.

<ol style="list-style-type: '★ '">
  <li>Lorem</li>
  <li>Ipsum</li>
</ol>

screenshot

Parsing and computation

The interesting thing to mention is that list-style-type had been implemented with a keyword template, so its value would be internally stored using an enum, and it would benefit from the parser fast path for keywords. I didn’t want to lose that, so I followed the same approach as for display, which also used to be a single-keyword property, until Houdini extended its syntax with layout(<ident>).

Basically, I treated list-style-type as a partial keyword property. This means that it keeps the parser fast path for keyword values, but in case of failure it falls back to the normal parser, where I accepted a string value.

When a string is provided, the internal list-style-type value is set to a special EListStyleType::kString enum value, and the string is stored in an extra ListStyleStringValue field.

Layout

From a layout point of view, I had to modify both LayoutNG and legacy code. LayoutNG is a new layout engine for Chromium that has been designed for the needs of modern scalable web applications. It was released in Chrome 77 for block and inline layout, but some CSS features like multi-column haven’t been implemented in LayoutNG yet, so they force Chromium to use the old legacy engine.

It was mostly a matter of tweaking LayoutNGListItem (for LayoutNG) and LayoutListMarker (for legacy) in order to retrieve the string from ListStyleStringValue when the ListStyleType was EListStyleType::kString, and making sure to update the marker when ListStyleStringValue changed.

Also, string values are somewhat special because they don’t have a suffix, unlike numeric values that are suffixed with a dot and space (like 1. ), or symbolic values that get a trailing space (like ).

It’s noteworthy that until this point, markers didn’t have to care about mixed bidi. But now you can have things like list-style-type: "aال", that is: U+0061 a, U+0627 ا, U+0644 ل. Note that ا is written before ل, but since they are arabic characters, ا appears at the right.

This is relevant because the marker is supposed to be isolated from the text in the list item, so in LayoutNG I had to set unicode-bidi: isolate to inside markers. It wasn’t necessary for outside markers since they are implemented as inline-blocks, which are already isolated.

In legacy layout, markers don’t actually have their text as a child, it’s just a paint-time effect. As such, no bidi reordering happens, and aال doesn’t render correctly:

<li style="list-style: 'aال - ' inside">text</li>

LayoutNG: screenshot vs. legacy: screenshot

At that point I decided to leave it this way, but this kind of problems in legacy layout would keep haunting me while implementing ::marker. Keep reading to know the bloody details!

::marker parsing and computation

Here I started working on the actual ::marker pseudo-element. As a first step, in https://crrev.com/709390 I recognized ::marker as a valid selector (behind a flag), added a usage counter, and defined a new PseudoId::kPseudoIdMarker to identify it in the node tree.

It’s important to note that list markers were still anonymous boxes, there was no actual ::marker pseudo-element, so kPseudoIdMarker wasn’t actually used yet.

Something that needs to be taken into account when using ::marker is that the layout model for outside positioning is not fully defined. Therefore, in order to prevent authors from relying on implementation-defined behaviors that may change in the future, the CSSWG decided to restrict which properties can actually be used on ::marker.

I implemented this restriction in https://crrev.com/710995, using a ValidPropertyFilter just like it was done for ::first-letter and ::cue. But note this was later refactored, and now whether a property applies to ::marker or not is specified in the property definition in css_properties.json5.

At this point, ::marker only allowed:

  • All font properties
  • Custom properties
  • color
  • content
  • direction
  • text-combine-upright
  • unicode-bidi

Using ::marker styles

At this point, ::marker was a valid selector, but list markers weren’t using ::marker styles. So in https://crrev.com/711883 I just took these styles and assigned them to the markers.

This simple patch was the real deal, making Chromium’s implementation of ::marker match WebKit’s one, which shipped in 2017. When enabling the runtime flag, you could style markers:

<style>
::marker {
  color: green;
  font-weight: bold;
}
</style>
<ol>
  <li>Lorem</li>
  <li>Ipsum</li>
</ol>

screenshot

This landed in Chromium 80. So, how come I didn’t ship ::marker until 86?

The answer is that, while the basic functionality was working fine, I wanted to provide a full and solid implementation. And it was not yet the case, since content was not working, and markers were still anonymous boxes that just happened to get assigned the styles for ::marker pseudo-elements, but there were no actual ::marker pseudo-elements.

Support content in LayoutNG

Adding support for the content property was relatively easy in LayoutNG, since I could reuse the existing logic for ::before and ::after.

Roughly it was a matter of ignoring list-style-type and list-style-image in non-normal cases, and using the LayoutObject of the ContentData as the children. This was not possible in legacy, since LayoutListMarker can’t have children.

It may be worth it to summarize the different LayoutObject classes for list markers:

  • LayoutListMarker, based on LayoutBox, for legacy markers.
  • LayoutNGListMarker, based on LayoutNGBlockFlowMixin<LayoutBlockFlow>, for LayoutNG markers with an outside position.
  • LayoutNGInsideListMarker, based on LayoutInline, for LayoutNG markers with an inside position.

It’s important to note that non-normal markers were actual pseudo-elements, their LayoutNGListMarker or LayoutNGInsideListMarker were no longer anonymous, they had an originating PseudoElement in the node tree.

This means that I had to add logic for attaching, dettaching and rebuilding kPseudoIdMarker pseudo-elements, add LayoutObjectFactory::CreateListMarker(), and make LayoutTreeBuilderTraversal and Node::PseudoAware* methods be aware of ::marker.

Most of it was done in https://crrev.com/718609.

Another problem that I had to address was that, until this point, both content: normal and content: none were considered to be synonymous, and were internally stored as nullptr.

However, unlike in ::before and ::after, normal and none have different behaviors in ::marker: the former decides the contents from the list-style properties, the latter prevents the ::marker from generating boxes.

Therefore, in https://crrev.com/732549 I implemented content: none as a NoneContentData, and replaced the HasContent() helper function with the more specific ContentBehavesAsNormal() and ContentPreventsBoxGeneration().

Default styles

According to the spec, markers needed to get assigned these styles in UA origin:

unicode-bidi: isolate;
font-variant-numeric: tabular-nums;

At this point, the ComputedStyle for a marker could be created in different ways:

  • If there was some ::marker selector, by running the cascade normally.
  • Otherwise, LayoutListMarker or LayoutNGListItem would create the style from scratch.

First, in https://crrev.com/720875 I made all StyleResolver::PseudoStyleForElementInternal, LayoutListMarker::ListItemStyleDidChange and LayoutNGListItem::UpdateMarker set these UA rules.

Then in https://crrev.com/725913 I made it so that markers would always run the cascade, unifying the logic in PseudoStyleForElementInternal. But this way of injecting the UA styles was a bit hacky and problematic.

So finally, in https://crrev.com/779284 I implemented it in the proper way, using a real UA stylesheet. However, I took care of preventing that from triggering SetHasPseudoElementStyle, which would have defeated some optimizations.

Interestingly, these UA styles use a ::marker selector, but they also affect nested ::before::marker and ::after::marker pseudo-elements. That’s because I took advantage of a bug in the style resolver, so that I wouldn’t have to implement the nested ::marker selectors. The bug is disabled for non-UA styles.

LayoutNGListItem::UpdateMarker also had some style tweaks that I moved into the style adjuster instead of to the UA sheet, because the exact styles depend on the marker:

  • Outside markers get display: inline-block, because they must be block containers.
  • Outside markers get white-space: pre, to prevent their trailing space from being trimmed.
  • Inside markers get some margins, depending on list-style-type.

I did that in https://crrev.com/725091 and https://crrev.com/728176.

Some fun: 99.9% performance regression

An implication of my work on the default marker styles was that the StyleType() became kPseudoIdMarker instead of kPseudoIdNone.

This made LayoutObject::PropagateStyleToAnonymousChildren() do more work, causing the flexbox_with_list_item perf test to worsen by a 99.9%!

Performance graph

I fixed it in https://crrev.com/722421 by returning early for markers with content: normal, which didn’t need that work anyways.

Once I completed the ::marker implementation, I tried reverting the fix, and then the test only worsened by a 2-3%. So I guess the big regression was caused by the interaction of multiple factors, and the other factors were later fixed or avoided.

Developer tools

It was important for me to expose ::marker in the devtools just like a ::before or ::after. Not just because I thought it would be beneficial for authors, but also because it helped me a lot when implementing ::marker.

So first I made the Styles panel expose the ::marker styles when inspecting the originating list item (https://crrev.com/724094).

Devtools ::marker styles

And then I made ::marker pseudo-elements inspectable in the Elements panel (https://crrev.com/724122 and https://crrev.com/c/1934220).

Devtools ::marker tree

However, note this only worked for actual ::marker pseudo-elements.

LayoutNG markers as real pseudo-elements

As previously stated, only non-normal markers were internally implemented as actual pseudo-elements, markers with content: normal were just annymous boxes.

So normal markers wouldn’t appear in devtools, and would yield incorrect values in getComputedStyle:

getComputedStyle(listItem, "::marker").width; // "auto"

According to CSSOM that’s supposed to be the used width in pixels, but since there was no actual ::marker pseudo-element, it would just return the computed value: auto.

So in https://crrev.com/731964 I implemented LayoutNG normal markers as real pseudo-elements. It’s a big patch, though mostly that’s because I had to update several test expectations.

Another advantage was that non-normal markers benefited from the much vaster test coverage for normal ones. For example, some accessibility code was expecting markers to be anonymous, I noticed this thanks to existing tests with normal markers. Without this change I might have missed that non-normal ones weren’t handled properly.

And a nice side-effect that I wasn’t expecting was that the flexbox_with_list_item perf test improved by a 30-40%. Nice!

It’s worth noting that until this point, pseudo-elements could only be originated by an element. However, ::before and ::after pseudo-elements can have display: list-item and thus have a nested marker.

Due to the lack of support for ::before::marker and ::after::marker selectors, I could previously assume that nested markers would have the initial content: normal, and thus be anonymous. But this was no longer the case, so in https://crrev.com/730531 I added support for nested pseudo-elements. However, the style resolver is still not able to handle them properly, so nested selectors don’t work.

A consequence of implementing LayoutNG markers as pseudo-elements was that they became independent, they were no longer created and destroyed by LayoutNGListItem. But the common logic for LayoutNGListMarker and LayoutNGInsideListMarker was still in LayoutNGListItem, so this made it difficult to keep the marker information in sync. Therefore, in https://crrev.com/735167 I moved the common logic into a new ListMarker class, and each LayoutNG marker class would own a ListMarker instance.

I also renamed LayoutNGListMarker to LayoutNGOutsideListMarker, since the old name was misleading.

Legacy markers as real pseudo-elements

Since I had already added the changes needed to implement all LayoutNG markers as pseudo-elements, I thought that doing the same for legacy markers would be easier.

But I was wrong! The thing is that legacy layout already had some bugs affecting markers, but they would only be triggered when dynamically updating the styles of the list item. But there aren’t many tests that do that, so they went unnoticed… until I tried my patch, which surfaced these issues in the initial layout, making some test fail.

So first I had to fix bug 1048672, 1049633, and 1051114.

Then there was also bug 1051685, involving selections or drag-and-drop with pseudo-elements like ::before or ::after. So turning markers into pseudo-elements made them have the same problem, causing a test failure.

I could finally land my patch in https://crrev.com/745012, which also improved performance like in LayoutNG.

Animations & transitions

While I was still working on ::marker, the CSSWG decided to expand the list of allowed properties in order to include animations and transitions. I did so in https://crrev.com/753752.

The tricky part was that only allowed properties could be animated. For example,

<style>
@keyframes anim {
  from { color: #c0c; background: #0cc }
  to   { color: #0cc; background: #c0c }
}
::marker {
  animation: anim 1s infinite alternate;
}
</style>
<ol><li>Non-animated text</li></ol>

Animated ::marker

Only the color of the marker is animated, not the background.

counter(list-item) inside <ol>

::before and ::after pseudo-elements already had the bug that, when referencing the list-item counter inside an <ol>, they would produce the wrong number, usually 1 unit greater.

Of course, ::marker inherited the same problem. And this was breaking one of the important use-cases, which is being able to customize the marker text with content.

For example,

<style>
::marker { content: "[" counter(list-item) "] " }
</style>
<ol>
  <li>foo</li>
  <li>bar</li>
  <li>baz</li>
</ol>

would start counting from 2 instead of 1:

::marker counter bug

Luckily, WebKit had already fixed this problem, so I could copy their solution. Unluckily, they mixed it with a big irrelevant refactoring, so I had to spend some time understanding which part was the actual fix. I ported it into Chromium in https://crrev.com/783493.

Support content in legacy

The only missing thing to do was adding support for content in legacy layout. The problem was that LayoutListMarker can’t have children, so it’s not possible to just insert the layout object produced by the ContentData.

Then, my idea was replacing LayoutListMarker with two new classes:

  • LayoutOutsideListMarker, for markers with outside positioning.
  • LayoutInsideListMarker, for markers with inside positioning.

and they could share the ListMarker logic with LayoutNG markers.

However, when I started working on this, something big happened: the COVID-19 pandemic.

SARS-CoV-2

And Google decided the skip Chromium 82 due to the situation, which is relevant because, in order to be able to merge patches easily, they wanted to avoid big refactorings.

And a big refactoring is precisely what I needed! So I had to wait until Chromium 83 reached stable.

Also, Google engineers were not convinced by my proposal, because it would imply that legacy markers would use more memory and would be slower, since they would have children even with content: normal.

So I changed my strategy as such:

  • Keep LayoutListMarker for normal markers.
  • Add LayoutOutsideListMarker for non-normal outside markers.
  • Add LayoutInsideListMarker for non-normal inside markers.

This was done in this chain of CLs: 2109697, 2109771, 2110630, 2246514, 2252244, 2252041, 2252189, 2246573, 2258732.

::marker enabled by default

Finally the ::marker implementation was complete!

To summarize, list markers ended up implemented among 5 different layout classes:

  • LayoutListMarker
    • Used for normal markers in legacy layout.
    • Based on LayoutBox.
    • Can’t have children, doesn’t use ListMarker.
  • LayoutOutsideListMarker
    • Used for outside markers in legacy layout.
    • Based on LayoutBlockFlow, i.e. it’s a block container.
    • Has children, uses ListMarker to keep them updated.
  • LayoutInsideListMarker
    • Used for inside markers in legacy layout.
    • Based on LayoutInline, i.e. it’s an inline box.
    • Has children, uses ListMarker to keep them updated.
  • LayoutNGOutsideListMarker
    • Used for outside markers in LayoutNG.
    • Based on LayoutNGBlockFlowMixin<LayoutBlockFlow>, i.e. it’s a block container.
    • Has children, uses ListMarker to keep them updated.
  • LayoutNGInsideListMarker
    • Used for inside markers in LayoutNG.
    • Based on LayoutInline, i.e. it’s an inline box.
    • Has children, uses ListMarker to keep them updated.

So at this point I just landed https://crrev.com/786976 to enable ::marker by default. This happened in Chromium 86.0.4198.0.

Allowing more properties

After shipping ::marker, I continued doing small tweaks in order to align the behavior with more recent CSSWG resolutions.

The first one was that, if you set text-transform on a list item or ancestor, the ::marker shouldn’t inherit it by default. For example,

<ol style="list-style-type: lower-alpha; text-transform: uppercase">
  <li>foo</li>
</ol>

should have a lowercase a, not A:

::marker text-transform

Therefore, in https://crrev.com/791815 I added text-tranform: none to the ::marker UA rules, but also allowed authors to specify another value if they want so.

Then, the CSSWG also resolved that ::marker should allow inherited properties that apply to text which don’t depend on box geometry. And other properties, unless whitelisted, shouldn’t affect markers, even when inherited from an ancestor.

Therefore, I added support for some text and text decoration properties, and also for line-height. On the other hand, I blocked inheritance of text-indent and text-align.

That was done in CLs 791815, 2382750, 2388384, 2391242, 2396125, 2438413.

The outcome was that, in Chromium, ::marker accepts these properties:

  • Animation properties: animation-delay, animation-direction, animation-duration, animation-fill-mode, animation-iteration-count, animation-name, animation-play-state, animation-timeline, animation-timing-function

  • Transition properties: transition-delay, transition-duration, transition-property, transition-timing-function

  • Font properties: font-family, font-kerning, font-optical-sizing, font-size, font-size-adjust, font-stretch, font-style, font-variant-ligatures, font-variant-caps, font-variant-east-asian, font-variant-numeric, font-weight, font-feature-settings, font-variation-settings,

  • Text properties: hyphens, letter-spacing, line-break, overflow-wrap, tab-size, text-transform, white-space, word-break, word-spacing

  • Text decoration properties: text-decoration-skip-ink, text-shadow, -webkit-text-emphasis-color, -webkit-text-emphasis-position, -webkit-text-emphasis-style

  • Writing mode properties: direction, text-combine-upright, unicode-bidi

  • Others: color, content, line-height

However, note that they may end up not having the desired effect in some cases:

  • The style adjuster forces white-space: pre in outside markers, so you can only customize white-space in inside ones.

  • text-combine-upright doesn’t work in pseudo-elements (bug 1060007). So setting it will only affect the computed style, and will also force legacy layout, but it won’t turn the marker text upright.

  • In legacy layout, the marker has no actual contents. So text properties, text decoration properties, unicode-bidi and line-height don’t work.

And this is the default UA stylesheet for markers:

::marker {
  unicode-bidi: isolate;
  font-variant-numeric: tabular-nums;
  text-transform: none;
  text-indent: 0 !important;
  text-align: start !important;
  text-align-last: start !important;
}

The final change, in https://crrev.com/837424, was the removal of the CSSMarkerPseudoElement runtime flag. Since 89.0.4358.0, it’s no longer possible to disable ::marker.

Overview

Implementing ::marker needed more than 100 patches in total, several refactorings, some existing bug fixes, and various CSSWG resolutions.

I also added lots of new WPT tests, additionally to the existing ones created by Apple and Mozilla. For every patch that had an observable improved behavior, I tried to cover it with a test. Most of them are in https://wpt.fyi/results/css/css-pseudo?q=marker, though some are in css-lists, and others are Chromium-internal since they were testing non-standard behavior.

Note my work didn’t include ::before::marker and ::after::marker selectors, which haven’t been implemented in WebKit nor Firefox either. What remains to be done is making the selector parser handle nested pseudo-elements properly.

Also, I kept the disclosure triangle of a <summary> as a ::-webkit-details-marker, but since Chromium 89 it’s a ::marker as expected, thanks to Kent Tamura.

by Oriol Brufau at December 21, 2020 09:00 PM

December 03, 2020

Alberto Garcia

Subcluster allocation for qcow2 images

In previous blog posts I talked about QEMU’s qcow2 file format and how to make it faster. This post gives an overview of how the data is structured inside the image and how that affects performance, and this presentation at KVM Forum 2017 goes further into the topic.

This time I will talk about a new extension to the qcow2 format that seeks to improve its performance and reduce its memory requirements.

Let’s start by describing the problem.

Limitations of qcow2

One of the most important parameters when creating a new qcow2 image is the cluster size. Much like a filesystem’s block size, the qcow2 cluster size indicates the minimum unit of allocation. One difference however is that while filesystems tend to use small blocks (4 KB is a common size in ext4, ntfs or hfs+) the standard qcow2 cluster size is 64 KB. This adds some overhead because QEMU always needs to write complete clusters so it often ends up doing copy-on-write and writing to the qcow2 image more data than what the virtual machine requested. This gets worse if the image has a backing file because then QEMU needs to copy data from there, so a write request not only becomes larger but it also involves additional read requests from the backing file(s).

Because of that qcow2 images with larger cluster sizes tend to:

  • grow faster, wasting more disk space and duplicating data.
  • increase the amount of necessary I/O during cluster allocation,
    reducing the allocation performance.

Unfortunately, reducing the cluster size is in general not an option because it also has an impact on the amount of metadata used internally by qcow2 (reference counts, guest-to-host cluster mapping). Decreasing the cluster size increases the number of clusters and the amount of necessary metadata. This has direct negative impact on I/O performance, which can be mitigated by caching it in RAM, therefore increasing the memory requirements (the aforementioned post covers this in more detail).

Subcluster allocation

The problems described in the previous section are well-known consequences of the design of the qcow2 format and they have been discussed over the years.

I have been working on a way to improve the situation and the work is now finished and available in QEMU 5.2 as a new extension to the qcow2 format called extended L2 entries.

The so-called L2 tables are used to map guest addresses to data clusters. With extended L2 entries we can store more information about the status of each data cluster, and this allows us to have allocation at the subcluster level.

The basic idea is that data clusters are now divided into 32 subclusters of the same size, and each one of them can be allocated separately. This allows combining the benefits of larger cluster sizes (less metadata and RAM requirements) with the benefits of smaller units of allocation (less copy-on-write, smaller images). If the subcluster size matches the block size of the filesystem used inside the virtual machine then we can eliminate the need for copy-on-write entirely.

So with subcluster allocation we get:

  • Sixteen times less metadata per unit of allocation, greatly reducing the amount of necessary L2 cache.
  • Much faster I/O during allocating when the image has a backing file, up to 10-15 times more I/O operations per second for the same cluster size in my tests (see chart below).
  • Smaller images and less duplication of data.

This figure shows the average number of I/O operations per second that I get with 4KB random write requests to an empty 40GB image with a fully populated backing file.

I/O performance comparison between traditional and extended qcow2 images

Things to take into account:

  • The performance improvements described earlier happen during allocation. Writing to already allocated (sub)clusters won’t be any faster.
  • If the image does not have a backing file chances are that the allocation performance is equally fast, with or without extended L2 entries. This depends on the filesystem, so it should be tested before enabling this feature (but note that the other benefits mentioned above still apply).
  • Images with extended L2 entries are sparse, that is, they have holes and because of that their apparent size will be larger than the actual disk usage.
  • It is not recommended to enable this feature in compressed images, as compressed clusters cannot take advantage of any of the benefits.
  • Images with extended L2 entries cannot be read with older versions of QEMU.

How to use this?

Extended L2 entries are available starting from QEMU 5.2. Due to the nature of the changes it is unlikely that this feature will be backported to an earlier version of QEMU.

In order to test this you simply need to create an image with extended_l2=on, and you also probably want to use a larger cluster size (the default is 64 KB, remember that every cluster has 32 subclusters). Here is an example:

$ qemu-img create -f qcow2 -o extended_l2=on,cluster_size=128k img.qcow2 1T

And that’s all you need to do. Once the image is created all allocations will happen at the subcluster level.

More information

This work was presented at the 2020 edition of the KVM Forum. Here is the video recording of the presentation, where I cover all this in more detail:

You can also find the slides here.

Acknowledgments

This work has been possible thanks to Outscale, who have been sponsoring Igalia and my work in QEMU.

Igalia and Outscale

And thanks of course to the rest of the QEMU development team for their feedback and help with this!

by berto at December 03, 2020 06:15 PM

November 29, 2020

Philippe Normand

Catching up on WebKit GStreamer WebAudio backends maintenance

Over the past few months the WebKit development team has been working on modernizing support for the WebAudio specification. This post highlights some of the changes that were recently merged, focusing on the GStreamer ports.

My fellow WebKit colleague, Chris Dumez, has been very active lately, updating the WebAudio implementation for the mac ports in order to comply with the latest changes of the specification. His contributions have been documented in the Safari Technology Preview release notes for version 113, version 114, version 115 and version 116. This is great for the WebKit project! Since the initial implementation landed around 2011, there wasn’t much activity and over the years our implementation started lagging behind other web engines in terms of features and spec compliance. So, many thanks Chris, I think you’re making a lot of WebAudio web developers very happy these days :)

The flip side of the coin is that some of these changes broke the GStreamer backends, as Chris is focusing mostly on the Apple ports, a few bugs slipped in, noticed by the CI test bots and dutifully gardened by our bots sheriffs. Those backends were upstreamed in 2012 and since then I didn’t devote much time to their maintenance, aside from casual bug-fixing.

One of the WebAudio features recently supported by WebKit is the Audio Worklet interface which allows applications to perform audio processing in a dedicated thread, thus relieving some pressure off the main thread and ensuring a glitch-free WebAudio rendering. I added support for this feature in r268579. Folks eager to test this can try the GTK nightly MiniBrowser with the demos:

$ wget https://trac.webkit.org/export/270226/webkit/trunk/Tools/Scripts/webkit-flatpak-run-nightly
$ chmod +x webkit-flatpak-run-nightly
$ python3 webkit-flatpak-run-nightly --gtk MiniBrowser https://googlechromelabs.github.io/web-audio-samples/audio-worklet/

For many years our AudioFileReader implementation was limited to mono and stereo audio layouts. This limitation was lifted off in r269104 allowing for processing of up to 5.1 surround audio files in the AudioBufferSourceNode.

Our AudioDestination, used for audio playback, was only able to render stereo. It is now able to probe the GStreamer platform audio sink for the maximum number of channels it can handle, since r268727. Support for AudioContext getOutputTimestamp was hooked up in the GStreamer backend in r266109.

The WebAudio spec has a MediaStreamAudioDestinationNode for MediaStreams, allowing to feed audio samples coming from the WebAudio pipeline to outgoing WebRTC streams. Since r269827 the GStreamer ports now support this feature as well! Similarly, incoming WebRTC streams or capture devices can stream their audio samples to a WebAudio pipeline, this has been supported for a couple years already, contributed by my colleague Thibault Saunier.

Our GStreamer FFTFrame implementation was broken for a few weeks, while Chris was landing various improvements for the platform-agnostic and mac-specific implementations. I finally fixed it in r267471.

This is only the tip of the iceberg. A few more patches were merged, including some security-related bug-fixes. As the Web Platform keeps growing, supporting more and more multimedia-related use-cases, we, at the Igalia Multimedia team, are committed to maintain our position as GStreamer experts in the WebKit community.

by Philippe Normand at November 29, 2020 12:45 PM

November 26, 2020

Víctor Jáquez

Notes on using Emacs (LSP/ccls) for WebKit

I used to regard myself as an austere programmer in terms of tooling: Emacs —with a plain configuration— and grep. This approach forces you to understand all the elements involved in a project.

Some time ago I have to code in Rust, so I needed to learn the language as fast as possible. I looked for packages in MELPA that could help me to be productive quickly. Obviously, I installed rust-mode, but I also found racer for auto-completion. I tried it out. It was messy to setup and unstable, but it helped me to code while learning. When I felt comfortable with the base code, I uninstalled it.

This year I returned to work on WebKit. The last time I contributed to it was around five years ago, but now in a different area (still in the multimedia stack). WebKit is huge, and because of C++, I found gtags rather limited. Out of curiosity I looked for something similar to racer but for C++. And I spent a while digging on it.

The solution consists in the integration of three MELPA packages:

  • lsp-mode: a client for Language Server Protocol for Emacs.
  • company-mode: a text completion framework.
  • ccls: A C/C++ language server. Besides emacs-ccls adds more functionality to lsp-mode.

(I known, there’s a simpler alternative to lsp-mode, but I haven’t tried it yet).

First we might explain what’s LSP. It stands for Language Server Protocol, defined with JSON-RPC messages, between the editor and the language server. It was orginally developed by Microsoft for Visual Studio, which purpose is to support auto-completion, finding symbol’s definition, to show early error markers, etc., inside the editor. Therefore, lsp-mode is an Emacs mode that communicates with different language servers in LSP and operates in Emacs accordingly.

In order to support the auto-completion use-case lsp-mode uses the company-mode. This Emacs mode is capable to create a floating context menu where the editing cursor is placed.

The third part of the puzzle is, of course, the language server. There’s a language servers for different programming languages. For C & C++ there are two servers: clangd and ccls. The former uses Clang compiler, the last can use either Clang, GCC or MSVC. Along this text ccls will be used for reasons exposed later. In between, emacs-ccls leverages and extends the support of ccls in lsp-mode, though it’s not mandatory.

In short, the basic .emacs configuration, using use-package, would have these lines:


(use-package company
  :diminish
  :config (global-company-mode 1))

(use-package lsp-mode
  :diminish "L"
  :init (setq lsp-keymap-prefix "C-l"
              lsp-enable-file-watchers nil
              lsp-enable-on-type-formatting nil
              lsp-enable-snippet nil)
  :hook (c-mode-common . lsp-deferred)
  :commands (lsp lsp-deferred))

(use-package ccls
  :init (setq ccls-sem-highlight-method 'font-lock)
  :hook ((c-mode c++-mode objc-mode) . (lambda () (require 'ccls) (lsp-deferred))))

The snippet first configures company-mode. It is enabled globally because, normally, it is a nice feature to have, even in non-coding buffers, such as this very one, for writing a blog post in markdown format. Diminish mode hides or abbreviates the mode description in the Emacs’ mode line.

Later comes lsp-mode. It’s big and aims to do a lot of things, basically we have to tell it to disable certain features, such as file watcher, something not viable in massive projects as WebKit; as I don’t use snippet (generic text templates), I also disable it; and finally, lsp-mode tries to format the code at typing, I don’t know how the code style is figured out, but in my experience, it’s always detected wrong, so I disabled it too. Finally, lsp-mode is launched when a text uses the c-mode-common, shared by c++-mode too. lsp-mode is launched deferred, meaning it’ll startup until the buffer is visible; this is important since we might want to delay ccls session creation until the buffer’s .dir-locals.el file is processed, where it is configured for the specific project.

And lastly, ccls-mode configuration, hooked until c-mode or c++-mode are loaded up in a deferred fashion (already explained).

It’s important to understand how ccls works in order to integrate it in our workflow of a specific project, since it might need to be configured using Emacs’ per-directory local variales.

We are living in a post-Makefile world (almost), proof of that is ccls, which instead of a makefile, it uses a compilation database, a record of the compile options used to build the files in a project. It’s commonly described in JSON and it’s generated automatically by build systems such as meson or cmake, and later consumed by ninja or ccls to execute the compilation. Bear in mind that ccls uses a cache, which can eat a couple gigabytes of disk.

Now, let’s review the concrete details of using these features with WebKit. Let me assume that WebKit local repository is cloned in ~/WebKit.

As you may know, the cool way to compile WebKit is with flatpak. Flatpak adds an indirection in the compilation process, since it’s done in an isolated environment, above the native system. As a consequence, ccls has to be the one inside the Flatpak environment. In ~/.local/bin/webkit-ccls:

#!/bin/sh
set -eu
cd $HOME/WebKit/
exec Tools/Scripts/webkit-flatpak -c ccls "$@"

Basically the scripts calls ccls inside flatpak, which is available in the SDK. And this is why ccls instead of clang, since clang is not provided.

By default ccls assumes the compilation database is in the project’s root directory, but in our case, it’s not, thus it is required to configure the database directory for our WebKit setup. For it, as we already said, a .dir-locals.el file is used.


((c-mode
  (indent-tabs-mode . nil)
  (c-basic-offset . 4))
 (c++-mode
  (indent-tabs-mode . nil)
  (c-basic-offset . 4))
 (java-mode
  (indent-tabs-mode . nil)
  (c-basic-offset . 4))
 (change-log-mode
  (indent-tabs-mode . nil))
 (nil
  (fill-column . 100)
  (ccls-executable . "/home/vjaquez/.local/bin/webkit-ccls")
  (ccls-initialization-options . (:compilationDatabaseDirectory "/app/webkit/WebKitBuild/Release"
                                  :cache (:directory ".ccls-cache")))
  (compile-command . "build-webkit --gtk --debug")))

As you can notice, ccls-execute is defined here, though it’s not a safe local variable. Also the ccls-initialization-options, which is a safe local variable. It is important to notice that the compilation database directory is a path inside flatpak, and always use the Release path. I don’t understand why, but Debug path didn’t work for me. This mean that WebKit should be compiled as Release frequently, even if we only use Debug type for coding (as you may see in my compile-command).

Update: Now we can explain why it’s important to configure lsp-mode as deferred: to avoid connections to ccls before processing the .dir-locals.el file.

And that’s all. Now I have early programming errors detection, auto-completion, and so on. I hope you find these notes helpful.

Update: Sadly, because of flatpak indirection, symbols’ definition finding won’t work because the file paths stored in ccls cache are relative to flatpak’s file system. For that I still rely on global and its Emacs mode.

by vjaquez at November 26, 2020 04:20 PM

November 22, 2020

Eleni Maria Stea

FOSSCOMM 2020, and a status update on EXT_external_objects(_fd) extensions [en, gr]

FOSSCOMM (Free and Open Source Software Communities Meeting) is a Greek conference aiming to promote the use of FOSS in Greece and to bring FOSS enthusiasts together. It is organized entirely by volunteers and universities and takes place in a different city each year. This year it was virtual as Greece is under lockdown, and … Continue reading FOSSCOMM 2020, and a status update on EXT_external_objects(_fd) extensions [en, gr]

by hikiko at November 22, 2020 06:11 PM

November 20, 2020

Paulo Matos

A tour of the for..of implementation for 32bits JSC

We look at the implementation of the for-of intrinsic in 32bit JSC (JavaScriptCore).

More…

by Paulo Matos at November 20, 2020 02:00 PM

Maksim Sisov

Chrome/Chromium on Wayland: The Waylandification project.

It has been a long time since I wrote my last blog post and since I wrote about something that I and my colleagues at Igalia have been working for the past 4 years. I have been postponing writing this post waiting until something big happens. Well, something big just happened…

If you already know what Ozone is, then I am happy to tell you that Chromium for Linux includes Ozone by default now and can be enabled with runtime command line flags. If you are interested in trying Chrome/Chromium with native Wayland support, you are encouraged to download Google Chrome for developers and try Ozone/Wayland by running the browser with the following command line flags – ‘–enable-features=UseOzonePlatform –ozone-platform=wayland’.

If you don’t know what Ozone is, here’s a brief explanation, before I talk about the history, status and design of this effort.

What is Ozone?

The very first thing that one may think about when they hear “Ozone” is the gas or a thin layer of the Earth’s atmosphere. Well… it is partly correct. In the case of Chromium, it is a platform abstraction layer.

I will not go into many details, but here is the description of that layer from Chromium’s documentation about Ozone –
“Ozone is a platform abstraction layer beneath Aura, Chromium’s platform independent windowing system, that is used for low level input and graphics. Once complete, the abstraction will support underlying systems ranging from embedded SoC targets to new X11-alternative window systems on Linux such as Wayland or Mir to bring up Aura Chromium by providing an implementation of the platform interface.”.
If you are interested in more details, you are welcome to read the project’s documentation at https://chromium.googlesource.com.

The Summary of the Design of Ozone/Wayland

It has been a long time since Antonio Gomes started to work on this project. It started as a research project for our customer – Renesas Electronics, and was based on a former abstraction project with another clever name, “mus+ash” (pronounced “mustache”, you can read more about that here – Chromium, ozone, wayland and beyond).

Since that time, the project has been moved to downstream and back to upstream (because of some unknowns related to the “mus+ash”) and the design of Ozone integration has also been changed.

Currently, the Aura/Platform classes are injected into the Browser process and communicate directly with the underlying Ozone platforms including Wayland. In the browser process, Wayland creates a connection with a Wayland compositor, while in the GPU process, it only draws pixels into the created DMABUFs and neither receives events nor creates surfaces.

Migrating Away From X11-only Legacy Backend.

It is worth mentioning that Igalia has been working on both Ozone/X11 and Ozone/Wayland.

Since June 2020, we have been working on switching Ozone for Linux from needing to be done at compile time to being choosable at runtime. At the moment, one can try Ozone by running Chrome downloaded from the development channel with the ‘–enable-features=UseOzonePlatform –ozone-platform=wayland/x11’ runtime flags.

That approach is allowing us to gather a bigger audience of users who are willing to test the Ozone capabilities, but also achieve a better feature parity between non-Ozone/X11 and Ozone/X11/Wayland.

That is, most of the features and code paths are shared between the two implementations, and the paths that are not compatible with Ozone are being refactored at the moment.

Once all the incompatible  paths are refactored ( just a few of them remain) and all the available test suites are enabled on the Linux/Ozone bot, we will start what is known as a “finch trial”.  This allows Ozone to be enabled by default for some percentage of users (about 10%). If the finch trial goes well, the percentage of users will be gradually grown to 100% and we will start removing old non”-Ozone/X11 implementation.

Wayland + Tab Drag

If you’ve been trying it out, you might have already noticed that Ozone/Wayland does not support the Tab Drag feature well enough. The problem is the lack of the protocol for this feature.

At the moment, my colleague Nick Diego is working on the definition of the protocol for tab drag and implementation of that in Chromium.

Unfortunately, Ozone will fallback to x11/xwayland for compositors that do not support the aforementioned protocol. However, once more and more compositors will support that, Chrome will whitelist those compositors.

I would not go into details of that effort here in this blog post, but just rather leave a link to the design document that Nick has created – Tab Dragging on Ozone/Wayland.

Summary

This blog post was rather a brief summary of the design, feature, and status of the project. Thank you for reading it. Hopefully, when we start a finch trial, I will write another blog telling you how it goes. Bye.

by msisov at November 20, 2020 11:28 AM

Brian Kardell

Open Prioritization First Experiment Wrap Up

Open Prioritization First Experiment Wrap Up

Earlier this year, Igalia launched an experiment called "Open Prioritization" which, effectively, lets us pull money together to prioritize work on a feature in web browsers. In this piece I'll talk about the outcome, lessons learned along the way and next steps.

Our Open Prioritization experiment was a pretty big idea. On the surface, it seems to be asking a pretty simple question: "Could we crowdfund some development work on browser?" However, it was quite a bit more involved in its goals, because there is a lot more hiding behind that question than meets the eye, and most of it is kind of difficult to talk about in purely theoretical ways. I'll get into all of that in a minute, but let's start with the results...

One project advances: :focus-visible

We began the experiment with six possible things we would try to crowdfund, and I'm pleased to say that one will advance: :focus-visible in WebKit.

We are working with Open Collective on next steps as it involves some decision making on how we manage future experiments and a bigger idea too. However, very soon this will shift from a pledged collective which just asked "would you financially support this project if it were to be offered?" to a proper way to collect funds for it. If you pledged, you will receive an contact when it's ready asking you to fulfill your pledge with information on how. We will also write a post when that happens as it's likely that at least some people will not come back and fulfill their pledge.

As soon as this begins and enough funds are available, it will enter our developers work queue and as staff frees up, they will shift to begin work on implementing this in WebKit!

We did it! We at Igalia would like to say a giant "thank you" for all of those who helped support these efforts in improving our commons.

Retrospective

Let's talk about some of those bigger ideas this experiment was aiming to look at, and lessons we learned along the way, in retrospect...

  • Resources are finite. Prioritization is hard. No matter how big the budget, resources are still finite and work has to be prioritized. Even with only a few choices on the table to choose from, it's not necessarily easy or plain what to choose because there isn't a "right" answer.

  • There are reasonable arguments for prioritizing different things. The two finalists both had strong arguments from different angles, but even a step back - at least some people chose to pledge to something else. Some thought that supporting SVG path syntax in CSS was the best choice. They only pledged to that one. Many of these were final implementations, but this wasn't. Some people thought that advancing new things that no one else seems to be advancing was the way to go. Others supported it because they thought that it was really important to boost ones that are help Mozilla. There just weren't enough people either seeing or agreeing with that weighting of things.

  • Cost is a factor It's not an exclusive factor - the cheapest option by far (SVG Path in CSS/Mozilla) was eliminated earlier. There are other reasons :focus-visible made some giant leaps too, but - at the end of the day the bar was also just lower. The second place project never actually managed to pull ahead, depite hoving more actual pledged dollars at one point.

  • Investing with uncertainty is especially hard . Just last week, Twitter exploded with excitement that Google was going to prototype some high level stuff with Container Queries. Fundamental to Google's intent is CSS containment in a single direction. CSS does not currently define containment in a single direction, but it does define the containment module where it would be defined. Containment was, in part, trying to lay some potential groundwork here. When we launched the project, I wrote about this: WebKit doesn't currently support the containment that is defined already and is a necessary prerequisite of any proposal involving that approach. The trouble is: We don't know if that will be the approach, and supporting it is a big task. Building a high level solution on the magic in our switch proposal, for example, doesn't require containment at all. Adding general containment support was the most expensive project on our list, by far. In fact, we could have done a couple of them for that price. This makes the value proposition of that work very speculative. Despite being potentially critically valuable for the single biggest/longest ask in CSS history - that project didn't make the finals when we put it to the public either.

  • Some things are difficult to predict. Going into this, I didn't know what to expect. A single viral tweet and a mass of developers pitching in $1 or $2 could, in theory, have funded any of these in hours. While I didn't expect that, I did kind of expect some amount of funds in the end would be of that sort. Interestingly, that didn't happen. At all. Depite lots of efforts trying to get lots of people to pledge very small dollars even asking specifically, and making it possible to do with a tweet - very, very few did (literally 1 on the winning project pledged less than five dollars). The most popular pledge was $20 with about a quarter of the pledges being over $50, and going up from there.

  • Matching funds are a really big deal. You can definitely see why fundraisers stress this. For the duration of this experiment, we saw periods of little actual movement, despite lots of tweets about it, likes and blog posts. There were a few giant leaps, and they all involved offers of matching dollars. Igalia ourselves, The A11Y Project and AMPHTML all had some offer of matching dollars that really seemed to inspire a lot more participation. The bigger the matching dollars available, the bigger the participation was.

  • Communication is hard. These might not have been the most ideal projects, in some respects. This last bullet is complicated enough that I'll give it it's own section.

Lessons learned: Communication challenges

While I am tremendously happy that inert and :focus-visible were our finalists and both did very well, I am biased. I helped advocate for and specify these two features before I came to Igalia, working with some Googlers who also did the initial implementations. I also advocated for them to be included in the list of projects we offered. However, I failed to anticipate that the very reasons I did both of these would present challenges for the experiment, so I'd like to talk about that a bit...

Unfortunately a confluence of things led to a lot of chatter and blog posts which were effectively saying something along the lines of "Developers shouldn't have to foot the bill because Apple doesn't care about accessibility and refuses to implement something. They don't care, and this is evidence proof - they are the last ones to not implement" and I wound up having a lot of conversations trying to correct the various misunderstanding here. That's not everyone else's fault, it's mine. I should have taken more time to communicate these things clearly, but for the record, nothing about this is really correct, so let me take the time to add the clarity for posterity...

  • On last implementations The second implementations only recently began or completed in Firefox, and one of those was also by Igalia. It seems really unfortunate and not exactly fair to suggest that being a few weeks/months behind, and especially when that came from outside help, is really an indictment. It's not. As an example, in the winning project, Chromium shipped this by default in October 2020. Firefox is right now pending a default release. Keep in mind that vendors don't have perfect insight into what is happening in other browsers, and even if they did reallocating resources isn't a thing that is done on a whim: Different browsers have different people with different skills and availability at any given point in time.

  • On refusal to implement This is 100% incorrect. I want to really stress this: Every item on our list comes from the list of things that are 'wants' from vendors themselves that need prioritization and are among the things they will be considering taking up next. If not funded here, it will definitely still get done - it's just impossible to say when really, and whatever priority they give it, they can't give to something else. This experiment gives us a more definite timeframe and frees them to spend that on implementing something else.

  • On web developers shouldn't have to foot the bill. Well, if you mean contributing dollars directly in crowdfunding in order to get the feature, we absolutely don't (see above bullet). However, generally speaking, this was in fact part of the conversation we wanted to start. Make no mistake: You are paying today, indirectly - and the actual investment back into the commons is inefficeint and non-guaranteed. It's wonderful that 3 organizations have seemed to foot the bill for decades, but starting a conversation about whether it is talking about that is definitely part of the goal here.

  • On "Apple doesn't care about accessibility" This one makes me really sad, not only because I know it isn't true and it seems easy to show otherwise, but also because there are some really great people from Apple like James Craig who absolutely not only care very deeply but often help lead on important things.

  • On "it's wrong to crowdfund accessibility features"Unfortunately, it seems the very things that drew me to work on these in the first place wound up working against us a little: Both inert and :focus-visible are interesting because they are "core features" to the platform that are useful to everyone. However, they are designed to sit at an intersection where they happily have really out-sized impact for accessibility. There are good polyfills for both of these which work well and somewhat reduce the degree of 'urgency'. I really thought that this made for a nice combination of interests/pains might lead to good partnerships of investment where, yes, I imagined that perhaps some organizations interested in advancing the accessibility end of things and who have historically contribute their labors, might see value in contributing to the flame more directly. Perhaps this wasn't as wise or obviously great as I imagined.

Wrapping up

All in all, in the end - despite some rocky communications, we are really encouraged by this first experiment. Thank you to everyone who pledged, boosted, blogged about the effort, etc. We're really looking forward to taking this much further next year and we'd like to begin by asking you to share which specific projects you'd be interested in seeing or supporting in the future? Hit us up on @briankardell or @igalia.

November 20, 2020 05:00 AM

November 14, 2020

Eleni Maria Stea

A hack to display the Vulkan CTS tests output

Vulkan conformance tests for graphics drivers save their output images inside an XML file called TestResults.qpa. As binary outputs aren’t allowed, these output images (that would be saved as PNG otherwise) are encoded to text using Base64 and the result is printed between <Image></Image> XML tags. This is a problem sometimes, as external tools are … Continue reading A hack to display the Vulkan CTS tests output

by hikiko at November 14, 2020 03:20 AM

November 13, 2020

Alexander Dunaev

HiDPI support in Chromium for Wayland

It all started with this bug. The description sounded humble and harmless: the browser ignored some command line flag on Wayland. A screenshot was attached where it was clearly seen that Chromium (version 72 at that time, 2019 spring) did not respect the screen density and looked blurry on a HiDPI screen.

HiDPI literally means small pixels. It is hard to tell now what was the first HiDPI screen, but I assume that their wide recognition came around 2010 with Apple’s Retina displays. Ultra HD had been standardised in 2012, defining the minimum resolution and aspect ratio for what today is known informally as 4K—and 4K screens for laptops have pixels that are small enough to call it HiDPI. This Chromium issue, dated 2012, says that the Linux port lacks support for HiDPI while the Mac version has it already. On the other hand, HiDPI on Windows was tricky even in 2014.

‘That should be easy. Apparently it’s upscaled from low resolution. Wayland allows setting scale for the back buffers, likely you’ll have to add a single call somewhere in the window initialisation’, a colleague said.

Like many stories that begin this way, this turned out to be wrong. It was not so easy. Setting the buffer scale did the right thing indeed, but it was absolutely not enough. It turned out that support for HiDPI screens was entirely missing in our implementation of the Wayland client. On my way to the solution, I have found that scaling support in Wayland is non-trivial and sometimes confusing. Since I finished this work, I have been asked a few times about what happens there, so I thought that writing it all down in a post would be useful.

Background

Modern desktop environments usually allow configuring the scale of the display at global system level. This allows all standard controls and window decorations to be sized proportionally. For applications that use those standard controls, this is a happy end: everything will be scaled automatically. Those which prefer doing everything themselves have to get the current scale from the environment and adjust rendering.  Chromium does exactly that: inside it has a so-called device scale factor. This factor is applied equally to all sizes, locations, and when rendering images and fonts. No code has to bother ever. It works within this scaled coordinate system, known as device independent pixels, or DIP. The device scale factor can take fractional values like 1.5, but, because it is applied at the stage of rendering, the result looks nice. The system scale is used as default device scale factor, and the user can override it using the command line flag named --force-device-scale-factor. However, this is the very flag which did not work in the bug mentioned in the beginning of this story.

Note that for X11 the ‘natural’ scale is still the physical pixels.  Despite having the system-wide scale, the system talks to the application in pixels, not in DIP.  It is the application that is responsible to handle the scale properly. If it does not, it will look perfectly sharp, but its details will be perhaps too small for the naked eye.

However, Wayland does it a bit differently. The system scale there is respected by the compositor when pasting buffers rendered by clients. So, if some application has no idea about the system scale and renders itself normally, the compositor will upscale it.  This is what originally happened to Chromium: it simply drew itself at 100%, and that image was then stretched by the system compositor. Remember that the Wayland way is giving a buffer to each application and then compositing the screen from those buffers, so this approach of upscaling buffers rendered by applications is natural. The picture below shows what that looks like. The screenshot is taken on a HiDPI display, so in order to see the difference better, you may want to see the full version (click the picture to open).

What Chromium looked like when it did not set its back buffer scale

Firefox (left) vs. Chromium (right)

How do Wayland clients support HiDPI then?

Level 1. Basic support

Each physical output device is represented at the Wayland level by an object named output. This object has a special integer property named buffer scale that tells literally how many physical pixels are used to represent the single logical pixel. The application’s back buffer has that property too. If scales do not match, Wayland will simply scale the raster image, thus emulating the ‘normal DPI’ device for the application that is not aware of any buffer scales.

The first thing the window is supposed to do is to check the buffer scale of the output that it currently resides at, and to set the same value to its back buffer scale. This will basically make the application using all available physical pixels: as scales of the buffer and the output are the same, Wayland will not re-scale the image.

Back buffer scale is set but rendering is not aware of that

Chromium now renders sharp image but all details are half their normal size

The next thing is fixing the rendering so it would scale things to the right size.  Using the output buffer scale as default is a good choice: the result will be ‘normal size’.  For Chromium, this means simply setting the device scale factor to the output buffer scale.

Now Chromium looks right

All set now

The final bit is slightly trickier.  Wayland sends UI events in DIP, but expects the client to send surface bounds in physical pixels. That means that if we implement something like interactive resize of the window, we will also have to do some math to convert the units properly.

This is enough for the basic support.  The application will work well on a modern laptop with 4K display.  But what if more than a single display is connected, and they have different pixel density?

Level 2. Multiple displays

If there are several output devices present in the system, each one may have its own scale. This makes things more complicated, so a few improvements are needed.

First, the window wants to know that it has been moved to another device.  When that happens, the window will ask for the new buffer scale and update itself.

Second, there may be implementation-specific issues. For example, some Wayland servers initially put the new sub-surface (which is used for menus) onto the default output, even if its parent surface has been moved to another output.  This may cause weird changes of their scale during their initialisation.  In Chromium, we just made it so the sub-surface always takes its scale from the parent.

Level 3? Fractional scaling?

Not really. Fractional scaling is basically ‘non-even’ scales like 125%. The entire feature had been somewhat controversial when it had been announced, because of how rendering in Wayland is performed. Here, non-even scale inevitably uses raster operations which make the image blurry. However, all that is transparent to the applications. Nothing new has been introduced at the level of Wayland protocols.

Conclusion

Although this task was not as simple as we thought, in the end it turned out to be not too hard. Check the output scale, set the back buffer scale, scale the rendering, translate pixels to DIP and vice versa in certain points. Pretty straightforward, and if you are trying to do something related, I hope this post helps you.

The issue is that there are many implementations of Wayland servers out there, not all of them are consistent, and some of them have bugs. It is worth testing the solution on a few distinct Linux distributions and looking for discrepancies in behaviour.

Anyway, Chromium with native Wayland support has recently reached beta—and it supports HiDPI! There may be bugs too, but the basic support should work well. Try it, and let us know if something is not right.

Note: the Wayland support is so far experimental. To try it, you would need to launch chrome via the command line with two flags:
--enable-features=UseOzonePlatform
--ozone-platform=wayland

by Alex at November 13, 2020 10:10 AM

November 08, 2020

Eleni Maria Stea

[OpenGL and Vulkan Interoperability on Linux] Part 10: Reusing a Vulkan stencil buffer from OpenGL

This is 10th post on OpenGL and Vulkan interoperability with EXT_external_objects and EXT_external_objects_fd. We’ll see the last use case I’ve written for Piglit to test the extensions implementation on various mesa drivers as part of my work for Igalia. In this test a stencil buffer is allocated and filled with a pattern by Vulkan and … Continue reading [OpenGL and Vulkan Interoperability on Linux] Part 10: Reusing a Vulkan stencil buffer from OpenGL

by hikiko at November 08, 2020 10:07 PM

November 05, 2020

Iago Toral

V3DV + Zink

During my presentation at the X Developers Conference I stated that we had been mostly using the Khronos Vulkan Conformance Test suite (aka Vulkan CTS) to validate our Vulkan driver for Raspberry Pi 4 (aka V3DV). While the CTS is an invaluable resource for driver testing and validation, it doesn’t exactly compare to actual real world applications, and so, I made the point that we should try to do more real world testing for the driver after completing initial Vulkan 1.0 support.

To be fair, we had been doing a little bit of this already when I worked on getting the Vulkan ports of all 3 Quake game classics to work with V3DV, which allowed us to identify and fix a few driver bugs during development. The good thing about these games is that we could get the source code and compile them natively for ARM platforms, so testing and debugging was very convenient.

Unfortunately, there are not a plethora of Vulkan applications and games like these that we can easily test and debug on a Raspberry Pi as of today, which posed a problem. One way to work around this limitation that was suggested after my presentation at XDC was to use Zink, the OpenGL to Vulkan layer in Mesa. Using Zink, we can take existing OpenGL applications that are currently available for Raspberry Pi and use them to test our Vulkan implementation a bit more thoroughly, expanding our options for testing while we wait for the Vulkan ecosystem on Raspberry Pi 4 to grow.

So last week I decided to get hands on with that. Zink requires a few things from the underlying Vulkan implementation depending on the OpenGL version targeted. Currently, Zink only targets desktop OpenGL versions, so that limits us to OpenGL 2.1, which is the maximum version of desktop OpenGL that Raspbery Pi 4 can support (we support up to OpenGL ES 3.1 though). For that desktop OpenGL version, Zink required a few optional Vulkan 1.0 features that we were missing in V3DV, namely:

  • Logic operations.
  • Alpha to one.
  • VK_KHR_maintenance1.

The first two were trivial: they were already implemented and we only had to expose them in the driver. Notably, when I was testing these features with the relevant CTS tests I found a bug in the alpha to one tests, so I proposed a fix to Khronos which is currently in review.

I also noticed that Zink was also implicitly requiring support for timestamp queries, so I also implemented that in V3DV and then also wrote a patch for Zink to handle this requirement better.

Finally, Zink doesn’t use Vulkan swapchains, instead it creates presentable images directly, which was problematic for us because our platform needs to handle allocations for presentable images specially, so a patch for Zink was also required to address this.

As of the writing of this post, all this work has been merged in Mesa and it enables Zink to run OpenGL 2.1 applications over V3DV on Raspberry Pi 4. Here are a few screenshots of Quake3 taken with the native OpenGL driver (V3D), with the native Vulkan driver (V3DV) and with Zink (over V3DV). There is a significant performance hit with Zink at present, although that is probably not too unexpected at this stage, but otherwise it seems to be rendering correctly, which is what we were really interested to see:


Quake3 Vulkan renderer (V3DV)

Quake3 OpenGL renderer (V3D)

Quake3 OpenGL renderer (Zink + V3DV)

Note: you’ll notice that the Vulkan screenshot is darker than the OpenGL versions. As I reported in another post, that is a feature of the Vulkan port of Quake3 and is unrelated to the driver.

Going forward, we expect to use Zink to test more applications and hopefully identify driver bugs that help us make V3DV better.

by Iago Toral at November 05, 2020 10:14 AM

Brian Kardell

All Them Switches: Responsive Elements and More

All Them Switches: Responsive Elements and More

In this post I'll talk about developments along the way to a 'responsive elements' proposal (aka container queries/element queries use cases) that I talked about earlier this year, a brief detour along the way, and finally, ask for your input on both...

I've been talking a lot this year about the web ecosystem as a commons, its health, and why I believe that diversifying investment in it is both important and productive1,2,3,4,5. Collectively, at Igalia, believe this and we choose to invest in the commons ourselves too. We try to apply our expertise toward helping solve hard problems that have been long stuck, trying to listen to developers and do things that we believe can help further what we think are valuable causes. I'd like to tell you the story of one of those efforts, which became two - and enlist your input..

Advancing the Container Queries Cause

As you may recall, back in Feburary I posted an article explaining that we had been working on this problem a bunch, and sharing our thoughts and progress and just letting people know that something is happening... People are listening, and trying. I also shared that our discussions also prompted David Baron's work toward another possible path.

We wanted to present these together, so by late April we both made informal proposals to the CSS working group of what we'd like to explore. Ours was to begin with a switch() function in CSS focused on slotting into the architecture of CSS in a way that allows us to solve currently impossible problems. If we can show that this works and progress all of the engines, the problem of sugaring an even higher level expression becomes possible, but we deliver useful values fast too.

Neither the CSS working nor anyone involved in any of the proposals is arguing that these are an either/or choice here. We are pursuing options and answering questions, together. We all believe that working this problem from both ends has definite value in both the short and long term and are mutually supportive. We are also excited by Miriam Suzanne's recent work. They are almost certainly complimentary and may even wind up helping each other with different cases.

Shortly after we presented our idea, Igalia also said that we would be willing to invest time to try to do some prototyping and implementation exploration and report back.

Demos and Updates

My colleague Javi Fernadez agreed to tackle initial implementation investigations with some down time he had. Initially, he made some really nice progress pretty quickly, coming up with a strategy, writing some low-level tests and getting them passing. But, then the world got very... you know... hectic.

However, I'm really happy to announce today that that we have recently completed enough to to share and to say we'll be able to take this experience back to report to CSSWG pretty soon.

The following demos represent research and development. Implementation is limited, not yet standard and was done for the purposes of investigation, dicussion and to answer questions necessary for implementers. It is, nevertheless, real, functioning code.

A running demo in a build of Chromium of an image grid component designed independently from page layout, which uses the proposed CSS switch() function to declaratively, responsively change the grid-template-columns that it uses based on the size available to it.

Cool, right? Here's a short "lightning talk" style presentation on it with some more demos too (note the bit of jank you see is dropped frames from my recording, rendering is very fluid as they are in the version embedded above)...

So - I think this is pretty exciting... What do you think? Here are answers to some questions I know people have

FAQ
Why a function and not a MQ or pseduo?
My post from Feb and the proposal explains that this is not an "instead of", but rather a " simpler and powerful step in breaking down the problem, which is luckily also a very useful step on its own". The things we want ultimately and tend to discuss are full of several hard problems, not just one. It's very hard to discuss numerous hypotheticals all at once, especially if they represent enormous amounts of work to imagine how they slot together in existing CSS architecture. That's not to say we shouldn't still try that too, it's just that the path for one is more definite and known. Our proposal, we believe, neatly slots out a few of the hardest problems in CSS and provides an achieveable answer we can make fast progress on in all engines and lessen the number of open questoons. This could allow us to both take on higher level sugar next, but also to fill that gap in user-land until we do. Breaking down problems like this is probably a thing you have done on your own engineering projects. It makes sense.
Why is it called inline available size?
The short answer is because that is accurately what it actually represents internally and there are good arguments for it I'll save for a more detailed post if this carries on, but don't get hung up on that, we haven't begun bikeshedding details of how you write the function and it will change. In fact, these demos use a slightly different format than our proposal because it was easier to parse. Don't get hung up on that either.
Where can you use switch?
You can use anything anywhere, but it will only be valid and meaningful in certain places. The function will only provide an available-inline-size value to switch in places that the CSS WG agrees to list. Sensibly what you can say is that these will never include things that could create circularties because they are used in derermining the available size. So, you wont be able to change the display, or the font with a switch() that depends on inine-available-size, but anything that changes properties of a layout module or paint is probably fair game. CSS could make other switchable values available for other properties though.
Why doesn't it use min-width/max-width style like media queries?
Media Queries L4 supports these examples, we just wanted to show you could. You could just as easily use min-width/max-width here!

Bonus Round: Switching gears...

Shortly after we made our switch proposal, my friend Jon Neal opened a github issue based on some twitter conversations. For the next week or two this thread was very busy with lots of function proposals that looked vaguely "switch-like". In fact, a number of them were also using the word "switch". From these, there are 3 ideas which seem interesting, look somewhat similar, but have some (very) importantly different characteristics, challenges and abilities. They are described below.

nth-value()

This proposal is a function which lets a variable represent an index into a list of possible values. Its use would look like this:

.foo {
  color: 
    nth-value(
      var(--x); 
      red; 
      blue; 
      yellow
    );
}

cond()

This proposal is a function which allows you to pass pairs of math-like conditions and value associations, as well as a default value. The conditions are evaluated from left to right and the value following the first condition to be true is used, or the default if none are. Its use would look like this:

.foo {
  margin-left: 
    cond(
      (50vw < 400px) 2em, 
      (50vw < 800px) 1em, 
      0px
    );
}

switch()

This (our) proposal is a function which works like cond() above, but can provide contextual information only available at appropriate stages in the lifecycle. In the case of layout properties, it would have the same sorts of information available to it as a layout worklet, thus allowing you to do a lot of the things people want to do with "container queries" as in this example below (available-inline-size is the contextual value provided during layout). Its use would look like this:

/* (proposed syntax, to be bikeshed much.. note the demos use a less flexible/different/easier to implement syntax for now ) */ 
.foo {
  grid-template-columns: 
    switch(
      (available-inline-size > 1024px) 1fr 4fr 1fr;
      (available-inline-size > 400px) 2fr 1fr;
      (available-inline-size > 100px) 1fr;
      default 1fr;
    );
}

As similar as these may seem, almost everything about them concretely is different. Each is parsed and follows very different paths around what can be resolved and when, as well as what you can do with them. nth-value(), it was suggested by Mozilla's Emilio Cobos, should be extremely easy to implement because it reuses much of the existing infrastructure for CSS math functions. In fact, he went ahead and implemented it in Mozilla's code base to illustrate.

While things were too hectic to advance our own proposal for a while earlier this year, we did have a enough time to look into that and indeed, the nth-value() proposal was fairly simple to implement in Chromium too! In a very short time, without very sustained investment, we were able to create a complete patch that we could submit.

While nth-value() doesn't help advance the container queries use cases, we agree that it looks like a comparatively easy win for developers, and it might be worth having too.

So, we put it to you: Is it?

We would love your feedback on both of these things - are they things that you would like to see standards bodies and implementers pursue? We certainly are willing to implement a similar prototype for WebKit if necessary if developers are interested and it is standardized. Let us know what you think via @igalia or @briankardell!

November 05, 2020 05:00 AM

October 31, 2020

Eleni Maria Stea

[OpenGL and Vulkan Interoperability on Linux] Part 9: Reusing a Vulkan z buffer from OpenGL

In this 9th post on OpenGL and Vulkan interoperability on Linux with EXT_external_objects and EXT_external_objects_fd we are going to see another extensions use case where a Vulkan depth buffer is used to render a pattern with OpenGL. Like every other example use case described in these posts, it was implemented for Piglit as part of … Continue reading [OpenGL and Vulkan Interoperability on Linux] Part 9: Reusing a Vulkan z buffer from OpenGL

by hikiko at October 31, 2020 02:10 PM

October 30, 2020

Jacobo Aragunde

Event management in X11 Chromium

This is a follow-up of my previous post, where I was trying to fix the bug #1042864 in Chromium: key strokes happening on native dialogs, like open and save dialogs, were not reported to the screen reader.

After learning how accessibility tools (ATs) register listeners for key events, I found out the problem was not actually there; I had to investigate how events arrive from the X11 server to the browser, and how they are forwarded to the ATs.

Not this kind of event

Events arrive from the X server

If you are running Chromium on Linux with the X11 backend (most likely, as it is the default), the Chromium browser process receives key press events from the X server. Then, it finds out if the target of those events is one of its browser windows, and sends it to the proper Window object to be processed.

These are the classes involved in the first part of this process:

The interface PlatformEventSource represents an undetermined source of events coming from the platform, and a PlatformEventDispatcher is any object in the browser capable of managing those events, dispatching them to the actual webpage or UI element. These two classes are related, the PlatformEventSource keeps a list of dispatchers it will forward the event to, if they can manage it (CanDispatchEvent).

The X11EventSource class implements PlatformEventSource; it has the code managing the events coming from an X11 server, in particular. It additionally keeps a list of XEventDispatcher objects, which is a class to manage X11 Event objects independently, but it’s not an implementation of PlatformEventDispatcher.

The X11Window class is the central piece, implementing both the PlatformEventDispatcher and the XEventDispatcher interfaces, in addition to the XWindow class. It has all the means required to find out if it can dispatch an event, and do it.

The main event processing loop looks like this:

  1. An event arrives to X11EventSource.
  • X11EventSource loops through its list of XEventDispatcher, and calls CheckCanDispatchNextPlatformEvent for each of them.

  • The X11Window implementing that function checks if the XWindow ID of the event target matches the ID of the XWindow represented by that object, and saves the XEvent object if affirmative.

  • X11EventSource calls DispatchEvent as implemented by its parent class PlatformEventSource.

  • The PlatformEventSource loops through its list of PlatformEventDispatchers and calls CanDispatchEvent on each one of them.

  • The X11Window object, which had previously run CheckCanDispatchNextPlatformEvent, just verifies if the XEvent object was saved then, and considers that a confirmation it can dispatch the event.

  • When one of the dispatchers answers positively, it receives the event for processing in a call to DispatchEvent; it is implemented at X11Window.

  • If it’s a keyboard event, it takes the steps required to send it to any ATs listening to it, which had been previously registered via ATK.

  • When X11Window ends processing the event, it returns POST_DISPATCH_STOP_PROPAGATION, telling PlatformEventSource to stop looping through the rest of dispatchers.

  • This is a sequence diagram summarizing this process:

    Events leave to the ATs

    As explained in the previous post, ATs can register callbacks for key press events, which ultimately call AtkUtilClass::add_key_event_listener. AtkUtilClass is a struct of function pointers, the actual implementation is provided by Chromium in the AtkUtilAuraLinux class, which keeps a list of those callbacks.

    When an X11Window class encounters an event that is targetting its own X Window, and it is a keyboard event, it calls X11ExtensionDelegate::OnAtkEvent() which is actually implemented by the class DesktopWindowTreeHostLinux; it ultimately hands the event to the AtkUtilAuraLinux class and runs HandleAtkEvent(). It will loop through, and run, any listeners that may have been registered.

    Native dialogs are different

    Native dialogs are stand-alone windows in the X server, different from the browser window that called them, and the browser process doesn’t wrap them in X11Window object. It is considered unnecessary, because the windows for native dialogs talk to the X server and receive events from it directly.

    They do belong to the browser process, though, which means that the browser will still receive events targetting the dialog windows. They will go through all the steps mentioned above to eventually be dismissed, because there is no X11Window object in the browser matching the ID of the target window of the process.

    Another consequence of dialog windows belonging to the browser process is that the AtkUtilClass struct points to Chromium’s own implementation, and here comes the problem… The dialog is expected to manage its own events through GTK+ code, including the GTK+ implementation of AtkUtilClass, but Chromium overrode it. The key press listeners that ATs registered are kept in Chromium code, so the dialog cannot notify them.

    Finally, fixing the problem

    Chromium does receive the keyboard events targetted to the dialog windows, but it does nothing with them because the target of those events is not a browser window. It gives us, though, a leg towards building a solution.

    To fix the problem, I made Chromium X Windows manage the keyboard events addressed to the native dialogs in addition to their own. For that, I took advantage of the “transient” property, which indicates a dependency of one window from the other: the dialog window had been set as transient for the browser window. In my first approach, I modified X11Window::CheckCanDispatchNextPlatformEvent() to verify if the target of the event was a transient window of the browser X Window, and in that case it would hand the event to X11ExtensionDelegate to be sent to ATs, following the code patch previously explained. It stopped processing at this point, otherwise the browser window would have received key presses directed to the dialog.

    The approach had one performance problem: I was calling the X server to check that property, for every keystroke, and that call implied using synchronous IPC. This was unacceptable! But it could be worked around: we could also notify the corresponding internal X11Window object about the existence of this transient window, when the dialog is created. This implies no IPC at all, we just store one new property in the X11Window object that can be checked locally when keyboard events are processed.

    This is a link to the review process of the patch, if you are interested in its history. To sum up, in the final solution:

    1. Chromium creates the native dialog and calls XWindow::SetTransientWindow, setting that property in the corresponding browser X Window.
  • When Chromium receives a keyboard event, it is captured by the X11Window object whose transient window property has been set before.

  • X11ExtensionDelegate::OnAtkEvent() is called for that event, then no more processing of this event happens in Chromium.

  • The native dialog code will also receive the event and manage the keystroke accordingly.

  • I hope you enjoyed this trip through Chromium event processing code. If you want to use the diagrams in this post, you may find their Dia source files in this link. Happy hacking!

    by Jacobo Aragunde Pérez at October 30, 2020 05:00 PM

    October 29, 2020

    Claudio Saavedra

    Thu 2020/Oct/29

    In this line of work, we all stumble at least once upon a problem that turns out to be extremely elusive and very tricky to narrow down and solve. If we&aposre lucky, we might have everything at our disposal to diagnose the problem but sometimes that&aposs not the case – and in embedded development it&aposs often not the case. Add to the mix proprietary drivers, lack of debugging symbols, a bug that&aposs very hard to reproduce under a controlled environment, and weeks in partial confinement due to a pandemic and what you have is better described as a very long lucid nightmare. Thankfully, even the worst of nightmares end when morning comes, even if sometimes morning might be several days away. And when the fix to the problem is in an inimaginable place, the story is definitely one worth telling.

    The problem

    It all started with one of Igalia&aposs customers deploying a WPE WebKit-based browser in their embedded devices. Their CI infrastructure had detected a problem caused when the browser was tasked with creating a new webview (in layman terms, you can imagine that to be the same as opening a new tab in your browser). Occasionally, this view would never load, causing ongoing tests to fail. For some reason, the test failure had a reproducibility of ~75% in the CI environment, but during manual testing it would occur with less than a 1% of probability. For reasons that are beyond the scope of this post, the CI infrastructure was not reachable in a way that would allow to have access to running processes in order to diagnose the problem more easily. So with only logs at hand and less than a 1/100 chances of reproducing the bug myself, I set to debug this problem locally.

    Diagnosis

    The first that became evident was that, whenever this bug would occur, the WebKit feature known as web extension (an application-specific loadable module that is used to allow the program to have access to the internals of a web page, as well to enable customizable communication with the process where the page contents are loaded – the web process) wouldn&apost work. The browser would be forever waiting that the web extension loads, and since that wouldn&apost happen, the expected page wouldn&apost load. The first place to look into then is the web process and to try to understand what is preventing the web extension from loading. Enter here, our good friend GDB, with less than spectacular results thanks to stripped libraries.

    #0  0x7500ab9c in poll () from target:/lib/libc.so.6
    #1  0x73c08c0c in ?? () from target:/usr/lib/libEGL.so.1
    #2  0x73c08d2c in ?? () from target:/usr/lib/libEGL.so.1
    #3  0x73c08e0c in ?? () from target:/usr/lib/libEGL.so.1
    #4  0x73bold6a8 in ?? () from target:/usr/lib/libEGL.so.1
    #5  0x75f84208 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #6  0x75fa0b7e in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #7  0x7561eda2 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #8  0x755a176a in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #9  0x753cd842 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #10 0x75451660 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #11 0x75452882 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #12 0x75452fa8 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #13 0x76b1de62 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #14 0x76b5a970 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #15 0x74bee44c in g_main_context_dispatch () from target:/usr/lib/libglib-2.0.so.0
    #16 0x74bee808 in ?? () from target:/usr/lib/libglib-2.0.so.0
    #17 0x74beeba8 in g_main_loop_run () from target:/usr/lib/libglib-2.0.so.0
    #18 0x76b5b11c in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #19 0x75622338 in ?? () from target:/usr/lib/libWPEWebKit-1.0.so.2
    #20 0x74f59b58 in __libc_start_main () from target:/lib/libc.so.6
    #21 0x0045d8d0 in _start ()

    From all threads in the web process, after much tinkering around it slowly became clear that one of the places to look into is that poll() call. I will spare you the details related to what other threads were doing, suffice to say that whenever the browser would hit the bug, there was a similar stacktrace in one thread, going through libEGL to a call to poll() on top of the stack, that would never return. Unfortunately, a stripped EGL driver coming from a proprietary graphics vendor was a bit of a showstopper, as it was the inability to have proper debugging symbols running inside the device (did you know that a non-stripped WebKit library binary with debugging symbols can easily get GDB and your device out of memory?). The best one could do to improve that was to use the gcore feature in GDB, and extract a core from the device for post-mortem analysis. But for some reason, such a stacktrace wouldn&apost give anything interesting below the poll() call to understand what&aposs being polled here. Did I say this was tricky?

    What polls?

    Because WebKit is a multiprocess web engine, having system calls that signal, read, and write in sockets communicating with other processes is an everyday thing. Not knowing what a poll() call is doing and who is it that it&aposs trying to listen to, not very good. Because the call is happening under the EGL library, one can presume that it&aposs graphics related, but there are still different possibilities, so trying to find out what is this polling is a good idea.

    A trick I learned while debugging this is that, in absence of debugging symbols that would give a straightforward look into variables and parameters, one can examine the CPU registers and try to figure out from them what the parameters to function calls are. Let&aposs do that with poll(). First, its signature.

    int poll(struct pollfd *fds, nfds_t nfds, int timeout);

    Now, let's examine the registers.

    (gdb) f 0
    #0  0x7500ab9c in poll () from target:/lib/libc.so.6
    (gdb) info registers
    r0             0x7ea55e58	2124766808
    r1             0x1	1
    r2             0x64	100
    r3             0x0	0
    r4             0x0	0

    Registers r0, r1, and r2 contain poll()&aposs three parameters. Because r1 is 1, we know that there is only one file descriptor being polled. fds is a pointer to an array with one element then. Where is that first element? Well, right there, in the memory pointed to directly by r0. What does struct pollfd look like?

    struct pollfd {
      int   fd;         /* file descriptor */
      short events;     /* requested events */
      short revents;    /* returned events */
    };

    What we are interested in here is the contents of fd, the file descriptor that is being polled. Memory alignment is again in our side, we don&apost need any pointer arithmetic here. We can inspect directly the register r0 and find out what the value of fd is.

    (gdb) print *0x7ea55e58
    $3 = 8

    So we now know that the EGL library is polling the file descriptor with an identifier of 8. But where is this file descriptor coming from? What is on the other end? The /proc file system can be helpful here.

    # pidof WPEWebProcess
    1944 1196
    # ls -lh /proc/1944/fd/8
    lrwx------    1 x x      64 Oct 22 13:59 /proc/1944/fd/8 -> socket:[32166]
    

    So we have a socket. What else can we find out about it? Turns out, not much without the unix_diag kernel module, which was not available in our device. But we are slowly getting closer. Time to call another good friend.

    Where GDB fails, printf() triumphs

    Something I have learned from many years working with a project as large as WebKit, is that debugging symbols can be very difficult to work with. To begin with, it takes ages to build WebKit with them. When cross-compiling, it&aposs even worse. And then, very often the target device doesn&apost even have enough memory to load the symbols when debugging. So they can be pretty useless. It&aposs then when just using fprintf() and logging useful information can simplify things. Since we know that it&aposs at some point during initialization of the web process that we end up stuck, and we also know that we&aposre polling a file descriptor, let&aposs find some early calls in the code of the web process and add some fprintf() calls with a bit of information, specially in those that might have something to do with EGL. What can we find out now?

    Oct 19 10:13:27.700335 WPEWebProcess[92]: Starting
    Oct 19 10:13:27.720575 WPEWebProcess[92]: Initializing WebProcess platform.
    Oct 19 10:13:27.727850 WPEWebProcess[92]: wpe_loader_init() done.
    Oct 19 10:13:27.729054 WPEWebProcess[92]: Initializing PlatformDisplayLibWPE (hostFD: 8).
    Oct 19 10:13:27.730166 WPEWebProcess[92]: egl backend created.
    Oct 19 10:13:27.741556 WPEWebProcess[92]: got native display.
    Oct 19 10:13:27.742565 WPEWebProcess[92]: initializeEGLDisplay() starting.

    Two interesting findings from the fprintf()-powered logging here: first, it seems that file descriptor 8 is one known to libwpe (the general-purpose library that powers the WPE WebKit port). Second, that the last EGL API call right before the web process hangs on poll() is a call to eglInitialize(). fprintf(), thanks for your service.

    Number 8

    We now know that the file descriptor 8 is coming from WPE and is not internal to the EGL library. libwpe gets this file descriptor from the UI process, as one of the many creation parameters that are passed via IPC to the nascent process in order to initialize it. Turns out that this file descriptor in particular, the so-called host client file descriptor, is the one that the freedesktop backend of libWPE, from here onwards WPEBackend-fdo, creates when a new client is set to connect to its Wayland display. In a nutshell, in presence of a new client, a Wayland display is supposed to create a pair of connected sockets, create a new client on the Display-side, give it one of the file descriptors, and pass the other one to the client process. Because this will be useful later on, let&aposs see how is that currently implemented in WPEBackend-fdo.

        int pair[2];
        if (socketpair(AF_UNIX, SOCK_STREAM | SOCK_CLOEXEC, 0, pair)  0)
            return -1;
    
        int clientFd = dup(pair[1]);
        close(pair[1]);
    
        wl_client_create(m_display, pair[0]);
    

    The file descriptor we are tracking down is the client file descriptor, clientFd. So we now know what&aposs going on in this socket: Wayland-specific communication. Let&aposs enable Wayland debugging next, by running all relevant process with WAYLAND_DEBUG=1. We&aposll get back to that code fragment later on.

    A Heisenbug is a Heisenbug is a Heisenbug

    Turns out that enabling Wayland debugging output for a few processes is enough to alter the state of the system in such a way that the bug does not happen at all when doing manual testing. Thankfully the CI&aposs reproducibility is much higher, so after waiting overnight for the CI to continuously run until it hit the bug, we have logs. What do the logs say?

    WPEWebProcess[41]: initializeEGLDisplay() starting.
      -> wl_display@1.get_registry(new id wl_registry@2)
      -> wl_display@1.sync(new id wl_callback@3)
    

    So the EGL library is trying to fetch the Wayland registry and it&aposs doing a wl_display_sync() call afterwards, which will block until the server responds. That&aposs where the blocking poll() call comes from. So, it turns out, the problem is not necessarily on this end of the Wayland socket, but perhaps on the other side, that is, in the so-called UI process (the main browser process). Why is the Wayland display not replying?

    The loop

    Something that is worth mentioning before we move on is how the WPEBackend-fdo Wayland display integrates with the system. This display is a nested display, with each web view a client, while it is itself a client of the system&aposs Wayland display. This can be a bit confusing if you&aposre not very familiar with how Wayland works, but fortunately there is good documentation about Wayland elsewhere.

    The way that the Wayland display in the UI process of a WPEWebKit browser is integrated with the rest of the program, when it uses WPEBackend-fdo, is through the GLib main event loop. Wayland itself has an event loop implementation for servers, but for a GLib-powered application it can be useful to use GLib&aposs and integrate Wayland&aposs event processing with the different stages of the GLib main loop. That is precisely how WPEBackend-fdo is handling its clients&apos events. As discussed earlier, when a new client is created a pair of connected sockets are created and one end is given to Wayland to control communication with the client. GSourceFunc functions are used to integrate Wayland with the application main loop. In these functions, we make sure that whenever there are pending messages to be sent to clients, those are sent, and whenever any of the client sockets has pending data to be read, Wayland reads from them, and to dispatch the events that might be necessary in response to the incoming data. And here is where things start getting really strange, because after doing a bit of fprintf()-powered debugging inside the Wayland-GSourceFuncs functions, it became clear that the Wayland events from the clients were never dispatched, because the dispatch() GSourceFunc was not being called, as if there was nothing coming from any Wayland client. But how is that possible, if we already know that the web process client is actually trying to get the Wayland registry?

    To move forward, one needs to understand how the GLib main loop works, in particular, with Unix file descriptor sources. A very brief summary of this is that, during an iteration of the main loop, GLib will poll file descriptors to see if there are any interesting events to be reported back to their respective sources, in which case the sources will decide whether to trigger the dispatch() phase. A simple source might decide in its dispatch() method to directly read or write from/to the file descriptor; a Wayland display source (as in our case), will call wl_event_loop_dispatch() to do this for us. However, if the source doesn&apost find any interesting events, or if the source decides that it doesn&apost want to handle them, the dispatch() invocation will not happen. More on the GLib main event loop in its API documentation.

    So it seems that for some reason the dispatch() method is not being called. Does that mean that there are no interesting events to read from? Let&aposs find out.

    System call tracing

    Here we resort to another helpful tool, strace. With strace we can try to figure out what is happening when the main loop polls file descriptors. The strace output is huge (because it takes easily over a hundred attempts to reproduce this), but we know already some of the calls that involve file descriptors from the code we looked at above, when the client is created. So we can use those calls as a starting point in when searching through the several MBs of logs. Fast-forward to the relevant logs.

    socketpair(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0, [128, 130]) = 0
    dup(130)               = 131
    close(130)             = 0
    fcntl64(128, F_DUPFD_CLOEXEC, 0) = 130
    epoll_ctl(34, EPOLL_CTL_ADD, 130, {EPOLLIN, {u32=1639599928, u64=1639599928}}) = 0

    What we see there is, first, WPEBackend-fdo creating a new socket pair (128, 130) and then, when file descriptor 130 is passed to wl_client_create() to create a new client, Wayland adds that file descriptor to its epoll() instance for monitoring clients, which is referred to by file descriptor 34. This way, whenever there are events in file descriptor 130, we will hear about them in file descriptor 34.

    So what we would expect to see next is that, after the web process is spawned, when a Wayland client is created using the passed file descriptor and the EGL driver requests the Wayland registry from the display, there should be a POLLIN event coming in file descriptor 34 and, if the dispatch() call for the source was called, a epoll_wait() call on it, as that is what wl_event_loop_dispatch() would do when called from the source&aposs dispatch() method. But what do we have instead?

    poll([{fd=30, events=POLLIN}, {fd=34, events=POLLIN}, {fd=59, events=POLLIN}, {fd=110, events=POLLIN}, {fd=114, events=POLLIN}, {fd=132, events=POLLIN}], 6, 0) = 1 ([{fd=34, revents=POLLIN}])
    recvmsg(30, {msg_namelen=0}, MSG_DONTWAIT|MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource temporarily unavailable)
    

    strace can be a bit cryptic, so let&aposs explain those two function calls. The first one is a poll in a series of file descriptors (including 30 and 34) for POLLIN events. The return value of that call tells us that there is a POLLIN event in file descriptor 34 (the Wayland display epoll() instance for clients). But unintuitively, the call right after is trying to read a message from socket 30 instead, which we know doesn&apost have any pending data at the moment, and consequently returns an error value with an errno of EAGAIN (Resource temporarily unavailable).

    Why is the GLib main loop triggering a read from 30 instead of 34? And who is 30?

    We can answer the latter question first. Breaking on a running UI process instance at the right time shows who is reading from the file descriptor 30:

    #1  0x70ae1394 in wl_os_recvmsg_cloexec (sockfd=30, msg=msg@entry=0x700fea54, flags=flags@entry=64)
    #2  0x70adf644 in wl_connection_read (connection=0x6f70b7e8)
    #3  0x70ade70c in read_events (display=0x6f709c90)
    #4  wl_display_read_events (display=0x6f709c90)
    #5  0x70277d98 in pwl_source_check (source=0x6f71cb80)
    #6  0x743f2140 in g_main_context_check (context=context@entry=0x2111978, max_priority=, fds=fds@entry=0x6165f718, n_fds=n_fds@entry=4)
    #7  0x743f277c in g_main_context_iterate (context=0x2111978, block=block@entry=1, dispatch=dispatch@entry=1, self=)
    #8  0x743f2ba8 in g_main_loop_run (loop=0x20ece40)
    #9  0x00537b38 in ?? ()

    So it&aposs also Wayland, but on a different level. This is the Wayland client source (remember that the browser is also a Wayland client?), which is installed by cog (a thin browser layer on top of WPE WebKit that makes writing browsers easier to do) to process, among others, input events coming from the parent Wayland display. Looking at the cog code, we can see that the wl_display_read_events() call happens only if GLib reports that there is a G_IO_IN (POLLIN) event in its file descriptor, but we already know that this is not the case, as per the strace output. So at this point we know that there are two things here that are not right:

    1. A FD source with a G_IO_IN condition is not being dispatched.
    2. A FD source without a G_IO_IN condition is being dispatched.

    Someone here is not telling the truth, and as a result the main loop is dispatching the wrong sources.

    The loop (part II)

    It is at this point that it would be a good idea to look at what exactly the GLib main loop is doing internally in each of its stages and how it tracks the sources and file descriptors that are polled and that need to be processed. Fortunately, debugging symbols for GLib are very small, so debugging this step by step inside the device is rather easy.

    Let&aposs look at how the main loop decides which sources to dispatch, since for some reason it&aposs dispatching the wrong ones. Dispatching happens in the g_main_dispatch() method. This method goes over a list of pending source dispatches and after a few checks and setting the stage, the dispatch method for the source gets called. How is a source set as having a pending dispatch? This happens in g_main_context_check(), where the main loop checks the results of the polling done in this iteration and runs the check() method for sources that are not ready yet so that they can decide whether they are ready to be dispatched or not. Breaking into the Wayland display source, I know that the check() method is called. How does this method decide to be dispatched or not?

        [](GSource* base) -> gboolean
        {
            auto& source = *reinterpret_cast(base);
            return !!source.pfd.revents;
        },

    In this lambda function we&aposre returning TRUE or FALSE, depending on whether the revents field in the GPollFD structure have been filled during the polling stage of this iteration of the loop. A return value of TRUE indicates the main loop that we want our source to be dispatched. From the strace output, we know that there is a POLLIN (or G_IO_IN) condition, but we also know that the main loop is not dispatching it. So let&aposs look at what&aposs in this GPollFD structure.

    For this, let&aposs go back to g_main_context_check() and inspect the array of GPollFD structures that it received when called. What do we find?

    (gdb) print *fds
    $35 = {fd = 30, events = 1, revents = 0}
    (gdb) print *(fds+1)
    $36 = {fd = 34, events = 1, revents = 1}

    That&aposs the result of the poll() call! So far so good. Now the method is supposed to update the polling records it keeps and it uses when calling each of the sources check() functions. What do these records hold?

    (gdb) print *pollrec->fd
    $45 = {fd = 19, events = 1, revents = 0}
    (gdb) print *(pollrec->next->fd)
    $47 = {fd = 30, events = 25, revents = 1}
    (gdb) print *(pollrec->next->next->fd)
    $49 = {fd = 34, events = 25, revents = 0}

    We&aposre not interested in the first record quite yet, but clearly there&aposs something odd here. The polling records are showing a different value in the revent fields for both 30 and 34. Are these records updated correctly? Let&aposs look at the algorithm that is doing this update, because it will be relevant later on.

      pollrec = context->poll_records;
      i = 0;
      while (pollrec && i  n_fds)
        {
          while (pollrec && pollrec->fd->fd == fds[i].fd)
            {
              if (pollrec->priority = max_priority)
                {
                  pollrec->fd->revents =
                    fds[i].revents & (pollrec->fd->events | G_IO_ERR | G_IO_HUP | G_IO_NVAL);
                }
              pollrec = pollrec->next;
            }
    
          i++;
        }

    In simple words, what this algorithm is doing is to traverse simultaneously the polling records and the GPollFD array, updating the polling records revents with the results of polling. From reading how the pollrec linked list is built internally, it&aposs possible to see that it&aposs purposely sorted by increasing file descriptor identifier value. So the first item in the list will have the record for the lowest file descriptor identifier, and so on. The GPollFD array is also built in this way, allowing for a nice optimization: if more than one polling record – that is, more than one polling source – needs to poll the same file descriptor, this can be done at once. This is why this otherwise O(n^2) nested loop can actually be reduced to linear time.

    One thing stands out here though: the linked list is only advanced when we find a match. Does this mean that we always have a match between polling records and the file descriptors that have just been polled? To answer that question we need to check how is the array of GPollFD structures filled. This is done in g_main_context_query(), as we hinted before. I&aposll spare you the details, and just focus on what seems relevant here: when is a poll record not used to fill a GPollFD?

      n_poll = 0;
      lastpollrec = NULL;
      for (pollrec = context->poll_records; pollrec; pollrec = pollrec->next)
        {
          if (pollrec->priority > max_priority)
            continue;
      ...
    

    Interesting! If a polling record belongs to a source whose priority is lower than the maximum priority that the current iteration is going to process, the polling record is skipped. Why is this?

    In simple terms, this happens because each iteration of the main loop finds out the highest priority between the sources that are ready in the prepare() stage, before polling, and then only those file descriptor sources with at least such a a priority are polled. The idea behind this is to make sure that high-priority sources are processed first, and that no file descriptor sources with lower priority are polled in vain, as they shouldn&apost be dispatched in the current iteration.

    GDB tells me that the maximum priority in this iteration is -60. From an earlier GDB output, we also know that there&aposs a source for a file descriptor 19 with a priority 0.

    (gdb) print *pollrec
    $44 = {fd = 0x7369c8, prev = 0x0, next = 0x6f701560, priority = 0}
    (gdb) print *pollrec->fd
    $45 = {fd = 19, events = 1, revents = 0}

    Since 19 is lower than 30 and 34, we know that this record is before theirs in the linked list (and so it happens, it&aposs the first one in the list too). But we know that, because its priority is 0, it is too low to be added to the file descriptor array to be polled. Let&aposs look at the loop again.

      pollrec = context->poll_records;
      i = 0;
      while (pollrec && i  n_fds)
        {
          while (pollrec && pollrec->fd->fd == fds[i].fd)
            {
              if (pollrec->priority = max_priority)
                {
                  pollrec->fd->revents =
                    fds[i].revents & (pollrec->fd->events | G_IO_ERR | G_IO_HUP | G_IO_NVAL);
                }
              pollrec = pollrec->next;
            }
    
          i++;
        }

    The first polling record was skipped during the update of the GPollFD array, so the condition pollrec && pollrec->fd->fd == fds[i].fd is never going to be satisfied, because 19 is not in the array. The innermost while() is not entered, and as such the pollrec list pointer never moves forward to the next record. So no polling record is updated here, even if we have updated revent information from the polling results.

    What happens next should be easy to see. The check() method for all polled sources are called with outdated revents. In the case of the source for file descriptor 30, we wrongly tell it there&aposs a G_IO_IN condition, so it asks the main loop to call dispatch it triggering a a wl_connection_read() call in a socket with no incoming data. For the source with file descriptor 34, we tell it that there&aposs no incoming data and its dispatch() method is not invoked, even when on the other side of the socket we have a client waiting for data to come and blocking in the meantime. This explains what we see in the strace output above. If the source with file descriptor 19 continues to be ready and with its priority unchanged, then this situation repeats in every further iteration of the main loop, leading to a hang in the web process that is forever waiting that the UI process reads its socket pipe.

    The bug – explained

    I have been using GLib for a very long time, and I have only fixed a couple of minor bugs in it over the years. Very few actually, which is why it was very difficult for me to come to accept that I had found a bug in one of the most reliable and complex parts of the library. Impostor syndrome is a thing and it really gets in the way.

    But in a nutshell, the bug in the GLib main loop is that the very clever linear update of registers is missing something very important: it should skip to the first polling record matching before attempting to update its revents. Without this, in the presence of a file descriptor source with the lowest file descriptor identifier and also a lower priority than the cutting priority in the current main loop iteration, revents in the polling registers are not updated and therefore the wrong sources can be dispatched. The simplest patch to avoid this, would look as follows.

       i = 0;
       while (pollrec && i  n_fds)
         {
    +      while (pollrec && pollrec->fd->fd != fds[i].fd)
    +        pollrec = pollrec->next;
    +
           while (pollrec && pollrec->fd->fd == fds[i].fd)
             {
               if (pollrec->priority = max_priority)

    Once we find the first matching record, let&aposs update all consecutive records that also match and need an update, then let&aposs skip to the next record, rinse and repeat. With this two-line patch, the web process was finally unlocked, the EGL display initialized properly, the web extension and the web page were loaded, CI tests starting passing again, and this exhausted developer could finally put his mind to rest.

    A complete patch, including improvements to the code comments around this fascinating part of GLib and also a minimal test case reproducing the bug have already been reviewed by the GLib maintainers and merged to both stable and development branches. I expect that at least some GLib sources will start being called in a different (but correct) order from now on, so keep an eye on your GLib sources. :-)

    Standing on the shoulders of giants

    At this point I should acknowledge that without the support from my colleagues in the WebKit team in Igalia, getting to the bottom of this problem would have probably been much harder and perhaps my sanity would have been at stake. I want to thank Adrián and &Zcaronan for their input on Wayland, debugging techniques, and for allowing me to bounce back and forth ideas and findings as I went deeper into this rabbit hole, helping me to step out of dead-ends, reminding me to use tools out of my everyday box, and ultimately, to be brave enough to doubt GLib&aposs correctness, something that much more often than not I take for granted.

    Thanks also to Philip and Sebastian for their feedback and prompt code review!

    October 29, 2020 01:10 PM