Planet Igalia Chromium

November 20, 2020

Maksim Sisov

Chrome/Chromium on Wayland: The Waylandification project.

It has been a long time since I wrote my last blog post and since I wrote about something that I and my colleagues at Igalia have been working for the past 4 years. I have been postponing writing this post waiting until something big happens. Well, something big just happened…

If you already know what Ozone is, then I am happy to tell you that Chromium for Linux includes Ozone by default now and can be enabled with runtime command line flags. If you are interested in trying Chrome/Chromium with native Wayland support, you are encouraged to download Google Chrome for developers and try Ozone/Wayland by running the browser with the following command line flags – ‘–enable-features=UseOzonePlatform –ozone-platform=wayland’.

If you don’t know what Ozone is, here’s a brief explanation, before I talk about the history, status and design of this effort.

What is Ozone?

The very first thing that one may think about when they hear “Ozone” is the gas or a thin layer of the Earth’s atmosphere. Well… it is partly correct. In the case of Chromium, it is a platform abstraction layer.

I will not go into many details, but here is the description of that layer from Chromium’s documentation about Ozone –
“Ozone is a platform abstraction layer beneath Aura, Chromium’s platform independent windowing system, that is used for low level input and graphics. Once complete, the abstraction will support underlying systems ranging from embedded SoC targets to new X11-alternative window systems on Linux such as Wayland or Mir to bring up Aura Chromium by providing an implementation of the platform interface.”.
If you are interested in more details, you are welcome to read the project’s documentation at https://chromium.googlesource.com.

The Summary of the Design of Ozone/Wayland

It has been a long time since Antonio Gomes started to work on this project. It started as a research project for our customer – Renesas Electronics, and was based on a former abstraction project with another clever name, “mus+ash” (pronounced “mustache”, you can read more about that here – Chromium, ozone, wayland and beyond).

Since that time, the project has been moved to downstream and back to upstream (because of some unknowns related to the “mus+ash”) and the design of Ozone integration has also been changed.

Currently, the Aura/Platform classes are injected into the Browser process and communicate directly with the underlying Ozone platforms including Wayland. In the browser process, Wayland creates a connection with a Wayland compositor, while in the GPU process, it only draws pixels into the created DMABUFs and neither receives events nor creates surfaces.

Migrating Away From X11-only Legacy Backend.

It is worth mentioning that Igalia has been working on both Ozone/X11 and Ozone/Wayland.

Since June 2020, we have been working on switching Ozone for Linux from needing to be done at compile time to being choosable at runtime. At the moment, one can try Ozone by running Chrome downloaded from the development channel with the ‘–enable-features=UseOzonePlatform –ozone-platform=wayland/x11’ runtime flags.

That approach is allowing us to gather a bigger audience of users who are willing to test the Ozone capabilities, but also achieve a better feature parity between non-Ozone/X11 and Ozone/X11/Wayland.

That is, most of the features and code paths are shared between the two implementations, and the paths that are not compatible with Ozone are being refactored at the moment.

Once all the incompatible  paths are refactored ( just a few of them remain) and all the available test suites are enabled on the Linux/Ozone bot, we will start what is known as a “finch trial”.  This allows Ozone to be enabled by default for some percentage of users (about 10%). If the finch trial goes well, the percentage of users will be gradually grown to 100% and we will start removing old non”-Ozone/X11 implementation.

Wayland + Tab Drag

If you’ve been trying it out, you might have already noticed that Ozone/Wayland does not support the Tab Drag feature well enough. The problem is the lack of the protocol for this feature.

At the moment, my colleague Nick Diego is working on the definition of the protocol for tab drag and implementation of that in Chromium.

Unfortunately, Ozone will fallback to x11/xwayland for compositors that do not support the aforementioned protocol. However, once more and more compositors will support that, Chrome will whitelist those compositors.

I would not go into details of that effort here in this blog post, but just rather leave a link to the design document that Nick has created – Tab Dragging on Ozone/Wayland.

Summary

This blog post was rather a brief summary of the design, feature, and status of the project. Thank you for reading it. Hopefully, when we start a finch trial, I will write another blog telling you how it goes. Bye.

by msisov at November 20, 2020 11:28 AM

November 13, 2020

Alexander Dunaev

HiDPI support in Chromium for Wayland

It all started with this bug. The description sounded humble and harmless: the browser ignored some command line flag on Wayland. A screenshot was attached where it was clearly seen that Chromium (version 72 at that time, 2019 spring) did not respect the screen density and looked blurry on a HiDPI screen.

HiDPI literally means small pixels. It is hard to tell now what was the first HiDPI screen, but I assume that their wide recognition came around 2010 with Apple’s Retina displays. Ultra HD had been standardised in 2012, defining the minimum resolution and aspect ratio for what today is known informally as 4K—and 4K screens for laptops have pixels that are small enough to call it HiDPI. This Chromium issue, dated 2012, says that the Linux port lacks support for HiDPI while the Mac version has it already. On the other hand, HiDPI on Windows was tricky even in 2014.

‘That should be easy. Apparently it’s upscaled from low resolution. Wayland allows setting scale for the back buffers, likely you’ll have to add a single call somewhere in the window initialisation’, a colleague said.

Like many stories that begin this way, this turned out to be wrong. It was not so easy. Setting the buffer scale did the right thing indeed, but it was absolutely not enough. It turned out that support for HiDPI screens was entirely missing in our implementation of the Wayland client. On my way to the solution, I have found that scaling support in Wayland is non-trivial and sometimes confusing. Since I finished this work, I have been asked a few times about what happens there, so I thought that writing it all down in a post would be useful.

Background

Modern desktop environments usually allow configuring the scale of the display at global system level. This allows all standard controls and window decorations to be sized proportionally. For applications that use those standard controls, this is a happy end: everything will be scaled automatically. Those which prefer doing everything themselves have to get the current scale from the environment and adjust rendering.  Chromium does exactly that: inside it has a so-called device scale factor. This factor is applied equally to all sizes, locations, and when rendering images and fonts. No code has to bother ever. It works within this scaled coordinate system, known as device independent pixels, or DIP. The device scale factor can take fractional values like 1.5, but, because it is applied at the stage of rendering, the result looks nice. The system scale is used as default device scale factor, and the user can override it using the command line flag named --force-device-scale-factor. However, this is the very flag which did not work in the bug mentioned in the beginning of this story.

Note that for X11 the ‘natural’ scale is still the physical pixels.  Despite having the system-wide scale, the system talks to the application in pixels, not in DIP.  It is the application that is responsible to handle the scale properly. If it does not, it will look perfectly sharp, but its details will be perhaps too small for the naked eye.

However, Wayland does it a bit differently. The system scale there is respected by the compositor when pasting buffers rendered by clients. So, if some application has no idea about the system scale and renders itself normally, the compositor will upscale it.  This is what originally happened to Chromium: it simply drew itself at 100%, and that image was then stretched by the system compositor. Remember that the Wayland way is giving a buffer to each application and then compositing the screen from those buffers, so this approach of upscaling buffers rendered by applications is natural. The picture below shows what that looks like. The screenshot is taken on a HiDPI display, so in order to see the difference better, you may want to see the full version (click the picture to open).

What Chromium looked like when it did not set its back buffer scale

Firefox (left) vs. Chromium (right)

How do Wayland clients support HiDPI then?

Level 1. Basic support

Each physical output device is represented at the Wayland level by an object named output. This object has a special integer property named buffer scale that tells literally how many physical pixels are used to represent the single logical pixel. The application’s back buffer has that property too. If scales do not match, Wayland will simply scale the raster image, thus emulating the ‘normal DPI’ device for the application that is not aware of any buffer scales.

The first thing the window is supposed to do is to check the buffer scale of the output that it currently resides at, and to set the same value to its back buffer scale. This will basically make the application using all available physical pixels: as scales of the buffer and the output are the same, Wayland will not re-scale the image.

Back buffer scale is set but rendering is not aware of that

Chromium now renders sharp image but all details are half their normal size

The next thing is fixing the rendering so it would scale things to the right size.  Using the output buffer scale as default is a good choice: the result will be ‘normal size’.  For Chromium, this means simply setting the device scale factor to the output buffer scale.

Now Chromium looks right

All set now

The final bit is slightly trickier.  Wayland sends UI events in DIP, but expects the client to send surface bounds in physical pixels. That means that if we implement something like interactive resize of the window, we will also have to do some math to convert the units properly.

This is enough for the basic support.  The application will work well on a modern laptop with 4K display.  But what if more than a single display is connected, and they have different pixel density?

Level 2. Multiple displays

If there are several output devices present in the system, each one may have its own scale. This makes things more complicated, so a few improvements are needed.

First, the window wants to know that it has been moved to another device.  When that happens, the window will ask for the new buffer scale and update itself.

Second, there may be implementation-specific issues. For example, some Wayland servers initially put the new sub-surface (which is used for menus) onto the default output, even if its parent surface has been moved to another output.  This may cause weird changes of their scale during their initialisation.  In Chromium, we just made it so the sub-surface always takes its scale from the parent.

Level 3? Fractional scaling?

Not really. Fractional scaling is basically ‘non-even’ scales like 125%. The entire feature had been somewhat controversial when it had been announced, because of how rendering in Wayland is performed. Here, non-even scale inevitably uses raster operations which make the image blurry. However, all that is transparent to the applications. Nothing new has been introduced at the level of Wayland protocols.

Conclusion

Although this task was not as simple as we thought, in the end it turned out to be not too hard. Check the output scale, set the back buffer scale, scale the rendering, translate pixels to DIP and vice versa in certain points. Pretty straightforward, and if you are trying to do something related, I hope this post helps you.

The issue is that there are many implementations of Wayland servers out there, not all of them are consistent, and some of them have bugs. It is worth testing the solution on a few distinct Linux distributions and looking for discrepancies in behaviour.

Anyway, Chromium with native Wayland support has recently reached beta—and it supports HiDPI! There may be bugs too, but the basic support should work well. Try it, and let us know if something is not right.

Note: the Wayland support is so far experimental. To try it, you would need to launch chrome via the command line with two flags:
--enable-features=UseOzonePlatform
--ozone-platform=wayland

by adunaev at November 13, 2020 10:10 AM

October 30, 2020

Jacobo Aragunde

Event management in X11 Chromium

This is a follow-up of my previous post, where I was trying to fix the bug #1042864 in Chromium: key strokes happening on native dialogs, like open and save dialogs, were not reported to the screen reader.

After learning how accessibility tools (ATs) register listeners for key events, I found out the problem was not actually there; I had to investigate how events arrive from the X11 server to the browser, and how they are forwarded to the ATs.

Not this kind of event

Events arrive from the X server

If you are running Chromium on Linux with the X11 backend (most likely, as it is the default), the Chromium browser process receives key press events from the X server. Then, it finds out if the target of those events is one of its browser windows, and sends it to the proper Window object to be processed.

These are the classes involved in the first part of this process:

The interface PlatformEventSource represents an undetermined source of events coming from the platform, and a PlatformEventDispatcher is any object in the browser capable of managing those events, dispatching them to the actual webpage or UI element. These two classes are related, the PlatformEventSource keeps a list of dispatchers it will forward the event to, if they can manage it (CanDispatchEvent).

The X11EventSource class implements PlatformEventSource; it has the code managing the events coming from an X11 server, in particular. It additionally keeps a list of XEventDispatcher objects, which is a class to manage X11 Event objects independently, but it’s not an implementation of PlatformEventDispatcher.

The X11Window class is the central piece, implementing both the PlatformEventDispatcher and the XEventDispatcher interfaces, in addition to the XWindow class. It has all the means required to find out if it can dispatch an event, and do it.

The main event processing loop looks like this:

  1. An event arrives to X11EventSource.
  • X11EventSource loops through its list of XEventDispatcher, and calls CheckCanDispatchNextPlatformEvent for each of them.

  • The X11Window implementing that function checks if the XWindow ID of the event target matches the ID of the XWindow represented by that object, and saves the XEvent object if affirmative.

  • X11EventSource calls DispatchEvent as implemented by its parent class PlatformEventSource.

  • The PlatformEventSource loops through its list of PlatformEventDispatchers and calls CanDispatchEvent on each one of them.

  • The X11Window object, which had previously run CheckCanDispatchNextPlatformEvent, just verifies if the XEvent object was saved then, and considers that a confirmation it can dispatch the event.

  • When one of the dispatchers answers positively, it receives the event for processing in a call to DispatchEvent; it is implemented at X11Window.

  • If it’s a keyboard event, it takes the steps required to send it to any ATs listening to it, which had been previously registered via ATK.

  • When X11Window ends processing the event, it returns POST_DISPATCH_STOP_PROPAGATION, telling PlatformEventSource to stop looping through the rest of dispatchers.

  • This is a sequence diagram summarizing this process:

    Events leave to the ATs

    As explained in the previous post, ATs can register callbacks for key press events, which ultimately call AtkUtilClass::add_key_event_listener. AtkUtilClass is a struct of function pointers, the actual implementation is provided by Chromium in the AtkUtilAuraLinux class, which keeps a list of those callbacks.

    When an X11Window class encounters an event that is targetting its own X Window, and it is a keyboard event, it calls X11ExtensionDelegate::OnAtkEvent() which is actually implemented by the class DesktopWindowTreeHostLinux; it ultimately hands the event to the AtkUtilAuraLinux class and runs HandleAtkEvent(). It will loop through, and run, any listeners that may have been registered.

    Native dialogs are different

    Native dialogs are stand-alone windows in the X server, different from the browser window that called them, and the browser process doesn’t wrap them in X11Window object. It is considered unnecessary, because the windows for native dialogs talk to the X server and receive events from it directly.

    They do belong to the browser process, though, which means that the browser will still receive events targetting the dialog windows. They will go through all the steps mentioned above to eventually be dismissed, because there is no X11Window object in the browser matching the ID of the target window of the process.

    Another consequence of dialog windows belonging to the browser process is that the AtkUtilClass struct points to Chromium’s own implementation, and here comes the problem… The dialog is expected to manage its own events through GTK+ code, including the GTK+ implementation of AtkUtilClass, but Chromium overrode it. The key press listeners that ATs registered are kept in Chromium code, so the dialog cannot notify them.

    Finally, fixing the problem

    Chromium does receive the keyboard events targetted to the dialog windows, but it does nothing with them because the target of those events is not a browser window. It gives us, though, a leg towards building a solution.

    To fix the problem, I made Chromium X Windows manage the keyboard events addressed to the native dialogs in addition to their own. For that, I took advantage of the “transient” property, which indicates a dependency of one window from the other: the dialog window had been set as transient for the browser window. In my first approach, I modified X11Window::CheckCanDispatchNextPlatformEvent() to verify if the target of the event was a transient window of the browser X Window, and in that case it would hand the event to X11ExtensionDelegate to be sent to ATs, following the code patch previously explained. It stopped processing at this point, otherwise the browser window would have received key presses directed to the dialog.

    The approach had one performance problem: I was calling the X server to check that property, for every keystroke, and that call implied using synchronous IPC. This was unacceptable! But it could be worked around: we could also notify the corresponding internal X11Window object about the existence of this transient window, when the dialog is created. This implies no IPC at all, we just store one new property in the X11Window object that can be checked locally when keyboard events are processed.

    This is a link to the review process of the patch, if you are interested in its history. To sum up, in the final solution:

    1. Chromium creates the native dialog and calls XWindow::SetTransientWindow, setting that property in the corresponding browser X Window.
  • When Chromium receives a keyboard event, it is captured by the X11Window object whose transient window property has been set before.

  • X11ExtensionDelegate::OnAtkEvent() is called for that event, then no more processing of this event happens in Chromium.

  • The native dialog code will also receive the event and manage the keystroke accordingly.

  • I hope you enjoyed this trip through Chromium event processing code. If you want to use the diagrams in this post, you may find their Dia source files in this link. Happy hacking!

    by Jacobo Aragunde Pérez at October 30, 2020 05:00 PM

    August 13, 2020

    Javier Fernández

    Improving CSS Custom Properties performance

    Chrome 84 reached the stable channel a few weeks ago, and there are already several great posts describing the many important additions, interesting new features, security fixes and improvements in privacy policies (([1], [2], [3], [4]) it contains. However, there is a change that I worked on in this release which might have passed unnoticed by most, but I think is very valuable: A change regarding CSS Custom Properties (variables) performance.

    The design of CSS, in general, takes great care in considering how features are designed with respect to making it possible for them to perform well. However, implementations may not perform as well as they could, and it takes a considerable amount of time to understand how authors use the features and which cases are more relevant for them.

    CSS Custom Properties are an interesting example to look at here: They are a wonderful feature that provides a lot of advantages for web authors. For a whole lot of cases, all of the implementations of CSS Custom Properties perform well enough that most people won’t notice. However, we at Igalia have been analyzing several use cases and looking at some reports around their performance in different implementations.

    Let’s consider a fairly straightforward example in which an author sets a single property in a toggleable class in the body, and then uses that property several times deeper in the tree to change the foreground color of some text.

    
       .red { --prop: red; }
       .green { --prop: green; }
    
    
    
    

    Only about 20% of those actually use this property, 5 elements deep into the tree, and only to change the foreground color.

    To evaluate Chromium’s performance in a case like this we can define a new perf tests, using the perf tools the Chromium project has available for browser engineers. In this case, we want a huge tree so that we can evaluate better the impact of the different optimizations.

    
        .green { --prop: green; }
        .red { --prop: red; }
    
    
        

    These are the results obtained runing the test in Chrome 83:

    avg median
    stdev

    min

    max

    163.74 ms 163.79 ms 3.69 ms 158.59 ms 163.74 ms

    I admit that it’s difficult to evaluate the results, especially considering the number of nodes of such a huge DOM tree. Lets compare the results of the same test on Firefox, using different number of nodes.

    Nodes 50K 20K 10K 5K 1K 500
    Chrome 83 163.74 ms 55.05 ms 25.12 ms 14.18 ms 2.74 ms 1.50 ms
    FF 78 28.35 ms 12.05 ms 6.10 ms 3.50 ms 1.15 ms 0.55 ms
    1/6 1/5 1/4 1/4 1/2 1/3

    As I commented before, the data are more accurate when the DOM tree has a lot of nodes; in any case, the difference is quite clear and shows there is plenty room for improvement. WebKit based browsers have results more similar to Chromium as well.

    Performance tests like the one above can be added to browsers for tracking improvements and regressions over time, so we’ve added (r763335) that to Chromium’s tree: We’d like to see it get faster over time, and definitely cannot afford regressions (see Chrome Performance Dashboard and the ChangeStyleCustomPropertyDeclaration test for details) .

    So… What can we do?

    In Chrome 83 and lower, whenever the custom property declaration changed, the new declaration would be inherited by the whole tree. This inheritance implied executing the whole CSS cascade and recalculating the styles of all the nodes in the entire tree, since with this approach, all nodes may be affected.

    Chrome had already implemented an optimization on the CSS cascade implementation for regular CSS properties that don’t depend on any other to resolve their value. These subset of CSS properties are defined as Independent Properties in the Chromium codebase. The optimization mentioned before affects how the inheritance mechanism is implemented for these Independent properties. Whenever one of these properties changes, instead of recalculating the styles of the inherited properties, children can just copy the whole parent’s computed style. Blink’s style engine has a component known as Matched Properties Cache responsible of deciding when is possible to avoid the style resolution of an element and instead, performing an efficient copy of the matched computed style. I’ll get back to this concept in the last part of this post.

    In the case of CSS Custom Properties, we could apply a similar approach as a good step. We can consider that the nodes with computed styles that don’t have references to custom properties declarations shouldn’t be affected by the new declaration, and we can implement the inheritance directly by copying the parent’s computed style. The patch with the optimization I’ve implemented in r765278 initially landed in Chrome 84.0.4137.0

    Let’s look at the result of this one action in the Chrome Performance Dashboard:

    That’s a really good improvement!

    However, it’s also just a first step. It’s clear that Chrome still has a wide margin for improvement in this case, as well any WebKit based browser – Firefox is still, impressively, markedly faster as it’s been described in the bug report filed to track this issue. The following table shows the result of the different browsers together; even disabling the muti-thread capabilities of Firefox’s Stylo engine (STYLO_THREAD=1), FF is much faster than Chrome with the optimization applied.

    Chrome 83 Chrome 84 FF 78 FF 78 th=1
    avg
    median
    stdev
    min
    max
    163.74 ms
    163.79 ms
    3.69 ms
    158.59 ms
    163.74 ms
    117.37 ms
    117.52 ms
    1.98 ms
    113.66 ms
    120.87 ms
    28.35 ms
    28.50 ms
    0.93 ms
    26.00 ms
    30.00 ms
    38.25 ms
    38.50 ms
    1.86 ms
    35.00 ms
    41.00 ms

    Before continue, I want get back to the Matched Properties Cache (MPC) concept, since it has an important role on these style optimizations. This cache is not a new concept in the Chrome’s engine; as a matter of fact, it’s also used in WebKit, since it was implemented long ago, before the fork that created the new blink engine. However, Google has been working a lot on this area in the last years and some of the most recent changes in the MPC have had an important impact on style resolution performance. As a result of this work, elements with independent and non-independent properties using CSS Variables might produce cache hits in the MPC. The results of the Performance Dashboard show a considerable improvement in the mentioned ChangeStyleCustomPropertyDeclaration test (avg: 108.06 ms)

    Additionally, there are several other cases where the use of CSS Variables has a considerable impact on performance, compared with using regular CSS properties. Obviously, resolving CSS Variables has a cost, so it’s clear that we could apply additional optimizations that reduce the impact of the variable resolution, especially for handling specific style changes that might not affect to a substantial portion of the DOM tree. I’ve been experimenting with the MPC to explore the idea an independent CSS Custom Properties cache; nodes with variables referencing the same custom property will produce cache hits in the MPC, even though other properties don’t match. The preliminary approach I’ve been implementing consists on a new matching function, specific for custom properties, and a mechanism to transfer/copy the property’s data to avoid resolving the variable again, since the property’s declaration hasn’t change. We would need to apply the css cascade again, but at least we could save the cost of the variable resolution.

    Of course, at the end of the day, improving performance has costs and challenges – and it’s hard to keep performance even once you get it. Bit if we really want performant CSS Custom Properties, this means that we have to decide to prioritize this work. Currently there is reluctance to explore the concept of a new Custom Properties specific cache – the challenge is big and the risks are not non-existent; cache invalidation can get complicated. But, the point is that we have to understand that we aren’t all going to agree what is important enough to warrant attention, or how much investment, or when. Web authors must convince vendors that these use cases are worth being optimized and that the cost and risks of such a complex challenges should be assumed by them.

    This work has been sponsored by Bloomberg, which I consider one of the most important contributors of the Web Platform. After several years, the vision of this company and its responsibility as consumer of the platform has lead to many and important contributions that we all enjoy now. Although CSS Grid Layout might be the most remarkable one, there are may other not that big, like this work on CSS Custom Properties, or several other new features of the CSS Text specification. This is a perfect example of an company that tries to change priorities and adapt the web platform to its needs and the use cases they consider more aligned with their business strategy.

    I understand that not every user of the web platform can do this kind of investment. This is why I believe that initiatives like Open Priorization could help to move the web platform in a positive direction. By providing a way for us to move past a lot of these conversation and focus on the needs that some web authors and users of the platform consider more important, or higher priority. Improving performance for CSS Custom Properties isn’t currently one of the projects we’ve listed, but perhaps it would be an interesting one we might try in the future if we are successful with these. If you haven’t already, have a look and see if there is something there that is interesting to you or your company – pledges of any size are good – ten thousand $1 donations are every bit as good as ten $1000 donations. Together, we can make a difference, and we all benefit.

    Also, we would love to hear about your ideas. Is improving CSS Custom Properties performance important to you? What else is? Share your comments with us on Twitter, either me (@lajava77) or our developer advocate Brian Kardell (@briankardell), or email me at jfernandez@igalia.com. I’d be glad to answer any question about the Open Priorization experiment.

    by jfernandez at August 13, 2020 06:16 PM

    July 27, 2020

    Jacobo Aragunde

    The trip of a key press event in Chromium accessibility

    It’s amazing to think about how much computing goes into something as simple as a keystroke that we just take for granted. Recently, I was fixing a bug related to accessibility key events, and to do this, first I had to understand the complex trip that these events take when they arrive to the browser – from the X server until they reach the accessibility system.

    Let me start from the beginning. I’m working on the accessibility of the Chromium browser on Linux. The bug was #1042864: key strokes happening on native dialogs, like open and save dialogs, were not reported to the screen reader. The issue also affects Electron-based software, one important example is Visual Studio Code.

    A cake with many layers

    Linux accessibility is composed of many layers, and this fact stands out when working on a complex piece of software like Chromium which comes with its own UI toolkit.

    Currently, the most straight-forward way to build accessible applications for the Linux desktop is using and implementing the hooks provided by the ATK (Accessibility Toolkit) API. GTK+ widgets already do so, applications using them will be accessible by default, but more complex software that implements custom widgets or its own toolkit will need to provide the code for the ATK entry points and emit the corresponding events. That’s the case of Chromium!

    The screen reader or any Assistive technology (AT) has to get information and listen for events happening in any software running on the system, to transform it into something meaningful for their users, like speech or braille.

    AT-SPI is the glue between these two ends: it runs at system level, observing what’s going on with applications and keeping a global registry of the accessible objects; it receives updates from applications, which translate local ATK objects into global AT-SPI objects; and it receives queries from ATs then pings them when events of their interest happen. It uses D-Bus for inter-process communication (IPC).

    You can learn more about accessibility, in general and also in the web platform, in this great interview with my colleague Martin Robinson.

    The trip of a keypress event

    Let’s say we are building an AT in Python, making use of the pyatspi2 library. We want to listen keypress events, so we run registerKeystrokeListener to register a callback function.

    Pyatspi2 actually wraps AT-SPI’s function atspi_register_keystroke_listener, which eventually calls the remote method RegisterKeystrokeListener via D-Bus. The actual D-Bus remote calls happen at dbind.c.

    We have jumped from our AT to the AT-SPI service, via IPC. The DeviceEventController interface provides the remote method mentioned above, and the actual code implementing it is in impl_register_keystroke_listener. Then, the function is added to a list of listeners for key events, in spi_controller_register_device_listener; these listeners in the list will be notified when an event happens, in spi_controller_notify_keylisteners.

    AT-SPI will sit there, waiting for the events from applications to arrive. They will come over through D-Bus, as they are from different processes: the entry point for any D-Bus message in the AT-SPI core is handle_dec_method_from_idle. One of the operations, NotifyListenersSync, will run impl_notify_listeners_sync which will later call the function spi_controller_notify_keylisteners we just mentioned, and run all the registered listeners.

    Who will call the remote method NotifyListenersSync? Applications will have to do it, if they want to be accessible. They could implement this themselves, but they are likely using a wrapper library. In the case of GTK+, there is at-spi2-atk, which bridges ATK signals with the at-spi2-core D-Bus interfaces so applications don’t have to know them.

    The bridge sports its own callback, named spi_atk_bridge_key_listener, and eventually calls the NotifyListenersSync method via D-Bus, at Accessibility_DeviceEventController_NotifyListenersSync. The callback was registered as an ATK key event listener in spi_atk_register_event_listeners: the function atk_add_key_event_listener, which is part of the ATK API, registers the key event listener, and it internally makes use of the AtkUtil struct as defined in atkutil.h.

    AtkUtil functions are not implemented by the ATK library; instead, toolkits must provide them. GTK+ does it in _gtk_accessibility_override_atk_util; in the case of Chromium, the functions that populate the AtkUtil struct are defined in atk_util_auralinux_class_init. The particular function that registers the key event listener in Chromium is AtkUtilAuraLinuxAddKeyEventListener: it gets added to a list, which will later be called when an Atk key event is processed in Chromium, at HandleAtkKeyEvent.

    Are we there yet?

    There are more pieces of software involved in this issue: Chromium will receive key press events from the X server, involving another IPC connection and API layer, and there’s all the browser code managing them, where the solution was actually implemented. I will cover those parts, and the actual solution, in a future post.

    Meanwhile, I hope you enjoyed reading this, and happy hacking!

    by Jacobo Aragunde Pérez at July 27, 2020 03:01 PM