Planet Igalia

April 22, 2019

Thibault Saunier

GStreamer Editing Services OpenTimelineIO support

GStreamer Editing Services OpenTimelineIO support

OpenTimelineIO is an Open Source API and interchange format for editorial timeline information, it basically allows some form of interoperability between the different post production Video Editing tools. It is being developed by Pixar and several other studios are contributing to the project allowing it to evolve quickly.

We, at Igalia, recently landed support for the GStreamer Editing Services (GES) serialization format in OpenTimelineIO, making it possible to convert GES timelines to any format supported by the library. This is extremely useful to integrate GES into existing Post production workflow as it allows projects in any format supported by OpentTimelineIO to be used in the GStreamer Editing Services and vice versa.

On top of that we are building a GESFormatter that allows us to transparently handle any file format supported by OpenTimelineIO. In practice it will be possible to use cuts produced by other video editing tools in any project using GES, for instance Pitivi:

At Igalia we are aiming at making GStreamer ready to be used in existing Video post production pipelines and this work is one step in that direction. We are working on additional features in GES to fill the gaps toward that goal, for instance we are now implementing nested timeline support and framerate based timestamps in GES. Once we implement them, those features will enhance compatibility of Video Editing projects created from other NLE softwares through OpenTimelineIO. Stay tuned for more information!

by thiblahute at April 22, 2019 03:21 PM

April 08, 2019

Philippe Normand

Introducing WPEQt, a WPE API for Qt5

WPEQt provides a QML plugin implementing an API very similar to the QWebView API. This blog post explains the rationale behind this new project aimed for QtWebKit users.

Qt5 already provides multiple WebView APIs, one based on QtWebKit (deprecated) and one based on QWebEngine (aka Chromium). WPEQt aims to provide a viable alternative to the former. QtWebKit is being retired and has by now lagged a lot behind upstream WebKit in terms of features and security fixes. WPEQt can also be considered as an alternative to QWebEngine but bear in mind the underlying Chromium web-engine doesn’t support the same HTML5 features as WebKit.

WPEQt is included in WPEWebKit, starting from the 2.24 series. Bugs should be reported in WebKit’s Bugzilla. WPEQt’s code is published under the same licenses as WPEWebKit, the LGPL2 and BSD.

At Igalia we have compared WPEQt and QtWebKit using the BrowserBench tests. The JetStream1.1 results show that WPEQt completes all the tests twice as fast as QtWebKit. The Speedometer benchmark doesn’t even finish due to a crash in the QtWebKit DFG JIT. Although the memory consumption looks similar in both engines, the upstream WPEQt engine is well maintained and includes security bug-fixes. Another advantage of WPEQt compared to QtWebKit is that its multimedia support is much stronger, with specs such as MSE, EME and media-capabilities being covered. WebRTC support is coming along as well!

So to everybody still stuck with QtWebKit in their apps and not yet ready (or reluctant) to migrate to QtWebEngine, please have a look at WPEQt! The remaining of this post explains how to build it and test it.

Building WPEQt

For the time being, WPEQt only targets Linux platforms using graphics drivers compatible with wayland-egl. Therefore, the end-user Qt application has to use the wayland-egl Qt QPA plugin. Under certain circumstances the EGLFS QPA might also work, YMMV.

Using a SVN/git WebKit snapshot

If you have a SVN/git development checkout of upstream WebKit, then you can build WPEQt with the following commands on a Linux desktop platform:

$ Tools/wpe/install-dependencies
$ Tools/Scripts/webkit-flatpak --wpe --wpe-extension=qt update
$ Tools/Scripts/build-webkit --wpe --cmakeargs="-DENABLE_WPE_QT=ON"

The first command will install the main WPE host build dependencies. The second command will setup the remaining build dependencies (including Qt5) using Flatpak. The third command will build WPEWebKit along with WPEQt.

Using the WPEWebKit 2.24 source tarball

This procedure is already documented in the WPE Wiki page. The only change required is the new CMake option for WPEQt, which needs to be explicitly enabled as follows:

$ cmake -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_WPE_QT=ON -GNinja

Then, invoke ninja, as documented in the Wiki.

Using Yocto

At Igalia we’re maintaining a Yocto overlay for WPE (and WebKitGTK). It was tested for the rocko, sumo and thud Yocto releases. The target platform we tested so far is the Zodiac RDU2 board, which is based on the Freescale i.MX6 QuadPlus SoC. The backend we used is WPEBackend-fdo which fits very naturally in the Mesa open-source graphics environment, inside Weston 5. The underlying graphics driver is etnaviv. In addition to this platform, WPEQt should also run on Raspberry Pi (with the WPEBackend-rdk or -fdo). Please let us know how it goes!

To enable WPEQt in meta-webkit, the qtwpe option needs to be enabled in the wpewebkit recipe:

PACKAGECONFIG_append_pn-wpewebkit = " qtwpe"

The resulting OS image can also include WPEQt’s sample browser application:

IMAGE_INSTALL_append = " wpewebkit-qtwpe-qml-plugin qt-wpe-simple-browser"

Then, on device, the sample application can be executed either in Weston:

$ qt-wpe-simple-browser -platform wayland-egl https://wpewebkit.org

Or with the EGLFS QPA:

$ # stop weston
$ qt-wpe-simple-browser -platform eglfs https://wpewebkit.org

Using WPEQt in your application

A sample MiniBrowser application is included in WebKit, in the Tools/MiniBrowser/wpe/qt directory. If you have a desktop build of WPEQt you can launch it with the following command:

$ Tools/Scripts/run-qt-wpe-minibrowser -platform wayland <url>

Here’s the QML code used for the WPEQt MiniBrowser. As you can see it’s fairly straightforward!

import QtQuick 2.11
import QtQuick.Window 2.11
import org.wpewebkit.qtwpe 1.0

Window {
    id: main_window
    visible: true
    width: 1280
    height: 720
    title: qsTr("Hello WPE!")

    WPEView {
        url: initialUrl
        focus: true
        anchors.fill: parent
        onTitleChanged: {
            main_window.title = title;
        }
    }
}

As explained in this blog post, WPEQt is a simple alternative to QtWebKit. Migrating existing applications should be straightforward because the API provided by WPEQt is very similar to the QWebView API. We look forward to hearing your feedback or inquiries on the webkit-wpe mailing list and you are welcome to file bugs in Bugzilla.

I wouldn’t close this post without acknowledging the support of my company Igalia and Zodiac, many thanks to them!

by Philippe Normand at April 08, 2019 10:20 AM

March 28, 2019

Jacobo Aragunde

The Chromium startup process

I’ve been investigating the process of Chromium startup, the classes involved and the calls exchanged between them. This is a summary of my findings!

There are several implementations of a browser living inside Chromium source code, known as “shells”. Chrome is the main one, of course, but there are other implementations like the content_shell, a minimal browser designed to exercise the content API; the app_shell, a minimal container for Chrome Apps, and several others.

To investigate the difference between the different shell, we can start checking the binary entry point and find out how it evolves. This is a sequence diagram that starts from the content_shell main() function:

content_shell and app_shell sequence diagram

It creates two objects, ShellMainDelegate and ContentMainParams, then hands control to ContentMain() as implemented in the content module.

Chrome’s main is very similar, it also spawns a couple objects and then hands control to ContentMain(), following exactly the same code path from that point onward:

Chrome init sequence diagram

If we took a look to the app_shell, it would be very similar, and it’s probably the same for other shells, so where’s the magic? What’s the difference between the many shells in Chromium? The key is the implementation of that first object created in the main() function:

ContentMainDelegate class diagram

Those *MainDelegate objects created in main() are implementations of ContentMainDelegate. This delegate will get the control in key moments of the initialization process, so the shells can customize what happens. Two important events are the calls to CreateContentBrowserClient and CreateContentRendererClient, which will enable the shells to customize the behavior of the Browser and Render processes.

ContentBrowserClient class diagram

The diagram above shows how the ContentMainDelegate implementations provided by the different shells intantiate each their own implementation of ContentBrowserClient. This class runs in the UI thread and is able to customize the browser logic, its API is able to enable or disable certain parameters (e.g. AllowGpuLaunchRetryOnIOThread), provide delegates on certain objects (e.g. GetWebContentsViewDelegate), etc. A remarkable responsibility of ContentBrowserClient is providing an implementation of BrowserMainParts, which runs code in certain stages of the initialization.

There is a parallel hierarchy of ContentRendererClient classes, which works analogously to what we’ve just seen for ContentBrowserClient:

ContentRendererClient class diagram

The specific case of extensions::ShellContentRendererClient is interesting because it contains the details to setup the extension API:

ShellContentRendererClient class diagram

The purpose of both ExtensionsClient and ExtensionsRendererClient is to set up the extensions system. The difference lies in the knowledge of the renderer process and its concepts by ExtensionsRendererClient, only methods that make use of this knowledge should be there, otherwise they should be part of ExtensionsClient, which has a much bigger API already.
The specific implementation of ShellExtensionsRendererClient is very simple but it owns an instance of extensions::Dispatcher; this is an important class that sets up extension features on demand whenever necessary.

The investigation may continue in different directions, and I’ll try to share more report like this one. Finally, these are the source files for the diagrams and a shared document containing the same information in this report, where any comments, corrections and updates are welcome!

by Jacobo Aragunde Pérez at March 28, 2019 05:03 PM

March 27, 2019

Michael Catanzaro

Epiphany 3.32 and WebKitGTK 2.24

I’m very pleased to (belatedly) announce the release of Epiphany 3.32 and WebKitGTK 2.24. This Epiphany release contains far more changes than usual, while WebKitGTK continues to improve steadily as well. There are a lot of new features to discuss, so let’s dive in.

Dazzling New Address Bar

The most noticeable change is the new address bar, based on libdazzle’s DzlSuggestionEntry. Christian put a lot of effort into designing this search bar to work for both Builder and Epiphany, and Jan-Michael helped integrate it into Epiphany. The result is much nicer than we had before:

The address bar is a central component of the user interface, and this clean design is important to provide a quality user experience. It should also leave a much better first impression than we had before.

Redesigned Tabs Menu

Epiphany 3.24 first added a tab menu at the end of the tab bar. This isn’t very useful if you have only a few tabs open, but if you have a huge number of tabs then it’s useful to help navigate through them. Previously, this menu only showed the page titles of the tabs. For 3.32, Adrien has converted this menu to a nice popover, including favicons, volume indicators, and close buttons. These enhancements were primarily aimed at making the browser easier to use on mobile devices, where there is no tab bar, but they’re nice improvement for desktop users, too.

(On mobile, the tab rows are much larger, to make touch selection easier.)

Touchpad Gestures

Epiphany now supports touchpad gestures. Jan-Michael first added a three-finger swipe to Epiphany, for navigating back and forward. Then Alexander (Exalm) decided to go and rewrite it, pushing the implementation down into WebKit to share as much code as possible with Safari. The end result is a two-finger swipe. This was much more involved than I expected as it required converting a bunch of Apple-specific Objective C++ code into cross-platform C++, but the end result was worth the effort:

Applications that depend on WebKitGTK 2.24 may opt-in to these gestures using webkit_settings_set_enable_back_forward_navigation_gestures().

Alexander also added pinch zoom.

Variable Fonts

Carlos Garcia decided to devote some attention to WebKit’s FreeType font backend, and the result speaks for itself:

Emoji 🦇

WebKit’s FreeType backend has supported emoji for some time, but there were a couple problems:

  • Most emoji combinations were not supported, so while characters like🧟(zombie) would work just fine, characters like 🧟‍♂️(man zombie) and 🧟‍♀️(woman zombie) were broken. Carlos fixed this. (Technically, only emoji combinations using a certain character code were broken, but that was most of them.)
  • There was no code to prefer emoji fonts for rendering emoji, meaning emoji would almost always be displayed in non-ideal fonts, usually DejaVu, resulting in a black and white glyph rather than color. Carlos fixed this, too. This seems to work properly in Firefox on some websites but not others, and it’s currently WONTFIXed in Chrome. It’s good to see WebKit ahead of the game, for once. Note that you’ll see color on this page regardless of your browser, because WordPress replaces the emoji characters with images, but I believe only WebKit can handle the characters themselves. You can test your browser here.

Improved Adaptive Mode

First introduced in 3.30, Adrien has continued to improve adaptive mode to ensure Epiphany works well on mobile devices. 3.32 is the first release to depend on libhandy. Adrien has converted various portions of the UI to use libhandy widgets.

Reader Mode

Jan-Michael’s reader mode has been available since 3.30, but new to 3.32 are many style improvements and new preferences to choose between dark and light theme, and between sans and serif font, thanks to Adrian (who is, confusingly, not Adrien). The default, sans on light background, still looks the best to me, but if you like serif fonts or dark backgrounds, now you can have them.

JPEG 2000

Wait, JPEG 2000? The obscure image standard not supported by Chrome or Firefox? Why would we add support for this? Simple: websites are using it. A certain piece of popular server-side software is serving JPEG 2000 images in place of normal JPEGs and even in place of PNG images to browsers with Safari-style user agents. (The software in question doesn’t even bother to change the file extensions. We’ve found far too many images in the wild ending in .png that are actually JPEG 2000.) Since this software is used on a fairly large number of websites, and our user agent is too fragile to change, we decided to support JPEG 2000 in order to make these websites work properly. So Carlos has implemented JPEG 2000 support, using the OpenJPEG library.

This isn’t a happy event for the web, because WebKit is only as secure as its least-secure dependency, and adding new obscure image formats is not a step in the right direction. But in this case,  it is necessary.

Mouse Gestures

Experimental mouse gesture support is now available, thanks to Jan-Michael, if you’re willing to use the command line to enable it:

$ gsettings set org.gnome.Epiphany.web:/org/gnome/epiphany/web/ enable-mouse-gestures true

With this, I find myself closing tabs by dragging the mouse down and then to the right. Down and back up will reload the tab. Straight to the left is Back, straight to the right is Forward. Straight down will open a new tab. I had originally hoped to use the right mouse button for this, as in Opera, but turns out there is a difference in context menu behavior: whereas Windows apps normally pop up the context menu on button release, GTK apps open the menu on button press. That means the context menu would appear at the start of every mouse gesture. And that is certainly no good, so we’ve opted to use the middle mouse button instead. We aren’t sure whether this is a good state of things, and need your feedback to decide where to go with this feature.

Improved Fullscreen Mode

A cool side benefit of using libdazzle is that the header bar is now available in fullscreen mode by pressing the mouse towards the top of the screen. There’s even a nice animation to show the header bar sliding up to the top of the screen, so you know it’s there (animation disabled for fullscreen video).

The New Tab Button

Some users were disconcerted that the new tab button would jump from the end of the tab bar (when multiple tabs are open) back up to the end of the header bar (when there is only one tab open). Now this button will remain in one place: the header bar. Since it will no longer appear in the tab bar, Jan-Michael has moved it back to the start of the header bar, where it was from 3.12 through 3.22, rather than the end. This is mostly arbitrary, but makes for a somewhat more balanced layout.

The history of the new tab button is rather fun: when the new tab button was first added in 3.8, it was added at the end of the header bar, but moved to the start in 3.12 to be more consistent with gedit, then moved back to the end in 3.24 to reduce the distance it would need to move to reach the tab bar. So we’ve come full circle here, twice. Only time will tell if this nomadic button will finally be able to stay put.

New Icon

Yes, most GNOME applications have a new icon in 3.32, so Epiphany is not special here. But I just can’t resist the urge to show it off. Thanks, Jakub!

And More…

It’s impossible to mention all the improvements in 3.32 in a single blog post, but I want to squeeze a few more in.

Alexander (Exalm) landed several improvements to Epiphany’s theme, especially the incognito mode theme, which needed work to look good with the new Adwaita in 3.32.

Jan-Michael added an animation for completed downloads, so we don’t need to annoyingly pop open the download popover anymore to let you know that your download has completed.

Carlos Garcia added support for automation mode. This means Epiphany can now be used for running automated tests with WebDriver (e.g. with Selenium). Using the new automation mode, I’ve upstreamed support for running tests with Epiphany to the Web Platform Tests (WPT) project, the test suite used by web engine developers to test standards conformance.

Carlos also reworked the implementation of script dialogs so that they are now modal only to their associated web view, not modal to the entire application. This means you can just close the browser tab if a particular website is abusing script dialogs in a problematic way, e.g. by continuously opening new dialogs.

Patrick has improved the directory layout Epiphany uses to store data on disk to avoid storing non-configuration data under ~/.config, and reworked the internals of the password manager to mitigate Spectre-related concerns. He also implemented Happy Eyeballs support in GLib, so Epiphany will now fall back to an IPv4 connection if IPv6 is available but broken.

Now Contains 100% Less Punctuation!

Did you notice any + signs missing in this blog? Following GTK+’s rename to GTK, WebKitGTK+ has been renamed to WebKitGTK. You’re welcome.

Whither Pop!_OS?

Extra Credit

Although Epiphany 3.32 has been the work of many developers, as you’ve seen, I want to give special credit Epiphany’s newest maintainer, Jan-Michael. He has closed a considerable number of bugs, landed too many improvements to mention here, and has been a tremendous help. Thank you!

Now, onward to 3.34!

by Michael Catanzaro at March 27, 2019 12:41 PM

March 19, 2019

Michael Catanzaro

Epiphany Technology Preview Upgrade Requires Manual Intervention

Jan-Michael has recently changed Epiphany Technology Preview to use a separate app ID. Instead of org.gnome.Epiphany, it will now be org.gnome.Epiphany.Devel, to avoid clashing with your system version of Epiphany. You can now have separate desktop icons for both system Epiphany and Epiphany Technology Preview at the same time.

Because flatpak doesn’t provide any way to rename an app ID, this means it’s the end of the road for previous installations of Epiphany Technology Preview. Manual intervention is required to upgrade. Fortunately, this is a one-time hurdle, and it is not hard:

$ flatpak uninstall org.gnome.Epiphany

Uninstall the old Epiphany…

$ flatpak install gnome-apps-nightly org.gnome.Epiphany.Devel org.gnome.Epiphany.Devel.Debug

…install the new one, assuming that your remote is named gnome-apps-nightly (the name used locally may differ), and that you also want to install debuginfo to make it possible to debug it…

$ mv ~/.var/app/org.gnome.Epiphany ~/.var/app/org.gnome.Epiphany.Devel

…and move your personal data from the old app to the new one.

Then don’t forget to make it your default web browser under System Settings -> Details -> Default Applications. Thanks for testing Epiphany Technology Preview!

by Michael Catanzaro at March 19, 2019 06:39 PM

February 27, 2019

Maksim Sisov

Review of Igalia’s Chromium team’s activities (2018/H2).

A first semiyearly report, which overviews our Chromium team’s activities and accomplishments, focusing on the activity of the second semester of year 2018.

Contributions to the Chromium mainline repository:

  • Ozone/Wayland support in Chromium browser.

Igalia has been working on Ozone/Wayland implementation for the Chromium browser sponsored by Renesas support since the end of 2016. In the beginning, the plan was to extend a so called mus service (mojo ui service, which had been intended to be used only by ChromeOS) to support external window mode, when each top level window including menus and popups were backed up by many native accelerated widgets. The result of that work can be found from our previous blog posts: Chromium, ozone, wayland and beyond, Chromium Mus/Ozone update (H1/2017): wayland, x11 and Chromium with Ozone/Wayland: BlinkOn9, dmabuf and more refactorings….

The project was firstly run in the downstream GitHub repository and its design was based on the mus service.

In the end, after lots of discussions with our colleagues from Google, we moved away from mus and made a platform integration directly into the aura layer. The patches in the downstream repository were refactored and merged into the Chromium mainline repository.

Currently, our Igalians Maksim Sisov and Antonio Gomes have ownership of the Ozone/Wayland in the Chromium mainline repository and continue to maintain it. The downstream repository still has been rebased on a weekly basis and contains only few patches being tested.

A meta bug for Ozone/Wayland support exists and it is constantly updated.

  • Maintenance of the upstream meta-browser recipe.

Igalia has also been contributing to the upstream Yocto layer called meta-browser. We constantly update the recipe, which allows Chromium with native Wayland support to be built for embedded devices. Currently, the recipe is based on the latest Chromium Linux stable channel and uses Chromium version 72.0.3626.109. To provide good user experience, we backport Ozone/Wayland patches, which are not included into the source code of the stable channel, and test them on Raspberry Pi 3 and Renesas R-car M3.

  • Web Application Manager for Automotive Grade Linux (AGL).

Automotive Grade Linux is an operating system for embedded devices targeted to automotive. It is even more than an operating system and brings together automakers, suppliers and technology companies to accelerate the development and adoption of a fully open software stack for the connected car.

At some point, the AGL community decided that they need a Web Application Manager capable of running web applications and providing the same as native applications user experience, which can attract web developers to design and create applications for automotive industry.

Igalia has been happy to provide its help and developed a Web Runtime based on recently released Web Application Manager initially targeted for WebOSOSE with some guidance and support from LGe engineers.

The recent work was demoed at CES 2019 in Las Vegas and Chromium M68 with integration to the Web Runtime showcased to run HTML5 applications with the same degree of integration and security as native apps.

By the time of writing, the Web Application Manager was integrated into the Grumpy Guppy branch. and became available for web applications developers.

  • Servicification effort in Chromium browser.

Chromium code base is moving towards a service-oriented model to produce reusable components and reduce code duplication.

Our Chromium team at Igalia has been taking part of that effort and has been helping Google engineers to achieve that goal. Our contributions are spread around the Chromium codebase and include patches to

  • network stack (including //services/network and //net) and,
  • the identity service (//services/identity and //component/signin/core/browser).

The total number of patches is about 650 since 08.04.2018 by 21.02.2019.

By the time of writing this blog post, Igalia contributed to the Chromium mainline repository by servicifying network and identity services, which are included in the canary, dev, beta and stable channels for desktop (Windows, MacOS and Linux) and ChromeOS platforms.

  • General contributions to the Chromium browser.

Igalia has also been doing general contributions to the Chromium mainline repository and the Blink engine.

To name a few, we contributed memory pressure support to //cc (Chromium compositor), resume/suspend active tasks of blink in the content layer. We also been contributing fixes and changes according to web platform specs like Implement Origin-Signed HTTP Exchanges (for WebPackage Loading or css grid support – [css-grid] Issue with abspos element which containing block is the grid container and [css-grid] The grid is by itself causing its grid container to overflow.

Also, we implemented new API operations for the webview tag to enable or disable spatial navigation inside the webview contents independently from the global settings, and to check its state. They are available in Chromium since version 71.

More changes and fixes can be found on chromium-review.

Our contributions count about 640 patches for the past year, which makes us 3rd largest contributor after chromium.org and google.com organization (71927 + 13735 patches), opera.com (777 patches) and samsung.com (652 patches).

  • Contributions to downstream forks of Chromium, such as the ones in EndlessOS, WebOS OSE, or the Brave browser:

Igalia has been also helping downstream forks of Chromium to develop their products. For example,
we have been helping Endless Mobile with the maintenance of the Chromium browser for the different versions of Endless OS for Intel and ARM. We have been taking care of doing the periodic rebases of the adaptations made to Chromium following the updates of the stable channel by Google.

Also, we take part in the development of the Brave browser. Our contributions include on/offline installer and update features integrated into the Omaha(Windows) and Sparkle(MacOS) framework. We have also made Brave browser to have multi channel releases, which include stable, beta, dev and nightly channels for Windows/MacOS/Linux. In addition to that, we worked on customized search engine provider feature, native/web UI, theme, branding, Widevine, brave scheme support and etc. What is more, we

Our contributions can also be found in the LGE’s WebOS OSE. For example, we have been participating the periodic
rebases and adaptations made to Chromium and other activities.

  • Committers and ownership of components in the Chromium browser:

We appreciate that contributions of our Chromium team are valued in the Chromium community, During the past half a year, Igalia gained ownership in three components:

  • third_party/blink/renderer/modules/navigatorcontentutils/ owned by Gyuyoung Kim (gkim@igalia.com).
  • ui/ozone/common/linux/ owned by Maksim Sisov (msisov@igalia.com).
  • ui/ozone/platform/wayland/ owned by Maksim Sisov and Antonio Gomes (tonikitoo@igalia.com).

We also target to have all our team members to be committers of the Chromium project.

During the past half a year, our two members, Jose Dapena Paz and Mario Sanchez Prada , gained the committership.

  • Events attended and talks given:

Our Chromium team has always been targeting to have as much visibility in the open-source community as possible.

For the past half a year, we attended the following conferences:

  • the Web Engines Hackfest 2018 and spoke about “The pathway to Chromium on Wayland” (by Antonio Gomes (tonikitoo@igalia.com) and Julie Jeongeun Kim (jkim@igalia.com))
  • the W3C HTML5 Conference 2018 and gave a talk about “The pathway to Chromium on Wayland” (by Julie Jeongeun Kim (jkim@igalia.com)).

Besides the events mentioned above, it is also worth mentioning the following events for the sake of completeness, as there has not been a H1/2018 report about our team’s activities:

also AGL AMM and AGL F2F meetings in Dresden and Yokohama, and other events, where we presented our projects.

  • Other contributions:

We have also been writing various blog posts about icecc and ccache usage with Chromium.

Recently, we have posted a new blog post about enabling cross-compilation for Windows from a Linux/Mac host. The support has already been in the Chromium repository but only worked for Google employees, we have added the remaining bits to make it available for everyone.

by msisov at February 27, 2019 01:22 PM

February 26, 2019

Frédéric Wang

Review of Igalia's Web Platform activities (H2 2018)

This blog post reviews Igalia’s activity around the Web Platform, focusing on the second semester of 2018.

Projects

MathML

During 2018 we have continued discussions to implement MathML in Chromium with Google and people interested in math layout. The project was finally launched early this year and we have encouraging progress. Stay tuned for more details!

Javascript

As mentioned in the previous report, Igalia has proposed and developed the specification for BigInt, enabling math on arbitrary-sized integers in JavaScript. We’ve continued to land patches for BigInt support in SpiderMonkey and JSC. For the latter, you can watch this video demonstrating the current support. Currently, these two support are under a preference flag but we hope to make it enable by default after we are done polishing the implementations. We also added support for BigInt to several Node.js APIs (e.g. fs.Stat or process.hrtime.bigint).

Regarding “object-oriented” features, we submitted patches private and public instance fields support to JSC and they are pending review. At the same time, we are working on private methods for V8

We contributed other nice features to V8 such as a spec change for template strings and iterator protocol, support for Object.fromEntries, Symbol.prototype.description, miscellaneous optimizations.

At TC39, we maintained or developed many proposals (BigInt, class fields, private methods, decorators, …) and led the ECMAScript Internationalization effort. Additionally, at the WebAssembly Working Group we edited the WebAssembly JS and Web API and early version of WebAssembly/ES Module integration specifications.

Last but not least, we contributed various conformance tests to test262 and Web Platform Tests to ensure interoperability between the various features mentioned above (BigInt, Class fields, Private methods…). In Node.js, we worked on the new Web Platform Tests driver with update automation and continued porting and fixing more Web Platform Tests in Node.js core.

We also worked on the new Web Platform Tests driver with update automation, and continued porting and fixing more Web Platform Tests in Node.js core. Outside of core, we implemented the initial JavaScript API for llnode, a Node.js/V8 plugin for the LLDB debugger.

Accessibility

Igalia has continued its involvement at the W3C. We have achieved the following:

We are also collaborating with Google to implement ATK support in Chromium. This work will make it possible for users of the Orca screen reader to use Chrome/Chromium as their browser. During H2 we began implementing the foundational accessibility support. During H1 2019 we will continue this work. It is our hope that sufficient progress will be made during H2 2019 for users to begin using Chrome with Orca.

Web Platform Predictability

On Web Platform Predictability, we’ve continued our collaboration with AMP to do bug fixes and implement new features in WebKit. You can read a review of the work done in 2018 on the AMP blog post.

We have worked on a lot of interoperability issues related to editing and selection thanks to financial support from Bloomberg. For example when deleting the last cell of a table some browsers keep an empty table while others delete the whole table. The latter can be problematic, for example if users press backspace continuously to delete a long line, they can accidentally end up deleting the whole table. This was fixed in Chromium and WebKit.

Another issue is that style is lost when transforming some text into list items. When running execCommand() with insertOrderedList/insertUnorderedList on some styled paragraph, the new list item loses the original text’s style. This behavior is not interoperable and we have proposed a fix so that Firefox, Edge, Safari and Chrome behave the same for this operation. We landed a patch for Chromium. After discussion with Apple, it was decided not to implement this change in Safari as it would break some iOS rich text editor apps, mismatching the required platform behavior.

We have also been working on CSS Grid interoperability. We imported Web Platform Tests into WebKit (cf bugs 191515 and 191369 and at the same time completing the missing features and bug fixes so that browsers using WebKit are interoperable, passing 100% of the Grid test suite. For details, see 191358, 189582, 189698, 191881, 191938, 170175, 191473 and 191963. Last but not least, we are exporting more than 100 internal browser tests to the Web Platform test suite.

CSS

Bloomberg is supporting our work to develop new CSS features. One of the new exciting features we’ve been working on is CSS Containment. The goal is to improve the rendering performance of web pages by isolating a subtree from the rest of the document. You can read details on Manuel Rego’s blog post.

Regarding CSS Grid Layout we’ve continued our maintenance duties, bug triage of the Chromium and WebKit bug trackers, and fixed the most severe bugs. One change with impact on end users was related to how percentages row tracks and gaps work in grid containers with indefinite size, the last spec resolution was implemented in both Chromium and WebKit. We are finishing the level 1 of the specification with some missing/incomplete features. First we’ve been working on the new Baseline Alignment algorithm (cf. CSS WG issues 1039, 1365 and 1409). We fixed related issues in Chromium and WebKit. Similarly, we’ve worked on Content Alignment logic (see CSS WG issue 2557) and resolved a bug in Chromium. The new algorithm for baseline alignment caused an important performance regression for certain resizing use cases so we’ve fixed them with some performance optimization and that landed in Chromium.

We have also worked on various topics related to CSS Text 3. We’ve fixed several bugs to increase the pass rate for the Web Platform test suite in Chromium such as bugs 854624, 900727 and 768363. We are also working on a new CSS value ‘break-spaces’ for the ‘white-space’ property. For details, see the CSS WG discussions: issue 2465 and pull request. We implemented this new property in Chromium under a CSSText3BreakSpaces flag. Additionally, we are currently porting this implementation to Chromium’s new layout engine ‘LayoutNG’. We have plans to implement this feature in WebKit during the second semester.

Multimedia

  • WebRTC: The libwebrtc branch is now upstreamed in WebKit and has been tested with popular servers.
  • Media Source Extensions: WebM MSE support is upstreamed in WebKit.
  • We implemented basic support for <video> and <audio> elements in Servo.

Other activities

Web Engines Hackfest 2018

Last October, we organized the Web Engines Hackfest at our A Coruña office. It was a great event with about 70 attendees from all the web engines, thank you to all the participants! As usual, you can find more information on the event wiki including link to slides and videos of speakers.

TPAC 2018

Again in October, but this time in Lyon (France), 12 people from Igalia attended TPAC and participated in several discussions on the different meetings. Igalia had a booth there showcasing several demos of our last developments running on top of WPE (a WebKit port for embedded devices). Last, Manuel Rego gave a talk on the W3C Developers Meetup about how to contribute to CSS.

This.Javascript: State of Browsers

In December, we also participated with other browser developers to the online This.Javascript: State of Browsers event organized by ThisDot. We talked more specifically about the current work in WebKit.

New Igalians

We are excited to announce that new Igalians are joining us to continue our Web platform effort:

  • Cathie Chen, a Chinese engineer with about 10 years of experience working on browsers. Among other contributions to Chromium, she worked on the new LayoutNG code and added support for list markers.

  • Caio Lima a Brazilian developer who recently graduated from the Federal University of Bahia. He participated to our coding experience program and notably worked on BigInt support in JSC.

  • Oriol Brufau a recent graduate in math from Barcelona who is also involved in the CSSWG and the development of various browser engines. He participated to our coding experience program and implemented the CSS Logical Properties and Values in WebKit and Chromium.

Coding Experience Programs

Last fall, Sven Sauleau joined our coding experience program and started to work on various BigInt/WebAssembly improvements in V8.

Conclusion

We are thrilled with the web platform achievements we made last semester and we look forward to more work on the web platform in 2019!

February 26, 2019 11:00 PM

Miguel A. Gómez

Hole punching in WPE

As you may (or may not) know, WPE (and WebKitGtk+ if the proper flags are enabled) uses OpengGL textures to render the video frames during playback.

In order to do this, WPE creates a playbin and uses a custom bin as videosink. This bin is composed by some GStreamer-GL components together with an appsink. The GL components ensure that the video frames are uploaded to OpenGL textures, while the appsink allows the player to get a signal when a new frame arrives. When this signal is emitted, the player gets the frame as a texture from the appsink and sends it to the accelerated compositor to be composed with the rest of the layers of the page.

This process is quite fast due to the hardware accelerated drawings, and as the video frames are just another layer that is composited, it allows them to be transformed and animated: the video can be scaled, rotated, moved around the page, etc.

But there are some platforms where this approach is not viable, maybe because there’s no OpenGL support, or it’s too slow, or maybe the platform has some kind of fast path support to take the decoded frames to the display. For these cases, the typical solution is to draw a transparent rectangle on the brower, in the position where the video should be, and then use some platform dependent way to put the video frames in a display plane below the browser, so they are visible through the transparent rectangle. This approach is called hole punching, as it refers to punching a hole in the browser to be able to see the video.

At Igalia we think that supporting this feature is interesting, and following our philosophy of collaborating upstream as much as possible, we have added two hole punching approaches to the WPE upstream trunk: GStreamer hole punch and external hole punch.

GStreamer hole punch

The idea behind this implementation is to use the existent GStreamer based MediaPlayer to perform the media playback, but replace the appsink (and maybe other GStreamer elements) with a platform dependant video sink that is in charge of putting the video frames on the display. This can be enabled with the -DUSE_GSTREAMER_HOLEPUNCH flag.

Of course, the current implementation is not complete cause the platform dependent bits need to be added to have the full functionality. What it currently does is to use a fakevideosink so the video frames are not shown, and draw the transparent rectangle on the position where the video should be. If you enable the feature and play a video, you’ll see the transparent rectangle and you’ll be able to hear the video sound (and even use the video controls as they work), but nothing else will happen.

In order to have the full functionality there are a couple of places in the code that need to be modified to create the appropriate platform dependend elements. These two places are inside MediaPlayerPrivateGStreamerBase.cpp, and they are the createHolePunchVideoSink() and setRectangleToVideoSink() functions.

GstElement* MediaPlayerPrivateGStreamerBase::createHolePunchVideoSink()
{
    // Here goes the platform-dependant code to create the videoSink. As a default
    // we use a fakeVideoSink so nothing is drawn to the page.
    GstElement* videoSink =  gst_element_factory_make("fakevideosink", nullptr);

    return videoSink;
}
static void setRectangleToVideoSink(GstElement* videoSink, const IntRect& rect)
{
    // Here goes the platform-dependant code to set to the videoSink the size
    // and position of the video rendering window. Mark them unused as default.
    UNUSED_PARAM(videoSink);
    UNUSED_PARAM(rect);
}

The first one, createHolePunchVideoSink() needs to be modified to create the appropriate video sink to use for the platform. This video sink needs to have some method that allows setting the position where the video frames are to be displayed, and the size they should have. And this is where setRectangleToVideoSink() comes into play. Whenever the transparent rectangle is painted by the browser, it will tell the video sink to render the frames to the appropriate position, and it does so using that function. So you need to modify that function to use the appropriate way to set the size and position to the video sink.

And that’s all. Once those changes are made the feature is complete, and the video should be placed exactly where the transparent rectangle is.

Something to take into account is that the size and position of the video rectangle are defined by the CSS values of the video element. The rectangle won’t be adjusted to fit the aspect ratio of the video, as that must be done by the platform video sink.

Also, the video element allows some animations to be performed: it can be translated and scaled, and it will properly notify the video sink about the animated changes. But, of course, it doesn’t support rotation or 3D transformations (as the normal video playback does). Take into account that there might be a small desynchronization between the transparent rectangle and the video frames size and position, due to the asynchronicity of some function calls.

Playing a video with GStreamer hole punch enabled.

External hole punch

Unlike the previous feature, this one doesn’t rely at all on GStreamer to perform the media playback. Instead, it just paints the transparent rectangle and lets the playback to be handled entirely by an external player.

Of course, there’s still the matter about how to synchronize the transparent rectangle position and the external player. There would be two ways to do this:

  • Implement a new WebKit MediaPlayerPrivate class that would communicate with the external player (through sockets, the injected bundle or any other way). WPE would use that to tell the platform media player what to play and where to render the result. This is completely dependant of the platform, and the most complex solution, but it would allow to use the browser to play content from any page without any change. But precisely as it’s completely platform dependant, this is not valid approach for upstream.
  • Use javascript to communicate with the native player, telling it what to play and where, and WPE would just paint the transparent rectangle. The problem with this is that we need to have control of the page to add the javascript code that controls the native player, but on the other hand, we can implement a generic approach on WPE to paint the transparent rectangle. This is the option that was implemented upstream.

So, how can this feature be used? It’s enabled by using the -DUSE_EXTERNAL_HOLEPUNCH flag, and what it does is add a new dummy MediaPlayer to WPE that’s selected to play content of type video/holepunch. This dummy MediaPlayer will draw the transparent rectangle on the page, according to the CSS values defined, and won’t do anything else. It’s up to the page owners to add the javascript code required to initiate the playback with the native player and position the output in the appropriate place under the transparent rectangle. To be a bit more specific, the dummy player will draw the transparent rectangle once the type has been set to video/holepunch and load() is called on the player. If you have any doubt about how to make this work, you can give a look to the video-player-holepunch-external.html test inside the ManualTests/wpe directory.

This implementation doesn’t support animating the size and position of the video… well, it really does, as the transparent rectangle will be properly animated, but you would need to animate the native player’s output as well, and syncing the rectangle area and the video output is going to be a challenging task.

As a last detail, controls can be enabled using this hole punch implementation, but they are useless. As WPE doesn’t know anything about the media playback that’s happening, the video element controls can’t be used to handle it, so it’s just better to keep them disabled.

Using both implementations together

You may be wondering, is it possible to use both implementations at the same time? Indeed it is!! You may be using the GStreamer holepunch to perform media playback with some custom GStreamer elements. At some point you may find a video that is not supported by GStreamer and you can just set the type of the video element to video/holepunch and start the playback with the native player. And once that video is finished, start using the GStreamer MediaPlayer again.

Availability

Both hole punch features will be available on the upcoming stable 2.24 release (and, of course, on 2.25 development releases). I hope they are useful for you!

by magomez at February 26, 2019 10:25 AM

February 25, 2019

Andrés Gómez

Review of Igalia’s Graphics activities (2018)

This is the first report about Igalia’s activities around Computer Graphics, specifically 3D graphics and, in particular, the Mesa3D Graphics Library (Mesa), focusing on the year 2018.

GL_ARB_gl_spirv and GL_ARB_spirv_extensions

GL_ARB_gl_spirv is an OpenGL extension whose purpose is to enable an OpenGL program to consume SPIR-V shaders. In the case of GL_ARB_spirv_extensions, it provides a mechanism by which an OpenGL implementation would be able to announce which particular SPIR-V extensions it supports, which is a nice complement to GL_ARB_gl_spirv.

As both extensions, GL_ARB_gl_spirv and GL_ARB_spirv_extensions, are core functionality in OpenGL 4.6, the drivers need to provide them in order to be compliant with that version.

Although Igalia picked up the already started implementation of these extensions in Mesa back in 2017, 2018 is a year in which we put a big deal of work to provide the needed push to have all the remaining bits in place. Much of this effort provides general support to all the drivers under the Mesa umbrella but, in particular, Igalia implemented the backend code for Intel‘s i965 driver (gen7+). Assuming that the review process for the remaining patches goes without important bumps, it is expected that the whole implementation will land in Mesa during the beginning of 2019.

Throughout the year, Alejandro Piñeiro gave status updates of the ongoing work through his talks at FOSDEM and XDC 2018. This is a video of the latter:

ETC2/EAC

The ETC and EAC formats are lossy compressed texture formats used mostly in embedded devices. OpenGL implementations of the versions 4.3 and upwards, and OpenGL/ES implementations of the versions 3.0 and upwards must support them in order to be conformant with the standard.

Most modern GPUs are able to work directly with the ETC2/EAC formats. Implementations for older GPUs that don’t have that support but want to be conformant with the latest versions of the specs need to provide that functionality through the software parts of the driver.

During 2018, Igalia implemented the missing bits to support GL_OES_copy_image in Intel’s i965 for gen7+, while gen8+ was already complying through its HW support. As we were writing this entry, the work has finally landed.

VK_KHR_16bit_storage

Igalia finished the work to provide support for the Vulkan extension VK_KHR_16bit_storage into Intel’s Anvil driver.

This extension allows the use of 16-bit types (half floats, 16-bit ints, and 16-bit uints) in push constant blocks, and buffers (shader storage buffer objects).  This feature can help to reduce the memory bandwith for Uniform and Storage Buffer data accessed from the shaders and / or optimize Push Constant space, of which there are only a few bytes available, making it a precious shader resource.

shaderInt16

Igalia added Vulkan’s optional feature shaderInt16 to Intel’s Anvil driver. This new functionality provides the means to operate with 16-bit integers inside a shader which, ideally, would lead to better performance when you don’t need a full 32-bit range. However, not all HW platforms may have native support, still needing to run in 32-bit and, hence, not benefiting from this feature. Such is the case for operations associated with integer division in the case of Intel platforms.

shaderInt16 complements the functionality provided by the VK_KHR_16bit_storage extension.

SPV_KHR_8bit_storage and VK_KHR_8bit_storage

SPV_KHR_8bit_storage is a SPIR-V extension that complements the VK_KHR_8bit_storage Vulkan extension to allow the use of 8-bit types in uniform and storage buffers, and push constant blocks. Similarly to the the VK_KHR_16bit_storage extension, this feature can help to reduce the needed memory bandwith.

Igalia implemented its support into Intel’s Anvil driver.

VK_KHR_shader_float16_int8

Igalia implemented the support for VK_KHR_shader_float16_int8 into Intel’s Anvil driver. This is an extension that enables Vulkan to consume SPIR-V shaders that use Float16 and Int8 types in arithmetic operations. It extends the functionality included with VK_KHR_16bit_storage and VK_KHR_8bit_storage.

In theory, applications that do not need the range and precision of regular 32-bit floating point and integers, can use these new types to improve performance. Additionally, its implementation is mostly API agnostic, so most of the work we did should also help to have a proper mediump implementation for GLSL ES shaders in the future.

The review process for the implementation is still ongoing and is on its way to land in Mesa.

VK_KHR_shader_float_controls

VK_KHR_shader_float_controls is a Vulkan extension which allows applications to query and override the implementation’s default floating point behavior for rounding modes, denormals, signed zero and infinity.

Igalia has coded its support into Intel’s Anvil driver and it is currently under review before being merged into Mesa.

VkRunner

VkRunner is a Vulkan shader tester based on shader_runner in Piglit. Its goal is to make it feasible to test scripts as similar as possible to Piglit’s shader_test format.

Igalia initially created VkRunner as a tool to get more test coverage during the implementation of GL_ARB_gl_spirv. Soon, it was clear that it was useful way beyond the implementation of this specific extension but as a generic way of testing SPIR-V shaders.

Since then, VkRunner has been enabled as an external dependency to run new tests added to the Piglit and VK-GL-CTS suites.

Neil Roberts introduced VkRunner at XDC 2018. This is his talk:

freedreno

During 2018, Igalia has also started contributing to the freedreno Mesa driver for Qualcomm GPUs. Among the work done, we have tackled multiple bugs identified through the usual testing suites used in the graphic drivers development: Piglit and VK-GL-CTS.

Khronos Conformance

The Khronos conformance program is intended to ensure that products that implement Khronos standards (such as OpenGL or Vulkan drivers) do what they are supposed to do and they do it consistently across implementations from the same or different vendors.

This is achieved by producing an extensive test suite, the Conformance Test Suite (VK-GL-CTS or CTS for short), which aims to verify that the semantics of the standard are properly implemented by as many vendors as possible.

In 2018, Igalia has continued its work ensuring that the Intel Mesa drivers for both Vulkan and OpenGL are conformant. This work included reviewing and testing patches submitted for inclusion in VK-GL-CTS and continuously checking that the drivers passed the tests. When failures were encountered we provided patches to correct the problem either in the tests or in the drivers, depending on the outcome of our analysis or, even, brought a discussion forward when the source of the problem was incomplete, ambiguous or incorrect spec language.

The most important result out of this significant dedication has been successfully passing conformance applications.

OpenGL 4.6

Igalia helped making Intel’s i965 driver conformant with OpenGL 4.6 since day zero. This was a significant achievement since, besides Intel Mesa, only nVIDIA managed to do this too.

Igalia specifically contributed to achieve the OpenGL 4.6 milestone providing the GL_ARB_gl_spirv implementation.

Vulkan 1.1

Igalia also helped to make Intel’s Anvil driver conformant with Vulkan 1.1 since day zero, too.

Igalia specifically contributed to achieve the Vulkan 1.1 milestone providing the VK_KHR_16bit_storage implementation.

Mesa Releases

Igalia continued the work that was already carrying on in Mesa’s Release Team throughout 2018. This effort involved a continuous dedication to track the general status of Mesa against the usual test suites and benchmarks but also to react quickly upon detected regressions, specially coordinating with the Mesa developers and the distribution packagers.

The work was obviously visible by releasing multiple bugfix releases as well as doing the branching and creating a feature release.

CI

Continuous Integration is a must in any serious SW project. In the case of API implementations it is even critical since there are many important variables that need to be controlled to avoid regressions and track the progress when including new features: agnostic tests that can be used by different implementations, different OS platforms, CPU architectures and, of course, different GPU architectures and generations.

Igalia has kept a sustained effort to keep Mesa (and Piglit) CI integrations in good health with an eye on the reported regressions to act immediately upon them. This has been a key tool for our work around Mesa releases and the experience allowed us to push the initial proposal for a new CI integration when the FreeDesktop projects decided to start its migration to GitLab.

This work, along with the one done with the Mesa releases, lead to a shared presentation, given by Juan Antonio Suárez during XDC 2018. This is the video of the talk:

XDC 2018

2018 was the year that saw A Coruña hosting the X.Org Developer’s Conference (XDC) and Igalia as Platinum Sponsor.

The conference was organized by GPUL (Galician Linux User and Developer Group) together with University of A Coruña, Igalia and, of course, the X.Org Foundation.

Since A Coruña is the town in which the company originated and where we have our headquarters, Igalia had a key role in the organization, which was greatly benefited by our vast experience running events. Moreover, several Igalians joined the conference crew and, as mentioned above, we delivered talks around GL_ARB_gl_spirv, VkRunner, and Mesa releases and CI testing.

The feedback from the attendees was very rewarding and we believe the conference was a great event. Here you can see the Closing Session speech given by Samuel Iglesias:

Other activities

Conferences

As usual, Igalia was present in many graphics related conferences during the year:

New Igalians in the team

Igalia’s graphics team kept growing. Two new developers joined us in 2018:

  • Hyunjun Ko is an experienced Igalian with a strong background in multimedia. Specifically, GStreamer and Intel’s VAAPI. He is now contributing his impressive expertise into our Graphics team.
  • Arcady Goldmints-Orlov is the latest addition to the team. His previous expertise as a graphics developer around the nVIDIA GPUs fits perfectly for the kind of work we are pushing currently in Igalia.

Conclusion

Thank you for reading this blog post and we look forward to more work on graphics in 2019!

Igalia

by tanty at February 25, 2019 02:50 PM

February 18, 2019

Neil Roberts

VkRunner at FOSDEM

I attended FOSDEM again this year thanks to funding from Igalia. This time I gave a talk about VkRunner in the graphics dev room. It’s now available on Igalia’s YouTube channel below:

I thought this might be a good opportunity to give a small status update of what has happened since my last blog post nearly a year ago.

Test suite integration

The biggest news is that VkRunner is now integrated into Khronos’ Vulkan CTS test suite and Mesa’s Piglit test suite. This means that if you work on a feature or a bugfix in your Vulkan driver and you want to make sure it doesn’t get regressed, it’s now really easy to add a VkRunner test for it and have it collected in one of these test suites. For Piglit all that is needed is to give the test script a .vk_shader_test extension and drop it anywhere under the tests/vulkan folder and it will automatically be picked up by the Piglit framework. As an added bonus, these tests are also run automatically on Intel’s CI system, so if your test is related to i965 in Mesa you can be sure it will not be regressed.

On the Khronos CTS side the integration is currently a little less simple. Along with help from Samuel Iglesias, we have merged a branch into master that lays the groundwork for adding VkRunner tests. Currently there are only proof-of-concept tests to show how the tests could work. Adding more tests still requires tweaking the C++ code so it’s not quite as simple as we might hope.

API

When VkRunner is built, in now also builds a static library containing a public API. This can be used to integrate VkRunner into a larger test suite. Indeed, the Khronos CTS integration takes advantage of this to execute the tests using the VkDevice created by the test suite itself. This also means it can execute multiple tests quickly without having to fork an external process.

The API is intended to be very highlevel and is as close to possible as just having simple functions to ask VkRunner to execute a test script and return an enum reporting whether the test succeeded or not. There is an example of its usage in the README.

Precompiled shader scripts

One of the concerns raised when integrating VkRunner into CTS is that it’s not ideal to have to run glslang as an external process in order to compile the shaders in the scripts to SPIR-V. To work around this, I added the ability to have scripts with binary shaders. In this case the 32-bit integer numbers of the compiled SPIR-V are just listed in ASCII in the shader test instead of the GLSL source. Of course writing this by hand would be a pain, so the VkRunner repo includes a Python script to precompile a bunch of shaders in a batch. This can be really useful to run the tests on an embedded device where installing glslang isn’t practical.

However, in the end for the CTS integration we took a different approach. The CTS suite already has a mechanism to precompile all of the shaders for all tests. We wanted to take advantage of this also when compiling the shaders from VkRunner tests. To make this work, Samuel added some functions to the VkRunner API to query the GLSL in a VkRunner shader script and then replace them with binary equivalents. That way the CTS suite can use these functions to replace the shaders with its cached compiled versions.

UBOs, SSBOs and compute shaders

One of the biggest missing features mentioned in my last post was UBO and SSBO support. This has now been fixed with full support for setting values in UBOs and SSBOs and also probing the results of writing to SSBOs. Probing SSBOs is particularily useful alongside another added feature: compute shaders. Thanks to this we can run our shaders as compute shaders to calculate some results into an SSBO and probe the buffer to see whether it worked correctly. Here is an example script to show how that might look:

[compute shader]
#version 450

/* UBO input containing an array of vec3s */
layout(binding = 0) uniform inputs {
        vec3 input_values[4];
};

/* A matrix to apply to these values. This is stored in a push
 * constant. */
layout(push_constant) uniform transforms {
        mat3 transform;
};

/* An SSBO to store the results */
layout(binding = 1) buffer outputs {
        vec3 output_values[];
};

void
main()
{
        uint i = gl_WorkGroupID.x;

        /* Transform one of the inputs */
        output_values[i] = transform * input_values[i];
}

[test]
# Set some input values in the UBO
ubo 0 subdata vec3 0 \
  3 4 5 \
  1 2 3 \
  1.2 3.4 5.6 \
  42 11 9

# Create the SSBO
ssbo 1 1024

# Store a matrix uniform to swap the x and y
# components of the inputs
push mat3 0 \
  0 1 0 \
  1 0 0 \
  0 0 1

# Run the compute shader with one instance
# for each input
compute 4 1 1

# Check that we got the expected results in the SSBO
probe ssbo vec3 1 0 ~= \
  4 3 5 \
  2 1 3 \
  3.4 1.2 5.6 \
  11 42 9

Extensions in the requirements section

The requirements section can now contain the name of any extension. If this is done then VkRunner will check for the availability of the extension when creating the device and enable it. Otherwise it will report that the test was skipped. A lot of the Vulkan extensions also add an extended features struct to be used when creating the device. These features can also be queried and enabled for extentions that VkRunner knows about simply by listing the name of the feature in that struct. For example if shaderFloat16 in listed in the requirements section, VkRunner will check for the VK_KHR_shader_float16_int8 extension and the shaderFloat16 feature within its extended feature struct. This makes it really easy to test optional features.

Cross-platform support

I spent a fair bit of time making sure VkRunner works on Windows including compiling with Visual Studio. The build files have been converted to CMake which makes building on Windows even easier. It also compiles for Android thanks to patches from Jaebaek Seo. The repo contains Android build files to build the library and the vkrunner executable. This can be run directly on a device using adb.

User interface

There is a branch containing the beginnings of a user interface for editing VkRunner scripts. It presents an editor widget via GTK and continuously runs the test script in the background as you are editing it. It then displays the results in an image and reports any errors in a text field. The test is run in a separate process so that if it crashes it doesn’t bring down the user interface. I’m not sure whether it makes sense to merge this branch into master, but in the meantime it can be a convenient way to fiddle with a test when it fails and it’s not obvious why.

And more…

Lots of other work has been going on in the background. The best way to get to more details on what VkRunner can do is to take a look at the README. This has been kept up-to-date as the source of documentation for writing scripts.

by nroberts at February 18, 2019 05:23 PM

February 17, 2019

Eleni Maria Stea

i965: Improved support for the ETC/EAC formats on Intel Gen 7 and previous GPUs

This post is about a recent contribution I’ve done to the i965 mesa driver to improve the emulation of the ETC/EAC texture formats on the Intel Gen 7 and older GPUs, as part of my work for the Igalia‘s graphics team. Demo: The video mostly shows the behavior of some GL calls and operations with … Continue reading i965: Improved support for the ETC/EAC formats on Intel Gen 7 and previous GPUs

by hikiko at February 17, 2019 04:45 PM

February 13, 2019

Víctor Jáquez

Generating a GStreamer-1.14 bundle for TravisCI with Ubuntu/Trusty

For having continous integration in your multimedia project hosted in GitHub with TravisCI, you may want to compile and run tests with a recent version of GStreamer. Nonetheless, TravisCI mainly offers Ubuntu Trusty as one of the possible distributions to deploy in their CI, and that distribution packages GStreamer 1.2, which might be a bit old for your project’s requirements.

A solution for this problem is to provide to TravisCI your own GStreamer bundle with the version you want to compile and test on you project, in this case 1.14. The present blog is recipe I followed to generate that GStreamer bundle with GstGL support.

There are three main issues:

  1. The packaged libglib version is too old, hoping that we will not find an ABI breakage while running the CI.
  2. The packaged ffmpeg version is too old
  3. As we want to compile GStreamer using gst-build, we need a recent version of meson, which requires python3.5, not available in Trusty.

schroot

Old habits die hard, and I have used schroot for handle chroot environments without complains, it handles the bind mounting of /proc, /sys and all that repetitive stuff that seals the isolation of the chrooted environment.

The debootstrap’s variant I use is buildd because it installs the build-essential package.

$ sudo mkdir /srv/chroot/gst-trusty64
$ sudo debootstrap --arch=amd64 --variant=buildd trusty ./gst-trusty64/ http://archive.ubuntu.com/ubuntu
$ sudo vim /etc/schroot/chroot.d

This is the schroot configuration I will use. Please, adapt it to your need.

[gst]
description=Ubuntu Trusty 64-bit for GStreamer
directory=/srv/chroot/gst-trusty64
type=directory
users=vjaquez
root-users=vjaquez
profile=default
setup.fstab=default/vjaquez-home.fstab

I am overrinding the fstab default file for a custom one where the home directory of vjaquez user aims to a clean directory.

$ mkdir -p ~/home-chroot/gst
$ sudo vim /etc/schroot/default/vjaquez-home.fstab
# fstab: static file system information for chroots.
# Note that the mount point will be prefixed by the chroot path
# (CHROOT_PATH)
#
#                
/proc           /proc           none    rw,bind         0       0
/sys            /sys            none    rw,bind         0       0
/dev            /dev            none    rw,bind         0       0
/dev/pts        /dev/pts        none    rw,bind         0       0
/home           /home           none    rw,bind         0       0
/home/vjaquez/home-chroot/gst   /home/vjaquez   none    rw,bind 0       0
/tmp            /tmp            none    rw,bind         0       0

configure chroot environment

We will get into the chroot environment as super user in order to add the required packages. For that pupose we add universe repository in apt.

  • libglib requires: autotools-dev gnome-pkg-tools libtool libffi-dev libelf-dev libpcre3-dev desktop-file-utils libselinux1-dev libgamin-dev dbus dbus-x11 shared-mime-info libxml2-utils
  • Python requires: libssl-dev libreadline-dev libsqlite3-dev
  • GStreamer requires: bison flex yasm python3-pip libasound2-dev libbz2-dev libcap-dev libdrm-dev libegl1-mesa-dev libfaad-dev libgl1-mesa-dev libgles2-mesa-dev libgmp-dev libgsl0-dev libjpeg-dev libmms-dev libmpg123-dev libogg-dev libopus-dev liborc-0.4-dev libpango1.0-dev libpng-dev libpulse-dev librtmp-dev libtheora-dev libtwolame-dev libvorbis-dev libvpx-dev libwebp-dev pkg-config unzip zlib1g-dev
  • And for general setup: language-pack-en ccache git curl
$ schroot --user root --chroot gst
(gst)# sed -i "s/main$/main universe/g" /etc/apt/sources.list
(gst)# apt update
(gst)# apt upgrade
(gst)# apt --no-install-recommends --no-install-suggests install \
autotools-dev gnome-pkg-tools libtool libffi-dev libelf-dev \
libpcre3-dev desktop-file-utils libselinux1-dev libgamin-dev dbus \
dbus-x11 shared-mime-info libxml2-utils \
libssl-dev libreadline-dev libsqlite3-dev \ 
language-pack-en ccache git curl bison flex yasm python3-pip \
libasound2-dev libbz2-dev libcap-dev libdrm-dev libegl1-mesa-dev \
libfaad-dev libgl1-mesa-dev libgles2-mesa-dev libgmp-dev libgsl0-dev \
libjpeg-dev libmms-dev libmpg123-dev libogg-dev libopus-dev \
liborc-0.4-dev libpango1.0-dev libpng-dev libpulse-dev librtmp-dev \
libtheora-dev libtwolame-dev libvorbis-dev libvpx-dev libwebp-dev \
pkg-config unzip zlib1g-dev

Finally we create our installation prefix. In this case /opt/gst to avoid the contamination of /usr/local and logout as root.

(gst)# mkdir -p /opt/gst
(gst)# chown vjaquez /opt/gst
(gst)# exit

compile ffmpeg 3.2

Now, let’s login again, but as the unprivileged user, to build the bundle, starting with ffmpeg. Notice that we are using ccache and building out-of-source.

$ schroot --chroot gst
(gst)$ git clone https://git.ffmpeg.org/ffmpeg.git ffmpeg
(gst)$ cd ffmpeg
(gst)$ git checkout -b work n3.2.12
(gst)$ mkdir build
(gst)$ cd build
(gst)$ ../configure --disable-static --enable-shared \
--disable-programs --enable-pic --disable-doc --prefix=/opt/gst 
(gst)$ PATH=/usr/lib/ccache/:${PATH} make -j8 install

compile glib 2.48

(gst)$ cd ~
(gst)$ git clone https://gitlab.gnome.org/GNOME/glib.git
(gst)$ cd glib
(gst)$ git checkout -b work origin/glib-2-48
(gst)$ mkdir mybuild
(gst)$ cd mybuild
(gst)$ ../autogen.sh --prefix=/opt/gst
(gst)$ PATH=/usr/lib/ccache/:${PATH} make -j8 install

install Python 3.5

Pyenv is a project that allows the automation of installing and executing, in the user home directory, multiple versions of Python.

(gst)$ curl -L https://github.com/pyenv/pyenv-installer/raw/master/bin/pyenv-installer | bash
(gst)$ ~/.pyenv/bin/pyenv install 3.5.0

Install meson 0.50

We will install the last available version of meson in the user home directory, that is why PATH is extended and exported.

(gst)$ cd ~
(gst)$ ~/.pyenv/verion/3.5.0/pip3 install --user meson
(gst)$ export PATH=${HOME}/.local/bin:${PATH}

build GStreamer 1.14

PKG_CONFIG_PATH is exported to expose the compiled versions of ffmpeg and glib. Notice that the libraries are installed in /opt/lib in order to avoid the dispersion of pkg-config files.

(gst)$ cd ~/
(gst)$ export PKG_CONFIG_PATH=/opt/gst/lib/pkgconfig/
(gst)$ git clone https://gitlab.freedesktop.org/gstreamer/gst-build.git
(gst)$ cd gst-build
(gst)$ git checkout -b work origin/1.14
(gst)$ meson -Denable_python=false \
-Ddisable_gst_libav=false -Ddisable_gst_plugins_ugly=true \
-Ddisable_gst_plugins_bad=false -Ddisable_gst_devtools=true \
-Ddisable_gst_editing_services=true -Ddisable_rtsp_server=true \
-Ddisable_gst_omx=true -Ddisable_gstreamer_vaapi=true \
-Ddisable_gstreamer_sharp=true -Ddisable_introspection=true \
--prefix=/opt/gst build --libdir=lib
(gst)$ ninja -C build install

test!

(gst)$ cd ~/
(gst)$ LD_LIBRARY_PATH=/opt/gst/lib \
GST_PLUGIN_SYSTEM_PATH=/opt/gst/lib/gstreamer-1.0/ \
/opt/gst/bin/gst-inspect-1.0

And the list of available elemente shall be shown.

archive the bundle

(gst)$ cd ~/
(gst)$ tar zpcvf gstreamer-1.14-x86_64-linux-gnu.tar.gz -C /opt ./gst

update your .travis.yml

These are the packages you shall add to run this generated GStreamer bundle:

  • libasound2-plugins
  • libfaad2
  • libfftw3-single3
  • libjack-jackd2-0
  • libmms0
  • libmpg123-0
  • libopus0
  • liborc-0.4-0
  • libpulsedsp
  • libsamplerate0
  • libspeexdsp1
  • libtdb1
  • libtheora0
  • libtwolame0
  • libwayland-egl1-mesa
  • libwebp5
  • libwebrtc-audio-processing-0
  • liborc-0.4-dev
  • pulseaudio
  • pulseaudio-utils

And this is the before_install and before_script targets:

      before_install:
        - curl -L http://server.example/gstreamer-1.14-x86_64-linux-gnu.tar.gz | tar xz
        - sed -i "s;prefix=/opt/gst;prefix=$PWD/gst;g" $PWD/gst/lib/pkgconfig/*.pc
        - export PKG_CONFIG_PATH=$PWD/gst/lib/pkgconfig
        - export GST_PLUGIN_SYSTEM_PATH=$PWD/gst/lib/gstreamer-1.0
        - export GST_PLUGIN_SCANNER=$PWD/gst/libexec/gstreamer-1.0/gst-plugin-scanner
        - export PATH=$PATH:$PWD/gst/bin
        - export LD_LIBRARY_PATH=$PWD/gst/lib:$LD_LIBRARY_PATH

      before_script:
        - pulseaudio --start
        - gst-inspect-1.0 | grep Total

by vjaquez at February 13, 2019 07:43 PM

February 11, 2019

Víctor Jáquez

Review of Igalia’s Multimedia Activities (2018/H2)

This is the first semiyearly report about Igalia’s activities around multimedia, covering the second half of 2018.

Great length of this report was exposed in Phil’s talk surveying mutimedia development in WebKitGTK and WPE:

WebKit Media Source Extensions (MSE)

MSE is a specification that allows JS to generate media streams for playback for Web browsers that support HTML 5 video and audio.

Last semester we upstreamed the support to WebM format in WebKitGTK with the related patches in GStreamer, particularly in qtdemux, matroskademux elements.

WebKit Encrypted Media Extensions (EME)

EME is a specification for enabling playback of encrypted content in Web bowsers that support HTML 5 video.

In a downstream project for WPE WebKit we managed to have almost full test coverage in the YoutubeTV 2018 test suite.

We merged our contributions in upstream, WebKit and GStreamer, most of what is legal to publish, for example, making demuxers aware of encrypted content and make them to send protection events with the initialization data and the encrypted caps, in order to select later the decryption key.

We started to coordinate the upstreaming process of a new implementation of CDM (Content Decryption Module) abstraction and there will be even changes in that abstraction.

Lighting talk about EME implementation in WPE/WebKitGTK in GStreamer Conference 2018.

WebKit WebRTC

WebRTC consists of several interrelated APIs and real time protocols to enable Web applications and sites to captures audio, or A/V streams, and exchange them between browsers without requiring an intermediary.

We added GStreamer interfaces to LibWebRTC, to use it for the network part, while using GStreamer for the media capture and processing. All that was upstreamed in 2018 H2.

Thibault described thoroughly the tasks done for this achievement.

Talk about WebRTC implementation in WPE/WebKitGTK in WebEngines hackfest 2018.

Servo/media

Servo is a browser engine written in Rust designed for high parallelization and high GPU usage.

We added basic support for <video> and <audio> media elements in Servo. Later on, we added the GstreamerGL bindings for Rust in gstreamer-rs to render GL textures from the GStreamer pipeline in Servo.

Lighting talk in the GStreamer Conference 2018.

GstWPE

Taking an idea from the GStreamer Conference, we developed a GStreamer source element that wraps WPE. With this source element, it is possible to blend a web page and video in a single video stream; that is, the output of a Web browser (to say, a rendered web page) is used as a video source of a GStreamer pipeline: GstWPE. The element is already merged in the gst-plugins-bad repository.

Talk about GstWPE in FOSDEM 2019

Demo #1

Demo #2

GStreamer VA-API and gst-MSDK

At last, but not the least, we continued helping with the maintenance of GStreamer-VAAPI and gst-msdk, with code reviewing and on-going migration of the internal library to GObject.

Other activities

The second half of 2018 was also intense in terms of conferences and hackfest for the team:


Thanks to bear with us along all this blog post and to keeping under your radar our work.

by vjaquez at February 11, 2019 12:52 PM

Manuel Rego

Summary of a week in Lyon for TPAC 2018

Past October Igalia participated on TPAC 2018 with 12 people, I believe it was the biggest presence of igalians in this event ever, probably because of proximity to many of us (as it happened in Lyon) but also reflecting our increasing presence on the web platform ecosystem.

Apart from TPAC itself, Igalia also participated on the W3C Developers Meetup that happened the very same week, where I gave a talk about how to contribute to CSS (more about that later).

Igalia booth at TPAC

In the Igalia booth we were showcasing some of our last developments with different demos running on embedded devices, in which you could find our more recent work around the web platform (WebRTC, MSE, CSS Grid Layout, CSS Box Alignment, MathML, etc.). These demos were using WPE a WebKit port optimized for low-end platforms developed by Igalia.

Igalia booth at TPAC 2018 Igalia booth at TPAC 2018

Be part of CSS evolution

As I mentioned in the introduction, I gave a talk in the W3C Developers Meetup. My talk was called “Be part of CSS evolution” and it tried to explain how the CSS Working Group works and also how anyone can have a direct impact on the development of CSS specifications by raising issues, providing feedback, explaining use cases, etc.

The slides of the talk can be found on this blog and the video has been recently published in Vimeo.

Video of my talk “Be part of CSS evolution

MathML

Most of the people that came to our booth asked about this topic, it’s clear there are a lot of people interested in MathML. As you might already know Igalia has been looking for funding to implement MathML in Chromium and during TPAC we got the confirmation that NISO will be sponsoring an important part of this work.

At TPAC there were several concerns about the future of MathML, and a TAG review was requested just after the conference. Igalia has been in conversations with many people since TPAC: TAG members, Google engineers and folks interested on reviving MathML specification. Past month a new MathML Refresh Community Group has been created and TAG review has been closed with a positive answer regarding the future of MathML.

On top of the specs work, Chromium implementation is in progress and more news will be released soon at mathml.igalia.com about the status of things. If your company would like to support MathML please don’t hesitate to contact us. Stay tuned!

Other

In my case I was attending CSS Working Group and Houdini Task Force meetings, it’s always a pleasure to share a room with such amount of brilliant people working hard on defining the future of CSS. Several people were quite interested about the work Igalia has been recently doing around CSS Containment specification (more info in my previous post). It seems this spec has some potential to become relevant regarding web rendering performance.

Apart from that, there were a bunch of interesting breakout sessions on the technical plenary day. I’d like to highlight the one given by fantasai and Marcos Cáceres about Spec Editing Best Practices, it was really interesting to understand how both write specs to make things easier for people reading them.

Spec Editing Best Practices notes by fantasai Spec Editing Best Practices” notes by fantasai

Last but not least, despite being most of the day at TPAC we found some time to enjoy Lyon during dinners at night, it looks a nice city.

February 11, 2019 08:00 AM

January 30, 2019

Samuel Iglesias

VkRunner is integrated into VK-GL-CTS and piglit

One of the greatest features from piglit was the easy development of OpenGL tests based on GLSL shaders plus some simple commands through shader_runner command. I even wrote about it.

However, Vulkan ecosystem was missing a tool like that but for SPIR-V shader tests… until last year!

Vulkan Logo

VkRunner is a tool written by Neil Roberts, which is very inspired on shader_runner. VkRunner was the result of the Igalia work to enable ARB_gl_spirv extension for Intel’s i965 driver on Mesa, where there was a need to test driver’s code against a good number of shaders to be sure that it was fine.

VkRunner uses a script language to define the requirements needed to run the test, such as the needed extension and features, the shaders to be run and a series of commands to run it. It will then parse everything and execute the equivalent Vulkan commands to do so under the hood, like shader_runner did for OpenGL in piglit.

This is an example of how a Vkrunner looks like:

[compute shader]
#version 450

layout(std140, push_constant) uniform push_constants {
        float in_value;
};

layout(std140, binding = 0) buffer ssbo {
        float out_value;
};

void
main()
{
        out_value = sqrt(in_value);
}

[test]
# Allocate an ssbo big enough for a float at binding 0
ssbo 0 4

# Set the push constant as an input value
uniform float 0 4

compute 1 1 1

# Probe that we got the expected value
tolerance 0.00006% 0.00006% 0.00006% 0.00006%
probe ssbo float 0 0 ~= 2

The end of 2018 was great for VkRunner! First, the tool was integrated into piglit so we can now use it in this amazing open-source testing suite for 3D graphics drivers. Soon after, it was integrated into Khronos Group’s Vulkan and OpenGL Conformance Test Suite (see commit), which will help contributors to easily write SPIR-V tests on Vulkan.

If you want learn more about VkRunner, apart from browsing the repository, Neil wrote a nice blogpost explaining the tool basics, gave a lightning talk at XDC 2018 (slides) in A Coruña and now, he is going to give a talk in the graphics devroom at FOSDEM 2019! You can follow his FOSDEM talk on Saturday via live-stream (or see the recording afterwards) in case you are not going to FOSDEM this year :-)

Igalia

FOSDEM 2019

January 30, 2019 07:00 AM

January 29, 2019

Mario Sanchez Prada

Working on the Chromium Servicification Project

It’s been a few months already since I (re)joined Igalia as part of its Chromium team and I couldn’t be happier about it: right since the very first day, I felt perfectly integrated as part of the team that I’d be part of and quickly started making my way through the -fully upstream- project that would keep me busy during the following months: the Chromium Servicification Project.

But what is this “Chromium servicification project“? Well, according to the Wiktionary the word “servicification” means, applied to computing, “the migration from monolithic legacy applications to service-based components and solutions”, which is exactly what this project is about: as described in the Chromium servicification project’s website, the whole purpose behind this idea is “to migrate the code base to a more modular, service-oriented architecture”, in order to “produce reusable and decoupled components while also reducing duplication”.

Doing so would not only make Chromium a more manageable project from a source code-related point of view and create better and more stable interfaces to embed chromium from different projects, but should also enable teams to experiment with new features by combining these services in different ways, as well as to ship different products based in Chromium without having to bundle the whole world just to provide a particular set of features. 

For instance, as Camille Lamy put it in the talk delivered (slides here) during the latest Web Engines Hackfest,  “it might be interesting long term that the user only downloads the bits of the app they need so, for instance, if you have a very low-end phone, support for VR is probably not very useful for you”. This is of course not the current status of things yet (right now everything is bundled into a big executable), but it’s still a good way to visualise where this idea of moving to a services-oriented architecture should take us in the long run.

With this in mind, the idea behind this project would be to work on the migration of the different parts of Chromium depending on those components that are being converted into services, which would be part of a “foundation” base layer providing the core services that any application, framework or runtime build on top of chromium would need.

As you can imagine, the whole idea of refactoring such an enormous code base like Chromium’s is daunting and a lot of work, especially considering that currently ongoing efforts can’t simply be stopped just to perform this migration, and that is where our focus is currently aimed at: we integrate with different teams from the Chromium project working on the migration of those components into services, and we make sure that the clients of their old APIs move away from them and use the new services’ APIs instead, while keeping everything running normally in the meantime.

At the beginning, we started working on the migration to the Network Service (which allows to run Chromium’s network stack even without a browser) and managed to get it shipped in Chromium Beta by early October already, which was a pretty big deal as far as I understand. In my particular case, that stage was a very short ride since such migration was nearly done by the time I joined Igalia, but still something worth mentioning due to the impact it had in the project, for extra context.

After that, our team started working on the migration of the Identity service, where the main idea is to encapsulate the functionality of accessing the user’s identities right through this service, so that one day this logic can be run outside of the browser process. One interesting bit about this migration is that this particular functionality (largely implemented inside the sign-in component) has historically been located quite high up in the stack, and yet it’s now being pushed all the way down into that “foundation” base layer, as a core service. That’s probably one of the factors contributing to making this migration quite complicated, but everyone involved is being very dedicated and has been very helpful so far, so I’m confident we’ll get there in a reasonable time frame.

If you’re curious enough, though, you can check this status report for the Identity service, where you can see the evolution of this particular migration, along with the impact our team had since we started working on this part, back on early October. There are more reports and more information in the mailing list for the Identity service, so feel free to check it out and/or subscribe there if you like.

One clarification is needed, tough: for now, the scope of this migrations is focused on using the public C++ APIs that such services expose (see //services/<service_name>/public/cpp), but in the long run the idea is that those services will also provide Mojo interfaces. That will enable using their functionality regardless of whether you’re running those services as part of the browser’s process, or inside their own & separate processes, which will then allow the flexibility that chromium will need to run smoothly and safely in different kind of environments, from the least constrained ones to others with a less favourable set of resources at their disposal.

And this is it for now, I think. I was really looking forward to writing a status update about what I’ve been up to in the past months and here it is, even though it’s not the shortest of all reports.

One last thing, though: as usual, I’m going to FOSDEM this year as well, along with a bunch of colleagues & friends from Igalia, so please feel free to drop me/us a line if you want to chat and/or hangout, either to talk about work-related matters or anything else really.

And, of course, I’d be also more than happy to talk about any of the open job positions at Igalia, should you consider applying. There are quite a few of them available at the moment for all kind of things (most of them available for remote work): from more technical roles such as graphicscompilersmultimedia, JavaScript engines, browsers (WebKitChromium, Web Platform) or systems administration (this one not available for remotes, though), to other less “hands-on” types of roles like developer advocatesales engineer or project manager, so it’s possible there’s something interesting for you if you’re considering to join such an special company like this one.

See you in FOSDEM!

by mario at January 29, 2019 06:35 PM

Samuel Iglesias

Improving my emacs setup

I need to start this post mentioning the reason I improved my Emacs setup after so many years with no change in it. Funny enough, I need to say thanks to Visual Studio Code :-D

Last month, I came across the blogpost “10 years of love for Emacs undone by a week of VSCode”. As I have been using Emacs for almost a decade, I wondered if that could be true… so I tried Visual Studio Code!

During my testing period, I found that IntelliSense worked like a charm, I love that “peek declaration” feature, it has a huge number of extensions that provide almost anything you want and it is well supported on GNU/Linux, including GDB and terminal support. This was the first Microsoft product in years that I considered it worth using every day :-O

Screenshot of Visual Studio Code

This experience made me think also what I was missing on my Emacs setup and I did not know before. I realized that I missed having multiple cursors, a good source code tag system, a modern theme (yes, why not?), markdown support and, if possible, a integrated way to check Pull Requests on Github and Merge Requests on Gitlab. I found a way to have everything in Emacs except the Gitlab’s Merge Requests integration, due to a failed installation of the gitlab package. Now I am much happier user of Emacs than one month ago, and I need to say thanks to Visual Studio Code :-P

In case you want to test it, I have pushed my emacs.d/ config to Github. Be aware this is not the final version… I plan to improve it in the future.

Screenshot of Emacs

January 29, 2019 07:00 AM

January 27, 2019

Adrián Pérez

Web Engines Hackfest 2018 → FOSDEM 2019

The last quarter of 2018 has been a quite hectic one, and every time I had some spare time after the Web Engines Hackfest the prospective of sitting down to write some thoughts about it seemed dreadful. Christmas went by already —two full weeks of holidays, practically without touching a computer— and suddenly I found myself booking tickets to this year's FOSDEM and it just hit me: it is about time to get back blogging!

FOSDEM

There is not much that I would want to add about FOSDEM, an event which I have attended a number of times before (and some others about which I have not even blogged). This is an event I always look forward to, and the one single reason that keeps me coming back is recharging my batteries.

This may seem contradictory because the event includes hundreds of talks and workshops tucked in just two days. Don't get me wrong, the event is physically tiresome, but there are always tons of new and exciting topics to learn about and many Free/Libre Software communities being represented, which means that there is a contagious vibe of enthusiasm. This makes me go back home with the will to do more.

Last but not least, FOSDEM is one of these rare events in which I get to meet many people who are dear to me — in some cases spontaneously, even without knowing we all would be attending. See you in Brussels!

Web Engines Hackfest

Like on previous years, the Web Engines Hackfest has been hosted by Igalia, in the lovely city of A Coruña. Every year the number of participants has been increasing, and we hit the mark of 70 people in the 2018 edition.

Are We GTK+4 Yet?

This time I was looking forward to figuring out how to bring WebKitGTK+ into the future, and in particular to GTK+4. We had a productive discussion with Benjamin Otte which helped a great deal to understand how the GTK+ scene graph works, and how to approach the migration to the new version of the toolkit in an incremental way. And he happens to be a fan of Factorio, too!

In its current incarnation the WebKitWebView widget needs to use Cairo as the final step to draw its contents, because that is how widgets work, while widgets in GTK+4 populate nodes of a scene graph with the contents they need to display. The “good” news is that it is possible to populate a render node using a Cairo surface, which will allow us to keep the current painting code. While it would be more optimal to avoid Cairo altogether and let WebKit paint using the GPU on textures that the scene graph would consume directly, I expect this to make the initial bringup more approachable, and allow building WebKitGTK+ both for GTK+3 and GTK+4 from the same code base. There will be room for improvements, but at least we expect performance to be on par with the current WebKitGTK+ releases running on GTK+3.

An ideal future: paint Web content in the GPU, feed textures to GTK+.

While not needing to modify our existing rendering pipeline should help, and probably just having the WebKitWebView display something on GTK+4 should not take that much effort, the migration will still be a major undertaking involving some major changes like switching input event handling to use GtkEventController, and it will not be precisely a walk in the park.

As of this writing, we are not (yet) actively working on supporting GTK+4, but rest assured that it will eventually happen. There are other ideas we have on the table to provide a working Web content rendering widget for GTK+4, but that will a the topic for another day.

The MSE Rush

At some point people decided that it would be a good idea to allow Web content to play videos, and thus the <video> and <audio> tags were born. All was good and swell until people wanted playback to adapt to different types of network connections and multiple screen resolutions (phones, tablets, cathode ray tubes, cinema projectors...). The “solution” is to serve video and audio in multiple small chunks of varying qualities, which are then chosen, downloaded, and stitched together while the content is being played back. Sci-fi? No: Media Source Extensions.

A few days before the hackfest it came to our attention that a popular video site stopped working with WebKitGTK+ and WPE WebKit. The culprit: The site started requiring MSE in some cases, without supporting a fallback anymore, our MSE implementation was disabled by default, and when enabled it showed a number of bugs which made it hardly possible to watch an entire video in one go.

During many the Web Engines Hackfest a few of us worked tirelessly, sometimes into the wee hours, to make MSE work well. We managed to crank out no less than two WebKitGTK+ releases (and one for WPE WebKit) which fixed most of the rough edges, making it possible to have MSE enabled and working.

And What Else?

To be completely honest, shipping the releases with a working MSE implementation made the hackfest pass in a blur and I cannot remember much else other than having a great time meeting everybody, and having many fascinating conversations — often around a table sharing good food. And that is already good motivation to attend again next year 😉

by aperez (adrian@perezdecastro.org) at January 27, 2019 09:45 PM

Eleni Maria Stea

Hair simulation with a mass-spring system (punk’s not dead!)

Hair rendering and simulation can be challenging, especially in real-time. There are many sophisticated algorithms for it (based on particle systems, hair mesh simulation, mass-spring systems and more) that can give very good results. But in this post, I will try to explain a simple and somehow hacky approach I followed in my first attempt to … Continue reading Hair simulation with a mass-spring system (punk’s not dead!)

by hikiko at January 27, 2019 08:34 PM

January 21, 2019

Henrique Ferreiro

Cross compiling Chromium for Windows

Since October 2017, Chromium for Windows can be built from Linux machines.

January 21, 2019 11:30 PM

About

Summary Free Software advocate. Background in functional programming. Eager to learn new technologies.

January 21, 2019 10:30 PM

January 17, 2019

Gyuyoung Kim

The story of the webOS Chromium contribution over the past year

In this article, I share how I started webOS Chromium upstream, what webOS patches were contributed by LG Electronics, and how I’ve contributed to Chromium.

First, let’s briefly describe the history of the webOS. WebOS was created by Palm, Inc. Palm Inc. was acquired by HP in 2010 and HP made the platform open source, so it then became open webOS. In January 2014 the operation system was sold to LG Electronics. LG Electronics has been shipping the webOS for their TV and signage products since. LG Electronics has also been spreading the webOS to more of their products.

The webOS uses Chromium to run web applications. So, Chromium is a very important component in the webOS. As other Chromium embedders, the webOS also has many downstream patches. So, LG Electronics has tried to contribute own downstream patches to the Chromium open source project to reduce the effort to catch up to the latest Chromium version as well as to improve the quality of the downstream patches. As one of LG Electronics contractors for the last one and half years, I’ve started to work on the webOS Chromium contribution since September 2017. So, let’s start to explain the process of contributing:

1. The Corporate CLA

The Chromium project only accepts patches after the contributor signs a Contributor License Agreement (CLA). There are two kinds of CLA, one for individual and one corporate contributors. If a company signs the corporate CLA, then the individual contributors are exempt from signing an individual CLA, however, they must use their corporate email address as well as join the google group which was created when the corporate CLA was signed. LG Electronics signed up the corporate CLA and they were added to AUTHOR file.

  • Corporate Contributor License Agreement (Link)
  • Individual Contributor License Agreement (Link)

  • 2. List upstreamable patches in webOS

    After finishing the registration of the corporate CLA, I started to list up upstreamable webOS patches. It seemed to me that there were two categories in the patches. One was new features for the webOS. The other one was bug fixes. In the case of new features, the patches were mainly to improve the performance or to make LG products like TV and signage use less memory. I tried to list upstreamable patches among those patches. The patch criteria was either to improve the performance or obtain a benefit on the desktop. I thought this would allow owners to accept and merge the patch into the mainline more easily.

    3. What patches have been merged to Chromium mainline?

    Before uploading webOS patches, I merged the patches to replace deprecated WTF or base utilities (WTF::RefPtr, base::MakeUnique) with c++ standard things. I thought that it would be good to show that LG Electronics started to contribute to Chromium. After replacing all of them, I could start to contribute webOS patches in earnest. I’ve since merged webOS patches to reduce memory usage, release more used memory under OOM, add a new content API to suspend/resume DOM operation, and so on. Below is the list of the main patches I successfully merged.

    1. New content API to suspend/resume DOM operation
    2. Release more used memory under an out-of-memory situation
      • Note: According to the Chromium performance bot, the patches to reduce the memory usage in RenderThreadImpl::ClearMemory could reduce the memory usage until 2MB in the background.
  • Note: OnMemoryPressure listener was added to the compositor layers through these patches. So, the compositor has been releasing more used memory under OOM through the OOM handler. In my opinion, this is very good contribution from the webOS.
  • Introduce new command line switches for embedded devices
  • 4. Trace LG Electronics contribution stats

    As more webOS patches have been merged to Chromium mainline, I thought that it would be good if we run a tool to chase all LG Electronics Chromium contributions so that LG Electronics’s Chromium contribution efforts are well documented. To do this I set up the LG Electronics Chromium contribution stats using the GitStats tool. The tool has been generating the stats every day.

    I was happy to work on the webOS upstream project over the past year. It was challenging work because the downstream patch should show some benefits in Chromium mainline. I’m sure that LG Electronics will continue to keep contributing good patches to webOS and I hope they’re going to become a good partner as well as a contributor in Chromium.

    by gyuyoung at January 17, 2019 09:12 AM

    January 16, 2019

    Víctor Jáquez

    Rust bindings for GStreamerGL: Memoirs

    Rust is a great programming language but the community around it’s just amazing. Those are the ingredients for the craft of useful software tools, just like Servo, an experimental browser engine designed for tasks isolation and high parallelization.

    Both projects, Rust and Servo, are funded by ">">Mozilla.

    Thanks to Mozilla and Igalia I have the opportunity to work on Servo, adding it HTML5 multimedia features.

    First, with the help of Fernando Jiménez, we finished what my colleague Philippe Normand and Sebastian Dröge (one of my programming heroes) started: a media player in Rust designed to be integrated in Servo. This media player lives in its own crate: servo/media along with the WebAudio engine. A crate, in Rust jargon, is like a library. This crate is (very ad-hocly) designed to be multimedia framework agnostic, but the only backend right now is for GStreamer. Later we integrated it into Servo adding an initial support for audio and video tags.

    Currently, servo/media passes, through a IPC channel, the array with the whole frame to render in Servo. This implies, at least, one copy of the frame in memory, and we would like to avoid it.

    For painting and compositing the web content, Servo uses WebRender, a crate designed to use the GPU intensively. Thus, if instead of raw frame data we pass OpenGL textures to WebRender the performance could be enhanced notoriously.

    Luckily, GStreamer already supports the uploading, downloading, painting and composition of video frames as OpenGL textures with the OpenGL plugin and its OpenGL Integration library. Even more, with plugins such as GStreamer-VAAPI, Gst-OMX (OpenMAX), and others, it’s possible to process video without using the main CPU or its mapped memory in different platforms.

    But from what’s available in GStreamer to what it’s available in Rust there’s a distance. Nonetheless, Sebastian has putting a lot of effort in the Rust bindings for GStreamer, either for applications and plugins, sadly, GStreamer’s OpenGL Integration library (GstGL for short) wasn’t available at that time. So I rolled up my sleeves and got to work on the bindings.

    These are the stories of that work.

    As GStreamer shares with GTK+ the GObject framework and its introspection mechanism, both projects have collaborated on the required infrastructure to support Rust bindings. Thanks to all the GNOME folks who are working on the intercommunication between Rust and GObject. The quest has been long and complex, since Rust doesn’t map all the object oriented concepts, and GObject, being a set of practices and software helpers to do object oriented programming with C, its usage is not homogeneous.

    The Rubicon that ease the generation of Rust bindings for GObject-based projects is GIR, a tool, written in Rust, that reads gir files, along with metadata in toml, and outputs two types of bindings: sys and api.

    Rust can call external functions through FFI (foreign function interface), which is just a declaration of a C function with Rust types. But these functions are considered unsafe. The sys bindings, are just the exporting of the C function for the library organized by the library’s namespace.

    The next step is to create a safe and rustified API. This is the api bindings.

    As we said, GObject libraries are quite homogeneous, and even following the introspection annotations, there will be cases where GIR won’t be able to generate the correct bindings. For that reason GIR is constantly evolving, looking for a common way to solve the corner cases that exist in every GObject project. For example, these are my patches in order to generate the GstGL bindings.

    The done tasks were:

    For this document we assume that the reader has a functional Rust setup and they know the basic concepts.

    Clone and build gir

    $ cd ~/ws
    $ git clone https://github.com/gtk-rs/gir.git
    $ cd gir
    $ cargo build --release
    

    The reason to build gir in release mode is because, otherwise would be very slow.

    For sys bindings.

    These kind of bindings are normally straight forward (and unsafe) since they only map the C API to Rust via FFI mechanism.

    $ cd ~/ws
    $ git clone https://gitlab.freedesktop.org/gstreamer/gstreamer-rs-sys.git
    $ cd gstreamer-rs-sys
    $ cp /usr/share/gir-1.0/GstGL-1.0.gir gir-files/
    
    1. Verify if the gir file is more o less correct
      1. If there something strange, we should fix the code that generated it.
      2. If that is not possible, the last resource is to fix the gir file directly, which is just XML, not manually but through a script using xmlstartlet. See fix.sh in gtk-rs as example.
    2. Create the toml file with the metadata required to create the bindings. In other words, this file contains the exceptions, rules and options used by the tool to generated the bindings. See Gir_GstGL.toml in gstreamer-rs-sys as example. The documentation of the toml file is in the gir’s README.md file.
    $ ~/ws/gir/target/release/gir -c Gir_GstGL.toml
    

    This command will generate, as specified in the toml file (target_path), a crate in the directory named gstreamer-gl-sys.

    Api bindings.

    These type of bindings may require more manual work since their purpose is to offer a rustified API of the library, with all its syntactic sugar, semantics, and so on. But in general terms, the process is similar:

    $ cd ~/ws
    $ git clone https://gitlab.freedesktop.org/gstreamer/gstreamer-sys.git
    $ cd gstreamer-sys
    $ cp /usr/share/gir-1.0/GstGL-1.0.gir gir-files/
    

    Again, it would be possible to end up applying fixes to the gir file through a fix.sh script using xmlstartlet.

    And again, the confection of the toml file might take a lot of time, by trial and error, by cleaning and tidying up the API. See Gir_GstGL.toml in gstreamer-rs as example.

    $ ~/ws/gir/target/release/gir -c Gir_GstGL.toml
    

    A good way to test your bindings is by crafting a test application, which shows how to use the API. Personally I devoted a ton of time in the test application for GstGL, but worth it. It made me aware of a missing part in the crate used for GL applications in Rust, named Glutin, which was a way to get the used EGLDisplay. So also worked on that and sent a pull request that was recently merged. The sweets of the free software development.

    Nowadays I’m integrating GstGL API in servo/media and later, Servo!

    by vjaquez at January 16, 2019 07:42 PM

    January 14, 2019

    Andrés Gómez

    matrix-send me a notification!

    When you are working in the console of an Un*x system you always have the possibility of using some kind of notification system to warn you when a task has completed. Quite typically, that would involve an email that could arrive to your box’ local inbox or, if you have a mail agent properly configure, to some other inbox in the Internet.

    With the arriving of the Instant Messaging systems you could somehow move from the good old email notification to some other fancy service. That has been my prefered method for quite a while since I understand email as a “non-instant” messaging system. Basically, I do not want to get instant notifications when a mail arrives. Add to that the hassle of setting some kind of filter criteria to get the notifications only for specific mail rules and the not yet universally supported IMAP4 push method, instead of pulling for newly arrived mail …

    Anyway, long story short, for some time now we are using [matrix] as our Instant Messaging service at Igalia so, why not getting notifications there when a task is completed?

    Yes, you have guessed correctly, that’s possible and, actually, it’s very easy to set up, specially with the help of matrix-send.

    First, you need an account that will send you the notification(s). Ideally, that would be a bot user, but it could be any account. Then, you have get an access token with such user so you can interact with the matrix server from the command line as if it would be any other ordinary matrix client. Finally, you need to create a chat room between that user and your own in order to keep the communication ongoing. All this is explained in matrix’ client-server API documentation but, to make things easier, it would go as follows:

    $ curl -XPOST -d '{"user":"<matrix-user>", "password":"<password>", "type":"m.login.password"}' "https://<matrix-server>/_matrix/client/r0/login"
    {
        "access_token": "<access-token>",
        "device_id": "<device-id>",
        "home_server": "<home-server>",
        "user_id": "@<matrix-user>:<home-server>"
    }

    This will give you the needed access-token.

    Now, from your regular matrix client, invite the bot user to a conversation in a new room. Check in the configuration of the new room for its internal ID. It would be something like
    !<internal-id>:<home-server>.

    Then, accept such invitation from the command line:

    $ curl -XPOST -d '{}' "https://<matrix-server>/_matrix/client/r0/rooms/%21<internal-room-id>:<home-server>/join?access_token=<access-token>"
    {
        "room_id": "!<internal-room-id>:<home-server>"
    }

    All that is left is to configure matrix-send and start using it. Mind you, I’ve done a small addition that it has not been merged yet so I would just clone from my fork.

    The configuration file would look like this:

    $ cat ~/.config/matrix-send/config.ini
    [DEFAULT]
    endpoint=https://<matrix-server>/_matrix/
    access_token=<access-token>
    channel_id=!<internal-room-id>:<home-server>
    msgtype=m.text

    The interesting addition from my own is the msgtype field. By default, in matrix-send its value is m.notice which, depending on the configuration, quite typically won’t trigger a notification in your matrix client.

    All that is left is to make matrix-send executable and test it:

    $ chmod +x <path-to-matrix-send>/matrix-send.py
    $ <path-to-matrix-send>/matrix-send.py "Hello World!"

    by tanty at January 14, 2019 10:05 PM

    January 11, 2019

    Manuel Rego

    An introduction to CSS Containment

    Igalia has been recently working on the implementation of css-contain in Chromium by providing some fixes and optimizations based on this standard. This is a brief blog post trying to give an introduction to the spec, explain the status of things, the work done during past year, and some plans for the future.

    What’s css-contain?

    The main goal of CSS Containment standard is to improve the rendering performance of web pages, allowing the isolation of a subtree from the rest of the document. This specification only introduces one new CSS property called contain with different possible values. Browser engines can use that information to implement optimizations and avoid doing extra work when they know which subtrees are independent of the rest of the page.

    Let’s explain what is this about and why this can be can bring performance improvements in complex websites. Imagine that you have a big HTML page which generates a complex DOM tree, but you know that some parts of that page are totally independent of the rest of the page and the content in those parts is modified at some point.

    Browser engines usually try to avoid doing more work than needed and use some heuristics to avoid spending more time than required. However there are lots of corner cases and complex situations in which the browser needs to actually recompute the whole webpage. To improve these scenarios the author has to identify which parts (subtrees) of their website are independent and isolate them from the rest of the page thanks to the contain property. Then when there are changes in some of those subrees the rendering engine will be able to avoid doing any work outside of the subtree boundaries.

    Not everything is for free, when you use contain there are some restrictions that will affect those elements, so the browser is totally certain it can apply optimizations without causing any breakage (e.g. you need to manually set the size of the elment if you want to use size containment).

    The CSS Containment specification defines four values for the contain property, one per each type of containment:

    • layout: The internal layout of the element is totally isolated from the rest of the page, it’s not affected by anything outside and its contents cannot have any effect on the ancestors.
    • paint: Descendants of the element cannot be displayed outside its bounds, nothing will overflow this element (or if it does it won’t be visible).
    • size: The size of the element can be computed without checking its children, the element dimensions are independent of its contents.
    • style: The effects of counters and quotes cannot escape this element, so they are isolated from the rest of the page.
      Note that regarding style containment there is an ongoing discussion on the CSS Working Group about how useful it is (due to the narrowed scope of counters and quotes).

    You can combine the different type of containments as you wish, but the spec also provides two extra values that are a kind of “shorthand” for the other four:

    • content: Which is equivalent to contain: layout paint style.
    • strict: This is the same than having all four types of containment, so it’s equivalent to contain: layout paint size style.

    Example

    Let’s show an example of how CSS Containment can help to improve the performance of a webpage.

    Imagine a page with lots of elements, in this case 10,000 elements like this:

      <div class="item">
        <div>Lorem ipsum...</div>
      </div>

    And that it modifies the content of one of the inner DIVs trough textContent attribute.

    If you don’t use css-contain, even when the change is on a single element, Chromium spends a lot of time on layout because it traverses the whole DOM tree (which in this case is big as it has 10,000 elements).

    CSS Containment Example DOM Tree CSS Containment Example DOM Tree

    Here is when contain property comes to the rescue. In this example the DIV item has fixed size, and the contents we’re changing in the inner DIV will never overflow it. So we can apply contain: strict to the item, that way the browser won’t need to visit the rest of the nodes when something changes inside an item, it can stop checking things on that element and avoid going outside.

    Notice that if the content overflows the item it would get clipped, also if we don’t set a fixed size for the item it’ll be rendered as an empty box so nothing would be visible (actually in this example the borders would be present but they would be the only visible thing).

    CSS Containment Example CSS Containment Example

    Despite how simple is each of the items in this example, we’re getting a big improvement by using CSS Containment in layout time going down from ~4ms to ~0.04ms which is a huge difference. Imagine what would happen if the DOM tree has very complex structures and contents but only a small part of the page gets modified, if you can isolate that from the rest of the page you could get similar benefits.

    State of the art

    This is not a new spec, Chrome 52 shipped the initial support by July 2016, but during last year there has been some extra development related to it and that’s what I want to highlight in this blog post.

    First of all many specification issues have been fixed and some of them imply changes on the implementations, most of this work has been carried on by Florian Rivoal in collaboration with the CSS Working Group.

    Not only that but on the tests side Gérard Talbot has completed the test suite in the web-platform-tests (WPT) repository, which is really important to fix bugs on the implementations and ensure interoperability.

    In my case I’ve been working on the Chromium implementation fixing several bugs and interoperability issues and getting it up to date according to the last specification changes. I took advantage of the WPT test suite to do this work and also contributed back a bunch of tests there. I also imported Firefox tests into Chromium to improve interop (even did a small Firefox patch as part of this work).

    Last, it’s worth to notice that Firefox has been actively working on the implementation of css-contain during last year (you can test it by enabling the runtime flag layout.css.contain.enabled). Hopefully that would bring a second browser engine shipping the spec in the future.

    Wrap-up

    CSS Containment is a nice and simple specification that can be useful to improve web rendering performance in many different use cases. It’s true that currently it’s only supported by Chromium (remember that Firefox is working on it too) and that more improvements and optimizations can be implemented based on it, still it seems to have a huge potential.

    Igalia logo Bloomberg logo
    Igalia and Bloomberg working together to build a better web

    One more time all the work from Igalia related to css-contain has been sponsored by Bloomberg as part of our ongoing collaboration.

    Bloomberg has some complex UIs that are taking advantage of css-contain to improve the rendering performance, in future blog posts we’ll talk about some of these cases and the optimizations that have been implemented on the rendering engine to improve them.

    January 11, 2019 08:00 AM

    January 10, 2019

    Diego Pino

    The eXpress Data Path

    In the previous article I briefly introduced XDP (eXpress Data Path) and eBPF, the multipurpose in-kernel virtual machine. On the XDP side, I focused only on the motivations behind this new technology, the reasons why rearchitecting the Linux kernel networking layer to enable faster packet processing. However, I didn’t get much into the details on how XDP works. In this new blog post I try to go deeper into XDP.

    XDP: A fast path for packet processing

    The design of XDP has its roots in a DDoS attack mitigation solution presented by Cloudflare at Netdev 1.1. Cloudflare leverages heavily on iptables, which according to their own metrics is able to handle 1 Mpps on a decent server (Source: Why we use the Linux kernel’s TCP stack). In the event of a DDoS attack, the amount of spoofed traffic can be up to 3 Mpps. Under those circumstances, a Linux box starts to be overflooded by IRQ interruptions until it becomes unusable.

    Because Cloudflare wanted to keep the convenience of using iptables (and the rest of the kernel’s network stack), they couldn’t go with a solution that takes full control of the hardware, such as DPDK. Their solution consisted of implementing what they called a “partial kernel bypass”. Some queues of the NIC are still attached to the kernel while others are attached to an user-space program that decides whether a packet should be dropped or not. By dropping packets at the lowest point of the stack, the amount of traffic that reaches the kernel’s networking subsystem gets significantly reduced.

    Cloudflare’s solution used the Netmap toolkit to implement its partial kernel bypass (Source: Single Rx queue kernel bypass with Netmap). However this idea could be generalized by adding a checkpoint in the Linux kernel network stack, preferably as soon as a packet is received in the NIC. This checkpoint should pass a packet to an user-space program that will decide what to do with it: drop it or let it continue through the normal path.

    Luckily, Linux already features a mechanism that allows user-space code execution within the kernel: the eBPF VM. So the solution seemed obvious.

    Linux network stack with XDP
    Linux network stack with XDP

    Packet operations

    Every network function, no matter how complex it is, consists of a series of basic operations:

    • Firewall: read incoming packets, compare them to a table of rules and execute an action: forward or drop.
    • NAT: read incoming packets, modify headers and forward packet.
    • Tunelling: read incoming packets, create a new packet, embed packet into new one and forward it.

    XDP passes packets to our eBPF program which decides what to do with them. We can read them or modify them if we need it. We can also access to helper functions to parse packets, compute checksums, and other functionalities, at no cost (avoiding system call cost penalties). And thanks to eBPF Maps we have access to complex data structures for persistent data storage, like tables. We are also able to decide what to do with a packet. Are we going to drop it? Forward it? To control a packet’s processing logic, XDP provides a set of predefined actions:

    • XDP_PASS: pass the packet to the normal network stack.
    • XDP_DROP: very fast drop.
    • XDP_TX: forward or TX-bounce back-out same interface.
    • XDP_REDIRECT: redirects the packet to another NIC or CPU.
    • XDP_ABORTED: indicates eBPF program error.

    XDP_PASS, XDP_TX and XDP_REDIRECT are specific cases of a forwarding action, whereas XDP_ABORTED is actually treated as a packet drop.

    Let’s take a look at one example that uses most of these elements to program a simple network function.

    Example: An IPv6 packet filter

    The canonical example when introducing XDP is a DDoS filter. What such network function does is to drop packets if they’re coming from a suspicious origin. In my case, I’m going with something even simpler: a function that filters out all traffic except IPv6.

    The advantage of this simpler function is that we don’t need to manage a list of suspicious addresses. Our program will simply examine the ethertype value of a packet and let it continue through the network stack or drop it depending on whether is an IPv6 packet or not.

    SEC("prog")
    int xdp_ipv6_filter_program(struct xdp_md *ctx)
    {
        void *data_end = (void *)(long)ctx->data_end;
        void *data     = (void *)(long)ctx->data;
        struct ethhdr *eth = data;
        u16 eth_type = 0;
    
        if (!(parse_eth(eth, data_end, eth_type))) {
            bpf_debug("Debug: Cannot parse L2\n");
            return XDP_PASS;
        }
    
        bpf_debug("Debug: eth_type:0x%x\n", ntohs(eth_type));
        if (eth_type == ntohs(0x86dd)) {
            return XDP_PASS;
        } else {
            return XDP_DROP;
        }
    }
    

    The function xdp_ipv6_filter_program is our main program. We define a new section in the binary called prog. This serves as a hook between our program and XDP. Whenever XDP receives a packet, our code will be executed.

    ctx represents a context, a struct which contains all the data necessary to access a packet. Our program calls parse_eth to fetch the ethertype. Then checks whether its value is 0x86dd (IPv6 ethertype), in that case the packet passes. Otherwise the packet is dropped. In addition, all the ethertype values are printed for debugging purposes.

    bpf_debug is in fact a macro defined as:

    #define bpf_debug(fmt, ...)                          \
        ({                                               \
            char ____fmt[] = fmt;                        \
            bpf_trace_printk(____fmt, sizeof(____fmt),   \
                ##__VA_ARGS__);                          \
        })
    

    It uses the function bpf_trace_printk under the hood, a function which prints out messages in /sys/kernel/debug/tracing/trace_pipe.

    The function parse_eth takes a packet’s beginning and end and parses its content.

    static __always_inline
    bool parse_eth(struct ethhdr *eth, void *data_end, u16 *eth_type)
    {
        u64 offset;
    
        offset = sizeof(*eth);
        if ((void *)eth + offset > data_end)
            return false;
        *eth_type = eth->h_proto;
        return true;
    }
    

    Running external code in the kernel involves certain risks. For instance, an infinite loop may freeze the kernel or a program may access an unrestricted area of memory. To avoid these potential hazards a verifier is run when the eBPF code is loaded. The verifier walks all possible code paths, checking our program doesn’t access out-of-range memory and there are not out of bound jumps. The verifier also ensures the program terminates in finite time.

    The snippets above conform our eBPF program. Now we just need to compile it (Full source code is available at: xdp_ipv6_filter).

    $ make
    

    Which generates xdp_ipv6_filter.o, the eBPF object file.

    Now we’re going to load this object file into a network interface. There are two ways to do that:

    • Write an user-space program that loads the object file and attaches it to a network interface.
    • Use iproute2 to load the object file to an interface.

    For this example, I’m going to use the latter method.

    Currently there’s a limited amount of network interfaces that support XDP (ixgbe, i40e, mlx5, veth, tap, tun, virtio_net and others), although the list is growing. Some of this network interfaces support XDP at driver level. That means, the XDP hook is implemented at the lowest point in the networking layer, just when the NIC receives a packet in the Rx ring. In other cases, the XDP hook is implemented at a higher point in the network stack. The former method offers better performance results, although the latter makes XDP available for any network interface.

    Luckily, veth interfaces are supported by XDP. I’m going to create a veth pair and attach the eBPF program to one of its ends. Remember that a veth always comes in pairs. It’s like a virtual patch cable connecting two interfaces. Whatever is transmited in one of the ends arrives to the other end and viceversa.

    $ sudo ip link add dev veth0 type veth peer name veth1
    $ sudo ip link set up dev veth0
    $ sudo ip link set up dev veth1
    

    Now I attach the eBPF program to veth1:

    $ sudo ip link set dev veth1 xdp object xdp_ipv6_filter.o
    

    You may have noticed I called the section for the eBPF program “prog”. That’s the name of the section iproute2 expects to find and naming the section with a different name will result into an error.

    If the program was successfully loaded I should see an xdp flag in the veth1 interface:

    $ sudo ip link sh veth1
    8: veth1@veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 xdp qdisc noqueue state UP mode DEFAULT group default qlen 1000
        link/ether 32:05:fc:9a:d8:75 brd ff:ff:ff:ff:ff:ff
        prog/xdp id 32 tag bdb81fb6a5cf3154 jited
    

    To verify my program works as expected, I’m going to push a mix of IPv4 and IPv6 packets to veth0 (ipv4-and-ipv6-data.pcap). My sample has a total of 20 packets (10 IPv4 and 10 IPv6). Before doing that though, I’m going to launch a tcpdump program on veth1 which is ready to capture only 10 IPv6 packets.

    $ sudo tcpdump "ip6" -i veth1 -w captured.pcap -c 10
    tcpdump: listening on veth1, link-type EN10MB (Ethernet), capture size 262144 bytes
    

    Send packets to veth0:

    $ sudo tcpreplay -i veth0 ipv4-and-ipv6-data.pcap
    

    The filtered packets arrived at the other end. The tcpdump program terminates since all the expected packets were received.

    10 packets captured
    10 packets received by filter
    0 packets dropped by kernel
    

    We can also print out /sys/kernel/debug/tracing/trace_pipe to check the ethertype values listed:

    $ sudo cat /sys/kernel/debug/tracing/trace_pipe
    tcpreplay-4496  [003] ..s1 15472.046835: 0: Debug: eth_type:0x86dd
    tcpreplay-4496  [003] ..s1 15472.046847: 0: Debug: eth_type:0x86dd
    tcpreplay-4496  [003] ..s1 15472.046855: 0: Debug: eth_type:0x86dd
    tcpreplay-4496  [003] ..s1 15472.046862: 0: Debug: eth_type:0x86dd
    tcpreplay-4496  [003] ..s1 15472.046869: 0: Debug: eth_type:0x86dd
    tcpreplay-4496  [003] ..s1 15472.046878: 0: Debug: eth_type:0x800
    tcpreplay-4496  [003] ..s1 15472.046885: 0: Debug: eth_type:0x800
    tcpreplay-4496  [003] ..s1 15472.046892: 0: Debug: eth_type:0x800
    tcpreplay-4496  [003] ..s1 15472.046903: 0: Debug: eth_type:0x800
    tcpreplay-4496  [003] ..s1 15472.046911: 0: Debug: eth_type:0x800
    ...
    

    XDP: The future of in-kernel packet processing?

    XDP started as a fast path for certain use cases, especially the ones which could result into an early packet drop (like a DDoS attack prevention solution). However, since a network function is nothing else but a combination of basic primitives (reads, writes, forwarding, dropping…), all of them available via XDP/eBPF, it could possible to use XDP for more than packet dropping. It could be used, in fact, to implement any network function.

    So what started as a fast path gradually is becoming the normal path. We’re seeing now how tools such as iptables are getting rewritten in XDP/eBPF, keeping their user-level interfaces intact. The enormous performance gains of this new approach makes the effort worth it. And since the hunger for more performance gains never ends, it seems reasonable to think that any other tool that can be possibly written in XDP/eBPF will follow a similar fate.

    iptables vs nftables vs bpfilter
    iptables vs nftables vs bpfilter

    Source: Why is the kernel community replacing iptables with BPF?

    Summary

    In this article I took a closer look at XDP. I explained the motivations that lead to its design. Through a simple example, I showed how XDP and eBPF work together to perform fast packet processing inside the kernel. XDP provides check points within the kernel’s network stack. An eBPF program can hook to XDP events to perform an operation on a packet (modify its headers, drop it, forward it, etc).

    XDP offers high-performance packet processing while maintaining interoperatibility with the rest of networking subsystem, an advantage over full kernel bypass solutions. I didn’t get much into the internals of XDP and how it interacts with other parts of the networking subsystem though. I encourage checking the first two links in the recommended readings section for further understanding on XDP internals.

    In the next article, the last in the series, I will cover the new AF_XDP socket address family and the implementation of a Snabb bridge for this new interface.

    Recommended readings:

    January 10, 2019 10:00 AM

    January 08, 2019

    Carlos García Campos

    Epiphany automation mode

    Last week I finally found some time to add the automation mode to Epiphany, that allows to run automated tests using WebDriver. It’s important to note that the automation mode is not expected to be used by users or applications to control the browser remotely, but only by WebDriver automated tests. For that reason, the automation mode is incompatible with a primary user profile. There are a few other things affected by the auotmation mode:

    • There’s no persistency. A private profile is created in tmp and only ephemeral web contexts are used.
    • URL entry is not editable, since users are not expected to interact with the browser.
    • An info bar is shown to notify the user that the browser is being controlled by automation.
    • The window decoration is orange to make it even clearer that the browser is running in automation mode.

    So, how can I write tests to be run in Epiphany? First, you need to install a recently enough selenium. For now, only the python API is supported. Selenium doesn’t have an Epiphany driver, but the WebKitGTK driver can be used with any WebKitGTK+ based browser, by providing the browser information as part of session capabilities.

    from selenium import webdriver
    
    options = webdriver.WebKitGTKOptions()
    options.binary_location = 'epiphany'
    options.add_argument('--automation-mode')
    options.set_capability('browserName', 'Epiphany')
    options.set_capability('version', '3.31.4')
    
    ephy = webdriver.WebKitGTK(options=options, desired_capabilities={})
    ephy.get('http://www.webkitgtk.org')
    ephy.quit()
    

    This is a very simple example that just opens Epiphany in automation mode, loads http://www.webkitgtk.org and closes Epiphany. A few comments about the example:

    • Version 3.31.4 will be the first one including the automation mode.
    • The parameter desired_capabilities shouldn’t be needed, but there’s a bug in selenium that has been fixed very recently.
    • WebKitGTKOptions.set_capability was added in selenium 3.14, if you have an older version you can use the following snippet instead
    from selenium import webdriver
    
    options = webdriver.WebKitGTKOptions()
    options.binary_location = 'epiphany'
    options.add_argument('--automation-mode')
    capabilities = options.to_capabilities()
    capabilities['browserName'] = 'Epiphany'
    capabilities['version'] = '3.31.4'
    
    ephy = webdriver.WebKitGTK(desired_capabilities=capabilities)
    ephy.get('http://www.webkitgtk.org')
    ephy.quit()
    

    To simplify the driver instantation you can create your own Epiphany driver derived from the WebKitGTK one:

    from selenium import webdriver
    
    class Epiphany(webdriver.WebKitGTK):
        def __init__(self):
            options = webdriver.WebKitGTKOptions()
            options.binary_location = 'epiphany'
            options.add_argument('--automation-mode')
            options.set_capability('browserName', 'Epiphany')
            options.set_capability('version', '3.31.4')
    
            webdriver.WebKitGTK.__init__(self, options=options, desired_capabilities={})
    
    ephy = Epiphany()
    ephy.get('http://www.webkitgtk.org')
    ephy.quit()
    

    The same for selenium < 3.14

    from selenium import webdriver
    
    class Epiphany(webdriver.WebKitGTK):
        def __init__(self):
            options = webdriver.WebKitGTKOptions()
            options.binary_location = 'epiphany'
            options.add_argument('--automation-mode')
            capabilities = options.to_capabilities()
            capabilities['browserName'] = 'Epiphany'
            capabilities['version'] = '3.31.4'
    
            webdriver.WebKitGTK.__init__(self, desired_capabilities=capabilities)
    
    ephy = Epiphany()
    ephy.get('http://www.webkitgtk.org')
    ephy.quit()
    

    by carlos garcia campos at January 08, 2019 05:22 PM

    January 07, 2019

    Diego Pino

    A brief introduction to XDP and eBPF

    In a previous post I explained how to build a kernel with XDP (eXpress Data Path) support. Having that feature enabled is mandatory in order to use it. XDP is a new Linux kernel component that highly improves packet processing performance.

    In the last years, we have seen an upraise of programming toolkits and techniques to overcome the limitations of the Linux kernel when it comes to do high-performance packet processing. One of the most popular techniques is kernel bypass which means to skip the kernel’s networking layer and do all packet processing from user-space. Kernel bypass also involves to manage the NIC from user-space, in other words, to rely on an user-space driver to handle the NIC.

    By giving full control of the NIC to an user-space program, we reduce the overhead introduced by the kernel (context switching, networking layer processing, interruptions, etc), which is relevant enough when working at speeds of 10Gbps or higher. Kernel bypass plus a combination of other features (batch packet processing) and performance tuning adjustments (NUMA awareness, CPU isolation, etc) conform the basis of high-performance user-space networking. Perhaps the poster child of this new approach to packet processing is Intel’s DPDK (Data Plane Development Kit), although other well-know toolkits and techniques are Cisco’s VPP (Vector Packet Processing), Netmap and of course Snabb.

    The disadvantages of user-space networking are several:

    • An OS’s kernel is an abstraction layer for hardware resources. Since user-space programs need to manage their resources directly, they also need to manage their hardware. That often means to program their own drivers.
    • As the kernel-space is completely skipped, all the networking functionality provided by the kernel is skipped too. User-space programs need to reimplement functionality that might be already provided by the kernel or the OS.
    • Programs work as sandboxes, which severely limit their ability to interact, and be integrated, with other parts of the OS.

    Essentially, user-space networking achieves high-speed performance by moving packet-processing out of the kernel’s realm into user-space. XDP does in fact the opposite: it moves user-space networking programs (filters, mappers, routing, etc) into the kernel’s realm. XDP allow us to execute our network function as soon as a packet hits the NIC, and before it starts moving upwards into the kernel’s networking subsystem, which results into a significant increase of packet-processing speed. But how does the kernel make possible for an user to execute their programs within the kernel’s realm? Before answering this question we need to take a look at BPF.

    BPF and eBPF

    Despite its somehow misleading name, BPF (Berkeley Packet Filtering) is in fact a virtual machine model. This VM was originally designed for packet filtering processing, thus its name.

    One of the most prominent users of BPF is the tool tcpdump. When capturing packets with tcpdump, an user can define a packet-filtering expression. Only packets that match that expression will actually be captured. For instance, the expression “tcp dst port 80” captures all TCP packets which destination port equals to 80. This expression can be reduced by a compiler to BPF bytecode.

    $ sudo tcpdump -d "tcp dst port 80"
    (000) ldh      [12]
    (001) jeq      #0x86dd          jt 2    jf 6
    (002) ldb      [20]
    (003) jeq      #0x6             jt 4    jf 15
    (004) ldh      [56]
    (005) jeq      #0x50            jt 14   jf 15
    (006) jeq      #0x800           jt 7    jf 15
    (007) ldb      [23]
    (008) jeq      #0x6             jt 9    jf 15
    (009) ldh      [20]
    (010) jset     #0x1fff          jt 15   jf 11
    (011) ldxb     4*([14]&0xf)
    (012) ldh      [x + 16]
    (013) jeq      #0x50            jt 14   jf 15
    (014) ret      #262144
    (015) ret      #0
    

    Basically what the program above does is:

    • Instruction (000): loads the packet’s offset 12, as a 16-bit word, into the accumulator. Offset 12 represents a packet’s ethertype.
    • Instruction (001): compares the value of the accumulator to 0x86dd, which is the ethertype value for IPv6. If the result is true, the program counter jumps to instruction (002), if not it jumps to (006).
    • Instruction (006): compares the value to 0x800 (ethertype value of IPv4). If true jump to (007), if not (015).

    And so forth, until the packet-filtering program returns a result. This result is generally a boolean. Returning a non-zero value (instruction (014)) means the packet matched, whereas returning a zero value (instruction (015)) means the packet didn’t match.

    The BPF VM and its bytecode was introduced by Steve McCanne and Van Jacobson in late 1992, in their paper The BSD Packet Filter: A New Architecture for User-level Packet Capture, and it was presented for the first time at Usenix Conference Winter ‘93.

    Since BPF is a VM, it defines an environment where programs are executed. Besides a bytecode, it also defines a packet-based memory model (load instructions are implicitly done on the processing packet), registers (A and X; Accumulator and Index register), a scratch memory store and an implicit Program Counter. Interestingly, BPF’s bytecode was modeled after the Motorola 6502 ISA. As Steve McCanne recalls in his Sharkfest ‘11 keynote, he was familiar with 6502 assembly from his junior high-school days programming on an Apple II and that influence him when he designed the BPF bytecode.

    The Linux kernel features BPF support since v2.5, mainly added by Jay Schullist. There were not major changes in the BPF code until 2011, when Eric Dumazet turned the BPF interpreter into a JIT (Source: A JIT for packet filters). Instead of interpreting BPF bytecode, now the kernel was able to translate BPF programs directly to a target architecture: x86, ARM, MIPS, etc.

    Later on, in 2014, Alexei Starovoitov introduced a new BPF JIT. This new JIT was actually a new architecture based on BPF, known as eBPF. Both VMs co-existed for some time I think, but nowadays packet-filtering is implemented on top of eBPF. In fact, a lot of documentation refers now to eBPF as BPF, and the classic BPF is known as cBPF.

    eBPF extends the classic BPF virtual machine in several ways:

    • Takes advantage of modern 64-bit architectures. eBPF uses 64-bit registers and increases the number of available registers from 2 (Accumulator and X register) to 10. eBPF also extends the number of opcodes (BPF_MOV, BPF_JNE, BPF_CALL…).
    • Decoupled from the networking subsystem. BPF was bounded to a packet-based data model. Since it was used for packet filtering, its code lived within the networking subsystem. However, the eBPF VM is no longer bounded to a data model and it can be used for any purpose. It’s possible to attach now an eBPF program to a tracepoint or to a kprobe. This opens up the door of eBPF to instrumentation, performance analysis and many more uses within other kernel subsystems. The eBPF code lives now at its own path: kernel/bpf.
    • Global data stores called Maps. Maps are key-value stores that allow the interchange of data between user-space and kernel-space. eBPF provides several types of Maps.
    • Helper functions. Such as packet rewrite, checksum calculation or packet cloning. Unlike user-space programming, these functions get executed inside the kernel. In addition, it’s possible to execute system calls from eBPF programs.
    • Tail-calls. eBPF programs are limited to 4096 bytes. The tail-call feature allows a eBPF program to pass control a new eBPF program, overcoming this limitation (up to 32 programs can be chained).

    eBPF: an example

    The Linux kernel sources include several eBPF examples. They’re available at samples/bpf/. To compile these examples simply type:

    $ sudo make samples/bpf/
    

    Instead of coding a new eBPF example, I’m going to reuse one of the samples available in samples/bpf/. I will go through some parts of the code and explain how it works. The example I chose was the tracex4 program.

    Generally, all the examples at samples/bpf/ consist of 2 files. In this case:

    We need to compile then tracex4_kern.c to eBPF bytecode. At this moment, gcc lacks a backend for eBPF. Luckily, clang can emit eBPF bytecode. The Makefile uses clang to compile tracex4_kern.c into an object file.

    I commented earlier that one of the most interesting features of eBPF are Maps. Maps are key/value stores that allow to exchange data between user-space and kernel-space programs. tracex4_kern defines one map:

    struct pair {
        u64 val;
        u64 ip;
    };  
    
    struct bpf_map_def SEC("maps") my_map = {
        .type = BPF_MAP_TYPE_HASH,
        .key_size = sizeof(long),
        .value_size = sizeof(struct pair),
        .max_entries = 1000000,
    };
    

    BPF_MAP_TYPE_HASH is one of the many Map types offered by eBPF. In this case, it’s simply a hash. You may also have noticed the SEC("maps") declaration. SEC is a macro used to create a new section in the binary. Actually the tracex4_kern example defines two more sections:

    SEC("kprobe/kmem_cache_free")
    int bpf_prog1(struct pt_regs *ctx)
    {   
        long ptr = PT_REGS_PARM2(ctx);
    
        bpf_map_delete_elem(&my_map, &ptr); 
        return 0;
    }
        
    SEC("kretprobe/kmem_cache_alloc_node") 
    int bpf_prog2(struct pt_regs *ctx)
    {
        long ptr = PT_REGS_RC(ctx);
        long ip = 0;
    
        // get ip address of kmem_cache_alloc_node() caller
        BPF_KRETPROBE_READ_RET_IP(ip, ctx);
    
        struct pair v = {
            .val = bpf_ktime_get_ns(),
            .ip = ip,
        };
        
        bpf_map_update_elem(&my_map, &ptr, &v, BPF_ANY);
        return 0;
    }   
    

    These two functions will allow us to delete an entry from a map (kprobe/kmem_cache_free) and to add a new entry to a map (kretprobe/kmem_cache_alloc_node). All the function calls in capital letters are actually macros defined at bpf_helpers.h.

    If I dump the sections of the object file, I should be able to see these new sections defined:

    $ objdump -h tracex4_kern.o
    
    tracex4_kern.o:     file format elf64-little
    
    Sections:
    Idx Name          Size      VMA               LMA               File off  Algn
      0 .text         00000000  0000000000000000  0000000000000000  00000040  2**2
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
      1 kprobe/kmem_cache_free 00000048  0000000000000000  0000000000000000  00000040  2**3
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
      2 kretprobe/kmem_cache_alloc_node 000000c0  0000000000000000  0000000000000000  00000088  2**3
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
      3 maps          0000001c  0000000000000000  0000000000000000  00000148  2**2
                      CONTENTS, ALLOC, LOAD, DATA
      4 license       00000004  0000000000000000  0000000000000000  00000164  2**0
                      CONTENTS, ALLOC, LOAD, DATA
      5 version       00000004  0000000000000000  0000000000000000  00000168  2**2
                      CONTENTS, ALLOC, LOAD, DATA
      6 .eh_frame     00000050  0000000000000000  0000000000000000  00000170  2**3
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
    

    Then there is tracex4_user.c, the main program. Basically what the program does is to listen to kmem_cache_alloc_node events. When that event happens, the corresponding eBPF code is executed. The code stores the IP attribute of an object into a map, which is printed in loop in the main program. Example:

    $ sudo ./tracex4
    obj 0xffff8d6430f60a00 is  2sec old was allocated at ip ffffffff9891ad90
    obj 0xffff8d6062ca5e00 is 23sec old was allocated at ip ffffffff98090e8f
    obj 0xffff8d5f80161780 is  6sec old was allocated at ip ffffffff98090e8f
    

    How the user-space program and the eBPF program are connected? On initialization, tracex4_user.c loads the tracex4_kern.o object file using the load_bpf_file function.

    int main(int ac, char **argv)
    {
        struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
        char filename[256];
        int i;
    
        snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
    
        if (setrlimit(RLIMIT_MEMLOCK, &r)) {
            perror("setrlimit(RLIMIT_MEMLOCK, RLIM_INFINITY)");
            return 1;
        }
    
        if (load_bpf_file(filename)) {
            printf("%s", bpf_log_buf);
            return 1;
        }
    
        for (i = 0; ; i++) {
            print_old_objects(map_fd[1]);
            sleep(1);
        }
    
        return 0;
    }
    

    When load_bpf_file is executed, the probes defined in the eBPF file are added to /sys/kernel/debug/tracing/kprobe_events. We’re listening now to those events and our program can do something when they happen.

    $ sudo cat /sys/kernel/debug/tracing/kprobe_events
    p:kprobes/kmem_cache_free kmem_cache_free
    r:kprobes/kmem_cache_alloc_node kmem_cache_alloc_node
    

    All the other programs in sample/bpf/ follow a similar structure. There’s always two files:

    • XXX_kern.c: the eBPF program.
    • XXX_user.c: the main program.

    The eBPF program defines Maps and functions hooked to a binary section. When the kernel emits a certain type of event (a tracepoint, for instance) our hooks will be executed. Maps are used to exchange data between the kernel program and the user-space program.

    Wrapping up

    In this article I have covered BPF and eBPF from a high-level view. I’m aware there’s a lot of resources and information nowadays about eBPF, but I feel I needed to explain it with my own words. Please check out the list of recommended readings for further information.

    On the next article I will cover XDP and its relation with eBPF.

    Recommended readings:

    January 07, 2019 08:00 AM