Planet Igalia

August 12, 2019

Jacobo Aragunde

Spatial navigation operations for the webview tag

The modern web is a space meant to be browsed with a mouse or a touchscreen, but think of a smart TV, set-top box, game console or budget phone. What happens when your target hardware doesn’t have those devices and provides only directional input instead?

TV, remote, phone

If you are under control of the contents, you certainly can design your UI around directional input, providing custom focus management based on key events, and properly highlighting the focused item. But that doesn’t work when you confront external content provided by the millions of websites out there, not even if you are targeting a small subset of them like the most popular social networks and search engines.

Virtual cursors are one solution. They certainly address the problem and provide full functionality, but they are somehow inconvenient: traversing the screen can take too long, and precision can be an issue.

Tab navigation is always available and it certainly works well in sites designed for that, like forms, but it’s quite limited; it can also show unexpected behavior when the DOM tree doesn’t match the visual placement of the contents.

Spatial navigation is a feature designed to address this problem. It will:

  • respond to directional input taking into account the position of the elements of the screen
  • deal with scroll when we reach the limits of the viewport or a scrollable element, in a way that content can be easily read (e.g. no abrupt jumps to the next focusable element)
  • provide special highlight for the focused element.

You can see it in action in this BlinkOn lightning talk by Junho Seo, one of the main contributors. You can also test it yourself with the --enable-spatial-navigation flag in your Chrome/Chromium browser.

A non-trivial algorithm makes sure that input matches user expectations. That video shows a couple of corner cases that demonstrate this isn’t easy.

When creating a web UI for a device that makes use of cursor navigation, it’s usual that you have to deal with the two kinds of content mentioned before: the one created specifically for the device, which already takes cursor input into account in its design, and external content from websites that don’t follow those principles. It would be interesting to be able to isolate external content and provide spatial navigation for it.

The webview extension tag is one way to provide isolation for an external site. To support this use case, we implemented new API operations for the tag to enable or disable spatial navigation inside the webview contents independently from the global settings, and to check its state. They have been available for a while, since Chromium version 71.

There are some interesting details in the implementation of this operation. One is that it is required some IPC between different browser processes, because webview contents run on a different renderer (guest), which must be notified of the status changes in spatial navigation made by its “parent” (host). There is no need to do IPC in the other direction because we cache the value in the browser process for the implementation of isSpatialNavigationEnabled to use it.

Another interesting detail was, precisely, related with that cache; it made our test flaky because sometimes the calls to isSpatialNavigationEnabled happened before the call to setSpatialNavigationEnabled had completed its IPC communication. My first attempt to fix this added an ACK reply via IPC to confirm the cache value, but it was considered an overkill for a situation that would never trigger in a real-world scenario. It even was very hard to reproduce from the test! It was finally fixed with some workarounds in the test code.

A more general solution is proposed for all kinds of iframes in the CSS WG draft of the spatial navigation feature. I hope my previous work makes the implementation for the iframe tag straightforward.

by Jacobo Aragunde Pérez at August 12, 2019 06:00 PM

August 05, 2019

Philippe Normand

Review of the Igalia Multimedia team Activities (2019/H1)

This blog post takes a look back at the various Multimedia-related tasks the Igalia Multimedia team was involved in during the first half of 2019.

GStreamer Editing Services

Thibault added support for the OpenTimelineIO open format for editorial timeline information. Having many editorial timeline information formats supported by OpenTimelineIO reduces vendor lock-in in the tools used by video artists in post-production studios. For instance a movie project edited in Final Cut Pro can now be easily reimported in the Pitivi free and open-source video editor. This accomplishment was made possible by implementing an OpenTimelineIO GES adapter and formatter, both upstreamed respectively in OpenTimelineIO and GES by Thibault.

Another important feature for non-linear video editors is nested timeline support. It allows teams to decouple big editing projects in smaller chunks that can later on be assembled for the final product. Another use-case is about pre-filling the timeline with boilerplate scenes, so that an initial version of the movie can be assembled before all teams involved in the project have provided the final content. To support this, Thibault implemented a GES demuxer which transparently enables playback support for GES files (through file://path/to/file.xges URIs) in any GStreamer-based media player.

As if this wasn’t impressive enough yet, Thibault greatly improved the GES unit-tests, fixing a lot of memory leaks, race conditions and generally improving the reliability of the test suite. This is very important because the Gitlab continuous integration now executes the tests harness for every submitted merge request.

For more information about this, the curious readers can dive in Thibault’s blog post. Thibault was invited to talk about these on-going efforts at SIGGRAPH during the OpenTimelineIO BOF.

Finally, Thibault is mentoring Swayamjeet Swain as part of the GSoC program, the project is about adding nested timeline support in Pitivi.

GStreamer VA-API

Víctor performed a good number of code reviews for GStreamer-VAAPI contributors, he has also started investigating GStreamer-VAAPI bugs specific to the AMDGPU Gallium driver, in order to improve the support of AMD hardware in multimedia applications.

As part of the on-going GStreamer community efforts to improve continuous integration (CI) in the project, we purchased Intel and AMD powered devices aimed to run validation tests. Running CI on real end-user hardware will help ensure regressions remain under control.

Servo and GStreamer-rs

As part of our on-going involvement in the Servo Mozilla project, Víctor has enabled zero-copy video rendering support in the GStreamer-based Servo-media crate, along with bug fixes in Glutin and finally in Servo itself.

A prior requirement for this major milestone was to enable gst-gl in the GStreamer Rust bindings and after around 20 patches were merged in the repository, we are pleased to announce Linux and Android platforms are supported. Windows and macOS platforms will also be supported, soon.

The following screencast shows how hardware-accelerated video rendering performs on Víctor’s laptop:

WebKit’s Media Source Extensions

As part of our on-going collaboration with various device manufacturers, Alicia and Enrique validated the Youtube TV MSE 2019 test suite in WPEWebKit-based products.

Alicia developed a new GstValidate plugin to improve GStreamer pipelines testing, called validateflow. Make sure to read her blog post about it!.

In her quest to further improve MSE support, especially seek support, Alicia rewrote the GStreamer source element we use in WebKit for MSE playback. The code review is on-going in Bugzilla and on track for inclusion in WPEWebKit and WebKitGTK 2.26. This new design of the MSE source element requires the playbin3 GStreamer element and at least GStreamer 1.16. Another feature we plan to work on in the near future is multi-track support; stay tuned!

WebKit WebRTC

We announced LibWebRTC support for WPEWebKit and WebKitGTK one year ago. Since then, Thibault has been implementing new features and fixing bugs in the backend. Bridging between the WebAudio and WebRTC backend was implemented, allowing tight integration of WebAudio and WebRTC web apps. More WebKit WebRTC layout tests were unskipped in the buildbots, allowing better tracking of regressions.

Thibault also fixed various performance issues on Raspberry Pi platforms during apprtc video-calls. As part of this effort he upstreamed a GstUVCH264DeviceProvider in GStreamer, allowing applications to use already-encoded H264 streams from webcams that provide it, thus removing the need for applications to encode raw video streams.

Additionally, Thibault upstreamed a device provider for ALSA in GStreamer, allowing applications to probe for Microphones and speakers supported through the ALSA kernel driver.

Finally, Thibault tweaked the GStreamer encoders used by the WebKit LibWebRTC backend, in order to match the behavior of the Apple implementation and also fixing a few performance and rendering issues on the way.

WebKit GStreamer Multimedia maintenance

The whole team is always keeping an eye on WebKit’s Bugzilla, watching out for multimedia-related bugs reported by the community members. Charlie recently fixed a few annoying volume bugs along with an issue related with youtube.

I rewrote the WebKitWebSrc GStreamer source element we use in WebKit to download HTTP(S) media resources and feed the data to the playback pipeline. This new element is now based on the GStreamer pushsrc base class, instead of appsrc. A few seek-related issues were fixed on the way but unfortunately some regressions also slipped in; those should all be fixed by now and shipped in WPEWebKit/WebKitGTK 2.26.

An initial version of the MediaCapabilities backend was also upstreamed in WebKit. It allows web-apps (such as Youtube TV) to probe the user-agent for media decoding and encoding capabilities. The GStreamer backend relies on the GStreamer plugin registry to provide accurate information about the supported codecs and containers. H264 AVC1 profile and level information are also probed.


Well, this isn’t directly related with multimedia, but I finally announced the initial release of WPEQt. Any feedback is welcome, QtWebKit has phased out and if anyone out there relies on it for web-apps embedded in native QML apps, now is the time to try WPEQt!

by Philippe Normand at August 05, 2019 01:30 PM

August 01, 2019

Julie Kim

Giving a talk at Automotive Linux Summit 2019

Last year, I’ve worked on Chromium and Web Application Manager on AGL platform with some fellow Igalian colleagues. Our goal was to support Chromium with upstream Wayland port and Web Application Manager. Although we faced a lot of challenges, such as porting them to AGL framework, handling SMACK on Chromium multiprocess architecture, supporting wayland ivi protocol on Chromium and so on, we completed our goal and showed a demo at CES 2019.

This year, I gave a talk about our work on AGL at several events such as AGL All Member Meeting, BlinkOn10 in Canada and Automotive Linux Summit 2019.

For Automotive Linux Summit 2019 in Tokyo last week, my colleague, Lorenzo, and I attended the event and managed our Igalia booth.

We showed some demos for Chromium and Web Application Manager on AGL on several reference boards such as Renesas m3 board, Intel Minnowboard, and RPi3. I was very pleased to meet many people and share our works with them.

I gave a talk about “HTML5 apps on AGL platform with the Web Application Manager“.

As you might know, Igalia has worked on Chromium Wayland port for several years. We tried various approaches and finally fully upstreamed our changes to upstream last year. Chromium on AGL has been implemented based on it with supporting ivi-extension.
LG opensourced Web Application Manager used for their products and Igalia implemented Web Runtime solution for AGL on the top of it. My colleague, Jacobo, uploaded a post about ‘Introducing the Chromium-based web runtime for the AGL platform‘. You can find more information there.

I also participated in the AGL Developer Panel in ALS.

We generally talked about the current status and the plans for the next step. Thanks to Walt Minor, Automotive Grade Linux Development Manager at The Linux Foundation, everything went well and I believe all of us enjoyed the time.

I think AGL is one of open source projects growing fast and it shows continuous improving. When I looked around demo booths, I could see many companies tried various interesting ideas on the top of the AGL.

Last, I’d like to say “Thanks a lot, people who organized this event, came by our Igalia booth, and listened to my talk”.

Automotive Grade Linux Logo

by jkim at August 01, 2019 06:31 AM

July 30, 2019

Manuel Rego

Talking about CSS Containment at CSSconf EU 2019

Back in January I wrote a blog post introducing CSS Containment where I tried to explain this specification and the main motivation and uses cases behind it. I’ve been working on css-contain implementation in Chromium during the last months as part of the ongoing collaboration between Igalia and Bloomberg, I believe this is not a very well known topic yet, and I have the chance to speak about it in the impressive CSSconf EU.

First of all it was a really great surprise to get my talk proposal accepted, Igalia has been attending CSSconf EU regularly but we never spoke there before. It’s one of the biggest CSS events and the setup they prepare (together with JSConf) is overwhelming, I’m very grateful to be part of the speakers lineup. Also the organization was very good and really paid attention to every detail.

CSSconf EU 2019 stage CSSconf EU 2019 stage

My talk’s title was Improving Website Performance with CSS Containment, during the talk I did an introduction to the spec, explained the contain property and the implications of the different types of containment. On top of that I showed a few examples about how the performance of a website can be improved thanks to css-contain. The video of my talk is published in YouTube and you can also get the slides from this blog.

After the presentation I got some positive feedback from the audience, it seems that despite of being a somehow deep topic people liked it. Thank you very much!

Last but not least, just to close this blog post I’d like to say thanks one more time to Bloomberg. They have been supporting Igalia’s work on CSS Containment implementation, which is helping to improve performance of their applications.

Igalia logo Bloomberg logo
Igalia and Bloomberg working together to build a better web

July 30, 2019 10:00 PM

July 09, 2019

Brian Kardell

Moving Math Forward, Thoughts and Experiments

Moving Math Forward, Thoughts and Experiments

In this piece, I explain details on thoughts on experiments and conversations around moving Math forward on the Web.

For some context, my team is actively working on implementing MathML in Chromium (the engine Chrome, Edge, Samsung, Brave, UC Browser and lots of other share), more specifically MathML Core. MathML has long had implementations in Safari and Firefox and has an intersting history that leaves it in a very unique spot.

MathML Core, defines the comparatively small, but well defined and mostly interoperable subset of MathML - with an aim of resolving uniqueness. It provides details about how it actually fits into the layers platform in well defined ways and to explain their existing magic in terms of the platform - to expose it if is it new. As such, in indirectly aims to provide a basis for moving many things forward.

A thought experiment

Recently conversations in the web platform at large have begun about moving HTML forward and polyfilling elements. Since we are actively working on a 'core' ourselves, we thought this would be an interesting discussion and experiment: We asked "What if MathML Core was largely 'done' and you would import other support to move forward?" Those things could potentially be shipped by a browser, but there's no guarantee they ever would. However, they could also be loaded as polyfills and in standard means that help omit many of the problems that traditional polyfills create.

To explain this, we took some elements that aren't in MathML Core, but are specified elsewhere, as an interesting stand in. One such example is an element called mfenced. mfenced is just a "more convenient way" of writing some math that has a lot of 'fences' (parens, pipes, braces, and so on). The original spec even said "It is strictly equivalent to this expanded, longer form".

Today, 'polyfills' exist for it if your browser doesn't support it, and today, those work by (if your browser doesn't support it) destroying your tree and replacing it with the equivalent version, at some undefined point in time.

We experimentally changed very little in our implementation and it now considers any MathML 'unknown element' as on the whitelist of elements which can have a Shadow DOM. We used a simple stand-in using Mutation Observer to allow any of those to be defined without a dash, and wrote a polyfill for mfenced which simply expands it into Shadow DOM.

For purposes of discussion, we illustrated how we could, in theory, use MathML Core, import some new vocabulary (potentially a future proposal, or just some slang) and easily expand it into some Shadow DOM, leaving you with the tree that you authored and all of the things just fitting together nicely and predictably. Below is a screenshot from our actual implementation experiment.

screenshot of element and shadow dom in experimental browser build

We think that these sorts of experiments are useful and informative and help us ask better questions about how we are fitting into the platform itself: Have we answered the necessary questions? How does the introduction of a ShadowRoot impact rendering and namespacing, for example? Could developers really 'plug in' to all the necessary bits? Could they properly layer atop? To test this, we also created elements that expand shorthands like TeX and ASCII Math. In the end, can we ultimately resolve a great a stable core that allows us to move math forward using these things similar in ways similar to Knuth's TeX book/engine - but in a way that is properly "of the platform"? We think the answer is yes.

While we're not entirely sure where larger conversations about how HTML will move forward will go, and we aren't claiming that this implementation is anything more than a quick experiment for discussion and testing, the MathML community itself is very excited that such talks are even happening in HTML and very interested in being a part of that conversation, and sorting out how we can move forward together as One Platform™, with less 'specialness'.

July 09, 2019 04:00 AM

One Platform™

One Platform™

Let's talk about making SVG and MathML less special.

As I mentioned in Harold Crick and the Web Platform, the HTML specification contains a section on "Other Embedded Content" which specifically mentions SVG and MathML. In that piece I explained their unique histories - how they wound up being "special" and how I felt this leaves them in a kind of unique place in the platform that deserves some similarly "special" attention in order to resolve.

What I mean by this, in part, is that we need to think about how to make them less special, so that we are reasoning about things as One Platform™ as evenly as possible. I'd like to illustrate and talk about two simple examples:

What is this element?

The HTML parser parses these things specially. While this is a kind of advantage we don't expect to be repeated, it actually came at a cost of historically disadvantaging the actual elements that we had to work with. They lacked basic/general interfaces that are otherwise common among any other element in HTML. There's really no great reason for this and we should unweird it - it's not hard. Last year, this work began by providing a mixin that could be used in SVG, and we'd like to do the same for MathML which lacked any kind of IDL at all.

What does this mean in practice? Well, it means that if you grab a reference to a MathML element, for example, and try to set it's .style property - what happens? It blows up. Why? Because if you ask it what is it's .constructor, the answer is just Element. But you can style it with CSS just fine. Common things like data-* attribute and tab indexes just don't work, trying to access any of over 100 common API properties or methods that Just Work anywhere else will throw, or not function. All of this is unnecessarily weird, for not great reasons.

This is a really basic thing and it's not at all hard to do technically. It seems unlikely that the parser will change, and everyone agrees that MathML is a reasonable thing to send to a browser (whether they believe that it is necessary that the browser itself renders that or not), so I am not sure why this should be controversial. This is a useful normalization of basic, low level platform things. It feels like everyone should support this.

Generally, let's make them both as unspecial as possible - which leads to the other thing...

How do we move forward?

HTML has had, and continues to have, a whole lot of thought and efforts to define "how to move forward". Figuring out the underpinnings of both the platform features necessary and a process enables the Web to really work and flourish is a big effort. A wide range of efforts over the last several years like Custom Elements, Shadow DOM, Houdini, AOM, modules (and types of modules), ways to share stylesheets, Import Maps and built-in modules are all, in some ways proposing elements of possible solutions to the basic question "How do we move forward?" A lot of this takes a long time and advancing frequently requires compromises in the interim.

Recently, new conversations have begun attempting to assemble these into a larger possible vision and discuss resolving compromises made along to really answer this question.

However, currently, SVG and MathML, exist at least partially outside many of those years of conversations and thought. It feels important to at least stop and consider the question: How do these move forward? And, if that doesn't look like the rest of the platform -- why not? Figuring out how to align with HTMLs "forward plans" seems very valuable.

I don't have concrete answers here, but I'd like to start making these things part of the conversation. For example...

Polyfilling Elements, Custom Elements and Built-ins

One of the more interesting recent conversations recently is around moving forward and polyfilling elements. It is, of course, very early. This issue and related efforts are intended to open conversation by offering some larger picture that we can talk about. One interesting conversation, I think, would be to think about how this might relate to SVG and MathML. The topics of Custom Elements and Shadow DOM, for example, are two that are thus far historically outside of SVG and MathML.

One way of looking at it in the thread offers that we might say that HTML itself is largely (not completely) 'done'. Not standards, just the core of HTML itself. It offers that what you could do then is import other support, and that support could also potentially be shipped down natively by the browser or polyfilled.

But, how does this apply to SVG and MathML? Is there real use in asking these questions? We think so.

While we're not entirely sure where larger conversations about how HTML will move forward will go, as part of our efforts to implement MathML Core in Chromium (the engine Chrome, Edge, Samsung, Brave, UC Browser and lots of apps share), we are asking lots of questions about how things fit into the platform more naturally. In this, we've recently done some interesting experimenting with some of these concepts in order to ask potentially useful questions. We believe these are both interesting 'at large', and actually very useful in helpng us make sure we're thinking about all of the right and necessary things.

If you're interested, you can read more on some experiments (with implementation) and discussions around them regarding Shadow DOM, Import Maps and potentially prefix-less custom elements for MathML and how we feel this is both compelling and helpful.

While the MathML Community Group is aware that these questions are far from settled, they were universally excited to see such talks begin in HTML and very interested in making sure that they're thinking about those things too: Being a part of that conversation, and sorting out how we can move forward together as One Platform™, with less 'specialness'.

If you think this is interesting or useful, please share your support and let someone know.

July 09, 2019 04:00 AM

June 30, 2019

Eleni Maria Stea

Depth-aware upsampling experiments (Part 6: A complete approach to upsample the half-resolution render target of a SSAO implementation)

This is the final post of the series where I explain the ideas tried in order to improve the upsampling of the half-resolution SSAO render target of the VKDF sponza demo that was written by Iago Toral. In the previous posts, I performed experiments to explore different upsampling ideas and I explained the logic behind … Continue reading Depth-aware upsampling experiments (Part 6: A complete approach to upsample the half-resolution render target of a SSAO implementation)

by hikiko at June 30, 2019 08:05 AM

Depth-aware upsampling experiments (Part 5: Sample classification tweaks to improve the SSAO upsampling on surfaces)

This is another post of the series where I explain the ideas I try in order to improve the upsampling of the half-resolution SSAO render target of the VKDF sponza demo that was written by Iago Toral. In a previous post (3.2), I had classified the sample neighborhoods in surface neighborhoods and neighborhoods that contain … Continue reading Depth-aware upsampling experiments (Part 5: Sample classification tweaks to improve the SSAO upsampling on surfaces)

by hikiko at June 30, 2019 08:04 AM

June 28, 2019

Michael Catanzaro

On Version Numbers

I’m excited to announce that Epiphany Tech Preview has reached version 3.33.3-33, as computed by git describe. That is 33 commits after 3.33.3:

Epiphany about dialog displaying the version number

I’m afraid 3.33.4 will arrive long before we  make it to 3.33.3-333, so this is probably the last cool version number Epiphany will ever have.

I might be guilty of using an empty commit to claim the -33 commit.

I might also apologize for wasting your time with a useless blog post, except this was rather fun. I await the controversy of your choice in the comments.

by Michael Catanzaro at June 28, 2019 04:07 PM

June 26, 2019

Jacobo Aragunde

Introducing the Chromium-based web runtime for the AGL platform

Igalia has been working with AGL (Automotive Grade Linux) to provide a web application runtime to their platform, based on Chromium. We delivered the first phase of this project back in January, and it’s been available since the Flounder 6.0.5 and Guppy 7.0.0 releases. This is a summary of how it came to be, and the next steps into the future.

Igalia stand showing the AGL web runtime

Web applications as first-class citizens

The idea of web applications as first-class citizens is not new in AGL. The early versions of the stack, based on Tizen 3.0, already implemented this idea, via the Crosswalk runtime. It provided extended JS APIs for developers to access to system services inside a web application runtime.

The abandon of the Tizen 3.0 effort by its main developers meant the chance for AGL to start fresh, redefining its architecture to become what it is today, but the idea of web applications was still there. The current AGL architecture still supports web apps, and the system APIs are available through WebSockets so both native and web applications can use them. But, until now, there wasn’t a piece of software to run them other than a general-purpose browser.

Leveraging existing projects

One of the strengths of open source is speeding up the time-to-market by allowing code reuse. We looked into the WebOS OSE platform, developed by LGe with contributions from Igalia. It provides the Web Application Manager (WAM) component, capable of booting up a Chromium-based runtime with a low footprint and managing application life cycle. Reusing it, we were able to deliver a web application runtime more quickly.

Our contribution back to the webOS OSE project is the use of the new Wayland backend, independently developed by Igalia and available upstream in new Chromium versions. It was backported to the Chromium version used by WebOS OSE, and hopefully it will be part of future releases of the platform based on newer versions of Chromium.

Development process

My colleague Julie made a great presentation of the AGL web runtime in the last AGL All Member Meeting in March. I’m reusing her work for this overview.


The version of Chromium provided by the WebOS OSE platform makes use of an updated version of the Intel Ozone-Wayland backend. This project has been abandoned for a while, and never upstreamed due to deep architecture differences. In the last couple of years, Igalia has implemented a new backend independently, using the lessons learned from integrating and maintaining Ozone-Wayland in previous projects, and following the current (and future) architecture of the video pipeline.

Compared process structure of Wayland implementations

As we mentioned before, we backported the new implementation of the Wayland backend to WebOS OSE, and added IVI-shell integration patches on top of it.

AGL life cycle

The WAM launcher process was modified to integrate with AGL life cycle callbacks and events. In particular, it registers event callbacks for HomeScreen, WindowManager and notification for ILMControl, activates WebApp window, when it gets Event_TapShortcut, and manages WebApp states for Event_Active/Event_Inactive. LGe, also a member of AGL, provided the initial work based on the Eel release, which we ported over to Flounder and kept evolving and maintaining.

WAM Launcher process integration diagram

AGL security model

In the AGL security model, access to system services is controlled with SMACK labels. A process with a certain label can only access a subset of the system API. The installation manifest for AGL applications and services is in charge of the relation between labels and services.

Access to system APIs in AGL happens through WebSockets so, from the point of view of the browser, it’s just a network request. Since WAM reuses the Chromium process model, the network interaction happening in any running webapp would actually be performed by the browser process. The problem is that there is only one browser process in the system, it’s running in the background and its labels don’t relate with the ones from the running applications. As a solution, we configure web applications to channel their networking through a proxy, and the WAM launcher creates a proxy process with the proper SMACK label for every webapp.

Proxy process integration diagram

Current implementation is based on the Tinyproxy project, but we are pending to review this model and find a more efficient solution.

Next steps

We are clearing any Qt dependencies from WAM, replacing them with standard C++ or the boost library. This work is interesting for AGL to be able to create a web-only version of the platform, and also for WebOS OSE to make the platform lighter, so we will be contributing it back. In this regard, the final goal is that AGL doesn’t require any patches on top of WebOS OSE to be able to use WAM.

Also, in the line of creating a web-only AGL, we will provide new demo apps so the Qt dependency wouldn’t be needed anywhere in the platform.

And, after the AGL face-to-face meeting that Igalia hosted in Coruña, we will be integrating more subsystems available in the platform, and reworking the integration with the security framework to be more robust and efficient. You can follow our progress in the AGL Jira with the label WebAppMgr.

Try it now!

The web application runtime is available in Flounder (starting in 6.0.5) and Guppy releases, and it will be in Halibut (current master). It can be tested adding the agl-html5-framework feature to the agl setup script. I have a bunch of test web applications available for testing in the repository wam-demo-applications, but there will be official demos in AGL soon.

Youtube running as an AGL app

Youtube running as an AGL application

This project has been made possible by the Linux Foundation, through a sponsorship from Automotive Grade Linux. Thanks!

AGL logo


by Jacobo Aragunde Pérez at June 26, 2019 05:00 PM

Andy Wingo

fibs, lies, and benchmarks

Friends, consider the recursive Fibonacci function, expressed most lovelily in Haskell:

fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

Computing elements of the Fibonacci sequence ("Fibonacci numbers") is a common microbenchmark. Microbenchmarks are like a Suzuki exercises for learning violin: not written to be good tunes (good programs), but rather to help you improve a skill.

The fib microbenchmark teaches language implementors to improve recursive function call performance.

I'm writing this article because after adding native code generation to Guile, I wanted to check how Guile was doing relative to other language implementations. The results are mixed. We can start with the most favorable of the comparisons: Guile present versus Guile of the past.

I collected these numbers on my i7-7500U CPU @ 2.70GHz 2-core laptop, with no particular performance tuning, running each benchmark 10 times, waiting 2 seconds between measurements. The bar value indicates the median elapsed time, and above each bar is an overlayed histogram of all results for that scenario. Note that the y axis is on a log scale. The 2.9.3* version corresponds to unreleased Guile from git.

Good news: Guile has been getting significantly faster over time! Over decades, true, but I'm pleased.

where are we? static edition

How good are Guile's numbers on an absolute level? It's hard to say because there's no absolute performance oracle out there. However there are relative performance oracles, so we can try out perhaps some other language implementations.

First up would be the industrial C compilers, GCC and LLVM. We can throw in a few more "static" language implementations as well: compilers that completely translate to machine code ahead-of-time, with no type feedback, and a minimal run-time.

Here we see that GCC is doing best on this benchmark, completing in an impressive 0.304 seconds. It's interesting that the result differs so much from clang. I had a look at the disassembly for GCC and I see:

    push   %r12
    mov    %rdi,%rax
    push   %rbp
    mov    %rdi,%rbp
    push   %rbx
    cmp    $0x1,%rdi
    jle    finish
    mov    %rdi,%rbx
    xor    %r12d,%r12d
    lea    -0x1(%rbx),%rdi
    sub    $0x2,%rbx
    callq  fib
    add    %rax,%r12
    cmp    $0x1,%rbx
    jg     again
    and    $0x1,%ebp
    lea    0x0(%rbp,%r12,1),%rax
    pop    %rbx
    pop    %rbp
    pop    %r12

It's not quite straightforward; what's the loop there for? It turns out that GCC inlines one of the recursive calls to fib. The microbenchmark is no longer measuring call performance, because GCC managed to reduce the number of calls. If I had to guess, I would say this optimization doesn't have a wide applicability and is just to game benchmarks. In that case, well played, GCC, well played.

LLVM's compiler (clang) looks more like what we'd expect:

   push   %r14
   push   %rbx
   push   %rax
   mov    %rdi,%rbx
   cmp    $0x2,%rdi
   jge    recurse
   mov    %rbx,%rax
   add    $0x8,%rsp
   pop    %rbx
   pop    %r14
   lea    -0x1(%rbx),%rdi
   callq  fib
   mov    %rax,%r14
   add    $0xfffffffffffffffe,%rbx
   mov    %rbx,%rdi
   callq  fib
   add    %r14,%rax
   add    $0x8,%rsp
   pop    %rbx
   pop    %r14

I bolded the two recursive calls.

Incidentally, the fib as implemented by GCC and LLVM isn't quite the same program as Guile's version. If the result gets too big, GCC and LLVM will overflow, whereas in Guile we overflow into a bignum. Also in C, it's possible to "smash the stack" if you recurse too much; compilers and run-times attempt to mitigate this danger but it's not completely gone. In Guile you can recurse however much you want. Finally in Guile you can interrupt the process if you like; the compiled code is instrumented with safe-points that can be used to run profiling hooks, debugging, and so on. Needless to say, this is not part of C's mission.

Some of these additional features can be implemented with no significant performance cost (e.g., via guard pages). But it's fair to expect that they have some amount of overhead. More on that later.

The other compilers are OCaml's ocamlopt, coming in with a very respectable result; Go, also doing well; and V8 WebAssembly via Node. As you know, you can compile C to WebAssembly, and then V8 will compile that to machine code. In practice it's just as static as any other compiler, but the generated assembly is a bit more involved:

    jmp    fib

    push   %rbp
    mov    %rsp,%rbp
    pushq  $0xa
    push   %rsi
    sub    $0x10,%rsp
    mov    %rsi,%rbx
    mov    0x2f(%rbx),%rdx
    mov    %rax,-0x18(%rbp)
    cmp    %rsp,(%rdx)
    jae    stack_check
    cmp    $0x2,%eax
    jl     return_n
    lea    -0x2(%rax),%edx
    mov    %rbx,%rsi
    mov    %rax,%r10
    mov    %rdx,%rax
    mov    %r10,%rdx
    callq  fib_tramp
    mov    -0x18(%rbp),%rbx
    sub    $0x1,%ebx
    mov    %rax,-0x20(%rbp)
    mov    -0x10(%rbp),%rsi
    mov    %rax,%r10
    mov    %rbx,%rax
    mov    %r10,%rbx
    callq  fib_tramp
    mov    -0x20(%rbp),%rbx
    add    %ebx,%eax
    mov    %rbp,%rsp
    pop    %rbp
    jmp    return
    callq  WasmStackGuard
    mov    -0x10(%rbp),%rbx
    mov    -0x18(%rbp),%rax
    jmp    post_stack_check

Apparently fib compiles to a function of two arguments, the first passed in rsi, and the second in rax. (V8 uses a custom calling convention for its compiled WebAssembly.) The first synthesized argument is a handle onto run-time data structures for the current thread or isolate, and in the function prelude there's a check to see that the function has enough stack. V8 uses these stack checks also to handle interrupts, for when a web page is stuck in JavaScript.

Otherwise, it's a more or less normal function, with a bit more register/stack traffic than would be strictly needed, but pretty good.

do optimizations matter?

You've heard of Moore's Law -- though it doesn't apply any more, it roughly translated into hardware doubling in speed every 18 months. (Yes, I know it wasn't precisely that.) There is a corresponding rule of thumb for compiler land, Proebsting's Law: compiler optimizations make software twice as fast every 18 years. Zow!

The previous results with GCC and LLVM were with optimizations enabled (-O3). One way to measure Proebsting's Law would be to compare the results with -O0. Obviously in this case the program is small and we aren't expecting much work out of the optimizer, but it's interesting to see anyway:

Answer: optimizations don't matter much for this benchark. This investigation does give a good baseline for compilers from high-level languages, like Guile: in the absence of clever trickery like the recursive inlining thing GCC does and in the absence of industrial-strength instruction selection, what's a good baseline target for a compiler? Here we see for this benchmark that it's somewhere between 420 and 620 milliseconds or so. Go gets there, and OCaml does even better.

how is time being spent, anyway?

Might we expect V8/WebAssembly to get there soon enough, or is the stack check that costly? How much time does one stack check take anyway? For that we'd have to determine the number of recursive calls for a given invocation.

Friends, it's not entirely clear to me why this is, but I instrumented a copy of fib, and I found that the number of calls in fib(n) was a more or less constant factor of the result of calling fib. That ratio converges to twice the golden ratio, which means that since fib(n+1) ~= φ * fib(n), then the number of calls in fib(n) is approximately 2 * fib(n+1). I scratched my head for a bit as to why this is and I gave up; the Lord works in mysterious ways.

Anyway for fib(40), that means that there are around 3.31e8 calls, absent GCC shenanigans. So that would indicate that each call for clang takes around 1.27 ns, which at turbo-boost speeds on this machine is 4.44 cycles. At maximum throughput (4 IPC), that would indicate 17.8 instructions per call, and indeed on the n > 2 path I count 17 instructions.

For WebAssembly I calculate 2.25 nanoseconds per call, or 7.9 cycles, or 31.5 (fused) instructions at max IPC. And indeed counting the extra jumps in the trampoline, I get 33 cycles on the recursive path. I count 4 instructions for the stack check itself, one to save the current isolate, and two to shuffle the current isolate into place for the recursive calls. But, compared to clang, V8 puts 6 words on the stack per call, as opposed to only 4 for LLVM. I think with better interprocedural register allocation for the isolate (i.e.: reserve a register for it), V8 could get a nice boost for call-heavy workloads.

where are we? dynamic edition

Guile doesn't aim to replace C; it's different. It has garbage collection, an integrated debugger, and a compiler that's available at run-time, it is dynamically typed. It's perhaps more fair to compare to languages that have some of these characteristics, so I ran these tests on versions of recursive fib written in a number of languages. Note that all of the numbers in this post include start-up time.

Here, the ocamlc line is the same as before, but using the bytecode compiler instead of the native compiler. It's a bit of an odd thing to include but it performs so well I just had to include it.

I think the real takeaway here is that Chez Scheme has fantastic performance. I have not been able to see the disassembly -- does it do the trick like GCC does? -- but the numbers are great, and I can see why Racket decided to rebase its implementation on top of it.

Interestingly, as far as I understand, Chez implements stack checks in the straightfoward way (an inline test-and-branch), not with a guard page, and instead of using the stack check as a generic ability to interrupt a computation in a timely manner as V8 does, Chez emits a separate interrupt check. I would like to be able to see Chez's disassembly but haven't gotten around to figuring out how yet.

Since I originally published this article, I added a LuaJIT entry as well. As you can see, LuaJIT performs as well as Chez in this benchmark.

Haskell's call performance is surprisingly bad here, beaten even by OCaml's bytecode compiler; is this the cost of laziness, or just a lacuna of the implementation? I do not know. I do know I have this mental image that Haskell is a good compiler but apparently if that's the standard, so is Guile :)

Finally, in this comparison section, I was not surprised by cpython's relatively poor performance; we know cpython is not fast. I think though that it just goes to show how little these microbenchmarks are worth when it comes to user experience; like many of you I use plenty of Python programs in my daily work and don't find them slow at all. Think of micro-benchmarks like x-ray diffraction; they can reveal the hidden substructure of DNA but they say nothing at all about the organism.

where to now?

Perhaps you noted that in the last graph, the Guile and Chez lines were labelled "(lexical)". That's because instead of running this program:

(define (fib n)
  (if (< n 2)
      (+ (fib (- n 1)) (fib (- n 2)))))

They were running this, instead:

(define (fib n)
  (define (fib* n)
    (if (< n 2)
        (+ (fib* (- n 1)) (fib* (- n 2)))))
  (fib* n))

The thing is, historically, Scheme programs have treated top-level definitions as being mutable. This is because you don't know the extent of the top-level scope -- there could always be someone else who comes and adds a new definition of fib, effectively mutating the existing definition in place.

This practice has its uses. It's useful to be able to go in to a long-running system and change a definition to fix a bug or add a feature. It's also a useful way of developing programs, to incrementally build the program bit by bit.

But, I would say that as someone who as written and maintained a lot of Scheme code, it's not a normal occurence to mutate a top-level binding on purpose, and it has a significant performance impact. If the compiler knows the target to a call, that unlocks a number of important optimizations: type check elision on the callee, more optimal closure representation, smaller stack frames, possible contification (turning calls into jumps), argument and return value count elision, representation specialization, and so on.

This overhead is especially egregious for calls inside modules. Scheme-the-language only gained modules relatively recently -- relative to the history of scheme -- and one of the aspects of modules is precisely to allow reasoning about top-level module-level bindings. This is why running Chez Scheme with the --program option is generally faster than --script (which I used for all of these tests): it opts in to the "newer" specification of what a top-level binding is.

In Guile we would probably like to move towards a more static way of treating top-level bindings, at least those within a single compilation unit. But we haven't done so yet. It's probably the most important single optimization we can make over the near term, though.

As an aside, it seems that LuaJIT also shows a similar performance differential for local function fib(n) versus just plain function fib(n).

It's true though that even absent lexical optimizations, top-level calls can be made more efficient in Guile. I am not sure if we can reach Chez with the current setup of having a template JIT, because we need two return addresses: one virtual (for bytecode) and one "native" (for JIT code). Register allocation is also something to improve but it turns out to not be so important for fib, as there are few live values and they need to spill for the recursive call. But, we can avoid some of the indirection on the call, probably using an inline cache associated with the callee; Chez has had this optimization since 1984!

what guile learned from fib

This exercise has been useful to speed up Guile's procedure calls, as you can see for the difference between the latest Guile 2.9.2 release and what hasn't been released yet (2.9.3).

To decide what improvements to make, I extracted the assembly that Guile generated for fib to a standalone file, and tweaked it in a number of ways to determine what the potential impact of different scenarios was. Some of the detritus from this investigation is here.

There were three big performance improvements. One was to avoid eagerly initializing the slots in a function's stack frame; this took a surprising amount of run-time. Fortunately the rest of the toolchain like the local variable inspector was already ready for this change.

Another thing that became clear from this investigation was that our stack frames were too large; there was too much memory traffic. I was able to improve this in the lexical-call by adding an optimization to elide useless closure bindings. Usually in Guile when you call a procedure, you pass the callee as the 0th parameter, then the arguments. This is so the procedure has access to its closure. For some "well-known" procedures -- procedures whose callers can be enumerated -- we optimize to pass a specialized representation of the closure instead ("closure optimization"). But for well-known procedures with no free variables, there's no closure, so we were just passing a throwaway value (#f). An unhappy combination of Guile's current calling convention being stack-based and a strange outcome from the slot allocator meant that frames were a couple words too big. Changing to allow a custom calling convention in this case sped up fib considerably.

Finally, and also significantly, Guile's JIT code generation used to manually handle calls and returns via manual stack management and indirect jumps, instead of using the platform calling convention and the C stack. This is to allow unlimited stack growth. However, it turns out that the indirect jumps at return sites were stalling the pipeline. Instead we switched to use call/return but keep our manual stack management; this allows the CPU to use its return address stack to predict return targets, speeding up code.

et voilà

Well, long article! Thanks for reading. There's more to do but I need to hit the publish button and pop this off my stack. Until next time, happy hacking!

by Andy Wingo at June 26, 2019 10:34 AM

June 22, 2019

Eleni Maria Stea

Depth-aware upsampling experiments (Part 4: Improving the nearest depth where we detect discontinuities)

This is another post of the series where I explain some ideas I tried in order to improve the upscaling of the half-resolution SSAO render target of the VKDF sponza demo that was written by Iago Toral. In the previous post, I had classified the sample neighborhoods in surface neighborhoods and neighborhoods that contain depth … Continue reading Depth-aware upsampling experiments (Part 4: Improving the nearest depth where we detect discontinuities)

by hikiko at June 22, 2019 09:44 AM

June 18, 2019

Manuel Rego

Speaking at CSS Day 2019

This year I got invited to speak at CSS Day, this is an amazing event that every year brings to Amsterdam great speakers from all around the globe to talk about cutting edge topics related to CSS.

Pictures of speakers and MC at CSS Day 2019 Speakers and MC at CSS Day 2019

The conference happened in the beautiful Compagnietheater. Kudos to the organization as they were really kind and supportive during the whole event. Thanks for giving me the opportunity to speak there.

Compagnietheater in Amsterdam Compagnietheater in Amsterdam

For this event I prepared a totally new talk, focused on explaining what it takes to implement something like CSS Grid Layout in a browser engine. I took an idea from Jen Simmons and implemented a new property grid-skip-areas during the presentation, this was useful to explain the different things that happen during the whole process. Video of the talk is available on YouTube, and the slides are available if you are interested too; however, note that some of them won’t work on your browser unless you built the linked patches.

The feedback after the talk was really good, everyone seemed to like it (despite the fact that I showed lots of slides with C++ code) and find it useful to understand what’s going on behind the scenes. Thank you very much for your kind words! 😊

Somehow with this talk I was explaining the kind of things we do at Igalia and how we can help to fill the gaps left by browser vendors in the evolution of the Web Platform. Igalia has now a solid position inside the Web community, which makes us an excellent partner in order to improve the platform aligned with your specific needs. If you want to know more don’t hesitate to read our website or contact us directly.

I hope you enjoy it! 😀

June 18, 2019 10:00 PM

Samuel Iglesias

My latest VK-GL-CTS contributions

Even if you are not a gamer, odds are that you already heard about Vulkan graphics and compute API that provides high-efficency, cross-platform access to modern GPUs. This API is designed by the Khronos Group and it is supported by a new set of drivers specifically designed to implement the different functions and features defined by the spec (at the time of writing this post, it is version 1.1).


In order to guarantee that the drivers work according to the spec, drivers need to pass a conformance test suite that ensures they do what it is expected from them. VK-GL-CTS is the name of the conformance test suite used for certify the conformance on both Vulkan and OpenGL APIs and… it is open-source!


As part of my daily job at Igalia, I contribute to VK-GL-CTS from fixing some bugs, improving existing tests or even writing new tests for a variety of extensions. In this post I am going to describe some of the work I have been doing in the last few months.


This extension gives you the opportunity to reset queries outside a command buffer, which is a fast way of doing it once your application has finished reading query’s data. All that you need is to call vkResetQueryPoolEXT() function. There are several Vulkan drivers supporting already this extension on GNU/Linux (NVIDIA, open-source drivers AMDVLK, RADV and ANV) and probably more in other platforms.

I have implemented tests for all the different queries: occlusion queries, pipeline timestamp queries and statistics queries. Transform feedback stream queries tests landed a bit later.


VK_EXT_discard_rectangles provides a way to define rectangles in framebuffer-space coordinates that discard rasterization of all points, lines and triangles that fall inside (exclusive mode) or outside (inclusive mode) of their area. You can regard this feature as something similar to scissor testing but it operates orthogonally to the existing scissor test functionality.

It is easier to understand with an example. Imagine that you want to do the following in your application: clear the color attachment to red color, draw a green quad covering the whole attachment but defining a discard rectangle in order to restrict the rasterization of the quad to the area defined by the discard rectangle.

For that, you define the discard rectangles at pipeline creation time for example (it is possible to define them dynamically too); as we want to restrict the rasterization of the quad to the area defined by the discard rectangle, then we set its mode to VK_DISCARD_RECTANGLE_MODE_INCLUSIVE_EXT.

VK_EXT_discard_rectangles inclusive mode example

If we want to discard the rasterization of the green quad inside the area defined by the discard rectangle, then we set VK_DISCARD_RECTANGLE_MODE_EXCLUSIVE_EXT mode at pipeline creation time and that’s all. Here you have the output for this case:

VK_EXT_discard_rectangles exclusive mode example

You are not limited to define just one discard rectangle, drivers supporting this extension should support a minimum of 4 of discard rectangles but some drivers may support more. As this feature works orthogonally to other tests like scissor test, you can do fancy things in your app :-)

The tests I developed for VK_EXT_discard_rectangles extension are already available in VK-GL-CTS repo. If you want to test them on an open-source driver, right now only RADV has implemented this extension.


VK_EXT_pipeline_creation_feedback is another example of a useful feature for application developers, specially game developers. This extension gives a way to know at pipeline creation, if the pipeline hit the provided pipeline cache, the time consumed to create it or even which shaders stages hit the cache. This feature gives feedback about pipeline creation that can help to improve the pipeline caches that are shipped to users, with the final goal of reducing load times.

Tests for VK_EXT_pipeline_creation_feedback extension have made their way into VK-GL-CTS repo. Good news for the ones using open-source drivers: both RADV and ANV have implemented the support for this extension!


Since I started working in the Graphics team at Igalia, I have been contributing code to Mesa drivers for both OpenGL and Vulkan, adding new tests to Piglit, improve VkRunner among other contributions.

Now I am contributing to increase VK-GL-CTS coverage by developing new tests for extensions, fixing existing tests among other things. This work also involves developing patches for Vulkan Validation Layers, fixes for glslang and more things to come. In summary, I am enjoying a lot doing contributions to the open-source ecosystem created by Khronos Group as part of my daily work!

Note: if you are student and you want to start contributing to open-source projects, don’t miss our Igalia Coding Experience program (more info in our website).


June 18, 2019 06:45 AM

June 14, 2019

Michael Catanzaro

An OpenJPEG Surprise

My previous blog post seems to have resolved most concerns about my requests for Ubuntu stable release updates, but I again received rather a lot of criticism for the choice to make WebKit depend on OpenJPEG, even though my previous post explained clearly why there are are not any good alternatives.

I was surprised to receive a pointer to ffmpeg, which has its own JPEG 2000 decoder that I did not know about. However, we can immediately dismiss this option due to legal problems with depending on ffmpeg. I also received a pointer to a resurrected libjasper, which is interesting, but since libjasper was removed from Ubuntu, its status is not currently better than OpenJPEG.

But there is some good news! I have looked through Ubuntu’s security review of the OpenJPEG code and found some surprising results. Half the reported issues affect the library’s companion tools, not the library itself. And the other half of the issues affect the libmj2 library, a component of OpenJPEG that is not built by Ubuntu and not used by WebKit. So while these are real security issues that raise concerns about the quality of the OpenJPEG codebase, none of them actually affect OpenJPEG as used by WebKit. Yay!

The remaining concern is that huge input sizes might cause problems within the library that we don’t yet know about. We don’t know because OpenJPEG’s fuzzer discards huge images instead of testing them. Ubuntu’s security team thinks there’s a good chance that fixing the fuzzer could uncover currently-unknown multiplication overflow issues, for instance, a class of vulnerability that OpenJPEG has clearly had trouble with in the past. It would be good to see improvement on this front. I don’t think this qualifies as a security vulnerability, but it is certainly a security problem that would facilitate discovering currently-unknown vulnerabilities if fixed.

Still, on the whole, the situation is not anywhere near as bad as I’d thought. Let’s hope OpenJPEG can be included in Ubuntu main sooner rather than later!

by Michael Catanzaro at June 14, 2019 02:43 PM

June 10, 2019

Javier Fernández

A new terminal-style line breaking with CSS Text

The CSS Text 3 specification defines a module for text manipulation and covers, among a few other features, the line breaking behavior of the browser, including white space handling. I’ve been working lately on some new features and bug fixing for this specification and I’d like to introduce in this posts the last one we made available for the Web Platform users. This is yet another contribution that came out the collaboration between Igalia and Bloomberg, which has been held for several years now and has produced many important new features for the Web, like CSS Grid Layout.

The feature

I guess everybody knows the white-space CSS property, which allows web authors to control two main aspects of the rendering of a text line: collapsing and wrapping. A new value break-spaces has been added to the ones available for this property, which allows web authors to emulate a terminal-like line breaking behavior. This new value operates basically like pre-wrap, but with two key differences:

  • any sequence of preserved white space characters takes up space, even at the end of the line.
  • a preserved white space sequence can be wrapped at any character, moving the rest of the sequence, intact, to the line bellow.

What does this new behavior actually mean ? I’ll try to explain it with a few examples. Lets start with a simple but quite illustrative demo which tries to emulate a meteorology monitoring system which shows relevant changes over time, where the gaps between subsequent changes must be preserved:

 #terminal {
  font: 20px/1 monospace;
  width: 340px;
  height: 5ch;
  background: black;
  color: green;
  overflow: hidden;
  white-space: break-spaces;
  word-break: break-all;


Another interesting use case for this feature could be a logging system which should preserve the text formatting of the logged information, considering different window sizes. The following demo tries to describe this such scenario:

body { width: 1300px; }
#logging {
  font: 20px/1 monospace;
  background: black;
  color: green;

  animation: resize 7s infinite alternate;

  white-space: break-spaces;
  word-break: break-all;
@keyframes resize {
  0% { width: 25%; }
  100% { width: 100%; }

Hash: 5a2a3d23f88174970ed8 Version: webpack 3.12.0 Time: 22209ms Asset Size Chunks Chunk Names pages/widgets/index.51838abe9967a9e0b5ff.js 1.17 kB 10 [emitted] pages/widgets/index img/icomoon.7f1da5a.svg 5.38 kB [emitted] fonts/icomoon.2d429d6.ttf 2.41 kB [emitted] img/fontawesome-webfont.912ec66.svg 444 kB [emitted] [big] fonts/fontawesome-webfont.b06871f.ttf 166 kB [emitted] img/mobile.8891a7c.png 39.6 kB [emitted] img/play_button.6b15900.png 14.8 kB [emitted] img/keyword-back.f95e10a.jpg 43.4 kB [emitted] . . .

Use cases

In the demo shown before there are several cases that I think it’s worth to analyze in detail.

A breaking opportunity exists after any white space character

The main purpose of this feature is to preserve the white space sequences length even when it has to be wrapped into multiple lines. The following example tries to describe this basic use case:

.container {
  font: 20px/1 monospace;
  width: 5ch;
  white-space: break-spaces;
  border: 1px solid;


The example above shows how the white space sequence with a length of 15 characters is preserved and wrapped along 3 different lines.

Single leading white space

Before the addition of the break-spaces value this scenario was only possible at the beginning of the line. In any other case, the trailing white spaces were either collapsed or hang, hence the next line couldn’t start with a sequence of white spaces. Lets consider the following example:

.container {
  font: 20px/1 monospace;
  width: 3ch;
  white-space: break-spaces;
  border: 1px solid;


Like when using pre-wrap, the single leading space is preserved. Since break-spaces allows breaking opportunities after any white space character, we break after the first leading white space (” |XX XX”). The second line can be broken after the first preserved white space, creating another leading white space in the next line (” |XX | XX”).

However, lets consider now a case without such first single leading white space.

.container {
  font: 20px/1 monospace;
  width: 3ch;
  white-space: break-spaces;
  border: 1px solid;


Again, it s not allowed to break before the first space, but in this case there isn’t any previous breaking opportunity, so the first space after the word XX should overflow (“XXX | XX”); the next white space character will be moved down to the next line as preserved leading space.

Breaking before the first white space

I mentioned before that the spec states clearly that the break-space feature allows breaking opportunities only after white space characters. However, it’d be possible to break the line just before the first white space character after a word if the feature is used in combination with other line breaking CSS properties, like word-break or overflow-wrap (and other properties too).

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  border: 1px solid;


The two white spaces between the words are preserved due to the break-spaces feature, but the first space after the XXXX word would overflow. Hence, the overflow-wrap: break-word feature is applied to prevent the line to overflow and introduce an additional breaking opportunity just before the first space after the word. This behavior causes that the trailing spaces are moved down as a leading white space sequence in the next line.

We would get the same rendering if word-break: break-all is used instead overflow-wrap (or even in combination), but this is actualy an incorrect behavior, which has the corresponding bug reports in WebKit (197277) and Blink (952254) according to the discussion in the CSS WG (see issue #3701).

Consider previous breaking opportunities

In the previous example I described a combination of line breaking features that would allow breaking before the first space after a word. However, this should be avoided if there are previous breaking opportunities. The following example is one of the possible scenarios where this may happen:

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  border: 1px solid;


In this case, we could break after the second word (“XX X| X”), since overflow-wrap: break-word would allow us to do that in order to avoid the line to overflow due to the following white space. However, white-space: break-spaces only allows breaking opportunities after a space character, hence, we shouldn’t break before if there are valid previous opportunities, like in this case in the space after the first word (“XX |X X”).

This preference for previous breaking opportunities before breaking the word, honoring the overflow-wrap property, is also part of the behavior defined for the white-space: pre-wrap feature; although in that case, there is no need to deal with the issue of breaking before the first space after a word since trailing space will just hang. The following example uses just the pre-wrap to show how previous opportunities are selected to avoid overflow or breaking a word (unless explicitly requested by word-break property).

.container {
  font: 20px/1 monospace;
  width: 2ch;
  white-space: pre-wrap;
  border: 1px solid;


In this case, break-all enables breaking opportunities that are not available otherwise (we can break a word at any letter), which can be used to prevent the line to overflow; hence, the overflow-wrap property doesn’t take any effect. The existence of previous opportunities is not considered now, since break-all mandates to produce the longer line as possible.

This new white-space: break-spaces feature implies a different behavior when used in combination with break-all. Even though the preference of previous opportunities should be ignored if we use the word-break: break-all, this may not be the case for the breaking before the first space after a word scenario. Lets consider the same example but using now the word-break: break-all feature:

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  word-break: break-all;
  border: 1px solid;


The example above shows that using word-break: break-all doesn’t produce any effect. It’s debatable whether the use of break-all should force the selection of the breaking opportunity that produces the longest line, like it happened in the pre-wrap case described before. However, the spec states clearly that break-spaces should only allow breaking opportunities after white space characters. Hence, I considered that breaking before the first space should only happen if there is no other choice.

As a matter of fact, specifying break-all we shouldn’t considering only previous white spaces, to avoid breaking before the first white space after a word; the break-all feature creates additional breaking opportunities, indeed, since it allows to break the word at any character. Since break-all is intended to produce the longest line as possible, this new breaking opportunity should be chosen over any previous white space. See the following test case to get a clearer idea of this scenario:

.container {
  font: 20px/1 monospace;
  width: 4ch;
  white-space: break-spaces;
  overflow-wrap: break-word;
  word-break: break-all;
  border: 1px solid;


Bear in mind that the expected rendering in the above example may not be obtained if your browser’s version is still affected by the bugs 197277(Safari/WebKit) and 952254(Chrome/Blink). In this case, the word is broken despite the opportunity in the previous white space, and also avoiding breaking after the ‘XX’ word, just before the white space.

There is an exception to the rule of avoiding breaking before the first white space after a word if there are previous opportunities, and it’s precisely the behavior the line-break: anywhere feature would provide. As I said, all these assumptions were not, in my opinion, clearly defined in the current spec, so that’s why I filed an issue for the CSS WG so that we can clarify when it’s allowed to break before the first space.

Current status and support

The intent-to-ship request for Chrome has been approved recently, so I’m confident the feature will be enabled by default in Chrome 76. However, it’s possible to try the feature in older versions by enabling the Experimental Web Platform Features flag. More details in the corresponding Chrome Status entry. I want to highlight that I also implemented the feature for LayoutNG, the new layout engine that Chrome will eventually ship; this achievement is very important to ensure the stability of the feature in future versions of Chrome.

In the case of Safari, the patch with the implementation of the feature landed in the WebKit’s trunk in r244036, but since Apple doesn’t announce publicly when a new release of Safari will happen or which features it’ll ship, it’s hard to guess when the break-spaces feature will be available for the web authors using such browser. Meanwhile, It’s possible to try the feature in the Safari Technology Preview 80.

Finally, while I haven’t see any signal of active development in Firefox, some of the Mozilla developers working on this area of the Gecko engine have shown public support for the feature.

The following table summarizes the support of the break-spaces feature in the 3 main browsers:

Chrome Safari Firefox
Experimental M73 STP 80 Public support
Ship M76 Unknown Unknown

Web Platform Tests

At Igalia we believe that the Web Platform Tests project is a key piece to ensure the compatibility and interoperability of any development on the Web Platform. That’s why a substantial part of my work to implement this relatively small feature was the definition of enough tests to cover the new functionality and basic use cases of the feature.

white-space overflow-wrap word-break

Implementation in several web engines

During the implementation of a browser feature, even a small one like this, it’s quite usual to find out bugs and interoperability issues. Even though this may slow down the implementation of the feature, it’s also a source of additional Web Platform tests and it may contribute to the robustness of the feature itself and the related CSS properties and values. That’s why I decided to implement the feature in parallel for WebKit (Safari) and Blink (Chrome) engines, which I think it helped to ensure interoperability and code maturity. This approach also helped to get a deeper understanding of the line breaking logic and its design and implementation in different web engines.

I think it’s worth mentioning some of these code architectural differences, to get a better understanding of the work and challenges this feature required until it reached web author’s browser.

Chrome/Blink engine

Lets start with Chrome/Blink, which was especially challenging due to the fact that Blink is implementing a new layout engine (LayoutNG). The implementation for the legacy layout engine was the first step, since it ensures the feature will arrive earlier, even behind an experimental runtime flag.

The legacy layout relies on the BreakingContext class to implement the line breaking logic for the inline layout operations. It has the main characteristic of handling the white space breaking opportunities by its own, instead of using the TextBreakIterator (based on ICU libraries), as it does for determining breaking opportunities between letters and/or symbols. This design implies too much complexity to do even small changes like this, especially because is very sensible in terms of performance impact. In the following diagram I try to show a simplified view of the classes involved and the interactions implemented by this line breaking logic.

The LayoutNG line breaking logic is based on a new concept of fragments, mainly handled by the NGLineBreaker class. This new design simplifies the line breaking logic considerably and it’s highly optimized and adapted to get the most of the TextBreakIterator classes and the ICU features. I tried to show a simplified view of this new design with the following diagram:

In order to describe the work done to implement the feature for this web engine, I’ll list the main bugs and patches landed during this time: CR#956465, CR#952254, CR#944063,CR#900727, CR#767634, CR#922437

Safari/WebKit engine

Although as time passes this is less probable, WebKit and Blink still share some of the layout logic from the ages prior to the fork. Although Blink engineers have applied important changes to the inline layout logic, both code refactoring and optimizations, there are common design patterns that made relatively easy porting to WebKit the patches that implemented the feature for the Blink’s legacy layout. In WebKit, the line breaking logic is also implemented by the BreakingContext class and it has a similar architecture, as it’s described, in a quite simplified way, in the class diagram above (it uses different class names for the render/layout objects, though) .

However, Safari supports for the mac and iOS platforms a different code path for the line breaking logic, implemented in the SimpleLineLayout class. This class provides a different design for the line breaking logic, and, similar to what Blink implements in LayoutNG, is based on a concept of text fragments. It also relies as much as possible into the TextBreakIterator, instead of implementing complex rules to handle white spaces and breaking opportunities. The following diagrams show this alternate design to implement the line breaking process.

This SimpleLineLayout code path in not supported by other WebKit ports (like WebKitGtk+ or WPE) and it’s not available either when using some CSS Text features or specific fonts. There are other limitations to use this SimpleLineLayout codepath, which may lead to render the text using the BreakingContext class.

Again, this is the list of bugs that were solved to implement the feature for the WebKit engine: WK#197277, WK#196169, WK#196353, WK#195361, WK#177327, WK#197278


I hope that at this point these 2 facts are clear now:

  • The white-space: break-spaces feature is a very simple but powerful feature that provides a new line breaking behavior, based on unix-terminal systems.
  • Although it’s a simple feature, on the paper (spec), it implies a considerable amount of work so that it reaches the browser and it’s available for web authors.

In this post I tried to explain in a simple way the main purpose of this new feature and also some interesting corner cases and combinations with other Line Breaking features. The demos I used shown 2 different use cases of this feature, but there are may more. I’m sure the creativity of web authors will push the feature to the limits; by then, I’ll be happy to answer doubts, about the spec or the implementation for the web engines, and of course fix the bugs that may appear once the feature is more used.

Igalia logo
Bloomberg logo

Igalia and Bloomberg working together to build a better web

Finally, I want to thank Bloomberg for supporting the work to implement this feature. It’s another example of how non-browser vendors can influence the Web Platform and contribute with actual features that will be eventually available for web authors. This is the kind of vision that we need if we want to keep a healthy, open and independent Web Platform.

by jfernandez at June 10, 2019 08:11 PM

June 06, 2019

Eleni Maria Stea

Depth-aware upsampling experiments (Part 3.2: Improving the upsampling using normals to classify the samples)

This post is again about improving the upsampling of the half-resolution SSAO render target used in the VKDF sponza demo that was written by Iago Toral. I am going to explain how I used information from the normals to understand if the samples of each 2×2 neighborhood we check during the upsampling belong to the … Continue reading Depth-aware upsampling experiments (Part 3.2: Improving the upsampling using normals to classify the samples)

by hikiko at June 06, 2019 08:15 PM

June 05, 2019

Eleni Maria Stea

Depth-aware upsampling experiments (Part 3.1: Improving the upsampling using depths to classify the samples)

In my previous posts of these series I analyzed the basic idea behind the depth-aware upsampling techniques. In the first post [1], I implemented the nearest depth sampling algorithm [3] from NVIDIA and in the second one [2], I compared some methods that are improving the quality of the z-buffer downsampled data that I use … Continue reading Depth-aware upsampling experiments (Part 3.1: Improving the upsampling using depths to classify the samples)

by hikiko at June 05, 2019 07:41 PM

Depth-aware upsampling experiments (Part 2: Improving the Z-buffer downsampling)

In the previous post of these series, I tried to explain the nearest depth algorithm [1] that I used to improve Iago Toral‘s SSAO upscaling in the sponza demo of VKDF. Although the nearest depth was improving the ambient occlusion in higher resolutions the results were not very good, so I decided to try more … Continue reading Depth-aware upsampling experiments (Part 2: Improving the Z-buffer downsampling)

by hikiko at June 05, 2019 01:37 PM

June 04, 2019

Eleni Maria Stea

Some additions to vkrunner

A new option has been added to Vkrunner (the Vulkan shader testing tool written by Neil Roberts) to allow selecting the Vulkan device for each shader test. Usage: [crayon-5d58790004ab0001677665/] or [crayon-5d58790004ac0311099206/]   When the device id is not set, the default GPU is used. IDs start from 1 to match the convention of the VK-GL-CTS … Continue reading Some additions to vkrunner

by hikiko at June 04, 2019 12:30 PM

June 03, 2019

Andy Wingo

pictie, my c++-to-webassembly workbench

Hello, interwebs! Today I'd like to share a little skunkworks project with y'all: Pictie, a workbench for WebAssembly C++ integration on the web.

loading pictie...

JavaScript disabled, no pictie demo. See the pictie web page for more information. >&&<&>>>&&><<>>&&<><>>

wtf just happened????!?

So! If everything went well, above you have some colors and a prompt that accepts Javascript expressions to evaluate. If the result of evaluating a JS expression is a painter, we paint it onto a canvas.

But allow me to back up a bit. These days everyone is talking about WebAssembly, and I think with good reason: just as many of the world's programs run on JavaScript today, tomorrow much of it will also be in languages compiled to WebAssembly. JavaScript isn't going anywhere, of course; it's around for the long term. It's the "also" aspect of WebAssembly that's interesting, that it appears to be a computing substrate that is compatible with JS and which can extend the range of the kinds of programs that can be written for the web.

And yet, it's early days. What are programs of the future going to look like? What elements of the web platform will be needed when we have systems composed of WebAssembly components combined with JavaScript components, combined with the browser? Is it all going to work? Are there missing pieces? What's the status of the toolchain? What's the developer experience? What's the user experience?

When you look at the current set of applications targetting WebAssembly in the browser, mostly it's games. While compelling, games don't provide a whole lot of insight into the shape of the future web platform, inasmuch as there doesn't have to be much JavaScript interaction when you have an already-working C++ game compiled to WebAssembly. (Indeed, much of the incidental interactions with JS that are currently necessary -- bouncing through JS in order to call WebGL -- people are actively working on removing all of that overhead, so that WebAssembly can call platform facilities (WebGL, etc) directly. But I digress!)

For WebAssembly to really succeed in the browser, there should also be incremental stories -- what does it look like when you start to add WebAssembly modules to a system that is currently written mostly in JavaScript?

To find out the answers to these questions and to evaluate potential platform modifications, I needed a small, standalone test case. So... I wrote one? It seemed like a good idea at the time.

pictie is a test bed

Pictie is a simple, standalone C++ graphics package implementing an algebra of painters. It was created not to be a great graphics package but rather to be a test-bed for compiling C++ libraries to WebAssembly. You can read more about it on its github page.

Structurally, pictie is a modern C++ library with a functional-style interface, smart pointers, reference types, lambdas, and all the rest. We use emscripten to compile it to WebAssembly; you can see more information on how that's done in the repository, or check the README.

Pictie is inspired by Peter Henderson's "Functional Geometry" (1982, 2002). "Functional Geometry" inspired the Picture language from the well-known Structure and Interpretation of Computer Programs computer science textbook.

prototype in action

So far it's been surprising how much stuff just works. There's still lots to do, but just getting a C++ library on the web is pretty easy! I advise you to take a look to see the details.

If you are thinking of dipping your toe into the WebAssembly water, maybe take a look also at Pictie when you're doing your back-of-the-envelope calculations. You can use it or a prototype like it to determine the effects of different compilation options on compile time, load time, throughput, and network trafic. You can check if the different binding strategies are appropriate for your C++ idioms; Pictie currently uses embind (source), but I would like to compare to WebIDL as well. You might also use it if you're considering what shape your C++ library should have to have a minimal overhead in a WebAssembly context.

I use Pictie as a test-bed when working on the web platform; the weakref proposal which adds finalization, leak detection, and working on the binding layers around Emscripten. Eventually I'll be able to use it in other contexts as well, with the WebIDL bindings proposal, typed objects, and GC.

prototype the web forward

As the browser and adjacent environments have come to dominate programming in practice, we lost a bit of the delightful variety from computing. JS is a great language, but it shouldn't be the only medium for programs. WebAssembly is part of this future world, waiting in potentia, where applications for the web can be written in any of a number of languages. But, this future world will only arrive if it "works" -- if all of the various pieces, from standards to browsers to toolchains to virtual machines, only if all of these pieces fit together in some kind of sensible way. Now is the early phase of annealing, when the platform as a whole is actively searching for its new low-entropy state. We're going to need a lot of prototypes to get from here to there. In that spirit, may your prototypes be numerous and soon replaced. Happy annealing!

by Andy Wingo at June 03, 2019 10:10 AM

May 28, 2019

Eleni Maria Stea

Depth-aware upsampling experiments (Part 1: Nearest depth)

This post is about different depth aware techniques I tried in order to improve the upsampling of the low resolution Screen Space Ambient Occlusion (SSAO) texture of a VKDF demo. VKDF is a library and collection of Vulkan demos, written by Iago Toral. In one of his demos (the sponza), Iago implemented SSAO among many … Continue reading Depth-aware upsampling experiments (Part 1: Nearest depth)

by hikiko at May 28, 2019 08:42 PM

May 24, 2019

Andy Wingo

lightening run-time code generation

The upcoming Guile 3 release will have just-in-time native code generation. Finally, amirite? There's lots that I'd like to share about that and I need to start somewhere, so this article is about one piece of it: Lightening, a library to generate machine code.

on lightning

Lightening is a fork of GNU Lightning, adapted to suit the needs of Guile. In fact at first we chose to use GNU Lightning directly, "vendored" into the Guile source respository via the git subtree mechanism. (I see that in the meantime, git gained a kind of a subtree command; one day I will have to figure out what it's for.)

GNU Lightning has lots of things going for it. It has support for many architectures, even things like Itanium that I don't really care about but which a couple Guile users use. It abstracts the differences between e.g. x86 and ARMv7 behind a common API, so that in Guile I don't need to duplicate the JIT for each back-end. Such an abstraction can have a slight performance penalty, because maybe it missed the opportunity to generate optimal code, but this is acceptable to me: I was more concerned about the maintenance burden, and GNU Lightning seemed to solve that nicely.

GNU Lightning also has fantastic documentation. It's written in C and not C++, which is the right thing for Guile at this time, and it's also released under the LGPL, which is Guile's license. As it's a GNU project there's a good chance that GNU Guile's needs might be taken into account if any changes need be made.

I mentally associated Paolo Bonzini with the project, who I knew was a good no-nonsense hacker, as he used Lightning for a smalltalk implementation; and I knew also that Matthew Flatt used Lightning in Racket. Then I looked in the source code to see architecture support and was pleasantly surprised to see MIPS, POWER, and so on, so I went with GNU Lightning for Guile in our 2.9.1 release last October.

on lightening the lightning

When I chose GNU Lightning, I had in mind that it was a very simple library to cheaply write machine code into buffers. (Incidentally, if you have never worked with this stuff, I remember a time when I was pleasantly surprised to realize that an assembler could be a library and not just a program that processes text. A CPU interprets machine code. Machine code is just bytes, and you can just write C (or Scheme, or whatever) functions that write bytes into buffers, and pass those buffers off to the CPU. Now you know!)

Anyway indeed GNU Lightning 1.4 or so was that very simple library that I had in my head. I needed simple because I would need to debug any problems that came up, and I didn't want to add more complexity to the C side of Guile -- eventually I should be migrating this code over to Scheme anyway. And, of course, simple can mean fast, and I needed fast code generation.

However, GNU Lightning has a new release series, the 2.x series. This series is a rewrite in a way of the old version. On the plus side, this new series adds all of the weird architectures that I was pleasantly surprised to see. The old 1.4 didn't even have much x86-64 support, much less AArch64.

This new GNU Lightning 2.x series fundamentally changes the way the library works: instead of having a jit_ldr_f function that directly emits code to load a float from memory into a floating-point register, the jit_ldr_f function now creates a node in a graph. Before code is emitted, that graph is optimized, some register allocation happens around call sites and for temporary values, dead code is elided, and so on, then the graph is traversed and code emitted.

Unfortunately this wasn't really what I was looking for. The optimizations were a bit opaque to me and I just wanted something simple. Building the graph took more time than just emitting bytes into a buffer, and it takes more memory as well. When I found bugs, I couldn't tell whether they were related to my usage or in the library itself.

In the end, the node structure wasn't paying its way for me. But I couldn't just go back to the 1.4 series that I remembered -- it didn't have the architecture support that I needed. Faced with the choice between changing GNU Lightning 2.x in ways that went counter to its upstream direction, switching libraries, or refactoring GNU Lightning to be something that I needed, I chose the latter.

in which our protagonist cannot help himself

Friends, I regret to admit: I named the new thing "Lightening". True, it is a lightened Lightning, yes, but I am aware that it's horribly confusing. Pronounced like almost the same, visually almost identical -- I am a bad person. Oh well!!

I ported some of the existing GNU Lightning backends over to Lightening: ia32, x86-64, ARMv7, and AArch64. I deleted the backends for Itanium, HPPA, Alpha, and SPARC; they have no Debian ports and there is no situation in which I can afford to do QA on them. I would gladly accept contributions for PPC64, MIPS, RISC-V, and maybe S/390. At this point I reckon it takes around 20 hours to port an additional backend from GNU Lightning to Lightening.

Incidentally, if you need a code generation library, consider your choices wisely. It is likely that Lightening is not right for you. If you can afford platform-specific code and you need C, Lua's DynASM is probably the right thing for you. If you are in C++, copy the assemblers from a JavaScript engine -- C++ offers much more type safety, capabilities for optimization, and ergonomics.

But if you can only afford one emitter of JIT code for all architectures, you need simple C, you don't need register allocation, you want a simple library to just include in your source code, and you are good with the LGPL, then Lightening could be a thing for you. Check the gitlab page for info on how to test Lightening and how to include it into your project.

giving it a spin

Yesterday's Guile 2.9.2 release includes Lightening, so you can give it a spin. The switch to Lightening allowed us to lower our JIT optimization threshold by a factor of 50, letting us generate fast code sooner. If you try it out, let #guile on freenode know how it went. In any case, happy hacking!

by Andy Wingo at May 24, 2019 08:44 AM

May 23, 2019

Andy Wingo

bigint shipping in firefox!

I am delighted to share with folks the results of a project I have been helping out on for the last few months: implementation of "BigInt" in Firefox, which is finally shipping in Firefox 68 (beta).

what's a bigint?

BigInts are a new kind of JavaScript primitive value, like numbers or strings. A BigInt is a true integer: it can take on the value of any finite integer (subject to some arbitrarily large implementation-defined limits, such as the amount of memory in your machine). This contrasts with JavaScript number values, which have the well-known property of only being able to precisely represent integers between -253 and 253.

BigInts are written like "normal" integers, but with an n suffix:

var a = 1n;
var b = a + 42n;
b << 64n
// result: 793209995169510719488n

With the bigint proposal, the usual mathematical operations (+, -, *, /, %, <<, >>, **, and the comparison operators) are extended to operate on bigint values. As a new kind of primitive value, bigint values have their own typeof:

typeof 1n
// result: 'bigint'

Besides allowing for more kinds of math to be easily and efficiently expressed, BigInt also allows for better interoperability with systems that use 64-bit numbers, such as "inodes" in file systems, WebAssembly i64 values, high-precision timers, and so on.

You can read more about the BigInt feature over on MDN, as usual. You might also like this short article on BigInt basics that V8 engineer Mathias Bynens wrote when Chrome shipped support for BigInt last year. There is an accompanying language implementation article as well, for those of y'all that enjoy the nitties and the gritties.

can i ship it?

To try out BigInt in Firefox, simply download a copy of Firefox Beta. This version of Firefox will be fully released to the public in a few weeks, on July 9th. If you're reading this in the future, I'm talking about Firefox 68.

BigInt is also shipping already in V8 and Chrome, and my colleague Caio Lima has an project in progress to implement it in JavaScriptCore / WebKit / Safari. Depending on your target audience, BigInt might be deployable already!


I must mention that my role in the BigInt work was relatively small; my Igalia colleague Robin Templeton did the bulk of the BigInt implementation work in Firefox, so large ups to them. Hearty thanks also to Mozilla's Jan de Mooij and Jeff Walden for their patient and detailed code reviews.

Thanks as well to the V8 engineers for their open source implementation of BigInt fundamental algorithms, as we used many of them in Firefox.

Finally, I need to make one big thank-you, and I hope that you will join me in expressing it. The road to ship anything in a web browser is long; besides the "simple matter of programming" that it is to implement a feature, you need a specification with buy-in from implementors and web standards people, you need a good working relationship with a browser vendor, you need willing technical reviewers, you need to follow up on the inevitable security bugs that any browser change causes, and all of this takes time. It's all predicated on having the backing of an organization that's foresighted enough to invest in this kind of long-term, high-reward platform engineering.

In that regard I think all people that work on the web platform should send a big shout-out to Tech at Bloomberg for making BigInt possible by underwriting all of Igalia's work in this area. Thank you, Bloomberg, and happy hacking!

by Andy Wingo at May 23, 2019 12:13 PM

May 21, 2019

Adrián Pérez

WPE WebKit 2.24

While WPE WebKit 2.24 has now been out for a couple of months, it includes over a year of development effort since our first official release, which means there is plenty to talk about. Let's dive in!

API & ABI Stability

The public API for WPE WebKit has been essentially unchanged since the 2.22.x releases, and we consider it now stable and its version has been set to 1.0. The pkg-config modules for the main components have been updated accordingly and are now named wpe-1.0 (for libwpe), wpebackend-fdo-1.0 (the FDO backend), and wpe-webkit-1.0 (WPE WebKit itself).

Our plan for the foreseeable future is to keep the WPE WebKit API backwards-compatible in the upcoming feature releases. On the other hand, the ABI might change, but will be kept compatible if possible, on a best-effort basis.

Both API and ABI are always guaranteed to remain compatible inside the same stable release series, and we are trying to follow a strict “no regressions allowed” policy for patch releases. We have added a page in the Web site which summarizes the WPE WebKit release schedule and this API/ABI stability guarantee.

This should allow distributors to always ship the latest available point release in a stable series. This is something we always strongly recommend because almost always they include fixes for security vulnerabilities.


Web engines are security-critical software components, on which users rely every day for visualizing and manipulating sensitive information like personal data, medical records, or banking information—to name a few. Having regular releases means that we are able to publish periodical security advisories detailing the vulnerabilities fixed by them.

As WPE WebKit and WebKitGTK share a number of components with each other, advisories and the releases containing the corresponding fixes are published in sync, typically on the same day.

The team takes security seriously, and we are always happy to receive notice of security bugs. We ask reporters to act responsibly and observe the WebKit security policy for guidance.

Content Filtering

This new feature provides access to the WebKit internal content filtering engine, also used by Safari content blockers. The implementation is quite interesting: filtering rule sets are written as JSON documents, which are parsed and compiled to a compact bytecode representation, and a tiny virtual machine which executes it for every resource load. This way deciding whether a resource load should be blocked adds very little overhead, at the cost of a (potentially) slow initial compilation. To give you an idea: converting the popular EasyList rules to JSON results in a ~15 MiB file that can take up to three seconds to compile on ARM processors typically used in embedded devices.

In order to penalize application startup as little as possible, the new APIs are fully asynchronous and compilation is offloaded to a worker thread. On top of that, compiled rule sets are cached on disk to be reused across different runs of the same application (see WebKitUserContentFilterStore for details). Last but not least, the compiled bytecode is mapped on memory and shared among all the processes which need it: a browser with many tabs opened will practically use the same amount of memory for content filtering than one with a single Web page loaded. The implementation is shared by the GTK and WPE WebKit ports.

I had been interested in implementing support for content filtering even before the WPE WebKit port existed, with the goal of replacing the ad blocker in GNOME Web. Some of the code had been laying around in a branch since the 2016 edition of the Web Engines Hackfest, it moved from my old laptop to the current one, and I worked on it on-and-off while the different patches needed to make it work slowly landed in the WebKit repository—one of the patches went through as many as seventeen revisions! At the moment I am still working on replacing the ad blocker in Web—on my free time—which I expect will be ready for GNOME 3.34.

It's All Text!

No matter how much the has evolved over the years, almost every Web site out there still needs textual content. This is one department where 2.24.x shines: text rendering.

Carlos García has been our typography hero during the development cycle: he single-handedly implemented support for variable fonts (demo), fixed our support for composite emoji (like 🧟‍♀️, composed of the glyphs “woman” and “zombie”), and improved the typeface selection algorithm to prefer coloured fonts for emoji. Additionally, many other subtle issues have been fixed, and the latest two patch releases include important fixes for text rendering.

Tip: WPE WebKit uses locally installed fonts as fallback. You may want to install at least one coloured font like Twemoji, which will ensure emoji glyphs can always be displayed.

API Ergonomics

GLib 2.44 added a nifty feature back in 2015: automatic cleanup of variables when they go out of scope using g_auto, g_autofree, and g_autoptr.

We have added the needed bits in the headers to allow their usage with the types from the WPE WebKit API. This enables developers to write code less likely to introduce accidental memory leaks because they do not need to remember freeing resources manually:

WebKitWebView* create_view (void)
    g_autoptr(WebKitWebContext) ctx = webkit_web_context_new ();
     * Configure "ctx" to your liking here. At the end of the scope (this
     * function), a g_object_unref(ctx) call will be automatically done.
    return webkit_web_view_new_with_context (ctx);

Note that this does not change the API (nor the ABI). You will need to build your applications with GCC or Clang to make use of this feature.

“Featurism” and “Embeddability”

Look at that, I just coined two new “technobabble” terms!

There are many other improvements which are shipping right now in WPE WebKit. The following list highlights the main user and developer visible features that can be found in the 2.24.x versions:

  • A new GObject based API for JavaScriptCore.
  • A fairly complete WebDriver implementation. There is a patch for supporting WPE WebKit in Selenium pending to be integrated. Feel free to vote 👍 for it to be merged.
  • WPEQt, which provides an idiomatic API similar to that of QWebView and allows embedding WPE WebKit as a widget in Qt/QML applications.
  • Support for the JPEG2000 image format. Michael Catanzaro has written about the reasoning behind this in his write-up about WebKitGTK 2.24.
  • Allow configuring the background of the WebKitWebView widget. Translucent backgrounds work as expected, which allows for novel applications like overlaying Web content on top of video streams.
  • An opt-in 16bpp rendering mode, which can be faster in some cases—remember to measure and profile in your target hardware! For now this only works with the RGB565 pixel format, which is the most common one used in embedded devices where 24bpp and 32bpp modes are not available.
  • Support for hole-punching using external media players. Note that at the moment there is no public API for this and you will need to patch the WPE WebKit code to plug your playback engine.

Despite all the improvements and features, still the main focus of WPE WebKit is providing an embeddable Web engine. Fear not: new features either are opt-in (e.g. 16bpp rendering), or disabled by default and add no overhead unless enabled (WebDriver, background color), or have no measurable impact at all (g_autoptr). Not to mention that many features can be even disabled at build time, bringing to the table smaller binaries and runtime footprint—but that would be a topic for another day.

by aperez ( at May 21, 2019 07:00 PM

Eleni Maria Stea

A simple pixel shader viewer

In a previous post, I wrote about Vkrunner, and how I used it to play with fragment shaders. While I was writing the shaders for it, I had to save them, generate a PPM image and display it to see the changes. This render to image/display repetition gave me the idea to write a minimal … Continue reading A simple pixel shader viewer

by hikiko at May 21, 2019 05:52 AM

May 14, 2019

Javier Muñoz

Cephalocon Barcelona 2019

Next week I will attend Cephalocon 2019. It will take place on 19 and 20 May in Barcelona.

I will deliver a talk, under the sponsorship of my company Igalia, about Ceph Object Storage and the RGW/S3 service layer.

In this talk, I will share my experience contributing new features and bugfixes upstream that were developed through open projects in the community.

I will also review some of the contributions from Jewel to Nautilus and its impact from the product/service perspective for users and companies investing in upstream development.

Cephalocon 2019 is our second international conference and it aims to bring together more than 800 technologists and adopters from across the globe to showcase the history and future of Ceph, demonstrate real-world applications and highlight vendor solutions.

The registration of the attendees is still open. You can find more information about the event and how to register on the official event page. The complete schedule is also available.

See you there!

Update 2019/05/25

by Javier at May 14, 2019 10:00 PM

Alicia Boya

validateflow: A new tool to test GStreamer pipelines

It has been a while since GstValidate has been available. GstValidate has made it easier to write integration tests that check that playback and transcoding executing actions (like seeking, changing subtitle tracks, etc…) work as expected; testing at a high level rather than fine exact/fine grained data flow.

As GStreamer is applied to an ever wider variety of cases, testing often becomes cumbersome for those cases that resemble less typical playback. On one hand there is the C testing framework intended for unit tests, which is admittedly low level. Even when using something like GstHarness, checking an element outputs the correct buffers and events requires a lot of manual coding. On the other hand gst-validate so far has focused mostly on assets that can be played with a typical playbin, requiring extra effort and coding for the less straightforward cases.

This has historically left many specific test cases within that middle ground without an effective way to be tested. validateflow attempts to fill this gap by allowing gst-validate to test that custom pipelines acted in a certain way produce the expected result.

validateflow itself is a GstValidate plugin that records buffers and events flowing through a given pad and records them in a log file. The first time a test is run, this log becomes the expectation log. Further executions of the test still create a new log file, but this time it’s compared against the expectation log. Any difference is reported as an error. The user can rebaseline the tests by removing the expectation log file and running it again. This is very similar to how many web browser tests work (e.g. Web Platform Tests).

How to get it

validateflow has been landed recently on the development versions of GStreamer. Before 1.16 is released you’ll be able to use it by checking out the latest master branches of GStreamer subprojects, preferably with something like gst-build.

Make sure to update both gst-devtools. Then update gst-integration-testsuites by running the following command, that will update the repo and fetch media files. Otherwise you will get errors.

gst-validate-launcher --sync -L

Writing tests

The usual way to use validateflow is through pipelines.json, a file parsed by the validate test suite (the one run by default by gst-validate-launcher) where all the necessary elements of a validateflow tests can be placed together.

For instance:

    "pipeline": "appsrc ! qtdemux ! fakesink async=false",
    "config": [
        "%(validateflow)s, pad=fakesink0:sink, record-buffers=false"
    "scenarios": [
            "name": "default",
            "actions": [
                "description, seek=false, handles-states=false",
                "appsrc-push, target-element-name=appsrc0, file-name=\"%(medias)s/fragments/car-20120827-85.mp4/init.mp4\"",
                "appsrc-push, target-element-name=appsrc0, file-name=\"%(medias)s/fragments/car-20120827-85.mp4/media1.mp4\"",
                "checkpoint, text=\"A moov with a different edit list is now pushed\"",
                "appsrc-push, target-element-name=appsrc0, file-name=\"%(medias)s/fragments/car-20120827-86.mp4/init.mp4\"",
                "appsrc-push, target-element-name=appsrc0, file-name=\"%(medias)s/fragments/car-20120827-86.mp4/media2.mp4\"",

These are:

  • pipeline: A string with the same syntax of gst-launch describing the pipeline to use. Python string interpolation can be used to get the path to the medias directory where audio and video assets are placed in the gst-integration-testsuites repo by writing %(media)s. It can also be used to get a video or audio sink that can be muted, with %(videosink)s or %(audiosink)s
  • config: A validate configuration file. Among other things that can be set, here validateflow overrides are defined, one per line, with %(validateflow)s, which expands to validateflow, plus some options defining where the logs will be written (which depends on the test name). Each override monitors one pad. The settings here define which pad, and what will be recorded.
  • scenarios: Usually a single scenario is provided. A series of actions performed in order on the pipeline. These are normal GstValidate scenarios, but new actions have been added, e.g. for controlling appsrc elements (so that you can push chunks of data in several steps instead of relying on a filesrc pushing a whole file and be done with it).
  • Running tests

    The tests defined in pipelines.json are automatically run by default when running gst-validate-launcher, since they are part of the default test suite.

    You can get the list of all the pipelines.json tests like this:

    gst-validate-launcher -L |grep launch_pipeline

    You can use these test names to run specific tests. The -v flag is useful to see the actions as they are executed. --gdb runs the test inside the GNU debugger.

    gst-validate-launcher -v validate.launch_pipeline.qtdemux_change_edit_list.default

    In the command line argument above validate. defines the name of the test suite Python file, testsuites/ The rest, launch_pipeline.qtdemux_change_edit_list.default is actually a regex: actually . just happens to match a period but it would match any character (it would be more correct, albeit also more inconvenient, to use \. instead). You can use this feature to run several related tests, for instance:

    $ gst-validate-launcher -m 'validate.launch_pipeline\.appsrc_.*'
    Setting up GstValidate default tests
    [3 / 3]  validate.launch_pipeline.appsrc_preroll_test.single_push: Passed
               Total time spent: 0:00:00.369149 seconds
               Passed: 3
               Failed: 0
               Total: 3

    Expectation files are stored in a directory named flow-expectations, e.g.:


    The actual output log (which is compared to the expectations) is stored as a log file, e.g.:


    Here is how a validateflow log looks.

    event stream-start: GstEventStreamStart, flags=(GstStreamFlags)GST_STREAM_FLAG_NONE, group-id=(uint)1;
    event caps: video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)2.1, profile=(string)main, codec_data=(buffer)014d4015ffe10016674d4015d901b1fe4e1000003e90000bb800f162e48001000468eb8f20, width=(int)426, height=(int)240, pixel-aspect-ratio=(fraction)1/1;
    event segment: format=TIME, start=0:00:00.000000000, offset=0:00:00.000000000, stop=none, time=0:00:00.000000000, base=0:00:00.000000000, position=0:00:00.000000000
    event tag: GstTagList-stream, taglist=(taglist)"taglist\,\ video-codec\=\(string\)\"H.264\\\ /\\\ AVC\"\;";
    event tag: GstTagList-global, taglist=(taglist)"taglist\,\ datetime\=\(datetime\)2012-08-27T01:00:50Z\,\ container-format\=\(string\)\"ISO\\\ fMP4\"\;";
    event tag: GstTagList-stream, taglist=(taglist)"taglist\,\ video-codec\=\(string\)\"H.264\\\ /\\\ AVC\"\;";
    event caps: video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)2.1, profile=(string)main, codec_data=(buffer)014d4015ffe10016674d4015d901b1fe4e1000003e90000bb800f162e48001000468eb8f20, width=(int)426, height=(int)240, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)24000/1001;
    CHECKPOINT: A moov with a different edit list is now pushed
    event caps: video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)3, profile=(string)main, codec_data=(buffer)014d401effe10016674d401ee8805017fcb0800001f480005dc0078b168901000468ebaf20, width=(int)640, height=(int)360, pixel-aspect-ratio=(fraction)1/1;
    event segment: format=TIME, start=0:00:00.041711111, offset=0:00:00.000000000, stop=none, time=0:00:00.000000000, base=0:00:00.000000000, position=0:00:00.041711111
    event tag: GstTagList-stream, taglist=(taglist)"taglist\,\ video-codec\=\(string\)\"H.264\\\ /\\\ AVC\"\;";
    event tag: GstTagList-stream, taglist=(taglist)"taglist\,\ video-codec\=\(string\)\"H.264\\\ /\\\ AVC\"\;";
    event caps: video/x-h264, stream-format=(string)avc, alignment=(string)au, level=(string)3, profile=(string)main, codec_data=(buffer)014d401effe10016674d401ee8805017fcb0800001f480005dc0078b168901000468ebaf20, width=(int)640, height=(int)360, pixel-aspect-ratio=(fraction)1/1, framerate=(fraction)24000/1001;

    Prerolling and appsrc

    By default scenarios don’t start executing actions until the pipeline is playing. Also by default sinks require a preroll for that to occur (that is, a buffer must reach the sink before the state transition to playing is completed).

    This poses a problem for scenarios using appsrc, as no action will be executed until a buffer reaches the sink, but a buffer can only be pushed as the result of an appsrc-push action, creating a chicken and egg problem.

    For many cases that don’t require playback we can solve this simply by disabling prerolling altogether, setting async=false in the sinks.

    For cases where prerolling is desired (like playback), handles_states=true should be set in the scenario description. This makes the scenario actions run without having to wait for a state change. appsrc-push will notice the pipeline is in a state where buffers can’t flow and enqueue the buffer without waiting for it so that the next action can run immediately. Then the set-state can be used to set the state of the pipeline to playing, which will let the appsrc emit the buffer.

    description, seek=false, handles-states=true
    appsrc-push, target-element-name=appsrc0, file-name="raw_h264.0.mp4"
    set-state, state=playing
    appsrc-eos, target-element-name=appsrc0


    The documentation of validateflow, explaining its usage in more detail can be found here:

    by aboya at May 14, 2019 02:13 PM

    May 06, 2019

    Eleni Maria Stea

    Vkrunner allows specifying the required Vulkan version

    The required Vulkan implementation version for a Vkrunner shader test can now be specified in its [require] section. Tests that are targeting Vulkan versions that aren’t supported by the device driver will be skipped. Syntax:[crayon-5d587900068cd957685091/]   For example:[crayon-5d587900068dd779464574/]   Reminding that vkrunner is a Vulkan shader testing tool similar to piglit that was written by … Continue reading Vkrunner allows specifying the required Vulkan version

    by hikiko at May 06, 2019 07:11 PM

    Having fun with Vkrunner!

    Vkrunner is a Vulkan shader testing tool similar to Piglit, written by Neil Roberts. It is mostly used by graphics drivers developers, and was also part of the official Khronos conformance tests suite repository (VK-GL-CTS) for some time [1]. There are already posts [2] about its use but they are all written from a driver … Continue reading Having fun with Vkrunner!

    by hikiko at May 06, 2019 07:10 PM

    April 29, 2019

    Asumu Takikawa

    WebAssembly in Redex

    Recently I’ve been studying the semantics of WebAssembly (wasm). If you’ve never heard of WebAssembly, it’s a new web programming language that’s planned to be a low-level analogue to JavaScript that’s supported by multiple browsers. Wasm could be used as a compilation target for a variety of new frontend languages for web programming.

    (see also Lin Clark’s illustrated blog series on WebAssembly for a nice introduction)

    As a cross-browser effort, the language comes with a detailed independent specification. A very nice aspect of the WebAssembly spec is that the language’s semantics are specified precisely with a small-step operational semantics (it’s even a reduction semantics in the style of Felleisen and Hieb).

    A condensed version of the wasm semantics was presented in a 2017 PLDI paper written by Andreas Hass and co-authors.

    (the current semantics of the full language is available at the WebAssembly spec website)

    In this blog post, I’m going to share some work I’ve done to make an executable version of the operational semantics from the 2017 paper. In the process, I’ll try to explain a few of the interesting design choices of wasm via the semantics.

    A good thing about operational semantics is that they are relatively easy to understand and can be easily converted into an executable form. You can think of them as basically an interpreter for your programming language that is specified precisely using mathematical relations defined over a formal grammar.

    Specifying a language in this way is helpful for proving various properties about your language. It’s also an implementation independent way to understand how programs will evaluate, which you can use to validate a production implementation.

    Because of the way that operational semantics resemble interpreters, you can construct executable operational semantics using domain specific modeling languages designed for that purpose (Redex, K, etc).

    (Note: wasm already comes with a reference interpreter written in OCaml, but it’s not using a modeling language in the sense described here)

    The modeling language I’ll use in this blog post is Redex, a DSL hosted in Racket that is designed for reduction semantics.

    Why Make an Executable Formal Semantics?

    If you’re not familiar with semantics modeling tools, you might be wondering what it even means to write an executable model. The basic process is to take the formal grammars and rules presented in an operational semantics and transcribe them as programs written in a specialized modeling language. The resulting interpreters can be used to run examples in the modeled language.

    The advantage to the executable model is that you can apply software engineering principles like randomized testing on your language semantics to see if there are bugs in your specification.

    Another benefit that you get is visualization, which can be helpful for understanding how specific programs execute.

    For example, here’s a screenshot showing a trace visualization for executing a factorial program in wasm:

    step-through of a wasm program
    Stepping through a function call in Redex

    This is a custom visualization that I made leveraging Redex’s built-in traces function, which lets you visualize every step of the reduction process. Here it shows a trace starting with a function call to an implementation of factorial. Each box show a visual representation of the instructions for that machine state. The arrows show the order of reduction, and which rule was used to produce the next state.

    In the next section I’m going to go over some background about semantics and modeling in Redex that’s needed to understand how the wasm model works (you can skip this if you already familiar with operational semantics and Redex).

    Redex & Reduction Semantics Basics

    As mentioned above, the basic idea of reduction semantics is to define a kind of formal interpreter for your programming language. In order to define this interpreter, you need to define (1) your language as a grammar, and (2) what the states for the machine that runs your programs looks like.

    In a simple language, the machine states might just be the program expressions in your language. But when you start to deal with more complicated features like memory and side effects, your machine may need additional components (like a store for representing memory, a representation of stack frames, and so on).

    In Redex, language definitions are made using the define-language form. It defines a grammar for your language using a straightforward pattern-based language that looks like BNF.

    A simple grammar might look like the following:

    (define-language simple-language
      ;; values
      (v ::= number)
      ;; expressions
      (e ::= v
             (add e e)
             (if e e e))
      ;; evaluation contexts
      (E ::= hole
             (add E e)
             (add v E)
             (if E e e)))

    The define-language form takes sets of non-terminals (e.g., e or v) paired with productions for terms in the language (e.g., (add e e)) and creates a language description bound to the given name (i.e., simple-language).

    This is a small and simple language with only conditionals, numbers, and basic addition. It accepts terms like (term 5), (term (+ (+ 1 2) 3)), (term (if 0 (+ 1 3) (if 1 1 2))) and so on.

    (term is a special Redex form for constructing language terms)

    In order to actually evaluate programs, we will need to define a reduction relation, which describes how machine states reduce to other states. In this simple language, states are just expressions.

    Here’s a very simple reduction relation:

    (define simple-red-v0
      (reduction-relation simple-language
        ;; machine states are just expressions "e"
        #:domain e
        (--> (add v_1 v_2)               ; pattern match
             ,(+ (term v_1) (term v_2))  ; result of reduction
             ;; this part is just the name of the rule
        ;; two rules for conditionals, true and false
        (--> (if 1 e_t e_f) ; the _ is used to name pattern occurences, like e_t
        (--> (if 0 e_t e_f) e_f

    You can read this reduction-relation form as describing a relation on the states e of the language, where the --> clauses define specific rules.

    Each --> describes how to reduce terms that match a pattern, like (add v_1 v_2), into a result term. The , allows an escape from the Redex DSL into full Racket to do arbitrary computations, like arithmetic in this case.

    With this reduction relation in hand, we can evaluate examples in the language using the test-->> unit test form, which will recursively apply the reduction relation until it’s done:

    (test-->> simple-red-v0 (term (add 3 4)) (term 7))
    (test-->> simple-red-v0 (term (if 0 1 2)) (term 2))

    These tests will succeed, showing that the first term evaluates to the second in each case.

    Unfortunately, this isn’t quite enough to evaluate more complex terms:

    > (test-->> simple-red-v0 (term (add (if 0 (add 3 4) (add 18 2)) 5)) (term 25))
    FAILED ::2666
    expected: 25
      actual: '(add (if 0 (add 3 4) (add 18 2)) 5)

    This is because the reduction relation above is defined for every form in the language, but it doesn’t tell you how to reduce inside nested expressions. For example, there is no explicit rule for evaluating an if nested inside an add.

    Of course, writing rules for all these nested combinations doesn’t make sense. That’s where evaluation contexts come in. An evaluation context like E (see the grammar from earlier) defines where reduction can happen. Anywhere that a hole exists in an evaluation context is a reducible spot.

    So for example, (add 3 hole) and (add hole (if 0 1 2)) are valid evaluation contexts, following the productions in the grammar. On the other hand, (if (add 1 -1) hole 5) is not a valid evaluation context, because you can’t evaluate the “then” branch of a conditional before evaluating the condition.

    You can combine evaluation contexts with an in-hole form to do context-based matching in reduction relations. Adding that changes the above reduction relation to this one:

    (define simple-red-v1
      (reduction-relation simple-language
        #:domain e
        ;; note the in-hole that's added
        (--> (in-hole E (add v_1 v_2))
             (in-hole E ,(+ (term v_1) (term v_2)))
        (--> (in-hole E (if 1 e_t e_f))
             (in-hole E e_t)
        (--> (in-hole E (if 0 e_t e_f))
             (in-hole E e_f)

    With this new relation, the test from before will succeed:

    (test-->> simple-red-v0 (term (add (if 0 (add 3 4) (add 18 2)) 5)) (term 25))

    This is because the in-hole pattern will match an evaluation context whose hole is substituted with a term that will be reduced.

    To make that a bit more concrete, the test above would execute a pattern match like this:

    > (redex-match
       (in-hole E (if 0 e_1 e_2))
       (term (add (if 0 (add 3 4) (add 18 2)) 5)))
       (bind 'E '(add hole 5))
       (bind 'e_1 '(add 3 4))
       (bind 'e_2 '(add 18 2)))))

    This interaction shows that the evaluation context pattern E gets matched to an add term with a hole. Inside that hole is an if expression, which is where the reduction will happen.

    Evaluation contexts are powerful in that you can describe all of these nested computations with straightforward rules that only mention the inner term that is being reduced. They are also useful for describing more complicated language features as well, such as mutation and state.

    With that brief tour of Redex, the next section will give a high-level overview of a model of wasm in Redex.

    WebAssembly Design via Redex

    WebAssembly is an interesting language because its design goals are specifically oriented towards web programming. That means, for example, that programs should be compact so that web browsers don’t have to download and process large scripts.

    Security is of course also a major concern on the web, so the language is designed with isolation in mind to ensure that programs cannot access unnecessary state or interfere with the runtime system’s data.

    The desire for a “a compact program representation” (Hass et al.) led to a stack-based design, in contrast to the simple nested expression language I gave as an example earlier.

    This means that operations take values from the stack, rather than nesting them in a tree-like structure. For example, instead of an expression like (add 3 4), wasm would use something like (i32.const 3) (i32.const 4) add. This sequence of instructions pushes two constants onto the stack, and then performs an addition with values popped from the stack (and then pushes the result).

    Wasm is also statically typed with a very simple type system. The validation provided by the type system ensures that you can statically know the stack layout at any point in the program, which means that you can’t have programs that access out-of-bounds positions in the stack (or the wrong registers, if you compile stack accesses to register accesses).

    By understanding the semantics, either in math or in model code, you can see how these security concerns are addressed operationally. For example, later in this section I’ll explain how the operational semantics demonstrates wasm’s memory isolation.

    To provide a starting point for explaining the Redex model, here’s a screenshot of the grammar of wasm from Hass et al’s paper:

    wasm grammar
    Grammar from Hass et al., CC-BY 4.0

    This formal grammar can be transcribed straightforwardly as a BNF-style language definition, as explained in the previous section. The grammar in Redex looks something like this:

    (define-language wasm-lang
      ;; basic types
      (t   ::= t-i t-f)
      (t-f ::= f32 f64)
      (t-i ::= i32 i64)
      ;; function types
      (tf  ::= (-> (t ...) (t ...)))
      ;; instructions (excerpted)
      (e-no-v ::= drop                      ; drop stack value
                  select                    ; select 1 of 2 values
                  (block tf e*)             ; control block
                  (loop tf e*)              ; looping
                  (if tf e* else e*)        ; conditional
                  (br i)                    ; branch
                  (br-if i)                 ; conditional branch
                  (call i)                  ; call (function by index)
                  (call cl)                 ; call (a closure)
                  return                    ; return from function
                  (get-local i)             ; get local variable
                  (set-local i)             ; set local variable
                  (label n {e*} e*)         ; branch target
                  (local n {i (v ...)} e*)  ; function body instruction
                  ...                       ; and so on
      ;; instructions including constant values
      (e    ::= e-no-v
                (const t c))
      (c    ::= number)
      ;; sequences of instructions
      (e*   ::= ϵ
                (e e*))
      ;; various kinds of indices
      ((i j l k m n a o) integer)
      ;; modules and various other forms omitted

    This shows just a subset of the grammar, but you can see there’s a close correspondence to the math. The main differences are just in the surface syntax, such as how expressions are ordered or nested.

    Using this syntax, the addition instructions from before would inhabit the e* non-terminal for a sequence of instructions. Following the grammar, it would be written as ((const i32 4) ((const i32 3) (add ε))). This instruction sequence is represented with a nested list (where ε is the null list) rather than a flat sequence to make it easier to manipulate in Redex.

    We also need to define evaluation contexts for this language. This is thankfully pretty simple:

    (E ::= hole
           (v E)
           ((label n (e*) E) e*))

    What these contexts means is that either you will look for a matching sequence of instructions that comes after a nested prefix (possibly empty) of values v to reduce, or you will look to reduce inside a label form (or a combination of these two patterns).

    Unlike the simple language from earlier, wasm has side effects, memory, modules, and other features. Because of this complexity, the machine states of wasm have to include a store (the non-terminal s), a call frame (F), a sequence of instructions (e*), and an index for the current module instance (i).

    As a result, the reduction relation becomes more complicated. The structure of the relation looks like this:

    (define wasm->
       ;; the machine states
       #:domain (s F e* i)
       ;; a subset of reduction rules shown below
       ;; rule for binary operations, like add or sub
       (++> ;; this pattern matches two consts on the stack
            (in-hole E ((const t c_1) ((const t c_2) ((binop t) e*))))
            ;; the two consts are consumed, and replaced with a result const
            (in-hole E ((const t c) e*))
            ;; do-binop is a helper function for executing the operations
            (where c (do-binop binop t c_1 c_2))
       ;; rule for just dropping a stack value
       (++> (in-hole E (v (drop e*))) ; consumes one v = (const t c)
            (in-hole E e*)            ; return remaining instructions e*
       ;; more rules would go here
       ;; shorthand reduction definitions
       [(--> (s F x i)
             (s F y i))
        (++> x y)]))

    Again, we have the #:domain keyword indicating what the machine states are. Then there are two rules for binary operations and the drop instruction (other rules omitted for now).

    The rules use a shorthand form ++> (defined at the bottom) that only match the instruction sequence, and ignores the store, call frame, and so on. This is just used to simplify how the rules look to match the paper.

    For comparison, here’s a screenshot of the full reduction semantics figure from the wasm paper:

    wasm reduction rules
    Reduction relation from Hass et al., CC-BY 4.0

    You can look at the 3rd and 12th rules in that figure and compare them to the two in the code excerpt above. You can see that there’s a close correspondence. In this fashion, you can transcribe all the rules from math to code, though you also have to write a significant amount of helper code to implement the side conditions in the rules.

    As I promised earlier, by inspecting the semantics, you can see how the language isolates the memory of wasm scripts that are executing. The store term s, whose grammar I omitted earlier, is defined like this:

      ;; stores: contains modules, tables, memories
      (s       ::= {(inst modinst ...)
                    (tab tabinst ...)
                    (mem meminst ...)})
      ;; modules with a list of closures cl, global values v,
      ;; table index and memory index
      (modinst ::= {(func cl ...) (glob v ...)}
                   ;; tab and mem are optional, hence all these cases
                   {(func cl ...) (glob v ...) (tab i)}
                   {(func cl ...) (glob v ...) (mem i)}
                   {(func cl ...) (glob v ...) (tab i) (mem i)})
      ;; tables are lists of closures
      (tabinst ::= (cl ...))
      ;; memories are lists of bytes
      (meminst ::= (b ...))

    The store is a structure containing some module, table, and memory instances. These are basically several kinds of global state that the instructions need to reference to accomplish various non-local operations, such as memory access, global variable access, function calls, and so on.

    Each module instance contains a list of functions (stored as closures cl), a list of global variables, and optionally indices specifying tables and memories associated with the module.

    The functions and global variables have an obvious purpose, which is to allow function calls to fetch the code for the function and allowing global variables to be read/written.

    The table index inside modules allow dynamic dispatch to functions stored in a shared table of closures. This lets you use a function pointer-like dispatch pattern without the dangers of pointers into memory.

    Finally, each module may declare an associated memory via an index, which means different modules can share access to the same memory if desired.

    This index number indexes into the list of memory instances (mem meminst ...), each of which is just represented as a list of bytes (b ...). The memory can be used for loads and stores to represent structured data or for other uses of memory.

    The Redex code for memory load and store rules look like this:

       ;; reductions for operating on memory
       (--> ;; one stack argument for address in memory, k
            (s F (in-hole E ((const i32 k) ((load t a o) e*))) i)
            ;; reintrepret bytes as the appropriately typed data
            (s F (in-hole E ((const t (const-reinterpret t (b ...))) e*)) i)
            ;; helper function fetches bytes from appropriate memory in s
            (where (b ...) (store-mem s i ,(+ (term k) (term o)) (sizeof t)))
       (--> ;; stack arguments for address k and new value c
            (s F (in-hole E ((const i32 k) ((const t c) ((store t a o) e*)))) i)
            ;; installs a new store s_new with the memory changed
            (s_new F (in-hole E e*) i)
            ;; the size in bytes for the given type
            (where n (sizeof t))
            ;; helper function modifies the memory in s, creating s_new
            (where s_new (store-mem= s i ,(+ (term k) (term o)) n (bits n t c)))

    These rules are a little more complicated and rely on various helper functions. The helper functions are not too complicated, however, and basically just do the appropriate indexing into the store data structure.

    One key difference from the earlier rules is the use of --> without any shorthands. This allows the rules to reference the store s to access parts of the global machine state. In this case, it’s the memory part of the store that’s needed.

    From this, you can see that wasm code only ever touches the appropriate memory instance that’s associated with the module that the code is in. More specifically, that’s because the store lookup is done using the current module index i. No other memories can be accessed via this lookup since this index is fixed for a given function definition.

    All memory accesses are also bounds-checked to avoid access to arbitrary regions of memory that might interfere with the runtime system and result in security vulnerabilities. The bounds checking is done inside the store-mem and store-mem= helper functions, which will fail to match in the where clauses if the index k is out of bounds.

    You can get an idea of how wasm is designed with web security in mind from this particular rule and how the store is designed. If you look at other rules, you can also see how global variables and local variables are kept separate from general memory access as well, which prevents memory access from interfering with variables and vice versa.

    In addition, code (e.g., function definitions, indirectly accessed functions in tables) are not stored in the memory either, which prevents arbitrary code execution or modification. You can see this from how the function closures are stored in separate function and table sections in modules, as explained above.

    Where to go from here

    The last section gave an overview of some interesting parts of the wasm semantics, with a focus on understanding some of the isolation guarantees that the design provides.

    This is just a small taste of the full semantics, which you can understand more comprehensively by reading the paper or the execution part of the language specification. The paper’s a very interesting read, with lots of attention paid to explaining design rationales.

    You can also try examples out with the basic Redex model that I’ve built, though with the caveat that it’s not quite complete. For example, I didn’t implement the type system and there are some rough edges around the numeric operations.

    There are also interesting ways in which the model could be extended.

    For one, it could cover the full semantics that’s described in the spec instead of the small version in the paper. If you combined that with a parser for wasm’s surface syntax, you could run real wasm programs that a web browser would understand and trace through the reductions.

    Wasm’s reference interpreter also comes with a lot of tests. With a proper parser, we could feed in all the tests to make sure the model is accurate (I’m sure there are bugs in my model).

    It would then be interesting to do random testing to discover if there are discrepancies between the specification-based semantics encoded in the executable model and implementations in various web browsers.

    If anyone’s interested in exploring any of that, you can check out the code I’ve written so far on github:

    Appendix: More Redex Discussion

    This section is an appendix of more in-depth discussion about using Redex specifically, if you’re interested in some of the nitty-gritty details.

    There were several challenges involved in actually modeling wasm in Redex. The first challenge I bumped into is that the wasm model’s reduction rules operate on sequences of values, sometimes even producing an empty sequence as a result.

    This actually makes encoding into Redex a little tricky, as Redex assumes that evaluation contexts are plugged in with a single datum and not a splicing sequence of them. The solution to this, which you can see in the rules shown above, is to explicitly match over the “remaining” sequence of expressions in the instruction stack and handle it explicitly (either discarding it or appending to it).

    Then evaluation contexts can just be plugged with a sequence of instructions or values.

    This requires a few cosmetic modifications to the grammar and rules when compared to the paper. For example, the wasm paper’s evaluation contexts simply define a basic context as (v ... hole e ...) in which the hole could be plugged with a general e ....

    Instead, the Redex model uses a context like (v E) where E can a hole or another nesting level. Then a hole is plugged in with an e* (defined as a list) representing both the thing to plug into the hole in the original rule plus the “remaining” e ... expressions.

    The rules also need to explicitly handle cons-ing and appending of expression sequences. In a sense, this is just making explicit what was implicit in the ... and juxtaposition in the paper’s rules.

    Another challenge was the indexed evaluation contexts of the paper. The paper’s contexts are indexed by a nesting depth, which constrains the kinds of contexts that a rule can apply to. In Redex it’s not possible to add such data-indexed constraints into grammars, so you end up having to apply extra side conditions in the reduction rules where such indexed contexts are used.

    For example, this shows in the rule for branching from inside of a control construct. The evaluation context E below is indexed by a nesting depth k in the paper’s rules:

    (==> ((label n {e*_0} (in-hole E ((br j) e*_1))) e*_2)
         (e*-append (in-hole E_v e*_0) e*_2)
         (where j (label-depth E))
         (where (E_outer E_v) (v-split E n))

    Whereas in the code above, the nesting depth is checked with a metafunction label-depth.

    Finally, the reduction relation in wasm has a rule for operating on the local form (for function invocations), that doesn’t follow the structure of typical evaluation context based reduction.

    Specifically, the rule states that when a model state reduces as:

    s; v*; e* →_i s'; v'*; e'*,

    then you can reduce under a function’s local expression as follows:

    s; v_0*; local_n {i; v*} e* →_j s'; v_0*; local_n {i; v'*} e'*.

    That is, you can reduce under a local if you swap out the call frame and are able to reduce under the swapped out call frame.

    I think this can’t be expressed using a normal evaluation context, but it’s still possible to express in Redex as a recursive call to the reduction relation being defined. Basically, you have to call apply-reduction-relation on the right-hand side of the rule if you encounter a local form.

    To make debugging easier, you also need to use apply-reduction-relation/tag-with-names and the computed-name form for the rule name to make sure the traces form shows the right rule names.

    Here’s what the reduction rule for the local case looks like:

    ;; specifies how to reduce inside a local/frame instruction via a
    ;; recursive use of the reduction relation
    (--> (s_0 F_0 (in-hole E ((local n {i F_1} e*_0) e*_2)) j)
         (s_1 F_0 (in-hole E ((local n {i F_2} e*_1) e*_2)) j)
         ;; apply --> recursively
         (where any_rec
                  wasm-> (term (s_0 F_1 e*_0 i))))
         ;; only apply this rule if this reduction was valid
         (side-condition (not (null? (term any_rec))))
         ;; the relation should be deterministic, so just take the first
         (where (string_tag (s_1 F_2 e*_1 i)) ,(first (term any_rec)))
         (computed-name (term string_tag)))

    by Asumu Takikawa at April 29, 2019 08:12 PM

    April 26, 2019

    Samuel Iglesias

    Igalia coding experience open positions

    The Igalia Coding Experience is an grant program which provides their first exposure to the professional world, working hand in hand with Igalia programmers and learning with them. The program is aimed to students with background in Computer Science, Information Technology, or Free Software development.

    This program is a great opportunity for those students willing to improve their technical skills by working on the field, learn how to contribute to open-source projects, and work together with the engineers of Igalia, which is a worker-owned company rocking in the Free Software world for more than 18 years!


    We are looking for candidates that are passionate about Free Software philosophy and willing to work on Free Software projects. If you have already contributed to any Free Software project related to our areas of specialization… that’s great! But don’t worry if you have not yet, we encourage you to apply as well!

    The conditions of the program are the following:

    • You will be mentored by one Igalian that is an expert in the respective field, so you are not going to be alone.
    • You will need to spend 450h working in the tasks agreed with the mentor, but you are free to distribute them along the year as it fits better for you. Usually students prefer to distribute them on timetables of 3 months working full-time, or 6 months part-time or even 1 year working 10 hours per week!
    • You are not going to do it for free. We will compensate you with 6500€ for all your work :)

    This year we are offering Coding experience positions on 6 different areas:

    • Implementation of web standards. The student will become familiar, and contribute to the implementation of W3C standards in open source web engines.

    • WebKit, one of the most important open source web rendering engines. The student will have the opportunity to help maintain and/or contribute to the development of new features.

    • Chromium, a well-known browser rendering engine. The student will work on specific features development and/or bug-fixing. Additional tasks may include maintenance of our internal buildbots, and creation of Chromium/Wayland packages for distribution.

    • Compilers, with focus on WebAssembly and JavaScript implementations. The student will contribute JS engines like V8 or JSC, working on new language features, optimizations or ports.

    • Multimedia and GStreamer, the leading open source multimedia framework. The student will help develop the Video Editing stack in GStreamer (namely GES and NLE). This work will include adding new features in any part of GStreamer, GStreamer Editing Services or in the Pitivi video editor, as well as fixing bugs in any of those components.

    • Open-source graphics stack. The student will work in the development of specific features in Mesa or in improving any of the open-source testing suites (VkRunner, piglit) used in the Mesa community. Candidates who would like to propose topics of interest to work on, please include them in the cover letter.

    The last area is the one that I have been working for more than 5 years inside the Graphics team at Igalia and I am thrilled we can offer such kind of position this year :-)

    You can find more information about Igalia Coding experience program in the website… don’t forget to apply for it! Happy hacking!

    April 26, 2019 06:30 AM

    April 22, 2019

    Thibault Saunier

    GStreamer Editing Services OpenTimelineIO support

    GStreamer Editing Services OpenTimelineIO support

    OpenTimelineIO is an Open Source API and interchange format for editorial timeline information, it basically allows some form of interoperability between the different post production Video Editing tools. It is being developed by Pixar and several other studios are contributing to the project allowing it to evolve quickly.

    We, at Igalia, recently landed support for the GStreamer Editing Services (GES) serialization format in OpenTimelineIO, making it possible to convert GES timelines to any format supported by the library. This is extremely useful to integrate GES into existing Post production workflow as it allows projects in any format supported by OpentTimelineIO to be used in the GStreamer Editing Services and vice versa.

    On top of that we are building a GESFormatter that allows us to transparently handle any file format supported by OpenTimelineIO. In practice it will be possible to use cuts produced by other video editing tools in any project using GES, for instance Pitivi:

    At Igalia we are aiming at making GStreamer ready to be used in existing Video post production pipelines and this work is one step in that direction. We are working on additional features in GES to fill the gaps toward that goal, for instance we are now implementing nested timeline support and framerate based timestamps in GES. Once we implement them, those features will enhance compatibility of Video Editing projects created from other NLE softwares through OpenTimelineIO. Stay tuned for more information!

    by thiblahute at April 22, 2019 03:21 PM