Planet Igalia

May 16, 2022

Alejandro Piñeiro

v3dv status update 2022-05-16

We haven’t posted updates to the work done on the V3DV driver since
we announced the driver becoming Vulkan 1.1 Conformant.

But after reaching that milestone, we’ve been very busy working on more improvements, so let’s summarize the work done since then.

Multisync support

As mentioned on past posts, for the Vulkan driver we tried to focus as much as possible on the userspace part. So we tried to re-use the already existing kernel interface that we had for V3D, used by the OpenGL driver, without modifying/extending it.

This worked fine in general, except for synchronization. The V3D kernel interface only supported one synchronization object per submission. This didn’t properly map with Vulkan synchronization, which is more detailed and complex, and allowed defining several semaphores/fences. We initially handled the situation with workarounds, and left some optional features as unsupported.

After our 1.1 conformance work, our colleage Melissa Wen started to work on adding support for multiple semaphores on the V3D kernel side. Then she also implemented the changes on V3DV to use this new feature. If you want more technical info, she wrote a very detailed explanation on her blog (part1 and part2).

For now the driver has two codepaths that are used depending on if the kernel supports this new feature or not. That also means that, depending on the kernel, the V3DV driver could expose a slightly different set of supported features.

More common code – Migration to the common synchronization framework

For a while, Mesa developers have been doing a great effort to refactor and move common functionality to a single place, so it can be used by all drivers, reducing the amount of code each driver needs to maintain.

During these months we have been porting V3DV to some of that infrastructure, from small bits (common VkShaderModule to NIR code), to a really big one: common synchronization framework.

As mentioned, the Vulkan synchronization model is really detailed and powerful. But that also means it is complex. V3DV support for Vulkan synchronization included heavy use of threads. For example, V3DV needed to rely on a CPU wait (polling with threads) to implement vkCmdWaitEvents, as the GPU lacked a mechanism for this.

This was common to several drivers. So at some point there were multiple versions of complex synchronization code, one per driver. But, some months ago, Jason Ekstrand refactored Anvil support and collaborated with other driver developers to create a common framework. Obviously each driver would have their own needs, but the framework provides enough hooks for that.

After some gitlab and IRC chats, Jason provided a Merge Request with the port of V3DV to this new common framework, that we iterated and tested through the review process.

Also, with this port we got timelime semaphore support for free. Thanks to this change, we got ~1.2k less total lines of code (and have more features!).

Again, we want to thank Jason Ekstrand for all his help.

Support for more extensions:

Since 1.1 got announced the following extension got implemented and exposed:

  • VK_EXT_debug_utils
  • VK_KHR_timeline_semaphore
  • VK_KHR_create_renderpass2
  • VK_EXT_4444_formats
  • VK_KHR_driver_properties
  • VK_KHR_16_bit_storage and VK_KHR_8bit_storage
  • VK_KHR_imageless_framebuffer
  • VK_KHR_depth_stencil_resolve
  • VK_EXT_image_drm_format_modifier
  • VK_EXT_line_rasterization
  • VK_EXT_inline_uniform_block
  • VK_EXT_separate_stencil_usage
  • VK_KHR_separate_depth_stencil_layouts
  • VK_KHR_pipeline_executable_properties
  • VK_KHR_shader_float_controls
  • VK_KHR_spirv_1_4

If you want more details about VK_KHR_pipeline_executable_properties, Iago wrote recently a blog post about it (here)

Android support

Android support for V3DV was added thanks to the work of Roman Stratiienko, who implemented this and submitted Mesa patches. We also want to thank the Android RPi team, and the Lineage RPi maintainer (Konsta) who also created and tested an initial version of that support, which was used as the baseline for the code that Roman submitted. I didn’t test it myself (it’s in my personal TO-DO list), but LineageOS images for the RPi4 are already available.

Performance

In addition to new functionality, we also have been working on improving performance. Most of the focus was done on the V3D shader compiler, as improvements to it would be shared among the OpenGL and Vulkan drivers.

But one of the features specific to the Vulkan driver (pending to be ported to OpenGL), is that we have implemented double buffer mode, only available if MSAA is not enabled. This mode would split the tile buffer size in half, so the driver could start processing the next tile while the current one is being stored in memory.

In theory this could improve performance by reducing tile store overhead, so it would be more benefitial when vertex/geometry shaders aren’t too expensive. However, it comes at the cost of reducing tile size, which also causes some overhead on its own.

Testing shows that this helps in some cases (i.e the Vulkan Quake ports) but hurts in others (i.e. Unreal Engine 4), so for the time being we don’t enable this by default. It can be enabled selectively by adding V3D_DEBUG=db to the environment variables. The idea for the future would be to implement a heuristic that would decide when to activate this mode.

FOSDEM 2022

If you are interested in watching an overview of the improvements and changes to the driver during the last year, we made a presention in FOSDEM 2022:
“v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi
4”

by infapi00 at May 16, 2022 09:48 AM

May 10, 2022

Melissa Wen

Multiple syncobjs support for V3D(V) (Part 2)

In the previous post, I described how we enable multiple syncobjs capabilities in the V3D kernel driver. Now I will tell you what was changed on the userspace side, where we reworked the V3DV sync mechanisms to use Vulkan multiple wait and signal semaphores directly. This change represents greater adherence to the Vulkan submission framework.

I was not used to Vulkan concepts and the V3DV driver. Fortunately, I counted on the guidance of the Igalia’s Graphics team, mainly Iago Toral (thanks!), to understand the Vulkan Graphics Pipeline, sync scopes, and submission order. Therefore, we changed the original V3DV implementation for vkQueueSubmit and all related functions to allow direct mapping of multiple semaphores from V3DV to the V3D-kernel interface.

Disclaimer: Here’s a brief and probably inaccurate background, which we’ll go into more detail later on.

In Vulkan, GPU work submissions are described as command buffers. These command buffers, with GPU jobs, are grouped in a command buffer submission batch, specified by vkSubmitInfo, and submitted to a queue for execution. vkQueueSubmit is the command called to submit command buffers to a queue. Besides command buffers, vkSubmitInfo also specifies semaphores to wait before starting the batch execution and semaphores to signal when all command buffers in the batch are complete. Moreover, a fence in vkQueueSubmit can be signaled when all command buffer batches have completed execution.

From this sequence, we can see some implicit ordering guarantees. Submission order defines the start order of execution between command buffers, in other words, it is determined by the order in which pSubmits appear in VkQueueSubmit and pCommandBuffers appear in VkSubmitInfo. However, we don’t have any completion guarantees for jobs submitted to different GPU queue, which means they may overlap and complete out of order. Of course, jobs submitted to the same GPU engine follow start and finish order. A fence is ordered after all semaphores signal operations for signal operation order. In addition to implicit sync, we also have some explicit sync resources, such as semaphores, fences, and events.

Considering these implicit and explicit sync mechanisms, we rework the V3DV implementation of queue submissions to better use multiple syncobjs capabilities from the kernel. In this merge request, you can find this work: v3dv: add support to multiple wait and signal semaphores. In this blog post, we run through each scope of change of this merge request for a V3D driver-guided description of the multisync support implementation.

Groundwork and basic code clean-up:

As the original V3D-kernel interface allowed only one semaphore, V3DV resorted to booleans to “translate” multiple semaphores into one. Consequently, if a command buffer batch had at least one semaphore, it needed to wait on all jobs submitted complete before starting its execution. So, instead of just boolean, we created and changed structs that store semaphores information to accept the actual list of wait semaphores.

Expose multisync kernel interface to the driver:

In the two commits below, we basically updated the DRM V3D interface from that one defined in the kernel and verified if the multisync capability is available for use.

Handle multiple semaphores for all GPU job types:

At this point, we were only changing the submission design to consider multiple wait semaphores. Before supporting multisync, V3DV was waiting for the last job submitted to be signaled when at least one wait semaphore was defined, even when serialization wasn’t required. V3DV handle GPU jobs according to the GPU queue in which they are submitted:

  • Control List (CL) for binning and rendering
  • Texture Formatting Unit (TFU)
  • Compute Shader Dispatch (CSD)

Therefore, we changed their submission setup to do jobs submitted to any GPU queues able to handle more than one wait semaphores.

These commits created all mechanisms to set arrays of wait and signal semaphores for GPU job submissions:

  • Checking the conditions to define the wait_stage.
  • Wrapping them in a multisync extension.
  • According to the kernel interface (described in the previous blog post), configure the generic extension as a multisync extension.

Finally, we extended the ability of GPU jobs to handle multiple signal semaphores, but at this point, no GPU job is actually in charge of signaling them. With this in place, we could rework part of the code that tracks CPU and GPU job completions by verifying the GPU status and threads spawned by Event jobs.

Rework the QueueWaitIdle mechanism to track the syncobj of the last job submitted in each queue:

As we had only single in/out syncobj interfaces for semaphores, we used a single last_job_sync to synchronize job dependencies of the previous submission. Although the DRM scheduler guarantees the order of starting to execute a job in the same queue in the kernel space, the order of completion isn’t predictable. On the other hand, we still needed to use syncobjs to follow job completion since we have event threads on the CPU side. Therefore, a more accurate implementation requires last_job syncobjs to track when each engine (CL, TFU, and CSD) is idle. We also needed to keep the driver working on previous versions of v3d kernel-driver with single semaphores, then we kept tracking ANY last_job_sync to preserve the previous implementation.

Rework synchronization and submission design to let the jobs handle wait and signal semaphores:

With multiple semaphores support, the conditions for waiting and signaling semaphores changed accordingly to the particularities of each GPU job (CL, CSD, TFU) and CPU job restrictions (Events, CSD indirect, etc.). In this sense, we redesigned V3DV semaphores handling and job submissions for command buffer batches in vkQueueSubmit.

We scrutinized possible scenarios for submitting command buffer batches to change the original implementation carefully. It resulted in three commits more:

We keep track of whether we have submitted a job to each GPU queue (CSD, TFU, CL) and a CPU job for each command buffer. We use syncobjs to track the last job submitted to each GPU queue and a flag that indicates if this represents the beginning of a command buffer.

The first GPU job submitted to a GPU queue in a command buffer should wait on wait semaphores. The first CPU job submitted in a command buffer should call v3dv_QueueWaitIdle() to do the waiting and ignore semaphores (because it is waiting for everything).

If the job is not the first but has the serialize flag set, it should wait on the completion of all last job submitted to any GPU queue before running. In practice, it means using syncobjs to track the last job submitted by queue and add these syncobjs as job dependencies of this serialized job.

If this job is the last job of a command buffer batch, it may be used to signal semaphores if this command buffer batch has only one type of GPU job (because we have guarantees of execution ordering). Otherwise, we emit a no-op job just to signal semaphores. It waits on the completion of all last jobs submitted to any GPU queue and then signal semaphores. Note: We changed this approach to correctly deal with ordering changes caused by event threads at some point. Whenever we have an event job in the command buffer, we cannot use the last job in the last command buffer assumption. We have to wait all event threads complete to signal

After submitting all command buffers, we emit a no-op job to wait on all last jobs by queue completion and signal fence. Note: at some point, we changed this approach to correct deal with ordering changes caused by event threads, as mentioned before.

Final considerations

With many changes and many rounds of reviews, the patchset was merged. After more validations and code review, we polished and fixed the implementation together with external contributions:

Also, multisync capabilities enabled us to add new features to V3DV and switch the driver to the common synchronization and submission framework:

  • v3dv: expose support for semaphore imports

    This was waiting for multisync support in the v3d kernel, which is already available. Exposing this feature however enabled a few more CTS tests that exposed pre-existing bugs in the user-space driver so we fix those here before exposing the feature.

  • v3dv: Switch to the common submit framework

    This should give you emulated timeline semaphores for free and kernel-assisted sharable timeline semaphores for cheap once you have the kernel interface wired in.

We used a set of games to ensure no performance regression in the new implementation. For this, we used GFXReconstruct to capture Vulkan API calls when playing those games. Then, we compared results with and without multisync caps in the kernelspace and also enabling multisync on v3dv. We didn’t observe any compromise in performance, but improvements when replaying scenes of vkQuake game.

May 10, 2022 09:00 AM

Multiple syncobjs support for V3D(V) (Part 1)

As you may already know, we at Igalia have been working on several improvements to the 3D rendering drivers of Broadcom Videocore GPU, found in Raspberry Pi 4 devices. One of our recent works focused on improving V3D(V) drivers adherence to Vulkan submission and synchronization framework. We had to cross various layers from the Linux Graphics stack to add support for multiple syncobjs to V3D(V), from the Linux/DRM kernel to the Vulkan driver. We have delivered bug fixes, a generic gate to extend job submission interfaces, and a more direct sync mapping of the Vulkan framework. These changes did not impact the performance of the tested games and brought greater precision to the synchronization mechanisms. Ultimately, support for multiple syncobjs opened the door to new features and other improvements to the V3DV submission framework.

DRM Syncobjs

But, first, what are DRM sync objs?

* DRM synchronization objects (syncobj, see struct &drm_syncobj) provide a
* container for a synchronization primitive which can be used by userspace
* to explicitly synchronize GPU commands, can be shared between userspace
* processes, and can be shared between different DRM drivers.
* Their primary use-case is to implement Vulkan fences and semaphores.
[...]
* At it's core, a syncobj is simply a wrapper around a pointer to a struct
* &dma_fence which may be NULL.

And Jason Ekstrand well-summarized dma_fence features in a talk at the Linux Plumbers Conference 2021:

A struct that represents a (potentially future) event:

  • Has a boolean “signaled” state
  • Has a bunch of useful utility helpers/concepts, such as refcount, callback wait mechanisms, etc.

Provides two guarantees:

  • One-shot: once signaled, it will be signaled forever
  • Finite-time: once exposed, is guaranteed signal in a reasonable amount of time

What does multiple semaphores support mean for Raspberry Pi 4 GPU drivers?

For our main purpose, the multiple syncobjs support means that V3DV can submit jobs with more than one wait and signal semaphore. In the kernel space, wait semaphores become explicit job dependencies to wait on before executing the job. Signal semaphores (or post dependencies), in turn, work as fences to be signaled when the job completes its execution, unlocking following jobs that depend on its completion.

The multisync support development comprised of many decision-making points and steps summarized as follow:

  • added to the v3d kernel-driver capabilities to handle multiple syncobj;
  • exposed multisync capabilities to the userspace through a generic extension; and
  • reworked synchronization mechanisms of the V3DV driver to benefit from this feature
  • enabled simulator to work with multiple semaphores
  • tested on Vulkan games to verify the correctness and possible performance enhancements.

We decided to refactor parts of the V3D(V) submission design in kernel-space and userspace during this development. We improved job scheduling on V3D-kernel and the V3DV job submission design. We also delivered more accurate synchronizing mechanisms and further updates in the Broadcom Vulkan driver running on Raspberry Pi 4. Therefore, we summarize here changes in the kernel space, describing the previous state of the driver, taking decisions, side improvements, and fixes.

From single to multiple binary in/out syncobjs:

Initially, V3D was very limited in the numbers of syncobjs per job submission. V3D job interfaces (CL, CSD, and TFU) only supported one syncobj (in_sync) to be added as an execution dependency and one syncobj (out_sync) to be signaled when a submission completes. Except for CL submission, which accepts two in_syncs: one for binner and another for render job, it didn’t change the limited options.

Meanwhile in the userspace, the V3DV driver followed alternative paths to meet Vulkan’s synchronization and submission framework. It needed to handle multiple wait and signal semaphores, but the V3D kernel-driver interface only accepts one in_sync and one out_sync. In short, V3DV had to fit multiple semaphores into one when submitting every GPU job.

Generic ioctl extension

The first decision was how to extend the V3D interface to accept multiple in and out syncobjs. We could extend each ioctl with two entries of syncobj arrays and two entries for their counters. We could create new ioctls with multiple in/out syncobj. But after examining other drivers solutions to extend their submission’s interface, we decided to extend V3D ioctls (v3d_cl_submit_ioctl, v3d_csd_submit_ioctl, v3d_tfu_submit_ioctl) by a generic ioctl extension.

I found a curious commit message when I was examining how other developers handled the issue in the past:

Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Mar 22 09:23:22 2019 +0000

    drm/i915: Introduce the i915_user_extension_method
    
    An idea for extending uABI inspired by Vulkan's extension chains.
    Instead of expanding the data struct for each ioctl every time we need
    to add a new feature, define an extension chain instead. As we add
    optional interfaces to control the ioctl, we define a new extension
    struct that can be linked into the ioctl data only when required by the
    user. The key advantage being able to ignore large control structs for
    optional interfaces/extensions, while being able to process them in a
    consistent manner.
    
    In comparison to other extensible ioctls, the key difference is the
    use of a linked chain of extension structs vs an array of tagged
    pointers. For example,
    
    struct drm_amdgpu_cs_chunk {
    	__u32		chunk_id;
        __u32		length_dw;
        __u64		chunk_data;
    };
[...]

So, inspired by amdgpu_cs_chunk and i915_user_extension, we opted to extend the V3D interface through a generic interface. After applying some suggestions from Iago Toral (Igalia) and Daniel Vetter, we reached the following struct:

struct drm_v3d_extension {
	__u64 next;
	__u32 id;
#define DRM_V3D_EXT_ID_MULTI_SYNC		0x01
	__u32 flags; /* mbz */
};

This generic extension has an id to identify the feature/extension we are adding to an ioctl (that maps the related struct type), a pointer to the next extension, and flags (if needed). Whenever we need to extend the V3D interface again for another specific feature, we subclass this generic extension into the specific one instead of extending ioctls indefinitely.

Multisync extension

For the multiple syncobjs extension, we define a multi_sync extension struct that subclasses the generic extension struct. It has arrays of in and out syncobjs, the respective number of elements in each of them, and a wait_stage value used in CL submissions to determine which job needs to wait for syncobjs before running.

struct drm_v3d_multi_sync {
	struct drm_v3d_extension base;
	/* Array of wait and signal semaphores */
	__u64 in_syncs;
	__u64 out_syncs;

	/* Number of entries */
	__u32 in_sync_count;
	__u32 out_sync_count;

	/* set the stage (v3d_queue) to sync */
	__u32 wait_stage;

	__u32 pad; /* mbz */
};

And if a multisync extension is defined, the V3D driver ignores the previous interface of single in/out syncobjs.

Once we had the interface to support multiple in/out syncobjs, v3d kernel-driver needed to handle it. As V3D uses the DRM scheduler for job executions, changing from single syncobj to multiples is quite straightforward. V3D copies from userspace the in syncobjs and uses drm_syncobj_find_fence()+ drm_sched_job_add_dependency() to add all in_syncs (wait semaphores) as job dependencies, i.e. syncobjs to be checked by the scheduler before running the job. On CL submissions, we have the bin and render jobs, so V3D follows the value of wait_stage to determine which job depends on those in_syncs to start its execution.

When V3D defines the last job in a submission, it replaces dma_fence of out_syncs with the done_fence from this last job. It uses drm_syncobj_find() + drm_syncobj_replace_fence() to do that. Therefore, when a job completes its execution and signals done_fence, all out_syncs are signaled too.

Other improvements to v3d kernel driver

This work also made possible some improvements in the original implementation. Following Iago’s suggestions, we refactored the job’s initialization code to allocate memory and initialize a job in one go. With this, we started to clean up resources more cohesively, clearly distinguishing cleanups in case of failure from job completion. We also fixed the resource cleanup when a job is aborted before the DRM scheduler arms it - at that point, drm_sched_job_arm() had recently been introduced to job initialization. Finally, we prepared the semaphore interface to implement timeline syncobjs in the future.

Going Up

The patchset that adds multiple syncobjs support and improvements to V3D is available here and comprises four patches:

  • drm/v3d: decouple adding job dependencies steps from job init
  • drm/v3d: alloc and init job in one shot
  • drm/v3d: add generic ioctl extension
  • drm/v3d: add multiple syncobjs support

After extending the V3D kernel interface to accept multiple syncobjs, we worked on V3DV to benefit from V3D multisync capabilities. In the next post, I will describe a little of this work.

May 10, 2022 08:00 AM

May 09, 2022

Iago Toral

VK_KHR_pipeline_executable_properties

Sometimes you want to go and inspect details of the shaders that are used with specific draw calls in a frame. With RenderDoc this is really easy if the driver implements VK_KHR_pipeline_executable_properties. This extension allows applications to query the driver about various aspects of the executable code generated for a Vulkan pipeline.

I implemented this extension for V3DV, the Vulkan driver for Raspberry Pi 4, last week (it is currently in review process) because I was tired of jumping through loops to get the info I needed when looking at traces. For V3DV we expose the NIR and QPU assembly code as well as various others stats, some of which are quite relevant to performance, such as spill or thread counts.


Some shader statistics

Final NIR code

QPU assembly

by Iago Toral at May 09, 2022 10:38 AM

May 02, 2022

Víctor Jáquez

From gst-build to local-projects

Two years ago I wrote a blog post about using gst-build inside of WebKit SDK flatpak. Well, all that has changed. That’s the true upstream spirit.

There were two main reason for the change:

  1. Since the switch to GStreamer mono repository, gst-build has been deprecated. The mechanism in WebKit were added, basically, to allow GStreamer upstream, so keeping gst-build directory just polluted the conceptual framework.
  2. By using gst-build one could override almost any other package in WebKit SDK. For example, for developing gamepad handling in WPE I added libmanette as a GStreamer subproject, to link a modified version of the library rather than the one in flatpak. But that approach added an unneeded conceptual depth in tree.

In order to simplify these operations, by taking advantage of Meson’s subproject support directly, gst-build handling were removed and new mechanism was set in place: Local Dependencies. With local dependencies, you can add or override almost any dependency, while flatting the tree layout, by placing at the same level GStreamer and any other library. Of course, in order add dependencies, they must be built with meson.

For example, to override libsoup and GStreamer, just clone both repositories below of Tools/flatpak/local-projects/subprojects, and declare them in WEBKIT_LOCAL_DEPS environment variable:


$ export WEBKIT_SDK_LOCAL_DEPS=libsoup,gstreamer-full
$ export WEBKIT_SDK_LOCAL_DEPS_OPTIONS="-Dgstreamer-full:introspection=disabled -Dgst-plugins-good:soup=disabled"
$ build-webkit --wpe

by vjaquez at May 02, 2022 11:11 AM

Brian Kardell

Slightly Random Interface Thoughts

Slightly Random Interface Thoughts

Not the normal fare for my blog, but my mind makes all kinds of weird connections after work, I suppose attempting to synthesize things. Occasionally it leads me down a road I think is interesting. Today it had me thinking about interfaces.

When I was very young, everyththing (TVs, radios, stereos, etc) had analog controls - mainly 2 kinds, switches and knobs. Almost everything was controlled and tuned by big physical knobs. Some of my grandparents things were probably 25 years old by then, and that's how they worked. While "new stuff" at the time certainly had differences, there was a definite "sameness" to it. They were mainly switches and dial knobs. We were tweaking, more or less, the same old things, for a pretty long time.

That makes sense on a whole lot of levels - it has benefits. You know it works. Your users understand how to use it. It's proven.

Of course, there were improvements or experiements at the edges: Minor evolutions of varying degrees. Maybe they moved the size or shape of knobs, or gave you additional knobs for different kinds of fine-tuning.

But then, sometimes something was interestingly new. The first takes are almost always not great, but they inspire other ideas. Ideas from all over start to mix and smash together and, ultimately, we get a kind of whole new "species".

My 4k smart TV is about as different from my grandmother's TV as you can imagine. It's a few speciations removed.

Human Machine Interfaces

Really, that's the case with pretty much everything we use to interface with machines, whether it is a physical interface, or a digital one. Nothing stays entirely the same. Comparing Windows 3.1 programs to their counterparts today, for example, would show you lots of variation and evolution in controls. Last year we wrote up some documentation surrounding research while working on "tabs" and if you breeze through it a bit, you can see a bit of discussion showing a fair bit of evolution and cross influencing over the years for just that one control.

But, what connected in my mind was something else entirely: "Game controllers". I say "game controllers" because historically, again, there is a lot of cross pollenation of ideas - these things often wind up being used for far more than just games. Indeed, in many of these devices they are your primary interface to a whole immersive operating system. That's basically your entire means of interacting with Igalia's new Wolvic Browser, for example, which is geared toward XR environments like this. It's interesting how we've smashed ideas from lots of different things together here and it's got me to thinking about the changes I've seen along the way.

Brief recollections

Way back, before my time, people started making games on computers (PDPs) with SpaceWar! That's just a neat thing you can learn a bunch about here if you want and see in action..

You can see though that our ideas were very rough - maybe you could repurpose keys on a keyboard or, as they did in the video - wire on/off buttons for everything: a button to turn left, a button to turn right, a button to fire, a button to thrust, etc.

By the time I was a kid we'd popularly moved on to Pong and pong-like games which had a physical knob, initially on the device itself. Later we'd separate those into physical corded paddles and add a button. Ohh, neat.

Very quickly came many takes on a joystick with a button (or two). Again, some tried more radical takes, like CalecoVision, which had a short stick with a fat head, two side triggers at the top of a whole number pad!

But, really, "a joystick that you 'grab' with one hand and a button or two" became the dominant paradigm. They had a definite "sameness" to the Atari 2600 model.

And for a the next several years most things just tweaked this a little bit. It was applied in arcades and on home computers and "game systems", but those ideas also wound up being applied to things like controlling heavy machinery.

Then, suddenly in the mid 1980s the original Nintendo was introduced here in the US. It said "Joystick? Hell no. They don't bring us joy." Instead, it introduced the d-pad, and two primary buttons, and two 'special' non-gameplay buttons arranged in the center. This was kind of the first big change in a while.

And then came some tweaking on the edges... A little bit later and we got the super Nintendo which gave us 4 gameplay buttons instead of two.

Then came something more radical. The 64 which came with a Frankenstein dpad + tiny thumb operated joystick and nine game play buttons, as well as radically changing the physical shape into some kind of monstrosity introducing "handles". Holy crap, what a monstrosity.

Then in 1997 we got the Playstation 1 controller: A d-pad, two thumbsticks with better shape and placement and basically 8 buttons - and... better handles.

What struck me was...

Wow - 1997 was a quarter of a century ago! A bit like my grandmother's radio, while there are certainly differences, I feel like there is a definite 'sameness' to controls for game systems today. They have many other kinds of advances - haptics, pitch and roll stuff, a touch pad inspired by other innovations, and now in the ps5 some resistance stuff... But really, they are very similar at their core in terms of how you interact with them, and these have all been bolted on at the edges. Anyone who played PS1 games could pretty easily jump into a PS5 controller. And there's a lot of good to that. It clearly works well.

But then, suddenly, all of these XR devices did something new and different... Just like all of the other examples, they're clearly very inspired by the controllers that came before them but because they really focus on the sensors as a primary mechanism rather secondary, they've basically split the controller in half.

A few of my favorite games blend them rather nicely, allowing you to use the left thumbstick to walk around as you would in any game, but control your immediate body with movement. I have to say, this feels kind of more natural to me in a lot of ways - a nicer blend.

I have to wonder what kind of other uses we'll put these kinds of advancements to - how the ideas will mix together. Very possibly XR will continue trying to free you up even more, maybe these controls won't even last long there. But, they're clearly on to something, I think - and I can easily imagine these being great ways to do all sorts of things that we're still using the current "normal" gamepads for - and a lot more.

I'm excited to see what happens next.

May 02, 2022 04:00 AM

April 26, 2022

Eric Meyer

Flexibly Centering an Element with Side-Aligned Content

In a recent side project that I hope will become public fairly soon, I needed to center a left-aligned list of links inside the sides of the viewport, but also line-wrap in cases where the lines got too long (as in mobile). There are a few ways to do this, but I came up with one that was new to me. Here’s how it works.

First, let’s have a list.  Pretend each list item contains a link so that I don’t have to add in all the extra markup.

<ol>
	<li>Foreword</li>
	<li>Chapter 1: The Day I Was Born</li>
	<li>Chapter 2: Childhood</li>
	<li>Chapter 3: Teachers I Admired</li>
	<li>Chapter 4: Teenage Dreaming</li>
	<li>Chapter 5: Look Out World</li>
	<li>Chapter 6: The World Strikes Back</li>
	<li>Chapter 7: Righting My Ship</li>
	<li>Chapter 8: In Hindsight</li>
	<li>Afterword</li>
</ol>

Great. Now I want it to be centered in the viewport, without centering the text. In other words, the text should all be left-aligned, but the element containing them should be as centered as possible.

One way to do this is to wrap the <ol> element in another element like a <div> and then use flexbox:

div.toc {
	display: flex;
	justify-content: center;
}

That makes sense if you want to also vertically center the list (with align-items: center) and if you’re already going to be wrapping the list with something that should be flexed, but neither really applied in this case, and I didn’t want to add a wrapper element that had no other purpose except centering. It’s 2022, there ought to be another way, right? Right. And this is it:

ol {
	max-inline-size: max-content;
	margin-inline: auto;
}

I also could have used width there in place of max-inline-size since this is in English, so the inline axis is horizontal, but as Jeremy pointed out, it’s a weird clash to have a physical property (width) and a logical property (margin-inline) working together. So here, I’m going all-logical, which is probably better for the ongoing work of retraining myself to instinctively think in logical directions anyway.

Thanks to max-inline-size: max-content, the list can’t get any wider (more correctly: any longer along the inline axis) than the longest list item. If the container is wider than that, then margin-inline: auto means the ol element’s box will be centered in the container, as happens with any block box where the width is set to a specific amount, there’s leftover space in the container, and the side margins of the box are set to auto. This is as if I’d pre-calculated the maximum content size to be (say) 434 pixels wide and then declared max-inline-size: 434px.

The great thing here is that I don’t have to do that pre-calculation, which would be very fragile in any case. I can just use max-content instead. And then, if the container ever gets too small to fit the longest bit of content, because the ol was set to max-inline-size instead of just straight inline-size, it can fill out the container as block boxes usually do, and the content inside it can wrap to multiple lines.

Perhaps it’s not the most common of layout needs, but if you find yourself wanting a lightweight way to center the box of an element with side-aligned content, maybe this will work for you.

What’s nice about this is that it’s one of those simple things that was difficult-to-impossible for so long, with hacks and workarounds needed to make it work at all, and now it… just works.  No extra markup, not even any calc()-ing, just a couple of lines that say exactly what they do, and are what you want them to do.  It’s a nice little example of the quiet revolution that’s been happening in CSS of late.  Hard things are becoming easy, and more than easy, simple.  Simple in the sense of “direct and not complex”, not in the sense of “obvious and basic”.  There’s a sense of growing maturity in the language, and I’m really happy to see it.


Have something to say to all that? You can add a comment to the post, or email Eric directly.

by Eric Meyer at April 26, 2022 09:31 PM

April 21, 2022

Qiuyi Zhang (Joyee)

Fixing snapshot support of class fields in V8

Up until V8 10.0, the class field initializers had been

April 21, 2022 03:28 AM

April 19, 2022

Manuel Rego

Web Engines Hackfest 2022

Once again Igalia is organizing the Web Engines Hackfest. This year the event is going to be hybrid. Though most things will happen on-site, online participation in some part of the event is going to be possible too.

Regarding dates, the hackfest will take place on June 13 & 14 in A Coruña. If you’re interested in participating, you can find more the information and the registration form at the event website: https://webengineshackfest.org/2022/.

What’s the Web Engines Hackfest?

This event started a long way back. The first edition happened in 2009 when 12 folks visited the Igalia offices in A Coruña and spent there a whole week working on WebKitGTK port. At that time, it was kind of early stages on the project and lots of work was needed, so those joint weeks were very productive to move things forward, discuss plans and implement features.

As the event grew and more people got interested, in 2014 it was renamed to Web Engines Hackfest and started to welcome people working on different web engines. This brought the opportunity for engineers of the different browsers to come together for a few days and discuss different features.

The hackfest has continued to grow and these days we welcome anyone that is somehow involved on the web platform. In this year’s event there will be people from different parts of the web platform community, from implementors and spec editors, to people interested in some particular feature.

This event has an unconference format. People attending are the ones defining the topics, and work together in breakout sessions to discuss them. They could be issues on a particular browser, generic purpose features, new ideas, even sometimes tooling demos. In addition, we always arrange a few talks as part of the hackfest. But the most important part of the event is being together with very different folks and having the chance to discuss a variety of topics with them. There are not lots of places where people from different companies and browsers join together to discuss topics. The idea of the hackfest is to provide a venue for that to happen.

2022 edition

This year we’re hosting the event in a new place, as Igalia’s office is no longer big enough to host all the people that will be attending the event. The venue is called Palexco and it’s close to the city center and just by the seaside (with views of the port). It’s a great place with lots of spaces and big rooms, so we’ll be very comfortable there. Note that we’ll have childcare service for the ones that might need it.

New venue: Palexco (picture by Jose Luis Cernadas Iglesias) New venue: Palexco (picture by Jose Luis Cernadas Iglesias)

The event is going to be 2 days this time, 13th and 14 June. Hopefully the weather will be great at that time of the year, and the folks visiting A Coruña should be able to really enjoy the trip. There are going to be lots of light hours too, sunrise is going to be around 7am and sunset past 10pm.

The registration form is still open. So far we’ve got a good amount of people registered from different companies like: Arm, Deno Land, Fission, Google, Igalia, KaiOS, Mozilla, Protocol Labs, Red Hat and Salesforce.

Arm, Google and Igalia will be sponsoring 2022 edition, and we’re really thankful for your support! If your company is also interested in sponsoring the hackfest, please contact us at hackfest@webengineshackfest.org.

Apart from that there are going to be some talks that will be live streamed during the event. We have a Call For Papers with a deadline by the end of this month. Talks can be on-site or remote, so if you’re interested on giving one, please fill the form.

We know we’re in complex times and not everyone can attend onsite this year. We’re sorry about that, and we hope you all can make it in future editions.

Looking forward to the Web Engines Hackfest 2022!

April 19, 2022 10:00 PM

April 11, 2022

Byungwoo Lee

April 10, 2022

Clayton Craft

-h --help -help help --? -? ????

Scenario: Congratulations, you won the lottery! You can barely believe your eyes as you stand there holding the winning ticket! It's amazing - so many feelings rush over you as you realize that some of your dreams are within reach now! You run over, nay, you float over to the lottery office to collect your winnings in pure excitement. You push open the doors to the building, scamper up to the front desk, present your ticket to the clerk, and the exchange goes something like this:

You: Hi! I won! Here's my ticket! Where do I collect my winnings?

Clerk: Hello. I understand you would like to collect your winnings, but I'm afraid I cannot let you do that unless you ask me in a very specific way.

You: .....

Clerk: Perhaps try something like "May I ..., please?"

You: May I have my winnings, please?

Clerk: Hello. I understand you would like to collect your winnings, but I'm afraid I cannot let you do that unless you ask me in a very specific way.

You: May I collect my winnings, please?

Clerk: Congrats on winning! Here you go!

Of course this would never happen in real life, right? There's no possible situation where the above interaction would make any sense in any way.

$ podman -h
Error: pflag: help requested
See 'podman --help'

Ya... Ok. I'm picking on podman[1] above, but it's pervasive in many, many command line tools. There are innumerable ways to ask a tool for help, and this blog's title has the most common ways I've seen, though I'm quite sure there are more. Anyway, the point of this post is to talk a little about the various ways to ask for help on the command line and quickly go over pitfalls.

-h / --help

Ah, the POSIX short/long help options. These are classics. Any competent, POSIX-compliant argument parser will handle them just fine. There are command argument parsers in many, many languages that are (or claim to be) compliant. A myriad of tools use these options, so there's a good chance your users are familiar with using them to ask for help. In my humble opinion, these are the best options to support because of how pervasive support is for them. In other words, many users have been trained with plentiful tools over considerable time, and have built these into their muscle memory. There's a reason why emergency phone numbers don't change arbitrarily every time some operator wants to "disrupt" the scene, thinking they know better. When it comes to asking for help, you probably want your users to get what they need quickly so they can use your tool.

[Edit 2021-04-12] Ok, I was wrong about long options being a POSIX thing, I guess they're a GNU thing.

-help

This one might save you 1 keystroke over --help, but it breaks any attempt to support short option chaining. For example, tar -xjf becomes impossible to parse correctly if the tool expects long option names to be proceeded by a single dash. Did the user mean -x -j -f? Or some option called xjf ?

Honestly, in practice, I've seen many tools that support -help also allow --help and -h for those who have the muscle memory reflex for those, so it's not nearly as problematic.

help

Some folks like to treat "help" as a command/verb on the command line. Some examples might include:

$ go help build

$ podman help run

Or more dramatically:

This pattern is cumbersome to deal with in practice, especially in tools that use subcommands. Instead of typing foo bar -h to get help, you have to move the cursor between foo and bar to insert help: foo help bar in order to get help about the bar subcommand. Then, once you presumably know how to use it, up-arrow, remove help from between the tool and subcommand, move to the end of the line, and continue on.

"Help" is commonly used in speech as an interjection, "Help!", and as a noun, "I need help with ____." It's also used as a verb, e.g., "Can you help me with ____?" However, I feel using it as a verb in command line tools that use the command/subcommand structure is awkward at best, as demonstrated above. It's also a verb in the following sentence: "Are you going to help me?!" Which is exactly what I feel like shouting every time I am forced to deal with tools that insist on using this pattern.

--? / -?

???? I have no idea where these came from, but my guess is that they are migrants from the wild west Windows-land, where I assume the shell won't try to expand ? into anything. Using these options will cause problems for anyone using common shells like bash, zsh, others. Don't do it.

asking for help

One final bit to end with: As in the case of podman above, if you know your user is asking for help, show them the damn help. It serves no one to chide them for not guessing the specific way your app wants them to ask for help. Better yet, support a more "common" way to allow users to ask for help if your app doesn't already. /rant

  1. Handling -h aside, podman is a really great alternative to docker. I highly recommend it, for many technical and non-technical reasons!

April 10, 2022 12:00 AM

April 07, 2022

Manuel Rego

:focus-visible is shipping in Safari/WebKit

This is the final report about the work Igalia has been doing to add support for :focus-visible in WebKit. As you probably already know this work is part of the Open Prioritization campaign by Igalia that has been funded by different people and organizations. Big thanks to all of you! If you’re curious and want to know all the details you can find the previous reports on this blog.

The main highlight for this blog post is that :focus-visible has been enabled by default in WebKit (r286783). 🚀 This change was included in Safari Technology Preview 138, with its own post on the official WebKit blog. And finally reached a stable release in Safari 15.4. It’s also included in WebKitGTK 2.36 and WPE WebKit 2.36.

Open Prioritization

Let’s start from the beginning, my colleague Brian Kardell had an idea to find more diverse ways to sponsor the development of the web platform, after some internal discussion that idea materialized into what we call Open Prioritization. In summer 2020 Igalia announced Open Prioritization that intially had six different features on the list:

  • CSS lab() colors in Firefox
  • :focus-visible in WebKit/Safari
  • HTML inert in WebKit/Safari
  • Selector list arguments for :not() in Chrome
  • CSS Containment support in WebKit/Safari
  • CSS d (SVG path) support in Firefox

By that time I wrote a blog post about this effort and CSS Containment in WebKit proposal and my colleagues did the same for the rest of the contenders:

After some months :focus-visible was the winner. By the end of 2020 we launched the Open Prioritization Collective to collect funds and we started our work on the implementation side.

Last year at TPAC, Eric Meyer gave an awesome talk called Adventures in Collective Implementation, explaining the Open Prioritization effort and the ideas behind it. This presentation also explains why there’s room for external investments (like this one) in the web platform, and that all open source projects (in particular the web browser engines) always have to make decisions regarding priorities. Investing on them will help to influence those priorities and speed up the development of features you’re interested in.

It’s been quite a while since we started all this, but now :focus-visible is supported in WebKit/Safari, so we can consider that the first Open Prioritization experiment has been successful. When :focus-visible was first enabled by default in Safari Technology Preview early this year, there were lots of misunderstandings about how the development of this feature was funded. Happily Eric wrote a great blog post on the matter, explaining all the details and going over some of the ideas from his TPAC talk.

:focus-visble is shipping in in WebKit, how that happened?

In November last year, I gave a talk at CSS Conf Armenia about the status of things regarding :focus-visible implementation in WebKit. In that presentation I explained some of the open issues and why :focus-visible was not enabled by default yet in WebKit.

The main issue was that Apple was not convinced about not showing a focus indicator (focus ring) when clicking on a focusable element (like a <div tabindex="0">). However this is one of the main goals of :focus-visible itself, avoiding to get a focus indicator in such situations. As Chromium and Firefox were already doing it, and aiming to have a better interoperability between the different implementations, Apple finally accepted this behavioral change on WebKit.

Then Antti Koivisto reviewed the implementation, suggesting a few changes and spotting some issues (thanks about that). Those things were fixed and the feature was enabled by default in the codebase last December. As usual once a feature is enabled some more issues appear and they were fixed too. Including even a generic issue regarding accesskey on focusable elements, which required to add support to test accesskey on WebKit Web Platform Tests (WPT).

As part of all this work since my previous blog post we landed 9 more patches on WebKit, making a total of 36 patches for the whole feature, together with a few new WPT tests.

Buttons and :focus-visible on Safari

This topic has been mentioned in my previous posts and also in my talk. Buttons (and other form controls) are not mouse focusable in Safari (both in macOS and iOS), this means that when you click a button on Safari, the button is not focused. This behavior has the goal to match Apple platform conventions, where the focus doesn’t move when you click a button. However Safari implementation differs from the platform one, as the focus gets actually lost when you click on such elements. There are some very old issues in WebKit bugtracker about the topic (see #22261 from 2008 or #112968 from 2013 for example).

There’s a kind of coincidence related to this. Before :focus-visible existed, buttons were never showing a focus indicator in Safari after mouse click, as they are not mouse focusable. This was different in other browsers where a focus ring was showed when clicking on buttons. So while :focus-visible fixed this issue for other browsers, it didn’t change the default behavior for buttons in Safari.

However with :focus-visible implementation we introduced a problem somehow related to this. Imagine a page that has an element and when you click it, the page moves the focus via script (using HTMLElement.focus()) to a different element. Should the new focused element show a focus indicator? Or in other words, should it match :focus-visible?

ol > li::marker { content: counter(list-item) ") "; }

The answer varies depending on whether the element clicked is or not mouse focusable:

  1. If you click on a focusable element and the focus gets moved via script to a different element, the newly focused element does NOT show a focus indicator and thus it does NOT match :focus-visible.
  2. If you click on a NON focusable element and the focus gets moved via script to a different element, the newly focused element shows a focus indicator and thus it matches :focus-visible.

All implementations agree on this, and Chromium and Firefox have been shipping this behavior for more than a year without known issues so far. But a problem appeared on Safari, because unlike the rest of browsers, buttons are not mouse focusable there. So when you click a button in Safari, you go to point 2) above, and end up showing a focus indicator in the newly focused element. Web authors don’t want to show a focus indicator on that situations, and that’s something that :focus-visible is fixing through point 1) in the rest of browsers, but not in Safari (see bug #236782 for details).

We landed a workaround to fix this problem in Safari, that somehow adds an exception for buttons to follow point 1) even if they are not mouse focusable. Anyway this doesn’t look like the solution for the long term, and looking into making buttons mouse focusable on Safari might be the way to go in the future. That will also help to solve other interop issues.

And now what?

The feature is complete and shipped, but as usual there are some other things that could be done as next steps:

  • The :focus-visible specification is kind of vague and has no normative text related to when or not show a focus indicator. This was done on purpose to advance on this area and have flexibility to adapt to user needs. Anyway now that all 3 major web engines agree on the implementation, maybe there could be the chance to define this in some spec. We tried to write a PR for HTML spec when we started the work on this feature, at that time it was closed, probably it was not the right time anyway. But maybe something like that could be retaken at some point in the future.
  • WebKit Web Inspector (Dev Tools) don’t allow you to force :focus-visible yet. We sent a patch for forcing :focus-within first but some UI refactoring is needed, once that’s done adding support for :focus-visible too should be straight forward.
  • Coming back to the topic on buttons not being mouse focusable in Safari. The web platform provides a way to make elements not keyboard focusable via tabindex="-1". Why not providing a way to mark an element as not mouse focusable? Maybe there could be a proposal for a new HTML attribute that allows making elements not mouse focusable, that way websites could mimic Apple platform conventions. There are nice use cases for this, for example when you’re editing an input and then you click on some button to show some contextual information, with something like this you could avoid losing the focus from the input to carry on with your editing.

Wrap-up

So yeah after more than a year since Igalia started working on :focus-visible in WebKit, we can now consider that this work has been complete. We can call the first Open Prioritization experiment a success, and we can celebrate together with all the people that have supported us during this achievement. 🎉

Thank you very much to all the people that sponsored this work. And also to all the people that helped reviewing patches, reporting bugs, discussing things, etc. during all this time. Without all your support we won’t be able to have made this happen. 🙏

Last but not least, we’d like to highlight how this work has helped the web platform as a whole. Now the major web browser engines have shipped :focus-visible and are using it in the default UA stylesheet. This makes tweaking the focus indicator on websites easier than ever.

April 07, 2022 10:00 PM

April 06, 2022

Qiuyi Zhang (Joyee)

Uncaught exceptions in Node.js

In this post, I’ll jot down some notes that I took when refactoring the uncaught exception handling routines in Node.js. Hopefully it

April 06, 2022 07:47 AM

On deps/v8 in Node.js

I recently ran into a V8 test failure that only showed up in the V8 fork of Node.js but not in the upstream. Here I’ll write down my

April 06, 2022 07:47 AM

March 30, 2022

Samuel Iglesias

Igalia Coding Experience, GSoC, Outreachy, EVoC

Do you want to start a career in open-source? Do you want to learn amazing skills while getting paid? Keep reading!

Igalia Coding Experience

Igalia logo

Igalia has a grant program that gives students with a background in Computer Science, Information Technology and Free Software their first exposure to the professional world, working hand in hand with Igalia programmers and learning with them. It is called Igalia Coding Experience.

While this experience is open for everyone, Igalia expressly invites women (both cis and trans), trans men, and genderqueer people to apply. The Coding Experience program gives preference to applications coming from underrepresented groups in our industry.

You can apply to any of the offered grants this year: Web Standards, WebKit, Chromium, Compilers and Graphics.

In the case of Graphics, the student will have the opportunity to deal with the Linux DRM subsystem. Specifically, the student will improve the test coverage of DRM drivers through IGT, a testing framework designed for this purpose. These includes learning how to contribute to Linux kernel/DRM, interact with the DRI-devel community, understand DRM core functionality, and increase test coverage of IGT tool.

The conditions of our Coding Experience program are:

  • Mentorship by one of the Igalia’s outstanding open source contributors in the field.
  • It is remote-friendly. Students can participate in it wherever they live.
  • Hours: 450h
  • Compensation: 6,500€
  • Usual timetables:
    • 3 months full-time
    • 6 months part-time

The submission period goes from March 16th until April 30th. Students will be selected in May. We will work with the student to arrange a suitable starting date during 2022, from June onwards, and finishing on a date to be agreed that suits their schedule.

Google Summer of Code (GSoC)

GSoC logo

The popular Google Summer of Code is another option for students. This year, X.Org Foundation participates as Open Source organization. We have some proposed ideas but you can propose any project idea as well.

Timeline for proposals is from April 4th to April 19th. However, you should contact us before in order to discuss your ideas with potential mentors.

GSoC gives some stipend to students too (from 1,500 to 6,000 USD depending on the size of the project and your location). The hours to complete the project varies from 175 to 350 hours depending on the size of the project as well.

Of course, this is a remote-friendly program, so any student in the world can participate in it.

Outreachy

Outreachy logo

Outreachy is another internship program for applicants from around the world who face under-representation, systemic bias or discrimination in the technology industry of their country. Outreachy supports diversity in free and open source software!

Outreachy internships are remote, paid ($7,000), and last three months. Outreachy internships run from May to August and December to March. Applications open in January and August.

The projects listed cover many areas of the open-source software stack: from kernel to distributions work. Please check current proposals to find anything that is interesting for you!

X.Org Endless Vacation of Code (EVoC)

X.Org logo

X.Org Foundation voted in 2008 to initiate a program known as the X.Org Endless Vacation of Code (EVoC) program, in order to give more flexibility to students: an EVoC mentorship can be initiated at any time during the calendar year, the Board can fund as many of these mentorships as it sees fit.

Like the other programs, EVoC is remote-friendly as well. The stipend goes as follows: an initial payment of 500 USD and two further payments of 2,250 USD upon completion of project milestones. EVoC does not set limits in hours, but there are some requirements and steps to do before applying. Please read X.Org Endless Vacation of Code website to learn more.

Conclusion

As you see, there are many ways to enter into the Open Source community. Although I focused in the open source graphics stack related programs, there are many of them.

With all of these possibilities (and many more, including internships at companies), I hope that you can apply and that the experience will encourage you to start a career in the open-source community.

Happy hacking!

March 30, 2022 09:15 AM

Brian Kardell

UA gotta be kidding

UA gotta be kidding

The UA String... It's a super weird, complex string that browsers send to servers, and is mostly dealt with behind the scenes. How big a deal could it be, really? I mean... It's a string. Well, pull up a chair.

I am increasingly dealing with in an ever larger number of things which involve very complex discussions, interrelationships of money, history, new standards, maybe even laws that are ultimately, somehow, about... A string. It's kind of wild to think about.

If you're interested in listening instead, I recently did an Igalia Chats podcast on this topic as well with fellow Igalians Eric Meyer and Alex Dunayev.

To understand any of this, a little background is helpful.

How did it get so complicated?

HTTP's first RFC 1945 was 1996. Section 10.15 defined the User Agent header as a tokenized string which it said wasn't required, but you should send it. Its intent was for

statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations

Seems reasonable enough, and early browsers did that ecxactly as expected.

So we got things like NCSA_Mosaic/2.0 (Windows 3.1), and we could count how many of our users were using that (statistical purposes).

But the web was new and there were lots of browsers popping up. Netcape came along, phenomenally well funded, intending to be a "Mosaic killer" they sent Mozilla/1.0 (Win3.1). Their IPO was the thing that really made the broad public really sit up and take notice of the web. It wasn't long before they had largely been declared the winners, impossible to unseat.

However, about this same time, Microsoft licensed the Mosaic source through NCSA's partner (called Spyglass) and created the initial IE in late 1995. It sent Microsoft Internet Explorer/1.0 (Windows 3.1). Interestingly, Apple too got into the race with a browser called Cyberdog released in Feb 1996. It sent a similarly simple string like Cyberdog/2.0 (Macintosh; 68k).

While we say things were taking off fast, it's worth mentioning that most people didn't have access to a computer at all. Among those that did, only a small number of them were really capable systems with graphical UIs. So text-based browsers, like the line mode browser from CERN, which could be used in university systems, for example, really helped expand the people exposed to the bigger idea of the web. It sent a simple string like W3CLineMode/5.4.0 libwww/5.4.0.

So far, so good.

But just then, the interwebs were really starting to hit a tipping point. Netscape quickly became the Chrome of their day (more, really): Super well funded, wanting to be first, and occasionally even just making shit up and shipping it. And, as a result, they had a hella good browser (for the first time). This created a runaway market share.

Oh hai! Are UA Netscape Browser?

Now, if you were a web master in those days, the gaps and bugs between the runaway top browser and others is kind of frustrating to manage. Netscape was really good in comparison to others. It supported frames and lots of interesting things. So, web masters just began creating two websites: A really nice one, with all the bells and whistles and the much simpler plain one that had all of the content, but worked fine even in text-based browsers... Or just blocking others and telling them to get a real browser. And they did this via the UA string.

Not too long after this became common, many other browsers (like IE and Cyberdog) did implement framesets and started getting a lot better… But it didn't matter.

It didn't matter because people had already placed them them in the "less good/doesn't support framesets and other fancy features" column. And, they weren't rushing out and changing it. Even if they wanted to, we all have other things to do, so it would take a long while before it would be changed everywhere.

If web masters wouldn't chage, end-users wouldn't adopt. If users don't adopt, why would your organization even try to fund and compete. Perhaps you can see the chicken and egg problem that Microsoft faced at this critical stage...

And so, they lied.

IE began sending Mozilla/1.22 (compatible; MSIE 2.0; Windows 3.1).

Note that in the product token, which was intended to identify the product, they knocked on the door and identified themselves as "Mozilla". Note also that they did identify themselves as MSIE in there elsewhere.

Why? Well, it's complicated.

For one, they needed to get the content. Secondly though, they needed a way to take credit, and build on it. Finally though - intentionally or not: If you start to win, the tables can turn. Web masters might send good stuff to MSIE and something less to everyone else. So, effectively, they deployed a clever workaround that cheated the particular parsing that was employed at that time (because that's what the spec said it should do) to achieve detection. It was the thing that was in their control.

Wash, rinse, repeat (and fork)...

So, basically, this just keeps happening. Everytime a browser comes along it's this problem all over again. We have to figure out a new lie that will fall through all of the right cracks in how people are currently parsing/using the UA strings. And we've got all the same pressures.

By the time you get to the release of Chrome 1.0 in 2008 it is sending something like Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.39 Safari/525.19.

Yikes. What is that Frakenstein thing?

But wait! There's more!

As flawed and weird as that is, it's just the beginning of the problem, because as I say this string is useful in ways that are sometimes at odds. Perhaps unintentionally, we've also created a system of advesarial advances.

Knowing stuff about the browser does lets you do useful things. But the decisioning powers available to you are mostly debatable, weird, and incomplete: You are reasoning about a thing stuck in time, which can become problematic. And so on the other end, we have to cheat.

That doesn't prevent people from wanting to know the answers to those questions or to do ever more seemingly useful things. "Useful things" can mean even something as simple as product planning and testing, as I say, even for browsers.

This goes wrong so many ways. For example, until not long ago, everything in the world counted Samsung Internet as "Chrome". However, that's not great for Samsung, and it's not necesarily great for all websites either. It is very much not Chrome, it is chromium-based. It's support Matrix and qualities are not the same in ways that do sometimes matter, at least in the moment. The follow on effects and ripples of that are just huge - from web masters routing content, sites making project choices, which polyfills to send, or even whether users have the inkling to want to try it - all of this is based on our perceptions of those stats.

But, it turns out that if you actually count them right - wow yes - Samsung Internet is the third most popular mobile browser worldwide, and by a good margin too! And also, a lot of stuff that totally should have let them in the door as totally capable before should have done that, and they should've gotten a good experience with the right polyfills too.

Even trying to keep track of all of this is gnarly, so we've built up whole industries to collect data, make sense of it and allow people to do "useful stuff" in ways that shield them from all of that. For example, if you use a popular CMS with things that let you say "if it's an iPad", or even just summarizes your stats in far more understandable ways like that - it's probably consulting one of these massive databases. Things like "whatismybrowser.com" which claims to have information about over 150 million unique UA strings in the wild.

Almost always, these systems involve mapping the UA string (including its lies) to "the facts, as we know them". These are used, often, not just for routing whole pages, but to deliver workarounds for specific devices, for example.

God of the UA Gaps

As you can imagine it's just gotten harder and harder to slide the the right holes. So now we have a kind of a new problem...

What happens when you have a lie that works for 95% of sites, but fails on, say, a few Alexa top 1k sites, or important properties you or your partners own?

Well, you lie differently to those ones.

That's right, there are many levels of lies. Your browser will send different UA strings to some domains, straight up spoofing another browser entirely.

Why? Because it has to. It's basically impossible to slip through all the cracks, and that's the only way to make things work for users that's in the browser's control.

What if the lie isn't enough? Well, you special case another kind of lie. Maybe you force that domain into quirks mode. You have to, because while the problem is on the site, that doesn't matter to regular users - they'll blame your "crappy browser". Worse still, if you're unlucky enough to be a newbie working on a brand new site in that domain, surprise! It doesn't work like almost anything else for some reason you can't explain! So, you try to find a way, another kind of workaround... and on and on it goes.

Privacy side effects

Of course, a side effect of all of this is that ultimately all of those simple variants in the UA and work that goes into those giant databases mean that we could know an awful lot about you, by default. So that's not great.

WebKit led on privacy by getting rid of most thirdparty cookies way way back. Mozilla followed. Now only Chrome does, and they're trying to figure out how to follow too.

But, back in 2017, WebKit also froze the UA string. And, since then we've been working to sort out a path that strikes all of the right balances. We do an experiment, and something breaks. We talk about doing another an experiment and some people get very cross. There are, after all, businesses built on the status quo.

Lots of things happening in standards (and chromium) surround trying to wrestle all of this into a manageable place. Efforts like UA reduction or Client Hints and many others are trying to find a way.

Obviously, it isn't easy.

Y2UA

Because of all of this complexity, there's even some worry that as browser versions hit triple digits (which once seemed it would take generations), some things could get tripped up in important ways.

There are several articles which discuss the various plans to deal with that - and, how this might involve (we hope, temporarily) some more lies.

Virtual (Reality) Lies

An interesting part of this is that occasionally we spawn a whole new paradigm - like mobile devices, dual screen, foldables - or XR.

The XR space is really dominated by new standalone devices that run Android and have a default Chromium browser with, realistically, no competition. Like, none. Not engine just choice, no actively developed browser choice. This is always the case in new paradigms, it seems, until it isn't.

As you might know, Igalia is changing that with our new Wolvic browser. Unfortunately, a lot of really interesting things fall into this same old trap - the "enter vr" button is only presented if it is what was previously the only real choice, and everything else was considered mobile or desktop. I'm not sure if it is them, or a service or library reasoning about it that way, but that's what happens.

So guess what? We selectively have to lie.

It's hard to overstate just how complex and intertwined this all is and what astounding amounts of money across the industry have been spent adversarially on ... a string.

March 30, 2022 04:00 AM

March 28, 2022

Joseph Griego

implementing ShadowRealm in WebKit

March 28, 2022

Igalia has been working in collaboration with Salesforce on advancing the ShadowRealm proposal through the TC39 process, and part of that work is actually getting the feature implemented in the Javascript engines and their embedders (browsers, nodejs, deno, etc.)

Since joining the compilers group at Igalia, I’ve been working (with some wonderful peers) to advance the implementation of ShadowRealms in JavaScriptCore (the Javascript engine used by WebKit, hereafter, ‘JSC’) and also integrating this functionality with WebKit proper.

You can read about some of the work done so far in the blog post Hanging in the Shadow Realm with JavaScriptCore by Phillip Mates, who implemented ShadowRealm support in JSC.

what is a ShadowRealm, anyways?

To explain what a ShadowRealm is, let’s start by explaining what a realm is more broadly:

“Realm” is from the Javascript spec, and is used to describe part of the environment a script executes in. For instance, different windows, frames, iframes, workers, and more all get their own realm to run code in.

Each realm also comes with an associated “global object” this is where top-level identifiers are stored as properties. For example, on a typical webpage, your Javascript runs in a realm whose global object exposes window, Promise, Event, and more. (The global object is always accessible as the name globalThis )

The usual isolation between these is informed by the mother of all browser security design principles, the same-origin policy: briefly, resources loaded from one domain (“origin”) shouldn’t normally be able to access resources from another; in the context of realms, this usually means that code running in one realm shouldn’t be able to directly access the objects associated with code running in another.

ShadowRealms are a new sandboxing primitive being added to the Javascript language, which allow Javascript code to create new realms that have similar isolation properties; these script-created realms are unique and disconnected from other realms the browser (or other host, like, node or deno) creates.

Any realm may create a new ShadowRealm:

const r = new ShadowRealm();

It’s useful to have a name for a realm that does this, we’ll steal from the proposal and call such a realm the “incubating realm” of its respective shadow realm.

cross-realm boundary enforcement

Part of the design of ShadowRealms is to provide a level of isolation around a ShadowRealm similar to what is provided today between other realms in the browser, as required by the content security policy. That is, code in the incubating realm (and, by extension, all other realms) should not be able to affect the content of the global object of a ShadowRealm and vice versa; except by using the ShadowRealm object directly (by calling myShadowRealm.evaluate or myShadowRealm.importValue.)

This requires fairly careful scrutiny of what we allow to pass between a ShadowRealm and its incubating realm. For instance, we basically cannot allow objects to pass between them at all, since if you obtain an object o from another realm, you can play nasty tricks by abusing prototype objects to get the Function constructor from the other realm.

const o = /* obtained via magical means */

// we can play games to obtain the constructor from o's prototype, which is bad enough on its own ...
const P = Object.getPrototypeOf(o).constructor;
// we can get the constructor _of that constructor_ which will be Function, but from the wrong realm!
const Funktion = Object.getPrototypeOf(P).constructor;
// now we can do basically whatever we please to o's global object, since
// constructing a new Function with a string gives us a function with that source text.
const farawayGlobalObject = (new Funktion("return globalThis;"))();

farawayGlobalObject.Array.prototype.slice = /* insert evil here */;

Note that this game (getting at the Function constructor from the other realm via any prototype chain) works in either direction, too! (it can be used to access the ShadowRealm from the incubating as well as accessing the incubating realm from the ShadowRealm)

We want to prevent leaks of this nature, since they allow action-at-a-distance not controlled by the normal ShadowRealm interface. This is important if we want modifications to the global objects of either realm to be only performed by code deliberately loaded into that realm.

venturing into the WebCore weeds

As he describes in the post linked above, Phillip implemented ShadowRealms and the ShadowRealm interface in JSC, but we left a host hook1 in to handle module loads and to allow the host to customize the ShadowRealm global object:

ShadowRealmObject*
ShadowRealmObject::create(VM& vm, Structure* structure, JSGlobalObject* globalObject) {
  ShadowRealmObject* object = /* ... */ ;
  /* ... */
  object->m_globalObject.set(
    vm, object,
    globalObject->globalObjectMethodTable()
    ->deriveShadowRealmGlobalObject(globalObject)
    // \__________________________________________/
    //       provided by the engine host
  );
  /* ... */
  return object;
}

When using JSC alone, deriveShadowRealmGlobalObject does little more than make a plain old new JSC::JSGlobalObject for the ShadowRealm to use. However, on WebKit, we needed to make sure it could create a JSGlobalObject that could perform module loads for the web page, and is otherwise customized to WebKit’s requirements, and that’s what we’ll describe here.

detour: wrappers for free

Central to WebKit’s use of JSC is that certain objects associated with a webpage all get associated “wrapper objects”: these are instances of the type JSC::JSObject whose job it is to send Javascript calls to a method of the wrapper object to calls to the C++ method that implements the object.

For example, in WebCore, we have an Element class which is responsible for modelling an HTML element in your web page—however, it cannot be used directly by Javascript code: that interaction is controlled by its wrapper object, which is an instance of JSElement (which is a subclass, ultimately, of JSObject)

Most wrapper classes in WebKit are, in fact, generated! Most web standards specify what Javascript objects should be available using a special language just for this purpose called WebIDL (IDL = Interface description language). For example, the WebIDL for TextEncoder looks like:

[
    Exposed=*,
] interface TextEncoder {
    constructor();

    readonly attribute DOMString encoding;

    [NewObject] Uint8Array encode(optional USVString input = "");
    TextEncoderEncodeIntoResult encodeInto(USVString source, [AllowShared] Uint8Array destination);
};

This is used during the WebKit build to produce the wrapper class, JSTextEncoder, which looks something like this: (though I am omitting a lot of boilerplate)

class JSTextEncoder : public JSDOMWrapper<TextEncoder> {
public:
    using Base = JSDOMWrapper<TextEncoder>;
  /* snip */
    static TextEncoder* toWrapped(JSC::VM&, JSC::JSValue);
  /* snip */
};

Here, the class JSDOMWrapper<TextEncoder> provides the most basic possible kind of wrapper object: the wrapper holds a reference to a TextEncoder and generated code in JSTextCoder.cpp instructs the JS engine how to dispatch to it:

/* Hash table for prototype */

static const HashTableValue JSTextEncoderPrototypeTableValues[] = {
  { "constructor",
    static_cast<unsigned>(JSC::PropertyAttribute::DontEnum),
    NoIntrinsic,
    { (intptr_t)static_cast<PropertySlot::GetValueFunc>(jsTextEncoderConstructor),
      (intptr_t) static_cast<PutPropertySlot::PutValueFunc>(0) } },
  { "encoding",   /* snip */ },
  { "encode",     /* snip */ },
  { "encodeInto", /* snip */ },
};

JSC_DEFINE_CUSTOM_GETTER(jsTextEncoderConstructor, (JSGlobalObject* lexicalGlobalObject,
                                                    EncodedJSValue thisValue,
                                                    PropertyName))
{ /* dispatch code goes here */ }

/* much more generated code goes here, using the above */

Usually, we don’t care much about the details here, that’s why the code is generated! The relevant information is typically that calling e.g. encoder.encode from Javascript should result to a call, in C++ to the encode method on TextEncoder.

There’s also a variety of attributes we can put on IDL declarations, some of which change the meaning of those declarations for instance, by specifying which kinds of realms they should be available in, and some others which affect WebKit-specific aspects of the declaration, notably, they give us more control over the code generation we just described.

ShadowRealm global objects

To make sure that ShadowRealms behave appropriately in WebKit, we need to make sure that we can create a JSGlobalObject that also cooperates with the wrapping machinery in WebCore; the typical way to do this is to make the wrapper object for the realm global object an instance of WebCore::JSDOMGlobalObject: this both provides functionality to ensure that the wrappers used in that realm can be tracked and also that they are distinct from wrappers used in other realms.

For ShadowRealms we need to make sure that our new ShadowRealm global object is wrapped as a subclass of JSDOMGlobalObject; we can do this pretty directly with WebKit IDL attributes:

[
    Exposed=ShadowRealm,
    JSLegacyParent=JSShadowRealmGlobalScopeBase,
    Global=ShadowRealm,
    LegacyNoInterfaceObject,
] interface ShadowRealmGlobalScope {
    /* snip */
};

These have the meaning:

  • Exposed=ShadowRealm + LegacyNoInterfaceObject: these two together don’t make much difference: Exposed=ShadowRealm tells us that the interface should be available in ShadowRealms; LegacyNoInterfaceObject tells us that there shouldn’t actually be a globalThis.ShadowRealmGlobalScope available anywhere; so, there is, in fact, nothing really to expose… but, because this is the global object for ShadowRealms, any members on it will be available on globalThis.

  • JSLegacyParent=JSShadowRealmGlobalScopeBase tells WebKit’s code generation that the wrapper class for this interface should use our custom JSShadowRealmGlobalScopeBase (which we have yet to write) as the base class.

  • Global=ShadowRealm tells other people reading this IDL file that this interface is the global object for ShadowRealms.

Now we just need two more things: the implementation of the unwrapped ShadowRealmGlobalScope, and the implementation of our wrapper class, JSShadowRealmGlobalScopeBase

the unwrapped global scope

We can start with the unwrapped, global object, since it ends up being simpler: the main thing we need from a ShadowRealm global object is just to be able to find our way back to the incubating realm—it turns out a convenient way to do this is to just make a new type and have it keep its incubating realm around:

class ShadowRealmGlobalScope : public RefCounted<ShadowRealmGlobalScope> {
  /* ... snip  ... */
private:
  // a (weak) pointer to the JSDOMGlobalObject that created this ShadowRealm
  JSC::Weak<JSDOMGlobalObject> m_incubatingWrapper;

  // the module loader from our incubating realm
  ScriptModuleLoader* m_parentLoader { nullptr };

  // a pointer to the JSDOMGlobalObject that wraps this realm (it's unique!)
  JSC::Weak<JSShadowRealmGlobalScopeBase> m_wrapper;

  // a separate module loader for this realm to use
  std::unique_ptr<ScriptModuleLoader> m_moduleLoader;
};

Asute readers will note that the ShadowRealmGlobalScope does not, in fact, keep its parent realm around; this is because it is retained by the ShadowRealmObject from above! Having the ShadowRealm global scope retain its incubating realm would form a loop of retaining pointers and therefore leak memory! Since these are WTF::RefCounted<...>, there’s no garbage collector to help, out either; we really need to avoid the reference cycle.

We can, however, get away with a weak pointer since if the incubating global object became unreachable, there would be no way to get back into the shadow realm except code running in the incubating realm or its event loop, neither of which should be possible, so, the weak pointer will always be valid when we need it.

the wrapper global object

Let’s go ahead and add the wrapper class now:

class JSShadowRealmGlobalScopeBase : public JSDOMGlobalObject { /* snip */ }

… and, since we get to pick our base class, we can pick JSDOMGlobalObject instead of JSObject, how convenient! This has the effect of implicitly making other parts of the engine treat our new global object as a separate realm that requires its own wrapper objects. This doesn’t come for free, though, we have several virtual methods on JSDOMGlobalObject we are obliged to implement. Thankfully, we have another JSDOMGlobalObject around we can happily delegate to! For example:

// a shared utility to retrieve the incubating realm's global object
const JSDOMGlobalObject* JSShadowRealmGlobalScopeBase::incubatingRealm() const
{
  auto incubatingWrapper = m_wrapped->m_incubatingWrapper.get();
  ASSERT(incubatingWrapper);
  return incubatingWrapper;
}

// discharge one of our obligations by delegating to `incubatingRealm()`
//
// (this method is static; we get `this` as JSGlobalObject*, annoyingly, but
// the downcast should always succeed)
bool JSShadowRealmGlobalScopeBase::supportsRichSourceInfo(const JSGlobalObject* object)
{
  auto incubating = jsCast<const JSShadowRealmGlobalScopeBase*>(object)->incubatingRealm();
  return incubating->globalObjectMethodTable()->supportsRichSourceInfo(incubating);
}

Finally we need only to add branches in some (admittedly awkward2) parts of JSDOMGlobalObject for our new ShadowRealm global object, for example:

static ScriptModuleLoader* scriptModuleLoader(JSDOMGlobalObject* globalObject)
{
  /* snip */
  if (globalObject->inherits<JSShadowRealmGlobalScopeBase>(vm))
    return &jsCast<const JSShadowRealmGlobalScopeBase*>(globalObject)->wrapped().moduleLoader();
  /* snip */
}

the grand finale … almost

Now we can actually implement deriveShadowRealmGlobalObject, right? Well, not quite. It turns out <iframe> acts rather differently when the page it contains has the same origin as the parent page—in that case, their global objects are actually reachable from one another! (This came as an unpleasant surprise to me at the time …)

This won’t do for us—it breaks the invariant we described above. There’s nothing to prevent a child <iframe> from creating a new ShadowRealm and allowing it to escape to the parent frame; then the ShadowRealm can outlive its incubating realm’s global object :(

We can solve the problem by actually walking up the hierarchy of frames until we either hit the top or find one with a different origin, and use the top-most global object with the same origin, which re-establishes our invariant, since there now really should be no way for the ShadowRealm object to escape :)

JSC::JSGlobalObject* JSDOMGlobalObject::deriveShadowRealmGlobalObject(JSC::JSGlobalObject* globalObject)
{
  auto& vm = globalObject->vm();

  auto domGlobalObject = jsCast<JSDOMGlobalObject*>(globalObject);
  auto context = domGlobalObject->scriptExecutionContext();
  if (is<Document>(context)) {
    // Same-origin iframes present a difficult circumstance because the
    // ShadowRealm global object cannot retain the incubating realm's
    // global object (that would be a refcount loop); but, same-origin
    // iframes can create objects that outlive their global object.
    //
    // Our solution is to walk up the parent tree of documents as far as
    // possible while still staying in the same origin to insure we don't
    // allow the ShadowRealm to fetch modules masquerading as the wrong
    // origin while avoiding any lifetime issues (since the topmost document
    // with a given wrapper world should outlive other objects in that
    // world)
    auto document = &downcast<Document>(*context);
    auto const& originalOrigin = document->securityOrigin();
    auto& originalWorld = domGlobalObject->world();

    while (!document->isTopDocument()) {
      auto candidateDocument = document->parentDocument();

      if (!candidateDocument->securityOrigin().isSameOriginDomain(originalOrigin))
        break;

      document = candidateDocument;
      domGlobalObject = candidateDocument->frame()->script().globalObject(originalWorld);
    }
  }
  /* snip */
  auto scope = ShadowRealmGlobalScope::create(domGlobalObject, scriptModuleLoader(domGlobalObject));
  /* snip */
}

a brief note on debugging

Of course, none of the above went as smoothly as I make it sound; I ended up encountering many crashes and inscrutable error messages as I fumbled my way around WebKit internals. After printf debugging, A classic technique to interactively explore program state when in unfamiliar territory is the iconic ASSERT(false)—WebKit even provides a marginally more convenient macro for this purpose, CRASH(), which proved invaluable.

Simply run your test case in a debugger and set a breakpoint on WTFCrash and you will have a convenient gdb prompt; I find it to be a fun, slightly more powerful flavor of printf-debugging :)

the road ahead

Now, we have a working ShadowRealm available in the browser!

If you’re interested to try them out for yourself, you can find them in the latest Safari Technology Preview release!

However this is only part of the work for this project, because it is also planned to add certain Web interfaces to ShadowRealm contexts, and more testing coverage is needed.

exposing web interfaces

Since ShadowRealms are actually part of the Javascript standard and not a Web standard, so, we need to be careful about this work; ShadowRealms are supposed to be a sandbox, so it wouldn’t do much good if scripts you load into a shadow realm start mucking around with the markup on your web site!

So the interfaces that are planned to be exposed are strictly those that expose some extra computational facility to Javascript, but do not really have an effect outside of the script where they are invoked. For example, TextEncoder is quite likely to be exposed, Document is not.

A patch adding several of these APIs to ShadowRealm contexts is already landed, but probably won’t appear in Safari until after ShadowRealms do.

never enough testing

ShadowRealms are already unit tested in both the existing WebKit implementation and in test262, the test suite accompanying the Javascript standard, however, more tests are needed in WPT, the web test suite, for the correctness of the module loading support and newly exposed interfaces; some work here is underway and should be finished in the coming few weeks.

notes


  1. “host” here refers to whatever piece of software is running Javascript with JSC—usually the host is a web browser, but it doesn’t have to be. For our purposes, “host hook” is a function that the Javascript engine cannot provide—it requires the host to cooperate in some way.↩︎

  2. The awkwardness here is that scriptModuleLoader is not actually part of the interface of JSDOMGlobalObject, but probably should be; however, we have now arrived at the delicate argument over whether or not patches like this should minimize the changes or clean up ugliness everywhere they find it: you can even see this in the code review of this patch if you look closely.↩︎

by Joseph Griego at March 28, 2022 12:00 AM

Clayton Craft

Never miss completion of a long-running command again!

This is a really short, simple thing I use to alert me when a long-running shell command/script, like building (some) containers or compiling the kernel, is done. It effectively allows me to switch context in the meantime and pick up where I left off when the long-running dependency is finished.

There are two versions of this, one triggers the shell bell after the command/script has completed, and the other uses notify-send to trigger a desktop notification. I prefer the shell bell approach most of the time, since it works nicely with my tmux setup, highlighting the window where it was triggered. It also works if there's no graphical notification daemon running.

alert () {
        "$@"; tput bel
}

And the notify-send version:

alert () {
        "$@"; notify-send "ding!" "$@"
}

These can be used by adding the function to you shell's rc script (e.g. ~/.zshrc for zsh or ~/.bashrc for bash). It may need to be adjusted for shells that use a different syntax for defining user functions.

To use it, simply run the function and pass the script+args to it, for example: $ alert make -j1 foo or whatever.

March 28, 2022 12:00 AM

March 21, 2022

Joseph Griego

hello, world

This blog is where I will put some blog-things. Enjoy this ascii-art of a cat, for now:
                zzz
___________      \
\           \ _/\___/\________
  \            = - . - =       \
  \           \                |
    \           \      _______ /
    \           \____(_______/__
      \___________\_-_-__________]

by Joseph Griego at March 21, 2022 12:00 AM

March 17, 2022

Samuel Iglesias

Igalia work within the GNU/Linux graphics stack in 2021

We had a busy 2021 within GNU/Linux graphics stack at Igalia.

Would you like to know what we have done last year? Keep reading!

Open Source Raspberry Pi GPU (VideoCore) drivers

Raspberry Pi 4, model B

Last year both the OpenGL and the Vulkan drivers received a lot of love. For example, we implemented several optimizations such improvements in the v3dv pipeline cache. In this blog post, Alejandro Piñeiro presents how we improved the v3dv pipeline cache times by reducing the two-cache-lookup done previously by only one, and shows some numbers on both a synthetic test (modified CTS test), and some games.

We also did performance improvements of the v3d compilers for OpenGL and Vulkan. Iago Toral explains our work on optimizating the backend compiler with techniques such as improving memory lookup efficiency, reducing instruction counts, instruction packing, uniform handling, among others. There are some numbers that show framerate improvements from ~6 to ~62% on different games / demos.

Framerate improvements Framerate improvement after optimization (in %). Taken from Iago’s blogpost

Of course, there was work related to feature implementation. This blog post from Iago lists some Vulkan extensions implemented in the v3dv driver in 2021… Although not all the implemented extensions are listed there, you can see the driver is quickly catching up in its Vulkan extension support.

My colleague Juan A. Suárez implemented performance counters in the v3d driver (an OpenGL driver) which required modifications in the kernel and in the Mesa driver. More info in his blog post.

There was more work in other areas done in 2021 too, like the improved support for RenderDoc and GFXReconstruct. And not to forget the kernel contributions to the DRM driver done by Melissa Wen, who not only worked on developing features for it, but also reviewed all the patches that came from the community.

However, the biggest milestone for the v3Dv driver was to be Vulkan 1.1 conformant in the last quarter of 2021. That was just one year after becoming Vulkan 1.0 conformant. As you can imagine, that implied a lot of work implementing features, fixing bugs and, of course, improving the driver in many different ways. Great job folks!

If you want to know more about all the work done on these drivers during 2021, there is an awesome talk from my colleague Alejando Piñeiro at FOSDEM 2022: “v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi 4”, and another one from my colleague Iago Toral in XDC 2021: “Raspberry Pi Vulkan driver update”. Below you can find the video recordings of both talks.

FOSDEM 2022 talk: “v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi 4”

XDC 2021 talk: “Raspberry Pi Vulkan driver update”

Open Source Qualcomm Adreno GPU drivers

RB3 Photo of the Qualcomm® Robotics RB3 Platform embedded board that I use for Turnip development.

There were also several achievements done by igalians on both Freedreno and Turnip drivers. These are reverse engineered open-source drivers for Qualcomm Adreno GPUs: Freedreno for OpenGL and Turnip for Vulkan.

Starting 2021, my colleague Danylo Piliaiev helped with implementing the missing bits in Freedreno for supporting OpenGL 3.3 on Adreno 6xx GPUs. His blog post explained his work, such as implementing ARB_blend_func_extended, ARB_shader_stencil_export and fixing a variety of CTS test failures.

Related to this, my colleague Guilherme G. Piccoli worked on porting a recent kernel to one of the boards we use for Freedreno development: the Inforce 6640. He did an awesome job getting a 5.14 kernel booting on that embedded board. If you want to know more, please read the blog post he wrote explaining all the issues he found and how he fixed them!

Inforce6640 Picture of the Inforce 6640 board that Guilherme used for his development. Image from his blog post.

However the biggest chunk of work was done in Turnip driver. We have implemented a long list of Vulkan extensions: VK_KHR_buffer_device_address, VK_KHR_depth_stencil_resolve, VK_EXT_image_view_min_lod, VK_KHR_spirv_1_4, VK_EXT_descriptor_indexing, VK_KHR_timeline_semaphore, VK_KHR_16bit_storage, VK_KHR_shader_float16, VK_KHR_uniform_buffer_standard_layout, VK_EXT_extended_dynamic_state, VK_KHR_pipeline_executable_properties, VK_VALVE_mutable_descriptor_type, VK_KHR_vulkan_memory_model and many others. Danylo Piliaiev and Hyunjun Ko are terrific developers!

But not all our work was related to feature development, for example I implemented Low-Resolution Z-buffer (LRZ) HW optimization, Danylo fixed a long list of rendering bugs that happened in real-world applications (blog post 1, blog post 2) like D3D games run on Vulkan (thanks to DXVK and VKD3D), instrumented the backend compiler to dump register values, among many other fixes and optimizations.

However, the biggest achievement was getting Vulkan 1.1 conformance for Turnip. Danylo wrote a blog post mentioning all the work we did to achieve that this year.

If you want to know more, don’t miss this FOSDEM 2022 talk given by my colleague Hyunjun Ko called “The status of turnip driver development. What happened in 2021 and will happen in 2022 for turnip.”. Video below.

FOSDEM 2022 talk: “The status of turnip driver development. What happened in 2021 and will happen in 2022 for turnip.”

Vulkan contributions

Our graphics work doesn’t cover only driver development, we also participate in Khronos Group as Vulkan Conformance Test Suite developers and even as spec contributors.

My colleague Ricardo Garcia is a very productive developer. He worked on implementing tests for Vulkan Ray Tracing extensions (read his blog post about ray tracing for more info about this big Vulkan feature), implemented tests for a long list of Vulkan extensions like VK_KHR_present_id and VK_KHR_present_wait, VK_EXT_multi_draw (watch his talk at XDC 2021), VK_EXT_border_color_swizzle (watch his talk at FOSDEM 2022) among many others. In many of these extensions, he contributed to their respective specifications in a significant way (just search for his name in the Vulkan spec!).

XDC 2021 talk: “Quick Overview of VK_EXT_multi_draw”

FOSDEM 2022 talk: “Fun with border colors in Vulkan. An overview of the story behind VK_EXT_border_color_swizzle”

Similarly, I participated modestly in this effort by developing tests for some extensions like VK_EXT_image_view_min_lod (blog post). Of course, both Ricardo and I implemented many new CTS tests by adding coverage to existing ones, we fixed lots of bugs in existing ones and reported dozens of driver issues to the respective Mesa developers.

Not only that, both Ricardo and I appeared as Vulkan 1.3 spec contributors.

Vulkan 1.3

Another interesting work we started in 2021 is Vulkan Video support on Gstreamer. My colleague Víctor Jaquez presented the Vulkan Video extension at XDC 2021 and soon after he started working on Vulkan Video’s h264 decoder support. You can find more information in his blog post, or watching his XDC 2021 talk below:

FOSDEM 2022 talk: “Video decoding in Vulkan: VK_KHR_video_queue/decode APIs”

Before I leave this section, don’t forget to take a look at Ricardo’s blogpost on debugPrintfEXT feature. If you are a Graphics developer, you will find this feature very interesting for debugging issues in your applications!

Along those lines, Danylo presented at XDC 2021 a talk about dissecting and fixing Vulkan rendering issues in drivers with RenderDoc. Very useful for driver developers! Watch the talk below:

XDC 2021 talk: “Dissecting Vulkan rendering issues in drivers with RenderDoc”

To finalize this blog post, remember that you now have vkrunner (the Vulkan shader tester created by Igalia) available for RPM-based GNU/Linux distributions. In case you are working with embedded systems, maybe my blog post about cross-compiling with icecream will help to speed up your builds.

This is just a summary of the highlights we did last year. I’m sorry if I am missing more work from my colleagues.

March 17, 2022 12:00 PM

March 16, 2022

Brian Kardell

A case for CSS-Like Languages

A case for CSS-Like Languages

For many years now, it seems that almost not a week goes by where I don't wind up thinking about the same topic while reading threads. Occasionally, I bring it up in private conversations, and recently it seems some others are starting to discuss something around the edges, so I thought that I should probably write a post...

The first "S" in CSS ("Style") governs a lot its design in both theory and practice. Much about it, ultimately, is designed toward, and limited by constraints around potentially fast-changing visual style. As my colleage Eric Meyer cleverly noted:

[W]eb browsers are actually 60fps+ rendering environments. They’re First-Person Scrollers.

What's interesting though, is how natural it seems to want to write things "in CSS" that don't fit into those neat little constraints. CSS is literally full of concepts which could be deployed toward other problems: Separation of concerns, sheets, media queries, selectors, pseduos, rules, ua-stylesheets, properties, functions, computed values and the complex and automatic application of all of those things.

It's been an almost regular occurence that people desire to somehow deploy those concepts toward ends that are not, strictly speaking, about potentially fast-changing visual style.

Some of many examples

Over the years, this has taken many shapes. Sometimes we try to rationalize about how it is style, sometimes we try to shoehorn a solution. Occasionally we have even had proposals and experiments that tried to somehow attempted to tap some aspect of this problem and some of those same concepts to bear on a different problem. Before CSS, Action Sheets proposed that actions and styles should be considered. Simple Tree Transformation Sheets offered ways to transform the DOM (that would be mentioned in Håkon's thesis on CSS itself). Shortly after, there was an attempt to add behavioral extensions to CSS - Microsoft even implemented some stuff. Another take on this introduced XBL to creating a 'binding' that could be applied via CSS. Public Web Components discussion began when trying to decide what to do with XBL, and they initially included a similar concept to bind in CSS via decorators.

A completely different angle of this was CSS Speech which reasoned that this was simply "aural styles".

Except, of course, in each of those cases the particular needs and constraints are a bit different. They shouldn't change at 60fps, in fact. Things that are kind of verbotten or impossible in CSS today might be totally solvable and fine, if only it weren't somehow shoe-horned into CSS, proper.

CSS...ish

Several years ago, discussions like these led Elika Etemad (aka Fantasai) of the CSS Working Group to make a suggestion to Tab Atkins (now a prolific CSS editor) which yielded a sketch for something called Cascading Attribute Sheets. As they say, it's not the first such take, it's just a nicely linkable, well-informed and dated illustration of a proposal to create a CSS-like language which repurposes major concepts and parts of the architecture toward other aims. As Tab noted in his CAS post, internally, browsers do this to an extent already.

In 2012, I was on a very similar page. I was doing things at my company which did exactly this. My partner and I went about trying to decouple what we could in order to share this with Tab and others in order to hopefully participate in the discussion. We began creating a version of this called Bess on Github, but it was incomplete and full of some bad internal ideas. It did, ultimately lead us to sharing something far more limited (HitchJS) and allow us to begin a much bigger discussion about what made this and a whole lot of other useful things (like, polyfilling something in CSS) way too hard. I even used this to create a kind of polyfill for Tab's CAS Proposal. I don't think it's particularly a great proposal as it stands - but there's clearly something there.

These discussions also led to the establishment of a new joint task force between members of the W3C Technical Architecture Group and the CSS Working Group: Houdini. In the very first meeting of this task force, the group agreed that making it possible for us to repurpose architectural aspects in order to explore "CSS-like languages" was ideal.

Now...

A lot has happened since that time, but realistically, we really haven't had a lot of time to talk about or pursue the stuff that would better enable us to explore CSS-like languages.

I think that's a real shame because we continue to have problems and ideas where at least advancing discussions on this would seem very useful.

Cascading Spicy Stuff

Consider our <spicy-sections> work in OpenUI, for example. It seems very natural to use the basic language and paradigms of CSS to express this. We're not entirely sure about some things, so we're waiting to see what pans out with the CSS Toggles.

This is also (I think naturally) shaping up larger conversations and ideas about whether we could just have "state machines" in CSS, and how we can share state and so on.

However, at the same time, it is also very unclear whether something which changes semantics and interactions at fixed points really belongs in CSS and the 60fps profile itself.

I pretty much agree with Mia here

There probably are things that work just fine in CSS - but at some point, we've entered something of an uncanny valley, and things get harder. We can't know where things should develop and live without a larger conversation.

I guess what I am trying to say, in the end, is that I love all of the convesations that are suddenly happening, and I'd love it even more if we spent some time thinking about how we might draw these lines and explore CSS-like solutions.

March 16, 2022 04:00 AM

March 14, 2022

Eric Meyer

When or If

The CSSWG (CSS Working Group) is currently debating what to name a conditional structure, and it’s kind of fascinating.  There are a lot of strong opinions, and I’m not sure how many of them are weakly held.

Boiled down to the bare bones, the idea is to take the conditional structures CSS already has, like @supports and @media, and allow more generic conditionals that combine and enhance what those structures make possible.  To pick a basic example, this:

@supports (display: grid) {
	@media (min-width: 33em) {
		…
	}
}

…would become something like this:

@conditional supports(display: grid) and media(min-width: 33em) {
	…
}

This would also be extended to allow for alternates, something like:

@conditional supports(display: grid) and media(min-width: 33em) {
	…
} @otherwise {
	…
}

Except nobody wants to have to type @conditional and @otherwise, so the WG went in search of shorter names.

The Sass-savvy among you are probably jumping up and down right now, shouting “We have that! We have that already! Just call them @if and @else and finally get on our level!”  And yes, you do have that already: Sass uses exactly those keywords.  There are some minor syntactic differences (Sass doesn’t require parentheses around the conditional tests, for example) and it’s not clear whether CSS would allow testing of variable values the way Sass does, but they’re very similar.

And that’s a problem, because if CSS starts using @if and @else, there is the potential for syntactic train wrecks.  If you’re writing with Sass, how will it tell the difference between its @if and the CSS @if?  Will you be forever barred from using CSS conditionals in Sass, if that’s what goes into CSS?  Or will Sass be forced to rename those conditionals to something else, in order to avoid clashing — and if so, how much upheaval will that create for Sass authors?

The current proposal, as I write this, is to use @when and @else in CSS Actual.  Thus, something like:

@when supports(display: grid) and media(min-width: 33em) {
	…
} @else {
	…
}

Even though there is overlap with @else, apparently starting the overall structure with @when would allow Sass to tell the difference.  So that would sidestep clashing with Sass.

But should the CSS WG even care that a third-party code base’s syntax gets trampled on by CSS syntax?  I imagine Sass authors would say, “Uh, hell yeah they should”, but does that outweigh the potential learning hurdle of all the non-Sass authors, both now and over the next few decades, learning that @when doesn’t actually have temporal meaning and is just an alias for the more recognizable if statement?

Because while it’s true that some programming languages have a when conditional structure (kOS being the one I’ve used most recently), they usually also have an if structure, and the two sometimes mean different things.  There is a view held by some that using the label when when we really mean if is a mistake, one that will stand out as a weird choice and a design blunder, 10 years hence, and will create a cognitive snag in the process of learning CSS.  Others hold the view that when is a relatively common programming term, it’s sometimes synonymous with if, every language has quirks that new learners need to learn, and it’s worth avoiding a clash with tools and authors that already exist.

If you ask me, both views are true, and that’s the real problem.  I imagine most of the participants in the discussion, even if their strong opinions are strongly held, can at least see where the other view is rooted, and sympathize with it.  And it’s very likely the case that even if Sass and other tools didn’t exist, the WG would still be having the same debate, because both terms work in context.  I suspect if would have won by now, but who knows?  Maybe not.  There have been longer debates over less fundamental concepts over the years.

A lot of my professional life has been spent explaining CSS to people new to it, so that may be why I personally lean toward @if over @when.  It’s a bit easier to explain, it looks more familiar to anyone who’s done programming at just about any level, and semantically it makes a bit more sense to me.  It’s also true that I come from a place of not having to worry about Sass changing on me, because I’ve basically never used it (or any other CSS pre-processor, for that matter) and I don’t have to do the heavy lifting of rewriting Sass to deal with this.  So, easy for me to say!

That said, I have an instinctive distrust of arguments by majority.  Yes, the number of Sass developers who’d have to adapt Sass to @if in CSS Actual is vanishingly small compared to the population of current and future CSS authors, and the number of Sass authors is likely much smaller than the number of total CSS authors.  That doesn’t automatically mean they should be discounted. It’s good to keep CSS as future-proof as possible, but it should also be kept as present-proof as possible.

The rub comes in with “as possible”, though.  This isn’t a situation where all things are possible. Something’s going to give, and there will be a group of people ill-served by the result.  Will it be Sass authors?  Future CSS learners?  Another group?  Everyone?  We’ll see!


Have something to say to all that? You can add a comment to the post, or email Eric directly.

by Eric Meyer at March 14, 2022 03:57 PM

March 03, 2022

Philip Chimento

A screenshot of calendar software showing a visual difference between one calendar event spanning 24 hours, and a second all-day event the next day.

Via Zach Holman’s blog post I found an interesting Twitter discussion that kicked off with these questions:

A couple of tough questions for all of you:
1. Is the date 2022-06-01 equal to the time 2022-06-01 12:00:00?
2. Is the date 2022-06-01 between the time 2022-06-01 12:00:00 and the time 2022-12-31 12:00:00?
3. Is the time 2022-06-01 12:00:00 after the date 2022-06-01?

I’ve been involved for two years and counting1 in the design of Temporal, an enhancement for the JavaScript language that adds modern facilities for handling dates and times. One of the principles of Temporal that was established long before I got involved, is that we should use different objects to represent different concepts. For example, if you want to represent a calendar date that’s not associated with any specific time of day, you use a class that doesn’t require you to make up a bogus time of day.2 Each class has a definition for equality, comparison, and other operations that are appropriate to the concept it represents, and you get to specify which one is appropriate for your use case by your choice of which one you use. In other, more jargony, words, Temporal offers different data types with different semantics.3

For me these questions all boil down to, when we consider a textual representation like 2022-06-01, what concept does it represent? I would say that each of these strings can represent more than one concept, and to get a good answer, you need to specify which concept you are talking about.

So, my answers to the three questions are “it depends”, “no but maybe yes”, and “it depends.” I’ll walk through why I think this, and how I would solve it with Temporal, for each question.

You can follow along or try out your own answers by going the Temporal documentation page, and opening your browser console. That will give you an environment where you can try these examples and experiment for yourself.

Question 1

Is the date 2022-06-01 equal to the time 2022-06-01 12:00:00?

As I mentioned above, Temporal has different data types with different semantics. In the case of this question, what the question refers to as a “time” we call a “date-time” in Temporal4, and the “date” is still a date. The specific types we’d use are PlainDateTime and PlainDate, respectively. PlainDate is a calendar date that doesn’t have a time associated with it: a single square on a wall calendar. PlainDateTime is a calendar date with a wall-clock time. In both cases, “plain” refers to not having a time zone attached, so we know we’re not dealing with any 23-hour or 25-hour or even more unusual day lengths.

The reason I say that the answer depends, is that you simply can’t say whether a date is equal to a date-time. They are two different concepts, so the answer is not well-defined. If you want to do that, you have to convert one to the other so that you either compare two dates, or two date-times, each with their accompanying definition of equality.

You do this in Temporal by choosing the type of object to create, PlainDate or PlainDateTime, and the resulting object’s equals() method will do the right thing:

> Temporal.PlainDate.from('2022-06-01').equals('2022-06-01 12:00:00')
true
> Temporal.PlainDateTime.from('2022-06-01').equals('2022-06-01 12:00:00')
false

I think either of PlainDate or PlainDateTime semantics could be valid based on your application, so it seems important that both are within reach of the programmer. I will say that I don’t expect PlainDateTime will get used very often in practice.5 But I can think of a use case for either one of these:

  • If you have a list of PlainDateTime events to present to a user, and you want to filter them by date. Let’s say we have data from a pedometer, where we care about what local time it was in the user’s time zone when they got their exercise, and the user has asked to see all the exercise they got yesterday. In this case I’d use date semantics: convert the PlainDateTime data to PlainDate data.
  • On the other hand, if the 2022-06-01 input comes from a date picker widget where the user could have input a time but didn’t, then we might decide that it makes sense to default the time of day to midnight, and therefore use date-time semantics.

Question 2

Is the date 2022-06-01 between the time 2022-06-01 12:00:00 and the time 2022-12-31 12:00:00?

I think the answer to this one is more unambiguously a no. If we use date-time semantics (in Temporal, PlainDateTime.compare()) the date implicitly converts to midnight on that day, so it comes before both of the date-times. If we use date semantics (PlainDate.compare()), 2022-06-01 and 2022-06-01 12:00:00 are equal as we determined in Question 1, so I wouldn’t say it’s “between” the two date-times.

> Temporal.PlainDateTime.compare('2022-06-01', '2022-06-01 12:00:00')
-1
> Temporal.PlainDateTime.compare('2022-06-01', '2022-12-31 12:00:00')
-1
> Temporal.PlainDate.compare('2022-06-01', '2022-06-01 12:00:00')
0
> Temporal.PlainDate.compare('2022-06-01', '2022-12-31 12:00:00')
-1

(Why these numbers?6 The compare methods return −1, 0, or 1, according to the convention used by Array.prototype.sort, so that you can do things like arr.sort(Temporal.PlainDate.compare). 0 means the arguments are equal and −1 means the first comes before the second.)

But maybe the answer still depends a little bit on what your definition of “between” is. If it means the date-times form a closed interval instead of an open interval, and we are using date semantics, then the answer is yes.7

Question 3

Is the time 2022-06-01 12:00:00 after the date 2022-06-01?

After thinking about the previous two questions, this should be clear. If we’re using date semantics, the two are equal, so no. If we’re using date-time semantics, and we choose to convert a date to a date-time by assuming midnight as the time, then yes.

Other people’s answers

I saw a lot of answers saying that you need more context to be able to compare the two, so I estimate that the way Temporal requires that you give that context, instead of assuming one or the other, does fit with the way that many people think. However, that wasn’t the only kind of reply I saw. (Otherwise the discussion wouldn’t have been that interesting!) I’ll discuss some of the other common replies that I saw in the Twitter thread.

“Yes, no, no: truncate to just the dates and compare those, since that’s the data you have in common.” People who said this seem like they might naturally gravitate towards date semantics. I’d estimate that date semantics are probably correct for more use cases. But maybe not your use case!

“No, no, yes: a date with no time means midnight is implicit.” People who said this seem like they might naturally gravitate towards date-time semantics. It makes sense to me that programmers think this way; if you’re missing a piece of data, just fill in 0 and keep going. I’d estimate that this isn’t how a lot of nontechnical users think of dates, though.

In this whole post I’ve assumed we assume the time is midnight when we convert a date to a date-time, but in the messy world of dates and times, it can make sense to assume other times than midnight, as well. This comes up especially if time zones are involved. For example, you might assume noon, or start-of-day, instead. Start-of-day is often, but not always midnight:

Temporal.PlainDateTime.from('2018-11-04T12:00')
  .toZonedDateTime('America/Sao_Paulo')
  .startOfDay()
  .toPlainTime()  // -> 01:00

“These need to have time zones attached for the question to make sense.” If this is your first reaction when you see a question like this, great! If you write JavaScript code, you probably make fewer bugs just by being aware that JavaScript’s Date object makes it really easy to confuse time zones.

I estimate that Temporal’s ZonedDateTime type is going to fit more use cases in practice than either PlainDate or PlainDateTime. In that sense, if you find yourself with this data and these questions in your code, it makes perfect sense to ask yourself whether you should be using a time-zone-aware type instead. But, I think I’ve given some evidence above that sometimes the answer to that is no: for example, the pedometer data that I mentioned above.

“Dates without times are 24-hour intervals.” Also mentioned as “all-day events”. I can sort of see where this comes from, but I’m not sure I agree with it. In the world where JavaScript Date is the only tool you have, it probably makes sense to think of a date as an interval. But I’d estimate that a lot of non-programmers don’t think of dates this way: instead, it’s a square on your calendar!

It’s also worth noting that in some calendar software, you can create an all-day event that lasts from 00:00 until 00:00 the following day, and you can also create an event for just the calendar date, and these are separate things.

A 24-hour interval and a calendar date. Although notably, Google Calendar collapses the 24-hour event into a calendar-date event if you do this.

“Doesn’t matter, just pick one convention and stick with it.” I hope after reading this post you’re convinced that it does matter, depending on your use case.

“Ugh!” That’s how I feel too and why I wrote a whole blog post about it!

How do I feel about the choices we made in Temporal?

I’m happy with how Temporal encourages the programmer to handle these cases. When I went to try out the comparisons that were suggested in the original tweet, I found it was natural to pick either PlainDate or PlainDateTime to represent the data.

One thing that Temporal could have done instead (and in fact, we went back and forth on this a few times before the proposal reached its currently frozen stage in the JS standardization process) would be to make the choice of data type, and therefore of comparison semantics, more explicit.

For example, one might make a case that it’s potentially confusing that the 12:00:00 part of the string in Temporal.PlainDate.from('2022-06-01').equals('2022-06-01 12:00:00') is ignored when the string is converted to a PlainDate. We could have chosen, for example, to throw if the argument to PlainDate.prototype.equals() was a string with a time in it, or if it was a PlainDateTime. That would make the code for answering question 1 look like this:

> Temporal.PlainDate.from('2022-06-01').equals(
... Temporal.PlainDateTime.from('2022-06-01 12:00:00')
... .toPlainDate())
true

This approach seems like it’s better at forcing the programmer to make a choice consciously by throwing exceptions when there is any doubt, but at the cost of writing such long-winded code that I find it difficult to follow. In the end, I prefer the more balanced approach we took.

Conclusion

This was a really interesting problem to dig into. I always find it good to be reminded that no matter what I think is correct about date-time handling, someone else is going to have a different opinion, and they won’t necessarily be wrong.

I said in the beginning of the post: “to get a good answer, you need to specify which concept you are talking about.” Something we’ve tried hard to achieve in Temporal is to make it easy and natural, but not too obtrusive, to specify this. When I went to answer the questions using Temporal code, I found it pretty straightforward, and I think that validates some of the design choices we made in Temporal.

I’d like to acknowledge my employer Igalia for letting me spend work time writing this post, as well as Bloomberg for sponsoring Igalia’s work on Temporal. Many thanks to my colleagues Tim Chevalier, Jesse Alama, and Sarah Groff Hennigh-Palermo for giving feedback on a draft of this post.


[1] 777 days at the time of writing, according to Temporal.PlainDate.from('2020-01-13').until(Temporal.Now.plainDateISO()) ↩

[2] A common source of bugs with JavaScript’s legacy Date when the made-up time of day doesn’t exist due to DST ↩

[3] “Semantics” is, unfortunately, a word I’m going to use a lot in this post ↩

[4] “Time” in Temporal refers to a time on a clock face, with no date associated with it ↩

[5] We even say this on the PlainDateTime documentation page ↩

[6] We don’t have methods like isBefore()/isAfter() in Temporal, but this is a place where they’d be useful. These methods seem like good contenders for a follow-up proposal in the future ↩

[7] Intervals bring all sorts of tricky questions too! Some other date-time libraries have interval objects. We also don’t have these in Temporal, but are likewise open to a follow-up proposal in the future ↩

by Philip Chimento at March 03, 2022 02:03 AM

February 20, 2022

Hyunjun Ko

A Complement story in the Fosdem 2022 Talk

At 6th February, fortunately I had a chance to have a lightening talk at the graphics devroom of fosdem 2022. At the time I presented a summary of the status of turnip driver development in mesa and I’d like to write some more things that I couldn’t say at the talk as a complement to the talk.

History of the development.

I write here again, Turnip is the code name of Qualcomm Adreno GPU’s open-source vulkan driver. I dived into this GPU at 2018 and participated in developing the Freedreno OpenGL driver, that is founded and leaded by Rob Clark. It was new for me to develop a user-space graphics driver at that time and it was a good chance to learn and experience of how GPUs work, how to develop a graphics driver and how to communicate with Mesa community. Due to the contributions to the freedreno driver at that time, I could be a committer at the Mesa community. And at the beginning of 2020, Igalia including me started working on the opensource vulkan driver, that is “Turnip” driver. Actually the turnip development started already at 2018 by Google people and it’s definately lucky for us to get a chance to participate in the turnip development.

As I said in the talk it was immature though it works fine at the beginning thus lots of extensions and features are implemented since 2020. Joanathan Marek and Connor Abbott played an important role at this time(and still) and they implemented very basic and essential features from the scratch (and copied from freedreno if necessary). So Igalia could get involved in the progress of the implementation of extensions and features on top of what they have done. At 2021, we could speed-up the progress based on accumulated experiences and knowledges, especially Danylo’s join was a great addition to the project and he’s been always doing a great job since then. I won’t write details about this in this post but you can refer to Samuel and Danylo’s blog, who are my great colleagues at Igalia. It’s a pity that there’s no posts about my works though. Hopefully I could write some more posts about it in the near future. (I got a list to write already! XD)

One of main goals: Playing games!

Especially I’d like to highlight this in this post, probably this is why I write this post.

I presented things like below at the “What happend at 2021”

  • Make it run for windows games with dxvk/vkd3d on linux/arm
    • with x86 emulators (Fex, Box86)
    • Some window games started running!

And what about this at the “What’ll happen in 2022”?

  • Focusing on real world use cases.
    • Still not enough games running on arm.
    • Trying to run more window games via wine(proton)

Cool! Isn’t it?

Actually I realized many people out there are very interested in this, that is to “make games run on linux/arm”, of course, including windows/x86 games. And I know people want to test by themselves with their devices if possible but there are some issues to do it.

  • First, qualcomm devices are very expensive comparing to RasberryPi or other arm devices.
  • Second, the setup is so tricky to make and there’s no good document for the setup.
  • Third, the setup is unstable to run every game so far even if you’ve completed the setup.

Honestly I’m not going to write details like how to do in this post. Instead, I’d like to show some efforts to do it as an example of what we’ve tried. I said in the talk that we’ve been trying this complicated setup because we want to test turnip with real cases but there are not enough native games running on linux/arm yet. Here are cases that we’ve been trying.

The first case is here : https://gitlab.freedesktop.org/mesa/mesa/-/issues/4738.

One day at 2021, one issue has been raised in the community. As you see at the system information, someone was trying to run window games with turnip on a qualcomm device.

There were 2 kind of setups on Android: One, most complicated setup, is using virgl-vtest to get OpenGL calls from windows games as you see below. It seems, at that time, the ExaGear couldn’t access to GPU directly then they used virgl-vtest-server to access to GPU with turnip, which causes performance degradation.

-------------------------------------------------------
Window Games
-------------------------------------------------------
ExaGear:    (virgl) | <----> |  virgl-vtest
                    |        | ------------------------
 Wine on            |        |  turnip/zink
  x86 Ubuntu 18.04  |        | ------------------------
                    |        |  Ubuntu 20.04 on chroot
-------------------------------------------------------
                  Android 10
-------------------------------------------------------

So there was another try to not use virgl-vtest-server and seems it was successful for a few games with another setup like this:

---------------------------------
Window Games
---------------------------------
WINE(proton using dxvk)
---------------------------------
X86 emulator(Box86)
---------------------------------
turnip/zink
---------------------------------
Ubuntu 20.04 on Termux proot
---------------------------------
Android 10
---------------------------------

Which looks simpler but still complicated. As you can see in the issue, Danylo managed to set up the system and found root causes and fixed all of it.

Now it looks someone has succeeded in accessing to GPU directly in this emulaotr so we wouldn’t need to use virgl-vtest any more since we got new issue for the setup including ExaGear. :) See https://gitlab.freedesktop.org/mesa/mesa/-/issues/6024 in detail.

The second case is here: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5723

That is similar to the second setup of the first case, which is like

---------------------------------
Window Games
---------------------------------
WINE
---------------------------------
X86 emulator(Box86)
---------------------------------
turnip/zink
---------------------------------
Linux
---------------------------------

Yeah, starts using linux directly instead of Termux for android, thus it got simple a bit more. You can see the instructions and some trouble shootings here for this setup: https://github.com/Heasterian/Box86-64-on-SD845-mobian

The third case is to run android games with just replacing the proprietary driver with turnip

Which is most simple. You can see instructions how to build the turnip driver for Android here But note that there seems to be not enough vulkan games on Android yet, which means that we need to try with x86 emulators to test real games. Also You can see some efforts on this setup at Danylo’s blog as the following:

https://blogs.igalia.com/dpiliaiev/turnips-in-the-wild-part-1/ https://blogs.igalia.com/dpiliaiev/turnips-in-the-wild-part-2 https://blogs.igalia.com/dpiliaiev/gfxreconstruct-test-mobile-gpus/

Additionally there was also a talk about emulation on arm for playing games: FEX-Emu: Fast(-er) x86 emulation for AArch64 I think you can get more informations to run x86 games on arm/linux and I believe you can have fun too with this talk.

Ok, until here, I did tell a bit more that I missed at the talk. Hopefully it could be useful to someone interested. Also I’ll give it a try to write far more details, how to setup and run games step by step for example, in the near future.

Thanks for reading!

February 20, 2022 03:00 PM

February 12, 2022

Ricardo García

My FOSDEM 2022 talk: Fun with border colors in Vulkan

FOSDEM 2022 took place this past weekend, on February 5th and 6th. It was a virtual event for the second year in a row, but this year the Graphics devroom made a comeback and I participated in it with a talk titled “Fun with border colors in Vulkan”. In the talk, I explained the context and origins behind the VK_EXT_border_color_swizzle extension that was published last year and in which I’m listed as one of the contributors.

Big kudos and a big thank you to the FOSDEM organizers one more year. FOSDEM is arguably the most important free and open source software conference in Europe and one of the most important FOSS conferences in the world. It’s run entirely by volunteers, doing an incredible amount of work that makes it possible to have hundreds of talks and dozens of different devrooms in the span of two days. Special thanks to the Graphics devroom organizers.

For the virtual setup, one more year FOSDEM relied on Matrix. It’s great because at Igalia we also use Matrix for our internal communications and, thanks to the federated nature of the service, I could join the FOSDEM virtual rooms using the same interface, client and account I normally use for work. The FOSDEM organizers also let participants create ad-hoc accounts to join the conference, in case they didn’t have a Matrix account previously. Thanks to Matrix widgets, each virtual devroom had its corresponding video stream, which you could also watch freely on their site, embedded in each of the virtual devrooms, so participants wanting to watch the talks and ask questions had everything in a single page.

Talks were pre-recorded and submitted in advance, played at the scheduled times, and Jitsi was used for post-talk Q&A sessions, in which moderators and devroom organizers read aloud the most voted questions in the devroom chat.

Of course, a conference this big is not without its glitches. Video feeds from presenters and moderators were sometimes cut automatically by Jitsi allegedly due to insufficient bandwidth. It also happened to me during my Q&A section while I was using a wired connection on a 300 Mbps symmetric FTTH line. I can only suppose the pipe was not wide enough on the other end to handle dozens of streams at the same time, or Jitsi was playing games as it sometimes does. In any case, audio was flawless.

In addition, some of the pre-recorded videos could not be played at the scheduled time, resulting in a black screen with no sound, due to an apparent bug in the video system. It’s worth noting all pre-recorded talks had been submitted, processed and reviewed prior to the conference, so this was an unexpected problem. This happened with my talk and I had to do the presentation live. Fortunately, I had written a script for the talk and could use it to deliver it without issues by sharing my screen with the slides over Jitsi.

Finally, as a possible improvement point for future virtual or mixed editions, is the fact that the deadline for submitting talk videos was only communicated directly and prominently by email on the day the deadline ended, a couple of weeks before the conference. It was also mentioned in the presenter’s guide that was linked in a previous email message, but an explicit warning a few days or a week before the deadline would have been useful to avoid last-minute rushes and delays submitting talks.

In any case, those small problems don’t take away the great online-only experience we had this year.

Transcription

Another advantage of having a written script for the talk is that I can use it to provide a pseudo-transcription of its contents for those that prefer not to watch a video or are unable to do so. I’ve also summed up the Q&A section at the end below. The slides are available as an attachment in the talk page.

Enjoy and see you next year, hopefully in Brussels this time.

Slide 1 (Talk cover)

Hello, my name is Ricardo Garcia. I work at Igalia as part of its Graphics Team, where I mostly work on the CTS project creating new Vulkan tests and fixing existing ones. Sometimes this means I also contribute to the specification text and other pieces of the Vulkan ecosystem.

Today I’m going to talk about the story behind the “border color swizzle” extension that was published last year. I created tests for this one and I also participated in its release process, so I’m listed as one of the contributors.

Slide 2 (Sampling in Vulkan)

I’ve already started mentioning border colors, so before we dive directly into the extension let me give you a brief introduction to sampling operations in Vulkan and explain where border colors fit in that.

Sampling means reading pixels from an image view and is typically done in the fragment shader, for example to apply a texture to some geometry.

In the example you see here, we have an image view with 3 8-bit color components in BGR order and in unsigned normalized format. This means we’ll suppose each image pixel is stored in memory using 3 bytes, with each byte corresponding to the blue, green and red components in that order.

However, when we read pixels from that image view, we want to get back normalized floating point values between 0 (for the lowest value) and 1 (for the highest value, i.e. when all bits are 1 and the natural number in memory is 255).

As you can see in the GLSL code, the result of the operation is a vector of 4 floating point numbers. Since the image does not have alpha information, it’s natural to think the output vector may have a 1 in the last component, making the color opaque.

If the coordinates of the sample operation make us read the pixel represented there, we would get the values you see on the right.

It’s also worth noting the sampler argument is a combination of two objects in Vulkan: an image view and a sampler object that specifies how sampling is done.

Slide 3 (Normalized Coordinates)

Focusing a bit on the coordinates used to sample from the image, the most common case is using normalized coordinates, which means using floating point values between 0 and 1 in each of the image axis, like the 2D case you see on the right.

But, what happens if the coordinates fall outside that range? That means sampling outside the original image, in points around it like the red marks you see on the right.

That depends on how the sampler is configured. When creating it, we can specify a so-called “address mode” independently for each of the 3 texture coordinate axis that may be used (2 in our example).

Slide 4 (Address Mode)

There are several possible address modes. The most common one is probably the one you see here on the bottom left, which is the repeat addressing mode, which applies some kind of module operation to the coordinates as if the texture was virtually repeating in the selected axis.

There’s also the clamp mode on the top right, for example, which clamps coordinates to 0 and 1 and produces the effect of the texture borders extending beyond the image edge.

The case we’re interested in is the one on the top left, which is the border mode. When sampling outside we get a border color, as if the image was surrounded by a virtually infinite frame of a chosen color.

Slide 5 (Border Color)

The border color is specified when creating the sampler, and initially could only be chosen among a restricted set of values: transparent black (all zeros), opaque white (all ones) or the “special” opaque black color, which has a zero in all color components and a 1 in the alpha component.

The “custom border color” extension introduced the possibility of specifying arbitrary RGBA colors when creating the sampler.

Slide 6 (Image View Swizzle)

However, sampling operations are also affected by one parameter that’s not part of the sampler object. It’s part of the image view and it’s called the component swizzle.

In the example I gave you before we got some color values back, but that was supposing the component swizzle was the identity swizzle (i.e. color components were not reorder or replaced).

It’s possible, however, to specify other swizzles indicating what the resulting final color should be for each of the 4 components: you can reorder the components arbitrarily (e.g. saying the red component should actually come from the original blue one), you can force some of them to be zero or one, you can replicate one of the original components in multiple positions of the final color, etc. It’s a very flexible operation.

Slide 7 (Border Color and Swizzle pt. 1)

While working on the Zink Mesa driver, Mike discovered that the interaction between non-identity swizzle and custom border colors produced different results for different implementations, and wondered if the result was specified at all.

Slide 8 (Border Color and Swizzle pt. 2)

Let me give you an example: you specify a custom border color of 0, 0, 1, 1 (opaque blue) and an addressing mode of clamping to border in the sampler.

The image view has this strange swizzle in which the red component should come from the original blue, the green component is always zero, the blue component comes from the original green and the alpha component is not modified.

If the swizzle applies to the border color you get red. If it does not, you get blue.

Any option is reasonable: if the border color is specified as part of the sampler, maybe you want to get that color no matter which image view you use that sampler on, and expect to always get a blue border.

If the border color is supposed to act as if it came from the original image, it should be affected by the swizzle as the normal pixels are and you’d get red.

Slide 9 (Border Color and Swizzle pt. 3)

Jason pointed out the spec laid out the rules in a section called “Texel Input Operations”, which specifies that swizzling should affect border colors, and non-identity swizzles could be applied to custom border colors without restrictions according to the spec, contrary to “opaque black”, which was considered a special value and non-identity swizzles would result in undefined values with that border.

Slide 10 (Texel Input Operations)

The Texel Input Operations spec section describes what the expected result is according to some steps which are supposed to happen in a defined order. It doesn’t mean the hardware has to work like this. It may need instructions before or after the hardware sampling operation to simulate things happen in the order described there.

I’ve simplified and removed some of the steps but if border color needs to be applied we’re interested in the steps we can see in bold, and step 5 (border color applied) comes before step 7 (applying the image view swizzle).

I’ll describe the steps with a bit more detail now.

Slide 11 (Coordinate Conversion)

Step 1 is coordinate conversion: this includes converting normalized coordinates to integer texel coordinates for the image view and clamping and modifying those values depending on the addressing mode.

Slide 12 (Coordinate Validation)

Once that is done, step 2 is validating the coordinates. Here, we’ll decide if texel replacement takes place or not, which may imply using the border color. In other sampling modes, robustness features will also be taken into account.

Slide 13 (Reading Texel from Image)

Step 3 happens when the coordinates are valid, and is reading the actual texel from the image. This immediately implies reordering components from the in-memory layout to the standard RGBA layout, which means a BGR image view gets its components immediately put in RGB order after reading.

Slide 14 (Format Conversion)

Step 4 also applies if an actual texel was read from the image and is format conversion. For example, unsigned normalized formats need to convert pixel values (stored as natural numbers in memory) to floating point values.

Our example texel, already in RGB order, results in the values you see on the right.

Slide 15 (Texel Replacement)

Step 5 is texel replacement, and is the alternative to the previous two steps when the coordinates were not valid. In the case of border colors, this means taking the border color and cutting it short so it only has the components present in the original image view, to act as if the border color was actually part of the image.

Because this happens after the color components have already been reordered, the border color is always specified in standard red, green, blue and alpha order when creating the sampler. The fact that the original image view was in BGR order is irrelevant for the border color. We care about the alpha component being missing, but not about the in-memory order of the image view.

Our transparent blue border is converted to just “blue” in this step.

Slide 16 (Expansion to RGBA)

Step 6 takes us back to a unified flow of steps: it applies to the color no matter where it came from. The color is expanded to always have 4 components as expected in the shader. Missing color components are replaced with zeros and the alpha component, if missing, is set to one.

Our original transparent blue border is now opaque blue.

Slide 17 (Component Swizzle)

Step 7, finally the swizzle is applied. Let’s suppose our image view had that strange swizzle in which the red component is copied from the original blue, the green component is set to zero, the blue one is set to one and the alpha component is not modified.

Our original transparent blue border is now opaque magenta.

Slide 18 (VK_EXT_custom_border_color)

So we had this situation in which some implementations swizzled the border color and others did not. What could we do?

We could double-down on the existing spec and ask vendors to fix their implementations but, what happens if they cannot fix them? Or if the fix is impractical due to its impact in performance?

Unfortunately, that was the actual situation: some implementations could not be fixed. After discovering this problem, CTS tests were going to be created for these cases. If an implementation failed to behave as mandated by the spec, it wouldn’t pass conformance, so those implementations only had one way out: stop supporting custom border colors, but that’s also a loss for users if those implementations are in widespread use (and they were).

The second option is backpedaling a bit, making behavior undefined unless some other feature is present and designing a mechanism that would allow custom border colors to be used with non-identity swizzles at least in some of the implementations.

Slide 19 (VK_EXT_border_color_swizzle)

And that’s how the “border color swizzle” extension was created last year. Custom colors with non-identity swizzle produced undefined results unless the borderColorSwizzle feature was available and enabled. Some implementations could advertise support for this almost “for free” and others could advertise lack of support for this feature.

In the middle ground, some implementations can indicate they support the case, but the component swizzle has to be indicated when creating the sampler as well. So it’s both part of the image view and part of the sampler. Samplers created this way can only be used with image views having a matching component swizzle (which means they are no longer generic samplers).

The drawback of this extension, apart from the obvious observation that it should’ve been part of the original custom border color extension, is that it somehow lowers the bar for applications that want to use a single code path for every vendor. If borderColorSwizzle is supported, it’s always legal to pass the swizzle when creating the sampler. Some implementations will need it and the rest can ignore it, so the unified code path is now harder or more specific.

And that’s basically it. Sometimes the Vulkan Working Group in Khronos has had to backpedal and mark as undefined something that previous versions of the Vulkan spec considered defined. It’s not frequent nor ideal, but it happens. But it usually does not go as far as publishing a new extension as part of the fix, which is why I considered this interesting.

Slide 20 (Questions?)

Thanks for watching! Let me know if you have any questions.

Q&A Section

Martin: The first question is from "ancurio" and he’s asking if swizzling is implemented in hardware.

Me: I don’t work on implementations so take my answer with a grain of salt. It’s my understanding you can usually program that in hardware and the hardware does the swizzling for you. There may be implementations which need to do the swizzling in software, emitting extra instructions.

Martin: Another question from "ancurio". When you said lowering the bar do you mean raising it?

I explain that, yes, I meant to say raising the bar for the application. Note: I meant to say that it lowers the bar for the specification and API, which means a more complicated solution has been accepted.

Martin: "enunes" asks if this was originally motivated by some real application bug or by something like conformance tests/spec disambiguation?

I explain it has both factors. Mike found the problem while developing Zink, so a real application hit the problematic case, and then the Vulkan Working Group inside Khronos wanted to fix this, make the spec clear and provide a solution for apps that wanted to use non-identity swizzle with border colors, as it was originally allowed.

Martin: no more questions in the room but I have one more for you. How was your experience dealing with Khronos coordinating with different vendors and figuring out what was the acceptable solution for everyone?

I explain that the main driver behind the extension in Khronos was Piers Daniell from NVIDIA (NB: listed as the extension author). I mention that my experience was positive, that the Working Group is composed of people who are really interested in making a good specification and implementations that serve app developers. When this problem was detected I created some tests that worked as a poll to see which vendors could make this work easily and what others may need to make this work if at all. Then, this was discussed in the Working Group, a solution was proposed (the extension), then more vendors reviewed and commented that, then tests were adapted to the final solution, and finally the extension was published.

Martin: How long did this whole process take?

Me: A few months. Take into account the Working Group does not meet every day, and they have a backlog of issues to discuss. Each of the previous steps takes several weeks, so you end up with a few months, which is not bad.

Martin: Not bad at all.

Me: Not at all, I think it works reasonably well.

Martin: Specially when you realize you broke something and the specification needs fixing. Definitely decent.

February 12, 2022 09:25 PM

February 04, 2022

Clayton Craft

Diffing binaries, in living color!

I recently needed to compare two binary files (ISO images) in order to debug why one ISO would boot with legacy BIOS and the other wouldn't, even though they were presumably generated by the same tooling (turns out they weren't exactly, but that's not what this post is about.)

This post is about a quick way to generate a binary diff, with color, which really helps with visually seeing differences in a sea of hex. Using colordiff and xdd, it's easy!

colordiff -y <(xxd a.iso) <(xxd b.iso)|less -R

Where the output looks something like:

For large binary files, you can use the -l option to xxd to limit length, e.g. xxd -l 1000 foo.bin. The -R option to less tells it to output control characters, which is what colordiff uses for coloring output. If you don't include that option, you get this disaster instead:

February 04, 2022 12:00 AM

February 03, 2022

Pablo Saavedra

http503

TL;DR: The balena-browser-wpe has been released. This is the result of using the WPE WebKit browser as the chosen web engine for the Balena Browser block. This opens a lot of doors for all kinds of things, really lowering the bar to checking out and exploring an official WPE build with Balena’s very convenient system (more below).


It is my pleasure to announce the public release of the new Balena Browser Block based on WPE WebKit (balena-browser-wpe). This was completed by a close collaboration between Igalia and Balena developers, and was several months in the making. It was made possible in large part by the decision to use the WPE WebKit browser as the web engine for the Balena Browser block. Thanks to everyone involved for making it happen!

As a quick introduction for those who don’t know what Balena is or what they do, Balena.io is a well-known company due to being authors of balenaEtcher, the open-source utility widely used for flashing disk images onto storage media to create live SD cards and USB flash drives. But for some time now, they have been working on what they call Balena Cloud, a complete open-source stack of tools, images and services for deploying IoT services.


Why you could be interested on continuing reading this post?

You might find this news especially interesting if:

  • You are interested in building a Balena project using the new Linux graphical stack based on Wayland.
  • You are looking for a browser solution with a very low memory footprint. This block is intended to be usable as an easy and fast evaluation channel for the WPE WebKit web rendering engine for embedded platforms.
  • You are looking for a fully open ecosystem with standardized specifications for your project.
  • You are optimizing your project for RaspberryPi 3 and RaspberryPi 4.

… and, specifically about WebKit, if:

  • You are interested in a platform that uses the latest stable versions of WPE WebKit available.
  • You are interested in playing with the experimental features for WPE WebKit.
  • You are looking for a WPE WebKit solution using the WPE Freedesktop (FDO) backend (wpebackend-fdo).
  • You are looking for a WPE WebKit solution using the Yocto meta-webkit recipes to build the binary images.

The Balena Cloud , as I introduced before, is a complete set of tools for building, deploying, and managing IoT services on connected Linux devices. Balena is already providing service currently for around a half-million connected devices via the Balena Cloud. What I find especially interesting is that every Fleet (Balena’s term for a collection of devices) hosted on the Balena Cloud is running on a full open-source stack, from the OS flashed in the devices to the applications running on the top of the OS.

Another service they provide in this ecosystem is the Balena Hub, a catalog of IoT and edge projects created by a community. In this catalog you can find other reusable blocks or projects that you can reuse or adapt to build your own Balena project. The idea is that you can connect blocks like a kind of Lego so you can chose a X server, and then connect a dashboard, later a browser and so … In summary, in this Balena ecosystem you can find:

  • Blocks:
    • Drop-in chunk of functionality built to handle the basics.
    • Defined as an Docker image (Dockerfiledocker-compose.yml).
  • Projects:
    • Allows you to design your services in a plug&play way by using blocks.
    • Source code of a Fleet (forkeable).
  • Fleets (== Applications):
    • Groups of devices running the same code for a specific task.
    • It can be private or public.

Coming back to initial point, what we are announcing here is two new Balena blocks that they will be part of the Balena Hub: 1) the balena-browser-wpe block and 2) the balena-weston block.

The design of the balena-browser-wpe block comes with significant innovations with respect to the Balena Browser, (balena-browser) which makes it significantly different from the former block. For example, contrary to other balena-browser, what uses a Chromium browser via the classical X11 Linux graphical system, the new balena-browser-wpe block provides a hardware accelerated web browser display based on WPE WebKit on the top of the new Linux graphical stack, Wayland, using the Weston compositor system.

Also WPE WebKit allows embedders to create simple and performant systems based on Web platform technologies. It is a WebKit port designed with flexibility and hardware acceleration in mind, leveraging common 3D graphics APIs for best performance.

Block diagram of the Balena Browser WPE project

Another important difference is that this project is intended to run entirely on a fully open graphical stack for the Raspberry Pi. That means the use of the Mesa VC4 graphics driver instead of the proprietary Broadcom driver for Raspberry Pi.

The Raspberry Pi Broadcom VideoCore 4 GPU (Graphical Processing Unit) is a OpenGL ES 2.0 3D and GLES 2.0 compatible engine. The closed source graphics stack runs on VC4 GPU and talks to V3D and display component using proprietary protocols. Instead of this, the Mesa VC4 driver provides the open-source implementation of open standards: the OpenGL (Open graphics Library), Vulkan and other graphics API specification (e.g: GLES2).

Finally, the API for interacting with GPU is enabled with the Mesa VC4 driver and provides, through Mesa, the access to to DRM (Direct Rendering Manager) subsystem of the Linux kernel responsible for interfacing with the GPU and the DMA Buffer Sharing Framework required for a efficient buffer export mechanism required by the Wayland compositor 🚀.

How can I start to play with the Balena Browser WPE?

This is the enjoyable part of the article. Balena provides many of the pieces that you will need, at least, from the point of view of the software (the hardware still has to be supplied by you 🙃). From Balena you will get:

  • The Balena OS, a downloadable OS image where the blocks will be executed in the top of this base system as isolated containers.
  • The Balena Hub, a source repository for the projects to run in the top of Balena Cloud.
  • and the Balena Cloud, a container-based platform for deploying IoT applications over all the connected devices.

Additional requirements are the sources for the blocks that we provide:

  • The Balena WPE project, the reference project for building all of the required Balena blocks for running the WPE WebKit browser.
  • The Balena Browser WPE block source code.
  • The Balena Weston block source code.

To get the Balena Browser WPE project working on your Raspberry Pi 3 or 4, begin by following the Getting started guide. Once you reach the Running your first Container section, use the balena-wpe Github URL of the repository instead of the one provided. For example: git clone https://github.com/balenalabs/multicontainer-getting-started.git -> git clone https://github.com/Igalia/balena-wpe.git

Last but not least …

… now that the sources of the project are public, I intend to keep publishing short posts explaining in detail what I consider the relevant features of this project are. We also intend to create a public Balena Fleet based in this project. Personally, I think this it could be a nice and easy way to familiarize yourself with the Balena Browser WPE project, for those just getting started.

That’s all for now! I hope you will enjoy this contribution. More things are coming soon.

by Pablo Saavedra at February 03, 2022 02:23 PM

Brian Kardell

Enter: Wolvic

Enter: Wolvic

Today Igalia announced that we're taking over the browser project formerly known as "Firefox Reality" and re-introducing it with a new browser brand: Wolvic. You can read more details on this (including what the name is about) in Igalia's announcement post , as well as why we're intentionally leaving the door to this browser's evolution pretty wide open. However, it might surprise you to learn that I am (somewhat newly) personally really excited by this as I'm not a long-time XR superfan - let me tell you what changed things for me.

VR has had a long history of passionate promoters. Even before the W3C was formed, a web interest group was established to work on bringing VR to the web. Over the years I've heard a lot, occasionally read a lot, seen videos of demos and tried a few things now and then. But, to be completely honest, I've also kind of "meh'ed" a lot. Really, I just never found it personally broadly exciting, and I couldn't understand why others did.

However, with some of my peers doing work in this space at Igalia, over the past year I got my hands on an Oculus Quest 2 to check it out and suddenly... Wow. I get it.

Really: There have been very few things in my life where I've experienced something so different and exciting that I just had to show people. This was definitely one. Each person I've showed so far has been just people taking turns using it and sharing their excitement as they experienced it and watching them reacting...

"Wow". "Oh my god". "Get... out.."

The experience itself has been pretty cool, but there seems to also be a universal agreement with the people I've talked to: We weren't prepared for that - and we can suddenly see why this is exciting as hell.

On the other hand, in retrospect, I guess this is kind of typical in a way.

It happened with the Web, in fact

In November of 1990, Tim Berners-Lee approached a man named Ian Ritchie and tried to hand the idea of a web browser to him. Ritchie's company made a popular piece of hypermedia software which was almost exactly a web browser already. It dealt with networking, multimedia, and had a serialized form for documents. That serialized form was even called called Hypertext Markup Language (.hml). Literally all that was missing was the URL and Tim tried to convince him what a huge leap URLs would be and convince him to add support, even offering to provide code. But no dice. He didn't get it.

Tim's 1992 paper on the web it was rejected from the world wide Hypertext Conference. None of them got it.

Tim had to build it and show it and build a small community of people to help him do it. At the hypertext confernce in Seatle in 1993 Ian Ritchie recounts seeing a demo of Mosaic and thinking

Yep... that's it. I get it.

He gave a brief TED Talk on the subject, if you'd like to watch...

That's pretty much how it was for me with the Oculus Quest 2 and XR, and I expect that's how it will be for a lot of people. Like so many other things, many of us will have to experience it first hand. Pictures and videos of XR, or even some not-so-long ago tries give you entirely the wrong impression. Additionally, XR is a big space so descriptions aren't universal even, so let me tell you what I think is pretty key for me and why devices like the Quest 2 (though not exclusively that at all) that really convince me that there's something big and exciting here.

The Immersive OS

Probably the biggest thing I failed to imagine the most is that when you're putting on a lot of these standalone headsets, what you're really doing is entering an immersive operating system .

The initial appealof this, of course, is the stuff that is relatively easier to imagine: A game where you can swing light sabers, shoot targets, box, ski or, whatever, and it feels pretty real. That is a thing you really just have to experience, there's really no subsitute.

But what really surprised me was how interesting it was for stuff that isn't that.

But, again, in resospect, I guess this isn't the first time it was just hard to see from where we were sitting. We've done "Why would someone choose to read the web on their tiny little phone?" or "Why would someone watch videos on their phone?". But... Then you try it and after a while, you realize: Wow, yeah... There are, of course, tons of reasons. Literally tons. So many that they sometimes eclipse the old ways of consuming.

It was really the same with both points here for me. I mean, "Why would someone choose to watch a non-immersive movie with this headset on their face?" seemed like a pretty obvious question to me. I was incredibly dubious.

But now that I've seen it - I can definitely see that there are plenty of use cases where this is really compelling. Watching a movie in a small, cluttered and overly bright room was... amazing. It made me see a massive theatrical sized screen, where I had the perfect seat, without challenges of ambient light, angle or glare. You could have a nearly perfect viewing experience, from a cramped airplane seat. You can also attend a watch party with friends you can't meet up with becase, for example, there is a global panemic or because they live thousands of miles away and share something very much like the movie theater experience.

And, really, that's the case with lots of apps in that kind of OS. You can get all kinds of interesting new UI mechanisms too - heads up display features and so on. There's still a lot being sorted out, but you can easily see where this is going and the potential.

The Web, in XR

Even more amazing to me was how obvious, in retrospect, the need for the web is here - I mean, even the regular old 2d, non-immersive web. Just think about how much you turn to the web . I found myself using the the browser constantly while doing things like reporting bugs, checking out links, quickly replying to a message, looking something up - all without leaving the immersive OS.

Of course that's just the start - because things get even more interesting when we mash these two superpowers together with WebXR. Suddenly we can have non-app store links to XR experiences and jump into immersive experiences from the web. Suddenly we have a way of getting to them and sharing them across platforms, without stores, and embedding them, or using them as enhancements. . Amazing, really.

The basic usefulness of the web in XR seems to be lost on no one who makes such a device: Each one ships with a default browser. Stats seem to suggest that I'm not alone in that: People spend a lot of time in the browser here.

Enter: Wolvic

The trouble is, there's just not a lot of choice. I'm not just talking about a a choice of rendering engines, it's even lack of a choice of open source browsers. In fact, for most of these standalone headsets, without Firefox Reality, there wouldn't any. We can't let that happen. Now feels like a critically important moment to make sure the Web has the same kinds of opportunities it does in other, similar ecocystems.

We, at Igalia, been interested for a while in all of the problems and opportunies in the ecosystem. Last year we demoed a prototype browser for Android which merged our own WPE WebKit rendering engine with Firefox Reality, and provided some updates and fixes to Firefox Reality itself. When most of us think about Android, we primarily think about mobile devices - but the truth is that the open part of Android serves as the basis for the OS on most of these standalone devices too.

Our initial focus with Wolvic is toward taking custody of, updating, and nurturing the Firefox Reality codebase. Our beta today launches with a large number of patches and improvements from Firefox Reality, and surely a few growing pains as well. Wherever we start, we're chosing a new name, in part to make it clear that this project will will follow its own evolutionary path, wherever that leads, to survive as a choice in bringing another quality browser to these devices. Looking forward to see where it leads.

February 03, 2022 05:00 AM

January 29, 2022

Tim Chevalier

Records and objects: both the same and different

I wrote in my most recent “daily” post about the choices involved in representing records in MIR and LIR, Warp’s intermediate languages. Since then, more questions occurred to me.

MIR has a simple type system and some internal typechecking, which is helpful. LIR erases most type details but does have an even simpler type system that distinguishes JavaScript Values from raw integers/pointers/etc. I already added a Record type to MIR; an alternative would have been to give the NewRecord operation the result type Object, reflecting that records are (currently) represented as objects, but I thought that more internal type checks are always a good thing. Without a separate Record type, it’s theoretically possible that compiler bugs could result in record fields being mutable (of course, there are still ways such bugs could arise even with MIR typechecking, but it’s one more thing that makes it less likely).

I realized there’s a problem with this approach: if I want to take option 2 in the previous post — translating FinishRecord into MIR by breaking it down into smaller operations like Elements, LoadFixedSlot, etc. — it’s not possible to generate type-correct MIR. These operations expect an Object as an operand, but I want them to work on Records.

The point of representing records as objects was (at least in part) to reuse existing code for objects, but that goal exists in tension with distinguishing records from objects and keeping the MIR type system simple. Of course, we could add some form of subtyping to MIR, but that’s probably a bad idea.

So orthogonally to the three options I mentioned in my previous post, there’s another set of three options:

  1. Add Record as an MIR type, like I started out doing; don’t re-use MIR instructions like Elements for records, instead making those operations explicit in CodeGenerator
, the final translation from LIR to assembly.
  • Don’t add any new MIR types; give NewRecord the result type Object and FinishRecord the argument and result types Object
  • Compile away records when translating JS to MIR, adding whatever new MIR instructions are necessary to make this possible; in this approach, records would desugar to Objects and any operations needed to enforce record invariants would be made explicit,
  • Option 1 rules out some code re-use and loses some internal checking, since more of the heavy lifting happens in the untyped world. Option 2 allows more code re-use, but also loses some internal checking, since it would be unsound to allow every Object operation to work on records. Option 3 has the advantage that only the JS-to-MIR translation (WarpBuilder / WarpCacheIRTranspiler) would need to be modified, but seems riskier than the first two options with respect to type-correctness bugs — it seems like debugging could potentially be hard if records were collapsed into objects, too.

    A colleague pointed out that MIR already has some types that are similar to other types, but where certain additional invariants need to be upheld, such as RefOrNull (used in implementing WebAssembly); this type is effectively equivalent to “pointer”, but keeping it a separate type allows for the representation to change later if the WebAssembly implementors want to. Record seems very similar; we want to avoid committing to the Object representation more than necessary.

    My colleague also suggested adding an explicit RecordToObject conversion, which would allow all the Object operations to be re-used for records without breaking MIR typechecking. As long as there’s no need to add an ObjectToRecord conversion (which I don’t think there is), this seems like the most appealing choice, combined with option 1 above. There’s still potential for bugs, e.g. if a compiler bug generates code that applies RecordToObject to a record and then does arbitrary object operations on it, such as mutating fields. Ideally, RecordToObject would produce something that can only be consumed by reads, not writes, but this isn’t practical to do within MIR’s existing type system.

    (All of the same considerations come up for tuples as well; I just started with records, for no particular reason.)

    I don’t think there’s any way to know which approach is best without having a working prototype, so I’m just going to try implementing this and seeing what happens.

    by Tim Chevalier at January 29, 2022 03:33 AM

    January 28, 2022

    Tim Chevalier

    JavaScript is a very interesting language

    The other day I learned about indexed setters in JavaScript. It hadn’t occurred to me that you can do the following:

    var z = 5;
    Object.defineProperty(Array.prototype, 0, { set: function(y) { z = 42; }});

    and the interpreter will happily accept it. This code mutates the global object called Array.prototype, overriding the (implicit) setter method for the property '0'. If I’m to compare this with a more conventional programming language (using C-like syntax for illustrative purposes), this would be like if you could write:

    int n = /* ... */;
    int[n] a;
    a[0] = 5;
    

    but the semantics of the last assignment statement could involve arbitrary side effects, and not necessarily mutating the 0th element of a, depending on the value of some global variable.

    Most languages don’t allow you to customize this behavior. In JavaScript, it wasn’t an intentional design decision (as far as I know), but a consequence of a few design choices interacting:

    • Everything is an object, which is to say a bag of key-value pairs (properties)
    • What’s more, properties aren’t required to be simple lookups, but can be defined by arbitrary accessor and mutator functions (setters and getters) that can interact with global mutable state
    • Numbers and strings can be cast to each other implicitly
    • Inheritance is prototype-based, prototypes are mutable, and built-in types aren’t special (their operations are defined by prototypes that user code can mutate)

    It’s not immediately obvious that together, these decisions imply user-defined semantics for array writes. But if an array is just a bag of key-value pairs where the keys happen to be integers (not a contiguous aligned block of memory), and if strings are implicitly coerced to integers (“0” can be used in place of 0 and vice versa), and if specifications for built-in types can be dynamically modified by anybody, then there’s nothing to stop you from writing custom setters for array indices.

    To someone used to modern statically typed languages, this is remarkable. Programming is communication between humans before it’s communication with machines, and communication is hard when you say “rat” when you really mean “bear” but you won’t tell your conversation partners that that’s what you mean and instead expect them to look it up in a little notebook that you carry around in your pocket (and regularly erase and rewrite). If that’s what you’re going to do, you had better be prepared for your hiking partners to react with less alarm than is warranted when you say “hey, there’s a rat over there.” If your goal is to be understood, it pays to think about how other people will interpret what you say, even if that conflicts with being able to say whatever you want.

    And if you’re trying to understand someone else’s program (including if that someone else is you, a month ago), local reasoning is really useful; mental static analysis is hard if random behavior gets overridden in the shadows. More concretely, the kind of dynamic binding that JavaScript allows makes it hard to implement libraries in user space. For example, if you want to build a linked list abstraction and use arrays as the underlying representation, the semantics of your library is completely dependent on whatever modifications anybody in scope has made to the meaning of the array operations. So that’s not an abstraction at all, since it forces people who just want to use your data structure to think about its representation.

    This came up for me in my work in progress implementing the tuple prototype methods — the most straightforward ways to re-use existing array library code to build an immutable tuple abstraction don’t work, because of the ability users have to override the meaning of any property of an array, including length and indexed elements. The workarounds I used depend on the code being in a self-hosted library, meaning that it has to live inside the compiler. It makes sense for the record and tuple implementations to live inside the compiler for other reasons (it would be hard for a userland library to guarantee immutability), but you can easily imagine data structures that don’t need to be built into the language, but nevertheless can’t take full advantage of code re-use if they’re implemented as libraries.

    Now, plenty of people who use and implement JavaScript agree with me; perhaps even most. The problem is that existing code may depend on this behavior, so it’s very difficult to justify changing the language to reject programs that were once accepted. And that’s all right; it’s a full employment guarantee for compiler writers.

    Even so, I can’t help but wonder what code that depends on this functionality — and it must be out there — is actually doing; the example I showed above is contrived, but surely there’s code out there that would break if array setters couldn’t be overridden, and I’d like to understand why somebody chose to write their code that way. I’m inclined to think that this kind of code springs from dependence on prototype inheritance as the only mechanism for code reuse (how much JS imposes that and how much has to do with individual imaginations, I can’t say), but maybe there are other reasons to do this that I’m not thinking of.

    I also wonder if anyone would stand up for this design choice if there was an opportunity to magically redesign JS from scratch, with no worries about backwards compatibility; I’m inclined to think there must be someone who would, since dynamic languages have their advocates. I have yet to find an argument for dynamic checking that wasn’t really just an argument for better type systems; if you’re writing a program, there’s a constraint simpler than the program itself that describes the data the program is supposed to operate on. The “static vs. dynamic” debate isn’t about whether those constraints exist or not — it’s about whether they exist only in the mind of the person who wrote the code, or if they’re documented for others to understand and potentially modify. I want to encourage people to write code that invites contributors rather than using tacit knowledge to concentrate power within a clique, and that’s what motivates me to value statically typed languages with strong guarantees. I’ve never seen a substantive argument against them.

    But what’s easier than persuading people is to wear my implementor hat and use types as much as possible underneath an abstraction barrier, to improve both performance and reliability, while making it look like you’re getting what you want.

    “The sad truth is, there’s very little that’s creative in creativity. The vast majority is submission–submission to the laws of grammar, to the possibilities of rhetoric, to the grammar of narrative, to narrative’s various and possible structurings. In a society that privileges individuality, self-reliance, and mastery, submission is a frightening thing.”
    — Samuel R. Delany, “Some Notes for the Intermediate and Advanced Creative Writing Student”, in About Writing

    Even when languages offer many degrees of freedom, most people find it more manageable to pick a set of constraints and live within them. Programming languages research offers many ways to offer people who want to declare their constraints ways to do so within the framework of an existing, less constrainted language, as well as inferring those constraints when they are left implicit. When it comes to putting those ideas into practice, there’s more than enough work to keep me employed in this industry for as long as I want to be.

    by Tim Chevalier at January 28, 2022 07:52 AM

    January 26, 2022

    Clayton Craft

    Using gdb to inspect a crashing app

    This is more or less a story about how one can attempt to debug an application crash by attaching to it with gdb and poking around, while resisting the urge to build the application manually. Such cases where this is useful might be when running something that takes a long time to compile, or which might have a complicated build system. It's easy to run into these situations when the system is relatively underpowered phone running Linux.

    I recently came across a strange issue on my phone when running Phosh, where it would crash if you (or the system package manager) ran dconf update. This is being done in postmarketOS by a UI package in the distro, for "applying" some configuration settings for UI scaling that is useful for phones. The crash, however, is really not useful. If the system is performing an upgrade using a shell or app started in Phosh, the upgrade goes down with Phosh.

    Well, we can't have that! So let's see if we can at least figure out why this is happening by debugging directly on the phone (via an SSH session)!

    The first hint something is truly going sideways is this single line from the desktop manager (tinydm) log:

    gnome-session-binary[32555]: WARNING: Application 'sm.puri.Phosh.desktop' killed by signal 11
    

    Signal 11 is a segmentation fault. Since it's easy to trigger the crash manually, and not (as) easy running Phosh manually, we can use gdb to attach to the running Phosh process. Before doing that, it's helpful to install debug symbols for some things we'll likely encounter in any backtrace in gdb. Phosh is a GTK/GLib app, and I'm running on Alpine Linux which uses musl for libc. So let's start out by installing symbols for these components:

    librem5:~/src/phosh $ doas apk add musl-dbg glib-dbg gtk+3.0-dbg phosh-dbg
    

    Big shout out to the kind soul who added the debug symbols package for Phosh in Alpine's aports!!! Manual local build of phosh averted!

    With symbols installed, let's fire up gdb:

    librem5:~/src/phosh $ gdb --pid $(pidof phosh)
    GNU gdb (GDB) 11.2
    ....
    28      src/thread/aarch64/syscall_cp.s: No such file or directory.
    (gdb) c
    Continuing.
    

    And trigger the crash:

    librem5:~/src/phosh $ doas dconf update
    

    Boom!

    Thread 1 "phosh" received signal SIGSEGV, Segmentation fault.
    get_meta (p=p@entry=0xffffbb8528a0 "\005") at src/malloc/mallocng/meta.h:135
    135     src/malloc/mallocng/meta.h: No such file or directory.
    (gdb)
    

    When running the full backtrace (with the bt command in gdb), there are several messages similar to: glib/gmain.c: No such file or directory. Having the source files is really helpful if you need to jump to different frames in the backtrace to poke around. I usually just clone the source code and check out the tag relevant for the version I have installed, then inform gdb of the new search directory. Something like:

    librem5:~/src/phosh $ cd ../
    librem5:~/src/phosh $ git clone https://github.com/GNOME/glib.git
    librem5:~/src/phosh $ cd glib
    librem5:~/src/glib$ apk info glib
    glib-2.70.1-r0 description:
    ...
    librem5:~/src/glib $ git checkout refs/tags/2.70.1
    librem5:~/src/glib $ cd -
    # back in gdb session:
    (gdb) directory ../glib
    Source directories searched: /home/clayton/src/phosh/../glib:$cdir:$cwd
    

    With that out of the way, we stand a chance of having a somewhat useful backtrace, let's see!

    (gdb) bt
    #0  get_meta (p=p@entry=0xffffbb8528a0 "\005") at src/malloc/mallocng/meta.h:135
    #1  0x0000ffffbed37294 in __libc_free (p=0xffffbb8528a0) at src/malloc/mallocng/free.c:105
    #2  0x0000ffffbed36974 in free (p=<optimized out>) at src/malloc/free.c:5
    #3  0x0000ffffbdebcdb8 in g_free (mem=<optimized out>) at ../glib/gmem.c:199
    #4  0x0000ffffbded3498 in g_strfreev (str_array=<optimized out>) at ../glib/gstrfuncs.c:2560
    #5  g_strfreev (str_array=0xffffbaa55d90) at ../glib/gstrfuncs.c:2553
    #6  0x0000aaaac52ebe7c in on_keybindings_changed (self=self@entry=0xffffbaa718f0 [PhoshRunCommandManager])
        at ../src/run-command-manager.c:134
    #7  0x0000ffffbdfaa990 in g_cclosure_marshal_VOID__STRINGv
       Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0x1f
     (closure=0xffffb9f6b960, return_value=<optimized out>, instance=<optimized out>, args=#8  0x0000ffffbdfa80b0 in _g_closure_invoke_va
        (closure=closure@entry=0xffffb9f6b960, return_value=return_value@entry=0x0, instance=instance@entry=0xffffb9f65b00, args=..., n_params=1, param_types=0xffffbd0bfd10) at ../gobject/gclosure.c:893
    #9  0x0000ffffbdfbc914 in g_signal_emit_valist
        (instance=instance@entry=0xffffb9f65b00, signal_id=<optimized out>, detail=<optimized out>, var_args=...)
        at ../gobject/gsignal.c:3406
    #10 0x0000ffffbdfbd224 in g_signal_emit
        (instance=instance@entry=0xffffb9f65b00, signal_id=<optimized out>, detail=<optimized out>)
        at ../gobject/gsignal.c:3553
    #11 0x0000ffffbe0ebf74 in g_settings_real_change_event
        (settings=0xffffb9f65b00 [GSettings], keys=0xffffbc253e40, n_keys=<optimized out>) at ../gio/gsettings.c:392
    #12 0x0000ffffbe076d78 in _g_cclosure_marshal_BOOLEAN__POINTER_INTv
       Python Exception <class 'gdb.MemoryError'>: Cannot access memory at address 0xb9fb4280
     (closure=<optimized out>, return_value=0xffffc69a41d8, instance=<optimized out>, args=#13 0x0000ffffbdfa65e0 in g_type_class_meta_marshalv
        (closure=<optimized out>, return_value=<optimized out>, instance=<optimized out>, args=..., marshal_data=<optimized out>, n_params=<optimized out>, param_types=<optimized out>) at ../gobject/gclosure.c:1058
    #14 0x0000ffffbdfa80b0 in _g_closure_invoke_va (closure=closure@entry=0xffffbc6dba90, return_value=0xffffc69a41d8,
        return_value@entry=0x0, instance=instance@entry=0xffffb9f65b00, args=..., n_params=2, param_types=0xffffbc7c2b70)
        at ../gobject/gclosure.c:893
    #15 0x0000ffffbdfbc914 in g_signal_emit_valist
        (instance=instance@entry=0xffffb9f65b00, signal_id=<optimized out>, detail=detail@entry=0, var_args=...)
        at ../gobject/gsignal.c:3406
    #16 0x0000ffffbdfbd224 in g_signal_emit
        (instance=instance@entry=0xffffb9f65b00, signal_id=<optimized out>, detail=detail@entry=0)
        at ../gobject/gsignal.c:3553
    #17 0x0000ffffbe0ed498 in settings_backend_path_changed
        (target=<optimized out>, backend=<optimized out>, path=0xffffbb562d40 "/", origin_tag=<optimized out>)
        at ../gio/gsettings.c:467
    #18 0x0000ffffbe0e7620 in g_settings_backend_invoke_closure (user_data=0xffffb9867800) at ../gio/gsettingsbackend.c:273
    #19 0x0000ffffbdeb7dd0 in g_main_dispatch (context=0xffffbdb49ec0) at ../glib/gmain.c:3381
    #20 g_main_context_dispatch (context=context@entry=0xffffbdb49ec0) at ../glib/gmain.c:4099
    #21 0x0000ffffbdeb8030 in g_main_context_iterate
        (context=0xffffbdb49ec0, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>)
        at ../glib/gmain.c:4175
    #22 0x0000ffffbdeb8480 in g_main_loop_run (loop=loop@entry=0xffffbac235b0) at ../glib/gmain.c:4373
    #23 0x0000ffffbe643180 in gtk_main () at ../gtk/gtkmain.c:1329
    #24 0x0000aaaac5286814 in main (argc=<optimized out>, argv=<optimized out>) at ../src/main.c:142
    

    That's a lot to take in at once... but...

    Ah ha! At the top of the stack (frames 0-4), the g_free, free, and so on suggest there's probably an invalid pointer being freed (like, either because of a double free or some corruption.) The numbers on the far left (e.g. #0, #1, ...) are the frame numbers.

    Frame #6 looks interesting, it's the top-most function from Phosh (on_keybindings_changed) itself in the stack, right before glib went boom trying to free some stuff.

    I spent a lot of time inspecting the call to on_keybindings_changed, and the subsequent frames in the stack to try and figure out what might be going wrong, but wasn't able to see anything obvious. Oh well. On the bright side, we now have a useful backtrace that can be shared with the Phosh developers, since all of the symbols are there.

    So, this is where the post ends abruptly. After chatting with the Phosh developers, this particular area received a few changes/fixes in the last development cycle and, indeed, I can no longer reproduce the crash when using the main branch. At the very least, hopefully this is a helpful "how to" for getting set up to debug similar issues, and shows the importance of having debug symbols available!

    January 26, 2022 12:00 AM

    January 24, 2022

    Eric Meyer

    No, Apple Did Not Crowdfund :focus-visible in Safari

    It’s not every week the release notes for a preview build of a web browser ignite Yet Another Twitter Teacup Storm (YATTS™), but that’s what happened when Safari Technology Preview 138 dropped late last week. At least, it’s what happened in the Twitter Teacups I tend to sip.

    Just in case you missed it, here’s the summary:

    1. The WebKit team released Safari Technology Preview 138, and the release notes for same.
    2. The “CSS” section of the release notes started with a line saying:
      Enabled :focus-visible pseudo-class by default (r286783, r286776, r286775)
    3. A few people, including Jen Simmons, gave credit to Igalia for implementing :focus-visible by means of a crowdfunding project (more on that in a moment).
    4. KABOOM

    I suppose I could be a bit more explicit in step 4, but I don’t really want to get into speculating on apparent motives and assumptions by others, because that’s not the point of this post. The point of this post is to clear up what seems to be a very common misunderstanding.

    What I kept seeing people saying was something to the effect of, “Why the hell did Apple have to crowdfund this feature?” And that’s wrong in two ways:

    1. Apple doesn’t have to crowdfund anything, up to and including colonization of the Moon. (They might have to ask for a few bucks to do Venus or Mars.)
    2. Apple didn’t crowdfund :focus-visible.

    This isn’t me splitting hairs, either. Nobody at Apple asked the crowd to fund anything. Nobody at Apple asked Igalia to crowdfund anything. They didn’t even ask Igalia to implement :focus-visible, and then Igalia decided to crowdfund the work. In fact, all of those assumptions get things almost exactly backwards — which is understandable! It’s what we expect from our experience of how the web has developed since at least the late 1990s. But here, something new happened.

    So, let me summarize what happened using yet another ordered list:

    1. Igalia noticed they’d done a fair bit of work adding features to all the browser engines (e.g., CSS Grid), with each project supported by a single paying client, and thought, “Wait a minute, the web is a commons. Why are features being driven one client at a time?”
    2. Of its own volition, Igalia decided to experiment with the idea of letting the web community (the “crowd”) vote for implementation of a missing browser feature with their wallets (the “funding”). They called this ongoing experiment Open Prioritization, and launched it in 2019.
    3. There were six possible projects, chosen by Igalia through their own set of criteria, for the community to vote on by pledging monetary support:
      • CSS lab() colors in Firefox
      • :focus-visible in WebKit/Safari
      • HTML inert in WebKit/Safari
      • Selector list arguments for :not() in Chrome
      • CSS Containment support in WebKit/Safari
      • CSS d (SVG path) support in Firefox
    4. The winner was implementing :focus-visible in WebKit/Safari, and by “winner”, I mean that project got the most monetary commitment from the members of the community.
    5. Igalia matched the community contributions dollar for dollar, and moved forward with the work.
    6. The work was done, and submitted to the WebKit code base. (Along the way, inconsistencies and other problems were discovered, addressed, and fixes contributed to engines other than WebKit.)
    7. The WebKit team accepted Igalia’s contributions, and are now shipping them in a preview build of Safari for developers to test out.

    In other words: the community (more precisely, a portion of it) voted on which feature was most needed, Igalia implemented it, and Apple accepted it. Apple’s role in this process came at the end, not the beginning.

    And no, this is not the usual thing! It’s not supposed to be. Igalia is deeply committed to not just advancing the web, but to an unprecedented extent democratizing that advancement. It isn’t anything like a pure democratic effort, at least not yet, but these are early days and the initiative is structured to meet the current constraints of the environment (read: living under capitalism means coders gotta get paid).

    But why is Igalia doing this? Time for another list! Just to switch things up, this one will be unordered:

    • Because the community should have more of a say in what gets prioritized in browsers. The community can be large collections of individuals, or it could be small collections of small companies, or a mix.
    • Because in every browser team, there’s always a priority list, and sometimes good features get pushed down that list for various reasons. It could be lack of expertise. It could be lack of time. It could be lack of interest. It could be interference by higher-ups. It doesn’t matter.
    • Because browser teams — not any one team, but the unfortunately small number of browser teams — are a bottleneck. No matter how much money the companies who employ those teams throw at them, they will always be a bottleneck, because resources are finite.

    And this brings us to why I think “Wait, shouldn’t the $browser_name team have already done $feature_name by now? Why did an outside party have to do it?” is a little short-sighted. There will always be a $feature_name that the $browser_name team hasn’t done yet, for any value of $browser_name you care to posit. Today it could be WebKit; tomorrow, Chromium. In ten years, maybe there will be teams at Amazon and Huawei, making browser engines that compete for user share. Maybe not. Doesn’t actually matter, because however many or few engines there are, no matter what their priorities are, this problem will persist.

    This is also why I’m not getting into Apple’s funding levels and priorities for WebKit and the web. Yes, there is much Apple-the-company can be criticized about, and personally, I am one of the biggest fans browser-engine diversity ever had, but that is a different conversation. Even if you could somehow wave a magic wand and open all platforms everywhere to engine diversity, and simultaneously cause a thousand browsers to bloom, we would still have the same basic problem. Open Prioritization would still need to exist.

    For another piece of evidence on that point, look at the second Open Prioritization project: MathML-Core, whose goal is to bring full cross-browser support for the MathML Core specification to browsers, starting with Chrome (which needs the most work in this area) and then moving on to other engines (which need less work, but still need work). Doing this will not only improve support for web-wide math markup and its visual rendering, but will also improve the accessibility of math content on the web by making math a first-class content type in browsers. And you can even now contribute to this effort with a pledge of your own!

    “But wait, why didn’t $browser_name already finish implementing MathML Core?” It doesn’t matter. Whether or not $browser_name (whichever one that is) should have done this by now, they haven’t. Maybe they would have done it eventually, but again, that doesn’t matter. We can make it happen now.

    That’s what happened with :focus-visible in WebKit, which helped improve other engines; it’s what will happen with MathML Core in various browsers; and it could very well be what happens with other features in the future. Igalia would love nothing more than to see more and more projects launch, even if they don’t get hired to do the work for a single one of them. This isn’t us spackling over the cracks of browser teams’ neglect. This is us trying to chart an entirely new way to advance browser engines.

    I go deeper into all of the above, as well as how Open Prioritization is designed to be an open forum and not some private reserve of Igalia’s, in a 17-minute talk delivered at W3C TPAC in fall 2021, available and captioned on Igalia’s YouTube channel. This post sort of summarizes it, but there are more examples and details in the talk, so if you’re interested, please do check that out.

    Just in case your eyes sort of glazed and you skipped to the end to see if there was a TL;DR, here it is:

    The addition of :focus-visible to WebKit was lead by the community, done by Igalia, and contributed to WebKit without any involvement from Apple except in the sense of their reviewing patches and accepting the contributions. Many of us are mad at Apple for a lot of good reasons, but please don’t let the process of venting that anger tar the goals and achievements of Open Prioritization. The future browser-feature priority you save may be your own.


    Have something to say to all that? You can add a comment to the post, or email Eric directly.

    by Eric Meyer at January 24, 2022 05:21 PM

    January 21, 2022

    Ziran Sun

    WPT Python 3 Migration

    In 2020, Igalia was involved in the Python 3 migration work for the web-platform-tests (WPT) project with sponsorships from Google. After a year-long effort, in December 2020 the flag for python 3 was switched on in WPT. Now over a year on, I only just manage to write about this migration work.  Better late than never, I hope :).

    Why migrate?

    Python 2 came to the end of life (EOL) on the 1st of January 2020. It marks the end of bugfix support or even security patches for Python 2 from Python maintainers. Code for the final Python2 release  2.7.18 ( happened in April 2020) was also frozen in January 2020. As a well used cross-browser test suite for the Web-platform stack, the web-platform-tests (WPT) Project uses python in many places, from infrastructure to test scripts. From maintenance and support for active development points of view, It’s imperative for WPT to make its code PY3 compatible sooner than later.

    Challenges

    Both the dynamic quality of the Python language and the complexity of the WPT present significant challenges to the upgrade.

    Language challenges

    Python is a dynamically typed language. There are no formal semantics for Python. As its de facto reference implementation, CPython maintains high coding standards but is not written with legibility as its primary focus. This means that code paths in Python can contain illegal semantics that are hard to detect even with non-static analyzers. Python 3 is a new version of Python, but it’s not backwards compatible with code written for Python 2. The nature of the changes between Python 2 and Python 3 are not just syntactical, rather, many of the changes are in the semantics. In particular, string literals are fundamentally different types in Python 2 and Python 3. Along with the change in the nature of the language, library support has also shifted. Many older libraries created for Python 2 are not forward-compatible. A lot of recent developers are creating libraries that can only be used with Python 3. We can run tools such as  caniusepython3 to take in a set of dependencies and then figure out which of them are holding us up from porting to Python 3. The tricky part though, is find and port the new libraries that will work.

    Project challenges

    WPT is a massive suite of tests (over one million in total), and serves many auxiliary functions. It uses Python in many places including but not limited to:
    • The majority of the infrastructure code. This is the code underlying the major wpt command, such as ‘wpt runner’ etc..
    • WPT file handlers, which test authors can define to run custom code in response to them making a particular request to the WPT server.
    • WebDriver tests, which use pytest structured tests.
    • Linting
    • Interacting with the docker, CI systems
    • Rebasing expectations, …
    The complexity of the code base requires us to take a step back and have a good overview of the relations of the components that are involved and make a good plan on porting principles, pathways and methodologies.

    The Porting Plan

    The WPT community was well aware of the challenges of moving to Python 3 for the project. It set principles, suggested possible approaches and planned timelines before and during the major practical work took place.

    Principles

    • The migration work should happen in the background since the project is quite active. 
    • The pathway to Python 3 was to make code dual Python 2 and Python 3 compatible and gradually switch over the runtime to Python 3. 
    • The porting should not reduce test coverage without explicit agreement from test authors.

    Approaches

    To make the porting tractable, it was decided to start with two very specific goals, each approaching the problem from different angles. One was to get the actual runner utility up running in Python 3, by starting to get a basic ‘wpt run‘ command to execute under Python 3. The other was to target wider test coverage via tests by running all relevant unit tests under Python 3.

    TimeLines

    For a project of non-trivial size like WPT, flag day transitions from Python 2 to Python 3 were simply not viable at the early stage of the project. Before 2020, there were already a few in-depth discussions and work going on within the community for the migration work. The major work, though, happened in 2020.  As the porting progressed, the timelines had got clearer. A concrete timeline of dropping Python 2 support in WPT was set in September 2020:
    • Py3-first” targeting 2021-01-01 : switch test runs to Python 3 on CI, but keep running unit tests and infrastructure tests in Python 2 and 3.
    • “Py3-only” on 2021-02-01: drop all Python 2 tests from CI, and start accepting Python 3-only changes.
    WPT successfully moved to the “Py3-first” stage before the targeted date. The minimum python 3 version supported for this move is 3.6 with main focus on 3.8+. 

    Implementations

    Porting test runner utility

    As we mentioned earlier, one of the starting points was to have the actual runner utility, ‘wpt run’  command to execute under Python 3. This porting was pretty straightforward. We came across some typical python 2 to python 3 migration issues such as
    • absolute imports. Absolute imports have become the default in Python 3 and relative imports should be explicit. For example, “from conftest import product, flatten” in Python 2 needs to be declared as “from .conftest import product, flatten” in Python 3.
    • built-in types comparison. In Python 3 most objects of built-in types compare unequal unless they are the same object. The choice of whether one object is smaller or larger than another one is made arbitrarily but consistently within one execution of a program. In Python 2 in the case of ‘mismatched’ types, the types are listed lexicographical by type name, e.g. a “list” comes after an “int” in alphabetical ordering, so is greater. For example, in Python 2, we have

    latest_release = 0
    version = [int(item) for item in m.groups()]if version > latest_release:

    This is not valid in Python 3. Rather, we need to declare latest_release as latest_release = (0,0,0)
    • API changes. There are some API changes between the two versions. For example, the changes of the optional parameter strict in HTTPConnection(). In Python 2 we have httplib.HTTPConnection(self.host, self.port, strict=True, **conn_kwargs). In Python 3 it has become HTTPConnection(self.host, self.port, **conn_kwargs)
    • order of dict. In Python 2, dict is organized via a hash-table and puts the keys into buckets according to their hash() value. in Python 3.6+, dict retains insertion order. One solution to make code work for both versions is to use the  alternative type OrderedDict instead of the original Dict in Python 3.
    • iteration. Python 3 changes the return values of several basic functions from list to iterator. The main reason for this change is that using iterators usually causes better memory consumption than lists. This change has little impact on common use cases. Furthermore, the iter* counterparts (which return iterators in Python 2) have been removed. To make code work for both version, we can call six library APIs and replace them with six.iter* to avoid memory regression in Python 2. This corresponds to dictionary.iteritems() in Python 2 and dictionary.items() in Python 3. six is a Python 2 and 3 compatibility library. It provides utility functions for smoothing over the differences between the Python versions with the goal of writing Python code that is compatible on both Python versions. We called the six  library APIs at a few places during the dual Python 2/3 compatible stage. These API calls were removed after WPT transferred to python 3 only.
    • Bytes vs. str. In python2, binary is basically an alias of str. In python3 the binary data is different to a string. We had to convert some binary data to string type in order to be compatible for both Python 2 and Python 3. This issue, at the utility script level, presented different challenges from that in the core level we are discussing in the next section. Most cases in the utility script can be resolved by adding prefix to quoted string literals. Quoted string literals can be prefixed with “b” or “u” to get bytes or Unicode, respectively. In another word, prefix a native string with “u” in Python 2 to get a Unicode object while prefix with “b” in Python 3 to get bytes. It is also noted that in Python 3, the “u” prefix does nothing. Likewise, the “b” prefix does nothing in Python 2. In the context of this blog, we are talking about prefixing a native string with “b” to get bytes in Python 3 in most cases. 
    There were also a few other issues such as Integer division, use of exceptions and call of print but they were generally very minor and easy to resolve.

    Handling string types in core

    One of the biggest hurdles in our porting effort was how to overcome the string literals type mismatch between Python 2 and 3 in core, specifically in infrastructure and file handlers. As we discussed earlier, in Python 2, a string literal is a sequence of bytes. In Python 3, a string literal is a sequence of Unicode code points. The rationale behind the change was to move to a Unicode-by-default world. Web Platform Test Server (wptserve) often intends to use byte sequences. To overcome this mismatch hurdle, we need  to either always use byte sequences or always use str[RFC49] has illustrated pros and cons for both approaches. It was decided within the community to go the byte sequence path in order to keep a consistent and semantically correct encoding model. That is to always use byte sequences: str in Python 2 and bytes in Python 3. This had incurred some noticeable changes in WPT core. In wptserve
    • It introduced a pair of ISO-8859-1 encode and decode helper functions. Both of them can accept either binary or text strings, but always return binary/text strings respectively regardless of the Python version.
    • Most public APIs for custom handlers can only accept and return binary with notable exception of the response body.
    In python file handlers, it has specified string types for Headers on both requests and responses, Request URL/form parameters and response bodies etc.. After the necessary changes in the core part were done, Robert Ma (@robertma) and Stephen Mcgruer (@smcgruer) from Google created  the porting guidelines. Based on the guideline, we re-examined pretty much every line of the  test scripts in the existing handlers to add prefixes to string literals when necessary. Here we’d like to walk through some examples on porting handler related tests following the guideline and hope to share some tips.

    Writing Python 3 compatible tests

    According to the guideline, rule of thumb for porting is to make sure all strings are either always text or always bytes; all string literals in handlers should be prefixed with "b" or "u".

    Headers of request and response

    Header data should always be binary strings for both keys and values. Prefer adding "b" prefixes to encoding/decoding.
    • The Request.headers dictionary-like interface (accessed via […], get, items).
    headers = [(b"Content-Type", b"text/html")]
    if b"allow_csp_from" in request.GET:
    headers.append((b"Allow-CSP-From", request.GET[b"allow_csp_from"]))
    • The Request.headers.get_list method example:
    assert isinstance(headers.get_list(b'x-bar')[0], bytes)
    • Response.headers.{get,set,append,update,items} examples:
    response.headers.set(b'Access-Control-Allow-Origin', request.headers.get(b"origin"))
    response.headers.append(b"Access-Control-Allow-Origin", b"*")

    HTTP Basic Authentication

    Request.auth.{username,password} are binary strings. For example, response.headers.set(b'Access-Control-Allow-Origin', request.headers.get(b"origin"))
    response.headers.append(b"Access-Control-Allow-Origin", b"*")
    response.headers.set(b'Content-type', b'text/plain')
    content = b""

    Cookies

    • Request.cookies (similar to Request.headers; it’s a MultiDict with all APIs of dict plus first, last, get_list). For example,
    response.content = request.cookies[b"foo"].value
    • Response.{set,unset,delete}_cookie.
    response.set_cookie(b"name", b"value")
    response.unset_cookie(b"name")

    Request URL/form parameters

    • Both the keys and values of URL/form parameters for the request (accessible via request.GET or request.POST) are all binary strings. Prefer adding “b” prefixes to encoding/decoding.
    b"realm" in request.POST
    request.GET.first(b"type", None) == b"value"

    Response Status Message

    • Response status message is binary string as follows.
    response.status = 401
    response.headers.set(b'Status', b'401 Authorization required')
    response.headers.set(b'WWW-Authenticate', b'Basic realm="test"')

    Response body

    The data put into the response body can be either text or binary strings, but the two types should never be mixed and string literals must be prefixed. response.writer.write(b"This is a body!")
    return u”Hello, 世界!”

    Status

    WPT successfully moved to the “Py3-first” stage in December 2020. In February 2021 it dropped all Python 2 tests from CI, and started accepting Python 3-only changes.

    by zsun at January 21, 2022 10:05 AM

    January 20, 2022

    Tim Chevalier

    Implementing records in Warp

    Toward the goal of implementing records and tuples in Warp, I’m starting with code generation for empty records. The two modules I’ve been staring at the most are WarpBuilder.cpp and WarpCacheIRTranspiler.cpp.

    While the baseline compiler translates bytecode and CacheIR directly into assembly code (using the MacroAssembler), the optimizing compiler (Warp) uses two intermediate languages: it translates bytecode and CacheIR to MIR; MIR to LIR; and then LIR to assembly code.

    As explained in more detail here, WarpBuilder takes a snapshot (generated by another module, WarpOracle.cpp) of running code, and for each bytecode, it generates MIR instructions, either from CacheIR (for bytecode ops that can have inline caches), or directly. For ops that can be cached, WarpBuilder calls its own buildIC(), method, which in turn calls the TranspileCacheIRToMIR() method in WarpCacheIRTranspiler.

    A comment in WarpBuilderShared.h says “Because this code is used by WarpCacheIRTranspiler we should generally assume that we only have access to the current basic block.” From that, I’m inferring that WarpCacheIRTranspiler maps each CacheIR op onto exactly one basic block. In addition, the addEffectful() method in WarpCacheIRTranspiler enforces that each basic block contains at most one effectful instruction.

    In the baseline JIT implementation that I already finished, the InitRecord and FinishRecord bytecodes each have their own corresponding CacheIR ops; I made this choice by looking at how existing ops like NewArray were implemented, though in all of these cases, I’m still not sure I fully understand what the benefit of caching is (rather than just generating code) — my understanding of inline caching is that it’s an optimization to avoid method lookups when polymorphic code is instantiated repeatedly at the same type, and in all of these cases, there’s no type-based polymorphism.

    I could go ahead and add InitRecord and FinishRecord into MIR and LIR as well; this would be similar to my existing code where the BaselineCacheIRCompiler compiles these operations to assembly. To implement these operations in Warp, I would add similar code to CodeGenerator.cpp (the module that compiles LIR to assembly) as what is currently in the BaselineCacheIRCompiler.

    But, MIR includes some lower-level operations that aren’t present in CacheIR — most relevantly to me, operations for manipulating ObjectElements fields: Elements, SetInitializedLength, and so on. Using these operations (and adding a few more similar ones), I could translate FinishRecord to a series of simpler MIR operations, rather than adding it to MIR. To be more concrete, it would look something like:

    (CacheIR)
    
    FinishRecord r
    
    == WarpCacheIRTranspiler ==>
    
    (MIR)
    
    e = Elements r
    Freeze e
    sortedKeys = LoadFixedSlot r SORTED_KEYS_SLOT
    sortedKeysElements = Elements sortedKeys
    CallShrinkCapacityToInitializedLength sortedKeys
    SetNonWritableArrayLength sortedKeysElements
    recordInitializedLength = InitializedLength r
    SetArrayLength sortedKeysElements recordInitializedLength
    CallSort sortedKeys
    

    (I’m making up a concrete syntax for MIR.)

    This would encapsulate the operations involved in finishing a record, primarily sorting the keys array and setting flags to ensure that the record and its sorted keys array are read-only. Several of these are already present in MIR, and the others would be easy to add, following existing operations as a template.

    The problem with this approach is that FinishRecord in CacheIR would map onto multiple effectful MIR instructions, so I can’t just add a case for it in WarpCacheIRTranspiler.

    I could also push the lower-level operations up into CacheIR, but I don’t know if that’s a good idea, since presumably there’s a reason why it hasn’t been done already.

    To summarize, the options I’m considering are:

    1. Pass down InitRecord and FinishRecord through the pipeline by adding them to MIR and LIR
    2. Open up FinishRecord (InitRecord isn’t as complicated) in the translation to MIR, which might involve making FinishRecord non-cacheable altogether
    3. Open up FinishRecord in the translation to CacheIR, by adding more lower-level operations into CacheIR

    I’ll have to do more research and check my assumptions before making a decision. A bigger question I’m wondering about is how to determine if it’s worth it to implement a particular operation in CacheIR at all; maybe I’m going about things the wrong way by adding the record/tuple opcodes into CacheIR right away, and instead I should just be implementing code generation and defer anything else until benchmarks exist?

    by Tim Chevalier at January 20, 2022 07:31 AM