Planet Igalia

June 24, 2022

Tim Chevalier

on impostor syndrome, or: worry dies last

According to Sedgwick, it was just this kind of interchange that fueled her emotional re-education. She came to see that the quickness of her mind was actually holding back her progress, because she expected emotional change to be as easy to master as a new theory: “It’s hard to recognize that your whole being, your soul doesn’t move at the speed of your cognition,” she told me. “That it could take you a year to really know something that you intellectually believe in a second.” She learned “how not to feel ashamed of the amount of time things take, or the recalcitrance of emotional or personal change.”

Maria Russo, “The reeducation of a queer theorist”, 1999

My colleague Ioanna Dimitriou told me “worry dies last”, and it made me remember this passage from an interview with Eve Kosofsky Sedgwick.

It’s especially common in fields where people’s work is constantly under review by talented peers, such as academia or Open Source Software, or taking on a new job.

Geek Feminism Wiki, “Impostor Syndrome”

At the end of 2012/beginning of 2013 I wrote a four-part blog post about my experiences with impostor syndrome. That led to me getting invited to speak on episode 113 of the “Ruby Rogues” podcast, which was dedicated to impostor syndrome. (Unfortunately, from what I can tell, their web site is gone.)

Since then, my thinking about impostor syndrome has changed.

“Impostor syndrome” is an entirely rational behavior for folks who do get called impostors (ie. many underrepresented people). It’s part coping mechanism, part just listening to the feedback you’re getting….

We call it “impostor syndrome”, but we’re not sick. The real sickness is an industry that calls itself a meritocracy but over and over and over fails to actually reward merit.

This is fixable. It will take doing the work of rooting out bias in all its forms, at all levels – and critically, in who gets chosen to level up. So let’s get to work.

Leigh Honeywell, “Impostor Syndrome”, 2016

I agree with everything Leigh wrote here. Impostor syndrome, like any response to past or ongoing trauma, is not a pathology. It’s a reasonable adaptation to an environment that places stresses on your mind and body that exhaust your resources for coping with those demands. I wrote a broader post about this point in 2016, called “Stop Moralizing About Personality Traits”.

Acceptance is the first step towards change. By now, I’ve spent over a decade consciously reckoning with the repercussions of growing up and into young adulthood without emotional support, both on the micro-level (family and intimate relationships) and the macro-level (being a perennial outsider with no home on either side of a variety of social borders: for example, those of gender, sexuality, disability, culture, and nationality). When i started my current job last year, I wasn’t over it. That made it unnecessarily hard to get started and put up a wall between me and any number of people who might have offered help if they’d only known what I was going through. I’m still not over it.

To recognize, and name as a problem, the extent to which my personality has been shaped by unfair social circumstances: that was one step. Contrary to my acculturation as an engineer, the next step is not “fix the problem”. In fact, there is no patch you can simply apply to your own inner operating system, because all of your conscious thoughts run in user space. Maybe you can attach a debugger to your own kernel, but some changes can’t be made to a running program without a cold reboot. I don’t recommend trying that at home.

Learning to identify impostor syndrome (or, as you might call it, “dysfunctional environment syndrome”; or, generalizing, “complex trauma” or “structural violence”) is one step, but a bug report isn’t the same thing as a passing regression test. As with free software, improvement has to come a little bit at a time, from many different contributors; there are few successful projects with a single maintainer.

I am ashamed of the amount of time things take, of looking like a senior professional on the outside as long as my peers don’t know (or aren’t thinking about) how I’ve never had a single job in tech for more than two years, about what it was like for me to move from job to job never picking enough momentum to accomplish anything that felt real to me. I wonder whether they think I’ve got it all figured out, which I don’t, but it often feels easier to just let people think that and suffer in silence. Learning to live with trauma requires trusting relationships; you can’t do it on your own. But the trauma itself makes it difficult to impossible to trust and to enter into genuine relationships.

I am not exaggerating when I say that my career has been traumatic for me; it has both echoed much older traumas and created entirely new ones. That’s a big part of why I had to share how I felt about finally meeting more of my co-workers in person. I’m 41 years old and I feel like I should be better at this by now. I’m not. But I’ll keep trying, if it takes a lifetime.

by Tim Chevalier at June 24, 2022 12:52 PM

June 23, 2022

Tim Chevalier

we belong

I am about 70% robot and 30% extremely sentimental and emotional person, generally in series rather than in parallel. But last week’s Igalia summit was a tidal wave of feelings, unexpected but completely welcome. Some of those feelings are ones I’ve already shared with the exact people who need to know, but there are some that I need to share with the Internet. I am hoping I’m not the only one who feels this way, though I don’t think I am.

A lot of us are new and this was our first summit. Meeting 60 or 70 people who were previously 2D faces on a screen for half an hour a week, at best, was intense. I was told by others that reuniting with long-time friends/colleagues/comrades/whatever words you want to use (and it’s hard to find the exact right one for a workplace like this) who they hadn’t seen since pre-pandemic was intense as well.

For me, there was more to it. I doubt I’m alone in this either, but it might explain why I’m feeling so strongly.

I tried to quit tech in 2015. I couldn’t afford to in the end, and I went to Google. They fired me for (allegedly) discriminating against white men, in late 2017. I decided it was time to quit again. I became an EMT and then a patient care coordinator, and applied to nursing schools. I got rejected. I decided I didn’t want to try again because I had learned that unless I became a physician, working in health care would never give me the respect I need. Unfortunately, I have an ego. I like to think that I balance it out with empathy more than some people in tech do, but it’s still there.

I got a DM in 2018 from some guy I knew from Twitter asking if I wanted to apply to Igalia, and I waited three years to take him up on it. Now I’m here.

Getting started wasn’t easy. The two weeks working from the office before the summit wasn’t easy either. But it all fell away sometime between Wednesday and Friday of last week, and quite unexpectedly, I decided I’m moving to Europe as soon as I can, probably to A Coruña (where Igalia’s headquarters is) at first but who knows where life will take me next? Listing all the reasons would take too long. But: I found a safe space, one where I felt welcome, accepted, like I belonged. It’s not a perfect one; I stood up during one of the meetings and expressed my pain at the dissonance between the comfort I feel here and the knowledge that most of the faces in the room were white and most belonged to men. I want to say we’re working on it, but our responsibility is to finish the work, not to feel good that we’ve started it. That’s true for writing code to deliver to a customer, and it’s true for achieving fairness.

I am old enough now to accommodate multiple conflicting truths. My desire to improve the unfairness, and to get other people to open their hearts enough to risk all-consuming rage at just how unfair things can be, those things coexist with my joy about finding a group of such consistently caring, thoughtful, and justice-minded people — ones who don’t seem to mind having me around.

I’m normally severely allergic to words like “love” and “family” in a corporate context. As an early childhood trauma survivor, these words are fraught, and I would rather keep things at work a bit more chill. At the same time, when I heard Igalians use these words during the summit to talk about our collective, it didn’t feel as menacing as it usually does. Maybe the right word to use here — the thing that we really mean when we generalize the words “love” and “family” because we’ve been taught (incorrectly) that it can only come from our lovers or parents — is “safety”. Safety is one of the most underrated concepts there is. Feeling safe means believing that you can rely on the people around you, that they know where you’re coming from or else if they don’t, that they’re willing to try to find out, that they’re willing to be changed by what happens if they do find out. I came in apprehensive. But in little ways and in big ways, I found safe people, not just one or two but a lot.

I could say more, but if I did, I might never stop. To channel the teenaged energy that I’m feeling right now (partly due to reconnecting with that version of myself who loved computers and longed to find other people who did too), I’ll include some songs that convey how I feel about this week. I don’t know if this will ring true for anyone else, but I have to try.

Allette Brooks, “Silicon Valley Rebel”

We lean her bike along the office floor
They never know what to expect shaved into the back of her head when she walks in the door
And she says ‘I don’t believe in working like that for a company
It’s not like they care about you
It’s not like they care about me’

Please don’t leave us here alone in this silicon hell, oh
Life would be so unbearable without your rebel yell...

Vienna Teng, “Level Up”

Call it any name you need
Call it your 2.0, your rebirth, whatever –
So long as you can feel it all
So long as all your doors are flung wide
Call it your day number 1 in the rest of forever

If you are afraid, give more
If you are alive, give more now
Everybody here has seams and scars

Namoli Brennet “We Belong”

Here's to all the tough girls
And here's to all the sensitive boys
We Belong
Here's to all the rejects
And here's to all the misfits
We Belong

Here's to all the brains and the geeks
And here's to all the made up freaks, yeah
We Belong

And when the same old voices say
That we'd be better off running away
We belong, We belong anyway

The Nields, “Easy People”

You let me be who I want to be

Bob Franke, “Thanksgiving Eve”

What can you do with your days
But work and hope
Let your dreams bind your work to your play

And most of all, the Mountain Goats, “Color in Your Cheeks”

They came in by the dozens, walking or crawling
Some were bright-eyed, some were dead on their feet
And they came from Zimbabwe, or from Soviet Georgia
East Saint Louis, or from Paris, or they lived across the street
But they came, and when they finally made it here
It was the least that we could do to make our welcome clear

Come on in
We haven't slept for weeks
Drink some of this
This'll put color in your cheeks

This is a different kind of post from the ones I was originally planning to do on this blog. And I never thought I’d be talking about my job this way. Life comes at you fast. To return to the Allette Brooks lyric: it’s because at a co-op, there’s no “they” that’s separate from “you” and “me”. It’s just “you” and “me”, and we care about each other. It turns out that safe spaces and cooperative structure aren’t just political ideas that happen to correlate — in a company, neither can exist without the other. It’s not a safe space if you can get fired over one person’s petty grievance, like being reminded that white men don’t understand everything. Inversely, cooperative structure can’t work without deep trust. Trust is hard to scale, and as Igalia grows I worry about what will happen (doubtless, the people who were here when it was 1/10ths of the size have a different view.) There is no guarantee of success, but I want to be one of the ones to try.

And we’re hiring. I know how hard it is to try again when you’ve been humiliated, betrayed, and disappointed at work before, something that’s more common than not in tech when you don’t look like, sound like, or feel like everybody else. I’m here because somebody believed in me. I’m glad both that they did and that I was able to return that leap of faith and believe that I truly was believed in. And I would like to pass along the favor to you. Of course, to do that, I have to get to know you a little bit first. As long as I continue to have some time, I want to talk to people in groups that are systematically underrepresented in tech who may be intrigued by what I wrote here but aren’t sure if they’re good enough. My email is tjc at (obvious domain name) and the jobs page is at https://www.igalia.com/jobs. Even if you don’t see a technical role there that exactly fits you, please don’t let that stop you from reaching out; what matters most is willingness to learn and to tolerate the sometimes-painful, always-rewarding process of creating something together with mutual consent rather than coercion.

by Tim Chevalier at June 23, 2022 09:33 PM

June 22, 2022

Brian Kardell

Achievement Unlocked: Intent to Mathify

Achievement Unlocked: Intent to Mathify

We are about to reach a very unique and honestly pretty epic moment in standards history, and most people probably won’t even notice. It’s also personally meaningful to me, so I’d like to tell you about it and why it’s worth celebrating it…

Igalia just filed an Intent to Ship support for MathML-Core in Chromium. If you’re not familliar with what an intent to ship means - the blink release process involves several stages before something gets to ship. An intent to ship marks the beginning of the final step which allows the feature to just work by default in stable releases.

I know, I know. Many website developers will think “that doesn’t help me much and I have all these other problems which I would love to solve instead”. But… that’s kind of the problem and why this is momentus in several ways.

MathML has a wildly interesting history. At some level, the need for being able to display mathematical text seem like it would have been obvious from the web’s start at CERN, right? It was! In fact, support for rendering some math existed in CERNs experimental browser in 1993. Graphics too. It’s unsurprising then that SVG and MathML were among the first active working groups at the W3C when it was established. MathML was, in fact, the first XML oriented standard that the W3C ever published with it’s first recommendation coming in April 1998. For reference that’s over a year before HTML 4.0.1 reached REC… During the “HTML5” split it, along with SVG was specially integrated into the new, very well defined parser.

So why is it suddenly “news?”.

Well… It is complicated, and I wrote one of my personally favorite pieces explaining it all at the beginning of 2019, before I came to work at Igalia. Really, I think it is enjoyable and you should read it, but the TL;DR version is that standards implementations, and their prioritization is voluntary. In the same way that many developers are focused on shopping and selling and animations and layout and better modularization and… lots of other problems - there were just more appealing things on their plates which would appease a wider audience. There are millions of equations in Wikipedia alone, but we don’t think about it so much because we’ve developed plenty of clever (and often not-so-clever) workarounds to deal with “until the last one lands”.

And so progress was slower than normal. Volunteers did an amazing amount of the actual implementation work in many cases. And then, just as it looked like we were about to cross a finish line, Chrome forked WebKit and decided that - for now - they were going to rip out the newly landed MathML which had some problems to make it easier for them to refactor major parts of the engine. And then the way we do standards changed. We got more rigorous over the years. Basically - the story just kept getting worse. It was almost like we were going backwards for math on the web.

By 2018 or so it was looking pretty unable to be righted. Igalia was provided many arguments about how hard the problem was, and the scale of actually righting the ship. It was more than just 1 more implementation, which would already have been a huge effort - it was about re-establishing a working group, specifying all of the previously unspecified things, in a way that fit the platform (coordinating with many other working groups and WHATWG on various details), going through a review with the W3C Technical Architecture Group, and so on.

But, here we are.

Setting this right isn’t just historically unique in that way either: I’m pretty sure it’s safe to say that no non-browser organization has ever landed something of this kind of scale before - and they’ve never done it with some aid from various funding sources either. While Igalia wound up footing the lion’s share of the bill ourselves, we also had financial support at stages from NISO and the Alfred P Sloan Foundation, APS Physics, and Pearson and a small collection of donors that included $75k from two people. Really, in many ways, that effort was the pre-cursor to our whole Open Prioritization idea.

Personal notes

For me, I have a lot of personal connections to this that make it meaningful. As I said in Harold Crick and the Web Platform, I might be partially responsible for its delays.

Little did he know, he’d get deeply involved in helping solve that problem.

When I came to Igalia, it was one of my first projects.

I helped fix some things. The first Web Platform Test I ever contributed was for MathML. I think the first WHATWG PR I ever sent was on MathML. The first BlinkOn Lightning Talk I ever did was about - you guessed it, MathML. The first W3C Charter I ever helped write? That’s right: About MathML. The first actual Working Group I’ve ever chaired (well, co-chaired) is about Math. The first explainer I ever wrote myself was about MathML. The first podcast I ever hosted was on… Guess what? MathML. And so on.

And here’s the thing: I am perhaps the least mathematically minded person you will ever meet. But we did the right thing. A good thing. And we did it a good way, and for good reasons.

A few episodes later on our podcast we had Rick Byers from Chrome and Rossen Atanassov from Microsoft on the show, and Rick brought up MathML. Both of them were hugely impressed and supportive of it even back then - Rick said

I fully expect MathML to ship at some point and it’ll ship in Chrome… Even though from Google’s business perspective, it probably wouldn’t have been a good return on investment for us to do it… I’m thrilled that Igalia was able to do it.

By then, there was a global pandemic underway and Rossen pointed out…

I’m looking forward to it. Look, I… I’m a huge supporter of having a native math into into the engines, MathML… And having the ability to, at the end of the day, increase the edu market, which will benefit the most out of it. Especially, you know, having been through one full semester of having a middle school student at home, and having her do all of her work through online tools… Having better edu support will go a long way. So, thank you on behalf all of the students, future students that will benefit

And… yeah, that’s kind of it right? There’s this huge thing that has no special seeming obvious ROI for browsers, that doesn’t have every developer in the world screaming for it but it’s really great for society and kind of important to serve this niche because it underpins the ability of student, physicists, mathematicians etc to communicate with text.

Mission Accomplished Progress

Anyway… Wow. This is huge, I think, in so many ways.

I’m gonna go raise a glass to everyone who helped achieve this astonishingly huge thing.

I think it says so much about our ability to do things together and promise for the ecosystem and how it could work.

I sort of wish I could just say “Mission Accomplished” but the truth is that this is a beginning, not an end. It means we have really good support of parsing and rendering (assuming some math capable fonts) tons and tons of math interoperably, and a spec for how it needs to integrate with the whole rest of the platform, but only 1 implementation of that bit. Now we have to align the others implementations the same way so that we really have just One Platform and all of it can move forward together and no part gets left behind again.

Then, beyond that is MathML-Core Level 2. Level 1 draws a box around what was practically able to be aligned in the first pass - but it leaves a few hard problems on the table which really need solving. Many of these have some partial solutions in the other 2 browsers already but are hard to specify and integrate.

I have a lot of faith that we can reach both of those goals, but to do it takes investment. I really hope that reaching this milestone helps convince organizations and vendors to contribute toward reaching them. I’d encourage everyone to give the larger ecosystem some thought and consider how we can support good efforts - even maybe directly.

Help us support this work through Open Prioritization / Open Collective

If you'd like to understand the argument for this better, my friend and colleague at Igalia Eric Meyer presented an excellent talk when we announced it...

June 22, 2022 04:00 AM

June 21, 2022

Ziran Sun

My first time attending Igalia Summit

With employee distributed in over 20 countries globally and most are working remotely, Igalia holds summits twice a year to give the employees opportunities to meet up face-to-face. The summit normally runs in a period of a week with code camps, team building and recreational activities etc.. This year’s summer summit was held between the 15th-19th of June in A Coruña, Galicia.

I joined Igalia in the beginning of 2020. Because of COVID-19 pandemic, this was my first time attending an Igalia Summit. Due to personal time schedule I only managed to stay at A Coruña for three nights. The overall experience was great and I thoroughly enjoyed it!

Getting to know A Coruña

Igalia’s HQ is based in A Coruña in Galicia, Spain. A beautiful white-sanded beach is just a few yards away from the hotel we stayed. I did a barefoot stroll along the beach in one morning. The beach itself was reasonably quiet. There were a few people swimming in the shallow part of the sea. Weather that morning was perfect with warm sunshine and occasional cool gentle breezes.

The smell of sea in the air and people wandering around relaxingly in the evenings, somehow brought a familiar feeling to me and reminded of my home town.

On Wednesday I joined guided visit to A Coruña set in 1906. It was a very interesting walk around the historic part of the city. The tour guide Suso Martín presented an amazing one man show by walking us through the A Coruña‘s history back to the 12th century, historic buildings, religion and romances associated with the city.

On Thursday evening we went for the guided tour of MEGA, Mundo Estrella Galicia. En route we passed the office that Igalia used to locate at in early days. According to some Igalians who worked there before, it was a small office and there were about just over 10 Igalians in those days. Today Igalia has over 100 employees and last year we celebrated our 20th year anniversary in Open Source.

The highlights of the MEGA tour for me were watching the product line working and tasting the beers. We were spoiled by beers and local cheeses.

Meeting other igalians

Since I joined Igalian, all the meetings happened online due to pandemic. I was glad that I could finally meet other Igalians “physically” 😊. During the summit I had chances to chat other Igalians at code camp, during meals and guided tours. It was pleasant. Playing board games together after an evening meal (I’d call it “night meal”) was a great fun. Some Igalians can be very funny and witty. I had quite a few 😂 moments. It was very enjoyable.

During the team code camp, I had chances to spending more time with my teammates. The technical meetings in both days were very engaging and effective. I was very happy to see Javi Fernández in person. Last year I was involved in CSS grid compatibility work (one of the 5 key areas for the Compat2021 effort that Igalia was involved in). I personally had never touched CSS Grid layout at the beginning of the assignment. The web platform team in Igalia though has very good knowledge and working experiences in this area. My teammates Manuel Rego, Javi Fernández and Sergio Villar were among the pioneer developers in this field. To assure the success of the task, the team formed a safe environment by providing me constant guidance and supports. Specifically, I had Javi’s helps throughout the whole task. Javi has been an amazing technical mentor with great expertise and patience. We had numerous calls for technical discussions. The team also had a couple of debugging sessions with me for some very tricky bugs.

The team meal on Wednesday evening was very nice. Delicious food, great companions and nice fruity beers – what could go wrong? Well, we did walk back to the hotel in the rain…

Thanks

The event was very well organized and productive. I really appreciate the hard work and great effort those igalians put in to make it happen. Just want to say – a big THANK YOU! I’m glad that I managed the trip. Traveling from the UK is not straightforward but I’d say it’s well worthy it 😁.

by zsun at June 21, 2022 01:02 PM

Andy Wingo

an optimistic evacuation of my wordhoard

Good morning, mallocators. Last time we talked about how to split available memory between a block-structured main space and a large object space. Given a fixed heap size, making a new large object allocation will steal available pages from the block-structured space by finding empty blocks and temporarily returning them to the operating system.

Today I'd like to talk more about nothing, or rather, why might you want nothing rather than something. Given an Immix heap, why would you want it organized in such a way that live data is packed into some blocks, leaving other blocks completely free? How bad would it be if instead the live data were spread all over the heap? When might it be a good idea to try to compact the heap? Ideally we'd like to be able to translate the answers to these questions into heuristics that can inform the GC when compaction/evacuation would be a good idea.

lospace and the void

Let's start with one of the more obvious points: large object allocation. With a fixed-size heap, you can't allocate new large objects if you don't have empty blocks in your paged space (the Immix space, for example) that you can return to the OS. To obtain these free blocks, you have four options.

  1. You can continue lazy sweeping of recycled blocks, to see if you find an empty block. This is a bit time-consuming, though.

  2. Otherwise, you can trigger a regular non-moving GC, which might free up blocks in the Immix space but which is also likely to free up large objects, which would result in fresh empty blocks.

  3. You can trigger a compacting or evacuating collection. Immix can't actually compact the heap all in one go, so you would preferentially select evacuation-candidate blocks by choosing the blocks with the least live data (as measured at the last GC), hoping that little data will need to be evacuated.

  4. Finally, for environments in which the heap is growable, you could just grow the heap instead. In this case you would configure the system to target a heap size multiplier rather than a heap size, which would scale the heap to be e.g. twice the size of the live data, as measured at the last collection.

If you have a growable heap, I think you will rarely choose to compact rather than grow the heap: you will either collect or grow. Under constant allocation rate, the rate of empty blocks being reclaimed from freed lospace objects will be equal to the rate at which they are needed, so if collection doesn't produce any, then that means your live data set is increasing and so growing is a good option. Anyway let's put growable heaps aside, as heap-growth heuristics are a separate gnarly problem.

The question becomes, when should large object allocation force a compaction? Absent growable heaps, the answer is clear: when allocating a large object fails because there are no empty pages, but the statistics show that there is actually ample free memory. Good! We have one heuristic, and one with an optimum: you could compact in other situations but from the point of view of lospace, waiting until allocation failure is the most efficient.

shrinkage

Moving on, another use of empty blocks is when shrinking the heap. The collector might decide that it's a good idea to return some memory to the operating system. For example, I enjoyed this recent paper on heuristics for optimum heap size, that advocates that you size the heap in proportion to the square root of the allocation rate, and that as a consequence, when/if the application reaches a dormant state, it should promptly return memory to the OS.

Here, we have a similar heuristic for when to evacuate: when we would like to release memory to the OS but we have no empty blocks, we should compact. We use the same evacuation candidate selection approach as before, also, aiming for maximum empty block yield.

fragmentation

What if you go to allocate a medium object, say 4kB, but there is no hole that's 4kB or larger? In that case, your heap is fragmented. The smaller your heap size, the more likely this is to happen. We should compact the heap to make the maximum hole size larger.

side note: compaction via partial evacuation

The evacuation strategy of Immix is... optimistic. A mark-compact collector will compact the whole heap, but Immix will only be able to evacuate a fraction of it.

It's worth dwelling on this a bit. As described in the paper, Immix reserves around 2-3% of overall space for evacuation overhead. Let's say you decide to evacuate: you start with 2-3% of blocks being empty (the target blocks), and choose a corresponding set of candidate blocks for evacuation (the source blocks). Since Immix is a one-pass collector, it doesn't know how much data is live when it starts collecting. It may not know that the blocks that it is evacuating will fit into the target space. As specified in the original paper, if the target space fills up, Immix will mark in place instead of evacuating; an evacuation candidate block with marked-in-place objects would then be non-empty at the end of collection.

In fact if you choose a set of evacuation candidates hoping to maximize your empty block yield, based on an estimate of live data instead of limiting to only the number of target blocks, I think it's possible to actually fill the targets before the source blocks empty, leaving you with no empty blocks at the end! (This can happen due to inaccurate live data estimations, or via internal fragmentation with the block size.) The only way to avoid this is to never select more evacuation candidate blocks than you have in target blocks. If you are lucky, you won't have to use all of the target blocks, and so at the end you will end up with more free blocks than not, so a subsequent evacuation will be more effective. The defragmentation result in that case would still be pretty good, but the yield in free blocks is not great.

In a production garbage collector I would still be tempted to be optimistic and select more evacuation candidate blocks than available empty target blocks, because it will require fewer rounds to compact the whole heap, if that's what you wanted to do. It would be a relatively rare occurrence to start an evacuation cycle. If you ran out of space while evacuating, in a production GC I would just temporarily commission some overhead blocks for evacuation and release them promptly after evacuation is complete. If you have a small heap multiplier in your Immix space, occasional partial evacuation in a long-running process would probably reach a steady state with blocks being either full or empty. Fragmented blocks would represent newer objects and evacuation would periodically sediment these into longer-lived dense blocks.

mutator throughput

Finally, the shape of the heap has its inverse in the shape of the holes into which the mutator can allocate. It's most efficient for the mutator if the heap has as few holes as possible: ideally just one large hole per block, which is the limit case of an empty block.

The opposite extreme would be having every other "line" (in Immix terms) be used, so that free space is spread across the heap in a vast spray of one-line holes. Even if fragmentation is not a problem, perhaps because the application only allocates objects that pack neatly into lines, having to stutter all the time to look for holes is overhead for the mutator. Also, the result is that contemporaneous allocations are more likely to be placed farther apart in memory, leading to more cache misses when accessing data. Together, allocator overhead and access overhead lead to lower mutator throughput.

When would this situation get so bad as to trigger compaction? Here I have no idea. There is no clear maximum. If compaction were free, we would compact all the time. But it's not; there's a tradeoff between the cost of compaction and mutator throughput.

I think here I would punt. If the heap is being actively resized based on allocation rate, we'll hit the other heuristics first, and so we won't need to trigger evacuation/compaction based on mutator overhead. You could measure this, though, in terms of average or median hole size, or average or maximum number of holes per block. Since evacuation is partial, all you need to do is to identify some "bad" blocks and then perhaps evacuation becomes attractive.

gc pause

Welp, that's some thoughts on when to trigger evacuation in Immix. Next time, we'll talk about some engineering aspects of evacuation. Until then, happy consing!

by Andy Wingo at June 21, 2022 12:21 PM

Carlos García Campos

Thread safety support in libsoup3

In libsoup2 there’s some thread safety support that allows to send messages from a thread different than the one where the session is created. There are other APIs that can be used concurrently too, like accessing some of the session properties, and others that aren’t thread safe at all. It’s not clear what’s thread safe and even sending a message is not fully thread safe either, depending on the session features involved. However, several applications relay on the thread safety support and have always worked surprisingly well.

In libsoup3 we decided to remove the (broken) thread safety support and only allowed to use the API from the same thread where the session was created. This simplified the code and made easier to add the HTTP/2 implementation. Note that HTTP/2 supports multiple request over the same TCP connection, which is a lot more efficient than starting multiple requests from several threads in parallel.

When apps started to be ported to libsoup3, those that relied on the thread safety support ended up being a pain to be ported. Major refactorings where required to either stop using the sync API from secondary threads, or moving all the soup usage to the same secondary thread. We managed to make it work in several modules like gstreamer and gvfs, but others like evolution required a lot more work. The extra work was definitely worth it and resulted in much better and more efficient code. But we also understand that porting an application to a new version of a dependency is not a top priority task for maintainers.

So, in order to help with the migration to libsoup3, we decided to add thread safety support to libsoup3 again, but this time trying to cover all the APIs involved in sending a message and documenting what’s expected to be thread safe. Also, since we didn’t remove the sync APIs, it’s expected that we support sending messages synchronously from secondary threads. We still encourage to use only the async APIS from a single thread, because that’s the most efficient way, especially for HTTP/2 requests, but apps currently using threads can be easily ported first and then refactored later.

The thread safety support in libsoup3 is expected to cover only one use case: sending messages. All other APIs, including accessing session properties, are not thread safe and can only be used from the thread where the session is created.

There are a few important things to consider when using multiple threads in libsoup3:

  • In the case of HTTP/2, two messages for the same host sent from different threads will not use the same connection, so the advantage of HTTP/2 multiplexing is lost.
  • Only the API to send messages can be called concurrently from multiple threads. So, in case of using multiple threads, you must configure the session (setting network properties, features, etc.) from the thread it was created and before any request is made.
  • All signals associated to a message (SoupSession::request-queued, SoupSession::request-unqueued, and all SoupMessage signals) are emitted from the thread that started the request, and all the IO will happen there too.
  • The session can be created in any thread, but all session APIs except the methods to send messages must be called from the thread where the session was created.
  • To use the async API from a thread different than the one where the session was created, the thread must have a thread default main context where the async callbacks are dispatched.
  • The sync API doesn’t need any main context at all.

by carlos garcia campos at June 21, 2022 07:36 AM

June 20, 2022

Tiago Vignatti

Short blog post from Madrid's bus

This post was inspired by “Short blog post from Madrid’s hotel room” from my colleague Frédéric Wang. You should really check his post instead! To Fred: thanks for the feedback and review here. I’m in for football in the next Summit, alright? :-) This week, I finally went to A Coruña for the Web Engines Hackfest and internal company meetings. These were my first on-site events since the COVID-19 pandemic. After two years of non-super-exciting virtual conferences I was so glad to finally be able to meet with colleagues and other people from the Web.

by Author at June 20, 2022 10:34 PM

Andy Wingo

blocks and pages and large objects

Good day! In a recent dispatch we talked about the fundamental garbage collection algorithms, also introducing the Immix mark-region collector. Immix mostly leaves objects in place but can move objects if it thinks it would be profitable. But when would it decide that this is a good idea? Are there cases in which it is necessary?

I promised to answer those questions in a followup article, but I didn't say which followup :) Before I get there, I want to talk about paged spaces.

enter the multispace

We mentioned that Immix divides the heap into blocks (32kB or so), and that no object can span multiple blocks. "Large" objects -- defined by Immix to be more than 8kB -- go to a separate "large object space", or "lospace" for short.

Though the implementation of a large object space is relatively simple, I found that it has some points that are quite subtle. Probably the most important of these points relates to heap size. Consider that if you just had one space, implemented using mark-compact maybe, then the procedure to allocate a 16 kB object would go:

  1. Try to bump the allocation pointer by 16kB. Is it still within range? If so we are done.

  2. Otherwise, collect garbage and try again. If after GC there isn't enough space, the allocation fails.

In step (2), collecting garbage could decide to grow or shrink the heap. However when evaluating collector algorithms, you generally want to avoid dynamically-sized heaps.

cheatery

Here is where I need to make an embarrassing admission. In my role as co-maintainer of the Guile programming language implementation, I have long noodled around with benchmarks, comparing Guile to Chez, Chicken, and other implementations. It's good fun. However, I only realized recently that I had a magic knob that I could turn to win more benchmarks: simply make the heap bigger. Make it start bigger, make it grow faster, whatever it takes. For a program that does its work in some fixed amount of total allocation, a bigger heap will require fewer collections, and therefore generally take less time. (Some amount of collection may be good for performance as it improves locality, but this is a marginal factor.)

Of course I didn't really go wild with this knob but it now makes me doubt all benchmarks I have ever seen: are we really using benchmarks to select for fast implementations, or are we in fact selecting for implementations with cheeky heap size heuristics? Consider even any of the common allocation-heavy JavaScript benchmarks, DeltaBlue or Earley or the like; to win these benchmarks, web browsers are incentivised to have large heaps. In the real world, though, a more parsimonious policy might be more appreciated by users.

Java people have known this for quite some time, and are therefore used to fixing the heap size while running benchmarks. For example, people will measure the minimum amount of memory that can allow a benchmark to run, and then configure the heap to be a constant multiplier of this minimum size. The MMTK garbage collector toolkit can't even grow the heap at all currently: it's an important feature for production garbage collectors, but as they are just now migrating out of the research phase, heap growth (and shrinking) hasn't yet been a priority.

lospace

So now consider a garbage collector that has two spaces: an Immix space for allocations of 8kB and below, and a large object space for, well, larger objects. How do you divide the available memory between the two spaces? Could the balance between immix and lospace change at run-time? If you never had large objects, would you be wasting space at all? Conversely is there a strategy that can also work for only large objects?

Perhaps the answer is obvious to you, but it wasn't to me. After much reading of the MMTK source code and pondering, here is what I understand the state of the art to be.

  1. Arrange for your main space -- Immix, mark-sweep, whatever -- to be block-structured, and able to dynamically decomission or recommission blocks, perhaps via MADV_DONTNEED. This works if the blocks are even multiples of the underlying OS page size.

  2. Keep a counter of however many bytes the lospace currently has.

  3. When you go to allocate a large object, increment the lospace byte counter, and then round up to number of blocks to decommission from the main paged space. If this is more than are currently decommissioned, find some empty blocks and decommission them.

  4. If no empty blocks were found, collect, and try again. If the second try doesn't work, then the allocation fails.

  5. Now that the paged space has shrunk, lospace can allocate. You can use the system malloc, but probably better to use mmap, so that if these objects are collected, you can just MADV_DONTNEED them and keep them around for later re-use.

  6. After GC runs, explicitly return the memory for any object in lospace that wasn't visited when the object graph was traversed. Decrement the lospace byte counter and possibly return some empty blocks to the paged space.

There are some interesting aspects about this strategy. One is, the memory that you return to the OS doesn't need to be contiguous. When allocating a 50 MB object, you don't have to find 50 MB of contiguous free space, because any set of blocks that adds up to 50 MB will do.

Another aspect is that this adaptive strategy can work for any ratio of large to non-large objects. The user doesn't have to manually set the sizes of the various spaces.

This strategy does assume that address space is larger than heap size, but only by a factor of 2 (modulo fragmentation for the large object space). Therefore our risk of running afoul of user resource limits and kernel overcommit heuristics is low.

The one underspecified part of this algorithm is... did you see it? "Find some empty blocks". If the main paged space does lazy sweeping -- only scanning a block for holes right before the block will be used for allocation -- then after a collection we don't actually know very much about the heap, and notably, we don't know what blocks are empty. (We could know it, of course, but it would take time; you could traverse the line mark arrays for all blocks while the world is stopped, but this increases pause time. The original Immix collector does this, however.) In the system I've been working on, instead I have it so that if a mutator finds an empty block, it puts it on a separate list, and then takes another block, only allocating into empty blocks once all blocks are swept. If the lospace needs blocks, it sweeps eagerly until it finds enough empty blocks, throwing away any nonempty blocks. This causes the next collection to happen sooner, but that's not a terrible thing; this only occurs when rebalancing lospace versus paged-space size, because if you have a constant allocation rate on the lospace side, you will also have a complementary rate of production of empty blocks by GC, as they are recommissioned when lospace objects are reclaimed.

What if your main paged space has ample space for allocating a large object, but there are no empty blocks, because live objects are equally peppered around all blocks? In that case, often the application would be best served by growing the heap, but maybe not. In any case in a strict-heap-size environment, we need a solution.

But for that... let's pick up another day. Until then, happy hacking!

by Andy Wingo at June 20, 2022 02:59 PM

June 17, 2022

Frédéric Wang

Short blog post from Madrid's hotel room

This week, I finally went back to A Coruña for the Web Engines Hackfest and internal company meetings. These were my first on-site events since the COVID-19 pandemic. After two years of non-super-exciting virtual conferences I was so glad to finally be able to meet with colleagues and other people from the Web.

Igalia has grown considerably and I finally get to know many new hires in person. Obviously, some people were still not able to travel despite the effort we put to settle strong sanitary measures. Nevertheless, our infrastructure has also improved a lot and we were able to provide remote communication during these events, in order to give people a chance to attend and participate !

Work on the Madrid–Galicia high-speed rail line finally completed last December, meaning one can now travel with fast trains between Paris - Barcelona - Madrid - A Coruña. This takes about one day and a half though and, because I’m voting for the Legislative elections in France, I had to shorten a bit my stay and miss nice social activities 😥… That’s a pity, but I’m looking forward to participating more next time!

Finally on the technical side, my main contribution was to present our upcoming plan to ship MathML in Chromium. The summary is that we are happy with this first implementation and will send the intent-to-ship next week. There are minor issues to address, but the consensus from the conversations we had with other attendees (including folks from Google and Mozilla) is that they should not be a blocker and can be refined depending on the feedback from API owners. So let’s do it and see what happens…

There is definitely a lot more to write and nice pictures to share, but it’s starting to be late here and I have a train back to Paris tomorrow. 😉

June 17, 2022 12:00 PM

June 16, 2022

Iago Toral

V3DV Vulkan 1.2 status

A quick update on my latest activities around V3DV: I’ve been focusing on getting the driver ready for Vulkan 1.2 conformance, which mostly involved fixing a few CTS tests of the kind that would only fail occasionally, these are always fun :). I think we have fixed all the issues now and we are ready to submit conformance to Khronos, my colleague Alejandro Piñeiro is now working on that.

by Iago Toral at June 16, 2022 09:21 AM

June 15, 2022

Andy Wingo

defragmentation

Good morning, hackers! Been a while. It used to be that I had long blocks of uninterrupted time to think and work on projects. Now I have two kids; the longest such time-blocks are on trains (too infrequent, but it happens) and in a less effective but more frequent fashion, after the kids are sleeping. As I start writing this, I'm in an airport waiting for a delayed flight -- my first since the pandemic -- so we can consider this to be the former case.

It is perhaps out of mechanical sympathy that I have been using my reclaimed time to noodle on a garbage collector. Managing space and managing time have similar concerns: how to do much with little, efficiently packing different-sized allocations into a finite resource.

I have been itching to write a GC for years, but the proximate event that pushed me over the edge was reading about the Immix collection algorithm a few months ago.

on fundamentals

Immix is a "mark-region" collection algorithm. I say "algorithm" rather than "collector" because it's more like a strategy or something that you have to put into practice by making a concrete collector, the other fundamental algorithms being copying/evacuation, mark-sweep, and mark-compact.

To build a collector, you might combine a number of spaces that use different strategies. A common choice would be to have a semi-space copying young generation, a mark-sweep old space, and maybe a treadmill large object space (a kind of copying collector, logically; more on that later). Then you have heuristics that determine what object goes where, when.

On the engineering side, there's quite a number of choices to make there too: probably you make some parts of your collector to be parallel, maybe the collector and the mutator (the user program) can run concurrently, and so on. Things get complicated, but the fundamental algorithms are relatively simple, and present interesting fundamental tradeoffs.


figure 1 from the immix paper

For example, mark-compact is most parsimonious regarding space usage -- for a given program, a garbage collector using a mark-compact algorithm will require less memory than one that uses mark-sweep. However, mark-compact algorithms all require at least two passes over the heap: one to identify live objects (mark), and at least one to relocate them (compact). This makes them less efficient in terms of overall program throughput and can also increase latency (GC pause times).

Copying or evacuating spaces can be more CPU-efficient than mark-compact spaces, as reclaiming memory avoids traversing the heap twice; a copying space copies objects as it traverses the live object graph instead of after the traversal (mark phase) is complete. However, a copying space's minimum heap size is quite high, and it only reaches competitive efficiencies at large heap sizes. For example, if your program needs 100 MB of space for its live data, a semi-space copying collector will need at least 200 MB of space in the heap (a 2x multiplier, we say), and will only run efficiently at something more like 4-5x. It's a reasonable tradeoff to make for small spaces such as nurseries, but as a mature space, it's so memory-hungry that users will be unhappy if you make it responsible for a large portion of your memory.

Finally, mark-sweep is quite efficient in terms of program throughput, because like copying it traverses the heap in just one pass, and because it leaves objects in place instead of moving them. But! Unlike the other two fundamental algorithms, mark-sweep leaves the heap in a fragmented state: instead of having all live objects packed into a contiguous block, memory is interspersed with live objects and free space. So the collector can run quickly but the allocator stops and stutters as it accesses disparate regions of memory.

allocators

Collectors are paired with allocators. For mark-compact and copying/evacuation, the allocator consists of a pointer to free space and a limit. Objects are allocated by bumping the allocation pointer, a fast operation that also preserves locality between contemporaneous allocations, improving overall program throughput. But for mark-sweep, we run into a problem: say you go to allocate a 1 kilobyte byte array, do you actually have space for that?

Generally speaking, mark-sweep allocators solve this problem via freelist allocation: the allocator has an array of lists of free objects, one for each "size class" (say 2 words, 3 words, and so on up to 16 words, then more sparsely up to the largest allocatable size maybe), and services allocations from their appropriate size class's freelist. This prevents the 1 kB free space that we need from being "used up" by a 16-byte allocation that could just have well gone elsewhere. However, freelists prevent objects allocated around the same time from being deterministically placed in nearby memory locations. This increases variance and decreases overall throughput for both the allocation operations but also for pointer-chasing in the course of the program's execution.

Also, in a mark-sweep collector, we can still reach a situation where there is enough space on the heap for an allocation, but that free space broken up into too many pieces: the heap is fragmented. For this reason, many systems that perform mark-sweep collection can choose to compact, if heuristics show it might be profitable. Because the usual strategy is mark-sweep, though, they still use freelist allocation.

on immix and mark-region

Mark-region collectors are like mark-sweep collectors, except that they do bump-pointer allocation into the holes between survivor objects.

Sounds simple, right? To my mind, though the fundamental challenge in implementing a mark-region collector is how to handle fragmentation. Let's take a look at how Immix solves this problem.


part of figure 2 from the immix paper

Firstly, Immix partitions the heap into blocks, which might be 32 kB in size or so. No object can span a block. Block size should be chosen to be a nice power-of-two multiple of the system page size, not so small that common object allocations wouldn't fit. Allocating "large" objects -- greater than 8 kB, for Immix -- go to a separate space that is managed in a different way.

Within a block, Immix divides space into lines -- maybe 128 bytes long. Objects can span lines. Any line that does not contain (a part of) an object that survived the previous collection is part of a hole. A hole is a contiguous span of free lines in a block.

On the allocation side, Immix does bump-pointer allocation into holes. If a mutator doesn't have a hole currently, it scans the current block (obtaining one if needed) for the next hole, via a side-table of per-line mark bits: one bit per line. Lines without the mark are in holes. Scanning for holes is fairly cheap, because the line size is not too small. Note, there are also per-object mark bits as well; just because you've marked a line doesn't mean that you've traced all objects on that line.

Allocating into a hole has good expected performance as well, as it's bump-pointer, and the minimum size isn't tiny. In the worst case of a hole consisting of a single line, you have 128 bytes to work with. This size is large enough for the majority of objects, given that most objects are small.

mitigating fragmentation

Immix still has some challenges regarding fragmentation. There is some loss in which a single (piece of an) object can keep a line marked, wasting any free space on that line. Also, when an object can't fit into a hole, any space left in that hole is lost, at least until the next collection. This loss could also occur for the next hole, and the next and the next and so on until Immix finds a hole that's big enough. In a mark-sweep collector with lazy sweeping, these free extents could instead be placed on freelists and used when needed, but in Immix there is no such facility (by design).

One mitigation for fragmentation risks is "overflow allocation": when allocating an object larger than a line (a medium object), and Immix can't find a hole before the end of the block, Immix allocates into a completely free block. So actually mutator threads allocate into two blocks at a time: one for small objects and medium objects if possible, and the other for medium objects when necessary.

Another mitigation is that large objects are allocated into their own space, so an Immix space will never be used for blocks larger than, say, 8kB.

The other mitigation is that Immix can choose to evacuate instead of mark. How does this work? Is it worth it?

stw

This question about the practical tradeoffs involving evacuation is the one I wanted to pose when I started this article; I have gotten to the point of implementing this part of Immix and I have some doubts. But, this article is long enough, and my plane is about to land, so let's revisit this on my return flight. Until then, see you later, allocators!

by Andy Wingo at June 15, 2022 12:47 PM

June 10, 2022

Alex Surkov

Automated accessibility testing

Accessibility Testing

Intro

I think it’s fair to say that every application developer needs to take care of accessibility at some point. Indeed, even if you use an accessible toolkit to create an application, the app isn’t always accessible simply because the combination of accessible parts is not always accessible. In reality, many things can go wrong. Without care the app may become completely inaccessible as complexity increases

If you’re a web developer, then you’re (hopefully) already familiar with WCAG, ARIA and other cool stuff helping to address accessibility issues in web apps. If you are a platform developer such as an Android or iOS, then you probably know a bazillion of tricks to keep mobile apps accessible. You might also know how to run the app over screen readers and other assistive technology softwares to ensure everything goes smoothly and works as expected. And you might already have started thinking of how you can use automated testing to cover all the accessibility caveats to avoid regressions and to reduce overhead of the manual testing.

So let’s talk about accessibility automated testing. There’s no universal solution that would embrace each and every platform and every single case. However there are many existing approaches, some are better than others. Some of them are fairly solid. Having said that looking at the diversity of the existing solutions and realizing how much people still do the manual testing (AAM specs, such as ARIA or HTML accessibility APIs mappings, can be a great example of it), I think it’d be amazing to systemize the existing techniques and come with a better, universal solution that would cover the majority of cases. I hope this post can help to find the right way to do this.

What to test

First things come first: what is the scope of automated accessibility testing, i.e. what exactly do we want to test?

Web

Without a doubt, the web is vast and complex and plays a significant role in our lives. It surely has to be accessible. But what is an accessible web, exactly?

The main conductors on the web are web browsers. They make up the web platform by providing all of the tiny building blocks such as HTML elements to create web content or ARIA attributes to define semantics. Browsers deliver web content to the users by rendering it on a screen and expose its semantics to the assistive technologies such as screen readers. All these blocks must be accessible. In particular that means the browsers are responsible to expose all building blocks to the assistive technologies correctly. It’s very tempting to say if a browser is doing the job well, then a web author cannot go wrong by using these blocks (for sure, if the web author is not doing anything strange on purpose). It sounds about right and it works nicely in theory. But, in practice accessibility issues suddenly pop up as the complexity of a web app goes up.

Browsers

The browsers are certainly the major use case in the web accessibility testing and I’d like to get them covered, simply because the web cannot be accessible without accessible browsers. Also, they already do a decent job on accessibility testing, and we can learn a lot from them. Indeed, they’re all stuffed with accessibility testing solutions and each has its own test harness for automating the process. Their system could be unified and adjusted to a broader range of uses.

Web apps

The web applications are the second piece of the web puzzle. They also must be covered.

The webapps are made up of small and accessible building blocks (if the browser does a good job). However, as I previously noted, it is not sufficient to have individually accessible parts — it is necessary that all of the combinations of parts are accessible. It may sound quite obvious since QA and end to end testing wasn’t invented yesterday, but this is something overlooked quite often. So this is one more yet use case on the table: the overall integrated web application’s accessibility.

Platform

Although the web is vital, there’s a sizable market of desktop and mobile applications which also need to get accessibility tested. Some desktop/mobile apps are using embedded browsers under-the-hood as a rendering engine, which brings them to the web scope. But speaking generally, desktop/mobile apps are not the web.

Having said that, it’s worth noting that browsers and desktop/mobile applications coexist in the same environment, and they use the very same platform accessibility APIs to expose the content to the assistive technologies. It makes web and desktop/mobile applications have more in common than people usually tend to think. I’m from the browser world and I keep focus on the web accessibility as many of you probably are, but let’s keep in mind desktop and mobile applications as one more use case. We will have them covered as well.

How to test

The next question to address is how to test accessibility or how to ensure that the application is accessible. You could test your entire application with a screen reader or a magnifier on your own and that would be pretty trustworthy, but what exactly to test when it comes to automated testing?

Unit testing

Unit testing is the first level of automated testing, and accessibility is no exception. It allows you to perform very low-level testing such as testing individual c++ classes or JS modules, or testing all internal things that are never disclosed or can never be tested via any public API as the pure under-the-hood thing. As a result, unit testing is a critical component in automated testing.

However, it has nothing to do with accessibility specifically, and there’s nothing to generalize or systemize for the benefits of accessibility. It’s simply something that all systems must have.

Accessibility APIs mappings

When it comes to automated accessibility testing, the first and probably the best thing to start from is to test accessibility APIs. Accessibility APIs is the universal language that any accessible application can be expressed in. If accessibility properties of UI looks about right, then you have great chances the app is accessible. By the way, this is the most common strategy in accessibility testing today in web browsers.

To give an example how accessibility APIs testing can be used practically, you can think of AAM web specifications. These specs define how ARIA attributes or HTML/SVG elements should be mapped to the accessibility APIs. This is a somewhat restrictive example to demonstrate the capabilities of accessibility APIs testing, because AAM utilize only a few things peculiar to accessibility APIs (for example, it misses key things like accessible relations, hierarchies or actions) but it’s a good example to get a sense of what kind of things can be tested when it comes to accessibility APIs.

Accessibility APIs have complex nature. They differ from platform to platform, but they share many traits in common. It’s not surprising at all, because they all are designed to expose the same thing: the semantics of user interface.

Here are the main categories you can find in many accessibility APIs.

  • States and properties: an example for these could be focused or selected states on a menu item or accessible name/description for properties.
  • Methods or interfaces allow to retrieve complex information about accessible elements, for example, to retrieve information about list or grid controls.
  • The relation concept is a key one. It defines how accessible elements relate to each other. In particular they are used to navigate content and/or to get extra information about elements such as labels for a control or headers for a table cell.
  • Accessible trees are a special case of accessible relations. However they are often defined as a separate entity. The accessible tree is a hierarchical representation of the content, and the ATs frequently rely on it.
  • As a rule of thumb there is also special support for text and selection.
  • There are also accessible actions for example a button click or expand/collapse a dropdown
  • Accessible events are used to notify the assistive technologies about app changes.

All of this is a very typical if not comprehensive list of what can be (or should be) tested when it comes to accessibility platform APIs testing.

Accessibility APIs are fairly low level testing, which is good or might be not depending on a use case. I would say it’s nearly perfect to test browsers, for example, how the browsers expose HTML elements and such. It also should be the right choice for different kinds of toolkits, which provides you with a set of building blocks, for example, extra controls on top of HTML5 elements. I’m not quite confident that this type of testing is exactly what web developers look for when it comes to webapps accessibility testing, because it’s fairly low level and requires understanding of under-the-hood things, but certainly they can benefit from checking accessibility properties on certain elements/structures, for example, for web components.

Assistive Technologies testing

Assistive Technologies (AT) testing makes another layer of automated testing. Unlinke accessibility APIs testing, this is a great way to ensure your app is spoken exactly as you want over different screen readers or it gets zoomed properly when it comes to screen magnifiers. Any application including both browsers and web apps can benefit from integrational AT testing by running individual control elements or separate UI blocks though the ATs.

AT testing can be also used for end-to-end testing. However, as any end-to-end testing it cannot be comprehensive because of a steep rise in testing scenarios as the app complexity increases. Having said that, it certainly can be helpful to check the most crucial scenarios and can make a real great addition to the integrational accessibility APIs testing.

Testing flow

Testing a single HTML page to check an accessible tree and/or its relevant accessibility properties makes a nice pattern of atomic testing. It is quite similar to unit testing. However, the reality is that not every use case can be reduced to a simple static page. It makes such testing quite restrictive. The real world affirms we need to test accessibility in dynamics for a full spectrum of testing scenarios, such as clicking a button and checking what happens next.

This makes another important piece of the puzzle, a testing flow, or in other words, how to control an application to test its dynamics, where you can query accessibility properties at the right time.

A typical test flow can be described in three simple steps:

  • Query-n-check: test accessibility properties against expectations.
  • Action: change a value or trigger an action: it represents the dynamics part and describes how a test interacts with an application.
  • On-hold: wait for an event to hold the test execution until the app gets to the right state.

To summarize there are two key pieces we’d need for a testing flow. First, to trigger an action. This can be emulation of user actions, triggering accessible actions or changing accessible properties, anything that allows the user to operate and control the application. Secondly, we need the ability to hold on to a test execution until certain criterias are met. That could be an accessible event or presence of certain accessible properties on an element, or waiting until a certain text is visible, whatever, just to put the test execution on pause until it’s good to proceed.

What we have now

Let’s take a look at what the market has to propose. I don’t know much about desktop or mobile applications, many of them are not open source, so no chance to sneak a peek. But I think it’d be a good idea to start from the web and the web browsers in particular, who have made some real good progress on accessibility testing.

AOM

The first thing I would like to mention is AOM. It’s not the browsers, but it’s something closely related. AOM stands for Accessible Object Model. This is an attempt to expose accessibility properties in a cross-platform way in browsers. Similarly to platform accessibility APIs, AOM defines roles, properties, relations, and an accessible tree. You can think of AOM as the web accessibility API. So having AOM with all typical accessibility features can make a great platform for cross-browser accessibility testing.

Certain elements like platform dependent mappings are not very suitable for AOM for sure. For example, AAM specs, the accessibility APIs mapping, which define how to expose the web to the platform accessibility APIs, do not make a great choice for AOM testing. Indeed, despite platform APIs similarities on the conceptual level, they differ in details. However, if we care about the logical compound only, AOM will suit nicely. For example, ARIA name computation algorithms make a great match for AOM testing, or AOM can be used to test relations between a label and a control. In this case we don’t worry what is the platform name for the relation, the only thing we want to test is the right relations are exposed.

Sadly, it’s not something that was implemented. AOM is a long term vision, and the main focus till now has been on ARIA reflection, which is prototyped by a number of browsers by now. But ARIA reflection spec is just a DOM reflection of ARIA attributes and thus has somewhat lower testing capabilities. For example, HTML elements that don’t have a proper ARIA mapping cannot be tested by ARIA reflection or any of accessibility events.

So AOM is something that has great testing potential but it’s not yet implemented or even specified.

ATTA

ATTA (Accessible Technology Test Adapter API) was the answer to manual accessibility testing for ARIA and HTML AAM specifications. Roughly speaking ATTA defines a protocol used to describe accessibility properties for a given HTML code snippet to create the test expectations. The expectations are sent to the ATTA server, which queries an ATTA driver for an accessible tree for the given HTML snippet and then checks it against the given expectations.

The ATTA is integrated into WPT test suite and it could make a great solution for the web accessibility APIs testing, if it worked. There are implementations of ATTA drivers for IAccessible2, UI Automation and ATK but apparently none of them ever reached the ready-to-use status.

So ATTA has a bunch of worthy ideas like WPT built-in integration and the modular system which allows you connect various platform APIs drivers, but sadly it is not finished yet and no active work any longer.

Browsers

Let’s take a look at the browsers. Browsers have quite a long history of accessibility support and they’ve made a great progress on accessibility testing.

Firefox

In Firefox all testing is done in Javascript. It’s worth noting Gecko’s accessibility core is a massive system that includes practically every feature of any desktop accessibility API. Gecko has a fairly mature set of cross-platform accessibility interfaces. Because the platform implementations are thin wrappers around the accessibility core, if you test something on a cross-platform layer in Gecko, then you can be confident that it will work well on all platforms. Gecko does, however, offer native NSAccessibility objects to JavaScript the same way that they do in cross-platform testing. It works nicely and allows one to poke all NSAccessibility attributes and parameterized attributes, as well as listening to the accessibility events. This approach is not portable as is, because it relies on a somewhat ancient Netscape-era technology. It could be adapted though to work through webidl if you want to to make it portable to other browsers, but fairly useless outside browsers world. Nevertheless it is certainly good to get inspired by.

Here’s an example of the Gecko accessibility test. The test represents a typical scenario for the accessibility testing: you get a property, call an action, wait for an event, and then make sure the property value was adjusted properly. You can imagine that cross platform tests are quite similar to this one.

Gecko implements its own test suite which provides a number of util functions such as ok() or is(), which are responsible to handle and report success or failures. This is the kind of testing system, where test expectations are listed explicitly in a test body, or in other words, the test says what to test and what are expectations. As a direct consequence if you need to change expectations, you have to adjust the test manually. It’s quite a typical testing system though.

WebKit/Safari

I think it’s fair to say that WebKit, being the engine behind the popular Safari browser, has decent NSAccessibility protocol support, but it also supports ATK and MSAA. The WebKit testsuite is rather straightforward and implements its testing capabilities in a cross platform style. It exposes a helper object to DOM, and you can query platform dependent accessible properties/methods from JavaScript. It’s quite similar to what Firefox does.

The testsuite itself is quite different from Firefox though, which is not surprising. The WebKit test generates output which is compared to expectations stored in files. The test expectations are also listed in a test body, which makes the approach close to Gecko.

Similar to Gecko, WebKit also supports event testing, here’s an example of a typical test. It might look bulky for a simple thing it does, but all stuff can be wrapped by a nice promise-based wrapper which will make the test more readable. The most important thing here is that WebKit also supports testing for all parts of a typical accessibility API.

Chromium

Beyond low-level c++ unit testing, Chromium relies on platform accessibility APIs testing. It is quite similar to Firefox or WebKit with one key difference.

Chromium can dump an accessibility tree with relevant accessibility properties or can record accessibility events, which it subsequently compares to the expectation files. The main gotcha here is these tests can be rebaselined easily unlike other kinds of tests. If something changes on the API level, for example, if an accessible tree is spec’d out differently, new properties are added or old ones are removed — any change, then all you need is to rerun a test and capture output, which becomes the new expectations. It’s as if you take a snapshot that becomes a new standard, and then all following runs match it.

Here’s a typical example of a tree test with mac and win expectation files.

Chromium also reveals basic scripting capabilities. Those are mainly mac targeted though, and thus scoped by NSAccessibility protocol testing. However it allows testing all bits of the API, including methods, attributes, parameterized attributes, actions and events.

What would make a great test suite?

Let’s pull things together. Accessibility exists within the context of a platform, which glues together an application and the assistive technology by accessibility API. We want a platform-wide testing solution to test platform accessibility APIs. It should be possible to test a variety of things such as accessible trees, properties, methods and events.

The solution should not be limited to just web browsers. It should be capable of covering web applications running in web browsers. We’d also like to test any application running on the system.

Multiple platforms should be supported such as AT-SPI/ATK on Linux, MSAA/IA2 and UIA on Windows and NSAccessibility protocol on Mac. The solution has to be extensible in case we need to support other platforms in future.

Test flow control should be supported out-of-the-box, such as interacting with an app and then waiting for an accessible event. Another example of interactions will be the flow control directives allowing communication with the assistive technologies. Getting those covered will allow writing end-to-end tests.

Last but not least. Easy test rebaselining is a key feature to have. If you ever need to change a test’s expectations, then you just rerun the test and record its output. It happens more often than you probably think. This testsuite allows you to adjust expectations with about zero effort.

Chromium accessibility tools

Chromium has decent accessibility tools to inspect a platform accessibility tree and to listen to accessibility events. They are available on all major platforms and can be easily ported to other platforms, essentially on any platform Chromium is running on. They are capable of inspecting any application on a system including web applications running in a web browser. All major desktop APIs are supported as well, namely AT-SPI on Linux, MSAA/IA2 and UIA on Windows and NSAccessibility protocol on Mac.

In Chromium these tools are integrated into a test harness to perform platform accessibility testing. The testsuite supports test flow instructions and rebaselining mechanism.

The tools can be beefed up with testing capabilities to become a new test harness, or be easily integrated into existing test suites or CDCI systems.

The tools are not perfect and are not feature-complete. They have great potential though. As long as they are capable of providing all the must-have testing features we discussed above, all they need is some love I think. The tools are open source and anyone can contribute. However, having them as an integral part of the Chromium project makes it seem like they are inherently tied to that project and doesn’t make life easy for new contributors. It’s possible, however, that eventually the tools could get a new home. If new contributors bring the fresh vision and new use cases to shape the tool’s features, then it can make a great start of a new open source project which will embrace all innovations, and hopefully can solve a long standing problem of accessibility automated testing.

Is it worth taking a shot?

Are you ready to join the efforts?

by Alexander Surkov at June 10, 2022 12:50 PM

June 06, 2022

Alejandro Piñeiro

Playing with the rpi4 CPU/GPU frequencies

In recent days I have been testing how modifying the default CPU and GPU frequencies on the rpi4 increases the performance of our reference Vulkan applications. By default Raspbian uses 1500MHz and 500MHz respectively. But with a good heat dissipation (a good fan, rpi400 heat spreader, etc) you can play a little with those values.

One of the tools we usually use to check performance changes are gfxreconstruct. This tools allows you to record all the Vulkan calls during a execution of an aplication, and then you can replay the captured file. So we have traces of several applications, and we use them to test any hypothetical performance improvement, or to verify that some change doesn’t cause a performance drop.

So, let’s see what we got if we increase the CPU/GPU frequency, focused on the Unreal Engine 4 demos, that are the more shader intensive:

Unreal Engine 4 demos FPS chart

So as expected, with higher clock speed we see a good boost in performance of ~10FPS for several of these demos.

Some could wonder why the increase on the CPU frequency got so little impact. As I mentioned, we didn’t get those values from the real application, but from gfxreconstruct traces. Those are only capturing the Vulkan calls. So on those replays there are not tasks like collision detection, user input, etc that are usually handled on the CPU. Also as mentioned, all the Unreal Engine 4 demos uses really complex shaders, so the “bottleneck” there is the GPU.

Let’s move now from the cold numbers, and test the real applications. Let’s start with the Unreal Engine 4 SunTemple demo, using the default CPU/GPU frequencies (1500/500):

Even if it runs fairly smooth most of the time at ~24 FPS, there are some places where it dips below 18 FPS. Let’s see now increasing the CPU/GPU frequencies to 1800/750:

Now the demo runs at ~34 FPS most of the time. The worse dip is ~24 FPS. It is a lot smoother than before.

Here is another example with the Unreal Engine 4 Shooter demo, already increasing the CPU/GPU frequencies:

Here the FPS never dips below 34FPS, staying at ~40FPS most of time.

It has been around 1 year and a half since we announced a Vulkan 1.0 driver for Raspberry Pi 4, and since then we have made significant performance improvements, mostly around our compiler stack, that have notably improved some of these demos. In some cases (like the Unreal Engine 4 Shooter demo) we got a 50%-60% improvement (if you want more details about the compiler work, you can read the details here).

In this post we can see how after this and taking advantage of increasing the CPU and GPU frequencies, we can really start to get reasonable framerates in more demanding demos. Even if this is still at low resolutions (for this post all the demos were running at 640×480), it is still great to see this on a Raspberry Pi.

by infapi00 at June 06, 2022 09:58 AM

June 02, 2022

Brian Kardell

Spicy Progress

Spicy Progress

For a while, we had lots of updates and excitement on our "spicy-sections" proposal - but you've probably noticed it's gotten a little quiet. In this post, I'll explain why and lay out where I think we are and how I hope we can move foward.

A month or so ago, the Tabvengers were working hard to advance some open issues, work out details (including the name 'oui-panelset') and create a stable, high-parity version of what our actual proposal would look like so that we could hand ownership of it to OpenUI officially and work with a more official proposal. Jon Neal, in particular, worked very hard to make something pretty great.

However, as is often the case, with more brains and eyeballs comes new scrutiny and feedback. In particular, during accesibility reviews to make sure we hadn't messed something up along the way (we did, because it was a total rewrite that broke all of the tests), we hit a new perspective.

Based on some questions asked during that review about examples of where it could be used, we quickly assembled a series of about 50 links to content in public home pages where we believed our "spicy-sections" (aka "oui-panelset") proposal could be used to great effect, as some supporting history in order to discuss.

In this discussion, one of the co-chairs of the ARIA Working Group suggested that perhaps most of those shouldn't even be using ARIA tabs. Worse, there was some concern that if that was true, then making it exceptionally easy for them to do just that is potentially actively harmful.

Suspend any opinion on what you might think that means, or your own thoughts for a moment... It's definitely worth slowing down and trying to understand and take some care. It's already led to a number of interesting conversations. We need to understand "why not?" and try to articulate some specifics. At the moment, we're still working to sort out a lot here, but you can expect more on that soon.

In some further discussion we began discussing that there are sort of distinct "kinds" of intefaces that many people would refer to as tabs. This is somewhat unsuprising as our research points out that many/most modern UI kits actually have more than one control for tab-like things. My own posts and notes also make sure to note that examples like browser tabs are a different control than the kind that spicy-sections/panelset are talking about - those kinds of tabs are sort of window-managers with additional features that even users understand differently. No one would ever expect Ctrl+F, for example to search all the open windows. We expect close buttons, perhaps status indicators, the ability to reorder them.

So, first thing was to introduce some separate terminolgy for clarity. The things we've been doing with "spicy sections" (and perhaps the sorts of things you'll be able to build with CSS Toggles) could be classified as "content tabs" (I believe it was Sarah Higley that offered the term). Again, this makes sense of past dicssions - Hixie said many years ago "tabs are a matter of overflow". That makes a lot of sense for "content tabs" and absolutely none for something like browser tabs.

If not that...?

Ok, but... Assume, for the moment, that ARIA tabs are not always appropriate. Assume that we can articulate the right specifics about when they are and are not. It kind of begs the question "if not that, what?". Because, it's not like those pages will cease to have interfaces that generally look and act like that. It's very clear that "content tabs" exist in the wild and have plenty of benefits, and in order to judge whether it is true that some other, non-ARIA using solution is better, we need something concrete to compare it with, and that we can do user testing with. If it is true that something else is better, we have to be very careful in describing how to do the better thing to we prevent authors from merely falling into a different series of pitfalls, maybe even still give them that element.

But what is that other thing specifically?

Further thought and experiments

So... I've been thinking about this a lot. I found several examples of "not ARIA tabs" which have very different characteristics, including from an accessibility standpoint. There are a lot of ways we can get these wrong. I've been looking for something without glaring issues.

To this end, and through some discussion with Adam Argyle and Jon Neal, I have created this very rough functional sketch of what I am thinking makes a good springboard for discussing specifics and trying to work through problems. Just like with the original, if your window is wide enough, this will display as 'content tabs' otherwise it will just show plain content.

See the Pen spicy-alternative-sketch by вкαя∂εℓℓ (@briankardell) on CodePen.

As of this posting, please don't use the code in this pen. It needs additional testing and discussion, but perhaps more importantly it is affected by one (or two, depending on your system) bugs in Firefox, being worked on now.

What it is..

Is it a somewhat reduced version of the current oui-panelset which simply lets you specify only a 'content-tabs' affordance (or not). The markup, parts, etc are identical. However, instead of realizing this as ARIA tabs, it turns these into a TOC of links and uses a scroll-port and scroll-snapping to create the tab-like presentation, based on one of Adam's experiments.

What I like about it

Well, it is pretty simple, and it doubles down on all of the original points and observations. It's now not just a case that it is "similar to scroll" - it is literally scroll. But with that comes some other interesting good qualities right out of the box: Find-in-page Just Works™. Headings aren't "turned into" tabs, they continue to exist - so screen reader users experience this as normal content and it is navigable by headings. We could probably apply it even to something like a carosel or paging if we're not fighting about the strict semantics of "what it is". You can leave the headings visible or easily hide them (assuming something isn't busted, see below) with a simple rule like

/* you can use a negative margin to hide */
oui-panelset h2 {
  margin-top: -4rem;
}  
        

If it turned out to be useful, then this is the proof for CSS Toggles and it makes the job of what they need to accomplish much clearer and easier.

What I am not so sure of...

As I say often: the devil is in the details. This is a crude pen prototype, we have to do a lot of additional questioning and make sure this is resilient, and we have to test and improve it - a lot, probably, to know if it is really viable. For example, currently it 'jumps' the content when you click a link - I'm not sure how much we can accurately avoid this and remain resilient - but maybe.

It raises other new questions though too: What happens if you do 'select all' -- currently it will literally select everything. Is that right? Or should that limit to the scrollport - and so on. It doesn't 'project' the headings into slots, it 'mirrors' them - as far as I know, that isn't something anything else does. Is it a deal breaker? I don't know!

These 'tabs' are really links and as such they have all of the qualities of regular links. For example, one navigates them with the tab key and they have to manually activate - there is no roving focus or automatic activation. Some people see this as a pro, some as a con. We'll need to do A/B testing.

Finally - most importantly - do I think it is actually 'better?'.

To be honest, I'm somewhat unconvinced, that there is a "right" answer here. It would not surprise me at all if the results was that we learn that some users prefer for tabs to use automatic activation and roving tab index, and others don't. It would not suprise me at all if some users prefer that all of these present as ARIA tabs, and others are confused by it. I expect that many of these questions have something to do with classes of users and individual preferences.

That said, we've been working to find ways to answer all the problems and we think we have some ideas

What I am hoping

What I am hoping is that having something concrete to discuss and debate, even if it is somewhat sketchy, actually lets us do that.

I would like to see us focus on ::parts and styling specification that is largely shared for both "kinds" of tabs (I think we're actually very close) such that even if they wind up being two different things, we can reuse much learning and code.

I would like to see us figure out how we sort this out via a custom element (or elements) when we don't know the final answer. Perhaps one option would be to add this affordance as an option to our <oui-panelset> proposal, perhaps even discuss whether it should be the only tabs-like affordance it supports initially. However, I would like to keep the door open to having the current use of ARIA as option too, since the truth is we don't know yet..

What I think _could_ be idea is that we could expose a user preference in browsers and let users pick what they prefer. This means that if we get it wrong by default, remediation is as simple as changing a single attribute value or CSS property.

June 02, 2022 04:00 AM

May 30, 2022

Christopher Michael

Modesetting: A Glamor-less RPi adventure

The goal of this adventure is to have hardware acceleration for applications when we have Glamor disabled in the X server.

What is Glamor ?

Glamor is a GL-based rendering acceleration library for the X server that can use OpenGL, EGL, or GBM. It uses GL functions & shaders to complete 2D graphics operations, and uses normal textures to represent drawable pixmaps where possible. Glamor calls GL functions to render to a texture directly and is somehow hardware independent. If the GL rendering cannot complete due to failure (or not being supported), then Glamor will fallback to software rendering (via llvmpipe) which uses framebuffer functions.

Why disable Glamor ?

On current RPi images like bullseye, Glamor is disabled by default for RPi 1-3 devices. This means that there is no hardware acceleration out of the box. The main reason for not using Glamor on RPi 1-3 hardware is because it uses GPU memory (CMA memory) which is limited to 256Mb. If you run out of CMA memory, then the X server cannot allocate memory for pixmaps and your system will crash. RPi 1-3 devices currently use V3D as the render GPU. V3D can only sample from tiled buffers, but it can render to tiled or linear buffers. If V3D needs to sample from a linear buffer, then we allocate a shadow buffer and transform the linear image to a tiled layout in the shadow buffer and sample from the shadow buffer. Any update of the linear texture implies updating the shadow image… and that is SLOW. With Glamor enabled in this scenario, you will quickly run out of CMA memory and crash. This issue is especially apparent if you try launching Chromium in full screen with many tabs opened.

Where has my hardware acceleration gone ?

On RPi 1-3 devices, we default to the modesetting driver from the X server. For those that are not aware, ‘modesetting’ is an Xorg driver for Kernel Modesetting (KMS) devices. The driver supports TrueColor visuals at various framebuffer depths and also supports RandR 1.2 for multi-head configurations. This driver supports all hardware where a KMS device is available and uses the Linux DRM ioctls or dumb buffer objects to create & map memory for applications to use. This driver can be used with Glamor to provide hardware acceleration, however that can lead to the X server crashing as mentioned above. Without enabling Glamor, then the modesetting driver cannot do hardware acceleration and applications will render using software (dumb buffer objects). So how can we get hardware acceleration without Glamor ? Let’s take an adventure into the land of Direct Rendering…

What is Direct Rendering ?

Direct rendering allows for X client applications to perform 3D rendering using direct access to the graphics hardware. User-space programs can use the DRM API to command the GPU to do hardware-accelerated 3D rendering and video decoding. You may be thinking “Wow, this could solve the problem” and you would be correct. If this could be enabled in the modesetting driver without using Glamor, then we could have hardware acceleration without having to worry about the X server crashing. It cannot be that difficult, right ? Well, as it turns out, things are not so simple. The biggest problem with this approach is that the DRI2 implementation inside the modesetting driver depends on Glamor. DRI2 is a version of the Direct Rendering Infrastructure (DRI). It is a framework comprising the modern Linux graphics stack that allows unprivileged user-space programs to use graphics hardware. The main use of DRI is to provide hardware acceleration for the Mesa implementation of OpenGL. So what approach should be taken ? Do we modify the modesetting driver code to support DRI2 without Glamor ? Is there a better way to get direct rendering without DRI2 ? As it turns out, there is a better way…enter DRI3.

DRI3 to the rescue ?

The main purpose of the DRI3 extension is to implement the mechanism to share direct rendered buffers between DRI clients and the X Server. With DRI3, clients can allocate the render buffers themselves instead of relying on the X server for doing the allocation. DRI3 clients allocate and use GEM buffers objects as rendering targets, while the X Server represents these render buffers using a pixmap. After initialization the client doesn’t make any extra calls to the X server, except perhaps in the case of window resizing. Utilizing this method, we should be able to avoid crashing the X server if we run out of memory, right ? Well once again, things are not as simple as they appear to be…

So using DRI3 & GEM can save the day ?

With GEM, a user-space program can create, handle and destroy memory objects living in the GPU memory. When a user-space program needs video memory (for a framebuffer, texture or any other data), it requests the allocation from the DRM driver using the GEM API. The DRM driver keeps track of the used video memory and is able to comply with the request if there is free memory available. You may recall from earlier that the main reason for not using Glamor on RPi 1-3 hardware is because it uses GPU memory (CMA memory) which is limited to 256Mb, so how can using DRI3 with GEM help us ? The short answer is “it does not”…at least, not if we utilize GEM.

Where do we go next ?

Surely there must be a way to have hardware acceleration without using all of our GPU memory ? I am glad you asked because there is a solution that we will explore in my next blog post.

by cmichael at May 30, 2022 11:17 AM

May 23, 2022

Iago Toral

Vulkan 1.2 getting closer

Lately I have been exposing a bit more functionality in V3DV and was wondering how far we are from Vulkan 1.2. Turns out that a lot of the new Vulkan 1.2 features are actually optional and what we have right now (missing a few trivial patches to expose a few things) seems to be sufficient for a minimal implementation.

We actually did a test run with CTS enabling Vulkan 1.2 to verify this and it went surprisingly well, with just a few test failures that I am currently looking into, so I think we should be able to submit conformance soon.

For those who may be interested, here is a list of what we are not supporting (all of these are optional features in Vulkan 1.2):

VK_EXT_descriptor_indexing

I think we should be able to support this in the future.

VK_KHR_shader_float16_int8

This we can support in theory, since the hardware has support for half-float, however, the way this is designed in hardware comes with significant caveats that I think would make it really difficult to take advantage of it in practice. It would also require significant work, so it is not something we are planning at present.

VK_KHR_buffer_device_address

We can’t implement this without hacks because the Vulkan spec explicitly defined these addresses to be 64-bit values and the V3D GPU only deals with 32-bit addresses and is not capable of doing any kind of native 64-bit operation. At first I thought we could just lower these to 32-bit (since we know they will be 32-bit), but because the spec makes these explicit 64-bit values, it allows shaders to cast a device address from/to uvec2, which generates 64-bit bitcast instructions and those require both the destination and source to be 64-bit values.

VK_EXT_sampler_filter_minmax
VK_KHR_draw_indirect_count
VK_EXT_scalar_block_layout
VK_EXT_shader_viewport_index_layer
VK_KHR_shader_atomic_int64

These lack required hardware support, so we don’t expect to implement them.

by Iago Toral at May 23, 2022 10:46 AM

May 16, 2022

Alejandro Piñeiro

v3dv status update 2022-05-16

We haven’t posted updates to the work done on the V3DV driver since
we announced the driver becoming Vulkan 1.1 Conformant.

But after reaching that milestone, we’ve been very busy working on more improvements, so let’s summarize the work done since then.

Multisync support

As mentioned on past posts, for the Vulkan driver we tried to focus as much as possible on the userspace part. So we tried to re-use the already existing kernel interface that we had for V3D, used by the OpenGL driver, without modifying/extending it.

This worked fine in general, except for synchronization. The V3D kernel interface only supported one synchronization object per submission. This didn’t properly map with Vulkan synchronization, which is more detailed and complex, and allowed defining several semaphores/fences. We initially handled the situation with workarounds, and left some optional features as unsupported.

After our 1.1 conformance work, our colleage Melissa Wen started to work on adding support for multiple semaphores on the V3D kernel side. Then she also implemented the changes on V3DV to use this new feature. If you want more technical info, she wrote a very detailed explanation on her blog (part1 and part2).

For now the driver has two codepaths that are used depending on if the kernel supports this new feature or not. That also means that, depending on the kernel, the V3DV driver could expose a slightly different set of supported features.

More common code – Migration to the common synchronization framework

For a while, Mesa developers have been doing a great effort to refactor and move common functionality to a single place, so it can be used by all drivers, reducing the amount of code each driver needs to maintain.

During these months we have been porting V3DV to some of that infrastructure, from small bits (common VkShaderModule to NIR code), to a really big one: common synchronization framework.

As mentioned, the Vulkan synchronization model is really detailed and powerful. But that also means it is complex. V3DV support for Vulkan synchronization included heavy use of threads. For example, V3DV needed to rely on a CPU wait (polling with threads) to implement vkCmdWaitEvents, as the GPU lacked a mechanism for this.

This was common to several drivers. So at some point there were multiple versions of complex synchronization code, one per driver. But, some months ago, Jason Ekstrand refactored Anvil support and collaborated with other driver developers to create a common framework. Obviously each driver would have their own needs, but the framework provides enough hooks for that.

After some gitlab and IRC chats, Jason provided a Merge Request with the port of V3DV to this new common framework, that we iterated and tested through the review process.

Also, with this port we got timelime semaphore support for free. Thanks to this change, we got ~1.2k less total lines of code (and have more features!).

Again, we want to thank Jason Ekstrand for all his help.

Support for more extensions:

Since 1.1 got announced the following extension got implemented and exposed:

  • VK_EXT_debug_utils
  • VK_KHR_timeline_semaphore
  • VK_KHR_create_renderpass2
  • VK_EXT_4444_formats
  • VK_KHR_driver_properties
  • VK_KHR_16_bit_storage and VK_KHR_8bit_storage
  • VK_KHR_imageless_framebuffer
  • VK_KHR_depth_stencil_resolve
  • VK_EXT_image_drm_format_modifier
  • VK_EXT_line_rasterization
  • VK_EXT_inline_uniform_block
  • VK_EXT_separate_stencil_usage
  • VK_KHR_separate_depth_stencil_layouts
  • VK_KHR_pipeline_executable_properties
  • VK_KHR_shader_float_controls
  • VK_KHR_spirv_1_4

If you want more details about VK_KHR_pipeline_executable_properties, Iago wrote recently a blog post about it (here)

Android support

Android support for V3DV was added thanks to the work of Roman Stratiienko, who implemented this and submitted Mesa patches. We also want to thank the Android RPi team, and the Lineage RPi maintainer (Konsta) who also created and tested an initial version of that support, which was used as the baseline for the code that Roman submitted. I didn’t test it myself (it’s in my personal TO-DO list), but LineageOS images for the RPi4 are already available.

Performance

In addition to new functionality, we also have been working on improving performance. Most of the focus was done on the V3D shader compiler, as improvements to it would be shared among the OpenGL and Vulkan drivers.

But one of the features specific to the Vulkan driver (pending to be ported to OpenGL), is that we have implemented double buffer mode, only available if MSAA is not enabled. This mode would split the tile buffer size in half, so the driver could start processing the next tile while the current one is being stored in memory.

In theory this could improve performance by reducing tile store overhead, so it would be more benefitial when vertex/geometry shaders aren’t too expensive. However, it comes at the cost of reducing tile size, which also causes some overhead on its own.

Testing shows that this helps in some cases (i.e the Vulkan Quake ports) but hurts in others (i.e. Unreal Engine 4), so for the time being we don’t enable this by default. It can be enabled selectively by adding V3D_DEBUG=db to the environment variables. The idea for the future would be to implement a heuristic that would decide when to activate this mode.

FOSDEM 2022

If you are interested in watching an overview of the improvements and changes to the driver during the last year, we made a presention in FOSDEM 2022:
“v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi
4”

by infapi00 at May 16, 2022 09:48 AM

May 10, 2022

Melissa Wen

Multiple syncobjs support for V3D(V) (Part 2)

In the previous post, I described how we enable multiple syncobjs capabilities in the V3D kernel driver. Now I will tell you what was changed on the userspace side, where we reworked the V3DV sync mechanisms to use Vulkan multiple wait and signal semaphores directly. This change represents greater adherence to the Vulkan submission framework.

I was not used to Vulkan concepts and the V3DV driver. Fortunately, I counted on the guidance of the Igalia’s Graphics team, mainly Iago Toral (thanks!), to understand the Vulkan Graphics Pipeline, sync scopes, and submission order. Therefore, we changed the original V3DV implementation for vkQueueSubmit and all related functions to allow direct mapping of multiple semaphores from V3DV to the V3D-kernel interface.

Disclaimer: Here’s a brief and probably inaccurate background, which we’ll go into more detail later on.

In Vulkan, GPU work submissions are described as command buffers. These command buffers, with GPU jobs, are grouped in a command buffer submission batch, specified by vkSubmitInfo, and submitted to a queue for execution. vkQueueSubmit is the command called to submit command buffers to a queue. Besides command buffers, vkSubmitInfo also specifies semaphores to wait before starting the batch execution and semaphores to signal when all command buffers in the batch are complete. Moreover, a fence in vkQueueSubmit can be signaled when all command buffer batches have completed execution.

From this sequence, we can see some implicit ordering guarantees. Submission order defines the start order of execution between command buffers, in other words, it is determined by the order in which pSubmits appear in VkQueueSubmit and pCommandBuffers appear in VkSubmitInfo. However, we don’t have any completion guarantees for jobs submitted to different GPU queue, which means they may overlap and complete out of order. Of course, jobs submitted to the same GPU engine follow start and finish order. A fence is ordered after all semaphores signal operations for signal operation order. In addition to implicit sync, we also have some explicit sync resources, such as semaphores, fences, and events.

Considering these implicit and explicit sync mechanisms, we rework the V3DV implementation of queue submissions to better use multiple syncobjs capabilities from the kernel. In this merge request, you can find this work: v3dv: add support to multiple wait and signal semaphores. In this blog post, we run through each scope of change of this merge request for a V3D driver-guided description of the multisync support implementation.

Groundwork and basic code clean-up:

As the original V3D-kernel interface allowed only one semaphore, V3DV resorted to booleans to “translate” multiple semaphores into one. Consequently, if a command buffer batch had at least one semaphore, it needed to wait on all jobs submitted complete before starting its execution. So, instead of just boolean, we created and changed structs that store semaphores information to accept the actual list of wait semaphores.

Expose multisync kernel interface to the driver:

In the two commits below, we basically updated the DRM V3D interface from that one defined in the kernel and verified if the multisync capability is available for use.

Handle multiple semaphores for all GPU job types:

At this point, we were only changing the submission design to consider multiple wait semaphores. Before supporting multisync, V3DV was waiting for the last job submitted to be signaled when at least one wait semaphore was defined, even when serialization wasn’t required. V3DV handle GPU jobs according to the GPU queue in which they are submitted:

  • Control List (CL) for binning and rendering
  • Texture Formatting Unit (TFU)
  • Compute Shader Dispatch (CSD)

Therefore, we changed their submission setup to do jobs submitted to any GPU queues able to handle more than one wait semaphores.

These commits created all mechanisms to set arrays of wait and signal semaphores for GPU job submissions:

  • Checking the conditions to define the wait_stage.
  • Wrapping them in a multisync extension.
  • According to the kernel interface (described in the previous blog post), configure the generic extension as a multisync extension.

Finally, we extended the ability of GPU jobs to handle multiple signal semaphores, but at this point, no GPU job is actually in charge of signaling them. With this in place, we could rework part of the code that tracks CPU and GPU job completions by verifying the GPU status and threads spawned by Event jobs.

Rework the QueueWaitIdle mechanism to track the syncobj of the last job submitted in each queue:

As we had only single in/out syncobj interfaces for semaphores, we used a single last_job_sync to synchronize job dependencies of the previous submission. Although the DRM scheduler guarantees the order of starting to execute a job in the same queue in the kernel space, the order of completion isn’t predictable. On the other hand, we still needed to use syncobjs to follow job completion since we have event threads on the CPU side. Therefore, a more accurate implementation requires last_job syncobjs to track when each engine (CL, TFU, and CSD) is idle. We also needed to keep the driver working on previous versions of v3d kernel-driver with single semaphores, then we kept tracking ANY last_job_sync to preserve the previous implementation.

Rework synchronization and submission design to let the jobs handle wait and signal semaphores:

With multiple semaphores support, the conditions for waiting and signaling semaphores changed accordingly to the particularities of each GPU job (CL, CSD, TFU) and CPU job restrictions (Events, CSD indirect, etc.). In this sense, we redesigned V3DV semaphores handling and job submissions for command buffer batches in vkQueueSubmit.

We scrutinized possible scenarios for submitting command buffer batches to change the original implementation carefully. It resulted in three commits more:

We keep track of whether we have submitted a job to each GPU queue (CSD, TFU, CL) and a CPU job for each command buffer. We use syncobjs to track the last job submitted to each GPU queue and a flag that indicates if this represents the beginning of a command buffer.

The first GPU job submitted to a GPU queue in a command buffer should wait on wait semaphores. The first CPU job submitted in a command buffer should call v3dv_QueueWaitIdle() to do the waiting and ignore semaphores (because it is waiting for everything).

If the job is not the first but has the serialize flag set, it should wait on the completion of all last job submitted to any GPU queue before running. In practice, it means using syncobjs to track the last job submitted by queue and add these syncobjs as job dependencies of this serialized job.

If this job is the last job of a command buffer batch, it may be used to signal semaphores if this command buffer batch has only one type of GPU job (because we have guarantees of execution ordering). Otherwise, we emit a no-op job just to signal semaphores. It waits on the completion of all last jobs submitted to any GPU queue and then signal semaphores. Note: We changed this approach to correctly deal with ordering changes caused by event threads at some point. Whenever we have an event job in the command buffer, we cannot use the last job in the last command buffer assumption. We have to wait all event threads complete to signal

After submitting all command buffers, we emit a no-op job to wait on all last jobs by queue completion and signal fence. Note: at some point, we changed this approach to correct deal with ordering changes caused by event threads, as mentioned before.

Final considerations

With many changes and many rounds of reviews, the patchset was merged. After more validations and code review, we polished and fixed the implementation together with external contributions:

Also, multisync capabilities enabled us to add new features to V3DV and switch the driver to the common synchronization and submission framework:

  • v3dv: expose support for semaphore imports

    This was waiting for multisync support in the v3d kernel, which is already available. Exposing this feature however enabled a few more CTS tests that exposed pre-existing bugs in the user-space driver so we fix those here before exposing the feature.

  • v3dv: Switch to the common submit framework

    This should give you emulated timeline semaphores for free and kernel-assisted sharable timeline semaphores for cheap once you have the kernel interface wired in.

We used a set of games to ensure no performance regression in the new implementation. For this, we used GFXReconstruct to capture Vulkan API calls when playing those games. Then, we compared results with and without multisync caps in the kernelspace and also enabling multisync on v3dv. We didn’t observe any compromise in performance, but improvements when replaying scenes of vkQuake game.

May 10, 2022 09:00 AM

Multiple syncobjs support for V3D(V) (Part 1)

As you may already know, we at Igalia have been working on several improvements to the 3D rendering drivers of Broadcom Videocore GPU, found in Raspberry Pi 4 devices. One of our recent works focused on improving V3D(V) drivers adherence to Vulkan submission and synchronization framework. We had to cross various layers from the Linux Graphics stack to add support for multiple syncobjs to V3D(V), from the Linux/DRM kernel to the Vulkan driver. We have delivered bug fixes, a generic gate to extend job submission interfaces, and a more direct sync mapping of the Vulkan framework. These changes did not impact the performance of the tested games and brought greater precision to the synchronization mechanisms. Ultimately, support for multiple syncobjs opened the door to new features and other improvements to the V3DV submission framework.

DRM Syncobjs

But, first, what are DRM sync objs?

* DRM synchronization objects (syncobj, see struct &drm_syncobj) provide a
* container for a synchronization primitive which can be used by userspace
* to explicitly synchronize GPU commands, can be shared between userspace
* processes, and can be shared between different DRM drivers.
* Their primary use-case is to implement Vulkan fences and semaphores.
[...]
* At it's core, a syncobj is simply a wrapper around a pointer to a struct
* &dma_fence which may be NULL.

And Jason Ekstrand well-summarized dma_fence features in a talk at the Linux Plumbers Conference 2021:

A struct that represents a (potentially future) event:

  • Has a boolean “signaled” state
  • Has a bunch of useful utility helpers/concepts, such as refcount, callback wait mechanisms, etc.

Provides two guarantees:

  • One-shot: once signaled, it will be signaled forever
  • Finite-time: once exposed, is guaranteed signal in a reasonable amount of time

What does multiple semaphores support mean for Raspberry Pi 4 GPU drivers?

For our main purpose, the multiple syncobjs support means that V3DV can submit jobs with more than one wait and signal semaphore. In the kernel space, wait semaphores become explicit job dependencies to wait on before executing the job. Signal semaphores (or post dependencies), in turn, work as fences to be signaled when the job completes its execution, unlocking following jobs that depend on its completion.

The multisync support development comprised of many decision-making points and steps summarized as follow:

  • added to the v3d kernel-driver capabilities to handle multiple syncobj;
  • exposed multisync capabilities to the userspace through a generic extension; and
  • reworked synchronization mechanisms of the V3DV driver to benefit from this feature
  • enabled simulator to work with multiple semaphores
  • tested on Vulkan games to verify the correctness and possible performance enhancements.

We decided to refactor parts of the V3D(V) submission design in kernel-space and userspace during this development. We improved job scheduling on V3D-kernel and the V3DV job submission design. We also delivered more accurate synchronizing mechanisms and further updates in the Broadcom Vulkan driver running on Raspberry Pi 4. Therefore, we summarize here changes in the kernel space, describing the previous state of the driver, taking decisions, side improvements, and fixes.

From single to multiple binary in/out syncobjs:

Initially, V3D was very limited in the numbers of syncobjs per job submission. V3D job interfaces (CL, CSD, and TFU) only supported one syncobj (in_sync) to be added as an execution dependency and one syncobj (out_sync) to be signaled when a submission completes. Except for CL submission, which accepts two in_syncs: one for binner and another for render job, it didn’t change the limited options.

Meanwhile in the userspace, the V3DV driver followed alternative paths to meet Vulkan’s synchronization and submission framework. It needed to handle multiple wait and signal semaphores, but the V3D kernel-driver interface only accepts one in_sync and one out_sync. In short, V3DV had to fit multiple semaphores into one when submitting every GPU job.

Generic ioctl extension

The first decision was how to extend the V3D interface to accept multiple in and out syncobjs. We could extend each ioctl with two entries of syncobj arrays and two entries for their counters. We could create new ioctls with multiple in/out syncobj. But after examining other drivers solutions to extend their submission’s interface, we decided to extend V3D ioctls (v3d_cl_submit_ioctl, v3d_csd_submit_ioctl, v3d_tfu_submit_ioctl) by a generic ioctl extension.

I found a curious commit message when I was examining how other developers handled the issue in the past:

Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Fri Mar 22 09:23:22 2019 +0000

    drm/i915: Introduce the i915_user_extension_method
    
    An idea for extending uABI inspired by Vulkan's extension chains.
    Instead of expanding the data struct for each ioctl every time we need
    to add a new feature, define an extension chain instead. As we add
    optional interfaces to control the ioctl, we define a new extension
    struct that can be linked into the ioctl data only when required by the
    user. The key advantage being able to ignore large control structs for
    optional interfaces/extensions, while being able to process them in a
    consistent manner.
    
    In comparison to other extensible ioctls, the key difference is the
    use of a linked chain of extension structs vs an array of tagged
    pointers. For example,
    
    struct drm_amdgpu_cs_chunk {
    	__u32		chunk_id;
        __u32		length_dw;
        __u64		chunk_data;
    };
[...]

So, inspired by amdgpu_cs_chunk and i915_user_extension, we opted to extend the V3D interface through a generic interface. After applying some suggestions from Iago Toral (Igalia) and Daniel Vetter, we reached the following struct:

struct drm_v3d_extension {
	__u64 next;
	__u32 id;
#define DRM_V3D_EXT_ID_MULTI_SYNC		0x01
	__u32 flags; /* mbz */
};

This generic extension has an id to identify the feature/extension we are adding to an ioctl (that maps the related struct type), a pointer to the next extension, and flags (if needed). Whenever we need to extend the V3D interface again for another specific feature, we subclass this generic extension into the specific one instead of extending ioctls indefinitely.

Multisync extension

For the multiple syncobjs extension, we define a multi_sync extension struct that subclasses the generic extension struct. It has arrays of in and out syncobjs, the respective number of elements in each of them, and a wait_stage value used in CL submissions to determine which job needs to wait for syncobjs before running.

struct drm_v3d_multi_sync {
	struct drm_v3d_extension base;
	/* Array of wait and signal semaphores */
	__u64 in_syncs;
	__u64 out_syncs;

	/* Number of entries */
	__u32 in_sync_count;
	__u32 out_sync_count;

	/* set the stage (v3d_queue) to sync */
	__u32 wait_stage;

	__u32 pad; /* mbz */
};

And if a multisync extension is defined, the V3D driver ignores the previous interface of single in/out syncobjs.

Once we had the interface to support multiple in/out syncobjs, v3d kernel-driver needed to handle it. As V3D uses the DRM scheduler for job executions, changing from single syncobj to multiples is quite straightforward. V3D copies from userspace the in syncobjs and uses drm_syncobj_find_fence()+ drm_sched_job_add_dependency() to add all in_syncs (wait semaphores) as job dependencies, i.e. syncobjs to be checked by the scheduler before running the job. On CL submissions, we have the bin and render jobs, so V3D follows the value of wait_stage to determine which job depends on those in_syncs to start its execution.

When V3D defines the last job in a submission, it replaces dma_fence of out_syncs with the done_fence from this last job. It uses drm_syncobj_find() + drm_syncobj_replace_fence() to do that. Therefore, when a job completes its execution and signals done_fence, all out_syncs are signaled too.

Other improvements to v3d kernel driver

This work also made possible some improvements in the original implementation. Following Iago’s suggestions, we refactored the job’s initialization code to allocate memory and initialize a job in one go. With this, we started to clean up resources more cohesively, clearly distinguishing cleanups in case of failure from job completion. We also fixed the resource cleanup when a job is aborted before the DRM scheduler arms it - at that point, drm_sched_job_arm() had recently been introduced to job initialization. Finally, we prepared the semaphore interface to implement timeline syncobjs in the future.

Going Up

The patchset that adds multiple syncobjs support and improvements to V3D is available here and comprises four patches:

  • drm/v3d: decouple adding job dependencies steps from job init
  • drm/v3d: alloc and init job in one shot
  • drm/v3d: add generic ioctl extension
  • drm/v3d: add multiple syncobjs support

After extending the V3D kernel interface to accept multiple syncobjs, we worked on V3DV to benefit from V3D multisync capabilities. In the next post, I will describe a little of this work.

May 10, 2022 08:00 AM

May 09, 2022

Iago Toral

VK_KHR_pipeline_executable_properties

Sometimes you want to go and inspect details of the shaders that are used with specific draw calls in a frame. With RenderDoc this is really easy if the driver implements VK_KHR_pipeline_executable_properties. This extension allows applications to query the driver about various aspects of the executable code generated for a Vulkan pipeline.

I implemented this extension for V3DV, the Vulkan driver for Raspberry Pi 4, last week (it is currently in review process) because I was tired of jumping through loops to get the info I needed when looking at traces. For V3DV we expose the NIR and QPU assembly code as well as various others stats, some of which are quite relevant to performance, such as spill or thread counts.


Some shader statistics

Final NIR code

QPU assembly

by Iago Toral at May 09, 2022 10:38 AM

May 02, 2022

Víctor Jáquez

From gst-build to local-projects

Two years ago I wrote a blog post about using gst-build inside of WebKit SDK flatpak. Well, all that has changed. That’s the true upstream spirit.

There were two main reason for the change:

  1. Since the switch to GStreamer mono repository, gst-build has been deprecated. The mechanism in WebKit were added, basically, to allow GStreamer upstream, so keeping gst-build directory just polluted the conceptual framework.
  2. By using gst-build one could override almost any other package in WebKit SDK. For example, for developing gamepad handling in WPE I added libmanette as a GStreamer subproject, to link a modified version of the library rather than the one in flatpak. But that approach added an unneeded conceptual depth in tree.

In order to simplify these operations, by taking advantage of Meson’s subproject support directly, gst-build handling were removed and new mechanism was set in place: Local Dependencies. With local dependencies, you can add or override almost any dependency, while flatting the tree layout, by placing at the same level GStreamer and any other library. Of course, in order add dependencies, they must be built with meson.

For example, to override libsoup and GStreamer, just clone both repositories below of Tools/flatpak/local-projects/subprojects, and declare them in WEBKIT_LOCAL_DEPS environment variable:


$ export WEBKIT_SDK_LOCAL_DEPS=libsoup,gstreamer-full
$ export WEBKIT_SDK_LOCAL_DEPS_OPTIONS="-Dgstreamer-full:introspection=disabled -Dgst-plugins-good:soup=disabled"
$ build-webkit --wpe

by vjaquez at May 02, 2022 11:11 AM

Brian Kardell

Slightly Random Interface Thoughts

Slightly Random Interface Thoughts

Not the normal fare for my blog, but my mind makes all kinds of weird connections after work, I suppose attempting to synthesize things. Occasionally it leads me down a road I think is interesting. Today it had me thinking about interfaces.

When I was very young, everyththing (TVs, radios, stereos, etc) had analog controls - mainly 2 kinds, switches and knobs. Almost everything was controlled and tuned by big physical knobs. Some of my grandparents things were probably 25 years old by then, and that's how they worked. While "new stuff" at the time certainly had differences, there was a definite "sameness" to it. They were mainly switches and dial knobs. We were tweaking, more or less, the same old things, for a pretty long time.

That makes sense on a whole lot of levels - it has benefits. You know it works. Your users understand how to use it. It's proven.

Of course, there were improvements or experiements at the edges: Minor evolutions of varying degrees. Maybe they moved the size or shape of knobs, or gave you additional knobs for different kinds of fine-tuning.

But then, sometimes something was interestingly new. The first takes are almost always not great, but they inspire other ideas. Ideas from all over start to mix and smash together and, ultimately, we get a kind of whole new "species".

My 4k smart TV is about as different from my grandmother's TV as you can imagine. It's a few speciations removed.

Human Machine Interfaces

Really, that's the case with pretty much everything we use to interface with machines, whether it is a physical interface, or a digital one. Nothing stays entirely the same. Comparing Windows 3.1 programs to their counterparts today, for example, would show you lots of variation and evolution in controls. Last year we wrote up some documentation surrounding research while working on "tabs" and if you breeze through it a bit, you can see a bit of discussion showing a fair bit of evolution and cross influencing over the years for just that one control.

But, what connected in my mind was something else entirely: "Game controllers". I say "game controllers" because historically, again, there is a lot of cross pollenation of ideas - these things often wind up being used for far more than just games. Indeed, in many of these devices they are your primary interface to a whole immersive operating system. That's basically your entire means of interacting with Igalia's new Wolvic Browser, for example, which is geared toward XR environments like this. It's interesting how we've smashed ideas from lots of different things together here and it's got me to thinking about the changes I've seen along the way.

Brief recollections

Way back, before my time, people started making games on computers (PDPs) with SpaceWar! That's just a neat thing you can learn a bunch about here if you want and see in action..

You can see though that our ideas were very rough - maybe you could repurpose keys on a keyboard or, as they did in the video - wire on/off buttons for everything: a button to turn left, a button to turn right, a button to fire, a button to thrust, etc.

By the time I was a kid we'd popularly moved on to Pong and pong-like games which had a physical knob, initially on the device itself. Later we'd separate those into physical corded paddles and add a button. Ohh, neat.

Very quickly came many takes on a joystick with a button (or two). Again, some tried more radical takes, like CalecoVision, which had a short stick with a fat head, two side triggers at the top of a whole number pad!

But, really, "a joystick that you 'grab' with one hand and a button or two" became the dominant paradigm. They had a definite "sameness" to the Atari 2600 model.

And for a the next several years most things just tweaked this a little bit. It was applied in arcades and on home computers and "game systems", but those ideas also wound up being applied to things like controlling heavy machinery.

Then, suddenly in the mid 1980s the original Nintendo was introduced here in the US. It said "Joystick? Hell no. They don't bring us joy." Instead, it introduced the d-pad, and two primary buttons, and two 'special' non-gameplay buttons arranged in the center. This was kind of the first big change in a while.

And then came some tweaking on the edges... A little bit later and we got the super Nintendo which gave us 4 gameplay buttons instead of two.

Then came something more radical. The 64 which came with a Frankenstein dpad + tiny thumb operated joystick and nine game play buttons, as well as radically changing the physical shape into some kind of monstrosity introducing "handles". Holy crap, what a monstrosity.

Then in 1997 we got the Playstation 1 controller: A d-pad, two thumbsticks with better shape and placement and basically 8 buttons - and... better handles.

What struck me was...

Wow - 1997 was a quarter of a century ago! A bit like my grandmother's radio, while there are certainly differences, I feel like there is a definite 'sameness' to controls for game systems today. They have many other kinds of advances - haptics, pitch and roll stuff, a touch pad inspired by other innovations, and now in the ps5 some resistance stuff... But really, they are very similar at their core in terms of how you interact with them, and these have all been bolted on at the edges. Anyone who played PS1 games could pretty easily jump into a PS5 controller. And there's a lot of good to that. It clearly works well.

But then, suddenly, all of these XR devices did something new and different... Just like all of the other examples, they're clearly very inspired by the controllers that came before them but because they really focus on the sensors as a primary mechanism rather secondary, they've basically split the controller in half.

A few of my favorite games blend them rather nicely, allowing you to use the left thumbstick to walk around as you would in any game, but control your immediate body with movement. I have to say, this feels kind of more natural to me in a lot of ways - a nicer blend.

I have to wonder what kind of other uses we'll put these kinds of advancements to - how the ideas will mix together. Very possibly XR will continue trying to free you up even more, maybe these controls won't even last long there. But, they're clearly on to something, I think - and I can easily imagine these being great ways to do all sorts of things that we're still using the current "normal" gamepads for - and a lot more.

I'm excited to see what happens next.

May 02, 2022 04:00 AM

April 26, 2022

Eric Meyer

Flexibly Centering an Element with Side-Aligned Content

In a recent side project that I hope will become public fairly soon, I needed to center a left-aligned list of links inside the sides of the viewport, but also line-wrap in cases where the lines got too long (as in mobile). There are a few ways to do this, but I came up with one that was new to me. Here’s how it works.

First, let’s have a list.  Pretend each list item contains a link so that I don’t have to add in all the extra markup.

<ol>
	<li>Foreword</li>
	<li>Chapter 1: The Day I Was Born</li>
	<li>Chapter 2: Childhood</li>
	<li>Chapter 3: Teachers I Admired</li>
	<li>Chapter 4: Teenage Dreaming</li>
	<li>Chapter 5: Look Out World</li>
	<li>Chapter 6: The World Strikes Back</li>
	<li>Chapter 7: Righting My Ship</li>
	<li>Chapter 8: In Hindsight</li>
	<li>Afterword</li>
</ol>

Great. Now I want it to be centered in the viewport, without centering the text. In other words, the text should all be left-aligned, but the element containing them should be as centered as possible.

One way to do this is to wrap the <ol> element in another element like a <div> and then use flexbox:

div.toc {
	display: flex;
	justify-content: center;
}

That makes sense if you want to also vertically center the list (with align-items: center) and if you’re already going to be wrapping the list with something that should be flexed, but neither really applied in this case, and I didn’t want to add a wrapper element that had no other purpose except centering. It’s 2022, there ought to be another way, right? Right. And this is it:

ol {
	max-inline-size: max-content;
	margin-inline: auto;
}

I also could have used width there in place of max-inline-size since this is in English, so the inline axis is horizontal, but as Jeremy pointed out, it’s a weird clash to have a physical property (width) and a logical property (margin-inline) working together. So here, I’m going all-logical, which is probably better for the ongoing work of retraining myself to instinctively think in logical directions anyway.

Thanks to max-inline-size: max-content, the list can’t get any wider (more correctly: any longer along the inline axis) than the longest list item. If the container is wider than that, then margin-inline: auto means the ol element’s box will be centered in the container, as happens with any block box where the width is set to a specific amount, there’s leftover space in the container, and the side margins of the box are set to auto. This is as if I’d pre-calculated the maximum content size to be (say) 434 pixels wide and then declared max-inline-size: 434px.

The great thing here is that I don’t have to do that pre-calculation, which would be very fragile in any case. I can just use max-content instead. And then, if the container ever gets too small to fit the longest bit of content, because the ol was set to max-inline-size instead of just straight inline-size, it can fill out the container as block boxes usually do, and the content inside it can wrap to multiple lines.

Perhaps it’s not the most common of layout needs, but if you find yourself wanting a lightweight way to center the box of an element with side-aligned content, maybe this will work for you.

What’s nice about this is that it’s one of those simple things that was difficult-to-impossible for so long, with hacks and workarounds needed to make it work at all, and now it… just works.  No extra markup, not even any calc()-ing, just a couple of lines that say exactly what they do, and are what you want them to do.  It’s a nice little example of the quiet revolution that’s been happening in CSS of late.  Hard things are becoming easy, and more than easy, simple.  Simple in the sense of “direct and not complex”, not in the sense of “obvious and basic”.  There’s a sense of growing maturity in the language, and I’m really happy to see it.


Have something to say to all that? You can add a comment to the post, or email Eric directly.

by Eric Meyer at April 26, 2022 09:31 PM

April 21, 2022

Qiuyi Zhang (Joyee)

Fixing snapshot support of class fields in V8

Up until V8 10.0, the class field initializers had been

April 21, 2022 03:28 AM

April 19, 2022

Manuel Rego

Web Engines Hackfest 2022

Once again Igalia is organizing the Web Engines Hackfest. This year the event is going to be hybrid. Though most things will happen on-site, online participation in some part of the event is going to be possible too.

Regarding dates, the hackfest will take place on June 13 & 14 in A Coruña. If you’re interested in participating, you can find more the information and the registration form at the event website: https://webengineshackfest.org/2022/.

What’s the Web Engines Hackfest?

This event started a long way back. The first edition happened in 2009 when 12 folks visited the Igalia offices in A Coruña and spent there a whole week working on WebKitGTK port. At that time, it was kind of early stages on the project and lots of work was needed, so those joint weeks were very productive to move things forward, discuss plans and implement features.

As the event grew and more people got interested, in 2014 it was renamed to Web Engines Hackfest and started to welcome people working on different web engines. This brought the opportunity for engineers of the different browsers to come together for a few days and discuss different features.

The hackfest has continued to grow and these days we welcome anyone that is somehow involved on the web platform. In this year’s event there will be people from different parts of the web platform community, from implementors and spec editors, to people interested in some particular feature.

This event has an unconference format. People attending are the ones defining the topics, and work together in breakout sessions to discuss them. They could be issues on a particular browser, generic purpose features, new ideas, even sometimes tooling demos. In addition, we always arrange a few talks as part of the hackfest. But the most important part of the event is being together with very different folks and having the chance to discuss a variety of topics with them. There are not lots of places where people from different companies and browsers join together to discuss topics. The idea of the hackfest is to provide a venue for that to happen.

2022 edition

This year we’re hosting the event in a new place, as Igalia’s office is no longer big enough to host all the people that will be attending the event. The venue is called Palexco and it’s close to the city center and just by the seaside (with views of the port). It’s a great place with lots of spaces and big rooms, so we’ll be very comfortable there. Note that we’ll have childcare service for the ones that might need it.

New venue: Palexco (picture by Jose Luis Cernadas Iglesias) New venue: Palexco (picture by Jose Luis Cernadas Iglesias)

The event is going to be 2 days this time, 13th and 14 June. Hopefully the weather will be great at that time of the year, and the folks visiting A Coruña should be able to really enjoy the trip. There are going to be lots of light hours too, sunrise is going to be around 7am and sunset past 10pm.

The registration form is still open. So far we’ve got a good amount of people registered from different companies like: Arm, Deno Land, Fission, Google, Igalia, KaiOS, Mozilla, Protocol Labs, Red Hat and Salesforce.

Arm, Google and Igalia will be sponsoring 2022 edition, and we’re really thankful for your support! If your company is also interested in sponsoring the hackfest, please contact us at hackfest@webengineshackfest.org.

Apart from that there are going to be some talks that will be live streamed during the event. We have a Call For Papers with a deadline by the end of this month. Talks can be on-site or remote, so if you’re interested on giving one, please fill the form.

We know we’re in complex times and not everyone can attend onsite this year. We’re sorry about that, and we hope you all can make it in future editions.

Looking forward to the Web Engines Hackfest 2022!

April 19, 2022 10:00 PM

April 11, 2022

Byungwoo Lee

April 10, 2022

Clayton Craft

-h --help -help help --? -? ????

Scenario: Congratulations, you won the lottery! You can barely believe your eyes as you stand there holding the winning ticket! It's amazing - so many feelings rush over you as you realize that some of your dreams are within reach now! You run over, nay, you float over to the lottery office to collect your winnings in pure excitement. You push open the doors to the building, scamper up to the front desk, present your ticket to the clerk, and the exchange goes something like this:

You: Hi! I won! Here's my ticket! Where do I collect my winnings?

Clerk: Hello. I understand you would like to collect your winnings, but I'm afraid I cannot let you do that unless you ask me in a very specific way.

You: .....

Clerk: Perhaps try something like "May I ..., please?"

You: May I have my winnings, please?

Clerk: Hello. I understand you would like to collect your winnings, but I'm afraid I cannot let you do that unless you ask me in a very specific way.

You: May I collect my winnings, please?

Clerk: Congrats on winning! Here you go!

Of course this would never happen in real life, right? There's no possible situation where the above interaction would make any sense in any way.

$ podman -h
Error: pflag: help requested
See 'podman --help'

Ya... Ok. I'm picking on podman[1] above, but it's pervasive in many, many command line tools. There are innumerable ways to ask a tool for help, and this blog's title has the most common ways I've seen, though I'm quite sure there are more. Anyway, the point of this post is to talk a little about the various ways to ask for help on the command line and quickly go over pitfalls.

-h / --help

Ah, the POSIX short/long help options. These are classics. Any competent, POSIX-compliant argument parser will handle them just fine. There are command argument parsers in many, many languages that are (or claim to be) compliant. A myriad of tools use these options, so there's a good chance your users are familiar with using them to ask for help. In my humble opinion, these are the best options to support because of how pervasive support is for them. In other words, many users have been trained with plentiful tools over considerable time, and have built these into their muscle memory. There's a reason why emergency phone numbers don't change arbitrarily every time some operator wants to "disrupt" the scene, thinking they know better. When it comes to asking for help, you probably want your users to get what they need quickly so they can use your tool.

[Edit 2021-04-12] Ok, I was wrong about long options being a POSIX thing, I guess they're a GNU thing.

-help

This one might save you 1 keystroke over --help, but it breaks any attempt to support short option chaining. For example, tar -xjf becomes impossible to parse correctly if the tool expects long option names to be proceeded by a single dash. Did the user mean -x -j -f? Or some option called xjf ?

Honestly, in practice, I've seen many tools that support -help also allow --help and -h for those who have the muscle memory reflex for those, so it's not nearly as problematic.

help

Some folks like to treat "help" as a command/verb on the command line. Some examples might include:

$ go help build

$ podman help run

Or more dramatically:

This pattern is cumbersome to deal with in practice, especially in tools that use subcommands. Instead of typing foo bar -h to get help, you have to move the cursor between foo and bar to insert help: foo help bar in order to get help about the bar subcommand. Then, once you presumably know how to use it, up-arrow, remove help from between the tool and subcommand, move to the end of the line, and continue on.

"Help" is commonly used in speech as an interjection, "Help!", and as a noun, "I need help with ____." It's also used as a verb, e.g., "Can you help me with ____?" However, I feel using it as a verb in command line tools that use the command/subcommand structure is awkward at best, as demonstrated above. It's also a verb in the following sentence: "Are you going to help me?!" Which is exactly what I feel like shouting every time I am forced to deal with tools that insist on using this pattern.

--? / -?

???? I have no idea where these came from, but my guess is that they are migrants from the wild west Windows-land, where I assume the shell won't try to expand ? into anything. Using these options will cause problems for anyone using common shells like bash, zsh, others. Don't do it.

asking for help

One final bit to end with: As in the case of podman above, if you know your user is asking for help, show them the damn help. It serves no one to chide them for not guessing the specific way your app wants them to ask for help. Better yet, support a more "common" way to allow users to ask for help if your app doesn't already. /rant

  1. Handling -h aside, podman is a really great alternative to docker. I highly recommend it, for many technical and non-technical reasons!

April 10, 2022 12:00 AM

April 07, 2022

Manuel Rego

:focus-visible is shipping in Safari/WebKit

This is the final report about the work Igalia has been doing to add support for :focus-visible in WebKit. As you probably already know this work is part of the Open Prioritization campaign by Igalia that has been funded by different people and organizations. Big thanks to all of you! If you’re curious and want to know all the details you can find the previous reports on this blog.

The main highlight for this blog post is that :focus-visible has been enabled by default in WebKit (r286783). 🚀 This change was included in Safari Technology Preview 138, with its own post on the official WebKit blog. And finally reached a stable release in Safari 15.4. It’s also included in WebKitGTK 2.36 and WPE WebKit 2.36.

Open Prioritization

Let’s start from the beginning, my colleague Brian Kardell had an idea to find more diverse ways to sponsor the development of the web platform, after some internal discussion that idea materialized into what we call Open Prioritization. In summer 2020 Igalia announced Open Prioritization that intially had six different features on the list:

  • CSS lab() colors in Firefox
  • :focus-visible in WebKit/Safari
  • HTML inert in WebKit/Safari
  • Selector list arguments for :not() in Chrome
  • CSS Containment support in WebKit/Safari
  • CSS d (SVG path) support in Firefox

By that time I wrote a blog post about this effort and CSS Containment in WebKit proposal and my colleagues did the same for the rest of the contenders:

After some months :focus-visible was the winner. By the end of 2020 we launched the Open Prioritization Collective to collect funds and we started our work on the implementation side.

Last year at TPAC, Eric Meyer gave an awesome talk called Adventures in Collective Implementation, explaining the Open Prioritization effort and the ideas behind it. This presentation also explains why there’s room for external investments (like this one) in the web platform, and that all open source projects (in particular the web browser engines) always have to make decisions regarding priorities. Investing on them will help to influence those priorities and speed up the development of features you’re interested in.

It’s been quite a while since we started all this, but now :focus-visible is supported in WebKit/Safari, so we can consider that the first Open Prioritization experiment has been successful. When :focus-visible was first enabled by default in Safari Technology Preview early this year, there were lots of misunderstandings about how the development of this feature was funded. Happily Eric wrote a great blog post on the matter, explaining all the details and going over some of the ideas from his TPAC talk.

:focus-visble is shipping in in WebKit, how that happened?

In November last year, I gave a talk at CSS Conf Armenia about the status of things regarding :focus-visible implementation in WebKit. In that presentation I explained some of the open issues and why :focus-visible was not enabled by default yet in WebKit.

The main issue was that Apple was not convinced about not showing a focus indicator (focus ring) when clicking on a focusable element (like a <div tabindex="0">). However this is one of the main goals of :focus-visible itself, avoiding to get a focus indicator in such situations. As Chromium and Firefox were already doing it, and aiming to have a better interoperability between the different implementations, Apple finally accepted this behavioral change on WebKit.

Then Antti Koivisto reviewed the implementation, suggesting a few changes and spotting some issues (thanks about that). Those things were fixed and the feature was enabled by default in the codebase last December. As usual once a feature is enabled some more issues appear and they were fixed too. Including even a generic issue regarding accesskey on focusable elements, which required to add support to test accesskey on WebKit Web Platform Tests (WPT).

As part of all this work since my previous blog post we landed 9 more patches on WebKit, making a total of 36 patches for the whole feature, together with a few new WPT tests.

Buttons and :focus-visible on Safari

This topic has been mentioned in my previous posts and also in my talk. Buttons (and other form controls) are not mouse focusable in Safari (both in macOS and iOS), this means that when you click a button on Safari, the button is not focused. This behavior has the goal to match Apple platform conventions, where the focus doesn’t move when you click a button. However Safari implementation differs from the platform one, as the focus gets actually lost when you click on such elements. There are some very old issues in WebKit bugtracker about the topic (see #22261 from 2008 or #112968 from 2013 for example).

There’s a kind of coincidence related to this. Before :focus-visible existed, buttons were never showing a focus indicator in Safari after mouse click, as they are not mouse focusable. This was different in other browsers where a focus ring was showed when clicking on buttons. So while :focus-visible fixed this issue for other browsers, it didn’t change the default behavior for buttons in Safari.

However with :focus-visible implementation we introduced a problem somehow related to this. Imagine a page that has an element and when you click it, the page moves the focus via script (using HTMLElement.focus()) to a different element. Should the new focused element show a focus indicator? Or in other words, should it match :focus-visible?

ol > li::marker { content: counter(list-item) ") "; }

The answer varies depending on whether the element clicked is or not mouse focusable:

  1. If you click on a focusable element and the focus gets moved via script to a different element, the newly focused element does NOT show a focus indicator and thus it does NOT match :focus-visible.
  2. If you click on a NON focusable element and the focus gets moved via script to a different element, the newly focused element shows a focus indicator and thus it matches :focus-visible.

All implementations agree on this, and Chromium and Firefox have been shipping this behavior for more than a year without known issues so far. But a problem appeared on Safari, because unlike the rest of browsers, buttons are not mouse focusable there. So when you click a button in Safari, you go to point 2) above, and end up showing a focus indicator in the newly focused element. Web authors don’t want to show a focus indicator on that situations, and that’s something that :focus-visible is fixing through point 1) in the rest of browsers, but not in Safari (see bug #236782 for details).

We landed a workaround to fix this problem in Safari, that somehow adds an exception for buttons to follow point 1) even if they are not mouse focusable. Anyway this doesn’t look like the solution for the long term, and looking into making buttons mouse focusable on Safari might be the way to go in the future. That will also help to solve other interop issues.

And now what?

The feature is complete and shipped, but as usual there are some other things that could be done as next steps:

  • The :focus-visible specification is kind of vague and has no normative text related to when or not show a focus indicator. This was done on purpose to advance on this area and have flexibility to adapt to user needs. Anyway now that all 3 major web engines agree on the implementation, maybe there could be the chance to define this in some spec. We tried to write a PR for HTML spec when we started the work on this feature, at that time it was closed, probably it was not the right time anyway. But maybe something like that could be retaken at some point in the future.
  • WebKit Web Inspector (Dev Tools) don’t allow you to force :focus-visible yet. We sent a patch for forcing :focus-within first but some UI refactoring is needed, once that’s done adding support for :focus-visible too should be straight forward.
  • Coming back to the topic on buttons not being mouse focusable in Safari. The web platform provides a way to make elements not keyboard focusable via tabindex="-1". Why not providing a way to mark an element as not mouse focusable? Maybe there could be a proposal for a new HTML attribute that allows making elements not mouse focusable, that way websites could mimic Apple platform conventions. There are nice use cases for this, for example when you’re editing an input and then you click on some button to show some contextual information, with something like this you could avoid losing the focus from the input to carry on with your editing.

Wrap-up

So yeah after more than a year since Igalia started working on :focus-visible in WebKit, we can now consider that this work has been complete. We can call the first Open Prioritization experiment a success, and we can celebrate together with all the people that have supported us during this achievement. 🎉

Thank you very much to all the people that sponsored this work. And also to all the people that helped reviewing patches, reporting bugs, discussing things, etc. during all this time. Without all your support we won’t be able to have made this happen. 🙏

Last but not least, we’d like to highlight how this work has helped the web platform as a whole. Now the major web browser engines have shipped :focus-visible and are using it in the default UA stylesheet. This makes tweaking the focus indicator on websites easier than ever.

April 07, 2022 10:00 PM

April 06, 2022

Qiuyi Zhang (Joyee)

Uncaught exceptions in Node.js

In this post, I’ll jot down some notes that I took when refactoring the uncaught exception handling routines in Node.js. Hopefully it

April 06, 2022 07:47 AM

On deps/v8 in Node.js

I recently ran into a V8 test failure that only showed up in the V8 fork of Node.js but not in the upstream. Here I’ll write down my

April 06, 2022 07:47 AM

March 30, 2022

Samuel Iglesias

Igalia Coding Experience, GSoC, Outreachy, EVoC

Do you want to start a career in open-source? Do you want to learn amazing skills while getting paid? Keep reading!

Igalia Coding Experience

Igalia logo

Igalia has a grant program that gives students with a background in Computer Science, Information Technology and Free Software their first exposure to the professional world, working hand in hand with Igalia programmers and learning with them. It is called Igalia Coding Experience.

While this experience is open for everyone, Igalia expressly invites women (both cis and trans), trans men, and genderqueer people to apply. The Coding Experience program gives preference to applications coming from underrepresented groups in our industry.

You can apply to any of the offered grants this year: Web Standards, WebKit, Chromium, Compilers and Graphics.

In the case of Graphics, the student will have the opportunity to deal with the Linux DRM subsystem. Specifically, the student will improve the test coverage of DRM drivers through IGT, a testing framework designed for this purpose. These includes learning how to contribute to Linux kernel/DRM, interact with the DRI-devel community, understand DRM core functionality, and increase test coverage of IGT tool.

The conditions of our Coding Experience program are:

  • Mentorship by one of the Igalia’s outstanding open source contributors in the field.
  • It is remote-friendly. Students can participate in it wherever they live.
  • Hours: 450h
  • Compensation: 6,500€
  • Usual timetables:
    • 3 months full-time
    • 6 months part-time

The submission period goes from March 16th until April 30th. Students will be selected in May. We will work with the student to arrange a suitable starting date during 2022, from June onwards, and finishing on a date to be agreed that suits their schedule.

Google Summer of Code (GSoC)

GSoC logo

The popular Google Summer of Code is another option for students. This year, X.Org Foundation participates as Open Source organization. We have some proposed ideas but you can propose any project idea as well.

Timeline for proposals is from April 4th to April 19th. However, you should contact us before in order to discuss your ideas with potential mentors.

GSoC gives some stipend to students too (from 1,500 to 6,000 USD depending on the size of the project and your location). The hours to complete the project varies from 175 to 350 hours depending on the size of the project as well.

Of course, this is a remote-friendly program, so any student in the world can participate in it.

Outreachy

Outreachy logo

Outreachy is another internship program for applicants from around the world who face under-representation, systemic bias or discrimination in the technology industry of their country. Outreachy supports diversity in free and open source software!

Outreachy internships are remote, paid ($7,000), and last three months. Outreachy internships run from May to August and December to March. Applications open in January and August.

The projects listed cover many areas of the open-source software stack: from kernel to distributions work. Please check current proposals to find anything that is interesting for you!

X.Org Endless Vacation of Code (EVoC)

X.Org logo

X.Org Foundation voted in 2008 to initiate a program known as the X.Org Endless Vacation of Code (EVoC) program, in order to give more flexibility to students: an EVoC mentorship can be initiated at any time during the calendar year, the Board can fund as many of these mentorships as it sees fit.

Like the other programs, EVoC is remote-friendly as well. The stipend goes as follows: an initial payment of 500 USD and two further payments of 2,250 USD upon completion of project milestones. EVoC does not set limits in hours, but there are some requirements and steps to do before applying. Please read X.Org Endless Vacation of Code website to learn more.

Conclusion

As you see, there are many ways to enter into the Open Source community. Although I focused in the open source graphics stack related programs, there are many of them.

With all of these possibilities (and many more, including internships at companies), I hope that you can apply and that the experience will encourage you to start a career in the open-source community.

Happy hacking!

March 30, 2022 09:15 AM

Brian Kardell

UA gotta be kidding

UA gotta be kidding

The UA String... It's a super weird, complex string that browsers send to servers, and is mostly dealt with behind the scenes. How big a deal could it be, really? I mean... It's a string. Well, pull up a chair.

I am increasingly dealing with in an ever larger number of things which involve very complex discussions, interrelationships of money, history, new standards, maybe even laws that are ultimately, somehow, about... A string. It's kind of wild to think about.

If you're interested in listening instead, I recently did an Igalia Chats podcast on this topic as well with fellow Igalians Eric Meyer and Alex Dunayev.

To understand any of this, a little background is helpful.

How did it get so complicated?

HTTP's first RFC 1945 was 1996. Section 10.15 defined the User Agent header as a tokenized string which it said wasn't required, but you should send it. Its intent was for

statistical purposes, the tracing of protocol violations, and automated recognition of user agents for the sake of tailoring responses to avoid particular user agent limitations

Seems reasonable enough, and early browsers did that ecxactly as expected.

So we got things like NCSA_Mosaic/2.0 (Windows 3.1), and we could count how many of our users were using that (statistical purposes).

But the web was new and there were lots of browsers popping up. Netcape came along, phenomenally well funded, intending to be a "Mosaic killer" they sent Mozilla/1.0 (Win3.1). Their IPO was the thing that really made the broad public really sit up and take notice of the web. It wasn't long before they had largely been declared the winners, impossible to unseat.

However, about this same time, Microsoft licensed the Mosaic source through NCSA's partner (called Spyglass) and created the initial IE in late 1995. It sent Microsoft Internet Explorer/1.0 (Windows 3.1). Interestingly, Apple too got into the race with a browser called Cyberdog released in Feb 1996. It sent a similarly simple string like Cyberdog/2.0 (Macintosh; 68k).

While we say things were taking off fast, it's worth mentioning that most people didn't have access to a computer at all. Among those that did, only a small number of them were really capable systems with graphical UIs. So text-based browsers, like the line mode browser from CERN, which could be used in university systems, for example, really helped expand the people exposed to the bigger idea of the web. It sent a simple string like W3CLineMode/5.4.0 libwww/5.4.0.

So far, so good.

But just then, the interwebs were really starting to hit a tipping point. Netscape quickly became the Chrome of their day (more, really): Super well funded, wanting to be first, and occasionally even just making shit up and shipping it. And, as a result, they had a hella good browser (for the first time). This created a runaway market share.

Oh hai! Are UA Netscape Browser?

Now, if you were a web master in those days, the gaps and bugs between the runaway top browser and others is kind of frustrating to manage. Netscape was really good in comparison to others. It supported frames and lots of interesting things. So, web masters just began creating two websites: A really nice one, with all the bells and whistles and the much simpler plain one that had all of the content, but worked fine even in text-based browsers... Or just blocking others and telling them to get a real browser. And they did this via the UA string.

Not too long after this became common, many other browsers (like IE and Cyberdog) did implement framesets and started getting a lot better… But it didn't matter.

It didn't matter because people had already placed them them in the "less good/doesn't support framesets and other fancy features" column. And, they weren't rushing out and changing it. Even if they wanted to, we all have other things to do, so it would take a long while before it would be changed everywhere.

If web masters wouldn't chage, end-users wouldn't adopt. If users don't adopt, why would your organization even try to fund and compete. Perhaps you can see the chicken and egg problem that Microsoft faced at this critical stage...

And so, they lied.

IE began sending Mozilla/1.22 (compatible; MSIE 2.0; Windows 3.1).

Note that in the product token, which was intended to identify the product, they knocked on the door and identified themselves as "Mozilla". Note also that they did identify themselves as MSIE in there elsewhere.

Why? Well, it's complicated.

For one, they needed to get the content. Secondly though, they needed a way to take credit, and build on it. Finally though - intentionally or not: If you start to win, the tables can turn. Web masters might send good stuff to MSIE and something less to everyone else. So, effectively, they deployed a clever workaround that cheated the particular parsing that was employed at that time (because that's what the spec said it should do) to achieve detection. It was the thing that was in their control.

Wash, rinse, repeat (and fork)...

So, basically, this just keeps happening. Everytime a browser comes along it's this problem all over again. We have to figure out a new lie that will fall through all of the right cracks in how people are currently parsing/using the UA strings. And we've got all the same pressures.

By the time you get to the release of Chrome 1.0 in 2008 it is sending something like Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.39 Safari/525.19.

Yikes. What is that Frakenstein thing?

But wait! There's more!

As flawed and weird as that is, it's just the beginning of the problem, because as I say this string is useful in ways that are sometimes at odds. Perhaps unintentionally, we've also created a system of advesarial advances.

Knowing stuff about the browser does lets you do useful things. But the decisioning powers available to you are mostly debatable, weird, and incomplete: You are reasoning about a thing stuck in time, which can become problematic. And so on the other end, we have to cheat.

That doesn't prevent people from wanting to know the answers to those questions or to do ever more seemingly useful things. "Useful things" can mean even something as simple as product planning and testing, as I say, even for browsers.

This goes wrong so many ways. For example, until not long ago, everything in the world counted Samsung Internet as "Chrome". However, that's not great for Samsung, and it's not necesarily great for all websites either. It is very much not Chrome, it is chromium-based. It's support Matrix and qualities are not the same in ways that do sometimes matter, at least in the moment. The follow on effects and ripples of that are just huge - from web masters routing content, sites making project choices, which polyfills to send, or even whether users have the inkling to want to try it - all of this is based on our perceptions of those stats.

But, it turns out that if you actually count them right - wow yes - Samsung Internet is the third most popular mobile browser worldwide, and by a good margin too! And also, a lot of stuff that totally should have let them in the door as totally capable before should have done that, and they should've gotten a good experience with the right polyfills too.

Even trying to keep track of all of this is gnarly, so we've built up whole industries to collect data, make sense of it and allow people to do "useful stuff" in ways that shield them from all of that. For example, if you use a popular CMS with things that let you say "if it's an iPad", or even just summarizes your stats in far more understandable ways like that - it's probably consulting one of these massive databases. Things like "whatismybrowser.com" which claims to have information about over 150 million unique UA strings in the wild.

Almost always, these systems involve mapping the UA string (including its lies) to "the facts, as we know them". These are used, often, not just for routing whole pages, but to deliver workarounds for specific devices, for example.

God of the UA Gaps

As you can imagine it's just gotten harder and harder to slide the the right holes. So now we have a kind of a new problem...

What happens when you have a lie that works for 95% of sites, but fails on, say, a few Alexa top 1k sites, or important properties you or your partners own?

Well, you lie differently to those ones.

That's right, there are many levels of lies. Your browser will send different UA strings to some domains, straight up spoofing another browser entirely.

Why? Because it has to. It's basically impossible to slip through all the cracks, and that's the only way to make things work for users that's in the browser's control.

What if the lie isn't enough? Well, you special case another kind of lie. Maybe you force that domain into quirks mode. You have to, because while the problem is on the site, that doesn't matter to regular users - they'll blame your "crappy browser". Worse still, if you're unlucky enough to be a newbie working on a brand new site in that domain, surprise! It doesn't work like almost anything else for some reason you can't explain! So, you try to find a way, another kind of workaround... and on and on it goes.

Privacy side effects

Of course, a side effect of all of this is that ultimately all of those simple variants in the UA and work that goes into those giant databases mean that we could know an awful lot about you, by default. So that's not great.

WebKit led on privacy by getting rid of most thirdparty cookies way way back. Mozilla followed. Now only Chrome does, and they're trying to figure out how to follow too.

But, back in 2017, WebKit also froze the UA string. And, since then we've been working to sort out a path that strikes all of the right balances. We do an experiment, and something breaks. We talk about doing another an experiment and some people get very cross. There are, after all, businesses built on the status quo.

Lots of things happening in standards (and chromium) surround trying to wrestle all of this into a manageable place. Efforts like UA reduction or Client Hints and many others are trying to find a way.

Obviously, it isn't easy.

Y2UA

Because of all of this complexity, there's even some worry that as browser versions hit triple digits (which once seemed it would take generations), some things could get tripped up in important ways.

There are several articles which discuss the various plans to deal with that - and, how this might involve (we hope, temporarily) some more lies.

Virtual (Reality) Lies

An interesting part of this is that occasionally we spawn a whole new paradigm - like mobile devices, dual screen, foldables - or XR.

The XR space is really dominated by new standalone devices that run Android and have a default Chromium browser with, realistically, no competition. Like, none. Not engine just choice, no actively developed browser choice. This is always the case in new paradigms, it seems, until it isn't.

As you might know, Igalia is changing that with our new Wolvic browser. Unfortunately, a lot of really interesting things fall into this same old trap - the "enter vr" button is only presented if it is what was previously the only real choice, and everything else was considered mobile or desktop. I'm not sure if it is them, or a service or library reasoning about it that way, but that's what happens.

So guess what? We selectively have to lie.

It's hard to overstate just how complex and intertwined this all is and what astounding amounts of money across the industry have been spent adversarially on ... a string.

March 30, 2022 04:00 AM

March 28, 2022

Joseph Griego

implementing ShadowRealm in WebKit

March 28, 2022

Igalia has been working in collaboration with Salesforce on advancing the ShadowRealm proposal through the TC39 process, and part of that work is actually getting the feature implemented in the Javascript engines and their embedders (browsers, nodejs, deno, etc.)

Since joining the compilers group at Igalia, I’ve been working (with some wonderful peers) to advance the implementation of ShadowRealms in JavaScriptCore (the Javascript engine used by WebKit, hereafter, ‘JSC’) and also integrating this functionality with WebKit proper.

You can read about some of the work done so far in the blog post Hanging in the Shadow Realm with JavaScriptCore by Phillip Mates, who implemented ShadowRealm support in JSC.

what is a ShadowRealm, anyways?

To explain what a ShadowRealm is, let’s start by explaining what a realm is more broadly:

“Realm” is from the Javascript spec, and is used to describe part of the environment a script executes in. For instance, different windows, frames, iframes, workers, and more all get their own realm to run code in.

Each realm also comes with an associated “global object” this is where top-level identifiers are stored as properties. For example, on a typical webpage, your Javascript runs in a realm whose global object exposes window, Promise, Event, and more. (The global object is always accessible as the name globalThis )

The usual isolation between these is informed by the mother of all browser security design principles, the same-origin policy: briefly, resources loaded from one domain (“origin”) shouldn’t normally be able to access resources from another; in the context of realms, this usually means that code running in one realm shouldn’t be able to directly access the objects associated with code running in another.

ShadowRealms are a new sandboxing primitive being added to the Javascript language, which allow Javascript code to create new realms that have similar isolation properties; these script-created realms are unique and disconnected from other realms the browser (or other host, like, node or deno) creates.

Any realm may create a new ShadowRealm:

const r = new ShadowRealm();

It’s useful to have a name for a realm that does this, we’ll steal from the proposal and call such a realm the “incubating realm” of its respective shadow realm.

cross-realm boundary enforcement

Part of the design of ShadowRealms is to provide a level of isolation around a ShadowRealm similar to what is provided today between other realms in the browser, as required by the content security policy. That is, code in the incubating realm (and, by extension, all other realms) should not be able to affect the content of the global object of a ShadowRealm and vice versa; except by using the ShadowRealm object directly (by calling myShadowRealm.evaluate or myShadowRealm.importValue.)

This requires fairly careful scrutiny of what we allow to pass between a ShadowRealm and its incubating realm. For instance, we basically cannot allow objects to pass between them at all, since if you obtain an object o from another realm, you can play nasty tricks by abusing prototype objects to get the Function constructor from the other realm.

const o = /* obtained via magical means */

// we can play games to obtain the constructor from o's prototype, which is bad enough on its own ...
const P = Object.getPrototypeOf(o).constructor;
// we can get the constructor _of that constructor_ which will be Function, but from the wrong realm!
const Funktion = Object.getPrototypeOf(P).constructor;
// now we can do basically whatever we please to o's global object, since
// constructing a new Function with a string gives us a function with that source text.
const farawayGlobalObject = (new Funktion("return globalThis;"))();

farawayGlobalObject.Array.prototype.slice = /* insert evil here */;

Note that this game (getting at the Function constructor from the other realm via any prototype chain) works in either direction, too! (it can be used to access the ShadowRealm from the incubating as well as accessing the incubating realm from the ShadowRealm)

We want to prevent leaks of this nature, since they allow action-at-a-distance not controlled by the normal ShadowRealm interface. This is important if we want modifications to the global objects of either realm to be only performed by code deliberately loaded into that realm.

venturing into the WebCore weeds

As he describes in the post linked above, Phillip implemented ShadowRealms and the ShadowRealm interface in JSC, but we left a host hook1 in to handle module loads and to allow the host to customize the ShadowRealm global object:

ShadowRealmObject*
ShadowRealmObject::create(VM& vm, Structure* structure, JSGlobalObject* globalObject) {
  ShadowRealmObject* object = /* ... */ ;
  /* ... */
  object->m_globalObject.set(
    vm, object,
    globalObject->globalObjectMethodTable()
    ->deriveShadowRealmGlobalObject(globalObject)
    // \__________________________________________/
    //       provided by the engine host
  );
  /* ... */
  return object;
}

When using JSC alone, deriveShadowRealmGlobalObject does little more than make a plain old new JSC::JSGlobalObject for the ShadowRealm to use. However, on WebKit, we needed to make sure it could create a JSGlobalObject that could perform module loads for the web page, and is otherwise customized to WebKit’s requirements, and that’s what we’ll describe here.

detour: wrappers for free

Central to WebKit’s use of JSC is that certain objects associated with a webpage all get associated “wrapper objects”: these are instances of the type JSC::JSObject whose job it is to send Javascript calls to a method of the wrapper object to calls to the C++ method that implements the object.

For example, in WebCore, we have an Element class which is responsible for modelling an HTML element in your web page—however, it cannot be used directly by Javascript code: that interaction is controlled by its wrapper object, which is an instance of JSElement (which is a subclass, ultimately, of JSObject)

Most wrapper classes in WebKit are, in fact, generated! Most web standards specify what Javascript objects should be available using a special language just for this purpose called WebIDL (IDL = Interface description language). For example, the WebIDL for TextEncoder looks like:

[
    Exposed=*,
] interface TextEncoder {
    constructor();

    readonly attribute DOMString encoding;

    [NewObject] Uint8Array encode(optional USVString input = "");
    TextEncoderEncodeIntoResult encodeInto(USVString source, [AllowShared] Uint8Array destination);
};

This is used during the WebKit build to produce the wrapper class, JSTextEncoder, which looks something like this: (though I am omitting a lot of boilerplate)

class JSTextEncoder : public JSDOMWrapper<TextEncoder> {
public:
    using Base = JSDOMWrapper<TextEncoder>;
  /* snip */
    static TextEncoder* toWrapped(JSC::VM&, JSC::JSValue);
  /* snip */
};

Here, the class JSDOMWrapper<TextEncoder> provides the most basic possible kind of wrapper object: the wrapper holds a reference to a TextEncoder and generated code in JSTextCoder.cpp instructs the JS engine how to dispatch to it:

/* Hash table for prototype */

static const HashTableValue JSTextEncoderPrototypeTableValues[] = {
  { "constructor",
    static_cast<unsigned>(JSC::PropertyAttribute::DontEnum),
    NoIntrinsic,
    { (intptr_t)static_cast<PropertySlot::GetValueFunc>(jsTextEncoderConstructor),
      (intptr_t) static_cast<PutPropertySlot::PutValueFunc>(0) } },
  { "encoding",   /* snip */ },
  { "encode",     /* snip */ },
  { "encodeInto", /* snip */ },
};

JSC_DEFINE_CUSTOM_GETTER(jsTextEncoderConstructor, (JSGlobalObject* lexicalGlobalObject,
                                                    EncodedJSValue thisValue,
                                                    PropertyName))
{ /* dispatch code goes here */ }

/* much more generated code goes here, using the above */

Usually, we don’t care much about the details here, that’s why the code is generated! The relevant information is typically that calling e.g. encoder.encode from Javascript should result to a call, in C++ to the encode method on TextEncoder.

There’s also a variety of attributes we can put on IDL declarations, some of which change the meaning of those declarations for instance, by specifying which kinds of realms they should be available in, and some others which affect WebKit-specific aspects of the declaration, notably, they give us more control over the code generation we just described.

ShadowRealm global objects

To make sure that ShadowRealms behave appropriately in WebKit, we need to make sure that we can create a JSGlobalObject that also cooperates with the wrapping machinery in WebCore; the typical way to do this is to make the wrapper object for the realm global object an instance of WebCore::JSDOMGlobalObject: this both provides functionality to ensure that the wrappers used in that realm can be tracked and also that they are distinct from wrappers used in other realms.

For ShadowRealms we need to make sure that our new ShadowRealm global object is wrapped as a subclass of JSDOMGlobalObject; we can do this pretty directly with WebKit IDL attributes:

[
    Exposed=ShadowRealm,
    JSLegacyParent=JSShadowRealmGlobalScopeBase,
    Global=ShadowRealm,
    LegacyNoInterfaceObject,
] interface ShadowRealmGlobalScope {
    /* snip */
};

These have the meaning:

  • Exposed=ShadowRealm + LegacyNoInterfaceObject: these two together don’t make much difference: Exposed=ShadowRealm tells us that the interface should be available in ShadowRealms; LegacyNoInterfaceObject tells us that there shouldn’t actually be a globalThis.ShadowRealmGlobalScope available anywhere; so, there is, in fact, nothing really to expose… but, because this is the global object for ShadowRealms, any members on it will be available on globalThis.

  • JSLegacyParent=JSShadowRealmGlobalScopeBase tells WebKit’s code generation that the wrapper class for this interface should use our custom JSShadowRealmGlobalScopeBase (which we have yet to write) as the base class.

  • Global=ShadowRealm tells other people reading this IDL file that this interface is the global object for ShadowRealms.

Now we just need two more things: the implementation of the unwrapped ShadowRealmGlobalScope, and the implementation of our wrapper class, JSShadowRealmGlobalScopeBase

the unwrapped global scope

We can start with the unwrapped, global object, since it ends up being simpler: the main thing we need from a ShadowRealm global object is just to be able to find our way back to the incubating realm—it turns out a convenient way to do this is to just make a new type and have it keep its incubating realm around:

class ShadowRealmGlobalScope : public RefCounted<ShadowRealmGlobalScope> {
  /* ... snip  ... */
private:
  // a (weak) pointer to the JSDOMGlobalObject that created this ShadowRealm
  JSC::Weak<JSDOMGlobalObject> m_incubatingWrapper;

  // the module loader from our incubating realm
  ScriptModuleLoader* m_parentLoader { nullptr };

  // a pointer to the JSDOMGlobalObject that wraps this realm (it's unique!)
  JSC::Weak<JSShadowRealmGlobalScopeBase> m_wrapper;

  // a separate module loader for this realm to use
  std::unique_ptr<ScriptModuleLoader> m_moduleLoader;
};

Asute readers will note that the ShadowRealmGlobalScope does not, in fact, keep its parent realm around; this is because it is retained by the ShadowRealmObject from above! Having the ShadowRealm global scope retain its incubating realm would form a loop of retaining pointers and therefore leak memory! Since these are WTF::RefCounted<...>, there’s no garbage collector to help, out either; we really need to avoid the reference cycle.

We can, however, get away with a weak pointer since if the incubating global object became unreachable, there would be no way to get back into the shadow realm except code running in the incubating realm or its event loop, neither of which should be possible, so, the weak pointer will always be valid when we need it.

the wrapper global object

Let’s go ahead and add the wrapper class now:

class JSShadowRealmGlobalScopeBase : public JSDOMGlobalObject { /* snip */ }

… and, since we get to pick our base class, we can pick JSDOMGlobalObject instead of JSObject, how convenient! This has the effect of implicitly making other parts of the engine treat our new global object as a separate realm that requires its own wrapper objects. This doesn’t come for free, though, we have several virtual methods on JSDOMGlobalObject we are obliged to implement. Thankfully, we have another JSDOMGlobalObject around we can happily delegate to! For example:

// a shared utility to retrieve the incubating realm's global object
const JSDOMGlobalObject* JSShadowRealmGlobalScopeBase::incubatingRealm() const
{
  auto incubatingWrapper = m_wrapped->m_incubatingWrapper.get();
  ASSERT(incubatingWrapper);
  return incubatingWrapper;
}

// discharge one of our obligations by delegating to `incubatingRealm()`
//
// (this method is static; we get `this` as JSGlobalObject*, annoyingly, but
// the downcast should always succeed)
bool JSShadowRealmGlobalScopeBase::supportsRichSourceInfo(const JSGlobalObject* object)
{
  auto incubating = jsCast<const JSShadowRealmGlobalScopeBase*>(object)->incubatingRealm();
  return incubating->globalObjectMethodTable()->supportsRichSourceInfo(incubating);
}

Finally we need only to add branches in some (admittedly awkward2) parts of JSDOMGlobalObject for our new ShadowRealm global object, for example:

static ScriptModuleLoader* scriptModuleLoader(JSDOMGlobalObject* globalObject)
{
  /* snip */
  if (globalObject->inherits<JSShadowRealmGlobalScopeBase>(vm))
    return &jsCast<const JSShadowRealmGlobalScopeBase*>(globalObject)->wrapped().moduleLoader();
  /* snip */
}

the grand finale … almost

Now we can actually implement deriveShadowRealmGlobalObject, right? Well, not quite. It turns out <iframe> acts rather differently when the page it contains has the same origin as the parent page—in that case, their global objects are actually reachable from one another! (This came as an unpleasant surprise to me at the time …)

This won’t do for us—it breaks the invariant we described above. There’s nothing to prevent a child <iframe> from creating a new ShadowRealm and allowing it to escape to the parent frame; then the ShadowRealm can outlive its incubating realm’s global object :(

We can solve the problem by actually walking up the hierarchy of frames until we either hit the top or find one with a different origin, and use the top-most global object with the same origin, which re-establishes our invariant, since there now really should be no way for the ShadowRealm object to escape :)

JSC::JSGlobalObject* JSDOMGlobalObject::deriveShadowRealmGlobalObject(JSC::JSGlobalObject* globalObject)
{
  auto& vm = globalObject->vm();

  auto domGlobalObject = jsCast<JSDOMGlobalObject*>(globalObject);
  auto context = domGlobalObject->scriptExecutionContext();
  if (is<Document>(context)) {
    // Same-origin iframes present a difficult circumstance because the
    // ShadowRealm global object cannot retain the incubating realm's
    // global object (that would be a refcount loop); but, same-origin
    // iframes can create objects that outlive their global object.
    //
    // Our solution is to walk up the parent tree of documents as far as
    // possible while still staying in the same origin to insure we don't
    // allow the ShadowRealm to fetch modules masquerading as the wrong
    // origin while avoiding any lifetime issues (since the topmost document
    // with a given wrapper world should outlive other objects in that
    // world)
    auto document = &downcast<Document>(*context);
    auto const& originalOrigin = document->securityOrigin();
    auto& originalWorld = domGlobalObject->world();

    while (!document->isTopDocument()) {
      auto candidateDocument = document->parentDocument();

      if (!candidateDocument->securityOrigin().isSameOriginDomain(originalOrigin))
        break;

      document = candidateDocument;
      domGlobalObject = candidateDocument->frame()->script().globalObject(originalWorld);
    }
  }
  /* snip */
  auto scope = ShadowRealmGlobalScope::create(domGlobalObject, scriptModuleLoader(domGlobalObject));
  /* snip */
}

a brief note on debugging

Of course, none of the above went as smoothly as I make it sound; I ended up encountering many crashes and inscrutable error messages as I fumbled my way around WebKit internals. After printf debugging, A classic technique to interactively explore program state when in unfamiliar territory is the iconic ASSERT(false)—WebKit even provides a marginally more convenient macro for this purpose, CRASH(), which proved invaluable.

Simply run your test case in a debugger and set a breakpoint on WTFCrash and you will have a convenient gdb prompt; I find it to be a fun, slightly more powerful flavor of printf-debugging :)

the road ahead

Now, we have a working ShadowRealm available in the browser!

If you’re interested to try them out for yourself, you can find them in the latest Safari Technology Preview release!

However this is only part of the work for this project, because it is also planned to add certain Web interfaces to ShadowRealm contexts, and more testing coverage is needed.

exposing web interfaces

Since ShadowRealms are actually part of the Javascript standard and not a Web standard, so, we need to be careful about this work; ShadowRealms are supposed to be a sandbox, so it wouldn’t do much good if scripts you load into a shadow realm start mucking around with the markup on your web site!

So the interfaces that are planned to be exposed are strictly those that expose some extra computational facility to Javascript, but do not really have an effect outside of the script where they are invoked. For example, TextEncoder is quite likely to be exposed, Document is not.

A patch adding several of these APIs to ShadowRealm contexts is already landed, but probably won’t appear in Safari until after ShadowRealms do.

never enough testing

ShadowRealms are already unit tested in both the existing WebKit implementation and in test262, the test suite accompanying the Javascript standard, however, more tests are needed in WPT, the web test suite, for the correctness of the module loading support and newly exposed interfaces; some work here is underway and should be finished in the coming few weeks.

notes


  1. “host” here refers to whatever piece of software is running Javascript with JSC—usually the host is a web browser, but it doesn’t have to be. For our purposes, “host hook” is a function that the Javascript engine cannot provide—it requires the host to cooperate in some way.↩︎

  2. The awkwardness here is that scriptModuleLoader is not actually part of the interface of JSDOMGlobalObject, but probably should be; however, we have now arrived at the delicate argument over whether or not patches like this should minimize the changes or clean up ugliness everywhere they find it: you can even see this in the code review of this patch if you look closely.↩︎

by Joseph Griego at March 28, 2022 12:00 AM

Clayton Craft

Never miss completion of a long-running command again!

This is a really short, simple thing I use to alert me when a long-running shell command/script, like building (some) containers or compiling the kernel, is done. It effectively allows me to switch context in the meantime and pick up where I left off when the long-running dependency is finished.

There are two versions of this, one triggers the shell bell after the command/script has completed, and the other uses notify-send to trigger a desktop notification. I prefer the shell bell approach most of the time, since it works nicely with my tmux setup, highlighting the window where it was triggered. It also works if there's no graphical notification daemon running.

alert () {
        "$@"; tput bel
}

And the notify-send version:

alert () {
        "$@"; notify-send "ding!" "$@"
}

These can be used by adding the function to you shell's rc script (e.g. ~/.zshrc for zsh or ~/.bashrc for bash). It may need to be adjusted for shells that use a different syntax for defining user functions.

To use it, simply run the function and pass the script+args to it, for example: $ alert make -j1 foo or whatever.

March 28, 2022 12:00 AM

March 21, 2022

Joseph Griego

hello, world

This blog is where I will put some blog-things. Enjoy this ascii-art of a cat, for now:
                zzz
___________      \
\           \ _/\___/\________
  \            = - . - =       \
  \           \                |
    \           \      _______ /
    \           \____(_______/__
      \___________\_-_-__________]

by Joseph Griego at March 21, 2022 12:00 AM

March 17, 2022

Samuel Iglesias

Igalia work within the GNU/Linux graphics stack in 2021

We had a busy 2021 within GNU/Linux graphics stack at Igalia.

Would you like to know what we have done last year? Keep reading!

Open Source Raspberry Pi GPU (VideoCore) drivers

Raspberry Pi 4, model B

Last year both the OpenGL and the Vulkan drivers received a lot of love. For example, we implemented several optimizations such improvements in the v3dv pipeline cache. In this blog post, Alejandro Piñeiro presents how we improved the v3dv pipeline cache times by reducing the two-cache-lookup done previously by only one, and shows some numbers on both a synthetic test (modified CTS test), and some games.

We also did performance improvements of the v3d compilers for OpenGL and Vulkan. Iago Toral explains our work on optimizating the backend compiler with techniques such as improving memory lookup efficiency, reducing instruction counts, instruction packing, uniform handling, among others. There are some numbers that show framerate improvements from ~6 to ~62% on different games / demos.

Framerate improvements Framerate improvement after optimization (in %). Taken from Iago’s blogpost

Of course, there was work related to feature implementation. This blog post from Iago lists some Vulkan extensions implemented in the v3dv driver in 2021… Although not all the implemented extensions are listed there, you can see the driver is quickly catching up in its Vulkan extension support.

My colleague Juan A. Suárez implemented performance counters in the v3d driver (an OpenGL driver) which required modifications in the kernel and in the Mesa driver. More info in his blog post.

There was more work in other areas done in 2021 too, like the improved support for RenderDoc and GFXReconstruct. And not to forget the kernel contributions to the DRM driver done by Melissa Wen, who not only worked on developing features for it, but also reviewed all the patches that came from the community.

However, the biggest milestone for the v3Dv driver was to be Vulkan 1.1 conformant in the last quarter of 2021. That was just one year after becoming Vulkan 1.0 conformant. As you can imagine, that implied a lot of work implementing features, fixing bugs and, of course, improving the driver in many different ways. Great job folks!

If you want to know more about all the work done on these drivers during 2021, there is an awesome talk from my colleague Alejando Piñeiro at FOSDEM 2022: “v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi 4”, and another one from my colleague Iago Toral in XDC 2021: “Raspberry Pi Vulkan driver update”. Below you can find the video recordings of both talks.

FOSDEM 2022 talk: “v3dv: Status Update for Open Source Vulkan Driver for Raspberry Pi 4”

XDC 2021 talk: “Raspberry Pi Vulkan driver update”

Open Source Qualcomm Adreno GPU drivers

RB3 Photo of the Qualcomm® Robotics RB3 Platform embedded board that I use for Turnip development.

There were also several achievements done by igalians on both Freedreno and Turnip drivers. These are reverse engineered open-source drivers for Qualcomm Adreno GPUs: Freedreno for OpenGL and Turnip for Vulkan.

Starting 2021, my colleague Danylo Piliaiev helped with implementing the missing bits in Freedreno for supporting OpenGL 3.3 on Adreno 6xx GPUs. His blog post explained his work, such as implementing ARB_blend_func_extended, ARB_shader_stencil_export and fixing a variety of CTS test failures.

Related to this, my colleague Guilherme G. Piccoli worked on porting a recent kernel to one of the boards we use for Freedreno development: the Inforce 6640. He did an awesome job getting a 5.14 kernel booting on that embedded board. If you want to know more, please read the blog post he wrote explaining all the issues he found and how he fixed them!

Inforce6640 Picture of the Inforce 6640 board that Guilherme used for his development. Image from his blog post.

However the biggest chunk of work was done in Turnip driver. We have implemented a long list of Vulkan extensions: VK_KHR_buffer_device_address, VK_KHR_depth_stencil_resolve, VK_EXT_image_view_min_lod, VK_KHR_spirv_1_4, VK_EXT_descriptor_indexing, VK_KHR_timeline_semaphore, VK_KHR_16bit_storage, VK_KHR_shader_float16, VK_KHR_uniform_buffer_standard_layout, VK_EXT_extended_dynamic_state, VK_KHR_pipeline_executable_properties, VK_VALVE_mutable_descriptor_type, VK_KHR_vulkan_memory_model and many others. Danylo Piliaiev and Hyunjun Ko are terrific developers!

But not all our work was related to feature development, for example I implemented Low-Resolution Z-buffer (LRZ) HW optimization, Danylo fixed a long list of rendering bugs that happened in real-world applications (blog post 1, blog post 2) like D3D games run on Vulkan (thanks to DXVK and VKD3D), instrumented the backend compiler to dump register values, among many other fixes and optimizations.

However, the biggest achievement was getting Vulkan 1.1 conformance for Turnip. Danylo wrote a blog post mentioning all the work we did to achieve that this year.

If you want to know more, don’t miss this FOSDEM 2022 talk given by my colleague Hyunjun Ko called “The status of turnip driver development. What happened in 2021 and will happen in 2022 for turnip.”. Video below.

FOSDEM 2022 talk: “The status of turnip driver development. What happened in 2021 and will happen in 2022 for turnip.”

Vulkan contributions

Our graphics work doesn’t cover only driver development, we also participate in Khronos Group as Vulkan Conformance Test Suite developers and even as spec contributors.

My colleague Ricardo Garcia is a very productive developer. He worked on implementing tests for Vulkan Ray Tracing extensions (read his blog post about ray tracing for more info about this big Vulkan feature), implemented tests for a long list of Vulkan extensions like VK_KHR_present_id and VK_KHR_present_wait, VK_EXT_multi_draw (watch his talk at XDC 2021), VK_EXT_border_color_swizzle (watch his talk at FOSDEM 2022) among many others. In many of these extensions, he contributed to their respective specifications in a significant way (just search for his name in the Vulkan spec!).

XDC 2021 talk: “Quick Overview of VK_EXT_multi_draw”

FOSDEM 2022 talk: “Fun with border colors in Vulkan. An overview of the story behind VK_EXT_border_color_swizzle”

Similarly, I participated modestly in this effort by developing tests for some extensions like VK_EXT_image_view_min_lod (blog post). Of course, both Ricardo and I implemented many new CTS tests by adding coverage to existing ones, we fixed lots of bugs in existing ones and reported dozens of driver issues to the respective Mesa developers.

Not only that, both Ricardo and I appeared as Vulkan 1.3 spec contributors.

Vulkan 1.3

Another interesting work we started in 2021 is Vulkan Video support on Gstreamer. My colleague Víctor Jaquez presented the Vulkan Video extension at XDC 2021 and soon after he started working on Vulkan Video’s h264 decoder support. You can find more information in his blog post, or watching his XDC 2021 talk below:

FOSDEM 2022 talk: “Video decoding in Vulkan: VK_KHR_video_queue/decode APIs”

Before I leave this section, don’t forget to take a look at Ricardo’s blogpost on debugPrintfEXT feature. If you are a Graphics developer, you will find this feature very interesting for debugging issues in your applications!

Along those lines, Danylo presented at XDC 2021 a talk about dissecting and fixing Vulkan rendering issues in drivers with RenderDoc. Very useful for driver developers! Watch the talk below:

XDC 2021 talk: “Dissecting Vulkan rendering issues in drivers with RenderDoc”

To finalize this blog post, remember that you now have vkrunner (the Vulkan shader tester created by Igalia) available for RPM-based GNU/Linux distributions. In case you are working with embedded systems, maybe my blog post about cross-compiling with icecream will help to speed up your builds.

This is just a summary of the highlights we did last year. I’m sorry if I am missing more work from my colleagues.

March 17, 2022 12:00 PM

March 16, 2022

Brian Kardell

A case for CSS-Like Languages

A case for CSS-Like Languages

For many years now, it seems that almost not a week goes by where I don't wind up thinking about the same topic while reading threads. Occasionally, I bring it up in private conversations, and recently it seems some others are starting to discuss something around the edges, so I thought that I should probably write a post...

The first "S" in CSS ("Style") governs a lot its design in both theory and practice. Much about it, ultimately, is designed toward, and limited by constraints around potentially fast-changing visual style. As my colleage Eric Meyer cleverly noted:

[W]eb browsers are actually 60fps+ rendering environments. They’re First-Person Scrollers.

What's interesting though, is how natural it seems to want to write things "in CSS" that don't fit into those neat little constraints. CSS is literally full of concepts which could be deployed toward other problems: Separation of concerns, sheets, media queries, selectors, pseduos, rules, ua-stylesheets, properties, functions, computed values and the complex and automatic application of all of those things.

It's been an almost regular occurence that people desire to somehow deploy those concepts toward ends that are not, strictly speaking, about potentially fast-changing visual style.

Some of many examples

Over the years, this has taken many shapes. Sometimes we try to rationalize about how it is style, sometimes we try to shoehorn a solution. Occasionally we have even had proposals and experiments that tried to somehow attempted to tap some aspect of this problem and some of those same concepts to bear on a different problem. Before CSS, Action Sheets proposed that actions and styles should be considered. Simple Tree Transformation Sheets offered ways to transform the DOM (that would be mentioned in Håkon's thesis on CSS itself). Shortly after, there was an attempt to add behavioral extensions to CSS - Microsoft even implemented some stuff. Another take on this introduced XBL to creating a 'binding' that could be applied via CSS. Public Web Components discussion began when trying to decide what to do with XBL, and they initially included a similar concept to bind in CSS via decorators.

A completely different angle of this was CSS Speech which reasoned that this was simply "aural styles".

Except, of course, in each of those cases the particular needs and constraints are a bit different. They shouldn't change at 60fps, in fact. Things that are kind of verbotten or impossible in CSS today might be totally solvable and fine, if only it weren't somehow shoe-horned into CSS, proper.

CSS...ish

Several years ago, discussions like these led Elika Etemad (aka Fantasai) of the CSS Working Group to make a suggestion to Tab Atkins (now a prolific CSS editor) which yielded a sketch for something called Cascading Attribute Sheets. As they say, it's not the first such take, it's just a nicely linkable, well-informed and dated illustration of a proposal to create a CSS-like language which repurposes major concepts and parts of the architecture toward other aims. As Tab noted in his CAS post, internally, browsers do this to an extent already.

In 2012, I was on a very similar page. I was doing things at my company which did exactly this. My partner and I went about trying to decouple what we could in order to share this with Tab and others in order to hopefully participate in the discussion. We began creating a version of this called Bess on Github, but it was incomplete and full of some bad internal ideas. It did, ultimately lead us to sharing something far more limited (HitchJS) and allow us to begin a much bigger discussion about what made this and a whole lot of other useful things (like, polyfilling something in CSS) way too hard. I even used this to create a kind of polyfill for Tab's CAS Proposal. I don't think it's particularly a great proposal as it stands - but there's clearly something there.

These discussions also led to the establishment of a new joint task force between members of the W3C Technical Architecture Group and the CSS Working Group: Houdini. In the very first meeting of this task force, the group agreed that making it possible for us to repurpose architectural aspects in order to explore "CSS-like languages" was ideal.

Now...

A lot has happened since that time, but realistically, we really haven't had a lot of time to talk about or pursue the stuff that would better enable us to explore CSS-like languages.

I think that's a real shame because we continue to have problems and ideas where at least advancing discussions on this would seem very useful.

Cascading Spicy Stuff

Consider our <spicy-sections> work in OpenUI, for example. It seems very natural to use the basic language and paradigms of CSS to express this. We're not entirely sure about some things, so we're waiting to see what pans out with the CSS Toggles.

This is also (I think naturally) shaping up larger conversations and ideas about whether we could just have "state machines" in CSS, and how we can share state and so on.

However, at the same time, it is also very unclear whether something which changes semantics and interactions at fixed points really belongs in CSS and the 60fps profile itself.

I pretty much agree with Mia here

There probably are things that work just fine in CSS - but at some point, we've entered something of an uncanny valley, and things get harder. We can't know where things should develop and live without a larger conversation.

I guess what I am trying to say, in the end, is that I love all of the convesations that are suddenly happening, and I'd love it even more if we spent some time thinking about how we might draw these lines and explore CSS-like solutions.

March 16, 2022 04:00 AM

March 14, 2022

Eric Meyer

When or If

The CSSWG (CSS Working Group) is currently debating what to name a conditional structure, and it’s kind of fascinating.  There are a lot of strong opinions, and I’m not sure how many of them are weakly held.

Boiled down to the bare bones, the idea is to take the conditional structures CSS already has, like @supports and @media, and allow more generic conditionals that combine and enhance what those structures make possible.  To pick a basic example, this:

@supports (display: grid) {
	@media (min-width: 33em) {
		…
	}
}

…would become something like this:

@conditional supports(display: grid) and media(min-width: 33em) {
	…
}

This would also be extended to allow for alternates, something like:

@conditional supports(display: grid) and media(min-width: 33em) {
	…
} @otherwise {
	…
}

Except nobody wants to have to type @conditional and @otherwise, so the WG went in search of shorter names.

The Sass-savvy among you are probably jumping up and down right now, shouting “We have that! We have that already! Just call them @if and @else and finally get on our level!”  And yes, you do have that already: Sass uses exactly those keywords.  There are some minor syntactic differences (Sass doesn’t require parentheses around the conditional tests, for example) and it’s not clear whether CSS would allow testing of variable values the way Sass does, but they’re very similar.

And that’s a problem, because if CSS starts using @if and @else, there is the potential for syntactic train wrecks.  If you’re writing with Sass, how will it tell the difference between its @if and the CSS @if?  Will you be forever barred from using CSS conditionals in Sass, if that’s what goes into CSS?  Or will Sass be forced to rename those conditionals to something else, in order to avoid clashing — and if so, how much upheaval will that create for Sass authors?

The current proposal, as I write this, is to use @when and @else in CSS Actual.  Thus, something like:

@when supports(display: grid) and media(min-width: 33em) {
	…
} @else {
	…
}

Even though there is overlap with @else, apparently starting the overall structure with @when would allow Sass to tell the difference.  So that would sidestep clashing with Sass.

But should the CSS WG even care that a third-party code base’s syntax gets trampled on by CSS syntax?  I imagine Sass authors would say, “Uh, hell yeah they should”, but does that outweigh the potential learning hurdle of all the non-Sass authors, both now and over the next few decades, learning that @when doesn’t actually have temporal meaning and is just an alias for the more recognizable if statement?

Because while it’s true that some programming languages have a when conditional structure (kOS being the one I’ve used most recently), they usually also have an if structure, and the two sometimes mean different things.  There is a view held by some that using the label when when we really mean if is a mistake, one that will stand out as a weird choice and a design blunder, 10 years hence, and will create a cognitive snag in the process of learning CSS.  Others hold the view that when is a relatively common programming term, it’s sometimes synonymous with if, every language has quirks that new learners need to learn, and it’s worth avoiding a clash with tools and authors that already exist.

If you ask me, both views are true, and that’s the real problem.  I imagine most of the participants in the discussion, even if their strong opinions are strongly held, can at least see where the other view is rooted, and sympathize with it.  And it’s very likely the case that even if Sass and other tools didn’t exist, the WG would still be having the same debate, because both terms work in context.  I suspect if would have won by now, but who knows?  Maybe not.  There have been longer debates over less fundamental concepts over the years.

A lot of my professional life has been spent explaining CSS to people new to it, so that may be why I personally lean toward @if over @when.  It’s a bit easier to explain, it looks more familiar to anyone who’s done programming at just about any level, and semantically it makes a bit more sense to me.  It’s also true that I come from a place of not having to worry about Sass changing on me, because I’ve basically never used it (or any other CSS pre-processor, for that matter) and I don’t have to do the heavy lifting of rewriting Sass to deal with this.  So, easy for me to say!

That said, I have an instinctive distrust of arguments by majority.  Yes, the number of Sass developers who’d have to adapt Sass to @if in CSS Actual is vanishingly small compared to the population of current and future CSS authors, and the number of Sass authors is likely much smaller than the number of total CSS authors.  That doesn’t automatically mean they should be discounted. It’s good to keep CSS as future-proof as possible, but it should also be kept as present-proof as possible.

The rub comes in with “as possible”, though.  This isn’t a situation where all things are possible. Something’s going to give, and there will be a group of people ill-served by the result.  Will it be Sass authors?  Future CSS learners?  Another group?  Everyone?  We’ll see!


Have something to say to all that? You can add a comment to the post, or email Eric directly.

by Eric Meyer at March 14, 2022 03:57 PM

March 03, 2022

Philip Chimento

A screenshot of calendar software showing a visual difference between one calendar event spanning 24 hours, and a second all-day event the next day.

Via Zach Holman’s blog post I found an interesting Twitter discussion that kicked off with these questions:

A couple of tough questions for all of you:
1. Is the date 2022-06-01 equal to the time 2022-06-01 12:00:00?
2. Is the date 2022-06-01 between the time 2022-06-01 12:00:00 and the time 2022-12-31 12:00:00?
3. Is the time 2022-06-01 12:00:00 after the date 2022-06-01?

I’ve been involved for two years and counting1 in the design of Temporal, an enhancement for the JavaScript language that adds modern facilities for handling dates and times. One of the principles of Temporal that was established long before I got involved, is that we should use different objects to represent different concepts. For example, if you want to represent a calendar date that’s not associated with any specific time of day, you use a class that doesn’t require you to make up a bogus time of day.2 Each class has a definition for equality, comparison, and other operations that are appropriate to the concept it represents, and you get to specify which one is appropriate for your use case by your choice of which one you use. In other, more jargony, words, Temporal offers different data types with different semantics.3

For me these questions all boil down to, when we consider a textual representation like 2022-06-01, what concept does it represent? I would say that each of these strings can represent more than one concept, and to get a good answer, you need to specify which concept you are talking about.

So, my answers to the three questions are “it depends”, “no but maybe yes”, and “it depends.” I’ll walk through why I think this, and how I would solve it with Temporal, for each question.

You can follow along or try out your own answers by going the Temporal documentation page, and opening your browser console. That will give you an environment where you can try these examples and experiment for yourself.

Question 1

Is the date 2022-06-01 equal to the time 2022-06-01 12:00:00?

As I mentioned above, Temporal has different data types with different semantics. In the case of this question, what the question refers to as a “time” we call a “date-time” in Temporal4, and the “date” is still a date. The specific types we’d use are PlainDateTime and PlainDate, respectively. PlainDate is a calendar date that doesn’t have a time associated with it: a single square on a wall calendar. PlainDateTime is a calendar date with a wall-clock time. In both cases, “plain” refers to not having a time zone attached, so we know we’re not dealing with any 23-hour or 25-hour or even more unusual day lengths.

The reason I say that the answer depends, is that you simply can’t say whether a date is equal to a date-time. They are two different concepts, so the answer is not well-defined. If you want to do that, you have to convert one to the other so that you either compare two dates, or two date-times, each with their accompanying definition of equality.

You do this in Temporal by choosing the type of object to create, PlainDate or PlainDateTime, and the resulting object’s equals() method will do the right thing:

> Temporal.PlainDate.from('2022-06-01').equals('2022-06-01 12:00:00')
true
> Temporal.PlainDateTime.from('2022-06-01').equals('2022-06-01 12:00:00')
false

I think either of PlainDate or PlainDateTime semantics could be valid based on your application, so it seems important that both are within reach of the programmer. I will say that I don’t expect PlainDateTime will get used very often in practice.5 But I can think of a use case for either one of these:

  • If you have a list of PlainDateTime events to present to a user, and you want to filter them by date. Let’s say we have data from a pedometer, where we care about what local time it was in the user’s time zone when they got their exercise, and the user has asked to see all the exercise they got yesterday. In this case I’d use date semantics: convert the PlainDateTime data to PlainDate data.
  • On the other hand, if the 2022-06-01 input comes from a date picker widget where the user could have input a time but didn’t, then we might decide that it makes sense to default the time of day to midnight, and therefore use date-time semantics.

Question 2

Is the date 2022-06-01 between the time 2022-06-01 12:00:00 and the time 2022-12-31 12:00:00?

I think the answer to this one is more unambiguously a no. If we use date-time semantics (in Temporal, PlainDateTime.compare()) the date implicitly converts to midnight on that day, so it comes before both of the date-times. If we use date semantics (PlainDate.compare()), 2022-06-01 and 2022-06-01 12:00:00 are equal as we determined in Question 1, so I wouldn’t say it’s “between” the two date-times.

> Temporal.PlainDateTime.compare('2022-06-01', '2022-06-01 12:00:00')
-1
> Temporal.PlainDateTime.compare('2022-06-01', '2022-12-31 12:00:00')
-1
> Temporal.PlainDate.compare('2022-06-01', '2022-06-01 12:00:00')
0
> Temporal.PlainDate.compare('2022-06-01', '2022-12-31 12:00:00')
-1

(Why these numbers?6 The compare methods return −1, 0, or 1, according to the convention used by Array.prototype.sort, so that you can do things like arr.sort(Temporal.PlainDate.compare). 0 means the arguments are equal and −1 means the first comes before the second.)

But maybe the answer still depends a little bit on what your definition of “between” is. If it means the date-times form a closed interval instead of an open interval, and we are using date semantics, then the answer is yes.7

Question 3

Is the time 2022-06-01 12:00:00 after the date 2022-06-01?

After thinking about the previous two questions, this should be clear. If we’re using date semantics, the two are equal, so no. If we’re using date-time semantics, and we choose to convert a date to a date-time by assuming midnight as the time, then yes.

Other people’s answers

I saw a lot of answers saying that you need more context to be able to compare the two, so I estimate that the way Temporal requires that you give that context, instead of assuming one or the other, does fit with the way that many people think. However, that wasn’t the only kind of reply I saw. (Otherwise the discussion wouldn’t have been that interesting!) I’ll discuss some of the other common replies that I saw in the Twitter thread.

“Yes, no, no: truncate to just the dates and compare those, since that’s the data you have in common.” People who said this seem like they might naturally gravitate towards date semantics. I’d estimate that date semantics are probably correct for more use cases. But maybe not your use case!

“No, no, yes: a date with no time means midnight is implicit.” People who said this seem like they might naturally gravitate towards date-time semantics. It makes sense to me that programmers think this way; if you’re missing a piece of data, just fill in 0 and keep going. I’d estimate that this isn’t how a lot of nontechnical users think of dates, though.

In this whole post I’ve assumed we assume the time is midnight when we convert a date to a date-time, but in the messy world of dates and times, it can make sense to assume other times than midnight, as well. This comes up especially if time zones are involved. For example, you might assume noon, or start-of-day, instead. Start-of-day is often, but not always midnight:

Temporal.PlainDateTime.from('2018-11-04T12:00')
  .toZonedDateTime('America/Sao_Paulo')
  .startOfDay()
  .toPlainTime()  // -> 01:00

“These need to have time zones attached for the question to make sense.” If this is your first reaction when you see a question like this, great! If you write JavaScript code, you probably make fewer bugs just by being aware that JavaScript’s Date object makes it really easy to confuse time zones.

I estimate that Temporal’s ZonedDateTime type is going to fit more use cases in practice than either PlainDate or PlainDateTime. In that sense, if you find yourself with this data and these questions in your code, it makes perfect sense to ask yourself whether you should be using a time-zone-aware type instead. But, I think I’ve given some evidence above that sometimes the answer to that is no: for example, the pedometer data that I mentioned above.

“Dates without times are 24-hour intervals.” Also mentioned as “all-day events”. I can sort of see where this comes from, but I’m not sure I agree with it. In the world where JavaScript Date is the only tool you have, it probably makes sense to think of a date as an interval. But I’d estimate that a lot of non-programmers don’t think of dates this way: instead, it’s a square on your calendar!

It’s also worth noting that in some calendar software, you can create an all-day event that lasts from 00:00 until 00:00 the following day, and you can also create an event for just the calendar date, and these are separate things.

A 24-hour interval and a calendar date. Although notably, Google Calendar collapses the 24-hour event into a calendar-date event if you do this.

“Doesn’t matter, just pick one convention and stick with it.” I hope after reading this post you’re convinced that it does matter, depending on your use case.

“Ugh!” That’s how I feel too and why I wrote a whole blog post about it!

How do I feel about the choices we made in Temporal?

I’m happy with how Temporal encourages the programmer to handle these cases. When I went to try out the comparisons that were suggested in the original tweet, I found it was natural to pick either PlainDate or PlainDateTime to represent the data.

One thing that Temporal could have done instead (and in fact, we went back and forth on this a few times before the proposal reached its currently frozen stage in the JS standardization process) would be to make the choice of data type, and therefore of comparison semantics, more explicit.

For example, one might make a case that it’s potentially confusing that the 12:00:00 part of the string in Temporal.PlainDate.from('2022-06-01').equals('2022-06-01 12:00:00') is ignored when the string is converted to a PlainDate. We could have chosen, for example, to throw if the argument to PlainDate.prototype.equals() was a string with a time in it, or if it was a PlainDateTime. That would make the code for answering question 1 look like this:

> Temporal.PlainDate.from('2022-06-01').equals(
... Temporal.PlainDateTime.from('2022-06-01 12:00:00')
... .toPlainDate())
true

This approach seems like it’s better at forcing the programmer to make a choice consciously by throwing exceptions when there is any doubt, but at the cost of writing such long-winded code that I find it difficult to follow. In the end, I prefer the more balanced approach we took.

Conclusion

This was a really interesting problem to dig into. I always find it good to be reminded that no matter what I think is correct about date-time handling, someone else is going to have a different opinion, and they won’t necessarily be wrong.

I said in the beginning of the post: “to get a good answer, you need to specify which concept you are talking about.” Something we’ve tried hard to achieve in Temporal is to make it easy and natural, but not too obtrusive, to specify this. When I went to answer the questions using Temporal code, I found it pretty straightforward, and I think that validates some of the design choices we made in Temporal.

I’d like to acknowledge my employer Igalia for letting me spend work time writing this post, as well as Bloomberg for sponsoring Igalia’s work on Temporal. Many thanks to my colleagues Tim Chevalier, Jesse Alama, and Sarah Groff Hennigh-Palermo for giving feedback on a draft of this post.


[1] 777 days at the time of writing, according to Temporal.PlainDate.from('2020-01-13').until(Temporal.Now.plainDateISO()) ↩

[2] A common source of bugs with JavaScript’s legacy Date when the made-up time of day doesn’t exist due to DST ↩

[3] “Semantics” is, unfortunately, a word I’m going to use a lot in this post ↩

[4] “Time” in Temporal refers to a time on a clock face, with no date associated with it ↩

[5] We even say this on the PlainDateTime documentation page ↩

[6] We don’t have methods like isBefore()/isAfter() in Temporal, but this is a place where they’d be useful. These methods seem like good contenders for a follow-up proposal in the future ↩

[7] Intervals bring all sorts of tricky questions too! Some other date-time libraries have interval objects. We also don’t have these in Temporal, but are likewise open to a follow-up proposal in the future ↩

by Philip Chimento at March 03, 2022 02:03 AM