Planet Igalia WebKit

January 16, 2020

Paulo Matos

Cross-Arch Reproducibility using Containers

I present the use of containers for cross architecture reproducibility using docker and podman, which I then go on to apply to JSC. If you are trying to understand how to create cross-arch reproducible environments for your software, this might help you!

More…

by Paulo Matos at January 16, 2020 04:00 PM

January 08, 2020

Angelos Oikonomopoulos

A Dive Into JavaScriptCore

Recently, the compiler team at Igalia was discussing the available resources for the WebKit project, both for the purpose of onboarding new Igalians and for lowering the bar for third-party contributors. As compiler people, we are mainly concerned with JavaScriptCore (JSC), WebKit’s javascript engine implementation. There are many high quality blog posts on the webkit blog that describe various phases in the evolution of JSC, but finding one’s bearings in the actual source can be a daunting task.

The aim of this post is twofold: first, document some aspects of JavaScriptCore at the source level; second, show how one can figure out what a piece of code actually does in a large and complex source base (which JSC’s certainly is).

In medias res

As an exercise, we’re going to arbitrarily use a commit I had open in a web browser tab. Specifically, we will be looking at this snippet:

Operands<Optional<JSValue>> mustHandleValues(codeBlock->numParameters(), numVarsWithValues);
int localsUsedForCalleeSaves = static_cast<int>(CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters());
for (size_t i = 0; i < mustHandleValues.size(); ++i) {
    int operand = mustHandleValues.operandForIndex(i);
    if (operandIsLocal(operand) && VirtualRegister(operand).toLocal() < localsUsedForCalleeSaves)
	continue;
    mustHandleValues[i] = callFrame->uncheckedR(operand).jsValue();
}

This seems like a good starting point for taking a dive into the low-level details of JSC internals. Virtual registers look like a concept that’s good to know about. And what are those “locals used for callee saves” anyway? How do locals differ from vars? What are “vars with values”? Let’s find out!

Backstory

Recall that JSC is a multi-tiered execution engine. Most Javascript code is only executed once; compiling takes longer than simply interpreting the code, so Javascript code is always interpreted the first time through. If it turns out that a piece of code is executed frequently though1, compiling it becomes a more attractive proposition.

Initially, the tier up happens to the baseline JIT, a simple and fast non-optimizing compiler that produces native code for a Javascript function. If the code continues to see much use, it will be recompiled with DFG, an optimizing compiler that is geared towards low compilation times and decent performance of the produced native code. Eventually, the code might end up being compiled with the FTL backend too, but the upper tiers won’t be making an appearence in our story here.

What do tier up and tier down mean? In short, tier up is when code execution switches to a more optimized version, whereas tier down is the reverse operation. So the code might tier up from the interpreter to the baseline JIT, but later tier down (under conditions we’ll briefly touch on later) back to the baseline JIT. You can read a more extensive overview here.

Diving in

With this context now in place, we can revisit the snippet above. The code is part of operationOptimize. Just looking at the two sites it’s referenced in, we can see that it’s only ever used if the DFG_JIT option is enabled. This is where the baseline JIT ➞ DFG tier up happens!

The sites that make use of operationOptimize both run during the generation of native code by the baseline JIT. The first one runs in response to the op_enter bytecode opcode, i.e. the opcode that marks entry to the function. The second one runs when encountering an op_loop_hint opcode (an opcode that only appears at the beginning of a basic block marking the entry to a loop). Those are the two kinds of program points at which execution might tier up to the DFG.

Notice that calls to operationOptimize only occur during execution of the native code produced by the baseline JIT. In fact, if you look at the emitted code surrounding the call to operationOptimize for the function entry case, you’ll see that the call is conditional and only happens if the function has been executed enough times that it’s worth making a C++ call to consider it for optimization.

The function accepts two arguments: a vmPointer which is, umm, a pointer to a VM structure (i.e. the “state of the world” as far as this function is concerned) and the bytecodeIndex. Remember that the bytecode is the intermediate representation (IR) that all higher tiers start compiling from. In operationOptimize, the bytecodeIndex is used for

Again, the bytecodeIndex is a parameter that has already been set in stone during generation of the native code by the baseline JIT.

The other parameter, the VM, is used in a number of things. The part that’s relevant to the snippet we started out to understand is that the VM is (sometimes) used to give us access to the current CallFrame. CallFrame inherits from Register, which is a thin wrapper around a (maximally) 64-bit value.

The CodeBlock

In this case, the various accessors defined by CallFrame effectively treat the (pointer) value that CallFrame consists of as a pointer to an array of Register values. Specifically, a set of constant expressions

struct CallFrameSlot {
    static constexpr int codeBlock = CallerFrameAndPC::sizeInRegisters;
    static constexpr int callee = codeBlock + 1;
    static constexpr int argumentCount = callee + 1;
    static constexpr int thisArgument = argumentCount + 1;
    static constexpr int firstArgument = thisArgument + 1;
};

give the offset (relative to the callframe) of the pointer to the codeblock, the callee, the argument count and the this pointer. Note that the first CallFrameSlot is the CallerFrameAndPC, i.e. a pointer to the CallFrame of the caller and the returnPC.

The CodeBlock is definitely something we’ll need to understand better, as it appears in our motivational code snippet. However, it’s a large class that is intertwined with a number of other interesting code paths. For the purposes of this discussion, we need to know that it

  • is associated with a code block (i.e. a function, eval, program or module code block)
  • holds data relevant to tier up/down decisions and operations for the associated code block

We’ll focus on three of its data members:

int m_numCalleeLocals;
int m_numVars;
int m_numParameters;

So, it seems that a CodeBlock can have at least some parameters (makes sense, right?) but also has both variables and callee locals.

First things first: what’s the difference between callee locals and vars? Well, it turns out that m_numCalleeLocals is only incremented in BytecodeGeneratorBase<Traits>::newRegister whereas m_numVars is only incremented in BytecodeGeneratorBase<Traits>::addVar(). Except, addVar calls into newRegister, so vars are a subset of callee locals (and therefore m_numVarsm_numCalleelocals).

Somewhat surprisingly, newRegister is only called in 3 places:

So there you have it. Callee locals

  1. are allocated by a function called newRegister
  2. are either a var or a temporary.

Let’s start with the second point. What is a var? Well, let’s look at where vars are created (via addVar):

There is definitely a var for every lexical variable (VarKind::Stack), i.e. a non-local variable accessible from the current scope. Vars are also generated (via BytecodeGenerator::createVariable) for

So, intuitively, vars are allocated more or less for “every JS construct that could be called a variable”. Conversely, temporaries are storage locations that have been allocated as part of bytecode generation (i.e. there is no corresponding storage location in the JS source). They can store intermediate calculation results and what not.

Coming back to the first point regarding callee locals, how come they’re allocated by a function called newRegister? Why, because JSC’s bytecode operates on a register VM! The RegisterID returned by newRegister wraps the VirtualRegister that our register VM is all about.

Virtual registers, locals and arguments, oh my!

A virtual register (of type VirtualRegister) consists simply of an int (which is also called its offset). Each virtual register corresponds to one of

There is no differentiation between locals and arguments at the type level (everything is a (positive) int); However, virtual registers that map to locals are negative and those that map to arguments are nonnegative. In the context of bytecode generation, the int

It feels like JSC is underusing C++ here.

In all cases, what we get after indexing with a local, argument or constant is a RegisterID. As explained, the RegisterID wraps a VirtualRegister. Why do we need this indirection?

Well, there are two extra bits of info in the RegisterID. The m_refcount and an m_isTemporary flag. The reference count is always greater than zero for a variable, but the rules under which a RegisterID is ref’d and unref’d are too complicated to go into here.

When you have an argument, you get the VirtualRegister for it by directly adding it to CallFrame::thisArgumentoffset.

When you have a local, you map it to (-1 - local) to get the corresponding Virtualregister. So

local vreg
0 -1
1 -2
2 -3

(remember, virtual registers that correspond to locals are negative).

For an argument, you map it to (arg + CallFrame::thisArgumentOffset()):

argument vreg
0 this
1 this + 1
2 this + 2

Which makes all the sense in the world when you remember what the CallFrameSlot looks like. So argument 0 is always the `this` pointer.

If the vreg is greater than some large offset (s_firstConstantRegisterIndex), then it is an index into the CodeBlock's constant pool (after subtracting the offset).

Bytecode operands

If you’ve followed any of the links to the functions doing the actual mapping of locals and arguments to a virtual register, you may have noticed that the functions are called localToOperand and argumentToOperand. Yet they’re only ever used in virtualRegisterForLocal and virtualRegisterForArgument respectively. This raises the obvious question: what are those virtual registers operands of?

Well, of the bytecode instructions in our register VM of course. Instead of recreating the pictures, I’ll simply encourage you to take a look at a recent blog post describing it at a high level.

How do we know that’s what “operand” refers to? Well, let’s look at a use of virtualRegisterForLocal in the bytecode generator. BytecodeGenerator::createVariable will allocate2 the next available local index (using the size of m_calleeLocals to keep track of it). This calls into virtualRegisterForLocal, which maps the local to a virtual register by calling localToOperand.

The newly allocated local is inserted into the function symbol table, along with its offset (i.e. the ID of the virtual register).

The SymbolTableEntry is looked up when we generate bytecode for a variable reference. A variable reference is represented by a ResolveNode3.

So looking into ResolveNode::emitBytecode, we dive into BytecodeGenerator::variable and there’s our symbolTable->get() call. And then the symbolTableEntry is passed to BytecodeGenerator::variableForLocalEntry which uses entry.varOffset() to initialize the returned Variable with offset. It also uses registerFor to retrieve the RegisterID from m_calleeLocals.

ResolveNode::emitBytecode will then pass the local RegisterID to move which calls into emitMove, which just calls OpMov::emit (a function generated by the JavaScriptCore/generator code). Note that the compiler implicitly converts the RegisterID arguments to VirtualRegister type at this step. Eventually, we end up in the (generated) function

template<OpcodeSize __size, bool recordOpcode, typename BytecodeGenerator>
static bool emitImpl(BytecodeGenerator* gen, VirtualRegister dst, VirtualRegister src)
{
    if (__size == OpcodeSize::Wide16)
	gen->alignWideOpcode16();
    else if (__size == OpcodeSize::Wide32)
	gen->alignWideOpcode32();
    if (checkImpl<__size>(gen, dst, src)) {
	if (recordOpcode)
	    gen->recordOpcode(opcodeID);
	if (__size == OpcodeSize::Wide16)
	    gen->write(Fits<OpcodeID, OpcodeSize::Narrow>::convert(op_wide16));
	else if (__size == OpcodeSize::Wide32)
	    gen->write(Fits<OpcodeID, OpcodeSize::Narrow>::convert(op_wide32));
	gen->write(Fits<OpcodeID, __size>::convert(opcodeID));
	gen->write(Fits<VirtualRegister, __size>::convert(dst));
	gen->write(Fits<VirtualRegister, __size>::convert(src));
	return true;
    }
    return false;
}

where Fits::convert(VirtualRegister) will trivially encode the VirtualRegister into the target type. Specifically the mapping is nicely summed up in the following comment

// Narrow:
// -128..-1  local variables
//    0..15  arguments
//   16..127 constants
//
// Wide16:
// -2**15..-1  local variables
//      0..64  arguments
//     64..2**15-1 constants

You may have noticed that the Variable returned by BytecodeGenerator::variableForLocalEntry already has been initialized with the virtual register offset we set when inserting the SymbolTableEntry for the local variable. And yet we use registerFor to look up the RegisterID for the local and then use the offset of the VirtualRegister contained therein. Surely those are the same? Oh well, something for a runtime assert to check.

Variables with values

Whew! Quite the detour there. Time to get back to our original snippet:

Operands<Optional<JSValue>> mustHandleValues(codeBlock->numParameters(), numVarsWithValues);
int localsUsedForCalleeSaves = static_cast<int>(CodeBlock::llintBaselineCalleeSaveSpaceAsVirtualRegisters());
for (size_t i = 0; i < mustHandleValues.size(); ++i) {
    int operand = mustHandleValues.operandForIndex(i);
    if (operandIsLocal(operand) && VirtualRegister(operand).toLocal() < localsUsedForCalleeSaves)
	continue;
    mustHandleValues[i] = callFrame->uncheckedR(operand).jsValue();
}

What are those numVarsWithValues then? Well, the definition is right before our snippet:

unsigned numVarsWithValues;
if (bytecodeIndex)
    numVarsWithValues = codeBlock->numCalleeLocals();
else
    numVarsWithValues = 0;

OK, so this looks straighforward for a change. If the bytecodeIndex is not zero, we’re doing the tier up from JIT to DFG in the body of a function (i.e. at a loop entry). In that case, we consider all our callee locals to have values. Conversely, when we’re running for the function entry (i.e. bytecodeIndex == 0), none of the callee locals are live yet. Do note that the variable is incorrectly named. Vars are not the same as callee locals; we’re dealing with the latter here.

A second gotcha is that, whereas vars are always live, temporaries might not be. The DFG compiler will do liveness analysis at compile time to make sure it’s only looking at live values. That must have been a fun bug to track down!

Values that must be handled

Back to our snippet, numVarsWithValues is used as an argument to the constructor of mustHandleValues which is of type Operands<Optional<JSValue>>. Right, so what are the Operands? They simply hold a number of T objects (here T is Optional<JSValue>) of which the first m_numArguments correspond to, well, arguments whereas the remaining correspond to locals.

What we’re doing here is recording all the live (non-heap, obviously) values when we try to do the tier up. The idea is to be able to mix those values in with the previously observed values that DFG’s Control Flow Analysis will use to emit code which will bail us out of the optimized version (i.e. do a tier down). According to the comments and commit logs, this is in order to increase the chances of a successful OSR entry (tier up), even if the resulting optimized code may be slightly less conservative.

Remember that the optimized code that we tier up to makes assumptions with regard to the types of the incoming values (based on what we’ve observed when executing at lower tiers) and wil bail out if those assumptions are not met. Taking the values of the current execution at the time of the tier up attempt ensures we won’t be doing all this work only to immediately have to tier down again.

Operands provides an operandForIndex method which will directly give you a virtual reg for every kind of element. For example, if you had called Operands<T> opnds(2, 1), then the first iteration of the loop would give you

operandForIndex(0)
-> VirtualRegisterForargument(0).offset()
  -> VirtualRegister(argumentToOperand(0)).offset()
    -> VirtualRegister(CallFrame::thisArgumentOffset).offset()
      -> CallFrame::thisArgumentOffset

The second iteration would similarly give you CallFrame::thisArgumentOffset + 1.

In the third iteration, we’re now dealing with a local, so we’d get

operandForIndex(2)
-> virtualRegisterForLocal(2 - 2).offset()
  -> VirtualRegister(localToOperand(0)).offset()
    -> VirtualRegister(-1).offset()
      -> -1

Callee save space as virtual registers

So, finally, what is our snippet doing here? It’s iterating over the values that are likely to be live at this program point and storing them in mustHandleValues. It will first iterate over the arguments (if any) and then over the locals. However, it will use the “operand” (remember, everything is an int…) to get the index of the respective local and then skip the first locals up to localsUsedForCalleeSaves. So, in fact, even though we allocated space for (arguments + callee locals), we skip some slots and only store (arguments + callee locals - localsUsedForCalleeSaves). This is OK, as the Optional<JSValue> values in the Operands will have been initialized by the default constructor of Optional<> which gives us an object without a value (i.e. an object that will later be ignored).

Here, callee-saved register (csr) refers to a register that is available for use to the LLInt and/or the baseline JIT. This is described a bit in LowLevelInterpreter.asm, but is more apparent when one looks at what csr sets are used on each platform (or, in C++).

platform metadataTable PC-base (PB) numberTag notCellMask
X86_64 csr1 csr2 csr3 csr4
x86_64_win csr3 csr4 csr5 csr6
ARM64~/~ARM64E csr6 csr7 csr8 csr9
C_LOOP 64b csr0 csr1 csr2 csr3
C_LOOP 32b csr3 - - -
ARMv7 csr0 - - -
MIPS csr0 - - -
X86 - - - -

On 64-bit platforms, offlineasm (JSC’s portable assembler) makes a range of callee-saved registers available to .asm files. Those are properly saved and restored. For example, for X86_64 on non-Windows platforms, the returned RegisterSet contains registers r12-r15 (inclusive), i.e. the callee-saved registers as defined in the System V AMD64 ABI. The mapping from symbolic names to architecture registers can be found in GPRInfo.

On 32-bit platforms, the assembler doesn’t make any csr regs available, so there’s nothing to save except if the platform makes special use of some register (like C_LOOP does for the metadataTable 4).

What are the numberTag and notCellMask registers? Out of scope, that’s what they are!

Conclusion

Well, that wraps it up. Hopefully now you have a better understanding of what the original snippet does. In the process, we learned about a few concepts by reading through the source and, importantly, we added lots of links to JSC’s source code. This way, not only can you check that the textual explanations are still valid when you read this blog post, you can use the links as spring boards for further source code exploration to your heart’s delight!

Footnotes

1 Both the interpreter – better known as LLInt – and the baseline JIT keep track of execution statistics, so that JSC can make informed decisions on when to tier up.

2 Remarkably, no RegisterID has been allocated at this point – we used the size of m_calleeLocals but never modified it. Instead, later in the function (after adding the new local to the symbol table!) the code will call addVar which will allocate a new “anonymous” local. But then the code asserts that the index of the newly allocated local (i.e. the offset of the virtual register it contains) is the same as the offset we previously used to create the virtual register, so it’s all good.

3 How did we know to look for the ResolveNode? Well, the emitBytecode method needs to be implemented by subclasses of ExpressionNode. If we look at how a simple binary expression is parsed (and given that ASTBuilder defines BinaryOperand as std::pair<ExpressionNode*, BinaryOpInfo>), it’s clear that any variable reference has already been lifted to an ExpressionNode.

So instead, we take the bottom up approach. We find the lexer/parser token definitions, one of which is the IDENT token. Then it’s simply a matter of going over its uses in Parser.cpp, until we find our smoking gun. This gets us into createResolve aaaaand

return new (m_parserArena) ResolveNode(location, ident, start);

That’s the node we’re looking for!

4 C_LOOP is a special backend for JSC’s portable assembler. What is special about it is that it generates C++ code, so that it can be used on otherwise unsupported architectures. Remember that the portable assembler (offlineasm) runs at compilation time.

January 08, 2020 12:00 PM

A Dive Into JavaScriptCore

This post is an attempt to both document some aspects of JSC at the source level and to show how one can figure out what a piece of code actually does in a source base as large and complex as JSC's.

January 08, 2020 12:00 PM

December 12, 2019

Nikolas Zimmermann

CSS 3D transformations & SVG

As mentioned in my first article, I have a long relationship with the WebKit project, and its SVG implementation. In this post I will explain some exciting new developments and possible advances, and I present some demos of the state of the art (if you cannot wait, go and watch them, and come back for the details). To understand why these developments are both important and achievable now though, we’ll have to first understand some history.

by zimmermann@kde.org (Nikolas Zimmermann) at December 12, 2019 12:00 AM

December 08, 2019

Philippe Normand

HTML overlays with GstWPE, the demo

Once again this year I attended the GStreamer conference and just before that, Embedded Linux conference Europe which took place in Lyon (France). Both events were a good opportunity to demo one of the use-cases I have in mind for GstWPE, HTML overlays!

As we, at Igalia, usually have a …

by Philippe Normand at December 08, 2019 02:00 PM

December 04, 2019

Manuel Rego

Web Engines Hackfest 2020: New dates, new venue!

Igalia is pleased to announce the 12th annual Web Engines Hackfest. It will take place on May 18-20 in A Coruña, and in a new venue: Palexco. You can find all the information, together with the registration form, on the hackfest website: https://webengineshackfest.org/2020/.

Mission and vision

The main goal behind this event is to have a place for people from different parts of the web platform community to meet together for a few days and talk, discuss, draft, prototype, implement, etc. on different topics of interest for the whole group.

There are not many events where browser implementors from different engines can sit together and talk about their last developments, their plans for the future, or the controversial topics they have been discussing online.

However this is an event not only for developers, other roles that are part of the community, like people working on standards, are welcomed to the event.

It’s really nice to have people from different backgrounds and working on a variety of things around the web, to reach better solutions, enlighten the conversations and draft higher quality conclusions during the discussions.

We believe the combination of all these factors make the Web Engines Hackfest an unique opportunity to push forward the evolution of the web.

2020 edition

We realized that autumn is usually full of browser events (TPAC, BlinkOn, WebKit Contributors Meeting, … just to name a few), and most of the people coming to the hackfest are also attending some of them. For that reason we thought it would be a good idea to move the event from fall to spring, in order to better accommodate everyone’s schedules and avoid unfortunate conflicts or unnecessary hard choices. So next year the hackfest will happen on May from Monday 18th to Wednesday 20th (both days included).

At this stage the event is becoming popular and during the past three years we have been around 60-70 people. Igalia office has been a great venue for the hackfest during all this time, but on the last occasions we were using it as its full capacity. So this time we decided to move the hackfest to a new venue, which will allow us to grow to 100 or more participants, let’s see how things go. The venue would be Palexco, a lovely conferences building in A Coruña port, which is very close to the city center. We really hope you like the new place and enjoy it.

New venue: Palexco (picture by Jose Luis Cernadas Iglesias) New venue: Palexco (picture by Jose Luis Cernadas Iglesias)

Having more people and the new venue bring us lots of challenges but also new possibilities. So we’re changing a little bit the format of the event, we’ll have a first day in a more regular conference fashion (with some talks and lighting talks) but also including some space for discussions and hacking. And then the last 2 days will be more the usual unconference format with a bunch of breakout sessions, informal discussions, etc. We believe the conversations and discussions that happen during the hackfest are one of the best things of the event, and we hope this new format will work well.

Join us

Thanks to the changes on the venue, the event is no longer invitation-only (as it used to be). We’ll be still sending the invitations to the people usually interested on the hackfest, but you can already register by yourself just filling the registration form.

Soon we will open a call for papers for the talks, stay tuned! We’ll also have room for ligthing talks, so people attending can take advantage of them to explain their work and plans on the event.

Last but not least, Arm, Google and Igalia will be sponsoring 2020 edition, thank you very much! We hope more companies join the trend and help us to arrange the event with their support. If your company is willing to sponsor the hackfest, please don’t hesitate to contact us at hackfest@webengineshackfest.org.

Some historical information

Igalia has been organizing and hosting this event since 2009. Back then, the event was called the “WebKitGTK+ Hackfest”. The WebKitGTK+ project was, on those days, in early stages. There was lots of work to do around the project, and a few people (11 to be specific) decided to work together for a whole week to move the project forward. The event was really successful and it was happening on a similar fashion for 5 years.

On 2014 we decided to make broader the scope of the event and not restrict it to people working only on WebKitGTK+ (or WebKit), but open it to members from all parts of the web platform community (including folks working on other engines like Blink, Servo, Gecko). We changed the name to “Web Engines Hackfest”, we got a very positive response and the event has been running on yearly since then, growing more and more every year.

And now we’re looking forward to 2020 edition, in a new venue and with more people than ever. Let’s hope everything goes great.

December 04, 2019 11:00 PM

November 25, 2019

Nikolas Zimmermann

Back in town

Welcome to my blog!

Finally I’m back after my long detour to physics :-)

Some of you might know that my colleague Rob Buis and me founded the ksvg project a little more than 18 years ago (announcement mail to kfm-devel) and met again after many years in Galicia last month.

by zimmermann@kde.org (Nikolas Zimmermann) at November 25, 2019 12:00 AM

November 13, 2019

Manuel Rego

Web Engines Hackfest 2019

A month ago Igalia hosted another edition of the Web Engines Hackfest in our office in A Coruña. This is my personal summary of the event, obviously biased as I’m part of the organization.

Talks

During the event we arranged six talks about a variety range of topics:

Emilio Cobos during his talk at the Web Engines Hackfest 2019 Emilio Cobos during his talk at the Web Engines Hackfest 2019

Web Platform Tests (WPT)

Apart from the talks, the main and most important part of the hackfest (at least from my personal point of view) are the breakout sessions and discussions we organize about different interest topics.

During one of these sessions we talked about the status of things regarding WPT. WPT is working really fine for Chromium and Firefox, however WebKit is still lagging behind as synchronization is still manual there. Let’s hope things will improve in the future on the WebKit side.

We also highlighted that the number of dynamic tests on WPT are less than expected, we discarded issues with the infrastructure and think that the problems are more on the side of people writing the tests, that somehow forget to cover cases when things changes dynamically.

Apart from that James Graham put over the table the results from the last MDN survey, which showed interoperability as one of the most important issues for web developers. WPT is helping with interop but despite of the improvements on that regard this is still a big problem for authors. We didn’t have any good answer about how to fix that, in my case I shared some ideas that could help to improve things at some point:

  • Mandatory tests for specs: This is already happening for some specs like HTML but not for all of them. If we manage to reach a point where every change on a spec comes with a test, probably interoperability on initial implementations will be much better. It’s easy to understand why this is not happening as people working on specs are usually very overloaded.
  • Common forum to agree on shipping features: This is a kind of utopia, as each company has their own priorities, but if we had a place were the different browser vendors talk in order to reach an agreement about when to ship a feature, that would make web author’s lifes much easier. We somehow managed to do that when we shipped CSS Grid Layout almost simultaneously in the different browsers, if we could repeat that success story for more features in the future that would be awesome.

Debugging tools

One of the afternoons we did a breakout session related to debugging tools.

First Christian Biesinger showed us JdDbg which is an amazing tool to explore data structures in the web browser (like the DOM, layout or accessibility trees). All the information is updated live while you debug your code, and you can access all of them on a single view very comfortably.

Afterwards Emilio Cobos explained how to use the reverse debugger rr. With this tool you can record a bug and then replay it as many times as you need going back and forward in time. Also Emilio showed how to annotate all the output so you can go directly to that moment in time, or how to randomize the execution to help caught race conditions. As a result of this explanation we got a bug fixed in WebKitGTK+.

Other

About MathML Fred’s talk finished sending the intent-to-implement mail to blink-dev officially announcing the beginning of the upstreaming process. Since then a bunch of patches have already landed behind a runtime flag, you can follow the progress on Chromium issue #6606 if you’re interested.

On the last day a few of us even attended the CSS Working Group confcall during the hackfest, which worked as a test for Igalia office’s infrastructure thinking on the face-to-face meeting we’ll be hosting next January.

People attending the CSSWG confcall (from left to right: Oriol, Emilio, fantasai, Christian and Brian) People attending the CSSWG confcall (from left to right: Oriol, Emilio, fantasai, Christian and Brian)

As a side note, this time we arranged a guided city tour around A Coruña and, despite of the weather, people seemed to have enjoyed it.

Acknowledgements

Thanks to everyone coming, we’re really happy for the lovely feedback you always share about the event, you’re so kind! ☺

Of course, kudos to the speakers for the effort working on such a nice set of interesting talks. 👏

Last, but not least, big thanks to the hackfest sponsors: Arm, Google, Igalia and Mozilla. Your support is critical to make this event possible, you rock. 🎸

Web Engines Hackfest 2019 sponsors: Google and Igalia Web Engines Hackfest 2019 sponsors: Arm, Google, Igalia and Mozilla

See you all next year. Some news about the next edition will be announced very soon, stay tuned!

November 13, 2019 11:00 PM

October 28, 2019

Adrián Pérez de Castro

The Road to WebKit 2.26: a Six Month Retrospective

Now that version 2.26 of both WPE WebKit and WebKitGTK ports have been out for a few weeks it is an excellent moment to recap and take a look at what we have achieved during this development cycle. Let's dive in!

  1. New Features
  2. Security
  3. Cleanups
  4. Releases, Releases!
  5. Buildbot Maintenance
  6. One More Thing

New Features

Emoji Picker

The GTK emoji picker has been integrated in WebKitGTK, and can be accessed with Ctrl-Shift-; while typing on input fields.

GNOME Web showing the GTK emoji picker.

Data Lists

WebKitGTK now supports the <datalist> HTML element (reference), which can be used to list possible values for an <input> field. Form fields using data lists are rendered as an hybrid between a combo box and a text entry with type–ahead filtering.

GNOME Web showing an <input> entry with completion backed by <datalist>.

WPE Renderer for WebKitGTK

The GTK port now supports reusing components from WPE. While there are no user-visible changes, with many GPU drivers a more efficient buffer sharing mechanism—which takes advantage of DMA-BUF, if available—is used for accelerated compositing under Wayland, resulting in better performance.

Packagers can disable this feature at build time passing -DUSE_WPE_RENDERER=OFF to CMake, which could be needed for systems which cannot provide the needed libwpe and WPEBackend-fdo libraries. It is recommended to leave this build option enabled, and it might become mandatory in the future.

In-Process DNS Cache

Running a local DNS caching service avoids doing queries to your Internet provider’s servers when applications need to resolve the same host names over and over—something web browsers do! This results in faster browsing, saves bandwidth, and partially compensates for slow DNS servers.

Patrick and Carlos have implemented a small cache inside the Network Process which keeps in memory a maximum of 400, valid for 60 seconds. Even though it may not seem like much, this improves page loads because most of the time the resources needed to render a page are spread across a handful of hosts and their cache entries will be reused over and over.

Promotional image of the “Gone in 60 Seconds” movie.
This image has nothing to do with DNS, except for the time entries are kept in the cache.

While it is certainly possible to run a full-fledged DNS cache locally (like dnsmasq or systemd-resolved, which many GNU/Linux setups have configured nowadays), WebKit can be used in all kinds of devices and operating systems which may not provide such a service. The caching benefits all kinds of systems, with embedded devices (where running an additional service is often prohibitive) benefiting the most, and therefore it is always enabled by default.

Security

Remember Meltdown and Spectre? During this development cycle we worked on mitigations against side channel attacks like these. They are particularly important for a Web engine, which can download and execute code from arbitrary servers.

Subprocess Sandboxing

Both WebKitGTK and WPE WebKit follow a multi-process architecture: at least there is the “UI Process”, an application that embeds WebKitWebView widget; the “Web Process” (WebKitWebProcess, WPEWebProcess) which performs the actual rendering, and the “Network Process” (WebKitNetworkProcess, WPENetworkProcess) which takes care of fetching content from the network and also manages caches and storage.

Patrick Griffis has led the effort to add support in WebKit to isolate the Web Process from the rest of the system, running it with restricted access to the rest of the system. This is achieved using Linux namespaces—the same underlying building blocks used by containerization technologies like LXC, Kubernetes, or Flatpak. As a matter of fact, we use the same building blocks as Flatpak: Bubblewrap, xdg-dbus-proxy, and libseccomp. This not only makes it more difficult for a website to snoop on other processes' data: it also limits potential damage to the rest of the system caused by maliciously crafted content, because the Web Process is where most of which it is parsed and processed.

This feature is built by default, and using it in applications is only one function call away.

PSON

Process Swap On (cross-site) Navigation is a new feature which makes it harder for websites to steal information from others: rendering of pages from different sites always takes place in different processes. In practice, each security origin uses a different Web Process (see above) for rendering, and while navigating from one page to another new processes will be launched or terminated as needed. Chromium's Site Isolation works in a similar way.

Unfortunately, the needed changes ended up breaking a few important applications which embed WebKitGTK (like GNOME Web or Evolution) and we had to disable the feature for the GTK port just before its 2.26.0 release—it is still enabled in WPE WebKit.

Our plan for the next development cycle is keep the feature disabled by default, and to provide a way for applications to opt-in. Unfortunately it cannot be done the other way around because the public WebKitGTK API has been stable for a long time and we cannot afford breaking backwards compatibility.

HSTS

This security mechanism helps protect websites against protocol downgrade attacks: Web servers can declare that clients must interact using only secure HTTPS connections, and never revert to using unencrypted HTTP.

During the last few months Claudio Saavedra has completed the support for HTTP Strict Transport Security in libsoup—our networking backend—and the needed support code in WebKit. As a result, HSTS support is always enabled.

New WebSockets Implementation

The WebKit source tree includes a cross-platform WebSockets implementation that the GTK and WPE ports have been using. While great for new ports to be able to support the feature, it is far from optimal: we were duplicating network code because libsoup also implements them.

Now that HSTS support is in place, Claudio and Carlos decided that it was a good moment to switch libsoup's implementation so WebSockets can now also benefit from it. This also made possible to provide the RFC-7692 permessage-deflate extension, which enables applications to request compression of message payloads.

To ensure that no regressions would be introduced, Claudio also added support in libsoup for running the Autobahn 🛣 test suite, which resulted in a number of fixes.

Cleanups

During this release cycle we have deprecated the single Web Process mode, and trying to enable it using the API is a no-op. The motivation for this is twofold: in the same vein of PSON and sanboxing, we would rather not allow applications to make side channel attacks easier; not to mention that the changes needed in the code to accommodate PSON would make it extremely complicated to keep the existing API semantics. As this can potentially be trouble for some applications, we have been in touch with packagers, supporting them as best as we can to ensure that the new WebKitGTK versions can be adopted without regressions.

Another important removal was the support for GTK2 NPAPI browser plug-ins. Note that NPAPI plug-ins are still supported, but if they use GTK they must use version 3.x—otherwise they will not be loaded. The reason for this is that GTK2 cannot be used in a program which uses GTK3, and vice versa. To circumvent this limitation, in previous releases we were building some parts of the WebKit source code twice, each one using a different version of GTK, resulting in two separate binaries: we have only removed the GTK2 one. This allowed for a good clean up of the source tree, reduced build times, and killed one build dependency. With NPAPI support being sunsetted in all browsers, the main reason to keep some degree of support for it is the Flash plug-in. Sadly its NPAPI version uses GTK2 and it does not work starting with WebKitGTK 2.26.0; on the other hand, it is still possible to run the PPAPI version of Flash through FreshPlayerPlugin if needed.

Releases, Releases!

Last March we released WebKitGTK 2.24 and WPE WebKit 2.24 in sync, and the same for the current stable, 2.26. As a matter of fact, most releases since 2.22 have been done in lockstep and this has been working extremely well.

Hannibal Smith, happy about the simultaneous releases.

Both ports share many of their components, so it makes sense to stabilize and prepare them for a new release series at the same time. Many fixes apply to both ports, and the few that not hardly add noise to the branch. This allows myself and Carlos García to split the effort of backporting fixes to the stable branch as well—though I must admit that Carlos has often done more.

Buildroot ♥ WebKit

Those using Buildroot to prepare software images for various devices will be happy to know that packages for the WPE WebKit components have been imported a while ago into the source tree, and have been available since the 2019.05 release.

Two years ago I dusted off the webkitgtk package, bringing it up to the most recent version at the time, keeping up with updates and over time I have been taking care of some of its dependencies (libepoxy, brotli, and woff2) as well. Buildroot LTS releases are now receiving security updates, too.

Last February I had a great time meeting some of the Buildroot developers during FOSDEM, where we had the chance of discussing in person how to go about adding WPE WebKit packages to Buildroot. This ultimately resulted in the addition of packages libwpe, wpebackend-fdo, wpebackend-fdo, and cog to the tree.

My plan is to keep maintaining the Buildroot packages for both WebKit ports. I have also a few improvements in the pipeline, like enabling the sandboxing support (see this patch set) and usage of the WPE renderer in the WebKitGTK package.

Buildbot Maintenance

Breaking the Web is not fun, so WebKit needs extensive testing. The source tree includes tens of thousands of tests which are used to avoid regressions, and those are ran on every commit using Buildbot. The status can be checked at build.webkit.org.

Additionally, there is another set of builders which run before a patch has had the chance of being committed to the repository. The goal is to catch build failures and certain kinds of programmer errors as early as possible, ensuring that the source tree is kept “green”—that is: buildable. This is the EWS, short for Early Warning System, which trawls Bugzilla for new—or updated—patches, schedules builds with them applied, and adds a set of status bubbles in Bugzilla next to them. Igalia also contributes with EWS builders

EWS bot bubbles as shown in Bugzilla
For each platform the EWS adds a status bubble after trying a patch.

Since last April there is an ongoing effort to revamp the EWS infrastructure, which is now using Buildbot as well. Carlos López has updated our machines recently to Debian Buster, then I switched them to the new EWS at ews-build.webkit.org. This is based on Buildbot as well, which brings niceties in the user interface like being able to check the status for the GTK and for the WPE WebKit port conveniently in realtime. Most importantly, this change has brought the average build time from thirteen minutes down to eight, making the “upload patch, check EWS build status” cycle shorter for developers.

Big props to Aakash Jain, who has been championing all the EWS improvements.

One More Thing

Finally, I would like to extend our thanks to everybody who has contributed to WebKit during the 2.26 development cycle, and in particular to the Igalia Multimedia team, who have been hard at work improving our WebRTC support and the GStreamer back-end 🙇.

October 28, 2019 12:30 AM

October 02, 2019

Paulo Matos

A Brief Look at the WebKit Workflow

As I learn about the workflow for contributing to JSC (the JavaScript Compiler) in WebKit, I took a few notes as I went along. However, I decided to write them as a post in the hope that they are useful for you as well. If you use git, a Unix based system, and want to start contributing to WebKit, keep on reading.

More…

by Paulo Matos at October 02, 2019 03:56 PM