Planet Igalia

September 13, 2018

Diego Pino

YANG alarms

Alarm management is a fundamental part of network monitoring. The motivation for defining a standard alarm interface for network devices isn’t new. In the early 90s, ITU-T standardized X.733 (OSI model). This continued in mobile networks with the standardization of Alarm IRP (Integration Reference Point) by 3GPP. In TCP/IP networks, SNMP is the preferred choice for network management, along with ad hoc tools (usually command-line scripts). In SNMP, object information is stored as MIBs (Management Information Base), formal descriptions of the network objects that can be managed. Usually MIBs have a tree structure.

The IETF didn’t early on standardize an alarm MIB. Instead, management systems interpreted the enterprise specific traps per MIB to build an alarm list. When finally RFC 3877 (Alarm Management Information Base MIB) was published, it had to address the existence of these enterprise traps and map them into alarms. This requirement led to a MIB that was not easy to use.

Introducing NETCONF and YANG

SNMP is still the dominant protocol for network management, although it has start showing its age. In the last years, several alternatives were proposed with the goal of replacing it. Among all proposals, the most promising alternative is NETCONF (RFC 6241: Network Configuration Protocol). NETCONF is, like SNMP, a network management protocol. It provides mechanisms to install, manipulate, and delete the configuration of network devices. NETCONF uses an RPC mechanism to execute its operations, whereas protocol messages are encoded in XML (or JSON).

The NETMOD WG (NETCONF Data Modeling Working Group) defines the semantics of operational data, configuration data, notifications and operations, using a data modeling language called YANG (See RFC 6020 and RFC 6021).

YANG is a very rich language. It allows to define much more complex data structures than other modeling languages such DTD or XML-Schema. For instance, YANG features a wide range of primitive data types (uint32, string, boolean, decimal64, etc), simple data (leaf), structured data elements (container, list, list-leaf), definition of customized types (typedef), definition of remote procedure calls, references (instance-ref, leaf-ref), notifications, etc.

Take the following model as example:

container students {
   list student {
      leaf name {
         type string;
      }
      leaf data-birth {
         type yang:date;
      }
   }
}

students {
   student { name "Jane"; date-of-birth "01-01-1995"; }
   student { name "John"; date-of-birth "31-03-1995"; }
}

That very same model could be written in DTD/XML form as:

<?xml version="1.0"?>
<!DOCTYPE note [
<!ELEMENT students (student*)>
<!ELEMENT student (name,date-of-birth)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT date-of-birth (#PCDATA)>
]>
<students>
   <student>
      <name>Jane</name>
      <date_of_birth>01-01-1995</date_of_birth>
   </student>
   <student>
      <name>John</name>
      <date_of_birth>31-01-1995</date_of_birth>
   </student>
</students>

One obvious difference is that field date-of-birth is encoded as a string in the DTD/XML model. On the contrary, it’s defined as a date in the YANG model. Supporting date as a native data type in the language improves value checking. If date-of-birth is not a valid date, our YANG library will report the error.

YANG also allows to compose several YANG modules into one single document. Data types from a different module can be accessed via namespace, as in the example above in the case of yang:date.

Covering all its aspects of YANG would require a blog post on its own, so I will end here this introduction. Summarizing, the two main ideas I’d like to highlight are the following:

  • NETCONF is a relatively new network management protocol, aimed to replace SNMP, tightly coupled with YANG.
  • YANG is the data modeling language used to define NETCONF’s data models.

YANG alarms

The YANG alarms module is defined in draft-vallin-netmod-alarm-module-02.txt. Its implementation in Snabb was sponsored, as most lwAFTR related work, by Deutsche-Telekom. The module specification is still a draft but even in this state it features enough functionality to make an implementation valuable.

The implementation uses Snabb’s native YANG library and Snabb’s config tool, a simple implementation of NETCONF. Both tools were mostly developed by Igalia (more precisely by my colleagues Andy Wingo and Jessica Tallon), also as part of the work of Snabb’s lwAFTR.

At a high level view, the YANG alarms module is organized in two parts:

  • Configuration data: stores all the attributes and variables that control how the module should operate. For example, max-alarm-status-changes controls the size of an alarm status-change list (default: 32); notify-status-changes, controls whether notifications are sent on alarms status updates.
  • State data: actually stores alarm information and consists of 4 containers: alarm-list, alarm-inventory, shelved-alarms and summary.

The main component of the state data container is the alarm-list container:

list alarm {
   key "resource alarm-type-id alarm-type-qualifier";

   uses common-alarm-parameters;
}

grouping common-alarm-parameters {
   leaf resource {
      type resource;
      mandatory true;
   }
   leaf alarm-type-id {
      type alarm-type-id;
      mandatory true;
   }
   leaf alarm-type-qualifier {
      type alarm-type-qualifier;
   }
}

The alarm-list container stores all the active alarms managed in the system. But before going any further, we should define what an alarm is. Basically, an alarm is a persistent indication of a fault that clears only when its triggering condition has been resolved. An active alarm is always in at least these two states: raised or cleared.

When an alarm is raised a new entry is created in alarm-list. An alarm is identified by the triple: {resource, alarm-type-id, alarm-type-qualifier}, describing the resource that is affected, a type of alarm identifier and a qualifier that contains other optional information. Besides this information, an alarm also stores other information (omitted in the example for simplification) such as whether the alarm is-cleared, its last-changed timestamp, perceived-severity and a list of status changes. When an alarm is created, a new item is created in this list. If later the alarm increases or decreases its priority, or changes some other properties as per defined in the standard, a new status change is added to this list.

Most of the YANG Alarms module business logic is implemented in lib/yang/alarms.lua. This library provides an API that allows to define alarms and handle when to raise them or clear them. If we would like to monitor a special condition we just simply need to import the alarms module and create a check point. For instance:

function ARP:maybe_send_arp_request (output)
   if self.next_mac then return end
   self.next_arp_request_time = self.next_arp_request_time or engine.now()
   if self.next_arp_request_time <= engine.now() then
      self:arp_resolving(self.next_ip)
      ...
   end
end
function ARP:arp_resolving (ip)
   print(("ARP: Resolving '%s'"):format(ipv4:ntop(self.next_ip)))
   if self.alarm_notification then
      arp_alarm:raise()
   end
end

When the condition is not met (self.next_arp wasn’t solved yet and self.next_arp_request_time has expired), an alarm is raised. But what if this check point is executed repeatedly, for instance every second until an operator fixes the alarm condition? To avoid saturating the alarm list, the standard specifies an elapse time of 2 minutes before the same alarm is raised again. This elapse is managed by the alarms library.

Besides a list of alarms, the module also defines these other containers:

  • alarm-inventory: It contains all possible alarm types for the system.
  • summary: Summary of numbers of alarms and shelved alarms.
  • shelved-alarms: A shelved alarm is ignored and won’t emit raise or clear events. Shelved alarms don’t emit notifications either. Shelving an alarm is a convenient way to silent an alarm.

When an alarm is raised, cleared or changes its status, a notification is sent. The alarms module specifies three types of notifications:

  • alarm-notification: Used to report a state change for an alarm. This alarm is emitted when an alarm is raised, clear or its status change.
  • alarm-inventory-changed: Used to report that the list of possible alarms has changed.
  • operator-action: Used to report that an operator acted upon an alarm.

Continuing with the ARP alarm example, here’s how a notification looks like when such alarm raises:

$ sudo ./snabb alarms listen lwaftr
{"event":"alarm-notification",
 "resource":"16446", "alarm_type_id":"arp-resolution", "alarm_type_qualifier":"",
 "perceived_severity":"critical", "alarm_text":"Make sure you can resolve..."}

Upon receiving a notification, an operator, or an external program, can act on the affected resource signaled by the alarm and fix the condition that triggered it. For instance, in the case of the lwAFTR being unable to resolve the next hop IPv4 address, such alarm indicates the host isn’t reachable (the host is down, or there’s no route to that address).

Lastly, the module also specifies one YANG action and two YANG RPCs:

  • set-operator-state: Allows an operator to change the state of an alarm. The specification defines 4 possible operator states: cleared-not-closed, cleared-closed, not-cleared-closed, not-cleared-closed, not-cleared-not-closed.
  • purge-alarms: Deletes entries from the alarm list according to the supplied criteria. It can be used to delete alarms that are in closed state or an older than a specified time.
  • compress-alarms: Compress entries in the alarm list by removing all but the latest state change for all alarms.

NETCONF side

Adding alarms support to Snabb, and more precisely to the lwAFTR, has brought in many good things. First of all, Snabb’s YANG library has added support for more data types such as empty, identityref and leafref. It has also improved parsing and validation of other data types such as ipv4-prefix, ipv6-prefix and enum, in addition to other minor improvements and bug fixes. For the moment, the lwAFTR is the poster child for alarms, but the mechanism is generic enough and it can be used by other data-planes.

A new program has been added to Snabb, not surprisingly being called alarms. It consists of five sub-commands:

  • listen: Listens to a Snabb instance which provides alarms support. The subprogram can send RPC requests calls to the server program or listen to notifications.
  • get-state: Sends an XPath request to a target Snabb instance that provides alarms state information.
  • set-operator-state: User interface to set-operator-state action.
  • purge-alarms: User interface to purge-alarms.
  • compress-alarms: User interface to compress-alarms.

Below there’s an excerpt of get-state subprogram and its output:

$ sudo ./snabb alarms get-state lwaftr /
alarm-list {
   alarm {
      alarm-type-id arp-resolution;
      alarm-type-qualifier '';
      resource 21385;
      alarm-text
         "Make sure you can resolve external-interface.next-hop.ip address manually."
         "If it cannot be resolved, consider setting the MAC address of the next-hop directly."
         "To do it so, set external-interface.next-hop.mac to the value of the MAC address.";
      is-cleared false;
      last-changed 2018-06-18T14:57:40Z;
      perceived-severity critical;
      status-change {
         time 2018-06-18T14:57:40Z;
         alarm-text 
            "Make sure you can resolve external-interface.next-hop.ip address manually."
            "If it cannot be resolved, consider setting the MAC address of the next-hop directly."
            "To do it so, set external-interface.next-hop.mac to the value of the MAC address.";
         perceived-severity critical;
      }
      time-created 2018-06-18T14:57:40Z;
   }
   last-changed 2018-06-18T14:57:40Z;
   number-of-alarms 1;
}

The alarms module keeps all its state into one Snabb instance, the leader process. As a reminder, since v3.0 the lwAFTR runs in a multiprocess architecture which consists of:

  • 1 Leader, which manages changes in lwAFTR configuration file. For instance, changes in softwires (add, remove, update).
  • 1 or N Workers, which runs a lwAFTR data-plane.

Both processes communicate via an IPC (Inter-process communication) mechanism, in this case a message channel implemented using sockets. When a worker raises an alarm, a message is sent to the leader via a worker. The leader polls the alarms-channel periodically, consuming all the stored messages. The result of processing a message is an action that alters the alarms state, for instance, adding a new alarm to the inventory, raising an alarm, clearing it, etc. All this logic is coded in lib/ptree/ptree.lua and lib/ptree/alarm_coded.lua.

Besides alarms, there are also notifications. A notification is a sort of simple message that is emitted under certain circumstances: when an alarm is raised, when its status change or when a new alarm-type is added to the inventory. Notifications are a native YANG element, not particular only to alarms.

In Snabb, the notifications mechanism is also implemented via sockets. In this case, a socket connects a lwAFTR leader to a series of peers that listen on the socket. When a notification is triggered, a new notification is added to the leader’s list of notifications. The leader process runs a fiber that constantly polls this list. If it finds new entries, the notifications got serialized to a JSON object and are sent through the socket. Once a notification is sent, it’s removed from the alarms state. This logic is implemented lib/ptree/ptree.lua and lib/yang/alarms.lua.

Summary and conclusions

YANG Alarms is a simple mechanism to notify erroneous conditions. The main strengths of this module are:

  • It’s encoded as a YANG module, with all the advantages which that represents (common vocabulary and semantics, reusable).
  • Signaling errors by simply printing out messages in stdout is not reliable, as they can be easily missed. Alarms are in-memory stored, they keep state which can be later consulted on demand.
  • Active notifications for the most important state changes. This allows to hook external programs, which do not need to constantly poll the artifact current state to check whether a change happened.

On the down side, I personally think that the amount of information tracked per alarm is excessive, making the YANG specification more complex than one may thought at first. Fortunately, programs interested in supporting this module do not need to implement all the features specified, being satisfied with just a subset of all the module’s features. At the moment of writing this, the YANG alarms proposal is still a draft but hopefully it will become an standard after several revisions.

September 13, 2018 06:00 AM

August 31, 2018

Eleni Maria Stea

SIGGRAPH 2018

About 2 weeks ago, I attended SIGGRAPH 2018. I am still very excited about the whole event, and I am very thankful that Igalia (the consultancy company I work for) and specifically the Graphics Team selected me to go, despite this being my first year at the company!  😀 SIGGRAPH is the biggest event for companies and individuals … Continue reading SIGGRAPH 2018

by hikiko at August 31, 2018 08:13 PM

August 09, 2018

Manuel Rego

Changes on CSS Grid Layout in percentages and indefinite height

This is a blog post about a change of behavior on CSS Grid Layout related to percentage row tracks and gutters in grid containers with indefinite height. Igalia has just implemented the change in Chromium and WebKit, which can affect some websites out there. So here I am going to explain several things about how percentages work in CSS and all the issues around it, of course I will also explain the change we are doing in Grid Layout and how to keep your previous behavior in the new version with very simple changes.

Sorry for the length but I have been dealing with these issues since 2015 (probably earlier but that is the date of the first commit I found about this topic), and I went too deep explaining the concepts. Probably the post has some mistakes, this topic is not simple at all, but it represents a kind of brain dump of my knowledge about it.

Percentages and definite sizes

This is the easy part, if you have an element with fixed width and height resolving percentages on children dimensions is really simple, they are just computed against the width or height of the containing block.

A simple example:

<div style="width: 500px; height: 200px; border: dashed thick black;">
  <div style="width: 50%; height: 75%; background: magenta;"></div>
</div>

Example of percentage dimensions in a containing block with definite sizes Example of percentage dimensions in a containing block with definite sizes

Things are a bit trickier for percentage margins and paddings. In inline direction (width in horizontal writing mode) they work as expected and are resolved against the inline size. However in block direction (height) they are not resolved against the block size (as one can initially expect) but against the inline size (width) of the containing block.

Again a very simple example:

<div style="width: 500px; height: 200px; border: dashed thick black;">
  <div style="margin-left: 10%; margin-top: 10%;
              height: 150px; background: magenta;"></div>
</div>

Example of percentage margins in a containing block with definite sizes Example of percentage margins in a containing block with definite sizes

Note that there is something more here, in both Flexbox and Grid Layout specifications it was stated in the past that percentage margins and paddings resolve against their corresponding dimension, for example inline margins against inline axis and block margins against block axis.

This was implemented like that in Firefox and Edge, but Chromium and WebKit kept the usual behavior of resolving always against inline size. So for a while the spec had the possibility to resolve them in either way.

This was a source of interoperability issues between the different browsers but finally the CSS Working Group (CSSWG) resolved to keep the behavior for regular blocks also for flex and grid items. And both Firefox and Edge modified their behavior and all browsers have the same output nowadays.

Percentages and indefinite sizes

First question is, what is an indefinite size? The simple answer is that a definite size is a size that you can calculate without taking into account the contents of the element. An indefinite size is the opposite, in order to compute it you need to check the contents first.

But then, what happens when the containing block dimensions are indefinite? For example, a floated element has indefinite width (unless otherwise manually specified), a regular block has indefinite height by default (height: auto).

For heights this is very simple, percentages are directly ignored so they have no effect on the element, they are treated as auto.

For widths it starts to get funny. Web rendering engines have two phases to compute the width of an element. A first one to compute the minimum and maximum intrinsic width (basically the minimum and maximum width of its contents), and a second one to compute the final width for that box.

So let’s use an example to explain this properly. Before getting into that, let me tell you that I am going to use Ahem font in some examples, as it makes very easy to know the size of the text and resolve the percentages accordingly, so if we use font: 50px/1 Ahem; we know that the size of an X character is a square of 50x50 pixels.

<div style="float: left; font: 50px/1 Ahem;
            border: solid thick black; background: magenta; color: cyan;">
  XX XXXXX
</div>

Example of intrisic width without constraints Example of intrisic width without constraints

The browser first calculates the intrinsic width, as minimum it computes 250px (the size of the smallest word, XXXXX in this case), as maximum size it would be 400px (the size of the whole text without line breaking XX XXXXX). So after this phase the browser knows that the element should have a width between 250px and 400px.

Then during layout phase the browser will decide the final size, if there are no constraints imposed by the containing block it will use the maximum intrinsic width (400px in this case). But if you have a wrapper with a 300px width, the element will have to use 300px as width. If you have a wrapper smaller than the minimium intrinsic width, for example 100px, the element will still use the minimum 250px as its size. This is a quick and dirty explanation, but I hope it is useful to get the general idea.

Example of intrisic width width different constraints Example of intrisic width with different constraints

In order to resolve percentage widths (in the indefinite width situations) the browser does a different thing depending on the phase. During intrinsic size computations the percentage width is ignored (treated as auto like for the heights). But in the layout phase the width is resolved against the intrinsic size computed earlier.

Trying to summarize the above paragraphs, we can say that somehow the width is only indefinite while the browser is computing the intrinsic width of the element, afterwards during the actual layout the width is considered definite and percentages are resolved against it.

So now let’s see an example of indefinite dimensions and percentages:

<div style="float: left;
            border: solid thick black; background: magenta;">
  <div style="width: 50%; height: 50%; background: cyan;">Hello world!</div>
</div>

Example of percentage dimensions in a containing block with indefinite sizes Example of percentage dimensions in a containing block with indefinite sizes

First the size of the magenta box is calculated based on its contents, as it has not any constraint it uses the maximum intrinsic width (the length of Hello world!). Then as you can see the width of the cyan box is 50% of the text length, but the height is the same than if we use height: auto (the default value), so the 50% height is ignored.

Back-compute percentages

For margins and paddings things work more or less the same, remember that all of them are resolved against the inline direction (so they are ignored during intrinsic size computation and resolved later during layout).

But there is something special about this too. Nowadays all the browsers have the same behavior but that was not always the case, not so long time ago (before Firefox 61 which was released past June) things worked different in Firefox than the rest of browsers

Again let’s go to an example:

<div style="float: left; font: 50px/1 Ahem;
            border: solid thick black; background: magenta;">
  <div style="margin-left: 50%; height: 100px;
              background: cyan; color: blue;">XXXXX</div>
</div>

Example of percentage margins in a containing block with indefinite sizes Example of percentage margins in a containing block with indefinite sizes

In this example the size of the magenta box (the floated div) is the width of the text, 250px in this case. Then the margin is 50% of that size (125px), making that the size of the cyan box gets reduced to 125px too, which causes overflow.

But for these cases (percentage width margins and paddings and indefinite width container) Firefox did something extra that was called back-compute percentages. For that it something similar to the following formula:

Intrinsic width / (1 - Sum of percentages)

Which for this case would be 250px / (1 - 0.50) = 500px. So it takes as intrinsic size of the magenta box 500px, and then it resolves the 50% margin against it (250px). Thanks to this there is no overflow, and the margin is 50% of the containing block size.

Example of old Firefox behavior back-computing percentage margins Example of old Firefox behavior back-computing percentage margins

This Firefox behavior seems really smart and avoid overflows, but the CSSWG discussed about it and decided to use the other behavior. The main reason is what happens when you are around 100% percentages, or if you go over that value. The size of the box starts to be quite big (with 90% margin it would be 2500px), and when you go to 100% or over it you cannot use that formula so it considers the size as infinity (basically the viewport size in this example) and there is discontinuity in how percentages are resolved.

So after that resolution Firefox changed their implementation and removed the back-computing percentages logic, thus we have now interoperability in how percentage margins and paddings are resolved.

CSS Grid Layout and percentages

And now we arrive to CSS Grid Layout and how to resolve percentages in two places: grid tracks and grid gutters.

Of course when the grid container has definite dimensions there are no problems in resolving percentages against them, that is pretty simple.

As usual the problem starts with indefinite sizes. Originally this was not a controversial topic, percentages for tracks were behaving similar to percentage for dimensions in regular blocks. A percentage column was treated as auto for intrinsic size computation and later resolved against that size during layout. For percentage rows they were treated as auto. It does not mean that this is very easy to understand (actually it took me a while), but once you get it, it is fine and not hard to implement.

But when percentage support was added to grid gutters the big party started. Firefox was the first browser implementing them and they decided to use the back-compute technique explained in the previous point. Then when we add support in Chromium and WebKit we did something different than Firefox, we basically mimic the behavior of percentage tracks. As browsers started to diverge different discussions appear.

One of the first agreements on the topic was that both percentage tracks and gutters should behave the same. That invalidated the back-computing approach, as it was not going to work fine for percentage tracks as they have contents. In addition it was finally discarded even for regular blocks, as commented earlier, so this was out of the discussion.

However the debate moved to how percentage row tracks and gutters should be resolved, if similar to what we do for regular blocks or if similar to what we do for columns. The CSSWG decided they would like to keep CSS Grid Layout as symmetric as possible, so making row percentages resolve against the intrinsic height would achieve that goal

So finally the CSSWG resolved to modify how percentage row tracks and gutters are resolved for grid containers with indefinite height. The two GitHub issues with the last discussions are: #509 and #1921.

Let’s finish this point with a pair of examples to understand the change better comparing the previous and new behavior.

Percentage tracks:

<div style="display: inline-grid; border: solid thick;
            grid-template-columns: 75%; grid-template-rows: 50%;">
  <div style="background: magenta;">Testing</div>
</div>

Example of percentage tracks in a grid container with indefinite sizes Example of percentage tracks in a grid container with indefinite sizes

Here the intrinsic size of the grid container is the width and height of the text Testing, and then the percentages tracks are resolved against that size for both columns and rows (before that was only done for columns).

Percentage gutters:

<div style="display: inline-grid; grid-gap: 10%; border: solid thick;
            grid-template-columns: 200px 200px; grid-template-rows: 100px 100px;">
  <div style="background: magenta;"></div>
  <div style="background: cyan;"></div>
  <div style="background: yellow;"></div>
  <div style="background: lime;"></div>
</div>

Example of percentage gutters in a grid container with indefinite sizes Example of percentage gutters in a grid container with indefinite sizes

In this example we can see the same thing, with the new behavior both the percentage column and row gaps are resolved against the intrinsic size.

Change behavior for indefinite height grid containers

For a while all browsers were behaving the same (after Firefox dropped the back-computing approach) so changing this behavior would imply some kind of risks, as some websites might be affected by that and get broken.

For that reason we added a use counter to track how many websites where hitting this situation, using percentage row tracks in a indefinite height grid container. The number is not very high, but there is an increasing trend as Grid Layout is being adopted (almost 1% of websites are using it today).

And then Firefox changed the behavior for percentage row gutters to follow the new text on the spec, so they are resolved against the intrinsic height (this happened in version 62). However it did not change the behavior for percentage row tracks yet.

This was a trigger to retake the topic and go deeper on it, after analyzing it carefully and crafting a prototype implementation we sent an intent to implement and ship to blink-dev mailing list.

The intent was approved, but we were requested to analyze the sites that were hitting the use counter. After checking 178 websites only 8 got broken due to this change, we contacted them to try to get them fixed explaining how to keep the previous behavior (more about this in next point). You can find more details about this research in this mail.

Apart from that we added a deprecation message in Chromium 69, so if you have a website that is affected by this (it does not mean that it has to get broken but that it uses percentage row tracks in a grid container with indefinite height) you will get the following warning in the JavaScript console:

[Deprecation] Percentages row tracks and gutters for indefinite height grid containers will be resolved against the intrinsic height instead of being treated as auto and zero respectively. This change will happen in M70, around October 2018. See https://www.chromestatus.com/feature/6708326821789696 for more details.

Finally this week the patch has been accepted and merged in master, so since Chromium 70.0.3516 (current Canary) you will have the new behavior. Apart from that we also make the fix in WebKit that will be hopefully part of the next Safari releases.

In addition Firefox and Edge developers have been notified and we have shared the tests in WPT as usual, so hopefully those implementations will get updated soon too.

Update your website

Yes this change might affect your website or not, even if you get the deprecation warning it can be the case that your website is still working perfectly fine, but in some cases it can break quite badly. The good news is that the solution is really straightforward.

If you find issues in your website and you want to keep the old behavior you just need to do the following for grid containers with indefinite height:

  • Change percentages in grid-template-rows or grid-auto-rows to auto.
  • Modify percentages in row-gap or grid-row-gap to 0.

With those changes your website will keep behaving like before. In most cases you will realize that the percentages were unneeded and were not doing anything useful for you, even you would be able to drop the declaration completely.

One of these cases would be websites that have grid containers with just one single row of 100% height (grid-template-rows: 100%), many of the sites hitting the use counter are like this. All these are not affected by this change, unless the have extra implicit rows, but the 100% is not really useful at all there, they can simply remove the declaration.

Another sites that have issues are the ones that have for example two rows that sum up 100% in total (grid-template-rows: 25% 75%). These percentages were ignored before, so the contents always fit in each of the rows. Now the contents might not fit in each row and the results might not be the desired ones. Example:

<div style="display: grid; grid-template-rows: 25% 75%; border: solid thick;">
  <div style="background: magenta;">First<br>two lines</div>
  <div style="background: cyan;">Second</div>
</div>

Example of overlapping rows in the new behavior Example of overlapping rows in the new behavior

The sites that were more broken usually have several rows and used percentages only for a few of them or for all. And now the rows overflow the height of the grid container and they overlap other content on the website. There were cases like this example:

<div style="display: grid; grid-template-rows: 50%; border: solid thick;">
  <div style="background: magenta;">First</div>
  <div style="background: cyan; height: 200px;">Second</div>
  <div style="background: yellow; height: 100px;">Third</div>
</div>

Example of overflowing rows in the new behavior Example of overflowing rows in the new behavior

Closing

This topic has been a kind of neverending story for the CSSWG, but finally it seems we are reaching to an end. Let’s hope this does not get any further and things get settle down after all this time. We hope that this change is the best solution for web authors and everyone will be happy with the final outcome.

As usual I could not forget to highlight that all this work has been done by Igalia thanks to Bloomberg sponsorship as part of our ongoing collaboration.

Igalia and Bloomberg working together to build a better web Igalia and Bloomberg working together to build a better web

Thanks for reading that long, this ended up being much more verbose and covering more topics than originally planned. But I hope it can be useful to understand the whole thing. You can find all the examples from this blog post in this pen feel free to play with them.

And to finish this blog post I could only do it by quoting fantasai:

this is why I hate percentages in CSS

I cannot agree more with her. 😇

August 09, 2018 10:00 PM

August 07, 2018

Manuel Rego

CSS Logical Properties and Values in Chromium and WebKit

Since the beginning of the web we have been used to deal with physical CSS properties for different features, for example we all know how to set a margin in an element using margin-left, margin-right, margin-top and/or margin-bottom. But with the appearance of CSS Writing Modes features, the concepts of left, right, top and bottom have somehow lost their meaning.

Imagine that you have some right-to-left (RTL) content on your website your left might be probably the physical right, so if you are usually setting margin-left: 100px for some elements, you might want to replace that with margin-right: 100px. But what happens if you have mixed content left-to-right (LTR) and RTL at the same time, then you will need different CSS properties to set left or right depending on that. Similar issues are present if you think about vertical writing modes, maybe left for that content is the physical top or bottom.

CSS Logical Properties and Values is a CSS specification that defines a set of logical (instead of physical) properties and values to prevent this kind of issues. So when you want to set that margin-left: 100px independently of the direction and writing mode of your content, you can directly use margin-inline-start: 100px that will be smart enough. Rachel Andrew has a nice blog post explaining deeply this specification and its relevance.

Example of 'margin-inline-start: 100px' in different combinations of directions and writing modes Example of margin-inline-start: 100px in different combinations of directions and writing modes

Oriol Brufau, an active collaborator on the CSS Working Group (CSSWG), has been doing a Igalia Coding Experience implementing support for CSS Logical Properties and Values in Blink and WebKit. Maybe you were already aware of this as my colleague Frédéric Wang already talked about it in his last blog post reviewing the activities of Igalia Web Platform team in the past semester.

Some history

Chromium and WebKit have had support since a long time ago for some of the CSS logical properties defined by the spec. But they were not using the standard names defined in the specification but some -webkit- prefixed ones with different names.

For setting the dimensions of an element Chromium and WebKit have properties like -webkit-logical-width and -webkit-logical-height. However CSS Logical defines inline-size and block-size instead. There are also the equivalent ones for minimum and maximum sizes too. These ones have been already unprefixed at the beginning of 2017 and included in Chromium since version 57 (March 2017). In WebKit they are still only supported using the prefixed version.

But there are more similar properties for margins, paddings and borders in Chromium and WebKit that use start and end for inline direction and before and after for block direction. In CSS Logical we have inline-start and inline-end for inline direction and block-start and block-end for block direction, which are much less confusing. There was an attempt in the past to unprefix these properties but the work was abandoned and never completed. These ones were still using the -webkit- prefix so we decided to tackle them as the first task.

The post has been only talking about properties so far, but the same thing applies to some CSS values, that is why the spec is called CSS Logical Properties and Values. For example a very well-known property like float has the physical values left and right. The spec defines inline-start and inline-end as the logical values for float. However these were not supported yet in Chromium and WebKit, not even using -webkit- prefixes.

Firefox used to have some -moz- prefixed properties, but since Firefox 41 (September 2015) it is shipping many of the standard logical properties and values. Firefox has been using these properties extensively in its own tests, thus having them supported in Chromium will make easier to share them.

At the beginning of this work, Oriol wrote a document in which explaining the implementation plan where you can check the status of all these properties in Chromium and Firefox.

Unprefix existent properties

We originally send an intent to implement and ship for the whole spec, actually not all the spec but the parts that the CSSWG considered ready to implement. But Chromium community decided it was better to split it in two parts:

The work on the first part, making the old -webkit- prefixed properties to use the new standard names, has been already completed by Oriol and it is going to be included in the upcoming release of Chromium 69.

In addition to the Chromium work Oriol has just started to do this on WebKit too. Work is on early stages here but hopefully things will move forward in parallel to the Chromium stuff.

Adding support for the rest

Next step was to add support for the new stuff behind an experimental flag. This work is ongoing and you can check the current status in the latest Canary enabling the Experimental Web Platform features flag.

So far Oriol has added support for a bunch of shorthands and the flow-relative offset properties. You can follow the work in issue #850004 in Chromium bug tracker.

We will talk more about this in a future blog post once this task is completed and the new logical properties and values are shipped.

Tests!

Of course testing is a key part of all these tasks, and web-platform-tests (WPT) repository plays a fundamental role to ensure interoperability between the different implementations. Like we have been doing in Igalia lately in all our developments we used WPT as the primary place to store all the tests related to this work.

Oriol has been creating tests in WPT to cover all these features. Initial tests were based in the ones already available in Firefox and modified them to adapt to the rest of stuff that needs to be checked.

Note that in Chromium all the sideways writing modes test cases are failing as there is no support for sideways in Chromium yet.

Plans for the future

As explained before, this is an ongoing task but we already have some extra plans for it. These are some of the tasks (in no particular order) that we would like to do in the coming months:

  • Complete the implementation of CSS Logical Properties and Values in Chromium. This was explained in the previous point and is moving forward at a good pace.
  • Get rid of usage of -webkit- prefixed properties in Chromium source code. Oriol has also started this task and is currently work in progress.
  • Deprecate and remove the -webkit- prefixed properties. It still too early for that but we will keep an eye on the metrics and do it once usage has decreased.
  • Implement it in WebKit too, first by unprefixing the current properties (which has been already started) and later continuing with the new things. It would be really nice if WebKit follows Chromium on this. Edge also has plans to add support for this spec, so that would make logical properties and values available in all the major browsers.

Wrap up

Oriol has been doing a good job here as part of his Igalia Coding Experience. Apart from all the new stuff that is landing in Chromium, he has also been fixing some related bugs.

We have just started the WebKit tasks, but we hope all this work can be part of future Chromium and Safari releases in the short term.

And that is all for now, we will keep you posted! 😉

August 07, 2018 10:00 PM

July 31, 2018

Thibault Saunier

WebKitGTK and WPE gain WebRTC support back!

WebRTC is a w3c draft protocol that “enables rich, high-quality RTP applications to be developed for the browser, mobile platforms, and IoT devices, and allow them all to communicate via a common set of protocols”. The protocol is mainly used to provide video conferencing systems from within web browsers.

https://appr.tc running in WebKitGTK
https://appr.tc running in WebKitGTK

A brief history

At the very beginning of the WebRTC protocol, before 2013, Google was still using WebKit in chrome and they started to implement support using LibWebRTC but when they started the blink fork the implementation stopped in WebKit.

Around 2015/2016 Ericsson and Igalia (later sponsored by Metrological) implemented WebRTC support into WebKit, but instead of using LibWebRTC from google, OpenWebRTC was used. This had the advantage of being implemented on top of the GStreamer framework which happens to be used for the Multimedia processing inside WebKitGTK and WebKitWPE. At that point in time, the standardization of the WebRTC protocol was still moving fast, mostly pushed by Google itself, and it was hard to be interoperable with the rest of the world. Despite of that, the WebKit/GTK/WPE WebRTC implementation started to be usable with website like appr.tc at the end of 2016.

Meanwhile, in late 2016, Apple decided to implement WebRTC support on top of google LibWebRTC in their ports of WebKit which led to WebRTC support in WebKit announcement in June 2017.

Later in 2017 the OpenWebRTC project lost momentum and as it was considered unmaintained, we, at Igalia, decided to use LibWebRTC for WebKitGTK and WebKitWPE too. At that point, the OpenWebRTC backend was completely removed.

GStreamer/LibWebRTC implementation

Given that Apple had implemented a LibWebRTC based backend in WebKit, and because this library is being used by the two main web browsers (Chrome and Firefox), we decided to reuse Apple’s work to implement support in our ports based on LibWebRTC at the end of 2017. A that point, the two main APIs required to allow video conferencing with WebRTC needed to be implemented:

  • MediaDevices.GetUserMedia and MediaStream: Allows to retrieve Audio and Video streams from the user Cameras and Microphones (potentially more than that but those are the main use cases we cared about).
  • RTCPeerConnection: Represents a WebRTC connection between the local computer and a remote peer.

As WeKit/GTK/WPE heavily relies on GStreamer for the multimedia processing, and given its flexibility, we made sure that our implementation of those APIs leverage the power of the framework and the existing integration of GStreamer in our WebKit ports.

Note that the whole implementation is reusing (after refactoring) big parts of the infrastructure introduced during the previously described history of WebRTC support in WebKit.

GetUserMedia/MediaStream

To implement that part of the API the following main components were developed:

  • RealtimeMediaSourceCenterLibWebRTC: Main entry point for our GStreamer based LibWebRTC backend.
  • GStreamerCaptureDeviceManager: A class to list and manage local Video/Audio devices using the GstDeviceMonitor API.
  • GStreamerCaptureDevice: Implementation of WebKit abstraction for capture devices, basically wrapping GstDevices.
  • GStreamerMediaStreamSource: A GStreamer Source element which wraps WebKit abstraction of MediaStreams to be used directly in a playbin3 pipeline (through a custom mediastream:// protocol). This implementation leverages latest GstStream APIs so it is already one foot into the future.

The main commit can be found here

RTCPeerConnection

Enabling the PeerConnection API meant bridging previously implemented APIs and the LibWebRTC backend developed by Apple:

  • RealtimeOutgoing/Video/Audio/SourceLibWebRTC: Passing local stream (basically from microphone or camera) to LibWebRTC to be sent to the peer.
  • RealtimeIncoming/Video/Audio/SourceLibWebRTC: Passing remote stream (from a remote peer) to the MediaStream object and in turn to the GStreamerMediaStreamSource element.

On top of that and to leverage GStreamer Memory management and negotiation capabilities we implemented encoders and decoder for LibWebRTC (namely GStreamerVideoEncoder and GStreamerVideoDecoders). This brings us a huge number of Hardware accelerated encoders and decoders implementations, especially on embedded devices, which is a big advantage in particular for WPE which is tuned for those platforms.

The main commit can be found here

WebKitWebRTC dataflow diagram

Conclusion

While we were able to make GStreamer and LibWebRTC work well together in that implementation, using the new GstWebRTC component (that is now in upstream GStreamer) as a WebRTC backend would be cleaner. Many pieces of the current implementation could be reused and it would allow us to have a simpler infrastructure and avoid having several RTP stack in the WebKitGTK and WebKitWPE ports.

Most of the required APIs and features have been implemented, but a few are still under development (namely MediaDevices.enumerateDevices, canvas captureStream and WebAudio and MediaStream bridging) meaning that many Web applications using WebRTC already work, but some don’t yet, we are working on those!

A big thanks to my employer Igalia and Metrological for sponsoring our work on that!

by thiblahute at July 31, 2018 06:06 PM

July 21, 2018

Michael Catanzaro

On Flatpak Nightlies

Here’s a little timeline of some fun we had with the GNOME master Flatpak runtime last week:

  • Tuesday, July 10: a bad runtime build is published.  Trying to start any application results in error while loading shared libraries: libdw.so.1: cannot open shared object file: No such file or directory. Problem is the library is present in org.gnome.Sdk instead of org.gnome.Platform, where it is required.
  • Thursday, July 12:  the bug is reported on WebKit Bugzilla (since it broke Epiphany Technology Preview)
  • Saturday, July 14: having returned from GUADEC, I notice the bug report and bisect the issue to a particular runtime build. Mathieu Bridon fixes the issue in the freedesktop SDK and opens a merge request.
  • Monday, July 16: Mathieu’s fix is committed. We now have to wait until Tuesday for the next build.
  • Tuesday, Wednesday, and Thursday: we deal with various runtime build failures. Each day, we get a new build log and try to fix whatever build failure is reported. Then, we wait until the next day and see what the next failure is. (I’m not aware of any way to build the runtime locally. No doubt it’s possible somehow, but there are no instructions for doing so.)
  • Friday, July 20: we wait. The build has succeeded and the log indicates the build has been published, but it’s not yet available via flatpak update
  • Saturday, July 21: the successful build is now available. The problem is fixed.

As far as I know, it was not possible to run any nightly applications during this two week period, except developer applications like Builder that depend on org.gnome.Sdk instead of the normal org.gnome.Platform. If you used Epiphany Technology Preview and wanted a functioning web browser, you had to run arcane commands to revert to the last good runtime version.

This multi-week response time is fairly typical for us. We need to improve our workflow somehow. It would be nice to be able to immediately revert to the last good build once a problem has been identified, for instance.

Meanwhile, even when the runtime is working fine, some apps have been broken for months without anyone noticing or caring. Perhaps it’s time for a rethink on how we handle nightly apps. It seems likely that only a few apps, like Builder and Epiphany, are actually being regularly used. The release team has some hazy future plans to take over responsibility for the nightly apps (but we have to take over the runtimes first, since those are more important), and we’ll need to somehow avoid these issues when we do so. Having some form of notifications for failed builds would be a good first step.

P.S. To avoid any possible misunderstandings: the client-side Flatpak technology itself is very good. It’s only the server-side infrastructure that is problematic here. Clearly we have a lot to fix, but it won’t require any changes in Flatpak.

by Michael Catanzaro at July 21, 2018 02:12 PM

July 10, 2018

Pablo Saavedra

http503

Many times, during these last months, I thought to keep updated my blog writing a new post. Unfortunately, for one or another reason I always found an excuse to not do so. Well, I think that time is over because finally I found something useful and worthy the time spent time on the writing.

– That is OK but … what are you talking about?.
– Be patient Pablo, if you didn’t skip the headline of the post you already know about what I’m talking, probably :-).

Yes, I’m talking about how to setup a MacPro computer into a icecc cluster based on Linux hosts to take advantage of those to get more CPU power to build heavy software projects, like Chromium,  faster. The idea besides this is to distribute all the computational work over Linux nodes (fairly cheaper than any Mac) requested for cross-compiling tasks from the Mac host.

I’ve been working as a sysadmin at Igalia for the last couple of years. One of my duties here is to support and improve the developers building infrastructures. Recently we’ve faced long building times for heavy software projects like, for instance, Chromium. In this context, one of the main  issues that I had to solve is  how to build Chromium for MacOS in a reasonable time and avoiding to spend a lot of money in expensive bleeding edge Apple’s hardware to get CPU power.

This is what this post is about. This is an explanation about how to configure a Mac Pro to use a Linux based icecc cluster to boost the building times using cross-compilation. For simplicity, the explanation is focused in the singular case of just one single Linux host as icecc node and just one MacOS host requesting for compiling tasks but, in any case, you can extrapolate the instructions provided here to have many nodes as you need.

So let’s go with the the explanation but, first of all, a summary for those who want to go directly to the minimal and essential information …

TL;DR

On the Linux host:

# Configure the iceccd
$ sudo apt install icecc
$ sudo systemctl enable icecc-scheduler
$ edit /etc/icecc/icecc.conf
ICECC_MAX_JOBS="32"
ICECC_ALLOW_REMOTE="yes"
ICECC_SCHEDULER_HOST="192.168.1.10"
$ sudo systemctl restart icecc

# Generate the clang cross-compiling toolchain
$ sudo apt install build-essential icecc
$ git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git ~/depot_tools
$ export PATH=$PATH:~/depot_tools
$ git clone https://github.com/psaavedra/chromium_clang_darwin_on_linux ~/chromium_clang_darwin_on_linux
$ cd ~/chromium_clang_darwin_on_linux
$ export CLANG_REVISION=332838  # or CLANG_REVISION=$(./get-chromium-clang-revision)
$ ./icecc-create-darwin-env
# copy the clang_darwin_on_linux_332838.tar.gz to your MacOS host

On the Mac:

# Configure the iceccd
$ git clone https://github.com/darktears/icecream-mac.git ~/icecream-mac/
$ sudo ~/icecream-mac/install.sh 192.168.1.10
$ launchctl load /Library/LaunchDaemons/org.icecream.iceccd.plist
$ launchctl start /Library/LaunchDaemons/org.icecream.iceccd.plist

# Set the ICECC env vars
$ export ICECC_CLANG_REMOTE_CPP=1
$ export ICECC_VERSION=x86_64:~/clang_darwin_on_linux_332838.tar.gz
$ export PATH=~/icecream-mac/bin/icecc/:$PATH

# Get the depot_tools
$ cd ~
$ git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
$ export PATH=$PATH:~/depot_tools

# Download and build the Chromium sources
$ cd chromium && fetch chromium && cd src
$ gn gen out/Default --args='cc_wrapper="icecc" \
  treat_warnings_as_errors=false \
  clang_use_chrome_plugins=false \
  use_debug_fission=false \
  linux_use_bundled_binutils=false \
  use_lld=false \
  use_jumbo_build=true'
$ ninja -j 32 -C out/Default chrome

… and now the detailed explanation

Installation and setup of icecream on Linux hosts

The installation of icecream on a  Debian based Linux host is pretty simple. The latest version (1.1) for icecc is available in Debian testing and sid for a while so everything that you must to do is install it from the APT repositories. For case of stretch, there is a backport available  in the apt.igalia.com repository publically available:

sudo apt install icecc

The second important part of a icecc cluster is the icecc-scheduler. This daemon is in charge to route the requests from the icecc nodes which requiring available CPUs  for compiling to the nodes of the icecc cluster allowed to run remote build jobs.

In this setup we will activate the scheduler in the Linux node (192.168.1.10). The key here is that only one scheduler should be up at the same time in the same network to avoid errors in the cluster.

sudo systemctl enable icecc-scheduler

Once the scheduler is configured and up, it is time to add icecc hosts to the cluster. We will start adding the Linux hosts following this idea:

  • The IP of the icecc scheduler is 192.168.1.10
  • The Linux host is allowed to run remote jobs
  • The Linux host is allowed to run up to 32 concurrent jobs (this is arbitrary decision and can be adjusted per each particular host)
    # edit /etc/icecc/icecc.conf
    ICECC_NICE_LEVEL="5"
    ICECC_LOG_FILE="/var/log/iceccd.log"
    ICECC_NETNAME=""
    ICECC_MAX_JOBS="32"
    ICECC_ALLOW_REMOTE="yes"
    ICECC_BASEDIR="/var/cache/icecc"
    ICECC_SCHEDULER_LOG_FILE="/var/log/icecc_scheduler.log"
    ICECC_SCHEDULER_HOST="192.168.1.10"

We will need to restart the service to apply those changes:

sudo systemctl restart icecc

Installing and setup of icecream on MacOS hosts

The next step is to install and configure the icecc service on our Mac.  The easy way to get icecc available on Mac is icecream-mac project from darktears. We will do the installation assuming the following facts:

  • The local user account in Mac is psaavedra
  • The IP of the icecc scheduler is 192.168.1.10
  • The Mac is not allowed to accept remote jobs
  • We don’t want run use the Mac as worker.

To get the icecream-mac software we will make a git-clone of the project on Github:

git clone https://github.com/darktears/icecream-mac.git /Users/psaavedra/icecream-mac/
sudo /Users/psaavedra/icecream-mac/install.sh 192.168.1.10

We will edit a bit the /Library/LaunchDaemons/org.icecream.iceccd.plist daemon definition as follows:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
  <dict>
    <key>Label</key>
    <string>org.icecream.iceccd</string>
    <key>ProgramArguments</key>
    <array>
      <string>/Users/psaavedra/icecream-mac/bin/icecc/iceccd</string>
      <string>-s</string>
      <string>192.168.1.10</string>
      <string>-m</string>
      <string>2</string>
      <string>--no-remote</string>
    </array>
    <key>KeepAlive</key>
    <true/>
    <key>UserName</key>
    <string>root</string>
  </dict>
</plist>

Note that we are setting 2 workers in the Mac. Those workers are needed to execute threads in the host client host for things like linking … We will reload the service with this configuration:

launchctl load /Library/LaunchDaemons/org.icecream.iceccd.plist
launchctl start /Library/LaunchDaemons/org.icecream.iceccd.plist

Getting the cross-compilation toolchain for the icecream-mac

We already have the icecc cluster configured but, before to start to build Chromium on MacOS using icecc, there is still something before to do. We still need a cross-compiled clang for Darwin on Linux and, to avoid incompatibilities between versions, we need a clang based on the very same version that your Chromium code to be compiled.

You can check and get the cross-compilation clang revision that you need as follows:

cd src
CLANG_REVISION=$(cat tools/clang/scripts/update.py | grep CLANG_REVISION | head -n 1 | cut -d "'" -f 2)
echo $CLANG_REVISION
332838

In order to simplify this step.  I made some scripts which make it easy the generation of this clang cross-compiled toolchain. On a Linux host:

  • Install build depends:
    sudo apt install build-essential icecc
  • Get the Chromium project depot tools
    git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git ~/depot_tools  
    export PATH=$PATH:~/depot_tools
  • Download the psaavedra’s scripts (yes, my scripts):
    git clone https://github.com/psaavedra/chromium_clang_darwin_on_linux ~/chromium_clang_darwin_on_linux
    cd ~/chromium_clang_darwin_on_linux
  • You can use the get-chromium-clang-revision script to get the latest clang revision using in Chromium master:
    ./get-chromium-clang-revision
  • and then, to build the cross-compiled toolchain:
    ./icecc-create-darwin-env

    ; this script encapsulates the download, configure and build of the clang software.

  • A clang_darwin_on_linux_999999.tar.gz file will be generated.

Setup the icecc environment variables

Once you have the /Users/psaavedra/clang_darwin_on_linux_332838.tar.gz generated in your MacOS. You are ready to set the icecc environments variables.

export ICECC_CLANG_REMOTE_CPP=1
export ICECC_VERSION=x86_64:/Users/psaavedra/clang_darwin_on_linux_332838.tar.gz

The first variable enables the usage of the remote clang for C++. The second one establish toolchain to use by the x86_64 (Linux nodes) to build the code sent from the Mac.

Finally, remember to add the icecc binaries to the $PATH:

export PATH=/Users/psaavedra/icecream-mac/bin/icecc/:$PATH

You can check and get the cross-compiled clang revision that you need as follows:

cd src
CLANG_REVISION=$(cat tools/clang/scripts/update.py | grep CLANG_REVISION | head -n 1 | cut -d "'" -f 2)
echo $CLANG_REVISION
332838

… and building Chromium, at last

Reached this point, it’s time to build a Chromium using the icecc cluster and the cross-compiled clang toolchain previously created. These steps follows the official Chromium build procedure and only adapted to setup the icecc wrapper.

Ensure depot_tools is the path:

cd ~git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
export PATH=$PATH:~/depot_tools
ninja --version
# 1.8.2

Get the code:

git config --global core.precomposeUnicode truemkdir chromium
cd chromium
fetch chromium

Configure the build:

cd src
gn gen out/Default --args='cc_wrapper="icecc" treat_warnings_as_errors=false clang_use_chrome_plugins=false linux_use_bundled_binutils=false use_jumbo_build=true'
# or with ccache
export CCACHE_PREFIX=icecc
gn gen out/Default --args='cc_wrapper="ccache" treat_warnings_as_errors=false clang_use_chrome_plugins=false linux_use_bundled_binutils=false use_jumbo_build=true'

And build, at last:

ninja -j 32 -C out/Default chrome

icemon allows you to graphically monitoring the icecc cluster. Run it in remote from your Linux host if you don’t want install it in the MacOS:

ssh -X user@yourlinuxbox icemon

; with icemon you should see how each build task is distributed across the icecc cluster.

by Pablo Saavedra at July 10, 2018 08:15 AM

July 09, 2018

Frédéric Wang

Review of Igalia's Web Platform activities (H1 2018)

This is the semiyearly report to let people know a bit more about Igalia’s activities around the Web Platform, focusing on the activity of the first semester of year 2018.

Projects

Javascript

Igalia has proposed and developed the specification for BigInt, enabling math on arbitrary-sized integers in JavaScript. Igalia has been developing implementations in SpiderMonkey and JSC, where core patches have landed. Chrome and Node.js shipped implementations of BigInt, and the proposal is at Stage 3 in TC39.

Igalia is also continuing to develop several features for JavaScript classes, including class fields. We developed a prototype implementation of class fields in JSC. We have maintained Stage 3 in TC39 for our specification of class features, including static variants.

We also participated to WebAssembly (now at First Public Working Draft) and internationalization features for new features such as Intl.RelativeTimeFormat (currently at Stage 3).

Finally, we have written more tests for JS language features, performed maintenance and optimization and participated to other spec discussions at TC39. Among performance optimizations, we have contributed a significant optimization to Promise performance to V8.

Accessibility

Igalia has continued the standardization effort at the W3C. We are pleased to announce that the following milestones have been reached:

A new charter for the ARIA WG as well as drafts for ARIA 1.2 and Core Accessibility API Mappings 1.2 are in preparation and are expected to be published this summer.

On the development side, we implemented new ARIA features and fixed several bugs in WebKit and Gecko. We have refined platform-specific tools that are needed to automate accessibility Web Platform Tests (examine the accessibility tree, obtain information about accessible objects, listen for accessibility events, etc) and hope we will be able to integrate them in Web Platform Tests. Finally we continued maintenance of the Orca screen reader, in particular fixing some accessibility-event-flood issues in Caja and Nautilus that had significant impact on Orca users.

Web Platform Predictability

Thanks to support from Bloomberg, we were able to improve interoperability for various Editing/Selection use cases. For example when using backspace to delete text content just after a table (W3C issue) or deleting a list item inside a content cell.

We were also pleased to continue our collaboration with the AMP project. They provide us a list of bugs and enhancement requests (mostly for the WebKit iOS port) with concrete use cases and repro cases. We check the status and plans in WebKit, do debugging/analysis and of course actually submit patches to address the issues. That’s not always easy (e.g. when it is touching proprietary code or requires to find some specific reviewers) but at least we make discussions move forward. The topics are very diverse, it can be about MessageChannel API, CSSOM View, CSS transitions, CSS animations, iOS frame scrolling custom elements or navigating special links and many others.

In general, our projects are always a good opportunity to write new Web Platform Tests import them in WebKit/Chromium/Mozilla or improve the testing infrastructure. We have been able to work on tests for several specifications we work on.

CSS

Thanks to support from Bloomberg we’ve been pursuing our activities around CSS:

We also got more involved in the CSS Working Group, in particular participating to the face-to-face meeting in Berlin and will attend TPAC’s meeting in October.

WebKit

We have also continued improving the web platform implementation of some Linux ports of WebKit (namely GTK and WPE). A lot of this work was possible thanks to the financial support of Metrological.

Other activities

Preparation of Web Engines Hackfest 2018

Igalia has been organizing and hosting the Web Engines Hackfest since 2009, a three days event where Web Platform developers can meet, discuss and work together. We are still working on the list of invited, sponsors and talks but you can already save the date: It will happen from 1st to 3rd of October in A Coruña!

New Igalians

This semester, new developers have joined Igalia to pursue the Web platform effort:

  • Rob Buis, a Dutch developer currently living in Germany. He is a well-known member of the Chromium community and is currently helping on the web platform implementation in WebKit.

  • Qiuyi Zhang (Joyee), based in China is a prominent member of the Node.js community who is now also assisting our compilers team on V8 developments.

  • Dominik Infuer, an Austrian specialist in compilers and programming language implementation who is currently helping on our JSC effort.

Coding Experience Programs

Two students have started a coding experience program some weeks ago:

  • Oriol Brufau, a recent graduate in math from Spain who has been an active collaborator of the CSS Working Group and a contributor to the Mozilla project. He is working on the CSS Logical Properties and Values specification, implementing it in Chromium implementation.

  • Darshan Kadu, a computer science student from India, who contributed to GIMP and Blender. He is working on Web Platform Tests with focus on WebKit’s infrastructure and the GTK & WPE ports in particular.

Additionally, Caio Lima is continuing his coding experience in Igalia and is among other things working on implementing BigInt in JSC.

Conclusion

Thank you for reading this blog post and we look forward to more work on the web platform this semester!

July 09, 2018 10:00 PM

June 26, 2018

Gyuyoung Kim

How to develop Chromium with Visual Studio Code on Linux?

How have you been developing Chromium? I have often been asked what is the best tool to develop Chromium. I guess Chromium developers have been usually using vim, emacs, cscope, sublime text, eclipse, etc. And they have used GDB or console logs for debugging. But, in case of Windows, developers have used Visual Studio. Although Visual Studio supports powerful features to develop C/C++ programs, unfortunately, it couldn’t be used by other platform developers. However, recently I notice that Visual Studio Code also can support to develop Chromium with the nice editor, powerful debugging tools, and a lot of extensions. And, even it can work on Linux and Mac because it is based on Electron. Nowadays I’m developing Chromium with VS Code. I feel that VS Code is one of the very nice tools to develop Chromium. So I’d like to share my experience how to develop Chromium by using the Visual Studio Code on Ubuntu.

Default settings for Chromium

There is already an article about how to set up the environment to build Chromium in VS codeSo to start, you need to prepare the environment settings. This section just lists key points in the instructions.
* https://chromium.googlesource.com/chromium/src/+/lkcr/docs/vscode.md

  1. Download VS Code
    https://code.visualstudio.com/docs/setup/setup-overview
  2. Launch VS Code in chromium/src
    $ code .
  3. Install useful extensions
    1. Python – Linting, intellisense, code formatting, refactoring, debugging, snippets.
    2. c/c++ for visual studio code – Code formatting, debugging, Intellisense.
    3.  Toggle Header/Source Toggles between .cc and .h with F4. The C/C++ extension supports this as well through Alt+O but sometimes chooses the wrong file when there are multiple files in the workspace that have the same name.
    4. you-complete-me YouCompleteMe code completion for VS Code. It works fairly well in Chromium. To install You-Complete-Me, enter these commands in a terminal:
      $ git clone https://github.com/Valloric/ycmd.git ~/.ycmd 
      $ cd ~/.ycmd 
      $ git submodule update --init --recursive 
      $ ./build.py --clang-completer
    5. Rewrap –  Wrap lines at 80 characters with Alt+Q.
  4. Setup for Chromium
  5. Key mapping
    * Ctrl+P: opens a search box to find and open a file.
    * F1 or Ctrl+Shift+P: opens a search box to find a command
      (e.g. Tasks: Run Task).
    * Ctrl+K, Ctrl+S: opens the key bindings editor.
    * Ctrl+`: toggles the built-in terminal.
    * Ctrl+Shift+M: toggles the problems view (linter warnings, 
      compile errors and warnings). You'll swicth a lot between
      terminal and problem view during compilation.
    * Alt+O: switches between the source/header file.
    * Ctrl+G: jumps to a line.
    * F12: jumps to the definition of the symbol at the cursor
      (also available on right-click context menu).
    * Shift+F12 or F1: CodeSearchReferences, Return shows all
      references of the symbol at the cursor.
    * F1: CodeSearchOpen, Return opens the current file in
      Code Search.
    * Ctrl+D: selects the word at the cursor. Pressing it multiple
      times multi-selects the next occurrences, so typing in one
      types in all of them, and Ctrl+U deselects the last occurrence.
    * Ctrl+K+Z: enters Zen Mode, a fullscreen editing mode with
      nothing but the current editor visible.
    * Ctrl+X: without anything selected cuts the current line.
      Ctrl+V pastes the line.
  6. (Optional) Color setting
    • Press Ctrl+Shift+P, color, Enter to pick a color scheme for the editor

Additional settings after the default settings.

  1. Set workspaceRoot to .bashrc. (Because it will be needed for some extensions.)
    export workspaceRoot=$HOME/chromium/src
  2. Add new tasks to tasks.json in order to build Chromium by using ICECC.

 

{
  "taskName": "1-build_chrome_debug_icecc",
  "command": "buildChromiumICECC.sh Debug",
  "isShellCommand": true,
  "isTestCommand": true,
  "problemMatcher": [
    {
      "owner": "cpp",
      "fileLocation": [
        "relative",
        "${workspaceRoot}"
      ],
      "pattern": {
        "regexp": "^../../(.*):(\\d+):(\\d+):\\s+(warning|\\w*\\s?error):\\s+(.*)$",
        "file": 1,
        "line": 2,
        "column": 3,
        "severity": 4,
        "message": 5
      }
    },
    {
      "owner": "cpp",
      "fileLocation": [
        "relative",
        "${workspaceRoot}"
      ],
      "pattern": {
        "regexp": "^../../(.*?):(.*):\\s+(warning|\\w*\\s?error):\\s+(.*)$",
        "file": 1,
        "severity": 3,
        "message": 4
      }
    }
  ]
},
  1. Update “Chrome Debug” configuration in launch.json
    {
      "name": "Chrome Debug",
      "type": "cppdbg",
      "request": "launch",
      "targetArchitecture": "x64",
      "environment": [
        {"name":"workspaceRoot", "value":"${HOME}/chromium/src"}
      ],
      "program": "${workspaceRoot}/out/Debug/chrome",
      "args": ["--single-process"],  // The debugger only can work with the single process mode for now.
      "preLaunchTask": "1-build_chrome_debug_icecc",
      "stopAtEntry": false,
      "cwd": "${workspaceRoot}/out/Debug",
      "externalConsole": true,
    },

Screenshot after the settings

you will see this window after finishing all settings.

Build Chromium with ICECC in VS Code

  1. Run Task (F1 or Ctrl+Shift+P)
  2. Select 1-build_chrome_debug_icecc. VS Code will show an integrated terminal when building Chromium as below, On the ICECC monitor, you can see that VS Code builds Chromium by using ICECC.

 

Start debugging in VS Code

After completing the build, now is time to start debugging Chromium.

  1. Set a breakpoint
    1. F9 button or just click the left side of line number
  2. Launch debug
    1. Press F5 button
  3. Screen captures when debugger stopped at a breakpoint
    1. Overview
    2. Editor
    3. Call stack
    4. Variables
    5. Watch
    6. Breakpoints

Todo

In multiple processes model, VS Code can’t debug child processes yet (i.e. renderer process). According to C/C++ extension project site, they suggested us to add the below command to launch.json though, it didn’t work for Chromium when I tried.

"setupCommands": [
    { "text": "-gdb-set follow-fork-mode child" }
],

Reference

  1. Chromium VS Code setup: https://chromium.googlesource.com/chromium/src/+/lkcr/docs/vscode.md
  2. https://github.com/Microsoft/vscode-cpptools/blob/master/launch.md

by gyuyoung at June 26, 2018 12:55 AM

June 14, 2018

Diego Pino

Fast checksum computation

An Internet packet generally includes two checksums: a TCP/UDP checksum and an IP checksum. In both cases, the checksum value is calculated using the same algorithm. For instance, IP header checksum is computed as follows:

  • Set the packet’s IP header checksum to zero.
  • Fetch the IP header octets in groups of 16-bit and calculate the accumulated sum.
  • In case there’s an overflow while adding, sum the carry-bit to the total sum.
  • Once the sum is done, set the checksum to the one’s complement of the accumulated sum.

Here is an implementation of such algorithm in Lua:

local ffi = require("ffi")
local bit = require("bit")

local function checksum_lua (data, size)
   local function r16 (data)
      return ffi.cast("uint16_t*", data)[0]
   end
   local csum = 0
   local i = size
   -- Accumulated sum.
   while i > 1 do
      local word = r16(data + (size - i))
      csum = csum + word
      i = i - 2
   end
   -- Handle odd sizes.
   if i == 1 then
      csum = csum + data[size-1]
   end
   -- Add accumulated carry.
   while true do
      local carry = bit.rshift(csum, 16)
      if carry == 0 then break end
      csum = bit.band(csum, 0xffff) + carry
   end
   -- One's complement.
   return bit.band(bit.bnot(csum), 0xffff)
end

The IP header checksum is calculated only over the IP header octets. However, the TCP header is calculated over the TCP header, the packet’s payload plus an extra header called the pseudo-header.

A pseudo-header is a 12-byte data structured composed of:

  • IP source and destination addresses (8 bytes).
  • Protocol (TCP=0x6 or UDP=0x11) (2 bytes).
  • TCP length, calculated from IP header’s total length minus TCP or UDP header size (2 bytes).

You may wonder what’s the purpose of the pseudo-header? David P Reed, often considered the father of UDP, provides a great explanation in this thread: Purpose of pseudo header in TCP checksum. Basically, the original goal of the pseudo-header was to take into account IP addresses as part of the TCP checksum, since they’re relevant fields in an end-to-end communication. Back in those days, the original plan for TCP safe communications was to leave source and destination addresses clear but encrypt the rest of the TCP fields. That would avoid man-in-the middle attacks. However NAT, which is essentially a man-in-the-middle, happened thrashing away this original plan. In summary, the pseudo-header exists today for legacy reasons.

Lastly, is worth mentioning that UDP checksum is optional in IPv4, so you might see it set as zero many times. However, the field is mandatory in IPv6.

Verifying a packet’s checksum

Verifying a packet’s checksum is easy. On receiving a packet, the receiver sums all the relevant octets, including the checksum field. The result must be zero if the packet is correct, since the sum of a number and its one’s complement is always zero.

From a developer’s perspective there are several tools for verifying the correctness of a packet’s checksum. Perhaps my preferred tool is Wireshark, which features an option to check the validity of TCP checksums (Edit->Preferences->Protocols[TCP]. Mark Validate the TCP checksum if possible). When this option is enabled, packets with a wrong checksum are highlighted in a black background.

Bad checksums
Bad checksums

Seeing packets with wrong checksums is common when capturing packets with tcpdump and open them in Wireshark. The reason why checksums are not correct is that TCP checksumming is generally offloaded to the NIC, since it’s a relatively expensive operation (nowadays, NICs count with specialized hardware to do that operation fast). Since tcpdump captures outgoing packets before they hit the NIC, the checksum value hasn’t been calculated yet and likely contains garbage. It’s possible to check whether checksum offloading is enabled in a NIC by typing:

$ ethtool --show-offload <nic> | grep checksumming
rx-checksumming: on
tx-checksumming: on

Another option for verifying checksum values is using tshark:

$ tshark -r packets.pcap -V -o tcp.check_checksum:TRUE | grep -c "Error/Checksum"
20

Lastly, in case you’d like to fix wrong checksums in a pcap file is possible to do that with tcprewrite:

$ tcprewrite -C -i packets.pcap -o packets-fixed.pcap
$ tshark -r packets-fixed.pcap -V -o tcp.check_checksum:TRUE | grep -c "Error/Checksum"
0

Fast checksum computation

Since TCP checksum computation involves a large chunk of data improving its performance is important. There are, in fact, several RFCs dedicated exclusively to discuss this topic. RFC 1071 (Computing the Internet Checksum) includes a detailed explanation of the algorithm and also explores different techniques for speeding up checksumming. In addition, it features reference implementations in several hardware architectures such as Motorola 68020, Cray and IBM 370.

Perhaps the fastest way to recompute a checksum of a modified packet is to incrementally update the checksum as the packet gets modified. Take for instance the case of NAT which modifies origin and destination ports and addresses. Those operations affect both the TCP and IP checksums. In the case of the IP checksum, if the source address gets modified we can recompute the new IP checksum as:

new_checksum = ~(~checksum + (-pkt.source_ip) + new_source_ip);

Or more generically using the following formula:

HC = one-complement(one-complement(C) + (-m) + m)

This technique is covered in RFC 1071 and further polished over two other RFCs: RFC 1141 and RFC 1624 (Incremental Updating of the Internet Checksum).

If we decide to recompute the checksum, there are several techniques to do it fast. On its canonical form, the algorithm says octets are summed as 16-bit words. If there’s carry after an addition, the carry should be added to the accumulated sum. Truth is it’s not necessary to add octets as 16-bit words. Due to the associative property of addition, it is possible to do parallel addition using larger word sizes such as 32-bit or 64-bit words. In those cases the variable that stores the accumulative sum has to be bigger too. Once the sum is computed a final step folds the sum to a 16-bit word (adding carry if any).

Here’s an implementation in C using 32-bit words:

uint16_t checksum (uint8_t* data, uint16_t len)
{
    uint64_t sum = 0;
    uint32_t* p = (uint32_t*) data;
    uint16_t i = 0;
    while (len >= 4) {
        sum = sum + p[i++];
        len -= 4;
    }
    if (len >= 2) { 
        sum = sum + ((uint16_t*) data)[i * 4];
        len -= 2;
    }
    if (len == 1) {
        sum += data[len-1];
    }
    
    // Fold sum into 16-bit word.
    while (sum>>16) {
        sum = (sum & 0xffff) + (sum>>16);
    }
    return ntohs((uint16_t)~sum);
}

Using larger word sizes increases speed as it reduces the total number of operations. What about using this technique on 64-bit integers? It definitely would be possible, but it requires to handle carry in the body loop. In the algorithm above, 32-bit words are summed to a 64-bit word. Carry, if any, is stored in the higher part of sum, which later gets summed in the folding step.

Using SIMD instructions should allow us to sum larger sizes of data in parallel. For instance using AVX2’s VPADD (vector-packed addition) instruction it should be possible to sum 16x16-bit words in parallel. The issue here once again is handling the possible generated carry to the accumulated sum. So instead of a 16x16-bit vector a 8x32 vector is used instead. From a functional point of view this is equivalent to sum using 128-bit words.

Snabb features implementations of checksum computation, generic and using SIMD instructions. In the latter case there are versions for SSE2 and AVX2 instruction sets. Snabb’s philosophy is to do everything in software and rely as much less as possible in offloaded NIC functions. Thus checksum computation is something done in code. Snabb’s implementation using AVX2 instructions is available at src/arch/avx2.c (Luke pushed a very interesting implementation in machine code as well. See PR#899).

Going back to RFC 1071, many of the reference implementations do additions in the main loop taking into account the carry bit. For instance, in the Motorola 68020 implementation that is done using the ADDXL instruction. In X86 there’s an equivalent add-with-carry instruction (ADC). Basically this instruction performs a sum of two operands plus the carry-flag.

Another technique described in RFC 1071, and also used in the reference implementations, is loop unrolling. Instead of summing one word per loop, we could sum 2, 4 or 8 words instead. A loop that sums 64-bit words in strides of 8 means actually avoiding loops for packet sizes lower than 512 bytes. Unrolling a loop requires adding waterfall code after the loop to handle the edge-cases that control the bounds of the loop.

As an exercise to teach myself more DynASM and X86-64 assembly, I decided to rewrite the generic checksum algorithm and see if performance improved. The first implementation followed the canonical algorithm, summing words as 16-bit values. Performance was much better than the generic Lua implementation posted at the beginning of this article, but it wasn’t better than Snabb’s C implementation, which does loop unrolling.

After this initially disappointing result, I decided to apply some of the optimization techniques commented before. Summing octets as 32-bit words definitely improved performance. The advantage of writing the algorithm in assembly is that I could make use of the ADC instruction. That allowed me to use 64-bit words. Performance improved once again. Finally I tried out several loop unrolling. With a loop unrolling of 4 strides the algorithm proved to be better than the SSE2 algorithm for several packet sizes: 64 bytes, 570 bytes and 1520 bytes. However, it doesn’t beat the AVX2 implementation in the large packet case, but it shown better performance for small and medium sizes.

And here’s the final implementation:

; Prologue.
push rbp
mov rbp, rsp

; Accumulative sum.
xor rax, rax                ; Clear out rax. Stores accumulated sum.
xor r9, r9                  ; Clear out r9. Stores value of array.
xor r8, r8                  ; Clear out r8. Stores array index.
mov rcx, rsi                ; Rsi (2nd argument; size). Assign rsi to rcx.
1:
cmp rcx, 32                 ; If index is less than 16.
jl >2                       ; Jump to branch '2'.
add rax, [rdi + r8]         ; Sum acc with qword[0].
adc rax, [rdi + r8 + 8]     ; Sum with carry qword[1].
adc rax, [rdi + r8 + 16]    ; Sum with carry qword[2].
adc rax, [rdi + r8 + 24]    ; Sum with carry qword[3]
adc rax, 0                  ; Sum carry-bit into acc.
sub rcx, 32                 ; Decrease index by 8.
add r8, 32                  ; Jump two qwords.
jmp <1                      ; Go to beginning of loop.
2:
cmp rcx, 16                 ; If index is less than 16.
jl >3                       ; Jump to branch '2'.
add rax, [rdi + r8]         ; Sum acc with qword[0].
adc rax, [rdi + r8 + 8]     ; Sum with carry qword[1].
adc rax, 0                  ; Sum carry-bit into acc.
sub rcx, 16                 ; Decrease index by 8.
add r8, 16                  ; Jump two qwords.
3:
cmp rcx, 8                  ; If index is less than 8.
jl >4                       ; Jump to branch '2'.
add rax, [rdi + r8]         ; Sum acc with qword[0].
adc rax, 0                  ; Sum carry-bit into acc.
sub rcx, 8                  ; Decrease index by 8.
add r8, 8                   ; Next 64-bit.
4:
cmp rcx, 4                  ; If index is less than 4.
jl >5                       ; Jump to branch '3'.
mov r9d, dword [rdi + r8]   ; Fetch 32-bit from data + r8 into r9d.
add rax, r9                 ; Sum acc with r9. Accumulate carry.
sub rcx, 4                  ; Decrease index by 4.
add r8, 4                   ; Next 32-bit.
5:
cmp rcx, 2                  ; If index is less than 2.
jl >6                       ; Jump to branch '4'.
movzx r9, word [rdi + r8]   ; Fetch 16-bit from data + r8 into r9.
add rax, r9                 ; Sum acc with r9. Accumulate carry.
sub rcx, 2                  ; Decrease index by 2.
add r8, 2                   ; Next 16-bit.
6:
cmp rcx, 1                  ; If index is less than 1.
jl >7                       ; Jump to branch '5'.
movzx r9, byte [rdi + r8]   ; Fetch 8-bit from data + r8 into r9.
add rax, r9                 ; Sum acc with r9. Accumulate carry.

; Fold 64-bit into 16-bit.
7:
mov r9, rax                 ; Assign acc to r9.
shr r9, 32                  ; Shift r9 32-bit. Stores higher part of acc.
and rax, 0x00000000ffffffff ; Clear out higher-part of rax. Stores lower part of acc.
add eax, r9d                ; 32-bit sum of acc and r9.
adc eax, 0                  ; Sum carry to acc.
mov r9d, eax                ; Repeat for 16-bit.
shr r9d, 16
and eax, 0x0000ffff
add ax, r9w
adc ax, 0

; One's complement.
not rax                     ; One-complement of rax.
and rax, 0xffff             ; Clear out higher part of rax.

; Epilogue.
mov rsp, rbp
pop rbp

; Return.
ret

Benchmark results for several data sizes:

Data size: 44 bytes

Algorithm Time per csum Time per byte
Generic 87.77 ns 1.99 ns
SSE2 86.06 ns 1.96 ns
AVX2 83.20 ns 1.89 ns
New 52.10 ns 1.18 ns

Data size: 550 bytes

Algorithm Time per csum Time per byte
Generic 1058.04 ns 1.92 ns
SSE2 510.40 ns 0.93 ns
AVX2 318.42 ns 0.58 ns
New 270.79 ns 0.49 ns

Data size: 1500 bytes

Algorithm Time per csum Time per byte
Generic 2910.10 ns 1.94 ns
SSE2 991.04 ns 0.66 ns
AVX2 664.98 ns 0.44 ns
New 743.88 ns 0.50 ns

All in all, it has been a fun exercise. I have learned quite a lot about the Internet checksum algorithm. I also learned how loop unrolling can help improving performance in a dramatic way (more than I initially expected). I found very interesting as well how changing the context of a problem, in this case the target programming language, forces to think about the problem in a different way but it also enables the possibility of doing more optimizations that were not possible before.

June 14, 2018 06:00 AM

June 13, 2018

Jacobo Aragunde

Chromium official/release builds and icecc

You may already be using icecc to compile your Chromium, either by following some instructions like the ones published by my colleague Gyuyoung or using the popular icecc-chromium set of scripts. In those cases, you will probably get in some trouble if you try to generate an official build with that configuration.

First, let me refresh what an “official build” is called in Chromium. You may know that build optimization in Chromium builds depends on two flags:

  • is_debug
    Debug build. Enabling official builds automatically sets is_debug to false.
  • is_official_build
    Set to enable the official build level of optimization. This has nothing
    to do with branding, but enables an additional level of optimization above
    release (!is_debug). This might be better expressed as a tri-state
    (debug, release, official) but for historical reasons there are two
    separate flags.

  • The GN documentation is pretty verbose about this. To sum up, to get full binary optimization you should enable is_official_build which will also disable is_debug in the background. This is what other projects would call a release build.

    Back to the main topic, I was running an official build distributed via icecc and stumbled on some compilation problems:

    clang: error: no such file or directory: /usr/lib/clang/7.0.0/share/cfi_blacklist.txt
    clang: error: no such file or directory: ../../tools/cfi/blacklist.txt
    clang: error: no such file or directory: /path/to/src/chrome/android/profiles/afdo.prof
    

    These didn’t happen when icecc build was disabled, so I was certain to have found some limitations in the distributed compiler. The icecc-chromium set of scripts was already disabling a number of clang cleanup/sanitize tools, so I decided to take the same approach. First, I checked the GN args that could be related to these errors and identified two:

    • is_cfi
      Current value (from the default) = true
      From //build/config/sanitizers/sanitizers.gni:53

    Compile with Control Flow Integrity to protect virtual calls and casts.
    See http://clang.llvm.org/docs/ControlFlowIntegrity.html

    TODO(pcc): Remove this flag if/when CFI is enabled in all official builds.

  • clang_use_default_sample_profile
    Current value (from the default) = true
    From //build/config/compiler/BUILD.gn:117

    Some configurations have default sample profiles. If this is true and
    clang_sample_profile_path is empty, we’ll fall back to the default.

    We currently only have default profiles for Chromium in-tree, so we disable
    this by default for all downstream projects, since these profiles are likely
    nonsensical for said projects.

  • These two args were enabled, I just disabled them and got rid the compilation flags that were causing trouble: -fprofile-sample-use=/path/to/src/chrome/android/profiles/afdo.prof -fsanitize=cfi-vcall -fsanitize-blacklist=../../tools/cfi/blacklist.txt. I’ve learned that support for -fsanitize-blacklist is available in upstream icecc, but most distros don’t package it yet, so it’s safer to disable that.

    To sum up, if you are using icecc and you want to run an official build, you have to add a couple more GN args:

    clang_use_default_sample_profile = false
    is_cfi = false
    

    by Jacobo Aragunde Pérez at June 13, 2018 08:06 AM

    June 04, 2018

    Michael Catanzaro

    Security vulnerability in Epiphany Technology Preview

    If you use Epiphany Technology Preview, please update immediately and ensure you have revision 3.29.2-26 or newer. We discovered and resolved a vulnerability that allowed websites to access internal Epiphany features and thereby exfiltrate passwords from the password manager. We apologize for this oversight.

    The unstable Epiphany 3.29.2 release is the only affected release. Epiphany 3.29.1 is not affected. Stable releases, including Epiphany 3.28, are also not affected.

    There is no reason to believe that the issue was discovered or exploited by any attackers, but you might wish to change your passwords if you are concerned.

    by Michael Catanzaro at June 04, 2018 11:00 PM

    May 29, 2018

    Maksim Sisov

    Chromium with Ozone/Wayland: BlinkOn9, dmabuf and more refactorings…

    It has been quite a long while since we wrote blogs about our Chromium Ozone/Wayland effort, and there are a lot of news right now. Igalia participated in the BlinkOn9 conference and gave a talk (https://www.youtube.com/watch?v=DREywLVAVeo) about the Ozone/Wayland support, and had many discussions on how to continue with upstreaming desktop integration related patches for the Mus service.

    Even though, we had been able to make Chromium running with the Wayland backend natively, and upstreamed a lot of Ozone/Wayland related patches, some disagreements had had to be resolved before proceeding with upstreaming. To be precise, the future of the Mus service was unclear and it was decided to abandon it in favor of a platform desktop integration directly to Aura without Mus. Thanks for a good Ozone design, we had been able to quickly redesign our solution and upstream more patches to make it possible to run Chromium Ozone/Wayland from the ToT. Of course, it still has been missing some functionality patches, but the effort is going on steadily and we expect to have all the patches upstreamed in the following months. A lot of work still has to be done.

    Another point I would like to mention is about our UI/GPU split effort. This is the effort to make Chromium Ozone/Wayland to be run with a separate gpu process. For that, a proper communication channel between the browser process, where a wayland connection is established, must be established. We have had tried a nested compositor approach, but decided to abandon it in favor of a dmabuf based approach, which will also allow us to have gpu native memory buffers, rasterization and, perhaps, zero-copy features enabled on some platforms. The idea behind of the dmabuf approach is to reuse some of the Ozone/Drm codebase, and by utilizing drm render nodes, create shared GBM buffers, which file descriptors are then shared with the browser process, where Wayland creates a dmabuf based wl_buffer and attaches it to wayland windows. At this stage, we have already had a working PoC, and Ozone/Drm refactorings are being upstreamed now. It will also support a presentation feedback if such protocol is available on a system with a Wayland compositor.

    Last but not least, we would like to clarify about the accelerated media decode in Chromium Ozone/Wayland mentioned in https://www.phoronix.com/scan.php?page=news_item&px=Chromium-Igalia-Wayland-V4L2VDA.
    To make it clear, we are not currently working on the V4L2, but rather the patches have been just merged by another external contributor for simplicity of compiling our Chromium solution on ARM based boards, especially Renesas M3 R-car board.

    by msisov at May 29, 2018 08:51 AM

    May 27, 2018

    Michael Catanzaro

    Thoughts on Flatpak after four months of Epiphany Technology Preview

    It’s been four months since I announced Epiphany Technology Preview — which I’ve been using as my main browser ever since — and five months since I announced the availability of a stable channel via Flatpak. For the most part, it’s been a good experience. Having the latest upstream development code for everything is wonderful and makes testing very easy. Any user can painlessly download and install either the latest stable version or the bleeding-edge development version on any Linux system, regardless of host dependencies, either via a couple clicks in GNOME Software or one command in the terminal. GNOME Software keeps it updated, so I always have a recent version. Thanks to this, I’m often noticing problems shortly after they’re introduced, rather than six months later, as was so often the case for me in the past. Plus, other developers can no longer complain that there’s a problem with my local environment when I report a bug they can’t reproduce, because Epiphany Technology Preview is a canonical distribution environment, a ground truth of sorts.

    There have been some rough patches where Epiphany Technology Preview was not working properly — sometimes for several days — due to various breaking changes, and the long time required to get a successful SDK build when it’s failing. For example, multimedia playback was broken for all of last week, due to changes in how the runtime is built. H.264 video is still broken, since the relevant Flatpak extension is only compatible with the 3.28 runtime, not with master. Opening files was broken for a while due to what turned out to be a bug in mutter that was causing the OpenURI portal to crash. I just today found another bug where closing a portal while visiting Slack triggered a gnome-shell crash. For the most part, these sorts of problems are expected by testers of unstable nightly software, though I’m concerned about the portal bugs because these affect stable users too. Anyway, these are just bugs, and all software has bugs: they get fixed, nothing special.

    So my impression of Flatpak is still largely positive. Flatpak does not magically make our software work properly in all host environments, but it hugely reduces the number of things that can go wrong on the host system. In recent years, I’ve seen users badly break Epiphany in various ways, e.g. by installing custom mimeinfo or replacing the network backend. With Flatpak, either of these would require an incredible amount of dedicated effort. Without a doubt, Flatpak distribution is more robust to user error. Another advantage is that we get the latest versions of OS dependencies, like GStreamer, libsoup, and glib-networking, so we can avoid the many bugs in these components that have been fixed in the years since our users’ LTS distros froze the package versions. I appreciate the desire of LTS distros to provide stability for users, but at the same time, I’m not impressed when users report issues with the browser that we fixed two years ago in one dependency or another. Flatpak is an excellent compromise solution to this problem: the LTS distro retains an LTS core, but specific applications can use newer dependencies from the Flatpak runtime.

    But there is one huge downside to using Flatpak: we lose crash reports. It’s at best very difficult — and often totally impossible — to investigate crashes when using Flatpak, and that’s frankly more important than any of the gains I mention above. For example, today Epiphany Technology Preview is crashing pretty much constantly. It’s surely a bug in WebKit, but that’s all I can figure out. The way to get a backtrace from a crashing app in flatpak is to use coredumpctl to manually dump the core dump to disk, then launch a bash shell in the flatpak environment and manually load it up in gdb. The process is manual, old-fashioned, primitive, and too frustrating for me by a lot, so I wrote a little pyexpect script to automate this process for Epiphany, thinking I could eventually generalize it into a tool that would be useful for other developers. It’s a horrible hack, but it worked pretty well the day I tested it. I haven’t seen it work since. Debuginfo seems to be constantly broken, so I only see a bunch of ???s in my backtraces, and how are we supposed to figure out how to debug that? So I have no way to debug or fix the WebKit bug, because I can’t get a backtrace. The broken, inconsistent, or otherwise-unreliable debuginfo is probably just some bug that will be fixed eventually (and which I half suspect may be related to our recent freedesktop SDK upgrade. Update: Alex has debugged the debuginfo problem and it looks like that’s on track to be solved), but even once it is, we’re back to square one: it’s still too much effort to get the backtrace, relative to developing on the host system, and that’s a hard problem to solve. It requires tools that do not exist, and for which we have no plans to create, or even any idea of how to create them.

    This isn’t working. I need to be able to effortlessly get a backtrace out of my application, with no or little more effort than running coredumpctl gdb as I would without Flatpak in the way. Every time I see Epiphany or WebKit crash, knowing I can do nothing to debug or investigate, I’m very sorely tempted to switch back to using Fedora’s Epiphany, or good old JHBuild. (I can’t promote BuildStream here, because BuildStream has the same problem.)

    So the developer experience is just not good, but set that aside: the main benefits of Flatpak are for users, not developers, after all. Now, what if users run into a crash, how can they report the bug? Crash reports are next to useless without a backtrace, and wise developers refuse to look at crash reports until a quality backtrace has been posted. So first we need to fix the developer experience to work properly, but even then, it’s not enough: we need an automatic crash reporter, along the lines of ABRT or apport, to make reporting crashes realistically-achievable for users, as it already is for distro-packaged apps. But this is a much harder problem to solve. Such a tool will require integration with coredumpctl, and I have not the faintest clue how we could go about making coredumpctl support container environments. Yet without this, we’re asking application developers to give up their most valuable data — crash reports — in order to use Flatpak.

    Eventually, if we don’t solve crash reporting, Epiphany’s experiment with Flatpak will have to come to an end, because that’s more important to me than the (admittedly-tremendous) benefits of Flatpak. I’m still hopeful that the ingenuity of the Flatpak community will find some solutions. We’ll see how this goes.

    by Michael Catanzaro at May 27, 2018 11:39 PM

    May 21, 2018

    Andy Wingo

    correct or inotify: pick one

    Let's say you decide that you'd like to see what some other processes on your system are doing to a subtree of the file system. You don't want to have to change how those processes work -- you just want to see what files those processes create and delete.

    One approach would be to just scan the file-system tree periodically, enumerating its contents. But when the file system tree is large and the change rate is low, that's not an optimal thing to do.

    Fortunately, Linux provides an API to allow a process to receive notifications on file-system change events, called inotify. So you open up the inotify(7) manual page, and are greeted with this:

    With careful programming, an application can use inotify to efficiently monitor and cache the state of a set of filesystem objects. However, robust applications should allow for the fact that bugs in the monitoring logic or races of the kind described below may leave the cache inconsistent with the filesystem state. It is probably wise to do some consistency checking, and rebuild the cache when inconsistencies are detected.

    It's not exactly reassuring is it? I mean, "you had one job" and all.

    Reading down a bit farther, I thought that with some "careful programming", I could get by. After a day of trying, I am now certain that it is impossible to build a correct recursive directory monitor with inotify, and I am not even sure that "good enough" solutions exist.

    pitfall the first: buffer overflow

    Fundamentally, inotify races the monitoring process with all other processes on the system. Events are delivered to the monitoring process via a fixed-size buffer that can overflow, and the monitoring process provides no back-pressure on the system's rate of filesystem modifications. With inotify, you have to be ready to lose events.

    This I think is probably the easiest limitation to work around. The kernel can let you know when the buffer overflows, and you can tweak the buffer size. Still, it's a first indication that perfect is not possible.

    pitfall the second: now you see it, now you don't

    This one is the real kicker. Say you get an event that says that a file "frenemies.txt" has been created in the directory "/contacts/". You go to open the file -- but is it still there? By the time you get around to looking for it, it could have been deleted, or renamed, or maybe even created again or replaced! This is a TOCTTOU race, built-in to the inotify API. It is literally impossible to use inotify without this class of error.

    The canonical solution to this kind of issue in the kernel is to use file descriptors instead. Instead of or possibly in addition to getting a name with the file change event, you get a descriptor to a (possibly-unlinked) open file, which you would then be responsible for closing. But that's not what inotify does. Oh well!

    pitfall the third: race conditions between inotify instances

    When you inotify a directory, you get change notifications for just that directory. If you want to get change notifications for subdirectories, you need to open more inotify instances and poll on them all. However now you have N2 problems: as poll and the like return an unordered set of readable file descriptors, each with their own ordering, you no longer have access to a linear order in which changes occurred.

    It is impossible to build a recursive directory watcher that definitively says "ok, first /contacts/frenemies.txt was created, then /contacts was renamed to /peeps, ..." because you have no ordering between the different watches. You don't know that there was ever even a time that /contacts/frenemies.txt was an accessible file name; it could have been only ever openable as /peeps/frenemies.txt.

    Of course, this is the most basic ordering problem. If you are building a monitoring tool that actually wants to open files -- good luck bubster! It literally cannot be correct. (It might work well enough, of course.)

    reflections

    As far as I am aware, inotify came out to address the needs of desktop search tools like the belated Beagle (11/10 good pupper just trying to get his pup on). Especially in the days of spinning metal, grovelling over the whole hard-drive was a real non-starter, especially if the search database should to be up-to-date.

    But after looking into inotify, I start to see why someone at Google said that desktop search was in some ways harder than web search -- I mean we all struggle to find files on our own machines, even now, 15 years after the whole dnotify/inotify thing started. Part of it is that the given the choice between supporting reliable, fool-proof file system indexes on the one hand, and overclocking the IOPS benchmarks on the other, the kernel gave us inotify. I understand it, but inotify still sucks.

    I dunno about you all but whenever I've had to document such an egregious uncorrectable failure mode as any of the ones in the inotify manual, I have rewritten the software instead. In that spirit, I hope that some day we shall send inotify to the pet cemetery, to rest in peace beside Beagle.

    by Andy Wingo at May 21, 2018 02:29 PM