Planet Igalia

August 18, 2014

Andy Wingo

on gnu and on hackers

Greetings, gentle hackfolk. 'Tis a lovely waning light as I write this here in Munich, Munich the green, Munich full of dogs and bikes, Munich the summer-fresh.

Last weekend was the GNU hackers meeting up in Garching, a village a few metro stops north of town. Garching is full of quiet backs and fruit trees and small gardens bursting with blooms and beans, as if an eddy of Chistopher Alexander whirled out and settled into this unlikely place. My French suburb could learn a thing or ten. We walked back from the hack each day, ate stolen apples and corn, and schemed the nights away.

The program of GHM this year was great. It started off with a bang, as GNUnet hackers Julian Kirsch and Christian Grothoff broke the story that the Five-Eyes countries (US, UK, Canada, Australia, NZ) regularly port-scan the entire internet, looking for vulnerabilities. They then proceed to exploit those vulnerabilities, in regular hack-a-thons, trying to own as many boxes in as many countries as they can. They then use them as launchpads for attacks and for exfiltration of information from other networks.

The presentation that broke this news also proposed a workaround based on port-knocking, Knock. Knock embeds the hash of a pre-shared key with some other information into the 32-bit initial sequence number of a TCP connection. Unlike previous incarnations of port-knocking, Knock also authenticates the first n payload bytes, so that the connection isn't vulnerable to hijacking (e.g. via GCHQ "quantum injection", where well-placed evil routers race the true destination server to provide the first response packet of a connection). Leaking the pwn-the-internet documents with Laura Poitras at the same time as the Knock unveiling was a pretty slick move!

I was also really impressed by Christian's presentation on the GNU name system. GNS is a replacement for DNS whose naming structure mirrors our social naming structure. For example, www.alice.gnu would be my friend Alice, and www.alice.bob.gnu would be Alice's friend Bob. With some integration, it can work on normal desktops and mobile devices. There are lots more details, so check gnunet.org/gns for more information.

Of course, a new naming system does need some operating system support. In this regard Ludovic Courtès' update on Guix was particularly impressive. Guix is a Nix-like system whose goal is reproducible, user-controlled GNU/Linux systems. A couple years ago I didn't think much of it, but now it's actually booting on raw hardware, not just under virtualization, and things seem to be rolling forth as if on rails. Guix manages to be technically innovative at the same time as being GNU-centered, so it can play a unique role in propagating GNU work like GNS.

and yet.

But now, as the dark clouds race above and the light truly goes, we arrive to the point I really wanted to discuss. GNU has a terrible problem with gender balance, and with diversity in general. Of about 70 attendees at this recent GHM, only two were women. We talk the talk about empowering users and working for freedom but, to a first approximation, it's really just a bunch of dudes that think the exact same things.

There are many reasons for this, of course. Some people like to focus on what's called the "pipeline problem" -- that there aren't as many women coming out of computer science programs as men. While true, the proportion of women CS graduates is much higher than the proportion of women at GHM events, so something must be happening in between. And indeed, the attrition rates of women in the tech industry are higher than that of men -- often because we men make it a needlessly unpleasant place for women to be. Sometimes it's even dangerous. The incidence of sexual harassment and assault in tech, especially at events, is something terrible. Scroll down in that linked page to June, July, and August 2014, and ask yourself whether that's OK. (Hint: hell no.)

And so you would think that people who consider themselves to be working for some abstract liberatory principle, as GNU is, would be happy to take a stand against this kind of asshaberdashery. There you would be wrong. Voilà a timeline of an incident.

timeline

March 2014
Someone at the FSF asks a GHM organizer to add an anti-harassment policy to GHM. The organizer does so and puts one on the web page, copying the text from Libreplanet's web site. The policy posted is:

Offensive or overly explicit sexual language or imagery is inappropriate during the event, including presentations.

Participants violating these rules may be sanctioned or expelled from the meeting at the discretion of the organizers.

Harassment includes offensive comments related to gender, sexual orientation, disability, appearance, body size, race, religion, sexual images in public spaces, deliberate intimidation, stalking, harassing photography or recording, persistent disruption of talks or other events, repeated unsolicited physical contact, or sexual attention.

Monday, 11 August 2014
The first mention of the policy is made on the mailing list, in a mail with details about the event that also includes the line:

An anti-harrasment policy applies at GHM: http://gnu.org/ghm/policy.html

Monday, 11 August 2014
A speaker writes the list to say:

Since I do not desire to be denounced, prosecuted and finally sanctioned or expelled from the event (especially considering the physical pain and inconvenience of attending due to my very recent accident) I withdraw my intention to lecture "Introducing GNU Posh" at the GHM, as it is not compliant with the policy described in the page above.

Please remove the talk from the official schedule. Thanks.

PS: for those interested, I may perform the talk off-event in case we find a suitable place, we will see..

The resulting thread goes totes clownshoes and rages on up until the event itself.
Friday, 15 August 2014
Sheepish looks between people that flamed each other over the internet. Hallway-track discussion starts up quickly though. Disagreeing people are not rational enough to have a conversation though (myself included).
Saturday, 16 August 2014
In the morning and lunch break, people start to really discuss the issues (spontaneously). It turns out that the original mail that sparked the thread was based, to an extent, on a misunderstanding: that "offensive or overly explicit sexual language or imagery" was parsed (by a few non-native English speakers) as "offensive language or ...", which people thought was too broad. Once this misunderstanding was removed, there were still people that thought that any policy at all was unneeded, and others that were concerned that someone could say something without intending offense, but then be kicked out of the event. Arguments back and forth. Some people wonder why others can be hurt by "just words". Some discussion of rape culture as continuum between physical violence and cultural tropes. One of the presentations after lunch is by a GNU hacker. He starts his talk by stating his hope that he won't be seen as "offensive or part of rape culture or something". His microphone wasn't on, so once he gets it on he repeats the joke. I stomp out, slam the door, and tweet a few angry things. Later in the evening the presenter and I discuss the issue. He apologizes to me.
Sunday, 17 August 2014
A closed meeting for GNU maintainers to round up the state of GNU and GHM. No women present. After dealing with a number of banalities, we finally broach the topic of the harassment policy. More opposition to the policy Sunday than Saturday lunch. Eventually a proposal is made to replace "offensive" with "disparaging", and people start to consent to that. We run out of time and have to leave; resolution unclear.
Monday, 18 August 2014
GHM organizer updates the policy to remove the words "offensive or" from the beginning of the harassment policy.

I think anyone involved would agree on this timeline.

thoughts

The problems seen over the last week with this anti-harassment policy are entirely to do with the men. It was a man who decided that he should withdraw his presentation because he found it offensive that he could be perceived as offensive. It was men who willfully misread the policy, comparing it to such examples as "I should have the right to offend Microsoft supporters", "if I say the wrong word I will go to jail", and who raised the ignorant, vacuous spectre of "political correctness" to argue that they should be able to say what they want, in a GNU conference, no matter who they hurt, no matter what the effects. That they are able to argue this position from a status-quo perspective is the definition of privilege.

Now, there is ignorance, and there is malice. Both must be opposed, but the former may find a cure. Although I didn't begin my contribution to the discussion in the smoothest way, linking to a an amusing article on the alchemy of intent that is probably misunderstood, it ended up that one of the main points was about intent. I know Ralph (say) and Ralph is a great person and so how could it be that anything Ralph would say could be a slur? You know he wouldn't mean it like that!

To that, we of course have to say that as GNU grows, not everyone knows that Ralph is a great person. In the end what would it mean for someone to be anti-racist but who says racist things all the time? You would have to call them racist, right? Or if you just said something one time, but refused to own up to your error, and instead persisted in repeating a really racist slur -- you would be racist right? But I know you... But the thing that you said...

But then to be honest I wonder sometimes. If someone repeats a joke trivializing rape culture, after making sure that the microphone is picking up his words -- I mean, that's a misogynist action, right? Put aside the question of whether the person is, in essence, misogynist or not. They are doing misogynist things. How do I know that this person isn't going to do it again, private apology or not?

And how do I know that this community isn't going to permit it again? That remark was made to a room of 40 dudes or so. Not one woman was present. Although there was some discussion afterwards, if people left because of the phrase, it was only two or three. How can we then say that GNU is not a misogynist community -- is not a community that tolerates misogyny?

Given all of this, what do you expect? Do you expect to grow GNU into a larger organization in the future, rich and strong and diverse? If that's not your goal, you are not my colleague. And if it is your goal, why do you put up with this kind of behavior?

The discussion on intent and offense seems to have had its effect in the removal of "offensive or" from the anti-harassment policy language. I think it's terrible, though -- if you don't trust someone who says they were offended by sexual language or imagery, why would you trust them when they report sexual harassment or assault? I can only imagine this leading to some sort of argument where the person who has had the courage to report such an incident finds himself or herself in a public witness box, justifying that the incident was offensive. "I'm sorry my words offended you, but that was not my intent, and anyway the words were not offensive." Lolnope.

There were so many other wrong things about this -- a suggestion that we the GNU cabal (lamentably, a bunch of dudes) should form a committee to make the wording less threatening to us; that we're just friends anyway; that illegal things are illegal anyway... it's as if the Code of Conduct FAQ that Ashe Dryden assembled were a Bingo card and we all lost.

Finally I don't think I trusted the organizers enough with this policy. Both organizers expressed skepticism about the policy in such terms that if I personally hadn't won the privilege lottery (white male "western" hetero already-established GNU maintainer) I wouldn't feel comfortable bringing up a concern to them.

In the future I will not be attending any conferences without strong, consciously applied codes of conduct, and I enjoin you to do the same.

conclusion


Propagandhi, "Refusing to Be a Man", Less Talk, More Rock (1996)

There is no conclusion yet -- working for the better world we all know is possible is a process, as people and as a community.

To outsiders, to outsiders everywhere, please keep up the vocal criticisms. Thank you for telling your story.

To insiders, to insiders everywhere, this is your problem. The problem is you. Own it.

by Andy Wingo at August 18, 2014 09:07 PM

August 08, 2014

Iago Toral

Diving into Mesa

Recap

In my last post I gave a quick introduction to the Linux graphics stack. There I explained how what we call a graphics driver in Linux is actually a combination of three different drivers:

  • the user space X server DDX driver, which handles 2D graphics.
  • the user space 3D OpenGL driver, that can be provided by Mesa.
  • the kernel space DRM driver.

Now that we know where Mesa fits let’s have a more detailed look into it.

DRI drivers and non-DRI drivers

As explained, Mesa handles 3D graphics by providing an implementation of the OpenGL API. Mesa OpenGL drivers are usually called DRI drivers too. Remember that, after all, the DRI architecture was brought to life precisely to enable efficient implementation of OpenGL drivers in Linux and, as I introduced in my previous post, DRI/DRM are the building blocks of the OpenGL drivers in Mesa.

There are other implementations of the OpenGL API available too. Hardware vendors that provide drivers for Linux will provide their own implementation of the OpenGL API, usually in the form of a binary blob. For example, if you have an NVIDIA GPU and install NVIDIA’s proprietary driver this will install its own libGL.so.

Notice that it is possible to create graphics drivers that do not follow the DRI architecture in Linux. For example, the NVIDIA proprietary driver installs a Kernel module that implements similar functionality to DRM but with a different API that has been designed by NVIDIA, and obviously, their corresponding user space drivers (DDX and OpenGL) will use this API instead of DRM to communicate with the NVIDIA kernel space driver.

Mesa, the framework

You have probably noticed that when I talk about Mesa I usually say ‘drivers’, in plural. That is because Mesa itself is not really a driver, but a project that hosts multiple drivers (that is, multiple implementations of the OpenGL API).

Indeed, Mesa is best seen as a framework for OpenGL implementators that provides abstractions and code that can be shared by multiple drivers. Obviously, there are many aspects of an OpenGL implementation that are independent of the underlying hardware, so these can be abstracted and reused.

For example, if you are familiar with OpenGL you know it provides a state based API. This means that many API calls do not have an immediate effect, they only modify the values of certain variables in the driver but do not require to push these new values to the hardware immediately. Indeed, usually that will happen later, when we actually render something by calling glDrawArrays() or a similar API: it is at that point that the driver will configure the 3D pipeline for rendering according to all the state that has been set by the previous API calls. Since these APIs do not interact with the hardware their implementation can be shared by multiple drivers, and then, each driver, in their implementation of glDrawArrays(), can fetch the values stored in this state and translate them into something meaningful for the hardware at hand.

As such, Mesa provides abstractions for many things and even complete implementations for multiple OpenGL APIs that do not require interaction with the hardware, at least not immediate interaction.

Mesa also defines hooks for the parts where drivers may need to do hardware specific stuff, for example in the implementation of glDrawArrays().

Looking into glDrawArrays()

Let’s see an example of these hooks into a hardware driver by inspecting the stacktrace produced from a call to glDrawArrays() inside Mesa. In this case, I am using the Mesa Intel DRI driver and I am calling glDrawArrays() from a function named render() in my program. This is the relevant part of the stacktrace:

brw_upload_state () at brw_state_upload.c:651
brw_try_draw_prims () at brw_draw.c:483
brw_draw_prims () at brw_draw.c:578
vbo_draw_arrays () at vbo/vbo_exec_array.c:667
vbo_exec_DrawArrays () at vbo/vbo_exec_array.c:819
render () at main.cpp:363

Notice that glDrawArrays() is actually vbo_exec_DrawArrays(). What is interesting about this stack is that vbo_exec_DrawArrays() and vbo_draw_arrays() are hardware independent and reused by many drivers inside Mesa. If you don’t have an Intel GPU like me, but also use a Mesa, your backtrace should be similar. These generic functions would usually do things like checks for API use errors, reformatting inputs in a way that is more appropriate for later processing or fetching additional information from the current state that will be needed to implement the actual operation in the hardware.

At some point, however, we need to do the actual rendering, which involves configuring the hardware pipeline according to the command we are issuing and the relevant state we have set in prior API calls. In the stacktrace above this starts with brw_draw_prims(). This function call is part of the Intel DRI driver, it is the hook where the Intel driver does the stuff required to configure the Intel GPU for drawing and, as you can see, it will later call something named brw_upload_state(), which will upload a bunch of state to the hardware to do exactly this, like configuring the various shader stages required by the current program, etc.

Registering driver hooks

In future posts we will discuss how the driver configures the pipeline in more detail, but for now let’s just see how the Intel driver registers its hook for the glDrawArrays() call. If we look at the stacktrace, and knowing that brw_draw_prims() is the hook into the Intel driver, we can just inspect how it is called from vbo_draw_arrays():

static void
vbo_draw_arrays(struct gl_context *ctx, GLenum mode, GLint start,
                GLsizei count, GLuint numInstances, GLuint baseInstance)
{
   struct vbo_context *vbo = vbo_context(ctx);
   (...)
   vbo->draw_prims(ctx, prim, 1, NULL, GL_TRUE, start, start + count - 1,
                   NULL, NULL);
   (...)
}

So the hook is draw_prims() inside vbo_context. Doing some trivial searches in the source code we can see that this hook is setup in brw_draw_init() like this:

void brw_draw_init( struct brw_context *brw )
{
   struct vbo_context *vbo = vbo_context(ctx);
   (...)
   /* Register our drawing function:
    */
   vbo->draw_prims = brw_draw_prims;
   (...)
}

Let’s put a breakpoint there and see when Mesa calls into that:

brw_draw_init () at brw_draw.c:583
brwCreateContext () at brw_context.c:767
driCreateContextAttribs () at dri_util.c:435
dri2_create_context_attribs () at dri2_glx.c:318
glXCreateContextAttribsARB () at create_context.c:78
setupOpenGLContext () at main.cpp:411
init () at main.cpp:419
main () at main.cpp:477

So there it is, Mesa (unsurprisingly) calls into the Intel DRI driver when we setup the OpenGL context and it is there when the driver will register various hooks, including the one for drawing primitives.

We could do a similar thing to see how the driver registers its hook for the context creation. We will see that the Intel driver (as well as other drivers in Mesa) assign a global variable with the hooks they need like this:

static const struct __DriverAPIRec brw_driver_api = {
   .InitScreen           = intelInitScreen2,
   .DestroyScreen        = intelDestroyScreen,
   .CreateContext        = brwCreateContext,
   .DestroyContext       = intelDestroyContext,
   .CreateBuffer         = intelCreateBuffer,
   .DestroyBuffer        = intelDestroyBuffer,
   .MakeCurrent          = intelMakeCurrent,
   .UnbindContext        = intelUnbindContext,
   .AllocateBuffer       = intelAllocateBuffer,
   .ReleaseBuffer        = intelReleaseBuffer
};

PUBLIC const __DRIextension **__driDriverGetExtensions_i965(void)
{
   globalDriverAPI = &brw_driver_api;

   return brw_driver_extensions;
}

This global is then used throughout the DRI implementation in Mesa to call into the hardware driver as needed.

We can see that there are two types of hooks then, the ones that are needed to link the driver into the DRI implementation (which are the main entry points of the driver in Mesa) and then the hooks they add for tasks that are related to the hardware implementation of OpenGL bits, typically registered by the driver at context creation time.

In order to write a new DRI driver one would only have to write implementations for all these hooks, the rest is already implemented in Mesa and reused across multiple drivers.

Gallium3D, a framework inside a framework

Currently, we can split Mesa DRI drivers in two kinds: the classic drivers (not based on the Gallium3D framework) and the new Gallium drivers.

Gallium3D is part of Mesa and attempts to make 3D driver development easier and more practical than it was before. For example, classic Mesa drivers are tightly coupled with OpenGL, which means that implementing support for other APIs (like Direct3D) would pretty much require to write a completely new implementation/driver. This is addressed by the Gallium3D framework by providing an API that exposes hardware functions as present in modern GPUs rather than focusing on a specific API like OpenGL.

Other benefits of Gallium include, for example, support for various Operating Systems by separating the part of the driver that relies on specific aspects of the underlying OS.

In the last years we have seen a lot of drivers moving to the Gallium infrastructure, including nouveau (the open source driver for NVIDIA GPUs), various radeon drivers, some software drivers (swrast, llvmpipe) and more.


Gallium3D driver model (image via wikipedia)

Although there were some efforts to port the Intel driver to Gallium in the past, development of the Intel Gallium drivers (i915g and i965g) is stalled now as far as I know. Intel is focusing in the classic version of the drivers instead. This is probably because it would take a large amount of time and effort to bring the current classic driver to Gallium with the same features and stability that it has in its current classic form for many generations of Intel GPUs. Also, there is a lot of work going on to add support for new OpenGL features to the driver at the moment, which seems to be the priority right now.

Gallium and LLVM

As we will see in more detail in future posts, writing a modern GPU driver involves a lot of native code generation and optimization. Also, OpenGL includes the OpenGL Shading Language (GLSL) which directly requires to have a GLSL compiler available in the driver too.

It is no wonder then that Mesa developers thought that it would make sense to reuse existing compiler infrastructure rather than building and using their own: enter LLVM.

By introducing LLVM into the mix, Mesa developers expect to bring new and better optimizations to shaders and produce better native code, which is critical to performance.

This would also allow to eliminate a lot of code from Mesa and/or the drivers. Indeed, Mesa has its own complete implementation of a GLSL compiler, which includes a GLSL parser, compiler and linker as well as a number of optimizations, both for abstract representations of the code, in Mesa, and for the actual native code for a specific GPU, in the actual hardware driver.

The way that Gallium plugs LLVM is simple: Mesa parses GLSL and produces LLVM intermediary representation of the shader code that it can then pass to LLVM, which will take care of the optimization. The role of hardware drivers in this scenario is limited to providing LLVM backends that describe their respective GPUs (instruction set, registers, constraints, etc) so that LLVM knows how it can do its work for the target GPU.

Hardware and Software drivers

Even today I see people who believe that Mesa is just a software implementation of OpenGL. If you have read my posts so far it should be clear that this is not true: Mesa provides multiple implementations (drivers) of OpenGL, most of these are hardware accelerated drivers but Mesa also provides software drivers.

Software drivers are useful for various reasons:

  • For developing and testing purposes, when you want to take the hardware out of the equation. From this point of view, a software representation can provide a reference for expected behavior that is not tied or constrained by any particular hardware. For example, if you have an OpenGL program that does not work correctly we can run it with the software driver: if it works fine then we know the problem is in the hardware driver, otherwise we can suspect that the problem is in the application itself.
  • To allow execution of OpenGL in systems that lack 3D hardware drivers. It would obviously be slow, but in some scenarios it could be sufficient and it is definitely better than not having any 3D support at all.

I initially intended to cover more stuff in this post, but it is already getting long enough so let’s stop here for now. In the next post we will discuss how we can check and change the driver in use by Mesa, for example to switch between a software and hardware driver, and we will then start looking into Mesa’s source code and introduce its main modules.

by Iago Toral at August 08, 2014 10:31 AM

August 06, 2014

Carlos García Campos

GTK+ 3 Plugins in WebKitGTK+ and Evince Browser Plugin

GTK+ 3 plugins in WebKitGTK+

The WebKit2 GTK+ API has always been GTK+ 3 only, but WebKitGTK+ still had a hard dependency on GTK+ 2 because of the plugin process. Some popular browser plugins like flash or Java use GTK+ 2 unconditionally (and it seems they are not going to be ported to GTK+ 3, at least not in the short term). These plugins stopped working in Epiphany when it switched to GTK+ 3 and started to work again when Epiphany moved to WebKit2.

To support GTK+ 2 plugins we had to build the plugin process with GTK+ 2, but also some parts of WebCore and WebKit2 (the ones depending on GTK+ and used by the plugin process) were built twice. As a result we had a WebKitPluginProcess binary of ~40MB, that was always used for all the plugins. This kind of made sense, since there were no plugins using GTK+ 3, and the GTK+ 2 dependency was harmless for plugins not using GTK+ at all. However, we realized we were making a rule for the exception, since most of the plugins don’t even use GTK+, and there weren’t plugins using GTK+ 3 because they were not supported by any browser (kind of chicken-egg problem).

Since WebKitGTK+ 2.5.1 we have two binaries for the plugin process: WebKitPluginProcess2 which is exactly the same 40MB binary using GTK+ 2 that we have always had, but that now is only used to load plugins using GTK+ 2; and WebKitPluginProcess, a 7,4K binary that is now used by default for everything except loading plugins that use GTK+ 2. And since it links to GTK+ 3, it might load plugins using GTK+ 3 as well. Another side effect is that now we can make GTK+ 2 optional, WebKitPluginProcess2 wouldn’t be built and only plugins using GTK+ 2 wouldn’t be supported.

Evince Browser Plugin

For a long time, we have maintained that PDF documents shouldn’t be opened inside the browser, but downloaded and then opened by the default document viewer. But then the GNOME design team came up with new mockups for Epiphany were everything was integrated in the browser, including PDF documents. It’s something all the major browsers do nowadays, using different approaches though (Custom PDF plugin inside the web engine, JavaScript libraries, etc.).

At the WebKitGTK+ hackfest in 2012 we started to think about how to implement the integrated document reading in Epiphany based on the design mockups. We quickly discarded the idea of implementing it as a NPAPI plugin, because that would mean we had to use a very old evince version using GTK+ 2. We can’t implement it inside WebKit using libevince because it’s a GPL library, so the first approach was to implement it inside Epiphany using libevince. I wrote a first patch, it was mostly a proof of concept hack, that added a new view widget based on EvView to be used instead of a WebView when a document supported by evince was requested. This approach has a lot of limitations, since it only works when the main resource is a document, but not for documents embedded in a HTML page or an iframe, and a lot of integration problems that makes it quite difficult to maintain inside Epiphany. All of these issues would be solved by implementing it as a NPAPI plugin and it wouldn’t require any change in Epiphany. Now that WebKitGTK+ supports GTK+ 3 plugins, there’s no reason not to do so.

Epiphany Evince Plugin

Thanks to a project in Igalia I’ve been able to work on it, and today I’ve landed an initial implementation of the browser plugin to Evince git master. It’s only a first implementation (written in C++ 11) with the basic features (page navigation, view modes, zoom and printing), and a very simple UI that needs to be updated to match the mockups. It can be disabled at compile time like all other frontends inside Evince (thumbnailer, previewer, nautilus properties page).

Epiphany embedded PDF document Epiphany standalone PDF document

Another advantage of being a NPAPI plugin is that it’s scriptable so that you can control the viewer using JavaScript.

Epiphany scriptable PDF

And you can pass initial parameters (like current page, zoom level, view mode, etc.) from the HTML tag.

<object data="test.pdf" type="application/pdf" width="600" height="300" 
                currentPage="2" zoomMode="fit-page" continuous="false">
  The pdf could not be rendered.
</object>

You can even hide the default toolbar and build your own one using HTML and JavaScript.

by carlos garcia campos at August 06, 2014 10:45 AM

August 05, 2014

Víctor Jáquez

GUADEC 2014

The last Friday 25 of July, National Day of Galicia, started very early because I had to travel to Strasbourg, official seat of the European Parliament, not for any political duty, but for the GNOME Users and Developers European Conference, the GUADEC!

My last GUADEC was in The Hague, in 2010, though in 2012, when it was hosted in Coruña, I attended a couple talks. Nonetheless, it had been a long time since I met the community, and it was a pleasure to me meet them again.

My biggest impression was the number of attendees. I remember the times in Turkey or in Gran Canaria where hundreds packed the auditoriums and halls. Nowadays the audience was smaller, but that is a good thing, since now you get in touch with the core of developers who drive and move the project easily.

We, Igalia, as sponsors, had a banner in the main room and a table in a corridor. Here is a picture of Juan to prove it:

Juan at the Igalia's both.

Juan at the Igalia’s booth.

Also I ran across with Emmanuele Bassi, setting up a booth to show up the Endless Mobile OS, based on GNOME 3. The people at GUADEC welcomed with enthusiasm the user experience provided by it and the purpose of the project. Personally, I love it. If you don’t know the project, you should visit their web site.

The first talk I attended what the classic GStreamer update by Sebastian Dröge and Tim Müller. They talked about the new features in GStreamer 1.4. Neat stuff in there. I like the new pace of GStreamer, rather of the old stagnated evolution of 0.10 version.

Afterwards, Jim Hall gave us a keynote about Usability in GNOME. I really enjoyed that talk. He studied the usability of several GNOME applications such as Nautilus (aka Files), GEdit, Epiphany (aka Web), etc., as part of his Masters’ research. It was a pleasure to hear that Epiphany is regarded as having a good usability.

After lunch I was in the main room hearing Sylvain Le Bon about sustainable business models for free software. He talked about crowd funding, community management and related stuff.

The next talk was Christian Hergert about his project GOM, an object mapper from GObjects to SQLite, which is used in Grilo to prevent SQL injection by some plugins that use SQLite.

Later on, Marina Zhurakhinskaya gave us one of the best talks of the GUADEC: How to be an ally to women in tech. I encourage you to download the slides and read them. There I learned about the unicorn law and the impostor syndrome.

The day closed with the GNOME Foundation’s teams reports.

Sunday came and I arrived to the venue for the second keynote: Should We Teach The Robot To Kill by Nathan Willis. In his particular style, Nathan, presented a general survey of GNU/Linux in the Automotive Industry.

Next, one of main talks from Igalia: Web 3.12: a browser to make us proud, presented by Edu. It was fairly good. Edu showed us the latest development in WebKitGTK+ and Epiphany (aka Web). There were quite a few questions at the end of the talk. Epiphany nowadays is actively used by a lot of people in the community.

After, Zeeshan presented his GNOME boxes, an user interface for running virtual machines. Later on Alberto Ruiz showed us Fleet Commander, a web application to handle large desktop deployments.

And we took our classic group photo:

Group phoo

Group photo

That Sunday closed with the intern’s lighting talks. Cool stuff is being cooked by them.

On Monday I was in the venue when Emmanuele Bassi talked us about GSK, the GTK+ Scene Graph Kit, his new project, using as a starting point the lessons learned in Clutter. Its objective is to have a scene graph library fully integrated in GTK+.

After the lunch and the second part of the Foundation’s Annual General Meeting, Benjamin Otte gave an amusing talk about the CSS implementation in GTK+. Later, Jasper St. Pierre talked about the Wayland support in GNOME.

When the coffee break ended, the almighty Žan Doberšek gave the other talk from Igalia: Wayland support in WebKit2GTK+.

In the last day of the GUADEC, I attended Bastien Nocera’s talk: Hardware integration, the GNOME way, where he reviewed the history of his contributions to GNOME related with hardware integration and the goal of nicely support most of the hardware in GNOME, like compasses, gyroscopes, et cetera.

Afterwards, Owen Taylor talked us about the GNOME’s continuous integration performance testing, in order to know exactly why one release of GNOME is faster or slower than the last.

And the third keynote came: Matthew Garrett talked us about his experiences with the GNOME community and his vision about where it should go: to enhance the privacy and security of the users, something that many GNOMErs are excited about, such as Federico Mena.

Later on, David King talked about his plans for Cheese, the webcam application, turning it into a DBus service, using the current development of kdbus to sandbox the interaction with the hardware.

Afterwards Christian Hergert talked us about his plans for Builder, a new IDE for GNOME. Promising stuff, but we will see how it goes. Christian said that he is going to take a full year working on this project.

The GUADEC ended with the lighting talks, where I enjoyed one about the problems around the current encryption and security tools.

Finally, the next GUADEC host was unveiled: the Sweden Conspiracy: Gothenburg!

by vjaquez at August 05, 2014 11:46 AM

August 01, 2014

Jacobo Aragunde

Tales of LibreOffice interoperability: the missing files

I’m welcoming the release of LibreOffice 4.3.0 in the name of Igalia with the last post in this series. This time we will talk about the preservation of embeddings in OOXML text documents, and devote some lines to the support of Standard Document Tags.

Embedded content in documents

Our goal this time was the preservation of embedded content in OOXML text documents, as a first step towards full support like we did with other features. The insertion of new embeddings in .docx documents or the edition of existing ones will have to come later in the future.

An embedded document usually consists of two files; one of them is a preview picture to be shown in the parent document, and the other one is the actual embedded document. For the case of a spreadsheet embedded in a text document, the most common case, you will find these two files in the document:

/word/media/image1.emf  
/word/embeddings/Microsoft_Excel_Worksheet1.xlsx 

With the corresponding entries in the relations file:

<Relationship Id="rId2"
  Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/package"
  Target="embeddings/oleObject1.xlsx" />
  <Relationship Id="rId3"
  Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image"
  Target="media/image1.emf" />

The relevant bits in the document.xml file are below. Notice a w:object consists of one shape, which is filled with data from an image file, and the OLE object itself, linked to the embedded spreadsheet. Also notice the attribute ProgID, which defines the program, document type and version:

<w:object>
  <v:shape id="ole_rId2"
  style="width:362.25pt;height:146.25pt" o:ole="">
    <v:imagedata r:id="rId3" o:title="" />
  </v:shape>
  <o:OLEObject Type="Embed" ProgID="Excel.Sheet.12"
  ShapeID="ole_rId2" DrawAspect="Content"
  ObjectID="_570182397" r:id="rId2" />
</w:object>

There is one more element that allows the embedded file to be properly detected by Word, it’s a content type definition in the content types file:

<Override PartName="/word/embeddings/oleObject1.xlsx"
ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" />

As you can see there are three elements that determine the kind of embedding we are dealing with, and Word requires the right combination of the three of them:

  • The properties in the tag in document.xml
  • The ContentType for the file defined in [Content_Types].xml
  • The Type of the Relationship defined in document.xml.rels

The most convenient way to achieve our goal was using the grab bag technique to store the ProgID attribute of the object, and infer the correct content type and relation type. Some examples:

  • An object with ProgID Excel.Sheet.12 is a OOXML spreadsheet. Its media type must be application/vnd.openxmlformats-officedocument.spreadsheetml.sheet and the relation type is http://schemas.openxmlformats.org/officeDocument/2006/relationships/package.
  • If the ProgID is Excel.Sheet.8, this is an old Office spreadsheet. Now the media type must be application/vnd.ms-excel and the relation type http://schemas.openxmlformats.org/officeDocument/2006/relationships/oleObject.

If you detect a particular type of embedding in your documents that isn’t being preserved, drop us a line in the Bugzilla. A patch to add new relations of this kind should be quick and easy.

Bonus track: Structured Document Tags

Structured Document Tags (SDTs) is a family of document objects that contains form-like controls, citations, contents tables or bibliography tables among many other. This variety of uses means that they can live inside a paragraph or they can be a high-level element that contains several paragraphs and even shapes, which of course is tricky to implement.

For 4.3.0 we have worked on some of these tags, and we can say we properly implemented the import and export of combo, date and check boxes. We also wrote some code to preserve generic SDTs and now most of the tags are preserved but there are formatting issues. The proper way to support every kind of SDT is translating them to the equivalent objects in LibreOffice on import and translate them back to SDTs on export, but that will require time and work. Any volunteers? ;)

Wrap-up

Despite the 6-month development cycles, I’m feeling like the development of 4.3 line started a long time ago and I may have forgotten to write about some little feature or fix… Anyway, it’s time to close this batch of blog posts about interoperability features, all of them developed by Igalia and sponsored by CloudOn.

Enjoy our shiny new LibreOffice, and happy hacking!

by Jacobo Aragunde Pérez at August 01, 2014 11:10 AM

Carlos García Campos

WebKitGTK+ 2.5.1: Good bye WebKit1

WebKitGTK+ 2.5.1 is the first version of this release cycle. It comes very late mainly due to the regressions introduced by the switch to CMake and the problems we found after removing WebKit1 from the tree. It also includes some new features that I’ll talk about in other posts, probably when 2.6.0 is released. In this post I’ll only focus on the breaks introduced in this release, in order to help everybody to adapt their applications to the API changes if needed.

Wait, but why breaking the API?

Since the release of WebKitGTK+ 2.0 the WebKit1 API has been considered deprecated and in maintenance mode. The new WebKit2 API is quite complete and stable now, so the plan for WebKitGTK+ 2.6 was removing WebKit1, leaving it alive, but still in maintenance mode, in the 2.4 branch. After removing the code from trunk we realized that newer versions of WebKitGTK+ that are WebKit2 only should be parallel installable with older versions of WebKitGTK+ that also include WebKit1. After some discussions trying to find the best solution, we reached the conclusion that we had to bump the binary version. But then I thought, since we were going to force everybody to recompile, why not take advantage to introduce some small (but necessary) API changes that in most of the cases will not affect the the users anyway? And then I started to review the API and proposing some changes. I also wanted to make sure all API changes were introduced in the first unstable release, so that users only have to adapt their applications once, and that’s the main reason why the release has taken so long.

Binary version bump

The new binary version is 4.0, so to use this new release you need to update your build system to look for webkit2gtk-4.0 pkg-config file.

GObject DOM Bindings

The GObject DOM bindings API situation was actually the main reason for breaking the API. The problem was that the code for the DOM bindings is generated automatically from the IDL files. This means that every time a new IDL file was added to the build system, we ended up exposing a new class in our public API without even noticing. Same happened when a API incompatible change was introduced in an IDL file, for example to update it to the current standard. We added a script to our build bots to warn us when that happened, and then we had to manually deprecate the existing API and add exceptions to the code generator. This was a lot of work just to keep backwards compatibility of an API nobody was using. Most of the people actually use a 5-10% of the DOM bindings API.

Since WebKitGTK+ 2.5.1 the GObject DOM bindings API is split into stable and unstable parts. The stable part contains the most commonly used API that it’s unlikely to change. We will keep maintaining backwards compatibility of this part of the API. The rest of the API is considered unstable and might change at any time, you can still use it but at your own risk. We thought this solution was better than just removing the unstable API. There are two kind of unstable APIs:

  • Classes that are considered unstable: the entire class is considered unstable. The header is not included in the main webkitdom.h header, so to use them you have to include the header file explicitly.
  • Unstable symbols of stable classes: a method or constant in a stable class that is considered unstable. In this case the header file is included by the main webkitfom.h header, but it doesn’t contain any unstable symbols, they are included in a new header WebKitDOMClassNameUnstable.h that also needs to be included explicitly.

In both cases you need to define WEBKIT_DOM_USE_UNSTABLE_API before including the headers

#define WEBKIT_DOM_USE_UNSTABLE_API
#include <webkitdom/WebKitDOMHTMLMediaElement.h>
#include <webkitdom/WebKitDOMElementUnstable.h>

WebKit2 GTK+ API

The API changes in the WebKit2 GTK+ API could have been avoided, by deprecating symbols and adding new ones, but since we were going to break the API anyway, and the affected symbols are not that commonly used we thought it was worth it.

  • WebKitWebView::create: the signal now receives a WebKitNavigationAction parameter containing information about the navigation action that triggered the event. So now you can know the type of event (if it was a link clicked, a form submitted, etc.), the mouse button and keyboard modifiers, the URI request or even if it was a user gesture. This information is very useful to implement a popup blocker, for example.
    /* before */
    static WebKitWebView *
    web_view_created_cb (WebKitWebView *web_view,
                         gpointer       user_data)
    
    /* after */
    static WebKitWebView *
    web_view_created_cb (WebKitWebView          *web_view,
                         WebKitNavigationAction *navigation_action,
                         gpointer                user_data)
  • WebKitWebViewGroup has been removed. This class was only introduced to add the user stylesheets API, since most of the people actually use the default web view group. The grouping of pages inside WebKit2 is something that will be eventually removed, in favor of users doing the groups they need. The user stylesheets API has been moved to a new class WebKitUserContentManager that will also be extended to support user scripts. The settings can still be handled directly with the WebKitWebView API, so that if you want a group of web views to share the same settings you can simply call webkit_web_view_set_settings() for all the web views passing the same WebKitSettings object.
    /* before */
    WebKitWebViewGroup *group = webkit_web_view_get_group (web_view);
    
    webkit_web_view_group_add_user_style_sheet (group, 
                                                buffer, 
                                                NULL, /* base URI */
                                                NULL, /* whitelist */
                                                NULL, /* blacklist */
                                                WEBKIT_INJECTED_CONTENT_FRAMES_ALL);
    
    /* after */
    WebKitUserContentManager *user_content;
    WebKitUserStyleSheet     *style_sheet;
    
    style_sheet = webkit_user_style_sheet_new (buffer,
                                               WEBKIT_USER_CONTENT_INJECT_ALL_FRAMES,
                                               WEBKIT_USER_STYLE_LEVEL_USER,
                                               NULL, /* whitelist */
                                               NULL /* blacklist */);
    user_content = webkit_web_view_get_user_content_manager (web_view);
    webkit_user_content_manager_add_style_sheet (user_content, style_sheet);
    webkit_user_style_sheet_unref (style_sheet);
  • WebKitCertificateInfo has been removed. This was supposed to be a convenient way of handling TLS certificates, but when trying to use it in a real case, it ended up being unconvenient. The WebKitWebView::load-failed-with-tls-errors signal now receives a GTlsCertificate and TlsCertificateFlags, and webkit_web_context_allow_tls_certificate_for_host() receives a GTlsCertificate.
    /* before */
    static gboolean
    load_failed_with_tls_errors_cb (WebKitWebView         *web_view,
                                    WebKitCertificateInfo *info,
                                    const gchar           *host,
                                    gpointer               user_data)
    {
      WebKitWebContext *context = webkit_web_view_get_context (web_view);
      GTlsCertificate *certificate = webkit_certificate_info_get_tls_certificate (info);
      GTlsCertificateFlags errors = webkit_certificate_info_get_tls_errors (info);
    
      if (add_exception_for_error (host, errors))
        webkit_web_context_allow_tls_certificate_for_host (context, info, host);
    }
    
    /* after */
    static gboolean
    load_failed_with_tls_errors_cb (WebKitWebView       *web_view,
                                    GTlsCertificate     *certificate,
                                    GTlsCertificateFlags errors,
                                    const gchar         *host,
                                    gpointer             user_data)
    {
      WebKitWebContext *context = webkit_web_view_get_context (web_view);
    
      if (add_exception_for_error (host, errors))
        webkit_web_context_allow_tls_certificate_for_host (context, certificate, host);
    }
  • View mode API: The view source mode was removed from WebCore, and the API was already marked as deprecated. Since it’s very unlikely we add more view modes, we just removed the API. There’s no replacement for this, but it could be easily implemented either using a external window with a GtkSourceView or embedded into a WebKitWebView by using a custom URI scheme and a JavaScript library for syntax highlighting.

CMake

Since version 2.5.1 WebKitGTK+ uses CMake instead autotools as its build system. The equivalent to configure, make and make install now would be something like this:

$ cd webkitgtk-2.5.1
$ cmake -DCMAKE_INSTALL_PREFIX= -DCMAKE_INSTALL_LIBDIR=lib -DPORT=GTK \
-DCMAKE_BUILD_TYPE=Release .
$ make
(enjoy the summer in the meantime)
# make install

Help!

Sure, we are available as usual in the #webkitgtk+ IRC channel at FreeNode and our mailing list webkit-gtk@lists.webkit.org.

by carlos garcia campos at August 01, 2014 10:27 AM

July 30, 2014

Carlos García Campos

Evince Hackfest

The Evince hackfest took place last week from 23rd to 25th July in Strasbourg. Yes, 3 days only, but very productive in my opinion, I’ll summarize all the cool stuff we worked on.

HiDPI

This work was initially started by Owen, and then Germán kept the patches up to date with evince git master. I reviewed all the pending patches and updated the thumbnails one and the result is that evince doesn’t look blurry on HiDPI screens any more.

Evince running with GDK_SCALE=2

Evince running with GDK_SCALE=2

Recent View

This was a GSoC project of 2013, but the patch provided by the student was never in an “upstreamable” state. Again Germán, who always loved this feature, took care of the patch addressing my review comments. At the beginning of the hackfest most of the work has already been done, we only needed a few more review iterations during the hackfest to finally push this feature to master. The idea is to show the list of recent documents as an icon view with thumbnails and documents metadata. This view is loaded when evince is launched without any document replacing the useless empty window we had before. It also replaces the recent documents submenu in the gear menu.

Evince Recent View

UI improvements

The move to the header bar recently made the toolbar look a bit cluttered, mainly because the title might use a lot of space. We discussed several ideas to improve the header bar and implemented some of them:

Evince header bar improvements

 

Juanjo Marín also wrote a patch to change the default zoom mode to “Automatic”, since several people commented that the current “Fit Width” mode doesn’t look good in screens with higher resolutions. The patch is still waiting review.

Annotations

Giselle and Anuj, our GSoc students this year, worked on their projects to improve the annotations support in both Evince and poppler.

    • Anuj wrote some patches to add support for Free Text annotations to poppler glib API. After a couple of review iterations and discussions about the API, the patches are now in bugzilla waiting for a final review (I hope to find the time soon)
    • Giselle was focused on adding support for highlight annotations to Evince, since poppler already has all the required API for this. The patches are not yet ready, but they look really promising.

 

Caret navigation and accessibility

Joanie and API continued improving the evince a11y support and fixing some remaining issues from the FoG project. Antía fought with the caret navigation implementation again to implement some missing key bindings and fixing other issues.

Comics backend

Juanjo Marín focused on the comics backend, working on a patch to use libarchive to uncompress the documents instead of spawning external command line tools.

Gestures

I started to review the gestures branch during the hackfest, patches looked clean and simple, but since I was not familiarized with the new GTK+ touch API and I didn’t have a touch screen to try it out either, I decided to wait after the hackfest and see it in action in garnacho’s laptop during GUADEC. Carlos explained to me how the touch API works in GTK+ and I could check it actually works great. The code doesn’t affect the normal use with the mouse, so the branch will be merged in master soon.

Evince hackfest dinner

And of course not everything was hacking

THANKS!

Many thanks to Alexandre Franke for the local organization, everything worked perfectly. Of course thanks to the GNOME Foundation for sponsoring the GSoC students, Giselle and Anuj, and Igalia for sponsoring all the Igalians attending the hackfest. Thanks also to Epitech for allowing us to do the hackfest there before the GUADEC.

Igalia S.L. GNOME FoundationEPITECH

by carlos garcia campos at July 30, 2014 03:37 PM

July 29, 2014

Iago Toral

A brief introduction to the Linux graphics stack

This post attempts to be a brief and simple introduction to the Linux graphics stack, and as such, it has an introductory nature. I will focus on giving enough context to understand the role that Mesa and 3D drivers in general play in the stack and leave it to follow up posts to dive deeper into the guts of Mesa in general and the Intel DRI driver specifically.

A bit of history

In order to understand some of the particularities of the current graphics stack it is important to understand how it had to adapt to new challenges throughout the years.

You see, nowadays things are significantly more complex than they used to be, but in the early times there was only a single piece of software that had direct access to the graphics hardware: the X server. This approach made the graphics stack simpler because it didn’t need to synchronize access to the graphics hardware between multiple clients.

In these early days applications would do all their drawing indirectly, through the X server. By using Xlib they would send rendering commands over the X11 protocol that the X server would receive, process and translate to actual hardware commands on the other side of a socket. Notice that this “translation” is the job of a driver: it takes a bunch of hardware agnostic rendering commands as its input and translates them into hardware commands as expected by the targeted GPU.

Since the X server was the only piece of software that could talk to the graphics hardware by design, these drivers were written specifically for it, became modules of the X server itself and an integral part of its architecture. These userspace drivers are called DDX drivers in X server argot and their role in the graphics stack is to support 2D operations as exported by Xlib and required by the X server implementation.


DDX drivers in the X server (image via wikipedia)

In my Ubuntu system, for example, the DDX driver for my Intel GPU comes via the xserver-xorg-video-intel package and there are similar packages for other GPU vendors.

3D graphics

The above covers 2D graphics as that is what the X server used to be all about. However, the arrival of 3D graphics hardware changed the scenario significantly, as we will see now.

In Linux, 3D graphics is implemented via OpenGL, so people expected an implementation of this standard that would take advantage of the fancy new 3D hardware, that is, a hardware accelerated libGL.so. However, in a system where only the X server was allowed to access the graphics hardware we could not have a libGL.so that talked directly to the 3D hardware. Instead, the solution was to provide an implementation of OpenGL that would send OpenGL commands to the X server through an extension of the X11 protocol and let the X server translate these into actual hardware commands as it had been doing for 2D commands before.

We call this Indirect Rendering, since applications do not send rendering commands directly to the graphics hardware, and instead, render indirectly through the X server.


OpenGL with Indirect Rendering (image via wikipedia)

Unfortunately, developers would soon realize that this solution was not sufficient for intensive 3D applications, such as games, that required to render large amounts of 3D primitives while maintaining high frame rates. The problem was clear: wrapping OpenGL calls in the X11 protocol was not a valid solution.

In order to achieve good performance in 3D applications we needed these to access the hardware directly and that would require to rethink a large chunk of the graphics stack.

Enter Direct Rendering Infrastructure (DRI)

Direct Rendering Infrastructure is the new architecture that allows X clients to talk to the graphics hardware directly. Implementing DRI required changes to various parts of the graphics stack including the X server, the kernel and various client libraries.

Although the term DRI usually refers to the complete architecture, it is often also used to refer only to the specific part of it that involves the interaction of applications with the X server, so be aware of this dual meaning when you read about this stuff on the Internet.

Another important part of DRI is the Direct Rendering Manager (DRM). This is the kernel side of the DRI architecture. Here, the kernel handles sensitive aspects like hardware locking, access synchronization, video memory and more. DRM also provides userspace with an API that it can use to submit commands and data in a format that is adequate for modern GPUs, which effectively allows userspace to communicate with the graphics hardware.

Notice that many of these things have to be done specifically for the target hardware so there are different DRM drivers for each GPU. In my Ubuntu system the DRM module for my Intel GPU is provided via the libdrm-intel1:amd64 package.


OpenGL with Direct Rendering (image via wikipedia)

DRI/DRM provide the building blocks that enable userspace applications to access the graphics hardware directly in an efficient and safe manner, but in order to use OpenGL we need another piece of software that, using the infrastructure provided by DRI/DRM, implements the OpenGL API while respecting the X server requirements.

Enter Mesa

Mesa is a free software implementation of the OpenGL specification, and as such, it provides a libGL.so, which OpenGL based programs can use to output 3D graphics in Linux. Mesa can provide accelerated 3D graphics by taking advantage of the DRI architecture to gain direct access to the underlying graphics hardware in its implementation of the OpenGL API.

When our 3D application runs in an X11 environment it will output its graphics to a surface (window) allocated by the X server. Notice, however, that with DRI this will happen without intervention of the X server, so naturally there is some synchronization to do between the two, since the X server still owns the window Mesa is rendering to and is the one in charge of displaying its contents on the screen. This synchronization between the OpenGL application and the X server is part of DRI. Mesa’s implementation of GLX (the extension of the OpenGL specification that addresses the X11 platform) uses DRI to talk to the X server and accomplish this.

Mesa also has to use DRM for many things. Communication with the graphics hardware happens by sending commands (for example “draw a triangle”) and data (for example the vertex coordinates of the triangle, their color attributes, normals, etc). This process usually involves allocating a bunch of buffers in the graphics hardware where all these commands and data are copied so that the GPU can access them and do its work. This is enabled by the DRM driver, which is the one piece that takes care of managing video memory and which offers APIs to userspace (Mesa in this case) to do this for the specific target hardware. DRM is also required whenever we need to allocate and manage video memory in Mesa, so things like creating textures, uploading data to textures, allocating color, depth or stencil buffers, etc all require to use the DRM APIs for the target hardware.


OpenGL/Mesa in the context of 3D Linux games (image via wikipedia)

What’s next?

Hopefully I have managed to explain what is the role of Mesa in the Linux graphics stack and how it works together with the Direct Rendering Infrastructure to enable efficient 3D graphics via OpenGL. In the next post we will cover Mesa in more detail, we will see that it is actually a framework where multiple OpenGL drivers live together, including both hardware and software variants, we will also have a look at its directory structure and identify its main modules, introduce the Gallium framework and more.

by Iago Toral at July 29, 2014 02:51 PM

July 24, 2014

Víctor Jáquez

See you at GUADEC!

Woah! Has been long time since I attended a GUADEC, besides the one in Coruña, of course. And I am very excited about it!

My plan is to chill with other GNOME developers and talk about the future of GNOME OS.

I'm going to GUADEC 2014

I’m going to GUADEC 2014

by vjaquez at July 24, 2014 10:03 AM

Juan A. Suárez

Another year, another GUADEC

It’s 2014, and like previous years:

guadec-2014-badge-large

This time I won’t give any talk, just relax and enjoy talks from others, and hope Strasbourg.

And what is more important, meet those hackers you interact with frequently, and maybe share some beers.

So if you go there, and you want to have a nice chat with me, or talk about Grilo project, don’t hesitate to do it. Igalia, which is kindly sponsoring my attendance, will have a place there during the core days, so likely you could find me around or ask anyone there for me.

Enjoy!

by Juan A. Suárez at July 24, 2014 08:30 AM

July 21, 2014

July 19, 2014

Philippe Normand

Moving to Pelican

Time for a change! Almost 10 years ago I was starting to hack on a Blog engine with two friends, it was called Alinea and it powered this website for a long time. Back then hacking on your own Blog engine was the pre-requirement to host your blog :) But nowadays people just use Wordpress or similar platforms, if they still have a blog at all. Alinea fell into oblivion as I didn’t have time and motivation to maintain it.

Moving to Pelican was quite easy, since I’ve been writing content in ReST on the previous blog I only had to pull data from the database and hacked pelican_import.py a bit :)

Now that this website looks almost modern I’ll hopefully start to blog again, expect at least news about the WebKit work I still enjoy doing at Igalia.

by Philippe Normand at July 19, 2014 02:13 PM

July 18, 2014

Iago Toral

A tour around the world of Mesa and Linux graphics drivers

For some time now I have decided to focus my work at Igalia on the graphics stack. As a result of this I had the chance to participate in a couple of very interesting projects like implementing Wayland support in WebKitGtk+ (a topic I have visited in this blog a number of times) and, lately, work on graphics drivers for Linux in the Mesa framework.

The graphics stack in Linux is complex and it is not always easy to find information and technical documentation that can aid beginners in their firsts steps. This is usually a very demanding domain, the brave individuals who decide to put their energy into it usually have their hands full hacking on the code and they don’t have that much room for documenting what they do in a way that is particularly accessible to newcomers.

As I mentioned above, I have been hacking on Mesa lately (particularly on the Intel i965 driver) and so far it as been a lot of fun, probably the most exciting work I have done at Igalia in all these years, but it is also certainly challenging, requiring me to learn a lot of new things and some times fairly complex stuff.

Getting involved in this is no easy endeavor, the learning curve is steep because the kind of work you do here is probably unlike anything you have done before: for starters it requires a decent understanding of OpenGL and capacity to understand OpenGL specifications and what they mean in the context of the driver, you also need to have a general understanding of how modern 3D-capable GPUs work and finally, you have to dig deeper and understand how the specific GPU that your driver targets works and what is the role that the driver needs to play to make that hardware work as intended. And that’s not all of it, a driver may need to support multiple generations of GPUs which sometimes can be significantly different from each other, requiring driver developers to write and merge multiple code paths that handle these differences. You can imagine the maintenance burden and extra complexity that comes from this.

Finally, we should also consider the fact that graphics drivers are among the most critical pieces of code you can probably have in a system, they need to be performant and stable for all supported hardware generations, which adds to the overall complexity.

All this stuff can be a bit overwhelming in the beginning for those who attempt to give their first steps in this world but I believe that this initial steep learning curve can be smoothed out by introducing some of the most important concepts in a way that is oriented specifically to new developers. The rest will still not be an easy task, it requires hard work, some passion, be willing to learn and a lot of attention to detail, but I think anyone passionate enough should be able to get into it with enough dedication.

I had to go through all this process myself lately, so I figured I am in a very good situation to try and address this problem myself, so that’s why I decided to write a series of posts to introduce people to the world of Mesa and 3D graphics drivers, with a focus on OpenGL and Intel GPUs, which is the area were I am currently developing my work. Although I’ll focus on Intel hardware I believe that many of the concepts that I will be introducing here are general enough so that they are useful also to people interested in other GPUs. I’ll try to be clear about when I am introducing general concepts and when I am discussing Intel specific stuff.

My next post, which will be the first in this series, will serve as an introduction to the Linux graphics stack and Linux graphics drivers. We will discuss what Mesa brings to the table exactly and what we mean when we talk about graphics drivers in Linux exactly. I think that should put us on the right track to start looking into the internals of Mesa.

So that’s it, if you are interested in learning more about Linux graphics and specifically Mesa and 3D graphics drivers, stay tuned! I’ll try my best to post regularly and often.

by Iago Toral at July 18, 2014 09:41 AM

July 10, 2014

Jacobo Aragunde

Hot accessibility for LibreOffice 4.3.0

The first release candidates for LibreOffice 4.3.0 are already bouncing around the internet and, besides great new features like the ones I’ve been explaining in my latest posts, they come with a set of fixes to ease the life of screen reader users and developers alike. This is, once again, a part of the accessibility work we do at Igalia.

Back in April, when the Gran Canaria hackfest took place, I started working on the bug #71556. The problem was that typing on a document triggered a lot of unnecessary text-attributes-changed events. These events had a variety of origins, but most of them were caused by modifications in the internal text attribute rsid, used for change tracking and with no relevance per se for the user. The bug is not completely fixed, but addressing the problem with rsid attribute allowed us to get rid of the most annoying part of it; now LibreOffice only sends one unnecessary event when we type the first character in a new paragraph and not with every keystroke.

Bug #71558 is also related with the same kind of events; in this case, text-attributes-changed was not being triggered when a word became marked as misspelled. Actually, the spell-checking status is not internally treated as a text attribute and because of that there were no events indicating its change. The patch explicitly raises the event which lets the accessibility code check the status of the text attributes and find out the spelling mistake. While I was working on this issue, I also detected a weird behavior when checking the text attributes through the Python API; it resulted to be a bug in the bridge between AT-SPI and ATK, which I reported and fixed too.

A triaging session took me to bug #74681; the main issue reported there had already been fixed for a while and only small bits regarding missing accessible names in some buttons were missing. I fixed that allowing toolbar buttons to use their tooltip text as the accessible name if it is not explicitly set, and now the paragraph properties panel is fully accessible.

Finally, I retook the work I had been doing in relation with ATK roles at bug #39944 and detected wrong mappings for LibreOffice EDIT_BAR, EMBEDDED_OBJECT and HYPER_LINK accessible roles. I fixed them and opened a ticket in ATK bugzilla to create ATK roles for the five cases that were still registering custom roles. Once that ticket is managed, hopefully we will be able to close bug #75191 too, which is related with the deprecation of atk_role_register.

These fixes are added on top of the ones coded in February hackfest, making 4.3.0 the most accessible LibreOffice so far… Until the next version arrives, of course!

by Jacobo Aragunde Pérez at July 10, 2014 10:55 AM

July 04, 2014

Eduardo Lima Mitev

A possibly faster approach to OpenGL rasterization of 2D Web content

Even thought it has been a while since my last entry on this blog, I have been quite busy. During most of last year I brought my modest contributions into an awesome startup that you have probably heard of by now, helping them integrate GNOME technologies into their products. I was lucky to join their team at an early stage of the development and participate in key discussions.

But more on that project on future entries.

Today I want to talk about things that keep me busy these days, and are of course related to Web engines. Specifically, I want to talk about 2D rasterization and the process of putting pixels on the screen as fast as possible (aka, the 60 frames-per-second holy grail). I want to discuss an idea that has the potential to significantly increase the performance of page rendering by utilizing modern GPU capabilities, OpenGL, and a bit of help from Web engines’ magic.

This is a technical article, so if you are not very familiar with 2D rasterization, OpenGL or how Web engines draw stuff, I recommended you to take some time off and read about it. It is truly a world of wonders (and sometimes pain).

Instanced rendering

The core of the idea is based on instanced rendering. It is a fairly well known technique introduced by OpenGL 3.1 and OpenGL-ES 3.0 as extension GL_EXT_draw_instanced.

To draw geometry with OpenGL, one normally submits a primitive to the rendering pipeline. The primitive consists of a collection of vertices, and a number of attributes per each vertex. Traditionally, you could only submit one primitive at a time.

With instanced rendering, it is possible to send several “instances” of the same primitive (the same collection of vertices and attributes) on a single call. This dramatically reduces the overhead of pipeline state changes and gives the GPU driver a better chance at optimizing rendering of instances of a particular geometry.

Hence, it is generally a common practice for OpenGL applications to group rendering of similar geometry into batches, and submit them to the pipeline all at once as instances. This technique is commonly known as batching and merging.

Skia, the 2D rasterizer used by the Chromium and Android projects, and Cairo, a popular 2D rasterizer backing many projects such as GNOME and previous versions of Mozilla Firefox; both to some extent have support for some sort of instanced rendering in their respective GL backends.

Telling instances apart

Ok, it is possible to draw a bunch of primitives at once, but how can we make them look different? A way of customizing individual instances is necessary, otherwise they will all render on top of the previous one. Not very useful.

There are two ways of submitting per-instance information: one is by adding a “divisor” to the buffers containing vertex or attribute information (VBOs), which will tell the pipeline to use the divided chunks as per-instance information instead of per-vertex. glVertexAttribDivisor is used in this case.

The other way is to upload the per-instance information to a buffer texture (or any texture for that matter) and fetch the information of the corresponding vertex by sampling, using a new variable gl_InstanceID available in shader code, as the key. This variable will increase for each instance of the geometry being rendered (as oppose to per vertex, for which you have gl_VertexID).

So, a quick recap so far. We are able to draw several instances of the same geometry at once, very efficiently, and are able to upload arbitrary data to customize each of these instances at will.

But wait, there are caveats.

The ordering problem

So, lets say we can now group together all drawing operations that involve the same primitive (rectangle, line, circle, etc). What happens if we draw (say) a filled rectangle, then a circle on top, and then another rectangle on top of the circle?

Following the simple grouping rule, what will happen is that the two rectangles will be grouped together and drawn first in one call, then the circle. This will not render the expected result, since the circle will end up laying on top of the rectangle that was drawn after it.

This problem is commonly known as “ordering”, and it clearly breaks our otherwise super-performing batching and merging case.

So, in scenes that involve lots of geometry overlapping, the grouping is limited to contents that do not overlap, if we wanted to preserve the right order of operations.

In practice, it means that we first need to separate the content in layers, then group the same primitives within a single layer, and finally submit the batches from each layer in the right order.

But guess what? Browser engines already do exactly that. Engines build a layer tree (among several other trees) with the information contained in the HTML and CSS content (layout, styling, transformations, etc), where the content is separated in render nodes whose content do not normally overlap. The actual process is much more complicated than that, but this simplification is enough to illustrate the idea.

Now, what if?

First, for the sake of presenting an idea, lets ignore the 2D context of a canvas element by now. More on that later. Lets focus on most of the web sites out there.

If we look at the number of primitives typically used by the rendering of a page, they boil down to a handful. Essentially, they are:

  • A rectangle: for almost all HTML elements, which are boxes. And character glyphs! which are normally rendered ahead of time and cached in a texture layout, then texture-mapped onto a rectangle. And images!, which are also texture-mapped onto rectangles.
  • A thin line: for thin (<=1 pixel) borders, inset/outset effect, hr, etc. Thicker borders can be drawn as thin rectangles.
  • A round corner: the quarter of a circle, filled or stroke, used to implement rounded rectangles (hello border-radius).
  • A circle: for bulleted listings, radio-buttons, etc. Argueably, these can be rendered using unicode characters, so no need for specific geometry.

Lets stay with these. There are other cases that I will purposely ignore, like one seen in a rounded rectangle with different thickness in two consecutive borders.

Then we have, for each of these primitives, an evolutionary-like variety of background styles (imaged, colored, repeated, gradient, etc); transformations (rotation, translation, scaling, etc); border styles (again imaged, colored, with different thickness, etc), shadow and blurring effects, and so on.

With a working texture cache, we have a potentially good chance at aggressively grouping together drawing of these primitives, like rectangles for example, for all text glyphs, boxes and images.

So, what if we could submit to a smart shader all the information that describes and tells apart these grouped instances? Is it possible to efficiently pack and then re-interpret in a shader all the styling and transformation complexities of today's CSS-styled HTML elements?

A new approach

Existing 2D rasterizers used in Web engines (at least Skia and Cairo, whose source code is available to me) are general purpose drawing libraries. That means they should render deterministically for any kind of application, not only Web engines. Specifically, they need to avoid the ordering problem explained above, where the result of a set of overlapped drawing operations is different if you change their order.

There are several reasons why modern Web engines use general purpose 2D rasterizers (as opposed to rasterizers written specifically for the needs of Web content rendering). One clear reason is that they existed before (in the case of Cairo at least) as a generic 2D graphics library, and was later used for Web rendering. Other reason is that the implementation of the Canvas 2D spec requires a general purpose 2D API, because that's what it is. And there is a clear benefit in reusing your beautifully optimized Canvas 2D implementation to draw the rest of the Web contents. Also, these libraries evolved from a pixmap (image) backed rendering target, into libraries exploiting the hardware-acceleration of GPU cards. Both libraries now feature an OpenGL(ES) backend that is somehow forced to comply with the previously existing API and behaviors.

But that is sub-optimal for Web engines that simply want to draw non-overlapping content into layers, then draw the layers in order. And even though batching and merging do occur in the GL backends today, it is apparently far from optimal as we will see later.


So, if we completely ignore the ordering problem for the case of Web engines drawing already layered nodes onto an OpenGL based render target, we might be able to aggressively group together potentially all the operations that share the same primitive.

This is of course if, as mentioned above, we are able to describe the particularities of each instance of these primitives, hand them down to a smart shader for rendering, then do all that efficiently so that the performance gained in batching is not lost by uploading tons of instance information to the GPU or running heavy shader code.

It is unclear (to me) whether this is at all possible. That's is why this approach is just an idea yet lacking validation for the real world. But it is a research that could potentially boost performance of Web content rendering.

It shares some similarities with (and was partially inspired by) the way Android does font rendering.

A proof of concept

So, I was set up to write a proof of concept trying to validate or discard the idea as quickly as possible. The purpose is to write the minimum code that would allow meaningful comparison between this approach and exiting rasterizers (Skia being my first target for this), for specific use cases that are relevant to generic Web content rendering (not Canvas 2D).

My proof of concept is being developed at: https://github.com/elima/glr

So far, it just provides a few primitives: rectangle and rounded corner, allowing for 3 basic drawing operations: rectangle (filled or stroked), rounded rectangle (only filled) and character glyph (not text, just single characters).

Then each element drawn can be transformed (scaled, rotated and/or translated), laid-out on the canvas (top, left, width and height), and has a color or texture.

Anti-aliasing is achieved by multisampling with 8 samples per pixel. Character glyphs are not anti-aliased, that was too complex to put in a proof of concept and it is a problem already solved by others anyway. I used the simplest possible path to put a pre-cached glyph on the screen, and for that wrote a super naive texture cache, and used FreeType2 for rasterizing the glyphs.
The idea of including chars was to explore if text glyphs, which accounts for most of typical Web page's content, could be batched together with all others drawings that use a rectangle primitive.

Note should be taken that this proof-of-concept is not intended to become a new library. It is just a vehicle to validate an idea by providing the minimum implementation needed to test its limits. Eventually, all this would have to be implemented in existing libraries. I just happen to be very fluent at glib and C :), as to prototype fast.

Comparisons

Before we jump into FPS excitements, lets clarify that any comparison here should be taken with a grain of salt. 2D rasterizers are complex libraries that do a lot of non-trivial things like anti-aliasing, sub-pixel alignment, color space conversion, adaptation to the specifics of the underlying hardware/driver/GL-version combos, to name just a few.

Thus, any comparison should be put in the context of what code paths are being selected, what rendering operations are being grouped, and when and why they aren't; how many GL operations are submitted to the pipeline to render the same scene, etc.

I have included 3 initial examples that try to illustrate how batching and merging of "compatible" draws (sharing the same underlying primitive) improves performance when ordering is ignored, while at the same time each element can have its own color, layout and transformation. For each example, I have written a Skia counterpart that tries to render exactly the same, to the extent possible, for the sake of comparing.

The data below corresponds to runs in my laptop, which is a Thinkpad T440p running Debian GNU/Linux, has an integrated Intel(tm) GPU (4th gen), and the OpenGL driver is provided by Mesa 10.2.1.

I used apitrace to look at what GL commands are actually sent to the driver.

Lets start with the RectsAndText example. It basically draws a lot of alternating filled rectangles and character glyphs, each with its own color, transformation and layout. In the screencast below, both examples (Skia and glr) are running at the same time. This of course does not reflect real performance since both compete for GPU resources, but I decided to record it this way because the improvement is much better noticed. The frames-per-second decrease proportionally for both examples when run at the same time, so it remains relevant for comparison.

The window in the left corresponds to the Skia example, and the right to glr. The same goes for all screencasts below.



This video file is also available for download.

The difference is considerable. Skia performs at an average of 6-7 FPS while the new approach gives around 40 FPS. That’s a 5x-6x improvement, not bad. Also, notice that CPU usage is considerably higher in the case of Skia.

The interesting thing here is that in the case of glr, all elements are batched together (both rectangles and chars), so only one drawing operation is actually submitted to the pipeline, as you can see in the available apitrace dump. A trace for the corresponding Skia example is also available.



apitrace output for RectAndText glr example



apitrace output for RectAndText Skia example

The next example is Rects, which is similar but renders only rectangles, alternating between filled and stroked. The interesting bit is that in the case of glr, each style of rectangles is drawn onto one different layer, each layer operating on its own separate thread; demonstrating that parallelism is now possible during batching.



This video file is available for download.

In this example, the performance difference is even higher. glr is around 8x faster. Again, apitrace traces for the glr example and the Skia version are available. This time glr submits a total of 2 instanced drawing operations, one for filled rects and one for stroked.



apitrace output for Rects glr example



apitrace output for Rects Skia example

The last example draws several layers of non-overlapping rounded rectangles. As with previous examples, every element is given a unique layout, color and transformation. This example tries to illustrate that because batching operates only at layer level, the more layers you have the less you benefit from this technique. In this particular example, the gap is reduced considerably. In fact it looks like Skia is faster by a few FPS, but it is actually not true. When both examples are run together, Skia is faster, but if run separately, glr example is faster (though not much). I’m still figuring this out.



This video file is available for download.

And the traces for the glr example and the Skia example.



apitrace output for Layers glr example



apitrace output for Layers Skia example


If you are curious about the implementation, take a look at where most of the magic happens: GlrContext, GlrCanvas and GlrBatch objects, and the vertex shader. The rest of the code is mostly API and glue to provide a coherent way to use this approach. Specifically, an abstract concept of "layer" is introduced. The workflow goes this way:

  • For initialization, a context, a rendering target and a canvas object are created. This is similar to how other 2D libraries work.
  • In the rendering loop and for each frame, the canvas is first notified that a new frame will be rendered.
  • Then any number of layer objects are created and attached to the canvas. The drawing API works against a layer (instead of a canvas), and will group and batch all the drawing operations in internal commands. When drawing to a layer finishes, the layer is notified that it is ready.
  • Finally, the canvas is requested to finish the frame, right before swapping buffers. This call will wait for all the attached layers to finish (blocking if needed). Once all complete, the canvas will take the batched commands from each layer, in the order they were attached, and submit them to the pipeline for rendering.

One thing to remark is that layers are self-contained stateful objects, and can survive frames without needing to redraw.

Other benefits

One by-product derived from the fact that layers cache drawing operations in internal commands (which in turn use locally allocated buffers), is that layers now become data-parallel. This is a term rarely used in the context of OpenGL because as you probably know, the way its API is designed makes it a giant state machine, making any parallelization unpractical.

With this approach, layers can be drawn in separated threads (or fully moved to OpenCL), which can bring extra performance if there are several complex layers that need drawing at the same time.

Another potential extra benefit comes from the fact that the canvas renders to a target that is actually a framebuffer backed up by a multisample texture. This means we can use any previously rendered frame as a texture, the same way it currently works in both Chromium and Webkit, where layers are texture-mapped then composited into the final scene.

So, we have the flexibility that, if a particular layer is too complex or slow to draw, we can attach it alone to a canvas, render it, and use the texture as with the current model. But, if we are short on texture memory, it is possible to keep commands batched in layers and render them on every frame. This is kind of similar to what Chromium does, recording draw operations into an SkPicture and then re-playing back when needed.

Future work

This is an approach that needs validation for a number of real world use cases before it can be even considered for testing on a Web engine. It is key to explore how complex information (for example, multi-step gradient backgrounds, or complex border styling with rounded rectangles) can be passed to the shaders and rendered correctly and efficiently. Also, there are shadows and blurring effects, all parametrized to cover the most creative use cases, that also need verification against this model.

Basically, we need to understand the limits of the approach by trying to implement modern W3C specs, selecting the most complex features first.

Other important priorities are:

  • Understand how much workload can be imposed on shaders side before the gained performance starts to degrade.
  • Test on OpenGL-ES and constrained GPU on embedded ARM, to detect the minimum requirements.
  • Figure out how to implement a mid-frame flushing mechanism when texture cache exhausts or command buffers get too large. This is not trivial, since to flush a layer (that is possibly running in a separate thread) it has to be blocked, then the canvas has to wait for all layers below it to finish and then execute their commands, then signal the blocked layer to continue.
  • Try how scrolling would behave if previously batched layers are drawn for every frame, instead of using current scrolling techniques that rely on rolling big textures, or moving several tiles up and down. These techniques impose either great pressure on texture memory, or a lot of complexity on tile management (or both), specially in the context of new super-high resolution screens.

Conclusions and final words

I have tried to detail an idea that although not new, I believe has not been explored in full in the context of Web engines. It relies on two essential hypothesis:

  • That it is possible to batch not only geometry, but the complex attributes of arbitrarily styled HTML elements, and render that geometry as instances using shader code.
  • It is safe to ignore ordering of draw operations during rasterization phase, and leverage on Web engine’s layer tree to solve overlapping.

Modern GPUs and OpenGL APIs have great potential for optimizing 2D rasterization, but as it happens most of the times, there is no one solution to fit all. Instead, each particular application and use case requires a different set of strategies and trade-offs for optimum performance.

This approach, even if valid for a sufficient number of use cases is unlikely to go faster than existing approaches for all tests cases. Even less replace these approaches. This is pretty clear in the case of canvas 2D for example, which will continue to require a general purpose rasterizer. But if there is a sufficient number of use cases that would benefit from this approach to some degree, then maintaining one code path that enables it will already be a win.

Finally, I want to thank Samsung SRA for partially sponsoring the time I dedicated to pursue this idea, and also Igalia and igalians which are always there to back me up and help me move forward.

Now, is there anyone interested in helping me explore this idea further?

by elima at July 04, 2014 08:28 AM

July 01, 2014

Andy Wingo

flow analysis in guile

Greets, and welcome back to the solipsism! I've been wandering the wilderness with my Guile hackings lately, but I'm finally ready to come back to civilization. Hopefully you will enjoy my harvest of forest fruit.

Today's article is about flow analysis and data structures. Ready? Let's rock!

flow analysis

Many things that a compiler would like to know can be phrased as a question of the form, "What do I know about the data flowing through this particular program point?" Some things you might want to know are:

  1. The set of variables that must be live.

  2. The set of variables that happen to be live. This is the same as (1) except it includes variables that aren't needed but haven't been clobbered by anything.

  3. The set of expressions whose results are still valid (i.e., haven't been clobbered by anything else).

  4. An upper and lower bound on the range of numeric variables.

Et cetera. I'll talk about specific instances of flow analysis problems in the future, but today's article is a bit more general.

The first thing to note about these questions is that they don't necessarily need or have unique answers. If GCC decides that it can't prove anything about the ranges of integers in your program, it's not the end of the world -- it just won't be able to do some optimizations that it would like to do.

At the same time, there are answers that are better and worse than others, and answers that are just invalid. Consider a function of the form:

int f():
  int a = 1
  int b = 2
  int c = a + b
  int d = b + c
  ...
  int z = x + y
  return z

In this function, there are 27 different expressions, including the return, and 27 different program points. (You can think of a program point as a labelled sub-expression. In this example none of the expressions have sub-expressions.) If we number the program points in order from 0 to 26, we will have a program that first executes expression 0 (int a = 1), then 1, and so on to the end.

Let's plot some possible solutions to the live variable flow-analysis problem for this program.

Here we see two solutions to the problem (in light and dark blue), along with a space of invalid solutions (in red). The Y axis corresponds to the variables of the program, starting with a on the bottom and finishing with z on the top.

For example, consider position 4 in the program, corresponding to int e = c + d. It is marked in the graph with a vertical bar. After position 4, the only values that are used in the rest of the program are d and e. These are the variables that are contained within the light-blue area. It wouldn't be invalid to consider a, b, and c to be live also, but it also wouldn't be as efficient to allocate space and reason about values that won't contribute to the answer. The dark blue space holds those values that may harmlessly be considered to be live, but which actually aren't live.

It would, however, be invalid to consider the variable f to be live after position 4, because it hasn't been defined yet. This area of the variable space is represented in red on the graph.

Of course, the space of all possible solutions isn't possible to represent nicely on a two-dimensional graph; we're only able to show two with colors, and that not very well as they overlap. This difficulty cuts close to the heart of the data-flow problem: that it ultimately requires computing a two-dimensional answer, which necessarily takes time and space O(n2) in program size.

Or does it?

classical flow analysis frameworks

The classical way to do flow analysis is to iterate a set of data-flow equations over an finite lattice until you reach a fixed point.

That's a pithy sentence that deserves some unpacking. If you're already comfortable with what it means, you can skip a couple sections.

Still here? Cool, me too. Let's take a simple example of sign analysis. The problem is to determine, for the integer variables of a program, at every point in the program, which ones may be negative (-), which ones may be zero (0), and which may be positive (+). All of these are conceptually bit-flags.

For example, in this program:

int f(int x):
 L0:  while (x >= 0)
 L1:    int y = x - 1
 L2:    x = y
 L3:  return x

We can assign the flags -0+ to the argument x as the initial state that flows into L0, because we don't know what it is ahead of time, and it is the only variable in scope. We start by representing the initial state of the solution as a set of sets of state values:

state := {L0: {x: -0+}, L1: Ø, L2: Ø, L3: Ø}

In this notation, Ø indicates a program point that hasn't been visited yet.

Now we iterate through all labels in the program, propagating state to their successors. Here is where the specific problem being solved "hooks in" to the generic classical flow analysis framework: before propagating to a successor, a flow equation transforms the state that flows into a program point to a state that flows out, to the particular successor. In this case we could imagine equations like this:

visit_test(expr, in, true_successor, false_successor):
  if expr matches "if var >= 0":
    # On the true branch, var is not negative.
    propagate(in + {var: in[var] - -}, true_successor)
    # On the false branch, var is not zero and not positive.
    propagate(in + {var: in[var] - 0+}, false_successor)
  else if ...

visit_expr(expr, in, successor):
  if expr matches "left = right - 1":
    if in[right] has +:
      if in[right] has 0:
        # Subtracting one from a non-negative arg may be negative.
        propagate(in + {left: in[right] + -}, successor)
      else
        # Subtracting one from a positive arg may be 0.
        propagate(in + {left: in[right] + 0}, successor)
    else:
      # Subtracting one from a nonpositive arg will be negative.
      propagate(in + {left: -}, successor)
  else if expr matches "left = right":
    propagate(in + {left: in[right]}, successor)
  ...

The meat of classical data-flow analysis is the meet operation:

propagate(out, successor):
  if state[successor] is Ø:
    state[successor] = out
  else
    state[successor] = meet(out, state[successor]):

# A version of meet for sign analysis
meet(out, in):
  return intersect_vars_and_union_values(out, in)

Let's run this algorithm by hand over the example program. Starting from the initial state, we propagate the L0→L1 and L0→L3 edges:

visit_test("if x <= 0", {x: -0+}, L1, L3)
→ propagate({x: 0+}, L1)
→ state[L1] = {x: 0+}
→ propagate({x: -}, L3)
→ state[L3] = {x: -}

Neat. Let's keep going. The successor of L1 is L2:

visit_expr("y = x - 1", {x: 0+}, L2)
→ propagate({x: 0+, y: -0+}, L2)
→ state[L2] = {x: 0+, y: -0+}

L2→L0 is a back-edge, returning to the top of the loop:

visit_expr("x = y", {x: 0+, y: -0+}, L0)
→ propagate({x: -0+, y: -0+}, L0)
→ state[L0] = meet({x: -0+, y: -0+}, state[L0])
→ state[L0] = meet({x: -0+, y: -0+}, {x: -0+})
→ state[L0] = {x: 0+}

Finally, L3 has no successors, so we're done with this iteration. The final state is:

{L0: {x: -0+},
 L1: {x: 0+},
 L2: {x: 0+, y: -0+},
 L3: {x: -}}

which indeed corresponds with what we would know intuitively.

fixed points and lattices

Each of the steps in our example flow analysis was deterministic: the result was calculated from the inputs and nothing else. However the backwards branch in the loop, L2→L0, could have changed inputs that were used by the previous L0→L1 and L0→L3 forward edges. So what we really should do is iterate the calculation to a fixed point: start it over again, and run it until the state doesn't change any more.

It's easy to see in this case that running it again won't end up modifying the state. But do we know that in all cases? How do we know that iteration would terminate at all? It turns out that a few simple conditions are sufficient.

The first thing to ensure is that state space being explored is finite. Here we can see this is the case, because there are only so many ways you can combine -, 0, and +. Each one may be present or not, and so we have 2n = 23 = 8 possible states. The elements of the state array will be a set with at most one entry for each variable, so the whole state space is finite. That at least ensures that an answer exists.

Next, the "meet" operation has to be commutative, associative, and idempotent. The above example used intersect_vars_and_union_values. We intersect vars because it only makes sense to talk about a variable at a program point if the variable dominates the program point. It didn't make sense to propagate y on the L2→L0 branch, for example. It's usually a good idea to model a data-flow problem using sets, as set union and intersection operations fulfill these commutative, associative, and distributive requirements.

Finally, the state being modelled should have a partial order, and functions that add information along control-flow edges -- above, visit_test and visit_expr -- should preserve this partial ordering. That is to say, visit_test and visit_expr should be monotonic. This means that no matter on what control paths data propagates, we keep building towards an answer with more information, making forward progress. This condition is also easily fulfilled with sets, or more generally with any lattice. (A lattice is nothing more than a data type that fulfills these conditions.)

Iterating the data-flow equations until the state stops changing will find a fixed point of the lattice. Whether you find the greatest or least fixed point is another question; I can't help linking to Paul Khuong's old article on Québécois student union strikes for a lovely discussion.

Another question is, how many iterations are necessary to reach a fixed point? I would first note that although in our walk-through we iterated in forward order (L0, L1, L2, L3), we could have visited nodes in any order and the answer would be the same. I'll cut to the chase and say that if:

  1. you represent your state with bitvectors

  2. the control-flow graph is reducible (has only natural loops)

  3. the meet operation on values is bitvector union or intersection

  4. you visit the program points in topologically sorted order

If these conditions are fulfilled, then you will reach a fixed point after LC + 2 iterations, where LC is the "loop-connectness number" of your graph. You can ensure (1), (3), and (4) by construction. (Reverse post-order numbering is an easy way to fulfill (4).) (2) can be ensured by using programming languages without goto (a for loop is always a natural loop) but can be violated by optimizing compilers (for example, via contification).

Loop connectedness is roughly equivalent to the maximum nesting level of loops in the program, which has experimentally been determined to rarely exceed 3. Therefore in practice, data-flow analysis requires a number of steps that is O(n * 5) = O(n) in program size.

For more information on data-flow analysis, including full proofs and references, see Carl Offner's excellent, excellent manuscript "Notes on Graph Algorithms used in Optimizing Compilers". I don't know of any better free resource than that. Thanks, Carl!

an aside: the kCFA algorithms

I just finished describing what I called "classical" data-flow analysis. By that I mean to say that people have been doing it since the 1970s, which is classical enough as far as our industry goes. However with the rise of functional languages in the 1980s, it became unclear how to apply classical data-flow analysis on a language like Scheme. Let's hear it from the horse's mouth:

This brings us to the summer of 1984. The mission was to build the world's most highly-optimising Scheme compiler. We wanted to compete with C and Fortran. The new system was T3, and the compiler was to be called Orbit. We all arrived at WRL and split up responsibility for the compiler. Norman was going to do the assembler. Philbin was going to handle the runtime (as I recall). Jonathan was project leader and (I think) wrote the linker. Kranz was to do the back end. Kelsey, the front end. I had passed the previous semester at CMU becoming an expert on data-flow analysis, a topic on which I completely grooved. All hot compilers do DFA. It is necessary for all the really cool optimisations, like loop-invariant hoisting, global register allocation, global common subexpression elimination, copy propagation, induction-variable elimination. I knew that no Scheme or Lisp compiler had ever provided these hot optimisations. I was burning to make it happen. I had been writing 3D graphics code in T, and really wanted my floating-point matrix multiplies to get the full suite of DFA optimisation. Build a DFA module for T, and we would certainly distinguish ourselves from the pack. So when we divided up the compiler, I told everyone else to back off and loudly claimed DFA for my own. Fine, everyone said. You do the DFA module. Lamping signed up to do it with me.

Lamping and I spent the rest of the summer failing. Taking trips to the Stanford library to look up papers. Hashing things out on white boards. Staring into space. Writing little bits of experimental code. Failing. Finding out *why* no one had ever provided DFA optimisation for Scheme. In short, the fundamental item the classical data-flow analysis algorithms need to operate is not available in a Scheme program. It was really depressing. I was making more money than I'd ever made in my life ($600/week). I was working with *great* guys on a cool project. I had never been to California before, so I was discovering San Francisco, my favorite city in the US and second-favorite city in the world. Silicon Valley in 1984 was beautiful, not like the crowded strip-mall/highway hell hole it is today. Every day was perfect and beautiful when I biked into work. I got involved with a gorgeous redhead. And every day, I went in to WRL, failed for 8 hours, then went home.

It was not a good summer.

At the end of the summer, I slunk back to CMU with my tail between my legs, having contributed not one line of code to Orbit.

Olin Shivers, A history of T

It took him another 7 years, but Shivers stuck with it, and in the end came out with the family of algorithms known as k-CFA. Instead of focusing on loops, which Scheme doesn't have syntactically, Shivers used continuation-passing style to ruthlessly simplify Scheme into a dialect consisting of not much more than function calls, and focused his attention on function calls. The resulting family of flow algorithms can solve flow equations even in the presence of higher-order functions -- a contribution to computer science born out of necessity, failure, and stubbornness.

With all those words, you'd think that I'd be itching to use k-CFA in Guile, and I'm not. Unfortunately even the simplest, least expressive version (0-CFA) is O(n2); 1-CFA is exponential. I don't have time for that. Instead, Guile is able to use classical DFA because it syntactically distinguishes labelled continuations and functions, and contifies functions to continuations where possible, which makes the Scheme DFA problem exactly the same as in any other language.

n times what?

Now that we have established that the number of visit operations is O(n), it remains to be seen what the individual complexity of a visit operation is in order to determine the total complexity. The naïve thing is just to use bitvectors, with each of the bitvectors having as many entries as the program has variables, times however many bits we are using.

This leads to O(|L|*|V|) space and time complexity, where |L| is the number of program points (labels) and |V| is the number of variables. As the number of variables is generally proportional to the size of program, we can approximate this as O(n2).

In practice, this means that we can use data-flow analysis to programs up to about 10000 labels in size. Sign analysis on a 10000-label function would require 100002*3/8 = 37.5 MB of memory, which is already a bit hefty. It gets worse if you need to represent more information. I was recently doing some flow-sensitive type and range inference, storing 12 bytes per variable per program point; for a 10000-label function, that's more than a gigabyte of memory. Badness.

shared tails

Although it was the type inference case that motivated this investigation, sign inference is similar and more simple so let's go with that. The visit_expr and visit_test functions above are only ever going to add additional information about the variables that are used in or defined by an expression; in practice this is a small finite number. What if we chose a representation of state that could exploit this fact by only adding O(1) amounts of data, sharing a common tail with preceding expressions?

If we draw a control-flow graph for the sign analysis program, we get something like:

The goal is to create a data structure that looks like the dominator tree. For "normal" control-flow edges -- those whose destination only have one predecessor -- we can avoid the "meet" operations, and just copy the predecessor's out set to the successor's in set. We then define "meet" as an adjoin operation that effectively conses the new information onto a shared tail, if it wasn't there already. The first iteration through the CFG will initialize the shared tail of a given control-flow join to the set of variables flowing into the join's dominator. Subsequent information will adjoin (cons) on new incoming values. In this case the resulting data structure ends up looking like:

Here the italic references like L1 indicate shared structure, and the tuples annotating the edges represent additional information flow, beyond that information that was already present in the successor's dominator.

Of course, you can implement this with linked lists and it will work fine. The problem there will be lookup speed -- when your visit operation (visit_expr or visit_test) goes to look up the sign of a variable, or the same happens via the meet operation, you get O(n) lookup penalties. Anticipating this, I implemented this with a version of Phil Bagwell's vhashes, which promise O(log n) variable lookup. See Guile's documentation, or Bagwell's excellent paper.

Note that you can't remove items from sets once they have been added in a shared-tail flow analysis; to keep the meet function monotonic, you have to instead insert tombstone entries. Not so nice, but it is what it is.

A shared-tail flow analysis consumes only O(1) additional memory per node, leading to O(n) space complexity. I have some measured space and time graphs below that show this experimentally as well.

space and time

Unfortunately, lookup time on branchy vhashes is really terrible: O(log n) in the best case, and O(n) at worst. This is especially acute because there is no easy way to do unions or intersections on vhashes -- you end up having to compute the unshared heads of the two vhashes you are merging, and looking up elements in one in the other... I could go on, but I think you believe me when I say it gets complicated and slow. It's possible to beat a bitvector approach in time for relatively "big" problems like type analysis, but for common subexpression elimination where I was just storing a bit per expression, it was tough to beat the speed of bitvectors.

I started looking for another solution, and in the end came on a compromise that I am much happier with, and again it's Phil Bagwell to the rescue. Instead of relying on vhashes that explicitly share state, I use Clojure-style persistent sparse bit-sets and bit-maps that share state opportunistically.

Guile's intset module implements a bitvector as a functional tree whose branches are vectors and whose leaves are fixnums. Each leaf represents one range of 32 integers, and each branch on top of it increases the range by a factor of 8. Branches can be sparse, so not all integers in the range of an intset need leaves.

As you would expect, adjoining an element onto such a tree is O(log n). Intersecting is much faster than vhashes though, as intsets partition the key space into power-of-two blocks. Intsets try hard to share state, so that if your adjoin would return the same value, the result is the same object, at the same address. This allows sub-trees to be compared for equality via pointer comparison, which is a great fast-path for intersection and union.

Likewise, Guile's new intmap module allow the association of larger values with integer keys.

science! fetch your white coats and lab books!

I had the chance to actually test the system with all three of these data structures, so I compiled one of Guile's bigger files and recorded the memory used and time taken when solving a 1-bit flow analysis problem. This file has around 600 functions, many of them small nested functions, many of them macro-generated, some of them quite loopy, and one big loopless one (6000 labels) to do the initialization.

First, a plot of how many bytes are consumed per label during while solving this 1-bit DFA.

Note that the X axis is on a log scale.

The first thing that pops out at me from these graphs is that the per-label overhead vhash sizes are indeed constant. This is a somewhat surprising result for me; I thought that iterated convergence would make this overhead depend on the size of the program being compiled.

Secondly we see that the bitvector approach, while quadratic in overall program size, is still smaller until we get to about 1000 labels. It's hard to beat the constant factor for packed bitvectors! Note that I restricted the Y range, and the sizes for the bitvector approach are off the charts for N > 1024.

The intset size is, as we expected, asymptotically worse than vhashes, but overall not bad. It stays on the chart at least. Surprisingly, intsets are often better than vhashes for small functions, where we can avoid allocating branches at all -- note the "shelves" in the intset memory usage, at 32 and 256 entries, respectively, corresponding to the sizes that require additional levels in the tree. Things keep on rising with n, but sublinearly (again, recall that the X axis is on a log scale).

Next, a plot of how many nanoseconds it takes per label to solve the DFA equation above.

Here we see, as expected, intsets roundly beating vhashes for all n greater than about 200 or so, and show sublinear dependence on program size.

The good results for vhashes for the largest program are because the largest program in this file doesn't have any loops, and hardly any branching either. It's the best case for vhashes: all appends and no branches. Unfortunately, it's not the normal case.

It's not quite fair to compare intsets to bitvectors, as Guile's bitvectors are implemented in C and intsets are implemented in Scheme, which runs on a bytecode VM right now. But still, the results aren't bad, with intsets even managing to beat bitvectors for the biggest function. The gains there probably pay for the earlier losses.

This is a good result, considering that the goal was to reduce the space complexity of the algorithm. The 1-bit case is also the hardest case; when the state size grows, as in type inference, the gains of using structure-sharing trees grow accordingly.

conclusion

Let's wrap up this word-slog. A few things to note.

Before all this DFA work in Guile, I had very little appreciation of the dangers of N-squared complexity. I mean, sometimes I had to to think about it, but not often, expecially if your constant factors are low, or so I thought. But I got burned by it; hopefully the next time, if any, will be a long time coming.

I was happily, pleasantly surprised at the expressiveness and power of Bagwell/Clojure-style persistent data structures when applied to the kinds of problems that I work on. Space-sharing can make a fundamental difference to the characteristics of an algorithm, and Bagwell's data structures can do that well. Intsets simplified my implementations because I didn't have to reason much about space-sharing on my own -- finding the right shared tail for vhashes is, as I said, an unmitigated mess.

Finally I would close by saying that I was happy to fail in such interesting (to me) ways. It has been a pleasant investigation and I hope I have been able to convey some of the feeling of it. If you want to see what the resulting algorithm looks like in practice, see compute-truthy-expressions.

Until next time, happy hacking!

by Andy Wingo at July 01, 2014 08:00 AM

June 26, 2014

Jacobo Aragunde

Tales of LibreOffice interoperability: shape effects

We continue introducing features that will be part of the 4.3.0 release of LibreOffice, which is coming soon. After having worked in the preservation of color in shapes, we worked on the different effects that can be applied to shapes and bitmaps.

There are three types of effects that are managed separately in the DrawingML specification.

General shape effects

Examples of these effects are inner or outer shadows, reflections, glow… They can be applied both to vectorial shapes or bitmaps, and several of them can be applied at the same time.

Shape effects sample

These effects are indicated with the a:effectLst tag inside the shape properties tag spPr. I won’t explain their specification in detail because you can find a very good description in this website, but you can get an idea by taking a look at the following example where three effects are applied: glow, inner shadow and reflection:

<a:effectLst>
  <a:glow rad="63500">
    <a:schemeClr val="accent2">
      <a:satMod val="175000" />
      <a:alpha val="40000" />
    </a:schemeClr>
  </a:glow>
  <a:innerShdw blurRad="63500" dist="50800"
  dir="2700000">
    <a:prstClr val="black">
      <a:alpha val="50000" />
    </a:prstClr>
  </a:innerShdw>
  <a:reflection blurRad="6350" stA="52000"
  endA="300" endPos="35000" dir="5400000"
  sy="-100000" algn="bl"
  rotWithShape="0" />
</a:effectLst>

Notice that some effects only have some attributes while others contain color specifications as child elements like the ones explained in the previous post.

3D effects

Shapes and bitmaps can be transformed into 3D objects and get lighting and camera modifications applied to them.

Shape 3D effects sample

These effects are basically controlled by two children tags of spPr. One of them is a:scene3d and controls the camera and lighting, and the other one is a:sp3d which controls the transformation of the shape in a 3D object adding extrusion, bevels and a material effect to the surface. In the same website I linked before, you can read a description of scene3d and sp3d tags and their children. Find an example of their combined use below:

<a:scene3d>
  <a:camera prst="perspectiveRelaxedModerately"
  zoom="150000">
    <a:rot lat="19490639" lon="0"
    rev="12900001" />
  </a:camera>
  <a:lightRig rig="threePt" dir="t">
    <a:rot lat="0" lon="0" rev="4800000" />
  </a:lightRig>
</a:scene3d>
<a:sp3d z="488950" extrusionH="63500"
prstMaterial="metal">
  <a:bevelT w="165100" prst="coolSlant" />
  <a:extrusionClr>
    <a:schemeClr val="tx2" />
  </a:extrusionClr>
</a:sp3d>

Artistic effects

Effects from the last category act like the filters found in image manipulation programs (blur, grain or background removal among others) and that’s why they only can be applied to bitmaps. This is actually not a part of DrawingML spec but an extension over it.

There is an important difference with other filters; these ones come pre-calculated in the document. The bitmap linked by the DrawingML shape already comes with the effect, and the effect specification links a second bitmap that contains the original picture so the effect can be undone. This second bitmap is saved in the relatively new loss-less Windows Media Photo format.

Writer screenshot showing artistic effects

Check the following example of a blip-filled shape; the actual filling comes from the file linked as rId6, while the effect definition is linked to rId7 which is a copy of the original image before the filter was applied:

<a:blip r:embed="rId6" cstate="print">
  <a:extLst>
    <a:ext uri="{BEBA8EAE-BF5A-486C-A8C5-ECC9F3942E4B}">
      <a14:imgProps xmlns:a14="http://schemas.microsoft.com/office/drawing/2010/main">
        <a14:imgLayer r:embed="rId7">
          <a14:imgEffect>
            <a14:artisticLightScreen trans="10000" gridSize="6" />
          </a14:imgEffect>
        </a14:imgLayer>
      </a14:imgProps>
    </a:ext>
  </a:extLst>
</a:blip>

These are the relations between the ids and the files contained in the document, as specified at document.xml.rels:

<Relationship Id="rId6"
  Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image"
  Target="media/image2.png" />
<Relationship Id="rId7"
  Type="http://schemas.microsoft.com/office/2007/relationships/hdphoto"
  Target="media/hdphoto1.wdp" />

The funny thing of this approach is that LO was able to render these effects with no effort, although the program was not aware of the effect parameters or the original bitmap and these were being lost on save.

Preservation

We use again the grab bag technique to save all the tags and attributes related with the effects as a hidden property that will be used later in the export phase to re-build the effect definitions. In the case of artistic effects, we additionally need to make sure that the original bitmap is preserved; LibreOffice doesn’t support the Windows Media Photo format yet, but we can keep the raw stream of data and output it to a properly named file in the exported document. A small cache table is maintained by the exporter code to prevent that the same original file is saved more than once when two or more pictures apply effects to the same image.

We have finished with the improvements related to shapes and pictures, but there are a few interoperability features not yet mentioned which will be covered in a future post. Like the current and previous ones, they were developed by Igalia and sponsored by CloudOn.

Happy hacking!

by Jacobo Aragunde Pérez at June 26, 2014 12:15 PM

June 19, 2014

Manuel Rego

CSS Grid Layout Automatic Placement

In his last post my mate Sergio explained the different syntax to position elements inside a grid. Now is time to talk about the automatic placement feature, how it works and show some examples of its potential.

Auto-placement

The concept is defined in the specification:

Grid items that aren’t explicitly placed are automatically placed into an unoccupied space in the grid container.

So, let’s start with a simple example to show how it works. Imagine that you have the following 2×2 grid:

<div style="display: grid;">
    <div style="grid-row: 1; grid-column: 1 / span 2;">first</div>
    <div style="grid-row: 2; grid-column: 2;">second</div>
    <div>auto</div>
</div>

first item is using the 2 columns on the 1st row, second item is placed on the 2nd row 2nd column. That leaves an empty space on the 2nd row 1st column, where the auto item is placed.

Simple example where the auto-positioned item is placed into the 2nd row 1st column

Simple example where the auto-positioned item is placed into the 2nd row 1st column

Of course you can do more complex things, like setting a specific position in one dimension (row or column) and left the other as auto, request more than one cell for the item, etc. as you can see in the following example of a 3×3 grid:

<div style="display: grid;">
    <div style="grid-column: span 2;">item 1</div>
    <div style="grid-column: 3;">item 2</div>
    <div style="grid-row: span 2;">item 3</div>
    <div style="grid-row: span 2; grid-column: span 2;">item 4</div>
    <div style="grid-row: 2;">item 5</div>
</div>
Example animation of how different auto-positioned items are placed in the grid

Example animation of how different auto-positioned items are placed in the grid

Let’s analyze why elements are positioned as you can see in the animation above:

  • item 1: This is the first element placed, this item need 2 columns and it’s placed in the 1st row 1st-2nd columns, as they’re empty.
  • item 2: This item is only specifying the column, so it’s placed in the 1st row 3rd column, as it’s the first empty cell in the 3rd column.
  • item 3: In this case the items needs 2 rows and it’s placed in the 2nd-3rd rows and 1st column.
  • item 4: Here the example needs 2 rows and 2 columns, we’re lucky as there’re some empty cells so it’s placed in the 2nd-3rd rows and 2nd-3rd columns.
  • item 5: This item needs to be placed in the 2nd row, as there’s no empty cells in that row, it’ll be placed in a new column outside current grid, 2nd row 4th column. Imagine that this item is completely auto (we remove the “grid-row: 2;” style), then it’d be placed in a new row outside current grid, 4th row 1st column.
    Why in a new row and not in a new column? That’s managed by grid-auto-flow property that’ll be explained below.

grid-auto-flow property

Again from the spec:

The grid-auto-flow property controls the direction in which the search for unoccupied space takes place, and whether rows or columns are added as needed to accommodate the content.

The simpler (and easier to understand) values for this property are row (by default) and column. These values define the direction that the auto-placement algorithm should follow to place the auto-positioned items. For example if the value is row the algorithm will try to fill each row before jumping to the next one, adding new rows as necessary.

Also, this property allows to combine these values (row and column) with other keywords (that cannot not be used together):

  • dense: By default auto-placement algorithm is sparse, that means that if placing an auto item has leave some empty cells in the grid (because of the item doesn’t fit) they won’t be used anymore. If dense is used, these holes will be used if smaller items come up later.
  • stack: In this case the auto-placement algorithm looks for the first empty cell (like if it was adding a 1×1 item). Then it places all the auto-positioned items in that cell (independently of their size), stacked atop one another.

As usual it’s easier to understand with a HTML example:

<div style="display: grid;">
    <div style="grid-row: 1; grid-column: 2;">item 1</div>
    <div style="grid-column: span 2;">item 2</div>
    <div>item 3</div>
</div>
Example grid with different values for grid-auto-flow property

Example grid with different values for grid-auto-flow property

In the image you can check how the auto-positioned elements (item 2 and item 3) are placed depending on the different values of grid-auto-flow:

  • sparse (grid-auto-flow: row;):
    • item 2: It looks for an empty area of 2 columns, so it skips cell in 1st row 1st column levaing it as an empty space. The item is placed in the 2nd row 1st-2nd columns.
    • item 3: It looks for the next empty space, and it adds the item in the 3rd row 1st column. This will leave a hole in the 1st row 1st column.
  • dense (grid-auto-flow: row dense;):
    • item 2: This is placed exactly the same than in the sparse case. The item is placed in the 2nd row 1st-2nd columns.
    • item 3: It looks for an empty cell and it adds the item in the 1st row and 1st column as it’s still empty (item 2 didn’t use it before, because of it doesn’t fit there).
  • stack (grid-auto-flow: stack row;):
    • item 2: It looks for the first empty cell, and it finds the 1st row and 1st column. Then all the auto-positioned items will be placed there. So item 2 is placed in 1st row 1st-2nd column (even if it takes more than one cell, it’ll keep using as reference the first empty cell).
    • item 3: Again this is added in the 1st row 1st column over item 2.

Possibilities

As you can imagine this brings a lot of power to the CSS Grid Layout spec. You won’t need to always place your items explicitly in a grid, you can simply pass the responsibility to find the best position to the grid itself.

The typical example is to think in a regular form:

<form>
    <label>Name</label><input type="text" />
    <label>Mail</label><input type="text" />
    <label>Comments</label><textarea></textarea>
    <label>Accept policy</label><input type="checkbox" />
    <div id="buttons">
        <button>Accept</button>
        <button>Cancel</button>
    </div>
</form>

You could just apply the following CSS and you’d have a formatted form “for free”:

    form     { display: grid;           }
    label    { grid-column: 1;          }
    input    { grid-column: 2;          }
    #buttons { grid-column: 1 / span 2; }

The result will be something like the following picture:

Formatting regular form using grid auto-placement feature

Formatting regular form using grid auto-placement feature

Wrap-up

We’ve created a small demo (as part of our grid examples repository) that will allow you to play with the auto-placement feature and understand it better:

http://igalia.github.io/css-grid-layout/autoplacement.html

As you could read during the post, the auto-placement algorithm has been mentioned several times. This algorithm and the implementations details will be explained in a subsequent post.

If you’re wondering about current support in Blink and WebKit, the basis of auto-placement feature is already working on both engines. In Blink span support in auto-positioned items has recently landed and in WebKit the patch is ready to be integrated.
However, the new grid-auto-flow syntax (sparse, dense and stack) is still on the way (follow the Blink and WebKit bugs if you’re interested).
As you can see, we’ve been working hard on implementing the missing bits of this feature and we hope it would be fully supported soon in both Blink and WebKit. :-)

Finally, we’d like to thank again Bloomberg for sponsoring our work around CSS Grid Layout.

Igalia & Bloomberg logos

Igalia and Bloomberg working together to build a better web

by Manuel Rego Casasnovas at June 19, 2014 03:36 PM

June 15, 2014

Javier Muñoz

Visit to INTECO's Cyber-Security Headquarters

Several weeks ago, I was invited by INTECO to attend the seminar 'Secure Coding in C and C++'. The event took place at INTECO's Cyber-Security headquarters in León, Spain.

It was a great coincidence because Robert, the person teaching this seminar, and I are part of the teams in SEI and Igalia collaborating on browser security.

As you may know, the mission of the National Institute for Communication Technologies (INTECO), located in León (Spain)

, is to strengthen cybersecurity, trust, and the protection of privacy with respect to services offered within the information society, providing value to the public, businesses, the Spanish Government, the Spanish academic and research network, the information technology sector and strategic sectors in general. It is a huge responsibility though.

This seminar was included in the Program on Excelence in Cibersecurity (PECS) in order to find and promote talent in Cibersecurity. The seminar focused on producing secure programs and designs using C and C++ languages. The seminar ran for four days providing a detailed explanation of common programming errors in C and C++.

Robert organized the material covering the following topics: string management, dynamic memory management, integral security, formatted output and file I/O. The lessons interleaved theory, practical exercises, mapping issues to CERT coding standards and debate around the flaws and vulnerabilities handled.

Robert is a person with long experience in Information Security. He is currently the Secure Coding Technical Manager in the CERT Program of Carnegie Mellon's Software Engineering Institute. I enjoyed a great time talking to him about the state of the art in Information Security, avoidable software defects and vulnerabilities.

At the end, I would like to thank to INTECO for hosting and organizing this event. They did a great job with the materials, infrastructure and organization. Congrats guys!

by Javier at June 15, 2014 10:00 PM

Collaborating with the Carnegie Mellon Software Engineering Institute on browser security

Those last weeks I was really busy here in Igalia. We were hacking in Chromium/Blink broadly, attending to BlinkOn 2, held our Assembly, enjoyed one of our summits and so on. On the top of all these things we started to collaborate with the Carnegie Mellon Software Engineering Institute (SEI) around browsers security too. Great news!

On the other hand, Robert commented on the SEI's blog about the importance of having in mind secure coding practices to prevent vulnerabilities while coding, and how the CERT Secure Coding Initiative at the SEI is supporting this approach with completed standards for C and Java. By the way, coding standards for C++, Perl and other languages are under development too.

CERT coding standards are valuable resources for the programmer taking care of Information Security. As Robert highlights, secure coding standards itemize coding errors that are the root causes of current software vulnerabilities, prioritizing them by severity, likelihood of exploitation, and remediation costs. Each rule in the standard includes examples of insecure code, as well as secure alternative implementations.

If you are interested about our collaboration with SEI and the research project to evaluate the costs of producing a CERT-conforming implementation of the Chromium browser you should not skip his post. It introduces the rest of lines and colleagues collaborating in this project too.

by Javier at June 15, 2014 10:00 PM

June 13, 2014

Carlos Alberto López Pérez

WebKitGTK+ Performance Bot!

Lately I have been working on the WebKitGTK+ build bots that Igalia maintains.

Some of the improvements we achieved include:

The most interesting one is the performance bot. You can check the numbers at http://perf.webkit.org. That numbers are not valid to compare the performance between the different WebKit ports (EFL/GTK+/Mac), because each one of these performance bots run on different hardware. The idea is to track the performance between revisions, and identify the commits that have a performance impact.

The performance bot we deployed is not very powerful (i5 661), but does the job. I took extra care to ensure that external factors don’t affect the performance results of the tests on the machine. This is a summary of that measures:

  • I disabled cron and every other daemon.
  • Gave the performance tests processes very high priority (nice -19, ionice -c1).
  • Disabled Hyper-Threading: While the tests are run in sequential order, the javascript engine (JSC) or other parts of WebKit can be parallelized. HT is great if you want to squeeze the maximum performance of your machine, but it can become a problem when you care about consistent results related to performance (or you care about real time).
  • By default the performance tests are run inside Xvfb, and this is not great if you are interested in the most realistic scenario possible. So I patched WebKit to allow running the tests on the native X display.

These are some details of the bot we deployed:

  • The machine is an Intel Core i5 661 with 8GB of RAM.
  • As explained, we run the tests on a real X server:
    $ glxinfo | grep "render.*:"
    direct rendering: Yes
    OpenGL renderer string: Mesa DRI Intel(R) Ironlake Desktop 
    
  • The OS were the tests are run is Debian Jessie.
  • The DE is GNOME 3: I disabled every power-save feature of GNOME (including the screensaver). Since the window of the browser that runs the tests is positioned at the top-left of the screen, I installed the extension Activities Configurator that allows to hide the top bar of activities, and to disable the hot upper-left corner.

If you want to run the performance tests on your machine, this is as easy as:

Tools/Scripts/run-perf-tests --platform=gtk

If you want to run them on your X display, simply export the variable USE_NATIVE_XDISPLAY before:

export USE_NATIVE_XDISPLAY=1

If you don’t want to run the full suite (takes no less than 2 hours), you can specify the test names (look on the PerformanceTests directory for the names).

For example to run the test DoYouEvenBench Speedometer (DoYouEvenBench was renamed to Speedometer):

Tools/Scripts/run-perf-tests --platform=gtk Speedometer/Full.html

Check the WebKit wiki for further information and keep and eye on http://perf.webkit.org: If you notice any performance regression, please let us know that by opening a bug.

by clopez at June 13, 2014 01:33 PM

June 05, 2014

Jacobo Aragunde

Tales of LibreOffice interoperability: shape theme colors and styles

This is the latest chapter of the series about interoperability features that will be part of LibreOffice 4.3, brought to you by Igalia and sponsored by CloudOn, like the previous ones.

Last week we explained our work with theme color preservation for fonts and paragraphs, and now we move to a very close topic: the preservation of theme colors and styles on shapes.

XML definitions

DrawingML includes new conventions to denominate colors, wider than the initial OOXML spec as used for paragraphs and fonts. Such color definitions can be used in the context of shape properties or style definitions to define fillings, line colors, gradients…

In first place, English color names (red, black, etc.) can be used besides RGB values and theme color names; and for any color, a set of transformations can be defined, similarly to the tint and shade properties in fonts and paragraph colors, but with more options. An important transformation is alpha to define the transparency of that color.

Find below one example of each color tag (for color names, RGB values or theme colors), with or without transformations:

<!-- color by name, with an alpha transform -->
<a:prstClr val="black">
  <a:alpha val="50000"/>
</a:prstClr>

<!-- color by RGB value, no transforms -->
<a:srgbClr val="FF0000"/>

<!-- theme color, with two transforms -->
<a:schemeClr val="accent1">
  <a:satMod val="175000"/>
  <a:alpha val="40000"/>
</a:schemeClr>

Shapes in DrawingML receive their attributes from two different sources: the style definitions and the own shape properties. The former act as a fallback of the latter; attributes from style definitions will only be applied if they are not overwritten by shape properties. For example, in case a shape contains both a filling definition among its properties and a filling style, the filling definition prevails.

The theme file defines a list of filling and line styles so they can be used by shapes. This is an example of an area filling style which adds a couple of color transformations:

<a:fillStyleLst>
  <a:solidFill>
    <a:schemeClr val="phClr">
      <a:tint val="15000"/>
      <a:satMod val="350000"/>
    </a:schemeClr>
  </a:solidFill>
  <!-- more fill style definitions -->
</a:fillStyleLst>

Notice it’s not using any particular color, phClr is an entry parameter of sorts for a color that will be indicated in the shape style definition. The style definition can add some transformations or even define a gradient fill based on that color.

A line style definition can be a bit more complex, because it has more format options: width, dash pattern, single or double… These options are available for shape properties too.

<a:lnStyleLst>
  <a:ln w="9525" cap="flat" cmpd="sng" algn="ctr">
    <a:solidFill>
      <a:schemeClr val="phClr"/>
    </a:solidFill>
    <a:prstDash val="solid"/>
  </a:ln>
  <!-- more line style definitions -->
</a:lnStyleLst>

This is how a shape from the document would be assigned to the first filling style and the second line style, specifying “accent1″ and “accent2″ colors as parameters:

<wps:style>
  <a:lnRef idx="2"> <!-- use the second element from fillStyleLst -->
    <a:schemeClr val="accent1"/>
  </a:lnRef>
  <a:fillRef idx="1"> <!-- use the first element from lnStyleLst -->
    <a:schemeClr val="accent2"/>
  </a:fillRef>
  ...
</wps:style>

Finally, this is how a shape can overwrite the style-defined attributes for its filling and line. The following XML chunk defines a solid filling for the area using “accent3″ color and a solid color for the shape line using “accent4″:

<wps:spPr> <!-- shape properties tag -->
  ...
  <a:solidFill> <!-- area filling -->
    <a:schemeClr val="accent3"/>
  </a:solidFill>
  <a:ln> <!-- line properties -->
    <a:solidFill>
      <a:schemeClr val="accent4"/>
    </a:solidFill>
  </a:ln>
  ...
</wps:spPr>

The importer code

The duty of the importer code is, in general, applying the style definitions, merging them with the shape properties and transforming the properties in a way that matches LibreOffice internal data model.

Applying the style definitions means getting the corresponding styles from the theme file and applying them to the specific colors indicated in the shape wps:style block. This results in a set of properties that has to be merged with the shape-specific ones, and this is done so the latter have priority, as explained before.

All properties must be transformed to match LibreOffice data model, and specially colors because of their complexity. Theme colors are translated to their RGB value checking the theme definition to match the color name with its value. As for color transformations, the importer applies all the transformations to the original color to get a final RGB value. The only exception is the alpha transformation, which is converted in a transparency value to be stored in the fill properties object (alpha and transparency are complementary; after homogenizing their units, alpha = 1 – transparency).

LibreOffice doesn’t have equivalent concepts to theme colors and color transformations other than alpha, and doesn’t have shape styles either. To prevent information loss, we will use a hidden property in the Shape object called the interop grab bag to fill it with any properties we need to rebuild this information on export:

  • To preserve a theme color, we need to store its name and the complete list of transformations. We will also need the RGB value of the color with all the transformations applied, so we can compare it with the final color to know if the user has changed it. We do it both for line and filling colors.
  • For preset and RGB colors, the loss of the transformations is not important so we don’t consider this special case.
  • To preserve the style definitions, we store the idx attribute of each and their color parameter, with all transformations if necessary. We also store the RGB value of the color including transformations because we will need it too.

The exporter code

The normal behavior of the exporter is translating every LibreOffice shape property to DrawingML to write it to the document. In the case of colors, that means using a a:srgbClr tag to save it in RGB format and optionally adding an alpha transformation calculated from the shape transparency value.

The preservation of shape styles, area and line colors increases a bit the complexity. The shape grab bag must be checked for interoperability information saved in the import phase which should be saved back to the document under certain circumstances.

If there are any style definitions in the shape grab bag, they are always written back to the document; we need no additional verification because style definitions act as a fallback as we already explained. On the other hand, the case of shape properties has some complexity because we must perform several checks before deciding which area and line information we want to save:

  • Does the final color match the original one? If it doesn’t, it means the user has changed it during edition, so the new one must be saved and the information in the grab bag discarded.
  • If the original and final colors match, is there any theme color information in the grab bag? If there is, we must write it to the shape properties.
  • If the original and final colors match and there is no theme color information in the grab bag, we must compare the color with the style color; if they match, it means the shape was using the style color so we must not write a shape property that would overwrite it.
  • Otherwise, the shape was using a custom color different from the style color and must be written to the shape properties.

Bonus track: multiple color gradients

While I was working on this topic, I noticed that DrawingML allows to specify gradient fillings with any number of steps, unlike in previous MS Office documents (and LibreOffice ones) where you could only use two. This is an example of gradient with three steps:

<wps:spPr>
  ...
  <a:gradFill>
    <a:gsLst>
      <a:gs pos="0">
        <a:srgbClr val="ffff00" />
      </a:gs>
      <a:gs pos="50000">
        <a:srgbClr val="ffff33">
          <a:alpha val="20000" />
        </a:srgbClr>
      </a:gs>
      <a:gs pos="100000">
        <a:srgbClr val="ff0000" />
      </a:gs>
    </a:gsLst>
    <a:lin ang="5400000" />
  </a:gradFill>
  ...
</wps:spPr>

Every a:gs tag indicates a point in the gradient extension with the pos attribute, measured as a percentage, and the color of that point. The system should calculate the gradient between the colors of two consecutive points.

LibreOffice does an approximation of those gradients with more than two steps using only two colors when it imports the document. To prevent information loss, we use again the grab bag to store the complete gradient definition. In an analogous way, we store the original, approximated gradient information to be able to compare it with the gradient information in the moment of the save operation.

We can identify three situations in the exporter:

  • The original and final gradients don’t match: it means the user has changed it during edition, so the new one must be saved and the information in the grab bag discarded.
  • The original and final gradients match and there is a full gradient definition stored in the grab bag: we must write that gradient definition to the shape properties.
  • The original and final gradients match and there isn’t any gradient definition stored in the grab bag: the gradient definition comes from the shape style and in that case we must not write the shape property that would, again, overwrite it.

One more thing or two

As you can see, this feature relies on the hidden interop grab bag property which contains more and more information as we increase its use to preserve unsupported properties. We hope to reduce its weight as those properties become natively supported. I would also like to comment that we have complemented our work with a set of unit tests that will help to detect regressions appearing in the future.

We haven’t yet finished with shapes; next post will be related to the preservation of different kinds of shape effects. It will be ready soon!

by Jacobo Aragunde Pérez at June 05, 2014 12:00 PM

May 28, 2014

Jacobo Aragunde

Tales of LibreOffice interoperability: font and paragraph colors

Last week, LibreOffice community branched out the version 4.3 in preparation of the next release in July. We had already introduced one of the new interoperability features that will be part of that release in the previous post in this series, and now we will continue explaining how OOXML theme colors will be preserved in LibreOffice.

Theme colors have a prominent place in the color palette in the latest Microsoft Office versions, as you can see:
Word color elector

They consist on a palette of ten colors, each one of them with five more variations. When a user changes the document theme, a new palette of colors will be loaded and any objects in the document that used one of those colors will be updated with the corresponding one from the new palette.

In the XML document, a theme color is identified by a name, and its variations are implemented with two attributes: shade to lighten the color and tint to darken it.

This is an example of theme color applied to the characters in a text run – a chunk of text with the same properties. In the run properties (rPr) we find the following tag (reference):

<w:rPr>
  <w:color w:val="E5B8B7" w:themeColor="accent2" w:themeTint="66"/>
</w:rPr>

Theme color for character

This is the theme color named accent2 darkened a 66%. The w:val property indicates the RGB code of the final color, it’s ignored if w:themeColor is present.

The preservation of this tag is simple:

  • On import, store w:themeColor, w:themeTint and w:themeShade properties as hidden properties in the text run.
  • Store the original color too!
  • On export, check if the original color has changed:
    • if it did, write only the new color in the w:val property.
    • if it didn’t, write the stored w:themeColor, w:themeTint and w:themeShade properties.

Now, an example of paragraph shade with theme colors (reference):

<w:pPr>
  <w:shd w:val="thinDiagStripe"
         w:color="215868" w:themeColor="accent5" w:themeShade="80"
         w:fill="DBE5F1" w:themeFill="accent1" w:themeFillTint="33" />
</w:pPr>

Paragraph filling with pattern

Paragraph shades can specify two colors, one for the background and another one for the pattern if present, with the type of pattern specified by the w:val attribute. The set of attributes w:color, w:themeColor, w:themeShade and w:themeTint work like in the case of font colors, specifying the RGB code, theme color name, shade and tint modifiers for the pattern; w:fill, w:themeFill w:themeFillTint and w:themeFillShade do the same for the background color.

The preservation of these properties works analogously; they are also stored as a hidden property of the text on import, and saved back on export in case the user hasn’t changed the fill color while editing the document with LibreOffice.

The w:shd tag can also be used in table or table cell properties with the same meaning. And speaking of tables, we also made sure that the table style property was preserved:

<w:tblPr>
  <w:tblStyle w:val="Tablaconcuadrcula" />
  ...
</w:tblPr>

We will continue next week with more shiny features brought to you by Igalia and CloudOn. Stay tuned!

by Jacobo Aragunde Pérez at May 28, 2014 03:05 PM

May 23, 2014

Claudio Saavedra

Sat 2014/May/24

I am a GNOME developer and I do not share Philip Van Hoof's views. He doesn't represent me.

May 23, 2014 11:44 PM

May 20, 2014

Samuel Iglesias

BlinkOn 2: Zürich

Last  week I traveled to Zürich to attend to BlinkOn 2 at the office Google has there. This is the first time the event is done in Europe (previous one at US) and it was a very good opportunity to meet all the people working around Blink, Chromium and Skia.

Zürich photograph

Zürich

As Igalia has a long experience working on browsers technologies, it was represented by four igalians who are working on different areas: Manuel Rego on CSS, Xabier Rodríguez on multimedia and Eduardo Lima together with me on graphics and Skia.

The attendees of this conference were not only googlers, there were people from Opera, Samsung, Intel and much more companies interested on meeting each other and collaborate to improve Blink.

Google Zürich office

Google Zürich office

Conference

The talks were covering a lot of different areas of a browser: from memory leaks or image filters to security models to name a few of them. Worth to mention Manuel Rego’s talk which was about his latest work on CSS Grid Layout (more details in his blog).

BlinkOn 2 photo

BlinkOn 2

I have attended to the talks more related to graphics on the browser as this is one of the areas I am currently working on: “Rendering for Glass Time“, “Responsive images“, “Chasing Pixels“, “Filter effects“, “Storage modes for 2D canvas” and “60fps on mobile“.

I would like to highlight the talk given by Justin Novosad (Google) where he explained his proposal of different storage modes for 2D canvas: persistent memory, discardable memory and none. Each one with different use cases and consequences on memory usage with focus on mobile devices dealing with very large canvases and there was a discussion about how to provide a useful API for web developers.

I attended to more that the aforementioned talks: the always interesting “Memory leaks in Blink” report, “Web animations“, “Out-of-process iframes“, “Smooth to the touch: challenges in input on mobile“, the Opening talk and “Trace reading clinic” which was very useful to learn more about the tracing tools available in Chromium.

This conference was also the excuse to meet some Skia developers in person and talk about Igalia’s contributions and future work on this 2D graphics library.

I would like to thank Google for its great job organizing BlinkOn at its office in Zürich and also Igalia for sponsoring my trip.

Logo Igalia

Miscellanea

But as not everything would be work, I also met an old friend who is working at Google to give me an update of his last years  while drinking good coffee.

At the last day, just before returning home, I bought a very appreciated gift for my relatives… good Swiss chocolate! Yummy!

by Samuel Iglesias at May 20, 2014 10:00 AM

May 19, 2014

Manuel Rego

CSS Grid Layout at BlinkOn 2

Zurich Google Office

Zurich Google Office

Past week I’d been attending BlinkOn 2 at Google office in Zurich. I think that the best part of this kind of events is to have the chance to meet many well known people that you only see usually in the bug tracker and IRC.

It was nice to see other companies around too like Opera, Samsung, Intel, etc. We were four igalians there talking about the work we’re lately doing on Blink (check Sam’s post for more information).

CSS Grid Layout

As you might expect, this was my primary focus in the conference, as that’s what I’ve been working on during the last months.

Me during the CSS Grid Layout talk (photo by Calvaris)

Me during the CSS Grid Layout talk (photo by Calvaris)

On Tuesday, I was giving a talk about the work we’re doing on CSS Grid Layout sponsored by Bloomberg. It seems that people are interested in this feature and they liked the presentation, you can check the slides.

Julien Chaffraix (grid’s reviewer) was there and later we discussed about the approach to fix some issues that have been hanging around in the bugzilla for a while.

Selection

On Wednesday, I attended the selection talk given by Yoshifumi Inoue and Hayato Ito. They’re now focused on fixing the issues in Shadow DOM with selection.

Actually, they have some interesting ideas regarding how to avoid the problems with DOM ranges. Basically, they’ll try to use a start and end points for the selection instead of range itself, and then they’ll iterate over the composed tree in the right order.

On top of that, we talked about the issues that are present with other layout models, but we just concluded that selection is very complex and we should try to solve things step by step.

Thanks to the editing team for arranging the session. :-)

Panel discussion in the main room (Tech Talk)

Panel discussion in the main room (Tech Talk)

Other

Many other interesting talks were given during the event, like the 60fps talk by Nat Duca, the multi-column sessions by Opera guys, the Oilpan presentation by Mads Ager and many more. If you missed any of them you can check the slides in the BlinkOn 2 web page and see the videos of the talks in the main room whenever they’re available online.

Finally, I’d like to thank Igalia for sponsoring my trip and, of course, Google for the event (specially Max for the great organization and also Philip for arranging the hiking on Thursday which, despite of the foggy weather, was pretty nice).

Zurich view from Felsenegg

Zurich view from Felsenegg

by Manuel Rego Casasnovas at May 19, 2014 11:26 AM

May 18, 2014

Andy Wingo

effects analysis in guile

OK kids, so I had a bit of time recently. I've been hacking on Guile's new CPS-based compiler, which should appear in a stable release in a few months. I have a few things to write about, but today's article is on effects analysis.

what it is, yo

The job of the optimization phase of a compiler is to remove, replace and reorder the sub-expressions of a program. The optimizer recognizes ways in which the program can be better, and then needs to check if those transformations are valid. A transformation is valid if a program exhibits the same behavior both before and after the transformation.

Effects analysis is a proof technique that builds a conservative model of how expressions can affect each other. It can be used to prove the validity of some transformations. For example, if we determine that an expression A reads the first field in a vector, and expression B sets the second field in a vector, then we can use effects analysis to show that the expressions don't affect each others' value and can be freely reordered, for example to hoist either one out of a loop.

In effects analysis, we model the program state coarsely and conservatively with a limited set of categories. The precise set of categories depends on the domain. In graphics, for example, you might have a state bit for the current coordinate system, the current color, the current fill mode, etc. In general-purpose compilation, the effects are mostly about memory and (sometimes) exceptions.

In Guile we model four kinds of effects: type checks (T), allocations (A), reads (R), and writes (W). Each of these effects is allocated to a bit. An expression can have any or none of these effects.

For the purposes of Guile, a "type check" is the possibility that this expression will throw an exception if its arguments are somehow out of range. For example, cons will never throw an exception, except in out-of-memory situations, but + may throw an exception if its arguments aren't numbers. However if an expression is dominated by a copy of itself -- that is, if we see that (+ a b) (which may throw an exception) is dominated by (+ a b), perhaps after CSE -- then we can determine that only the first will exhibit the type-check effects, and that the second will not.

Allocation indicates that an expression allocates a fresh object. In Scheme (and many other languages), each allocated object has its own identity: (eq? (cons 1 2) (cons 1 2)) must be false. Note that this restriction does not apply to constant literals, in Scheme; (eq? '(1 . 2) '(1 . 2)) may or may not be true. In Guile both results are possible. They are the same object when compiled (and thus deduplicated), but different when interpreted; the objects are just the ones returned from `read' and this are different. Anyway we use this allocation bit to indicate that an expression allocates a fresh object with a fresh identity.

The remaining effect kinds are "read" and "write", which indicate reads or writes to memory. So there are just 4 kinds of effects.

Allocation, read, and write effects are associated at run-time with particular memory locations. At compile-time we cannot in general know which of these locations are distinct, and which are actually the same. To simplify the problem, we simply record the type of the object, knowing that (say) a pair and a vector will never be at the same location. We also record the field in the object in the case of objects with multiple fields. There are special catch-all values to indicate "all memory kinds", when we really don't know what an expression will do (which is the case for all expression kinds without specific support in the effect analyzer), and for "all fields" when we don't know which field we are accessing.

One example of the use of this analysis is in common subexpression elimination (CSE). If at a program point P we have a set of available expressions A, then to determine which members of A are still available after the expression at P, you subtract members of A that are clobbered by P. Computation of A at each P plus value numbering is most of CSE; but more on that in some later dispatch. Anyway here's the definition of effects-clobber?.

(define (effect-clobbers? a b)
  "Return true if A clobbers B.  This is the case
if A might be a write, and B might be a read or a
write to the same location as A."
  (define (locations-same?)
    (let ((a (ash a (- &effect-kind-bits)))
          (b (ash b (- &effect-kind-bits))))
      (or (eqv? &unknown-memory-kinds
                (logand a &memory-kind-mask))
          (eqv? &unknown-memory-kinds
               (logand b &memory-kind-mask))
          (and (eqv? (logand a &memory-kind-mask)
                     (logand b &memory-kind-mask))
               ;; A negative field indicates "the
               ;; whole object".  Non-negative fields
               ;; indicate only part of the object.
               (or (< a 0) (< b 0) (= a b))))))
  (and (not (zero? (logand a &write)))
       (not (zero? (logand b (logior &read &write))))
       (locations-same?)))

This analysis is simple, small, and fast. It's also coarse and imprecise -- if you are reading from and writing to two vectors at once, you're almost sure to miss some optimization opportunities as accesses to all vectors are conflated into one bit. Oh well. If you get into this situation, you'll know it, and be able to invest a bit more time into alias analysis; there's lots of literature out there. A simple extension would be to have alias analysis create another mapping from expression to equivalence class, and to use those equivalence classes in the same-location? check above.

Of course this assumes that expressions just access one object. This is the case for Guile's CPS intermediate language, because in CPS, as in SSA or ANF, expressions don't have subexpressions.

This contrasts with direct-style intermediate languages, in which expressions may have nested subexpressions. If you are doing effects analysis on such a language, it's probably more appropriate to allocate a bit to each kind of effect on each kind of object, so that you can usefully union effects for a tree of expressions. Since you don't have to do this for CPS, we can allocate a fixed bit-budget towards more precision as to which field of an object is being accessed. The inability to be precise as to which field was being accessed due to the direct-style IL was one of the problems in Guile's old CSE pass.

Finally, a note about type checks. Guile includes type checks as part of the effects analysis for two reasons. The first is the obvious case of asking whether an expression is effect-free or not, which can lead to some optimizations in other parts of the compiler. The other is to express the potential for elision of duplicate expressions, if one dominates the other. But it's also possible to remove type checks in more cases than that: one can run a type inference pass to remove type-check effects if we can prove that the arguments to an expression are in range. Obviously this is more profitable for dynamically-typed languages, but the same considerations apply to any language with sum types.

Guile's effects analysis pass is here. V8 seems to have two effects analysis passes; one is in effects.h and typing.cc, and operates over the direct-style AST, and the other is in the value numbering pass (hydrogen-instructions.h and hydrogen-gvn.h; search for GVNFlag).

More on how effects analysis is used in Guile in a future missive. Until then, happy hacking.

by Andy Wingo at May 18, 2014 07:19 PM

May 08, 2014

Jacobo Aragunde

Document Freedom Day 2014 recap

You already know I took part in the celebration of the Document Freedom Day speaking in an event in the University of Coruña. The nice people from GPUL had recorded the talks and now they are ready for everyone to enjoy, distribute or reuse under the terms of the Attribution-ShareAlike 4.0 International license.

Here are the talks:

Estándares abertos, a situación dende o punto de legal

Estándares abertos, a situación dende o punto de legal

LibreOffice, o proxecto de referencia para a edición de documentos libres

LibreOffice, o proxecto de referencia para a edición de documentos libres

Sobre a importancia dos estándares e formatos libres

Sobre a importancia dos estándares e formatos libres

by Jacobo Aragunde Pérez at May 08, 2014 09:33 AM

May 05, 2014

Manuel Rego

Selection interaction in CSS Regions is now spec compliant

This post is a recap to summarize all the work we’ve been doing in Igalia during the last months related to selection in CSS Regions.

Wrap-up

Back in October 2013 my mate Javi explained the different problems we detected on how selection worked in pages using CSS Regions. As part of this initial work, we were improving test coverage as it was explained in a previous post. Originally we were working both in WebKit and Blink, but as you probably know CSS Regions has been removed from Blink code during this year.

In parallel, we’ve been also working in optimizations over CSS Regions code. First providing some new selection performance tests and later implementing some performance patches in the part related with re-paint phase, which is very important during the selection process.

Making selection spec compliant

Our first solution was put aside as it had some rough edges and it was probably too ambitious as initial step.

Nonetheless in my last post we introduced a new idea to make selection in CSS Regions spec compliant. We called it the “Subtrees approach“. We’ve finally implemented it and the patch is already integrated upstream.

Example of selection in CSS Regions before and after the Subtrees approach patch

Example of selection in CSS Regions before and after the Subtrees approach patch

You can check the new behaviour using our test suite. As you could see now selection content always matches with the highlighted text in the web page.

Future

Somehow this is just the tip of the iceberg, a small step to make selection in CSS Regions work better. However, selection is still not completely natural from the user point of view in several cases. On top of that, selection in other layout models (flexbox, grid, multi-column, etc.) has very similar issues.

Our plan is to start a discussion inside WebKit and Blink communities about how selection works in the different layout models and what kind of changes can be done there to create a better user experience around selection interaction on the web (as Javi has introduced in his last post).

Finally, we’d like to publicly thank Adobe Web Platform team for giving us the chance to collaborate with them, specially Mihnea for all his support. Also, thanks to all the reviewers for their insightful feedback and precious time.

by Manuel Rego Casasnovas at May 05, 2014 03:56 PM

April 30, 2014

Javier Fernández

2014 Webkit Contributors Meeting

A few weeks ago, I had the opportunity to represent Igalia in the Webkit Contributors Meeting. It was hosted by Apple at their campus in Cupertino and, unlike what some Igalia’s fellows told me about, it wasn’t the huge event it used to be which gave me the chance to meet personally some of the well known hackers I interact with by IRC and bugzilla; that was definitively nice.

The usual unconference-like environment was something I liked a lot, specially how we made the session scheduling on the fly; it was funny to directly notice the interest of the audience on the talk I proposed. My involvement in Webkit during the last year has been improving and implementing CSS standards like CSS Regions and CSS Grid Layout, being precisely the former what my talk was about.

CSS Grid Layout

When I knew I was going to attend the meeting I saw the opportunity of spread out the work we are doing at Igalia with the support of Bloomberg on the implementation of the CSS Grid Layout standard, which we both consider very important for the Web because of the use cases it solves.

I had the feeling that the situation of this new feature could be considerably improved among the Webkit community and perhaps get the attention of more hackers and reviewers willing to collaborate and accelerate the implementation and, eventually, shipping it in future releases of the Safari browser. Considering that IE/Trident is already shipping it, Blink/Chromium has plans to do it as soon as possible and Mozilla is also willing to do it in the long term (the E.D is being written by members of those web engines), we at Igalia considered sensible to suggest the Webkit community to go in that direction.

Regarding the talk itself, I think it helped to increase the visibility of the CSS Grid Layout implementation in Webkit, so let’s see what this means for our work in the following months. If you are interested on the slides, you can check them out here. We talked about ways to enable the feature by default in the nightly builds, so it can be tested by Webkit hackers more easily, and also about the work to be done in the next months. I was happy to see this point being moved to the mailing list and finally become a patch landing. We have a web site with many examples and guidelines to activate the feature on different browsers:

http://igalia.github.io/css-grid-layout/

Selection

One of the goals to attend this meeting was to unblock the issue of Selection with CSS Regions, something that Igalia was involved on during some months last year in collaboration with Adobe. I discussed with David Hyatt the Sub-Trees Approach as a way of making Selection specification compliant when using CSS Regions, even though the issues from the end user perspective are still there. I consider this an important milestone because it puts the CSS Regions standard in a similar position than the rest of layout models affected by these Selection problems (CSS Grid Layout, MathML, Shadom DOM, Absolute Positioning and Multi-Column, among others).

Now a different challenge has to be addressed: how to provide a usable Selection for these layout models ? Even though it was not in the approved schedule, we managed to setup a meeting among different people interested on Selection. I have to say that it was a nice and interesting discussion, which these are the main conclusions I’d extract from:

  • We need to go beyond the editing/selection specification, since the DOM nature of its definition is very limited for the new layout models.
  • Emulating the iOS selection behavior was something most of the participants liked, so perhaps we should consider it in the future.
  • There are several approaches we can follow, multi-range, sub-trees, region-as-containers, … all of them have benefits and cons and what is worst, some of them address only issues of specific layout models; the idea that there is not single solution for all the layout models was always present.
  • Perhaps we should define different implementations of Selection for these specific features; something incremental since it’s very important to keep such an old and important feature like Selection as stable as it’s been all these years.

These ideas are not actually breaking news, since they were already mentioned in previous meetings, but I perceived a better mood now to implement more ambitious approaches to improve Selection. Some of the standards affected by these issues, like CSS Regions and CSS Grid Layout, are really important features so these technical challenges must be faced as soon as possible.

One of the action points agreed was to analyze case by case the issues each of these layout models have, defining very specific examples and test cases; we could later start a discussion on the best way to solve them. At Igalia we have been thinking about this for some time already, so we have created a web site and test cases repository to analyze the different issues present in the layout models we considered more interesting so far. We’d like to invite any hacker interested on this topic to contribute to the wiki and the test cases repository, even adding different layout models to be included in the analysis.

http://igalia.github.io/web-selection-examples/

Misc topics

I attended also some other talks and discussions interesting for the future of the Webkit project, which Igalia is quite committed to, specially as maintainers of the WebkitGtk+ port.

We participated in the discussion of “How to make WebKit more awesome” representing the WebkitGtk+ port community; it was a nice discussion with plenty of good ideas.

I attended the  CSS Regions and CSS Shapes talks, which were quite rich in terms of progress and roadmap announcements  and with the usual awesome demos. The CSS Regions talk presented by Andrei Abucur got quite much attention and it generated an interesting discussion about it’s future.

I attended the session about Subpixel Layout as well, something I was involved into some months ago. I found out some CSS Regions bugs related to the Subpixel Layout feature, which root cause was not enabling the SATURATED_LAYOUT_ARITHMETIC flag. Actually, we’ve recently enabled it in the WebkitGtk+ port with quite good results, although there are still a few regressions  I’m still working on.

 

by jfernandez at April 30, 2014 05:39 PM

April 28, 2014

Alejandro Piñeiro

GNOME 3.12.1 out: PDF accessibility progress

Welcome to a new “GNOME 3.12 is out blog post”, somewhat late because I wanted to focus on 3.12.1 instead of the usual 3.12.0, and because I was away for several days due to Easter holidays.

Flowers and a mill at Keukenhof

Flowers and a mill at Keukenhof

As I said just after 3.10, Antía worked hard on adding keyboard navigation support to Evince, and Adrián provided an implementation of the tagged PDF specification for Poppler. The plan for the 3.12 cycle was to build upon their work in order to improve Evince’s accessibility support.

Thanks to the tagged-PDF implementation for Poppler, we were able to start experimenting with tagged PDF documents in Evince, and playing with all the cool things that tagged PDFs bring to the table. Finally we have available information about if we are in a paragraph, where a list starts, different levels of headings, and pretty much anything else one can put in an element tag. But while Adrián and Carlos García kept working on getting their patches pushed upstream (more than 15 patches were pushed during this cycle), Joanmarie Diggs and I realized that “only” a bare/plain implementation of this specification would be a hard animal to tame in order to be used by assistive technologies: Additional parsing and structuring will be needed in Poppler to properly implement ATK support in Evince.

Additionally, having working keyboard support in Evince made it finally possible to test real-world document accessibility with Orca (as opposed to just Accerciser). But in doing so, we found that the existing ATK support was incomplete or wrong in several places. So even the more basic PDF documents, those that should be also accessible without tagged PDF, were not properly accessible. Taking all this into account, we decided to focus on fixing the bugs in Evince’s core accessibility support as doing so would make all PDFs more accessible, but at the same time to continue working on the tagged-PDF support in order to start developing a concrete list of the improvements we will need added to Poppler.

So the main tasks on Evince during this cycle were:

  • Reimplement AtkText
  • Expose all document pages to the accessibility tools, not only the current one.
  • Implementation of AtkDocument
  • Some fixes to caret-navigation and hyperlink management

As a result of these changes:

  • Several accessibility-triggered crashes have been eliminated
  • Orca’s SayAll feature now works with Evince
  • Prosody when reading documents with Orca has been improved
  • The caret can be positioned and text selected via AT-SPI2

Some of this work was not quite in time for the 3.12.0 release, but has been included in 3.12.1. In addition, we are continuing to work on accessibility-related bug fixes which we anticipate will be included in 3.12.2.

As for what’s next: We encourage Orca users to give Evince a try and help us identify the bugs that remain in Evince’s core accessibility support. Anything that they find will be added to our high-priority TODO list. In the meantime, we will continue to work on enhancing Poppler’s tagged-PDF support and then exposing that structural information through Evince to assistive technologies.

Finally, I would like to thank the GNOME Foundation and the Friends of GNOME supporters for their contributions towards making a more accessible GNOME, as this work would not be possible without them.

Sponsored by GNOME Foundation

by api at April 28, 2014 12:37 PM