The html document in cocktail is represented by multiple data structures, most of them data trees. The first of those trees, as seen before, is the HTML DOM tree. The DOM tree directly maps to the content of the HTML or XML document and implements the standard W3C method. All the other data tree in cocktail are specific to its implemtentation and not defined by a standard body. Their high-level concepts and naming in some cases are heavily influenced by Webkit.
Can you see me ?
An html document can be consumed in multiple ways. It can be used for search bots to index a site or it can be consulted on a device for visually-impaired user. However we first think of an html document as a visual one, as rendered by our browser.
Each node in an html document can either be rendered or not. Wether or not an element is rendered depends on the CSS or attributes applied to it. With CSS it can be defined with the « display » or « visible » styles, in HTML an element can be hidden with the « hidden » attribute. Funnily enough in html, there are no tags inherently rendered or hidden. For instance, tags we assume to be non-visual, such as those contained in the <head> section, can be rendered by changing their « display » CSS styles, which defaults to « none ».
This distinction between rendered and hidden tags justifies the creation of a data structure separate from the DOM : the rendering tree.
The rendering tree is owned by the DOM tree, the root element of the rendering tree being owned by the root element of the DOM tree (the <html> tag). Each node in the DOM tree is responsible for creating or not a corresponding node in the rendering tree. It only creates a node if its CSS and attributes make it a rendered node. The rendering tree is thus always sparser than the DOM tree.
The process were a DOM node creates a rendering tree node and attaches it to the rendering tree is called the attachment. It is implemented in the HTMLElement, mostly in the « attach » method. The opposing process of removing a node from the rendering tree, for instance if the DOM node is removed from the DOM or if a CSS style change makes it an hidden node, is called « detachment » and implemented in the « detach » method.
Rendering and layout
The rendering tree is in charge of 2 criticals tasks, layout and rendering. Each node in the rendering tree is capable of laying itself out (determing its bounds in the document space) and of rendering itself out (drawing its background, borders, asset for an image…). Rendering and layout will be detailed in following articles.
When the document needs to perform a layout, it calls the layout method on the root element of the rendering tree. The element will first lay itself out, then it will call the layout method on all its children, thus laying out the whole rendering tree recursively. A similat task is performed when the document needs to perform a rendering.
All the rendering tree code is in the « renderer » package located in the « core » package. This package contains multiple files defining classes in charge of rendering specific html elements. For instance the rendering and layout of an <img> tag will be different from a <div> tag. The <img> will need to render its asset, if any, while the <div> element will also be in charge of starting the rendering of its children. Layout algorithm also varies for each tag.
Still, they will share code for instance to draw their background and borders, all rendering class do. As a consequence, they all share the base abstract ElementRenderer class.
There are 2 main categories of renderers :
- The flow box renderers, for rendering html tags which can be nested, typically <div> tags.
- The embedded or replaced renderers, used to render embedded assets such the <img> or <video> tag.
The layout and rendering algorithms varies for those categories. The embedded renderers inherits from the EmbeddedBoxRenderer class while
the flow boxes inherit from FlowBoxRenderer.