## How To Layer A Directed Acyclic Graph?

### Description

The layering problem for directed acyclic graphs (DAGs) arises together with the steps of the classical Sugiyama algorithm for drawing directed graphs. If the nodes of a DAG aren’t pre-assigned to specific layers then it’s necessary to separate them into such layers so as to draw the DAG in Sugiyama fashion. We call an algorithm that finds a layering of a DAG a layering algorithm. Normally a layering algorithm must find a layering of a DAG subject to certain aesthetic criteria important to the ultimate drawing. While these could also be subjective, some are generally agreed upon: the drawing should be compact; large edge spans should be avoided; and, the sides should be as straight as possible. Compactness is often achieved by specifying bounds W and H on the width and therefore the height of the layering respectively. Short edge spans are desirable aesthetically because they increase the readability of the drawing but also because the forced introduction of dummy nodes, when a foothold spans multiple layers, degrades further stages of drawing algorithms. The span of edge (u, v) with u ∈ Vi and v ∈ Vj is i − j. Further, the dummy nodes can also cause additional bends on edges since edge bends mainly occur at dummy nodes.

At present, there are three widely-used layering algorithms that find layerings of a DAG subject to a number of the above aesthetic criteria. all of them have polynomial running time: the Longest Path algorithm finds a layering with minimal height; the Coffman-Graham algorithm finds a layering of width at the most W and height h ≤(2−2/W)hmin, where min is that the minimum height of layering of width W; and therefore the ILP algorithm of Gansner et al. finds a layering with a minimum number of dummy nodes. A boundary on the width of the layering is often specified only within the Coffman-Graham algorithm. within the classical version of the Coffman-Graham algorithm, the width of a layer is taken into account to be the number of real nodes the layer contains while neglecting the introduced dummy nodes. The algorithm can easily be modified to require under consideration the widths of the important nodes, but the width of the ultimate drawing should be much greater than expected, due to the contribution of the dummy nodes thereto. The algorithm of Gansner et al. usually leads to compact layerings, but the size of the drawing isn’t controlled, and that they could also be undesirable.

With the functional API, not only can we build models with multiple inputs and multiple outputs but we will also implement networks with a posh internal topology. Neural networks in Keras are allowed to be arbitrary directed acyclic graphs of layers. The qualifier acyclic is important: these graphs can’t have cycles. It’s impossible for a tensor x to become the input of 1 of the layers that generated x. the sole processing loops that are allowed (that is, recurrent connections) are those internal to recurrent layers. Several common neural-network components are implemented as graphs. Two notable ones are Inception modules and residual connections. to raised understand how the functional API are often wont to build graphs of layers, let’s take a glance at how we will implement both of them in Keras.

### INCEPTION MODULES

Inception 3 may be a popular sort of specification for convolutional neural networks; it had been developed by Christian Szegedy and his colleagues at Google in 2013–2014, inspired by the sooner network-in-network architecture. 4 It consists of a stack of modules that themselves appear as if small independent networks, split into several parallel branches. the foremost basic sort of an Inception module has three to four branches starting with a 1 × 1 convolution, followed by a 3 × 3 convolution, and ending with the concatenation of the resulting features. This setup helps the network separately learn spatial features and channel-wise features, which is more efficient than learning them jointly. More complex versions of an Inception module also are possible, typically involving pooling operations, different spatial convolution sizes (for example, 5 × 5 instead of 3 × 3 on some branches), and branches without a spatial convolution (only a 1 × 1 convolution).