Time Complexity Evaluation of Good Binary Tree Traversal | by Daniel García Solla | Oct, 2024

Concretely, the blue textual content denotes the index of the mother or father node, whereas the opposite textual content in black illustrates the label of the node in query.

Implementation

Now with a well-defined enter, it’s attainable to understand at a extra summary stage what the operations of the tree traversal really do. On one facet, the outer For[] loop traverses by way of all of the nodes in stage order from the bottom stage to the one the place the foundation is situated. And, for every node the Whereas[] loop traverses the trail from the foundation to the visited node in reverse order, though the necessary facet for the time complexity certain is its size.

TreeIteration[v_List] := Module[{n, pos}, n = Length[v];
For[i = n, i >= 0, i--,
pos = v[[i]];
Print["FOR: ", i];
Whereas[pos > 0,
Print["While: ", pos];
pos = v[[pos + 1]];
]]]

So, after implementing it in Wolfram, some Print[] are included to show the index of the mother or father node of the nodes it traverses throughout its execution, enabling a better reconstruction of its hint.

Output

n = 7;
treeList = GenerateTree[n]
TreeIteration[treeList]
PlotBinaryTreeFromList[Drop[treeList, 1]]

Operating the algorithm with a 7-node tree, represented as L={-1, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3}, yields the next end result:

FOR: 8
Whereas: 3
Whereas: 1
FOR: 7
Whereas: 3
Whereas: 1
FOR: 6
Whereas: 2
Whereas: 1
FOR: 5
Whereas: 2
Whereas: 1
FOR: 4
Whereas: 1
FOR: 3
Whereas: 1
FOR: 2
FOR: 1
FOR: 0

At first look, the hint isn’t too revealing, so it ought to be mixed with the tree depiction:

Picture by creator

Within the first iteration of the for loop, the traversal begins on the final node of the bottom stage, whose mother or father has index 3. Subsequently, this node can be visited by the whereas loop, till within the subsequent iteration it reaches the foundation and ends. Within the succeeding for iteration, the identical course of is carried out with the distinction that it begins with the node with index 6, whose mother or father node is identical as earlier than. Thus, it may be famous that the for is definitely traversing all the prevailing paths within the tree that join every of the nodes to the foundation.

With the algorithm in place, and after having totally characterised its enter and understood its operation, it’s appropriate to proceed with its evaluation, each by way of reminiscence and time. On the one hand, the evaluation of the reminiscence occupied is easy on this case, because the algorithm doesn’t want further reminiscence to carry out its activity, past the integer worth pos wherein the node traversed in every iteration is saved. Accordingly, the asymptotic certain representing the extra reminiscence consumption is fixed O(1). And, if we think about the area occupied by the enter, an quantity of reminiscence of order O(n) can be required, the place n is the variety of nodes within the tree, or extra exactly O(2^d), the place d is the tree depth.

Then again, to find out the time complexity certain, we should outline an elementary operation to be accounted for. For this algorithm, it’s established because the asingnation executed contained in the whereas loop, which at an summary stage might be thought-about because the traversal of an edge between a node and its mother or father. Subsequently, to determine what number of instances this operation is invoked, the price of the algorithm is first decomposed into two capabilities. There may be, on one facet, the price Tf(n) of the for loop, which represents the whole of 1 algorithm’s execution. This, in flip, is outlined because the sum of the prices incurred by the whereas loop, designated as Tw(i).

Picture by creator.

For all nodes i contained within the array, we have to decide what number of elementary operations are concerned within the traversal of the trail from the foundation to i, so we append the corresponding Tw(i) analysis. Particularly, that perform will return the precise variety of assignments brought on by a sure enter node. Thus, since we all know that the primary L[0] can’t stroll any path to the foundation, it’s not counted, protecting the sum limits between 1 and the variety of nodes n within the tree.

Earlier than persevering with, we proceed to exhibit that the appliance of the perform P(i) to a node i situated at stage l of the tree leads to the label of a node situated on the instantly higher stage, because the elementary operation thought-about on this evaluation is equal to pos=P(pos), primarily as a result of enter options:

Picture by creator.

As proven, we start with the inequality that any node should fulfill with respect to the extent at which it’s discovered, being its label bounded by the capabilities m(l) and M(l), and assuming that l is its stage. Afterwards, when making use of P, a number of simplifications might be effected, resulting in the conclusion that P(i) lies between 2^(l-1) and 2^l-1, each coinciding with the evaluations m(l-1) and M(l-1), suggesting that after the transformation the ensuing node is situated at stage l-1. With this, we’re demonstrating that after a number of iterations of the whereas loop, the node saved in pos can have a stage nearer to the tree root. Consequently, if sufficient of them are accomplished, the trail is assured to succeed in the foundation and terminate. Though, in case of contemplating an infinite tree this may not maintain.

Method 1

In the meanwhile, we all know that the complexity is pushed by Tf(n), regardless of the shortage of an actual expression for Tw(i), so we proceed to debate three other ways to characterize this perform, and thereby the general asymptotic development of the execution time.

No matter how the remaining perform is discovered, a constraint on the tree nodes might be met in all analyses. Specifically, since all of them have a single mother or father, besides the foundation, we are able to make sure that the trail size between an arbitrary node situated at a stage l and the foundation is the same as l. Primarily that is as a result of property demonstrated above, though it will also be evidenced by the conclusion that every node current on such a path is at a unique stage, which might fluctuate from 0 to l. Then, because the whereas loop traverses each node within the path, it’s concluded that the operations counted in Tw(i) are precisely l.

Picture by creator.

Now, the evaluation is concentrated on computing the extent of a given node, so we begin from the higher inequality. That’s, the label of a node i is at a stage l bounded by m(l) and M(l), yielding two situations with which the extent might be precisely quantified:

Picture by creator.

On the one hand, fixing from the left facet of the inequality results in a situation reducible to a flooring operation on log_2(i), by its personal definition. From this, it may be inferred that the extent is the same as that amount, though the opposite situation of the unique inequality nonetheless must be verified.

Picture by creator.

Ranging from the right-hand facet, we arrive at a decrease certain for l, which a priori seems to be complementary to the previous. Nevertheless, after working and making use of the definition of the ceiling perform, we arrive on the following formulation for the extent, since its worth is the minimal integer that satisfies the final inequality proven above.

Picture by creator.

Recapitulating, to date we now have derived a number of expressions for the extent of a node i, which at first might be regarded as bounds of that worth as a result of their nature. Nonetheless, the extent have to be an integer, so it’s conceivable to verify the space between them, simply in case it had been sufficiently small to uniquely determine a single worth.

Picture by creator.

In abstract, it’s confirmed that each expressions are equivalent for all of the values that the node labels could have. Subsequently, the extent of a node might be inferred by both of the above formulae, the left one being the best, and so the one which might be used within the evaluation.

Picture by creator.

As the extent of i coincides with the elementary operations of the whereas loop, the price Tw(i) is outlined analogously to the node’s stage from which the trail to the highest should start.

Picture by creator.

Subsequent, with an expression for the price of every iteration of the for loop as a perform of the preliminary node, we are able to attempt to discover the sum of all the prices generated by the nodes of the tree. However, as there’s a flooring perform in every summand, we are going to first research the affect of not making use of this perform on the final word certain, with a purpose to simplify the summation, in addition to the ensuing certain in case the ground turns into dispensable.

Picture by creator

If we plot Tf(n) for an honest vary of n, a slight distinction is discernible between the perform with the ground of every summand eliminated and the unique one. Notably, the one which instantly sums the logarithm values with none further transformation seems to be a higher certain of the particular complexity, so if we proceed to unravel the sum the place every time period is instantly log_2(i), we are able to arrive at a certain that asymptotically could also be considerably bigger than the precise one, establishing itself because the higher one:

Picture by creator.

By expressing the sum in a closed kind, we may assume that the algorithm requires an execution time no larger than the order O(log(n!)) with respect to the variety of nodes within the enter tree. Nonetheless, this certain might be additional simplified. As an example, if we think about that in every iteration of the for loop, which is executed as many instances as n nodes, a piece is carried out proportional and never increased than the utmost stage of the tree, we’d get an higher certain of order O(n log(n)). As a consequence, if we examine it with the earlier order O(log(n!)) by way of the restrict of its ratio when the enter tends to infinity, we conclude that each are equal, permitting the simplification of the higher certain of the algorithm’s runtime to O(n log(n)).

Picture by creator.

At this juncture, the higher certain ensures that the runtime overhead of the algorithm doesn’t exceed the order O(n log(n)) by way of development with respect to the enter. Nevertheless, the curiosity of the evaluation resides in bounding the price as a lot as attainable, i.e., discovering the tight certain, not an higher one, which in some circumstances could differ considerably. For this, it’s essential to discover a closed kind for the sum Tf(n) above, particularly when the ground perform is utilized to the summands. Intuitively, the appliance of the ground will scale back the worth of every time period to some extent, and the final word worth could fluctuate as a result of dependence between the higher restrict and the scale of the tree.

Picture by creator.

Firstly, for log_2(i) to be an integer and to keep away from making use of the ground transformation, the node label have to be of the shape 2^l, the place l should essentially discuss with the extent at which it’s encountered.

Picture by creator.

Coincident with m(l), above it’s proven that every one nodes i=m(l) whose label is the minimal of their stage will end in log_2(i) being an integer, specifically l.

Picture by creator.

Subsequently, by feeding all labels between m(l) and M(l) as enter to the flooring(log_2(i)) perform, it ought to yield its stage, which has been discovered to coincide with that of the “consultant” m(l) node of that stage. Briefly, this permits to imagine that each node of a selected stage will incur in the identical price Tw(i), as the trail’s size from any considered one of them to the foundation is precisely equal to l.

Picture by creator.

Subsequently, the variety of nodes at every stage is deduced, which as one would possibly guess with out this step is 2^l, that’s, if at every stage the variety of nodes of the earlier one is doubled, for a sure stage this amount might be given by the product of the branching issue by itself l instances.

Picture by creator.

In conclusion, the runtime price of the algorithm in any respect nodes of the identical stage l is the product between the size of the trail to the foundation, coincident with the extent, and the variety of nodes in it. And, from this outcome a closed kind for Tf(n) depending on the depth d of the tree might be drawn:

Picture by creator.

By rewriting the sum as a perform of the degrees from 0 to the depth, we arrive on the above expression, which might be concretized by defining the connection between d and the whole variety of nodes n:

Picture by creator.

Since n is the label of the final node, flooring(log_2(n)) ensures to return the worth of the final stage, which in flip coincides with the depth d. Thus, by the above formulation of the entire price Tf(n) we conclude with the next tight certain:

Picture by creator.

At this level, it’s price making an attempt to simplify it, in order that it’s featured by an easier expression. For that reason, we proceed to calculate its ratio with the earlier higher bounds, which can primarily present the distinction between each in case they’re asymptotically equal, or diverge within the reverse case (though it may additionally converge to 0).

Picture by creator.

However, the bounds of the ratio produce the identical outcome for each higher bounds, being asymptotically equal. And, as they lie on an actual interval, it may be inferred that the tight certain is equal to the higher one, no less than asymptotically, because the ratio signifies a negligible distinction at infinity.

Picture by creator.

Lastly, the time complexity of the algorithm is set by the highest order, which might be achieved in a number of methods as we are going to see under. Earlier than persevering with, although, it’s price noting the connection between the 2 expressions discovered for the tight certain. Whereas the latter relies upon instantly on the variety of nodes, the unique one might be shaped by rewriting the one proven above changing n by the variety of nodes on the final stage, which contributes to a greater understanding of the dependence between the runtime and the properties of the information construction concerned.

Method 2

One other option to proceed with the evaluation is by defining every worth contained within the enter array:

Picture by creator.

Every one is recognized by a concrete analysis P(i), from which the next constraint might be inferred:

Picture by creator.

By representing P(i) an ascent within the tree, any enter bounded by [0,n] that may be offered to the perform will at all times return a outcome current in the identical interval, which results in the formalization of the traversal carried out by the whereas loop and whereby we are going to obtain Tw(i):

Picture by creator.

Originally of the traversal, any node i is chosen, ascending to its mother or father P(i), then to its ancestor P(P(i)) and so forth till reaching the foundation with label 1. Truly, the loop stops when reaching the “node” L[0], nevertheless, right here it’s thought-about that it stops on the root, because the distinction in price might be fixed. So, above we formalize this course of by composing P(i) a variable variety of instances, which as we all know coincides with the size of the trail to the foundation, might be set equal to the node’s stage l.

Picture by creator.

With this strategy, the price Tw(i) is outlined as the extent of the enter node, which will also be acquired by discovering the integer that satisfies the higher equality.

Picture by creator.

At this level, when acquiring the integer l that causes the repeated composition to end in 1, we first apply the properties of the ground perform to explain the composition in a closed kind. Additionally, it’s demonstrable that the composition of the perform P leads to the above expression.

Picture by creator.

Thereafter, by definition of the ground perform, an inequality is established between the closed type of the composition and the result it ought to attain. That’s, the equality dictates that after l compositions precisely the worth of the foundation is reached, though, because the argument of the ground could also be larger than 1, we proceed from the inferred inequality. Lastly, we conclude with an expression for the extent of a sure node i, which we are going to use to search out Tf(n), and therefore the complexity.

Picture by creator.

When changing Tw(i) by the extent of node i, the summation produced is equal to the one solved within the earlier evaluation, so the ultimate expression can be equal.

Picture by creator.

Finally, the tight certain derived from this process is of order nlog(n), coinciding with the beforehand inferred one. In flip, it could even be rewritten as a perform of tree’s depth, which in sure conditions turns into useful.

Method 3

Lastly, we are going to discover an alternate option to carry out this evaluation and purchase the prior asymptotic certain. On this case, we will begin from the label i of a mother or father node saved within the array. This label at low stage is represented as a optimistic integer, particularly in base 2. Subsequently, its binary kind might be denoted as follows:

Picture by creator.

On one hand, it’s outlined because the sum of the merchandise of the bits by their worth within the corresponding base, which in a compact format is formalized as a gaggle of bits whose subscript denotes such worth.

Picture by creator.

Every of the bits is an integer 0 or 1, whose subscript belongs to the interval of the integers comprised between 0 and B(i)-1, the place B(i) is the perform that returns the size of the binary illustration of the integer i.

Picture by creator.

As for its formulation, it stays confirmed that the variety of bits wanted to explain an integer in base 2 is given by the above equality. A priori, the logarithmic time period is equivalent to the expression describing the extent at which node i is situated, so we are able to start to elucidate the remainder of the process.

Picture by creator.

To calculate Tw(i), it’s essential to account for the impact of P(i) on the binary illustration of the node label. Merely put, the label ensuing from the repeated utility of P(i) have to be 1, or on this case for simplicity 0. Subsequently, by dividing the label by 2 and making use of the ground perform, it may be assured that in binary the equal of this perform is a shift operation to the correct. So, after B(i) shifts, the ensuing label might be 0, concluding the trail of the whereas loop and incurring a value proportional to flooring(log_2(i))+1.

Picture by creator.

Likewise, when substituting B(i) within the sum of the general price, on this evaluation we find yourself with a further time period n, which, being smaller than the ultimate worth, is asymptotically negligible.

Picture by creator.

In conclusion, with this process the identical tight certain is deduced, protecting the runtime price of the algorithm categorised by the order nlog(n).

Time Measurements

Lastly, after the theoretical evaluation, experimental measurements might be collected of the time it takes to complete the algorithm for inputs of various sizes, with a purpose to present how effectively or poorly the runtime development matches the tight certain.

information = Flatten[
ParallelTable[
Table[{n,
AbsoluteTiming[TreeIteration[GenerateTree[n]]][[1]]}, {index, 0,
n}], {n, 1, 300}], 1];

To this finish, a number of Wolfram capabilities are used within the measurement course of. Probably the most vital of those is AbsoluteTiming[], which information the time in seconds it took to run the algorithm with a tree consisting of n nodes. Right here, we don’t choose values of n which can be powers of two, we merely think about that the enter is an entire tree as an alternative of an ideal one with a purpose to observe how the execution time grows in relation to the variety of nodes. Then, measurements are taken for n from 1 to 300, performing n runs for every corresponding variety of nodes.

nonlinearModel = NonlinearModelFit[points, a*x*Log[x], {a}, x]
ListPlot[{points, Table[{x, nonlinearModel[x]}, {x, 0, 300}]},
PlotStyle -> {Pink, Inexperienced}, ImageSize -> Giant,
AxesLabel -> {"n", "Time (s)"}]

Afterwards, a becoming mannequin of the shape c*nlog(n) is outlined wherein c represents a relentless used as a parameter, adjusting its worth to the measurement dataset as dictated by the NonLinearModelFit[] perform.

Picture by creator.

As soon as the mannequin has been fitted, the highest end result is generated, the interpretation of which is considerably extra significant when plotted towards the information factors:

Picture by creator.

As seen, the dataset exhibits some variability as a result of sensible interferences within the measurement course of. Nevertheless, the expansion as n is elevated is clearly much like an order nlog(n), which can be exceptional as compared with the placement of the mannequin, being located in a zone considerably decrease than the common between the 2 areas that visibly present the next density of measurements.

Picture by creator.

Lastly, with the earlier becoming outcomes and an adjusted R² of 0.934551, it may be concluded that the mannequin appropriately captures the expansion pattern of the dataset. Although, its variability interprets right into a slight uncertainty within the worth of the c fixed.

The formal evaluation of the algorithm characterizes the asymptotic development of its execution time by the order Θ(nlog(n)). Such certain has been calculated from three completely different approaches, though all of them are based mostly on the identical concept of figuring out the depth of every tree node. Within the first one, the extent was used because the depth measure, which is equal to the variety of instances P(i) should compose with itself to succeed in the label of the foundation node, and, in flip, to the variety of bits wanted to characterize the label of the preliminary node i in binary.

Additionally, as a ultimate be aware, it’s price mentioning that many of the Wolfram code concerned on this evaluation was generated by the GPT-4o mannequin from ChatGPT.