Ethical Hacking Programming, Blogging, Hosting, All Computer Software, PC Software Download, JAVA in hindi, HTML, PHP, C, C++, Free Learning, Software's Download, Technical Videos, Technical Tricks and Tips, How Make Money

Binary Trees in C programming Class 48

15.7 Binary Trees


Binary trees are self-referential structures like linked-lists; they consist of nodes that point to other nodes. However, where list nodes point to adjacent nodes in a linear manner, tree nodes point to child nodes in a branching descent structure as shown in Figure 15.4. Each node has two pointers, which point to a left and a right child node, respectively. If a particular child does not exist, the pointer is NULL, and if both pointers are NULL, the node is called a leaf node. The top node in the tree is called the root node.




Figure 15.4: Binary tree. The top node is the root node. Each node contains an item of data and has two links to child nodes, which may point to NULL. In this figure, the bottom and righthand nodes are leaf nodes.

Binary trees are useful for storing data in sorted order. If the data were to be stored first and sorted second, then an array or list would be the right data-structure for the job. But neither is particularly e cient for storing data in sorted order as it arrives; they both take O(n) time to insert a new item into a set that already contains n elements. For a balanced tree containing n elements, insertion of a new node in sorted order takes O(log n) time. (A balanced tree is a tree with the minimum possible depth; it has no nodes at a level lower than the highest NULL branch. Figure 15.4 is a balanced tree.) Furthermore, a balanced tree permits O(log n) search for items, and the entire tree may be examined in sorted order in O(n) operations.

The following implementation presents a binary tree that contains two items of ancillary data: a string and an integer. This structure, and the associated functions, are used to implement a word-counting algorithm similar to that presented in [KR88, pages 139–143]. From this example, more generic use of binary trees should be apparent. The Tree structure contains the two data items and pointers to left and right Tree nodes as follows.

struct Tree {

char *item; /* Contained string. */

int count; /* Count of string appearances. */

Tree *left; /* Pointer to left-child. */

Tree *right; /* Pointer to right-child. */

};

The module’s public interface exports the name of the Tree data-type and four functions to ma-nipulate the tree. These perform operations to add new strings, count the occurrences of a specified string, print all strings in lexicographic order, and delete the entire tree structure, respectively.



1 typedef struct Tree Tree;

2

3 Tree *add item(Tree *root, const char *item); 4 int count item(Tree *root, const char *item); 5 void print inorder(Tree *root);

6   void destroy tree(Tree *root);


The function below is the implementation for adding a new string to the tree. Its basic operation is to first search for whether the word already exists in the tree. If so, it increments the count for that word. Otherwise, it adds the new word to the tree with a count of one. Trees (and linked-lists also), being self-referential structures, are well suited to recursive algorithms, and this is the case here: add_item() calls itself recursively until either the word is found, or an empty space is located in which to store a new word.



1 Tree *add item(Tree *node, const char *item)

2 /* Search for whether item already exists in tree. If not, add it to first empty

3 * node location (in lexicographical order), if found, increment word count. Perform

4* recursive descent for search, return pointer to current node. */

5   {
6 int cmp;

7

8if (node == NULL) /* found empty tree location, add item */

9 return make node(item);

10

11/* Recursive comparison to put item in correct location. */

12cmp = strcmp(item, node->item);

13if (cmp < 0)


14 node->left = add item(node->left, item);

15 else if (cmp > 0)

16 node->right = add item(node->right, item);

17 else

18 ++node->count; /* item already in tree, increment count */
19

20 return node;

21 }


8–9 If the passed pointer is NULL, a new node is created by calling make_node() and a pointer to this node is returned. The make_node() function is part of the module’s private interface; it allocates memory for the new node, and initialises it with a copy of the passed string and a count of one.

12 The binary tree stores words in lexicographic order. This ordering is accomplished using strcmp() to determine whether a word is less than, greater than, or equal to the word contained by the current node.

13–14 If the word is less than the node word, we recurse using the node’s left-hand child. Notice how the return value of add_item() is used to connect lower-level nodes with their parent nodes.

15–16 If the word is greater than the node word, we recurse using the node’s right-hand child.

17–18 If the words are equal (i.e., a match has been found), the count for that node is incremented and the recursive search terminates.

There are three points to note from this function. The first is that the recursive search terminates when either a word match is found (lines 17 and 18) or we reach a NULL node (lines 8 and 9) indicating

that we have a new word. Second, when a new child node is created, it is attached to its parent node via the return value (lines 9, 14 and 16). And third, the recursion, as it splits to the left and right, orders insertion and search, giving the tree its O(log n) properties.

The next two functions perform binary search and in-order visitation, respectively. The first, count_item() searches the tree for a word match, and returns the word count if a match is found, and zero if it is not. (The word count is the number of times a particular word was sent to add_item().) This function demonstrates an iterative (i.e., non-recursive) method for traversing the tree. The second function, print_inorder() visits every node in the tree and prints the stored word and word count. (The function print_node() is part of the module’s private interface.) The recursive implementation of print_inorder() causes the nodes to be printed in sorted order.


1 int count item(Tree *root, const char *item)

2 /* Search for item in tree and return its count. */

3   {

4 while (root) {

5 int cmp = strcmp(item, root->item);
6 if (cmp < 0)

7 root = root->left;
8 else if (cmp > 0)

9 root = root->right;
10 else

11 return root->count;

12 }

13 return 0;

14 }

15

16 void print inorder(Tree *node)

17 /* Print tree in lexicographical order */

18 {

19 if (node == NULL) return;

20 print inorder(node->left);

21 print node(node);

22 print inorder(node->right);

23 }


The basic binary tree, as presented here, is su cient for a great many situations. It is well suited to problems where the data arrives in random order, such as the words from a book. However, it behaves very ine ciently if the data does not arrive in random order and the tree becomes unbalanced. In the worst case, if the data is added to the tree in sorted order, the tree obtains the appearance and properties of a linked-list, with insert and search times being O(n).

Various solutions exist that resolve this problem. Advanced binary tree implementations, such as red-black trees, remain balanced for any input. Also, a data-structure called a skip list, while entirely di erent to a binary tree, possesses the same insertion and search properties as for a balanced binary tree.

Finally, even when the data is suitable for a simple binary tree implementation, it might not be the best data-structure for the job. Trees are best suited to tasks where the data is to be in sorted order during the insertion phase. However, if the data is to be stored in any order, and fast search is required subsequently, it is usually more e cient to store the data in an (expandable) array and then sort the array. With a good sorting algorithm, the time required to sort an array is less than the time to insert data into a tree. Similarly, the time to perform a binary search of a sorted array is generally less than the time to search a tree. Also, arrays consume less space than trees. The key advice here is to be aware of the tradeo s between data-structures, and know when one is likely to be more suitable than another.


Share:

No comments:

Post a Comment

Follow On YouTube