diff --git a/.travis.yml b/.travis.yml new file mode 100644 index 0000000..3d65f30 --- /dev/null +++ b/.travis.yml @@ -0,0 +1,9 @@ +language: python +python: + - "2.7" + - "3.6" + +install: + - pip install pytest +script: + - pytest diff --git a/README.md b/README.md index 518a54e..90a6e94 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,46 @@ -# data-structures -Data Structures in Python. Code Fellows 401. +# Data-Structures + +**Author**: Chelsea Dole + +**Coverage**: [![Build Status](https://travis-ci.org/chelseadole/data-structures.svg?branch=master)](https://travis-ci.org/chelseadole/data-structures) + +**Resources/Shoutouts**: Nathan Moore (lab partner/amigo) + +**Testing Tools**: pytest, pytest-cov + +## Data Structures: + +* **Binary Search Tree** — *a BST is a "tree shaped" data structure containing nodes. Each node can have a maximum of two children or "leaves," and all node values are properly located based on its parents and siblings values. Nodes to the left of the "root"/head node have values smaller than the root. Those to the right have values larger than the root. There are no duplicate values.* + +* **Trie Tree** - *a Trie Tree is a "tree shaped" data structure containing nodes with references to letters. These nodes string together (using each node's "children" and "parent" attriutes) to form words. This tree allows for quick lookup time of words, and is used for things such as word suggestion/auto-complete.* + +## Time Complexities: + +* balance() = *This BST function returns the balance (or size difference) between the left and right parts of the tree. Its runtime is O(1), because it always takes the same amount of time to run regardless of tree size, and only performs simple subtraction.* + +* size() = *This BST function returns the number of nodes/leaves in a tree. Its runtime is O(1), because runtime never changes regardless of tree size. It only returns the value of tree_size, which is created at BST initialization, and changed during the insertion of nodes.* + +* insert() = *This BST function inserts a new node into a tree, and uses a helper function called find_home() to find its correctly sorted place in the tree. This function is, depending on the tree, anywhere between O(logn) and O(n), if it's a relatively balanced tree, every decision will reduce the number of nodes one has to traverse. But if it's a one-sided tree, one may look over every node -- making it O(n).* + +* search() = *This BST function is a reference to check_for_equivalence(), which is recursive, and has a runtime of O(n^2), because every time you're re-calling check_for_equivalence, it looks at every node's equivalence.* + +* contains() = *This BST function looks at search(), described above, and for the same reasons has runtime of O(n^2).* + +* depth() = *This BST function returns the number of "levels" that the tree has, by finding which of the two sides has the greatest depth, and returning that. It has a runtime of O(1), because no matter the size of the tree, it only performs a comparison operation.* + +* in_order() = *This BST traversal function traverses the tree and returns a generator that outputs the node values in numerical order. It has a runtime of O(n), not because you visit every node once (you visit them more than once here) but because the work you do/time you take is constant and grows constantly per node addition.* + +* pre_order() = *This BST traversal function returns a generator that outputs the node values in order of the furthest left parent, its left child, then its right child. This traveral then backs up to the parent, and repeats until the whole tree has been traversed. Like in_order, it has a runtime of O(n), not because you visit every node once (you visit them more than once here) but because the work you do/time you take is constant and grows constantly per node addition.* + +* post_order() = *This BST traversal function returns a generator that outputs the node values in order of the bottom-most left node, the bottom-most right node, and then those nodes' parent. Then it backs up, and repeats this action with the parent as a new child node, until the whole tree has been traversed. Like in_order and pre_order, it has a runtime of O(n), not because you visit every node once (you visit them more than once here) but because the work you do/time you take is constant and grows constantly per node addition.* + +* breadth_first() = *This BST traversal returns a generator that outputs the node values in order of their "levels". It produces first the root, then all nodes (left to right) in the first depth level, then all nodes (left to right) in the second depth level, et cetera. Like in_order, pre_order, and post_order, it has a runtime of O(n), not because you visit every node once (you visit them more than once here) but because the work you do/time you take is constant and grows constantly per node addition.* + +* insert() = *This Trie insert method adds a word to the trie tree. It first checks to see if the word is already in the tree (in which case it does nothing). Then, it goes through each letter of the word and uses the dictionary function setdefault to add a new letter node if it doesn't already exist, and string together the letters. Finally, it increases the tree's size attribute. The time complexity is O(len(word)), because the length of runtime depends on the size of the word you're inserting.* + +* contains() = *This Trie method checks if the tree contains a certain word. It does this by iterating through each letter of the word and checking if the letter node's children dictionary contains a key to the next letter in the word. If at any point it doesnt (or if the last letter of the word doesn't have the "end" attribute as True) it returns False. The time complexity is at worst case, O(n), because in the worst case scenario, you have just one word in the tree, and you have to check through all the letters in that one word.* + +* size() = *This Trie method returns the number of words in the tree by returning the tree's size attribute, which is incremented and decremented in insert() and remove() respectively. Time complexity should be O(1), because it just returns a number: the attribute of Trie.* + +* remove() = *This Trie method removes a word from the tree. First it traverses to the node of the last letter of the tree (and raises an error if the word doesnt exist). Once at the last letter, it moves backwards, deleting references to the children/letters below. Time complexity should be O(n * 2), because worst case scenario, the word you're removing is the only word in the tree, and you had to traverse all the way down the letters then come back up.* + diff --git a/bst.py b/bst.py new file mode 100644 index 0000000..4665490 --- /dev/null +++ b/bst.py @@ -0,0 +1,273 @@ +"""Implementation of a binary search tree data structure.""" +import timeit as time + + +class Node(object): + """Define the Node-class object.""" + + def __init__(self, value, left=None, right=None, parent=None): + """Constructor for the Node class.""" + self.val = value + self.left = left + self.right = right + self.parent = parent + self.depth = 0 + + +class BST(object): + """Define the BST-class object.""" + + def __init__(self, starting_values=None): + """Constructor for the BST class.""" + self.tree_size = 0 + self.left_depth = 0 + self.right_depth = 0 + self.visited = [] + + if starting_values is None: + self.root = None + + elif isinstance(starting_values, (list, str, tuple)): + self.root = Node(starting_values[0]) + self.tree_size += 1 + for i in range(len(starting_values) - 1): + self.insert(starting_values[i + 1]) + + else: + raise TypeError('Only iterables or None\ + are valid parameters!') + + def balance(self): + """Return the current balance of the BST.""" + return self.right_depth - self.left_depth + + def size(self): + """Return the current size of the BST.""" + return self.tree_size + + def insert(self, value): + """Insert a new node into the BST, and adjust the balance.""" + new_node = Node(value) + + if self.root: + if new_node.val > self.root.val: + if self.root.right: + self._find_home(new_node, self.root.right) + if new_node.depth > self.right_depth: + self.right_depth = new_node.depth + else: + new_node.parent = self.root + self.root.right = new_node + self.root.right.depth = 1 + if self.root.right.depth > self.right_depth: + self.right_depth = self.root.right.depth + self.tree_size += 1 + + elif new_node.val < self.root.val: + if self.root.left: + self._find_home(new_node, self.root.left) + if new_node.depth > self.left_depth: + self.left_depth = new_node.depth + else: + new_node.parent = self.root + self.root.left = new_node + self.root.left.depth = 1 + if self.root.left.depth > self.left_depth: + self.left_depth = self.root.left.depth + self.tree_size += 1 + else: + self.root = new_node + self.tree_size += 1 + + def _find_home(self, node_to_add, node_to_check): + """. + Check if the node_to_add belongs on the left or right + of the node_to_check, then place it there if that spot is empty, + otherwise recur. + """ + if node_to_add.val > node_to_check.val: + if node_to_check.right: + self._find_home(node_to_add, node_to_check.right) + else: + node_to_add.parent = node_to_check + node_to_check.right = node_to_add + node_to_check.right.depth = node_to_check.depth + 1 + self.tree_size += 1 + + elif node_to_add.val < node_to_check.val: + if node_to_check.left: + self._find_home(node_to_add, node_to_check.left) + else: + node_to_add.parent = node_to_check + node_to_check.left = node_to_add + node_to_check.left.depth = node_to_check.depth + 1 + self.tree_size += 1 + + def search(self, value): + """If a value is in the BST, return its node.""" + return self._check_for_equivalence(value, self.root) + + def contains(self, value): + """Return whether or not a value is in the BST.""" + return bool(self.search(value)) + + def _check_for_equivalence(self, value, node_to_check): + """. + Check if the value matches that of the node_to_check + if it does, return the node. If it doesn't, go left or right + as appropriate and recur. If you reach a dead end, return None. + """ + try: + if value == node_to_check.val: + return node_to_check + + except AttributeError: + return None + + if value > node_to_check.val and node_to_check.right: + return self._check_for_equivalence(value, node_to_check.right) + + elif value < node_to_check.val and node_to_check.left: + return self._check_for_equivalence(value, node_to_check.left) + + def depth(self): + """Return the depth of the BST.""" + if self.left_depth > self.right_depth: + return self.left_depth + return self.right_depth + + def in_order(self): + """Return a generator to perform an in-order traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._in_order_gen() + return gen + + def _in_order_gen(self): + """Recursive helper method for in-order traversal.""" + current = self.root + + while len(self.visited) < self.tree_size: + if current.left: + if current.left.val not in self.visited: + current = current.left + continue + + if current.val not in self.visited: + self.visited.append(current.val) + yield current.val + + if current.right: + if current.right.val not in self.visited: + current = current.right + continue + + current = current.parent + + def pre_order(self): + """Return a generator to perform an pre-order traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._pre_order_gen() + return gen + + def _pre_order_gen(self): + """Recursive helper method for pre-order traversal.""" + current = self.root + + while len(self.visited) < self.tree_size: + if current.val not in self.visited: + self.visited.append(current.val) + yield current.val + + if current.left: + if current.left.val not in self.visited: + current = current.left + continue + + if current.right: + if current.right.val not in self.visited: + current = current.right + continue + + current = current.parent + + def post_order(self): + """Return a generator to perform an post-order traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._post_order_gen() + return gen + + def _post_order_gen(self): + """Recursive helper method for post-order traversal.""" + current = self.root + + while len(self.visited) < self.tree_size: + if current.left: + if current.left.val not in self.visited: + current = current.left + continue + + if current.right: + if current.right.val not in self.visited: + current = current.right + continue + + if current.val not in self.visited: + self.visited.append(current.val) + yield current.val + + current = current.parent + + def breadth_first(self): + """Return a generator to perform a breadth-first traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._breadth_first_gen(self.root) + return gen + + def _breadth_first_gen(self, root_node): + """Helper generator for breadth-first traversal.""" + queue = [self.root] + while queue: + current = queue[0] + yield current.val + queue = queue[1:] + + if current not in self.visited: + self.visited.append(current) + + if current.left: + if current.left not in self.visited: + queue.append(current.left) + + if current.right: + if current.right not in self.visited: + queue.append(current.right) + + +if __name__ == '__main__': # pragma: no cover + left_bigger = BST([6, 5, 4, 3, 2, 1]) + right_bigger = BST([1, 2, 3, 4, 5, 6]) + bal_tree = BST([20, 12, 10, 1, 11, 16, 30, 42, 28, 27]) + + left_bigger = time.timeit("left_bigger.search(5)", setup="from __main__ import left_bigger") + right_bigger = time.timeit("right_bigger.search(5)", setup="from __main__ import right_bigger") + bal_tree = time.timeit("bal_tree.search(8)", setup="from __main__ import bal_tree") + + print('Left-Skewed Search Time: ', left_bigger) + print('Right-Skewed Search Time: ', right_bigger) + print('Balanced Search Time: ', bal_tree) diff --git a/setup.py b/setup.py new file mode 100644 index 0000000..6751995 --- /dev/null +++ b/setup.py @@ -0,0 +1,15 @@ +"""Setup module for Chelsea's data structures.""" + +from setuptools import setup + +setup( + name='Data structures', + description='Various data structures in python', + author='Chelsea Dole', + author_email='chelseadole@gmail', + package_dir={' ': 'src'}, + py_modules=['bst'], + install_requires=['timeit'], + extras_require={ + 'test': ['pytest', 'pytest-cov', 'pytest-watch', 'tox'], + 'development': ['ipython']}) diff --git a/src/bst.py b/src/bst.py new file mode 100644 index 0000000..96faef5 --- /dev/null +++ b/src/bst.py @@ -0,0 +1,258 @@ +"""Implementation of a binary search tree data structure.""" + + +class Node(object): + """Define the Node-class object.""" + + def __init__(self, value, left=None, right=None, parent=None): + """Constructor for the Node class.""" + self.val = value + self.left = left + self.right = right + self.parent = parent + self.depth = 0 + + +class BST(object): + """Define the BST-class object.""" + + def __init__(self, starting_values=None): + """Constructor for the BST class.""" + self.tree_size = 0 + self.left_depth = 0 + self.right_depth = 0 + self.visited = [] + + if starting_values is None: + self.root = None + + elif isinstance(starting_values, (list, str, tuple)): + self.root = Node(starting_values[0]) + self.tree_size += 1 + for i in range(len(starting_values) - 1): + self.insert(starting_values[i + 1]) + + else: + raise TypeError('Only iterables or None\ + are valid parameters!') + + def balance(self): + """Return the current balance of the BST.""" + return self.right_depth - self.left_depth + + def size(self): + """Return the current size of the BST.""" + return self.tree_size + + def insert(self, value): + """Insert a new node into the BST, and adjust the balance.""" + new_node = Node(value) + + if self.root: + if new_node.val > self.root.val: + if self.root.right: + self._find_home(new_node, self.root.right) + if new_node.depth > self.right_depth: + self.right_depth = new_node.depth + else: + new_node.parent = self.root + self.root.right = new_node + self.root.right.depth = 1 + if self.root.right.depth > self.right_depth: + self.right_depth = self.root.right.depth + self.tree_size += 1 + + elif new_node.val < self.root.val: + if self.root.left: + self._find_home(new_node, self.root.left) + if new_node.depth > self.left_depth: + self.left_depth = new_node.depth + else: + new_node.parent = self.root + self.root.left = new_node + self.root.left.depth = 1 + if self.root.left.depth > self.left_depth: + self.left_depth = self.root.left.depth + self.tree_size += 1 + else: + self.root = new_node + self.tree_size += 1 + + def _find_home(self, node_to_add, node_to_check): + """. + Check if the node_to_add belongs on the left or right + of the node_to_check, then place it there if that spot is empty, + otherwise recur. + """ + if node_to_add.val > node_to_check.val: + if node_to_check.right: + self._find_home(node_to_add, node_to_check.right) + else: + node_to_add.parent = node_to_check + node_to_check.right = node_to_add + node_to_check.right.depth = node_to_check.depth + 1 + self.tree_size += 1 + + elif node_to_add.val < node_to_check.val: + if node_to_check.left: + self._find_home(node_to_add, node_to_check.left) + else: + node_to_add.parent = node_to_check + node_to_check.left = node_to_add + node_to_check.left.depth = node_to_check.depth + 1 + self.tree_size += 1 + + def search(self, value): + """If a value is in the BST, return its node.""" + return self._check_for_equivalence(value, self.root) + + def contains(self, value): + """Return whether or not a value is in the BST.""" + return bool(self.search(value)) + + def _check_for_equivalence(self, value, node_to_check): + """. + Check if the value matches that of the node_to_check + if it does, return the node. If it doesn't, go left or right + as appropriate and recur. If you reach a dead end, return None. + """ + try: + if value == node_to_check.val: + return node_to_check + + except AttributeError: + return None + + if value > node_to_check.val and node_to_check.right: + return self._check_for_equivalence(value, node_to_check.right) + + elif value < node_to_check.val and node_to_check.left: + return self._check_for_equivalence(value, node_to_check.left) + + def depth(self): + """Return the depth of the BST.""" + if self.left_depth > self.right_depth: + return self.left_depth + return self.right_depth + + def in_order(self): + """Return a generator to perform an in-order traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._in_order_gen() + return gen + + def _in_order_gen(self): + """Recursive helper method for in-order traversal.""" + current = self.root + + while len(self.visited) < self.tree_size: + if current.left: + if current.left.val not in self.visited: + current = current.left + continue + + if current.val not in self.visited: + self.visited.append(current.val) + yield current.val + + if current.right: + if current.right.val not in self.visited: + current = current.right + continue + + current = current.parent + + def pre_order(self): + """Return a generator to perform an pre-order traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._pre_order_gen() + return gen + + def _pre_order_gen(self): + """Recursive helper method for pre-order traversal.""" + current = self.root + + while len(self.visited) < self.tree_size: + if current.val not in self.visited: + self.visited.append(current.val) + yield current.val + + if current.left: + if current.left.val not in self.visited: + current = current.left + continue + + if current.right: + if current.right.val not in self.visited: + current = current.right + continue + + current = current.parent + + def post_order(self): + """Return a generator to perform an post-order traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._post_order_gen() + return gen + + def _post_order_gen(self): + """Recursive helper method for post-order traversal.""" + current = self.root + + while len(self.visited) < self.tree_size: + if current.left: + if current.left.val not in self.visited: + current = current.left + continue + + if current.right: + if current.right.val not in self.visited: + current = current.right + continue + + if current.val not in self.visited: + self.visited.append(current.val) + yield current.val + + current = current.parent + + def breadth_first(self): + """Return a generator to perform a breadth-first traversal.""" + self.visited = [] + + if self.root is None: + raise IndexError("Tree is empty!") + + gen = self._breadth_first_gen(self.root) + return gen + + def _breadth_first_gen(self, root_node): + """Helper generator for breadth-first traversal.""" + queue = [self.root] + while queue: + current = queue[0] + yield current.val + queue = queue[1:] + + if current not in self.visited: + self.visited.append(current) + + if current.left: + if current.left not in self.visited: + queue.append(current.left) + + if current.right: + if current.right not in self.visited: + queue.append(current.right) diff --git a/src/test_bst.py b/src/test_bst.py new file mode 100644 index 0000000..77b764f --- /dev/null +++ b/src/test_bst.py @@ -0,0 +1,363 @@ +"""Tests for the Binary Search Tree.""" + +import pytest +from bst import BST +from bst import Node + + +@pytest.fixture +def sample_bst(): + """Make a sample_bst for testing.""" + from bst import BST + return BST() + + +def test_bst_exists(sample_bst): + """Test that the BST class makes something.""" + assert sample_bst + + +def test_bst_can_take_list_at_initialization(): + """Test that the BST can take a list.""" + from bst import BST + b = BST([1, 2, 3]) + assert b.size() == 3 + assert b.depth() == 2 + assert b.left_depth == 0 + assert b.right_depth == 2 + + +def test_bst_can_take_tuple_at_initialization(): + """Test that the BST can take a tuple.""" + from bst import BST + b = BST((1, 2, 3)) + assert b.size() == 3 + assert b.depth() == 2 + assert b.left_depth == 0 + assert b.right_depth == 2 + + +def test_bst_can_take_string_at_initialization(): + """Test that the BST can take a string.""" + from bst import BST + b = BST('abc') + assert b.size() == 3 + assert b.depth() == 2 + assert b.left_depth == 0 + assert b.right_depth == 2 + + +def test_insert_increases_depth(sample_bst): + """Test that the insert method increases the output of the depth method.""" + assert sample_bst.depth() == 0 + sample_bst.insert(1) + assert sample_bst.depth() == 0 + sample_bst.insert(2) + assert sample_bst.depth() == 1 + + +def test_insert_increases_size(sample_bst): + """Test that the insert method increases the output of the size method.""" + assert sample_bst.size() == 0 + sample_bst.insert(1) + assert sample_bst.size() == 1 + sample_bst.insert(2) + assert sample_bst.size() == 2 + + +def test_insert_increases_tree_size(sample_bst): + """Test that the insert method increases the tree_size attribute.""" + assert sample_bst.tree_size == 0 + sample_bst.insert(1) + assert sample_bst.tree_size == 1 + sample_bst.insert(2) + assert sample_bst.tree_size == 2 + + +def test_search_right(sample_bst): + """Assert that the search method returns something and that it's a Node.""" + from bst import Node + sample_bst.insert(1) + sample_bst.insert(2) + sample_bst.insert(3) + found = sample_bst.search(3) + assert found.val == 3 + assert isinstance(found, Node) + + +def test_search_left(sample_bst): + """Assert that the search method returns something and that it's a Node.""" + from bst import Node + sample_bst.insert(3) + sample_bst.insert(2) + sample_bst.insert(1) + found = sample_bst.search(3) + assert found.val == 3 + assert isinstance(found, Node) + + +def test_search_not_found_returns_none(sample_bst): + """Assert that the search method returns None when value isn't found.""" + sample_bst.insert(1) + sample_bst.insert(2) + sample_bst.insert(3) + found = sample_bst.search(4) + assert found is None + + +def test_contains_right(sample_bst): + """Assert that the contains method returns True.""" + sample_bst.insert(1) + sample_bst.insert(2) + sample_bst.insert(3) + found = sample_bst.contains(3) + assert found is True + + +def test_contains_left(sample_bst): + """Assert that the contains method returns True.""" + sample_bst.insert(3) + sample_bst.insert(2) + sample_bst.insert(1) + found = sample_bst.contains(3) + assert found is True + + +def test_contains_not_found_returns_none(sample_bst): + """Assert that the contains method returns False when value isn't found.""" + sample_bst.insert(1) + sample_bst.insert(2) + sample_bst.insert(3) + found = sample_bst.contains(4) + assert found is False + + +def test_that_bst_doesnt_work_with_non_iterable(): + """Test that BST only takes iterable inputs.""" + with pytest.raises(TypeError): + BST({0: 0, 1: 1, 2: 2}) + + +def test_adding_preexisting_node_is_not_added(sample_bst): + """Test that adding a node val that exists does not increase BST.""" + assert sample_bst.size() == 0 + sample_bst.insert(5) + sample_bst.insert(5) + sample_bst.insert(5) + assert sample_bst.size() == 1 + + +def test_that_negative_numbers_work_with_insert(sample_bst): + """Test that negative numbers are covered in insert.""" + sample_bst.insert(-500) + assert sample_bst + + +def test_node_attributes_exist(): + """Test that node attributes are in Node class.""" + n = Node(1) + assert n.val == 1 + assert n.left is None + assert n.right is None + assert n.depth == 0 + + +def test_node_attribute_depth_changes(sample_bst): + """Test that node attribute depth increases.""" + sample_bst.insert(4) + sample_bst.insert(3) + sample_bst.insert(2.5) + assert sample_bst.search(4).depth == 0 + assert sample_bst.search(3).depth == 1 + assert sample_bst.search(2.5).depth == 2 + + +def test_node_left_and_right_attributes_change(): + """Test that left and right node attributes are added with insert.""" + b = BST([5]) + b.insert(4) + b.insert(6) + assert b.root.left.val == 4 + assert b.root.right.val == 6 + + +def test_root_val_with_no_val_at_initialization(sample_bst): + """Test that root is None.""" + assert sample_bst.root is None + + +def test_in_order_indexerrors_with_empty_tree(sample_bst): + """Test that in_order raises an IndexError if the tree is empty.""" + with pytest.raises(IndexError): + sample_bst.in_order() + + +def test_pre_order_indexerrors_with_empty_tree(sample_bst): + """Test that pre_order raises an IndexError if the tree is empty.""" + with pytest.raises(IndexError): + sample_bst.pre_order() + + +def test_post_order_indexerrors_with_empty_tree(sample_bst): + """Test that post_order raises an IndexError if the tree is empty.""" + with pytest.raises(IndexError): + sample_bst.post_order() + + +def test_breadth_first_indexerrors_with_empty_tree(sample_bst): + """Test that breadth_first raises an IndexError if the tree is empty.""" + with pytest.raises(IndexError): + sample_bst.breadth_first() + + +def test_in_order_size_one(sample_bst): + """Check for the correct output of in_order on a tree of size 1.""" + sample_bst.insert(1) + gen = sample_bst.in_order() + assert next(gen) == 1 + + +def test_pre_order_size_one(sample_bst): + """Check for the correct output of pre_order on a tree of size 1.""" + sample_bst.insert(1) + gen = sample_bst.pre_order() + assert next(gen) == 1 + + +def test_post_order_size_one(sample_bst): + """Check for the correct output of post_order on a tree of size 1.""" + sample_bst.insert(1) + gen = sample_bst.post_order() + assert next(gen) == 1 + + +def test_breadth_first_size_one(sample_bst): + """Check for the correct output of breadth_first on a tree of size 1.""" + sample_bst.insert(1) + gen = sample_bst.breadth_first() + assert next(gen) == 1 + + +LEFT_IMBALANCED = [6, 5, 4, 3, 2, 1] +RIGHT_IMBALANCED = [1, 2, 3, 4, 5, 6] +SAMPLE_TREE = [20, 12, 10, 1, 11, 16, 30, 42, 28, 27] + + +def test_in_order_left_imba(): + """Check for the correct output of iot on a left-imbalanced tree.""" + tree = BST(LEFT_IMBALANCED) + gen = tree.in_order() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [1, 2, 3, 4, 5, 6] + + +def test_pre_order_left_imba(): + """Check for the correct output of preo-t on a left-imbalanced tree.""" + tree = BST(LEFT_IMBALANCED) + gen = tree.pre_order() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [6, 5, 4, 3, 2, 1] + + +def test_post_order_left_imba(): + """Check for the correct output of posto-t on a left-imbalanced tree.""" + tree = BST(LEFT_IMBALANCED) + gen = tree.post_order() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [1, 2, 3, 4, 5, 6] + + +def test_breadth_first_left_imba(): + """Check for the correct output of bft on a left-imbalanced tree.""" + tree = BST(LEFT_IMBALANCED) + gen = tree.breadth_first() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [6, 5, 4, 3, 2, 1] + + +def test_in_order_right_imba(): + """Check for the correct output of iot on a right-imbalanced tree.""" + tree = BST(RIGHT_IMBALANCED) + gen = tree.in_order() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [1, 2, 3, 4, 5, 6] + + +def test_pre_order_right_imba(): + """Check for the correct output of preo-t on a right-imbalanced tree.""" + tree = BST(RIGHT_IMBALANCED) + gen = tree.pre_order() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [1, 2, 3, 4, 5, 6] + + +def test_post_order_right_imba(): + """Check for the correct output of posto-t on a right-imbalanced tree.""" + tree = BST(RIGHT_IMBALANCED) + gen = tree.post_order() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [6, 5, 4, 3, 2, 1] + + +def test_breadth_first_right_imba(): + """Check for the correct output of bft on a right-imbalanced tree.""" + tree = BST(RIGHT_IMBALANCED) + gen = tree.breadth_first() + output = [] + for i in range(6): + output.append(next(gen)) + assert output == [1, 2, 3, 4, 5, 6] + + +def test_in_order_sample_tree(): + """Check for the correct output of iot on a sample tree.""" + tree = BST(SAMPLE_TREE) + gen = tree.in_order() + output = [] + for i in range(10): + output.append(next(gen)) + assert output == [1, 10, 11, 12, 16, 20, 27, 28, 30, 42] + + +def test_pre_order_sample_tree(): + """Check for the correct output of preo-t on a sample tree.""" + tree = BST(SAMPLE_TREE) + gen = tree.pre_order() + output = [] + for i in range(10): + output.append(next(gen)) + assert output == [20, 12, 10, 1, 11, 16, 30, 28, 27, 42] + + +def test_post_order_sample_tree(): + """Check for the correct output of posto-t on a sample tree.""" + tree = BST(SAMPLE_TREE) + gen = tree.post_order() + output = [] + for i in range(10): + output.append(next(gen)) + assert output == [1, 11, 10, 16, 12, 27, 28, 42, 30, 20] + + +def test_breadth_first_sample_tree(): + """Check for the correct output of bft on a right-imbalanced tree.""" + tree = BST(SAMPLE_TREE) + gen = tree.breadth_first() + output = [] + for i in range(10): + output.append(next(gen)) + assert output == [20, 12, 30, 10, 16, 28, 42, 1, 11, 27] diff --git a/src/test_trie.py b/src/test_trie.py new file mode 100644 index 0000000..d034a63 --- /dev/null +++ b/src/test_trie.py @@ -0,0 +1,195 @@ +"""Tests for Trie tree.""" + +import pytest +from trie import Trie +from trie import Node + + +@pytest.fixture +def empty(): + """Sample Trie without nodes for testing.""" + return Trie() + + +@pytest.fixture +def filled_1(): + """Sample Trie with contents for testing.""" + t = Trie() + t.insert('hello') + t.insert('goodbye') + t.insert('helsinki') + t.insert('goodlord') + t.insert('squish') + t.insert('heckingoodboye') + return t + + +@pytest.fixture +def filled_2(): + """Sample Trie, with simpler contents.""" + t = Trie() + t.insert('abc') + t.insert('az') + t.insert('a') + t.insert('q') + return t + + +def test_created_node_has_attributes(): + """Test attributes of Node.""" + n = Node() + assert n.letter is None + assert n.children == {} + assert n.end is False + + +def test_trie_has_correct_attributes(empty): + """Test that Trie has correct attributes on init.""" + assert empty.root.letter == '*' + assert isinstance(empty.root, Node) + + +def test_insert_adds_word_to_trie(empty): + """Test basic insert method on single word.""" + empty.insert('abc') + assert 'a' in empty.root.children + assert 'b' in empty.root.children['a'].children + assert 'c' in empty.root.children['a'].children['b'].children + + +def test_word_has_end_attribute(empty): + """Test that nothing comes after the sign inserted.""" + empty.insert('a') + assert 'a' in empty.root.children + assert empty.root.children['a'].end is True + + +def test_word_is_not_added_twice(empty): + """Test that the same word cannot be added twice.""" + empty.insert('yo') + a = empty.root.children + empty.insert('yo') + b = empty.root.children + assert a == b + + +def test_one_letter_word_works(empty): + """Test insert method on one letter word.""" + empty.insert('a') + assert len(empty.root.children) == 1 + + +def test_insert_adds_multiple_words(filled_2): + """Test that insert works with multiple words.""" + keys = filled_2.root.children.keys() + assert 'a' in keys and 'q' in keys + assert len(keys) == 2 + assert len(filled_2.root.children['a'].children) == 2 + assert 'b' in filled_2.root.children['a'].children + assert 'z' in filled_2.root.children['a'].children + + +def test_insert_adds_multiple_words_using_contains(filled_1): + """Test combo of contains and insert method.""" + assert filled_1.contains('hello') + assert filled_1.contains('goodbye') + assert filled_1.contains('helsinki') + assert filled_1.contains('goodlord') + assert filled_1.contains('squish') + assert filled_1.contains('heckingoodboye') + assert not filled_1.contains('thisisnothere') + + +def test_contains_where_it_returns_false(filled_2, filled_1): + """Test false contains.""" + assert not filled_2.contains('nooooope') + assert not filled_1.contains('h') + assert not filled_1.contains('good') + assert not filled_1.contains('squi') + + +def test_size_method_on_empty_trie(empty): + """Test size on empy trie instance.""" + assert empty.size() == 0 + + +def test_size_method_on_filled_trie(filled_1): + """Test size on empy trie instance.""" + assert filled_1.size() == 6 + + +def test_size_method_on_second_filled_trie(): + """Test size on empy trie instance.""" + t = Trie() + t.insert('abc') + t.insert('az') + t.insert('a') + t.insert('q') + assert t.size() == 4 + + +def test_remove_method_doesnt_work_without_word(filled_1): + """Test that the size method will raise TypeError.""" + with pytest.raises(TypeError): + filled_1.remove('thiswordisnotindict') + + +def test_deleting_single_word(empty): + """.""" + empty.insert('ace') + empty.remove('ace') + assert empty.size() == 0 + assert empty.contains('ace') is False + + +def test_remove_will_remove_word_from_dict(filled_1): + """Test remove method will remove word off Trie.""" + assert filled_1.contains('heckingoodboye') + filled_1.remove('heckingoodboye') + assert filled_1.contains('heckingoodboye') is False + + +def test_remove_wont_remove_words_with_same_beginning(empty): + """Test that remove method wont remove words if they start with the same letters.""" + empty.insert('antidisestablishmentarianism') + empty.insert('antimatter') + empty.remove('antimatter') + assert empty.contains('antidisestablishmentarianism') + assert empty.contains('antimatter') is False + + +def test_size_decreases_with_removing_node(filled_2): + """Test size of tree reduces with you delete a word.""" + assert filled_2.size() == 4 + filled_2.remove('az') + assert filled_2.size() == 3 + + +def test_trie_autocomplete_on_filled_tree_letter_h(filled_1): + """Autocomplete tests on filled tree.""" + a = filled_1.autocomplete('h') + assert next(a) == 'hello' + assert next(a) == 'helsinki' + assert next(a) == 'heckingoodboye' + with pytest.raises(StopIteration): + assert next(a) + + +def test_trie_autocomplete_on_filled_tree_letter_g(filled_1): + """Autocomplete tests on filled tree.""" + a = filled_1.autocomplete('good') + assert next(a) == 'goodbye' + assert next(a) == 'goodlord' + + +def test_trie_autocomplete_where_no_suggestions(filled_1): + """Autocomplete with a letter not in Trie tree, makes empty list.""" + a = filled_1.autocomplete('z') + assert a == [] + + +def test_trie_auto_with_non_string(filled_1): + """Autocomplete with a non string.""" + with pytest.raises(TypeError): + a = filled_1.autocomplete('z') + assert next(a) diff --git a/src/trie.py b/src/trie.py new file mode 100644 index 0000000..94e2a6f --- /dev/null +++ b/src/trie.py @@ -0,0 +1,124 @@ +"""Trie tree structure.""" + + +class Node(object): + """Node class.""" + + def __init__(self, letter=None, parent=None, end=False): + """Initialization of Trie node attributes.""" + self.letter = letter + self.children = {} + self.parent = parent + self.end = end + + def __iter__(self): + """Make children iterable.""" + return self.children.itervalues() + + +class Trie(object): + """Trie class.""" + + def __init__(self): + """Initialization of Trie tree.""" + self.root = Node('*') + self.tree_size = 0 + + def insert(self, word): + """Insert a new word into the tree.""" + current = self.root + if self.contains(word) or type(word) is not str: + return + for letter in word: + current.children.setdefault(letter, Node(letter, current)) + current = current.children[letter] + current.end = True + self.tree_size += 1 + return + + def contains(self, word): + """Check if Trie contains word.""" + current = self.root + for letter in word: + if letter not in current.children: + return False + current = current.children[letter] + if current.end: + return True + return False + + def size(self): + """Return number of words in Trie tree.""" + return self.tree_size + + def remove(self, word): + """Remove word from trie.""" + current = self.root + for letter in word: + if letter not in current.children: + raise TypeError('This word is not in Trie.') + current = current.children[letter] + current = current.parent + while len(current.children) == 1: + current.children.clear() + if current.parent: + current = current.parent + self.tree_size -= 1 + + def trie_traversal(self, start=None): + """Depth-first traveral of Trie.""" + self.visited = [] + + if start: + curr = self.root + for char in start: + if char in curr.children: + curr = curr.children[char] + return 'Invalid starting string.' + return self._combo_gen(curr) + else: + return self._combo_gen(self.root) + + def _combo_gen(self, start): + """.""" + for child, child_node in start.children.items(): + self.visited.append(child) + for node in child_node.children: + self.visited.append(child_node.children[node].letter) + if child_node.children[node].end and not child_node.children: + continue + child_node.children = child_node.children[node].children + for let in self.visited: + yield let + + def _trie_gen(self, start): + """Generator for traversal function.""" + for child in start.children: + return self._recursive_depth(start.children[child]) + + def _recursive_depth(self, node): + """Recursive helper fn for generator.""" + self.visited.append(node.letter) + for child in node.children: + if child.end: + break + yield self._recursive_depth(node.children[child]) + + def autocomplete(self, start): + """Autocomplete using Trie Tree.""" + if isinstance(start, str): + curr = self.root + for letter in start: + if letter not in curr.children: + return [] + curr = curr.children[letter] + return self._auto_helper(curr, start) + raise TypeError('Autocomplete takes only strings.') + + def _auto_helper(self, node, start): + """Helper fn for autocomplete.""" + if node.end: + yield start + for letter in node.children: + for word in self._auto_helper(node.children[letter], start + letter): + yield word diff --git a/tox.ini b/tox.ini new file mode 100644 index 0000000..082b2bc --- /dev/null +++ b/tox.ini @@ -0,0 +1,8 @@ +[tox] +envlist = py27, py36 + +[testenv] +commands = py.test --cov --cov-report term-missing +deps = + pytest + pytest-cov \ No newline at end of file