WebCrawler

##A basic Python web crawler.

MAIN FUNCTIONS / FEATURES

crawl_web(seed)

given a seed page, create index of all links and create a relational graph between the pages

compute_ranks(graph)

computes ranks of a given webpage using inlinks/outlinks

lucky_search(index, ranks, keyword)

returns the highest ranked page off a given keyword

lookup(index, keyword)

returns a list of all the url's associated with the given keyword

BUGS

get_page(page)

only works with 3 specific url's so far
update to work on any URL (using Beautiful Soup to parse HTML)

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
Crawler.py		Crawler.py
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

WebCrawler

MAIN FUNCTIONS / FEATURES

crawl_web(seed)

compute_ranks(graph)

lucky_search(index, ranks, keyword)

lookup(index, keyword)

BUGS

get_page(page)

About

Uh oh!

Releases

Packages

Languages

NickCorneau/WebCrawler

Folders and files

Latest commit

History

Repository files navigation

WebCrawler

MAIN FUNCTIONS / FEATURES

crawl_web(seed)

compute_ranks(graph)

lucky_search(index, ranks, keyword)

lookup(index, keyword)

BUGS

get_page(page)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages