GitHub - luofei2011/image-spider: image spider, written by node.js.

A spider for crawling images on the website.

INSTALL

# clone git repository
git clone https://github.com/luofei2011/image-spider.git
cd image-spider

# install nodejs packages
npm install

# add test.js
touch test.js
vim test.js

# insert 
var Spider = require('./spider');
var spider = new Spider('http://poised-flw.com', {
    level: 3,
    maxSockets: 4,
    downloadImage: true
});
spider.start();

# save & quit
# then. excute this file
node test.js

OPTIONS

useAgent: the ua of spider.

maxSockets: the concurrent number of spider.

level: the crawling depth of spider.

onlyHost: whether the spider only crawl the same domain website, default true.

downloadImage: whether download the images, when crawling. default false.

OUTPUT

The images src will be written to $(pwd)/log/images_log. you can download them use download.sh, or set downloadImage: true.
You can expand this tool to deal with js/css/html etc. files.

ISSUES

If there has any problem, Please let me know. thanks~

LICENSE

You can only use this for learning nodejs.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
lib		lib
tests		tests
.gitignore		.gitignore
README.md		README.md
download.sh		download.sh
imageSpider.js		imageSpider.js
package.json		package.json
spider.js		spider.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

INSTALL

OPTIONS

OUTPUT

ISSUES

LICENSE

About

Uh oh!

Releases

Packages

Languages

luofei2011/image-spider

Folders and files

Latest commit

History

Repository files navigation

INSTALL

OPTIONS

OUTPUT

ISSUES

LICENSE

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages