** Downloads files from list of source URLs at a well defined specified path
The CLI has been tested on ubuntu18 (64 bit), golang1.17.6.
Unzip the code and place it in $GO_PATH/src/github.com Then on terminal, run
make clean; make buildThis will create a binary in $GO_PATH/src/github.com/dl_cli/bin
Below are supported commands for dl_cli
% ./dl_cli --help
CLI to download files.
Usage:
dl_cli [command]
Available Commands:
download download files from list of urls
help Help about any command
version Current version of file downloader CLI being used
Flags:
-h, --help help for dl_cli
Use "dl_cli [command] --help" for more information about a command.% ./dl_cli version
dl_cli version: v1.0
# Download the files represented by comma separated urls at a specified path.
% ./dl_cli download -p <path to download> -u <comma separated list of urls>
# Sample command to download files,
% ./dl_cli download -p /tmp/test -u http://my.file.com/file,https://you.file.com/test- The code is written in Golang.
- The code is multi-threaded.
- Used waitgroups to make sure all the file downloads are completed before the main process exits.
- Currently supports http and https protocols.
- Used protocol interface to extend it to other protocols.
- Used combination of http.Get and io.copy to make sure that process does not go out-of-memory.
- This is achieved due to the fact that io.Copy copies data in chunks of 32k.
- If download fails, partial file is deleted.
- In case of failures, rollbacks are carried out.
- The code creates timestamp based directory, followed by sub-dir structure in order to avoid collisions.
- Not able to get automated unit tests implemented. I ran out of time due to my job commitments.
- Looking back, I could have structured the code a bit differently, creating different interfaces.
- Not tested with directory urls.
- Could have added checksumming logic.
- Tested with real urls from S3
- Tested for larget files
- Tested for out of memory on local file system
- Tested for protocols that are not http or https
Example output of testing
% ./bin/dl_cli download -p /tmp -u https://pmkft-assets.s3.us-west-1.amazonaws.com/appctl/macos/appctl,https://pmkft-assets.s3.us-west-1.amazonaws.com/non-existing-file,https://pmkft-assets.s3.us-west-1.amazonaws.com/appctl/windows/appctl,ftp://my.file.com/file,https://pmkft-assets.s3.us-west-1.amazonaws.com/appctl/linux/appctl
Source URL : https://pmkft-assets.s3.us-west-1.amazonaws.com/appctl/macos/appctl Destination : /tmp/2022-03-29-16-21-06/pmkft-assets.s3.us-west-1.amazonaws.com/appctl/macos/appctl Status : Success
Source URL : https://pmkft-assets.s3.us-west-1.amazonaws.com/non-existing-file Destination : /tmp/2022-03-29-16-21-06/pmkft-assets.s3.us-west-1.amazonaws.com/non-existing-file Status : Failed, Reason : Download failed for URL: https://pmkft-assets.s3.us-west-1.amazonaws.com/non-existing-file
Source URL : https://pmkft-assets.s3.us-west-1.amazonaws.com/appctl/windows/appctl Destination : /tmp/2022-03-29-16-21-06/pmkft-assets.s3.us-west-1.amazonaws.com/appctl/windows/appctl Status : Success
Source URL : ftp://my.file.com/file Destination : /tmp/2022-03-29-16-21-06/my.file.com/file Status : Failed, Reason : Error: Protocol Unsupported: ftp, URL: ftp://my.file.com/file
Source URL : https://pmkft-assets.s3.us-west-1.amazonaws.com/appctl/linux/appctl Destination : /tmp/2022-03-29-16-21-06/pmkft-assets.s3.us-west-1.amazonaws.com/appctl/linux/appctl Status : Success
Download operation complete, all URLs processed.
% tree /tmp/2022-03-29-16-21-06/
/tmp/2022-03-29-16-21-06/
-- pmkft-assets.s3.us-west-1.amazonaws.com -- appctl
|-- linux
| -- appctl |-- macos | -- appctl
-- windows -- appctl
5 directories, 3 files