Paperdl documentation

Statements


This repo is created for learning python.

If I find that anyone leverage this project in an illegal way, I will delete this project immediately.

Install Paperdl

Environment

  • OS: Linux or macOS or Windows

  • Python version: Python3.6+

Pip install

You can run the following command in the terminal to install paperdl (make sure that the python is in the environment variable):

pip install paperdl --upgrade

Source code install

1.Online

You can run the following command in the terminal to install paperdl (make sure that the python and git are in the environment variable):

pip install git+https://github.com/CharlesPikachu/paperdl.git@master

2.Offline

You can first run the following command in the terminal to download the source code in your computer:

git clone https://github.com/CharlesPikachu/paperdl.git

Then, enter the corresponding directory:

cd paperdl

Finally, run the following command in the terminal to install paperdl:

python setup.py install

Quick Start

Calling API

If you want to search and download papers from arxiv and google scholar, you can write codes as follow:

from paperdl import paperdl

config = {'logfilepath': 'paperdl.log', 'savedir': 'papers', 'search_size_per_source': 5, 'proxies': {}}
target_srcs = ['arxiv', 'googlescholar']
client = paperdl.Paperdl(config=config)
client.run(target_srcs)

In addition, if you can not visit google, you can set config as follow:

config = {'logfilepath': 'paperdl.log', 'savedir': 'papers', 'search_size_per_source': 5, 'proxies': {}, 'area': 'CN'}

You can also only download papers by using sci-hub as follow:

from paperdl import paperdl

config = {'logfilepath': 'paperdl.log', 'savedir': 'papers', 'search_size_per_source': 5, 'proxies': {}}
client = paperdl.SciHub(config=config, logger_handle=paperdl.Logger('paper.log'))
paperinfo = {
    'savename': '9193963',
    'ext': 'pdf',
    'savedir': 'outputs',
    'input': 'https://ieeexplore.ieee.org/document/9193963/',
    'source': 'scihub',
}
client.download([paperinfo])

Here is a screenshot:


Calling EXE

You can directly leverage paperdl in the terminal, and the usage is as follow:

Usage: paperdl [OPTIONS]

Options:
  --version               Show the version and exit.
  -m, --mode TEXT         the used mode, support "search" and "download"
  -i, --inp TEXT          the paper to download, the supported format is the
                          same as sci-hub
  -s, --source TEXT       the used source, support "arxiv", "scihub" and
                          "googlescholar", you can use "," to split multi
                          sources
  -d, --savedir TEXT      the directory for saving papers
  -l, --logfilepath TEXT  the logging filepath
  -z, --size INTEGER      search size per source
  -p, --proxies TEXT      the proxies to be adopted
  -a, --area TEXT         your area, support "CN" and "EN"
  -c, --cookie TEXT       the cookie copied from the target website, only used
                          in "baiduwenku"
  --help                  Show this message and exit.

Here is an example:

paperdl -i https://ieeexplore.ieee.org/document/7485869/ -m download

Changelog

2022-02-23

  • Version: v0.1.3,

  • Update: support arxiv, googlescholar and scihub.

2022-02-25

  • Version: v0.1.4,

  • Update: some improvements, like adding exe auto building and the progress bar.

2022-05-08

  • Version: v0.1.5,

  • Update: support baiduwenku.

2022-05-13

  • Version: v0.1.6,

  • Update: fix the bugs in downloader.

About Me

I’m a student whose research interests include computer vision and information security.

WeChat public account: Charles_pikachu

Github: https://github.com/CharlesPikachu

Zhihu: https://www.zhihu.com/people/charles_pikachu

Bilibili: https://space.bilibili.com/406756145

Email: charlesblwx@gmail.com