
Where to configure shub

shub is configured via two YAML files:

  • ~/.scrapinghub.yml – this file contains global configuration like your API key. It is automatically created in your home directory when you run shub login. You can also change the default location with an environment variable, check an appropriate section below.

  • scrapinghub.yml – this file contains local configuration like the project ID or the location of your requirements file. It is automatically created in your project directory when you run shub deploy for the first time.

All configuration options listed below can be used in both of these configuration files. In case they overlap, the local configuration file will always take precedence over the global one.

Defining target projects

A very basic scrapinghub.yml, as generated when you first run shub deploy, could look like this:

project: 12345

This tells shub to deploy to the Scrapy Cloud project 12345 when you run shub deploy. Often, you will have multiple projects on Scrapy Cloud, e.g. one for development and one for production. For these cases, you can replace the project option with a projects dictionary:

  default: 12345
  prod: 33333

shub will now deploy to project 12345 when you run shub deploy, and deploy to project 33333 when you run shub deploy prod.

The configuration options

A deployed project contains more than your Scrapy code. Among other things, it has a version tag, and often has additional package requirements or is bound to a specific Scrapy version. All of these can be configured in scrapinghub.yml.

Sometimes the requirements may be different for different target projects, e.g. because you want to run your development project on Scrapy 1.3 but use Scrapy 1.0 for your production project. For these cases some options can be configured either globally or project-specific.

A global configuration option serves as default for all projects. E.g., to set scrapy:1.3-py3 as default Scrapy Cloud stack, use:

  default: 12345
  prod: 33333

stack: scrapy:1.3-py3

If you wish to use the stack only for project 12345, expand its entry in projects as follows:

    id: 12345
    stack: scrapy:1.3-py3
  prod: 33333

The following is a list of all available configuration options:





Path to the project’s requirements file, and to any additional eggs that should be deployed to Scrapy Cloud. See Deploying dependencies.

global default and project-specific


Scrapy Cloud stack to use (this is the environment that your project will run in, e.g. the Scrapy version that will be used).

global default and project-specific


Whether to use a custom Docker image on deploy. See Deploying custom Docker images.

global default and project-specific


Version tag to use when deploying. This can be an arbitrary string or one of the magic keywords AUTO (default), GIT, or HG. By default, shub will auto-detect your version control system and use its branch/commit ID as version.

global only


API key to use for deployments. You will typically not have to touch this setting as it will be configured inside ~/.scrapinghub.yml in your home directory, via shub login.

global only

Configuration via environment variables

Your Scrapinghub API key can be set as an environment variable, it could be useful for noninteractive deploys (e.g. for CI workflow).

On Linux-based systems:


On Windows:

SET SHUB_APIKEY=0bbf4f0f691e0d9378ae00ca7bcf7f0c

You can also parametrize global scrapinghub.yml file location with SHUB_GLOBAL_CONFIG environment variable (default ~/.scrapinghub.yml).

When working with custom Docker images, please be aware that the tool relies on a set of standard DOCKER_ prefixed environment variables:


The URL or Unix socket path used to connect to the Docker API.


The version of the Docker API running on the host. Defaults to the latest version of the API supported by docker-py.


Specify a path to the directory containing the client certificate, client key and CA certificate.


Enables securing the connection to the API by using TLS and verifying the authenticity of the Docker Host.

Example configurations

Custom requirements file and fixed version information:

project: 12345
  file: requirements_scrapinghub.txt
version: 0.9.9

Custom Scrapy Cloud stack, requirements file and additional private dependencies:

project: 12345
stack: scrapy:1.1
  file: requirements.txt
    - privatelib.egg
    - path/to/otherlib.egg

Using the latest Scrapy 1.3 stack in staging and development, but pinning the production stack to a specific release:

  default: 12345
  staging: 33333
    id: 44444
    stack: scrapy:1.3-py3-20170322

stack: scrapy:1.3-py3

Using a custom Docker image:

  default: 12345
  prod: 33333

image: true

Using a custom Docker image only for the development project:

    id: 12345
    image: true
  prod: 33333

Using a custom Docker image in staging and development, but a Scrapy Cloud stack in production:

  default: 12345
  staging: 33333
    id: 44444
    image: false
    stack: scrapy:1.3-py3-20170322

image: true

Setting the API key used for deploying:

project: 12345
apikey: 0bbf4f0f691e0d9378ae00ca7bcf7f0c

Advanced use cases

It is possible to configure multiple API keys:

  default: 123
  otheruser: someoneelse/123

  default: 0bbf4f0f691e0d9378ae00ca7bcf7f0c
  someoneelse: a1aeecc4cd52744730b1ea6cd3e8412a

as well as different API endpoints:

  dev: vagrant/3

  vagrant: http://vagrant:3333/api/

  default: 0bbf4f0f691e0d9378ae00ca7bcf7f0c
  vagrant: a1aeecc4cd52744730b1ea6cd3e8412a

Global and project-specific requirements. requirements.txt is used for projects prod and some, requirements-dev.txt and eggs for dev:

  prod: 12345
      id: 345
          file: requirements-dev.txt
          - ./egg1.egg
          - ./egg2.egg
  some: 567
  file: requirements.txt
  default: "scrapy:2.8"