Scrapy Log Saving

Overview

By default, Scrapy prints log messages to the console (stdout) but does not persist them to disk. Once the spider finishes, all log output is lost. This document explains how to control log verbosity with LOG_LEVEL and how to persist logs to a file using Scrapy’s built-in settings.


Log Settings in settings.py

LOG_LEVEL

Controls the minimum severity of messages that are displayed. Any message below this level is silenced.

Value Description
DEBUG Most verbose — all internal Scrapy details
INFO General progress information (default)
WARNING Non-critical problems
ERROR Errors that need attention
CRITICAL Fatal errors only
# settings.py
LOG_LEVEL = 'INFO'   # default
LOG_LEVEL = 'DEBUG'  # show everything
LOG_LEVEL = 'WARNING'  # production-friendly, less noise

LOG_FILE — Persist Logs to Disk

Set LOG_FILE to a file path to redirect all log output from the console to a file.

# settings.py
LOG_FILE = 'scrapy.log'

Once set, nothing is printed to the console; everything goes to the specified file.

Tip: Use an absolute path to ensure the file lands in a predictable location regardless of where you invoke scrapy crawl.

import os

BASE_DIR = os.path.dirname(os.path.abspath(__file__))
LOG_FILE = os.path.join(BASE_DIR, 'logs', 'scrapy.log')

LOG_FILE_APPEND

Controls whether each run appends to the existing log file or overwrites it.

LOG_FILE_APPEND = True   # append — keeps history across runs (default: True)
LOG_FILE_APPEND = False  # overwrite — fresh log every run

LOG_ENCODING

Character encoding used when writing the log file. Defaults to utf-8.

LOG_ENCODING = 'utf-8'

LOG_FORMAT / LOG_DATEFORMAT

Customize the format of each log line.

LOG_FORMAT    = '%(asctime)s [%(name)s] %(levelname)s: %(message)s'
LOG_DATEFORMAT = '%Y-%m-%d %H:%M:%S'

Minimal Production Configuration

# settings.py

LOG_LEVEL       = 'WARNING'          # suppress DEBUG / INFO noise
LOG_FILE        = 'logs/scrapy.log'  # save to file
LOG_FILE_APPEND = True               # keep log history
LOG_ENCODING    = 'utf-8'
LOG_FORMAT      = '%(asctime)s [%(name)s] %(levelname)s: %(message)s'
LOG_DATEFORMAT  = '%Y-%m-%d %H:%M:%S'

Override via Command Line

You can override any setting at runtime without editing settings.py:

# Save logs to a specific file for this run only
scrapy crawl myspider -s LOG_FILE=run_$(date +%F).log -s LOG_LEVEL=DEBUG

Rotating Logs (Advanced)

Scrapy’s built-in LOG_FILE does not support rotation. For production use, integrate Python’s logging.handlers.RotatingFileHandler in a custom extension or use an external tool like logrotate (Linux).

Example: Custom Scrapy Extension for Rotating Logs

# myproject/extensions/rotating_log.py
import logging
from logging.handlers import RotatingFileHandler
from scrapy import signals

class RotatingLogExtension:
    @classmethod
    def from_crawler(cls, crawler):
        ext = cls()
        crawler.signals.connect(ext.spider_opened, signal=signals.spider_opened)
        return ext

    def spider_opened(self, spider):
        handler = RotatingFileHandler(
            filename='logs/scrapy.log',
            maxBytes=10 * 1024 * 1024,  # 10 MB per file
            backupCount=5,              # keep 5 rotated files
            encoding='utf-8',
        )
        handler.setFormatter(logging.Formatter(
            '%(asctime)s [%(name)s] %(levelname)s: %(message)s'
        ))
        logging.getLogger().addHandler(handler)

Register it in settings.py:

EXTENSIONS = {
    'myproject.extensions.rotating_log.RotatingLogExtension': 100,
}

Summary

Setting Purpose Default
LOG_LEVEL Minimum log severity shown DEBUG
LOG_FILE Path to persist logs; None = console only None
LOG_FILE_APPEND Append vs overwrite on each run True
LOG_ENCODING File encoding utf-8
LOG_FORMAT Log line format string Scrapy default
LOG_DATEFORMAT Date format in log lines %Y-%m-%d %H:%M:%S