The accepted answer leverages the launch.json
provided by Scrapy's documentation. However, depending on where and how you are running your scrapers, you may not have the scrapy module available. Here is an alternative method that connects on your localhost and port, and allows for debugging with Python's debugpy
. The example I provide is specifically for debugging spiders, but it may work for other scrapy objects as well.
Python version: 3.11.0
# launch.json{"version": "0.2.0","configurations": [ {"name": "Python: Attach","type": "debugpy","request": "attach","connect": {"host": "localhost","port": 5678 },"justMyCode": false } ]}
At the bottom of whatever Spider you are running, include the following:
# foo_bar_spider.pyclass FooBarSpider(): ...if __name__ == '__main__': import debugpy print('waiting for client...') debugpy.listen(('localhost', 5678)) # should be the port you're using debugpy.wait_for_client() from scrapy.crawler import CrawlerProcess process = CrawlerProcess() process.crawl(FooBarSpider) process.start() process.join()
Lastly, you can use Python to execute the file (e.g. python foo_bar_spider.py
) and you should see the "wait for client" printed in your terminal. Go ahead and launch the VSCode debugger, and you should be good to go.