从Flask应用程序运行scrapy

我有一个爬行器，我想每次运行一个人去链接。由于所有其他模块都在Flask中，所以我被告知在Flask中也可以创build它。我已经在虚拟环境中安装了scrapy和selenium，并且在机器上全局安装了root。

当我通过terminal运行爬行器时，一切正常。当我启动Flask应用程序并访问浏览器中的xx.xx.xx.xx:8080/whats时，这也可以正常运行，并运行我的搜寻器并获取文件。但是，只要我上线，任何时候一个人去链接，它给我在浏览器内部错误。

为了运行爬虫，我们必须在terminal中键入“scrapy crawl whateverthespidernameis”。我用Python的os模块做了这个。

这是我的烧瓶代码：

 import sys from flask import request, jsonify, render_template, url_for, redirect, session, abort,render_template_string,send_file,send_from_directory from flask import * #from application1 import * from main import * from test123 import * import os app = Flask(__name__) filename = '' app = Flask(__name__) @app.route('/whats') def whats(): os.getcwd() os.chdir("/var/www/myapp/whats") //cmd = "scrapy crawl whats" cmd = "sudo scrapy crawl whats" os.system(cmd) return send_file("/var/www/myapp/staticcsv/whats.csv", as_attachment =True) if __name__ == "__main__": app.run(host='0.0.0.0', port=8080,debug=True)

这是我运行实时链接时logging在日志文件中的错误：

 sh: 1: scrapy: not found**

这是在命令（variablescmd ）中使用sudo时logging在日志文件中的错误：

 sudo: no tty present and no askpass program specified**

我正在使用uwsgi和nginx。

我怎样才能运行这个爬虫，当任何人去“xx.xx.xx.xx /什么”爬虫运行，并返回csv文件？

当你使用sudo这个shell会启动会在tty上要求输入一个密码 – 它不会读取这个信息的标准输入。由于flask和其他Web应用程序通常会从终端运行， sudo无法要求输入密码，所以它会查找可以提供密码的程序。你可以在这个答案中找到关于这个主题的更多信息。

您没有找到scrapy的原因很可能是因为您在测试中使用的交互式shell与运行flask的进程之间的$PATH差异。解决这个问题的最简单方法就是在命令中给出scrapy程序的完整路径。