Здравствуйте, перейдем сразу к моим настройкам.
В коде:
from selenium import webdriver
op = webdriver.ChromeOptions()
op.binary_location = os.environ.get('GOOGLE_CHROME_BIN')
op.add_argument('--headless')
op.add_argument('--no-sandbox')
op.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(executable_path=os.environ.get('CHROMEDRIVER_PATH'), options=op)
Билдпаки heroku:
- heroku/python
- https://github.com/heroku/heroku-buildpack-google-...
- https://github.com/heroku/heroku-buildpack-chromedriver
Ключи:
- CHROMEDRIVER_PATH = /app/.chromedriver/bin/chromedriver
- GOOGLE_CHROME_BIN = /app/.apt/usr/bin/google-chrome
Ошибка, возникающая вследствие использования метода driver.get():
selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_CONNECTION_TIMED_OUT
Полный трейсбек:
2020-11-20T16:37:52.441037+00:00 app[worker.1]: Task exception was never retrieved
2020-11-20T16:37:52.441073+00:00 app[worker.1]: future: exception=WebDriverException('unknown error: net::ERR_CONNECTION_TIMED_OUT\n (Session info: headless chrome=87.0.4280.66)', None, None)>
2020-11-20T16:37:52.441075+00:00 app[worker.1]: Traceback (most recent call last):
2020-11-20T16:37:52.441076+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 388, in _process_polling_updates
2020-11-20T16:37:52.441077+00:00 app[worker.1]: for responses in itertools.chain.from_iterable(await self.process_updates(updates, fast)):
2020-11-20T16:37:52.441078+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 225, in process_updates
2020-11-20T16:37:52.441079+00:00 app[worker.1]: return await asyncio.gather(*tasks)
2020-11-20T16:37:52.441079+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/handler.py", line 117, in notify
2020-11-20T16:37:52.441080+00:00 app[worker.1]: response = await handler_obj.handler(*args, **partial_data)
2020-11-20T16:37:52.441081+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 246, in process_update
2020-11-20T16:37:52.441081+00:00 app[worker.1]: return await self.message_handlers.notify(update.message)
2020-11-20T16:37:52.441082+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/handler.py", line 117, in notify
2020-11-20T16:37:52.441082+00:00 app[worker.1]: response = await handler_obj.handler(*args, **partial_data)
2020-11-20T16:37:52.441082+00:00 app[worker.1]: File "/app/main.py", line 72, in listening_to_links
2020-11-20T16:37:52.441083+00:00 app[worker.1]: await tasks_distribution(message.text, 60)
2020-11-20T16:37:52.441083+00:00 app[worker.1]: File "/app/main.py", line 49, in tasks_distribution
2020-11-20T16:37:52.441084+00:00 app[worker.1]: data = await crawl_data(link)
2020-11-20T16:37:52.441084+00:00 app[worker.1]: File "/app/main.py", line 27, in crawl_data
2020-11-20T16:37:52.441085+00:00 app[worker.1]: driver.get(link)
2020-11-20T16:37:52.441085+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
2020-11-20T16:37:52.441085+00:00 app[worker.1]: self.execute(Command.GET, {'url': url})
2020-11-20T16:37:52.441086+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
2020-11-20T16:37:52.441086+00:00 app[worker.1]: self.error_handler.check_response(response)
2020-11-20T16:37:52.441087+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
2020-11-20T16:37:52.441087+00:00 app[worker.1]: raise exception_class(message, screen, stacktrace)
2020-11-20T16:37:52.441087+00:00 app[worker.1]: selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_CONNECTION_TIMED_OUT
2020-11-20T16:37:52.441088+00:00 app[worker.1]: (Session info: headless chrome=)
На всякий случай пишу, что в начале запуска(а запускаю приложение я командой heroku scale worker=1) мне выдает следующее:
/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/firefox/firefox_profile.py:208: SyntaxWarning: "is" with a literal. Did you mean "=="?
if setting is None or setting is '':
Возможно, что это пригодится.
И напоследок код, вызывающий застой и впоследствии ошибку:
async def crawl_data(link: str) -> Union[dict, None, str]:
driver.get(link)
# сюда поток все равно не доползает
soup = BeautifulSoup(driver.page_source, 'html.parser')
# ....
async def tasks_distribution(link: str, wait_duration: int) -> None:
while True:
data = await crawl_data(link)
if data == settings.PINNACLE_LATE:
return None
elif data:
message_text = data['commands'] + '\n\n' + data['link']
await parser.send_message(chat_id=settings.OWNER_ID, text=message_text)
return None
await asyncio.sleep(wait_duration)
@dp.message_handler()
async def listening_to_links(message: Message):
await tasks_distribution(message.text, 60)
Помогите, пожалуйста!