Задать вопрос
@Mikkkch

Не работает selenium на heroku?

Здравствуйте, перейдем сразу к моим настройкам.
В коде:
from selenium import webdriver

op = webdriver.ChromeOptions()
op.binary_location = os.environ.get('GOOGLE_CHROME_BIN')
op.add_argument('--headless')
op.add_argument('--no-sandbox')
op.add_argument('--disable-dev-shm-usage')

driver = webdriver.Chrome(executable_path=os.environ.get('CHROMEDRIVER_PATH'), options=op)


Билдпаки heroku:
  1. heroku/python
  2. https://github.com/heroku/heroku-buildpack-google-...
  3. https://github.com/heroku/heroku-buildpack-chromedriver


Ключи:
  • CHROMEDRIVER_PATH = /app/.chromedriver/bin/chromedriver
  • GOOGLE_CHROME_BIN = /app/.apt/usr/bin/google-chrome


Ошибка, возникающая вследствие использования метода driver.get():
selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_CONNECTION_TIMED_OUT

Полный трейсбек:

2020-11-20T16:37:52.441037+00:00 app[worker.1]: Task exception was never retrieved
2020-11-20T16:37:52.441073+00:00 app[worker.1]: future: exception=WebDriverException('unknown error: net::ERR_CONNECTION_TIMED_OUT\n (Session info: headless chrome=87.0.4280.66)', None, None)>
2020-11-20T16:37:52.441075+00:00 app[worker.1]: Traceback (most recent call last):
2020-11-20T16:37:52.441076+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 388, in _process_polling_updates
2020-11-20T16:37:52.441077+00:00 app[worker.1]: for responses in itertools.chain.from_iterable(await self.process_updates(updates, fast)):
2020-11-20T16:37:52.441078+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 225, in process_updates
2020-11-20T16:37:52.441079+00:00 app[worker.1]: return await asyncio.gather(*tasks)
2020-11-20T16:37:52.441079+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/handler.py", line 117, in notify
2020-11-20T16:37:52.441080+00:00 app[worker.1]: response = await handler_obj.handler(*args, **partial_data)
2020-11-20T16:37:52.441081+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/dispatcher.py", line 246, in process_update
2020-11-20T16:37:52.441081+00:00 app[worker.1]: return await self.message_handlers.notify(update.message)
2020-11-20T16:37:52.441082+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/aiogram/dispatcher/handler.py", line 117, in notify
2020-11-20T16:37:52.441082+00:00 app[worker.1]: response = await handler_obj.handler(*args, **partial_data)
2020-11-20T16:37:52.441082+00:00 app[worker.1]: File "/app/main.py", line 72, in listening_to_links
2020-11-20T16:37:52.441083+00:00 app[worker.1]: await tasks_distribution(message.text, 60)
2020-11-20T16:37:52.441083+00:00 app[worker.1]: File "/app/main.py", line 49, in tasks_distribution
2020-11-20T16:37:52.441084+00:00 app[worker.1]: data = await crawl_data(link)
2020-11-20T16:37:52.441084+00:00 app[worker.1]: File "/app/main.py", line 27, in crawl_data
2020-11-20T16:37:52.441085+00:00 app[worker.1]: driver.get(link)
2020-11-20T16:37:52.441085+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
2020-11-20T16:37:52.441085+00:00 app[worker.1]: self.execute(Command.GET, {'url': url})
2020-11-20T16:37:52.441086+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
2020-11-20T16:37:52.441086+00:00 app[worker.1]: self.error_handler.check_response(response)
2020-11-20T16:37:52.441087+00:00 app[worker.1]: File "/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
2020-11-20T16:37:52.441087+00:00 app[worker.1]: raise exception_class(message, screen, stacktrace)
2020-11-20T16:37:52.441087+00:00 app[worker.1]: selenium.common.exceptions.WebDriverException: Message: unknown error: net::ERR_CONNECTION_TIMED_OUT
2020-11-20T16:37:52.441088+00:00 app[worker.1]: (Session info: headless chrome=)


На всякий случай пишу, что в начале запуска(а запускаю приложение я командой heroku scale worker=1) мне выдает следующее:
/app/.heroku/python/lib/python3.9/site-packages/selenium/webdriver/firefox/firefox_profile.py:208: SyntaxWarning: "is" with a literal. Did you mean "=="?
if setting is None or setting is '':

Возможно, что это пригодится.

И напоследок код, вызывающий застой и впоследствии ошибку:

async def crawl_data(link: str) -> Union[dict, None, str]:
    driver.get(link)
    # сюда поток все равно не доползает
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    # ....


async def tasks_distribution(link: str, wait_duration: int) -> None:
    while True:

        data = await crawl_data(link)

        if data == settings.PINNACLE_LATE:
            return None

        elif data:

            message_text = data['commands'] + '\n\n' + data['link']

            await parser.send_message(chat_id=settings.OWNER_ID, text=message_text)

            return None

        await asyncio.sleep(wait_duration)

@dp.message_handler()
async def listening_to_links(message: Message):
    await tasks_distribution(message.text, 60)


Помогите, пожалуйста!
  • Вопрос задан
  • 723 просмотра
Подписаться 1 Средний 3 комментария
Пригласить эксперта
Ваш ответ на вопрос

Войдите, чтобы написать ответ

Похожие вопросы