Я полный профан в этом и не знаю что и как удалить
var spreadsheet = SpreadsheetApp.getActive();
var spreadsheet = SpreadsheetApp.getActive().getSheetByName('НазваниеЛиста');
soup = BeautifulSoup(html)
author_text = soup.find('i',{'class':'icon icon-user'})
email_text = soup.find('i',{'class':'icon icon-support'})
phone_text = soup.find('i',{'class':'icon icon-phone'})
print(author_text.next)
print(email_text.next)
print(phone_text.next)
Пыльнев Анатолий
tollik36@mail.ru
89055663563
from bs4 import BeautifulSoup
html="""
<tr>
<td>
<br/><br/>
<i class="icon icon-user" data-selector=".icon" title="Автор"></i> Барышева Олеся<br/>
<i class="icon icon-support" data-selector=".icon" title="E-mail"></i> olesya052019@bk.ru<br/>
<i class="icon icon-phone" data-selector=".icon" title="Телефон"></i> 89188565504<br/>
</td>
</tr>
"""
soup = BeautifulSoup(html)
my_text = soup.find('td')
print(my_text.get_text().split())
['Барышева', 'Олеся', 'olesya052019@bk.ru', '89188565504']
Барышева Олеся
olesya052019@bk.ru
89188565504
from bs4 import BeautifulSoup
html="""
<tr>
<td>
<br/><br/>
<i class="icon icon-user" data-selector=".icon" title="Автор"></i> Барышева Олеся<br/>
<i class="icon icon-support" data-selector=".icon" title="E-mail"></i> olesya052019@bk.ru<br/>
<i class="icon icon-phone" data-selector=".icon" title="Телефон"></i> 89188565504<br/>
</td>
</tr>
<tr>
<td>
<br/><br/>
<i class="icon icon-user" data-selector=".icon" title="Автор"></i> Иван Иванович<br/>
<i class="icon icon-support" data-selector=".icon" title="E-mail"></i> obi_van_ia9@bk.ru<br/>
<i class="icon icon-phone" data-selector=".icon" title="Телефон"></i> 232321113312<br/>
</td>
</tr>
<tr>
<td>
<br/><br/>
<i class="icon icon-user" data-selector=".icon" title="Автор"></i> Темный лорд<br/>
<i class="icon icon-support" data-selector=".icon" title="E-mail"></i> pirojok51@mail.ru<br/>
<i class="icon icon-phone" data-selector=".icon" title="Телефон"></i> 80002111122<br/>
</td>
</tr>
"""
soup = BeautifulSoup(html)
my_text = soup.findAll('td')
for text in my_text:
print(text.get_text().split())
['Барышева', 'Олеся', 'olesya052019@bk.ru', '89188565504']
['Иван', 'Иванович', 'obi_van_ia9@bk.ru', '232321113312']
['Темный', 'лорд', 'pirojok51@mail.ru', '80002111122']
Все гайды пересмотрел, включая Арч вики.
from bs4 import BeautifulSoup
import re
html = """
<p class="order-quantity j-orders-count-wrapper" data-link="class{merge: selectedNomenclature^ordersCount < 1 toggle='hide'}">Купили
<span data-link="{include tmpl='productCardOrderCount' ^~ordersCount=selectedNomenclature^ordersCount}">
<script type="jsv#29_"></script>
<script type="jsv#27^"></script>
<script type="jsv#30_"></script>
<script type="jsv#26^"></script>более 700 раз<script type="jsv/26^">
</script>
<script type="jsv/30_"></script>
<script type="jsv/27^"></script>
<script type="jsv/29_"></script>
</span>
</p>
"""
soup = BeautifulSoup(html)
full_text = re.sub(' +', ' ',soup.find('p').get_text().strip().replace(u'\n', u' '))
print(full_text)
number = re.findall("[0-9]+",soup.find('p').get_text())
print(nunber)
C:\Users\david\Desktop>test.py www.google.com
#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/resources/registry/whois/tou/
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/resources/registry/whois/inaccuracy_reporting/
#
# Copyright 1997-2021, American Registry for Internet Numbers, Ltd.
#
NetRange: 216.58.192.0 - 216.58.223.255
CIDR: 216.58.192.0/19
NetName: GOOGLE
NetHandle: NET-216-58-192-0-1
Parent: NET216 (NET-216-0-0-0-0)
NetType: Direct Allocation
OriginAS: AS15169
Organization: Google LLC (GOGL)
RegDate: 2012-01-27
Updated: 2012-01-27
Ref: https://rdap.arin.net/registry/ip/216.58.192.0
OrgName: Google LLC
OrgId: GOGL
Address: 1600 Amphitheatre Parkway
City: Mountain View
StateProv: CA
PostalCode: 94043
Country: US
RegDate: 2000-03-30
Updated: 2019-10-31
Comment: Please note that the recommended way to file abuse complaints are located in the following links.
Comment:
Comment: To report abuse and illegal activity: https://www.google.com/contact/
Comment:
Comment: For legal requests: http://support.google.com/legal
Comment:
Comment: Regards,
Comment: The Google Team
Ref: https://rdap.arin.net/registry/entity/GOGL
OrgAbuseHandle: ABUSE5250-ARIN
OrgAbuseName: Abuse
OrgAbusePhone: +1-650-253-0000
OrgAbuseEmail: network-abuse@google.com
OrgAbuseRef: https://rdap.arin.net/registry/entity/ABUSE5250-ARIN
OrgTechHandle: ZG39-ARIN
OrgTechName: Google LLC
OrgTechPhone: +1-650-253-0000
OrgTechEmail: arin-contact@google.com
OrgTechRef: https://rdap.arin.net/registry/entity/ZG39-ARIN
#
# ARIN WHOIS data and services are subject to the Terms of Use
# available at: https://www.arin.net/resources/registry/whois/tou/
#
# If you see inaccuracies in the results, please report at
# https://www.arin.net/resources/registry/whois/inaccuracy_reporting/
#
# Copyright 1997-2021, American Registry for Internet Numbers, Ltd.
как сделать что бы питон мониторил папку (например раз в 10сек)
from datetime import date
today = date.today()
d1 = today.strftime("%d.%m.%Y")
print("d1 =", d1)