vvkSeven
@vvkSeven
Junior Python Dev

Подсчёт одинаковых слов в файле python?

Всем доброго времени суток. Есть файл, в котором много англоязычных строк. Нужно подсчитать частоту встретившихся слов в файле, но через функции и map, но не используя циклы. Сам не смог, т.к не хватило времени, сегодня нужно сдать это упражнение. Буду супер благодарен за помощь.

Текст из файла:

The ancient Greeks first had the idea of getting men together every four years to hold and witness sporting events (in those days women did not participate, though they had their own, independent, events). The idea was to have the best athletes from all over Greece gather in one field and compete every four years. All wars and fighting had to stop while the athletes and their supporters came together in the town of Olympia for a few days to compete in a few events, mostly related to warfare (throwing the javelin, running, wrestling, boxing and chariot racing).
The first written reference to the Games is 776 BC. They lasted until 389 AD. The idea of having the modern Games was suggested in the mid 19th century but they weren't a world event until 1896. Besides being postponed because of wars, they have been held since then every four years in different cities around the world.
The Olympic Games have many important symbols that most people recognize. The five rings that appear on the Olympic flag (coloured yellow, green, blue, black and red) were introduced in 1914. They represent the five continents of Africa, the Americas, Australia, Asia and Europe. The flag is raised in the host city and then flown to the next one where it is kept until the next Games. The Olympic torch, a major part of the ancient Games, was brought back in 1928 and is carried with great fanfare and publicity to the host city where it lights the burning flame of the Games. It is kept burning until the close of the Games. The torch symbolizes purity, the drive for perfection and the struggle for victory.
The rousing Olympic anthem is the simply named "Olympic Music" by John Williams, who wrote it for the 1984 Olympics, held in Los Angeles. What you hear first are the forty or so notes played on horns which form the "Bugler's Dream" (also called "Olympic Fanfare") by Leo Arnaud, first played in the 1968 Games.
The torch, fanfare and flag are clearly evident in the Opening Ceremony, when everyone formally welcomes the participants and the Games can begin. Here we find the dramatic and colourful March of Nations, in which all the athletes from each country go into the venue to the sound of their country's anthem and march behind their flags, thus becoming representatives of their countries.
  • Вопрос задан
  • 8751 просмотр
Решения вопроса 1
fox_12
@fox_12 Куратор тега Python
Расставляю биты, управляю заряженными частицами
> но через функции и map, но не используя циклы
Ну если строго функции и map и не используя циклы - тогда вот:
def count_word(word):
    if word in total_count:
        total_count[word] += 1
    else:
        total_count[word] = 1

total_count = dict()
list(map(lambda x: count_word(''.join(filter(str.isalpha, x.lower())), str1.split())) # в str1 - ваш текст
print(total_count)


{'the': 46, 'ancient': 2, 'greeks': 1, 'first': 4, 'had': 3, 'idea': 3, 'of': 11, 'getting': 1, 'men': 1, 'together': 2, 'every': 3, 'four': 3, 'years': 3, 'to': 9, 'hold': 1, 'and': 15, 'witness': 1, 'sporting': 1, 'events': 3, 'in': 13, 'those': 1, 'days': 2, 'women': 1, 'did': 1, 'not': 1, 'participate': 1, 'though': 1, 'they': 5, 'their': 5, 'own': 1, 'independent': 1, 'was': 3, 'have': 3, 'best': 1, 'athletes': 3, 'from': 2, 'all': 3, 'over': 1, 'greece': 1, 'gather': 1, 'one': 2, 'field': 1, 'compete': 2, 'wars': 2, 'fighting': 1, 'stop': 1, 'while': 1, 'supporters': 1, 'came': 1, 'town': 1, 'olympia': 1, 'for': 4, 'a': 4, 'few': 2, 'mostly': 1, 'related': 1, 'warfare': 1, 'throwing': 1, 'javelin': 1, 'running': 1, 'wrestling': 1, 'boxing': 1, 'chariot': 1, 'racing': 1, 'written': 1, 'reference': 1, 'games': 9, 'is': 6, '': 7, 'bc': 1, 'lasted': 1, 'until': 4, 'ad': 1, 'having': 1, 'modern': 1, 'suggested': 1, 'mid': 1, 'th': 1, 'century': 1, 'but': 1, 'werent': 1, 'world': 2, 'event': 1, 'besides': 1, 'being': 1, 'postponed': 1, 'because': 1, 'been': 1, 'held': 2, 'since': 1, 'then': 2, 'different': 1, 'cities': 1, 'around': 1, 'olympic': 6, 'many': 1, 'important': 1, 'symbols': 1, 'that': 2, 'most': 1, 'people': 1, 'recognize': 1, 'five': 2, 'rings': 1, 'appear': 1, 'on': 2, 'flag': 3, 'coloured': 1, 'yellow': 1, 'green': 1, 'blue': 1, 'black': 1, 'red': 1, 'were': 1, 'introduced': 1, 'represent': 1, 'continents': 1, 'africa': 1, 'americas': 1, 'australia': 1, 'asia': 1, 'europe': 1, 'raised': 1, 'host': 2, 'city': 2, 'flown': 1, 'next': 2, 'where': 2, 'it': 4, 'kept': 2, 'torch': 3, 'major': 1, 'part': 1, 'brought': 1, 'back': 1, 'carried': 1, 'with': 1, 'great': 1, 'fanfare': 3, 'publicity': 1, 'lights': 1, 'burning': 2, 'flame': 1, 'close': 1, 'symbolizes': 1, 'purity': 1, 'drive': 1, 'perfection': 1, 'struggle': 1, 'victory': 1, 'rousing': 1, 'anthem': 2, 'simply': 1, 'named': 1, 'music': 1, 'by': 2, 'john': 1, 'williams': 1, 'who': 1, 'wrote': 1, 'olympics': 1, 'los': 1, 'angeles': 1, 'what': 1, 'you': 1, 'hear': 1, 'are': 2, 'forty': 1, 'or': 1, 'so': 1, 'notes': 1, 'played': 2, 'horns': 1, 'which': 2, 'form': 1, 'buglers': 1, 'dream': 1, 'also': 1, 'called': 1, 'leo': 1, 'arnaud': 1, 'clearly': 1, 'evident': 1, 'opening': 1, 'ceremony': 1, 'when': 1, 'everyone': 1, 'formally': 1, 'welcomes': 1, 'participants': 1, 'can': 1, 'begin': 1, 'here': 1, 'we': 1, 'find': 1, 'dramatic': 1, 'colourful': 1, 'march': 2, 'nations': 1, 'each': 1, 'country': 1, 'go': 1, 'into': 1, 'venue': 1, 'sound': 1, 'countrys': 1, 'behind': 1, 'flags': 1, 'thus': 1, 'becoming': 1, 'representatives': 1, 'countries': 1}


А вообще лучше так:
from collections import Counter
print(Counter([''.join(filter(str.isalpha, x.lower())) for x in str1.split() if ''.join(filter(str.isalpha, x.lower()))]))


Counter({'the': 46, 'and': 15, 'in': 13, 'of': 11, 'to': 9, 'games': 9, 'is': 6, 'olympic': 6, 'they': 5, 'their': 5, 'first': 4, 'for': 4, 'a': 4, 'until': 4, 'it': 4, 'had': 3, 'idea': 3, 'every': 3, 'four': 3, 'years': 3, 'events': 3, 'was': 3, 'have': 3, 'athletes': 3, 'all': 3, 'flag': 3, 'torch': 3, 'fanfare': 3, 'ancient': 2, 'together': 2, 'days': 2, 'from': 2, 'one': 2, 'compete': 2, 'wars': 2, 'few': 2, 'world': 2, 'held': 2, 'then': 2, 'that': 2, 'five': 2, 'on': 2, 'host': 2, 'city': 2, 'next': 2, 'where': 2, 'kept': 2, 'burning': 2, 'anthem': 2, 'by': 2, 'are': 2, 'played': 2, 'which': 2, 'march': 2, 'greeks': 1, 'getting': 1, 'men': 1, 'hold': 1, 'witness': 1, 'sporting': 1, 'those': 1, 'women': 1, 'did': 1, 'not': 1, 'participate': 1, 'though': 1, 'own': 1, 'independent': 1, 'best': 1, 'over': 1, 'greece': 1, 'gather': 1, 'field': 1, 'fighting': 1, 'stop': 1, 'while': 1, 'supporters': 1, 'came': 1, 'town': 1, 'olympia': 1, 'mostly': 1, 'related': 1, 'warfare': 1, 'throwing': 1, 'javelin': 1, 'running': 1, 'wrestling': 1, 'boxing': 1, 'chariot': 1, 'racing': 1, 'written': 1, 'reference': 1, 'bc': 1, 'lasted': 1, 'ad': 1, 'having': 1, 'modern': 1, 'suggested': 1, 'mid': 1, 'th': 1, 'century': 1, 'but': 1, 'werent': 1, 'event': 1, 'besides': 1, 'being': 1, 'postponed': 1, 'because': 1, 'been': 1, 'since': 1, 'different': 1, 'cities': 1, 'around': 1, 'many': 1, 'important': 1, 'symbols': 1, 'most': 1, 'people': 1, 'recognize': 1, 'rings': 1, 'appear': 1, 'coloured': 1, 'yellow': 1, 'green': 1, 'blue': 1, 'black': 1, 'red': 1, 'were': 1, 'introduced': 1, 'represent': 1, 'continents': 1, 'africa': 1, 'americas': 1, 'australia': 1, 'asia': 1, 'europe': 1, 'raised': 1, 'flown': 1, 'major': 1, 'part': 1, 'brought': 1, 'back': 1, 'carried': 1, 'with': 1, 'great': 1, 'publicity': 1, 'lights': 1, 'flame': 1, 'close': 1, 'symbolizes': 1, 'purity': 1, 'drive': 1, 'perfection': 1, 'struggle': 1, 'victory': 1, 'rousing': 1, 'simply': 1, 'named': 1, 'music': 1, 'john': 1, 'williams': 1, 'who': 1, 'wrote': 1, 'olympics': 1, 'los': 1, 'angeles': 1, 'what': 1, 'you': 1, 'hear': 1, 'forty': 1, 'or': 1, 'so': 1, 'notes': 1, 'horns': 1, 'form': 1, 'buglers': 1, 'dream': 1, 'also': 1, 'called': 1, 'leo': 1, 'arnaud': 1, 'clearly': 1, 'evident': 1, 'opening': 1, 'ceremony': 1, 'when': 1, 'everyone': 1, 'formally': 1, 'welcomes': 1, 'participants': 1, 'can': 1, 'begin': 1, 'here': 1, 'we': 1, 'find': 1, 'dramatic': 1, 'colourful': 1, 'nations': 1, 'each': 1, 'country': 1, 'go': 1, 'into': 1, 'venue': 1, 'sound': 1, 'countrys': 1, 'behind': 1, 'flags': 1, 'thus': 1, 'becoming': 1, 'representatives': 1, 'countries': 1})
Ответ написан
Пригласить эксперта
Ответы на вопрос 2
@xtonypythonx
Нет
text = open("file.txt").read()
unique = set(text.split())

for word in unique:
    print(f'{word}: {text.count(word)}')
Ответ написан
Комментировать
Ваш ответ на вопрос

Войдите, чтобы написать ответ

Похожие вопросы