Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
146 views
in Technique[技术] by (71.8m points)

python - Divide the text by 2 characters

I have a pytohn script which counts how many times a character is met in the text file.

from __future__ import unicode_literals
import string
from collections import Counter
freqs = {}
text = sorted(open("rabi2.txt", "r" ,encoding='utf-8').read())

bad_chars = [')', '(', '-', '?', '?',',','!','—',' ','!','.','
']
text1 = ''.join(i for i in text if not i in bad_chars) 
texts = [[words for words in sentences.lower().split()] for sentences in text1]
for line in texts:
       for char in line:
           if char in freqs:
               freqs[char] += 1
           else:
               freqs[char] = 1

print(freqs)

I need to divide the text by 2 characters(and by 3 characters, this is a separate program)including the space and count how many times each syllable occurred, for example: input: hello world hello everybody output: he,ll,o(space),wo,rl,d (space),he,ll,o(space),ev,er,yb,od,y(space) and count how many times each met, e.g: he - 2 times ll - 2 times wo - 1 time and so on


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Is it what you search for? Note: it will not filter the special characters

import re
from collections import Counter

filePath = #your file path
with open(filePath, 'r') as filePointer:
    groupList = re.findall('..', filePointer.read())

outputString = ','.join(groupList)
print(outputString)
print(Counter(groupList))

For your example:


import re
from collections import Counter

test = 'hello world hello everybody'
groupList = re.findall('..', test)

outputString = ','.join(groupList)
print(outputString)
print(Counter(groupList)) #two chars

OUTPUT:
he,ll,o ,wo,rl,d ,he,ll,o ,ev,er,yb,od
Counter({'ll': 2, 'o ': 2, 'he': 2, 'od': 1, 'wo': 1, 'yb': 1, 'd ': 1, 'rl': 1, 'ev': 1, 'er': 1})

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...