26 Commits

Author SHA1 Message Date
Tim Mahrt a36d7c8d17 REFACTOR: Gave the tonic vowel tier a more representative name 2016-03-16 12:00:19 +01:00
Tim Mahrt 65ac652dea DOCUMENTATION: Version changed to 1.3 in the setup.py file 2016-03-16 11:17:16 +01:00
Tim Mahrt ee08c347d5 DOCUMENTATION: Bolding text 2016-03-15 17:51:01 +01:00
Tim Mahrt c16c68a6ac DOCUMENTATION: Added to the acknowledgements. 2016-03-15 17:48:24 +01:00
Tim Mahrt bc4f19c74c FEATURE: Index to stressed vowel; marking of stressed vowels on textgrids
- the index to the stressed syllable was provided in the past.  Now
  the library also includes the index to the stressed vowel.  This is
  provided with relation to the phones in the syllable and all phones
  in the word.

- the code that marks the stressed syllables in the textgrids also
  now marks the stressed vowels

- several variables renamed to be more informative
2016-03-15 17:42:33 +01:00
Tim Mahrt c19cde7165 DOCUMENTATION: The link in the last update didn't work. 2016-02-18 14:21:06 +01:00
Tim Mahrt 38ebc7f3f9 BUGFIX: Python 3.x compability
Changed xrange -> range
Also added some documentation and changed the version number.
2016-02-18 14:17:49 +01:00
Tim Mahrt 102e8a7488 DOCUMENTATION: Removed duplicated text 2016-01-25 13:26:40 +01:00
Tim Mahrt 6b786cd00a DOCUMENTATION: Bolding 2016-01-25 13:25:33 +01:00
Tim Mahrt fb1e638cb8 DOCUMENTATION: Fixed link 2016-01-25 13:19:55 +01:00
Tim Mahrt e5acdfce30 DOCUMENTATION: Corrected islex reference, bolded grant numbers. 2016-01-25 13:18:20 +01:00
Tim Mahrt d47c312de7 DOCUMENTATION: Added requirements text about Python 3 to readme file. 2016-01-25 13:05:32 +01:00
Tim Mahrt 303d9bfcf2 DOCUMENTATION: Added revision information to pysle and more acknowledgements 2016-01-25 13:02:57 +01:00
Tim Mahrt 9c0ccd5748 DOCUMENTATION: Acknowledgements and citing information added 2016-01-25 12:39:43 +01:00
timmahrt 393182500e REFACTOR: Syncronized changes with the praatio library
Optional textgrid functionality requires praatio 2.1.0 or
greater.
2015-07-28 14:30:20 -05:00
timmahrt 985d68da6c REFACTOR: Change print statement to print function 2015-06-19 17:29:19 -05:00
timmahrt 0e53ed654e REFACTOR: PEP 8 compliance and minor bugfix
For bugfix, see last change in pronunciationtools.py
2015-06-18 19:56:15 -05:00
timmahrt ce633d0590 BUGFIX: Reflect changes in praatio library 2015-06-16 02:27:46 -05:00
timmahrt e2a2025f5b Merge remote-tracking branch 'origin/master' 2015-06-11 15:46:36 -05:00
timmahrt c10e3cf05f BUGFIX: Was unable to read islev2.txt with trailing newline
My custom islev2.txt did not have a trailing newline.
2015-06-11 15:43:27 -05:00
timmahrt 06222bf176 REFACTOR: PEP 8 compliance 2015-06-11 15:00:26 -05:00
Tim 6353e0172e Update README.rst 2015-06-01 15:01:29 -05:00
timmahrt fad0dd2902 SPEED BOOST: Now word lookup ~65 times faster.
Used to iterate through the isle text file for each search.
Now builds a dictionary of the form{word:pronunciation list,}
2015-01-29 23:02:13 -06:00
timmahrt 475053eee2 DOCUMENTATION: Moved the project description up. 2014-10-23 15:53:57 -05:00
timmahrt 08f8e859cc DOCUMENTATION: Added link to praatio. Added table of contents.
Also added some clarification about the requirements.
2014-10-23 15:51:35 -05:00
timmahrt 9cd6a7e68b DOCUMENTATION: Added/cleaned up the readme file
Added a new section 'common use cases' since I
get that question a lot.
2014-10-23 15:41:02 -05:00
7 changed files with 273 additions and 162 deletions
+84 -6
View File
@@ -3,6 +3,9 @@
pysle
---------
.. image:: https://img.shields.io/badge/license-MIT-blue.svg?
:target: http://opensource.org/licenses/MIT
Pronounced like 'p' + 'isle'.
An interface for the ILSEX (international speech lexicon) dictionary,
@@ -11,6 +14,52 @@ pronunciations (e.g. a list of phones someone said versus a standard or
canonical dictionary pronunciation).
.. sectnum::
.. contents::
Common Use Cases
================
What can you do with this library?
- look up the list of phones and syllables for canonical pronunciations
of a word::
pysle.isletool.LexicalTool.lookup('cat')
- map an actual pronunciation to a dictionary pronunciation (can be used
to automatically find speech errors)::
pysle.pronunciationtools.findClosestPronunciation(isleDict, 'cat', ['kh', 'ae',])
- automatically syllabify a praat textgrid containing words and phones
(e.g. force-aligned text) -- requires my
`praatIO <https://github.com/timmahrt/praatIO>`_ library::
pysle.syllabifyTextgrid(isleDict, praatioTextgrid, "words", "phones")
Major revisions
================
Ver 1.3 (March 15, 2016)
- added indicies for stressed vowels
Ver 1.2 (June 20, 2015)
- Python 3.x support
Ver 1.1 (January 30, 2015)
- word lookup ~65 times faster
Ver 1.0 (October 23, 2014)
- first public release.
Requirements
================
@@ -20,17 +69,26 @@ Requirements
`ISLEX project page <http://www.isle.illinois.edu/sst/data/dict/>`_
`Direct link to the ISLEX file used in this project
<http://www.isle.illinois.edu/sst/data/dict/islev2.txt)>`_
<http://www.isle.illinois.edu/sst/data/dict/islex/islev2.txt>`_ (islev2.txt)
- ``Python 2.7.*`` or above
- ``Python 3.3.*`` or above
- The `praatIO <https://github.com/timmahrt/praatIO>`_ library is required IF
you want to use the textgrid functionality. It is not required
for normal use.
Installation
================
From a command-line shell, navigate to the directory this is located in
and type::
If you on Windows, you can use the installer found here (check that it is up to date though)
`Windows installer <http://www.timmahrt.com/python_installers>`_
python setup.py install
Otherwise, to manually install, after downloading the source from github, from a command-line shell, navigate to the directory containing setup.py and type::
python setup.py install
If python is not in your path, you'll need to enter the full path e.g.::
@@ -61,7 +119,27 @@ and another::
print syllableList
>> [["''"], ['n', '@'], ['th', 'r']]
stressedSyllable, syllableList, syllabification, stressedIndex = returnList
Please see \test for example usage
Please see \\examples for example usage
Citing pysle
===============
Pysle is general purpose coding and doesn't need to be cited
(you should cite the
`ISLEX project <http://www.isle.illinois.edu/sst/data/dict/islex/index.shtml>`_
instead) but if you would like to, it can be cited like so:
Tim Mahrt. Pysle. https://github.com/timmahrt/pysle, 2016.
Acknowledgements
================
Development of Pysle was possible thanks to NSF grant **IIS 07-03624**
to Jennifer Cole and Mark Hasegawa-Johnson, NSF grant **BCS 12-51343**
to Jennifer Cole, José Hualde, and Caroline Smith, and
to the A*MIDEX project (n° **ANR-11-IDEX-0001-02**) to James Sneed German
funded by the Investissements dAvenir French Government program, managed
by the French National Research Agency (ANR).
+67 -60
View File
@@ -5,86 +5,92 @@ Created on Oct 11, 2012
'''
vowelList = ['a', '@', 'e', 'i', 'o', 'u', '^', '&', '>',]
vowelList = ['a', '@', 'e', 'i', 'o', 'u', '^', '&', '>', ]
class WordNotInISLE(Exception):
def __init__(self, word):
super(WordNotInISLE, self).__init__()
self.word = word
def __str__(self):
return "Word '%s' not in ISLE dictionary. Please add it to continue." % self.word
return ("Word '%s' not in ISLE dictionary. "
"Please add it to continue." % self.word)
class LexicalTool():
def __init__(self, islePath):
self.islePath = islePath
self.data = None
self.pronDict = None
self.data = self._buildDict()
def _buildDict(self):
'''
Builds the isle textfile into a dictionary for fast searching
'''
lexDict = {}
wordList = [line.rstrip('\n') for line in open(self.islePath, "rU")]
for row in wordList:
word, pronunciation = row.split(" ", 1)
word = word.split("(")[0]
lexDict.setdefault(word, [])
lexDict[word].append(pronunciation)
return lexDict
def lookup(self, word):
'''
Lookup a word and receive a list of syllables and stressInfo
'''
# All words must be lowercase with no extraneous whitespace
word = word.lower()
word = word.strip()
# Find indicies in the dictionary
pronList = self.data.get(word, None)
if self.data == None:
self.data = open(self.islePath, "r").read()
if pronList is None:
raise WordNotInISLE(word)
else:
pronList = [_parsePronunciation(pronunciationStr)
for pronunciationStr in pronList]
wordList = []
searchIndex = 0
while True:
# (The +1 skips over the "\n" which marks the start of every word)
startIndex = self.data.find("\n"+word + "(", searchIndex) + 1
# find() returns -1 if it does not find anything, but
# note that we added 1 to the return value
try:
assert(startIndex != 0)
except AssertionError:
if searchIndex == 0:
raise WordNotInISLE(word)
else:
break
endIndex = self.data.find("\n", startIndex)
searchIndex = endIndex
wordList.append((startIndex, endIndex))
returnList = []
for startIndex, endIndex in wordList:
isleWord = self.data[startIndex:endIndex]
syllableTxt = isleWord.split("#")[1].strip()
syllableList = [x for x in syllableTxt.split(' . ')]
# Find stress
stressList = []
for i, syllable in enumerate(syllableList):
# Primary stress
if "'" in syllable:
stressList.insert(0, i)
# Secondary stress
elif '"' in syllable:
stressList.append(i)
syllableList = [x.split(" ") for x in syllableList]
returnList.append((syllableList, stressList))
return returnList
return pronList
def _parsePronunciation(pronunciationStr):
'''
Parses the pronunciation string
Returns the list of syllables and a list of primary and
secondary stress locations
'''
syllableTxt = pronunciationStr.split("#")[1].strip()
syllableList = [x.split() for x in syllableTxt.split(' . ')]
# Find stress
stressedSyllableList = []
stressedPhoneList = []
for i, syllable in enumerate(syllableList):
for j, phone in enumerate(syllable):
if "'" in phone:
stressedSyllableList.insert(0, i)
stressedPhoneList.insert(0, j)
break
elif '"' in phone:
stressedSyllableList.insert(i)
stressedPhoneList.insert(j)
return syllableList, stressedSyllableList, stressedPhoneList
def getNumPhones(isleDict, label, maxFlag):
'''
If maxFlag=True, use the longest pronunciation. Otherwise, take the
If maxFlag=True, use the longest pronunciation. Otherwise, take the
average length.
'''
phoneCount = 0
@@ -94,24 +100,28 @@ def getNumPhones(isleDict, label, maxFlag):
phoneListOfLists = isleDict.lookup(word)
syllableCountList = []
for syllableList, stressIndex in phoneListOfLists:
for row in phoneListOfLists:
syllableList = row[0]
syllableCountList.append(len(syllableList))
# In ISLE, there can be multiple pronunciations for each word
# as we have no reason to believe one pronunciation is more
# likely than another, we take the average of all of them
phoneCountList = []
for syllableList, stressIndex in phoneListOfLists:
phoneCountList.append(len([phon for phoneList in syllableList for phon in phoneList]))
for row in phoneListOfLists:
syllableList = row[0]
phoneCountList.append(len([phon for phoneList in syllableList for
phon in phoneList]))
# The average number of phones for all possible pronunciations
# of this word
if maxFlag == True:
if maxFlag is True:
syllableCount += max(syllableCountList)
phoneCount += max(phoneCountList)
else:
syllableCount += sum(syllableCountList) / float(len(syllableCountList))
phoneCount += sum(phoneCountList) / float(len(phoneCountList))
syllableCount += (sum(syllableCountList) /
float(len(syllableCountList)))
phoneCount += sum(phoneCountList) / float(len(phoneCountList))
return syllableCount, phoneCount
@@ -131,6 +141,3 @@ def findOODWords(isleDict, wordList):
oodList.sort()
return oodList
+51 -31
View File
@@ -4,13 +4,14 @@ Created on Oct 22, 2014
@author: tmahrt
'''
class OptionalFeatureError(ImportError):
def __str__(self):
return "ERROR: You must have praatio installed to use pysle.praatTools"
try:
import praatio
from praatio import tgio
except ImportError:
raise OptionalFeatureError()
@@ -18,7 +19,7 @@ from pysle import isletool
from pysle import pronunciationtools
def syllabifyTextgrid(isleDict, tg, wordTierName, phoneTierName,
def syllabifyTextgrid(isleDict, tg, wordTierName, phoneTierName,
skipLabelList=None):
'''
Given a textgrid, syllabifies the phones in the textgrid
@@ -34,11 +35,12 @@ def syllabifyTextgrid(isleDict, tg, wordTierName, phoneTierName,
wordTier = tg.tierDict[wordTierName]
phoneTier = tg.tierDict[phoneTierName]
if skipLabelList == None:
if skipLabelList is None:
skipLabelList = []
syllableEntryList = []
tonicEntryList = []
tonicSEntryList = []
tonicPEntryList = []
for start, stop, word in wordTier.entryList:
if word in skipLabelList:
@@ -46,28 +48,43 @@ def syllabifyTextgrid(isleDict, tg, wordTierName, phoneTierName,
subPhoneTier = phoneTier.crop(start, stop, True, False)[0]
phoneList = [phone for startP, endP, phone in subPhoneTier.entryList if phone != '']
# entry = (start, stop, phone)
phoneList = [entry[2] for entry in subPhoneTier.entryList
if entry[2] != '']
try:
returnList = pronunciationtools.findBestSyllabification(isleDict,
word,
returnList = pronunciationtools.findBestSyllabification(isleDict,
word,
phoneList)
except isletool.WordNotInISLE:
print "Word ('%s') not is isle -- skipping syllabification" % word
print("Word ('%s') not is isle -- skipping syllabification" % word)
continue
except (pronunciationtools.NullPronunciationError):
print "Word ('%s') has no provided pronunciation" % word
print("Word ('%s') has no provided pronunciation" % word)
continue
stressedSyllable, syllableList, syllabification, stressIndexList = returnList
syllableList = returnList[1]
stressedSyllableIndexList = returnList[3]
stressedPhoneIndexList = returnList[4]
flattenedPhoneIndexList = returnList[5]
try:
stressI = stressedSyllableIndexList[0]
stressJ = stressedPhoneIndexList[0]
except IndexError:
stressI = None # Function word probably
stressJ = None #
if stressI is not None:
syllableList[stressI][stressJ] += "'"
i = 0
# print syllableList
# print(syllableList)
for k, syllable in enumerate(syllableList):
# Create the syllable tier entry
j = len(syllable)
stubEntryList = subPhoneTier.entryList[i:i+j]
stubEntryList = subPhoneTier.entryList[i:i + j]
i += j
# The whole syllable was deleted
@@ -76,29 +93,32 @@ def syllabifyTextgrid(isleDict, tg, wordTierName, phoneTierName,
syllableStart = stubEntryList[0][0]
syllableEnd = stubEntryList[-1][1]
label = "-".join([phone for start, end, phone in stubEntryList])
label = "-".join([entry[2] for entry in stubEntryList])
syllableEntryList.append( (syllableStart, syllableEnd, label) )
syllableEntryList.append((syllableStart, syllableEnd, label))
# Create the tonic tier entry
try:
stressIndex = stressIndexList[0]
except IndexError:
stressIndex = None # Function word probably
tonicLabel = ''
if k == stressIndex:
tonicLabel = 'T'
tonicEntryList.append( (syllableStart, syllableEnd, tonicLabel) )
# Create the tonic syllable tier entry
if k == stressI:
tonicSEntryList.append((syllableStart, syllableEnd, 'T'))
# Create the tonic phone tier entry
if k == stressI:
syllablePhoneTier = phoneTier.crop(syllableStart, syllableEnd,
True, False)[0]
phoneList = [entry for entry in syllablePhoneTier.entryList
if entry[2] != '']
phoneStart, phoneEnd = phoneList[stressJ][:2]
tonicPEntryList.append((phoneStart, phoneEnd, 'T'))
# Create a textgrid with the two syllable-level tiers
syllableTier = praatio.TextgridTier("syllable", syllableEntryList, praatio.INTERVAL_TIER)
tonicTier = praatio.TextgridTier('tonic', tonicEntryList, praatio.INTERVAL_TIER)
syllableTier = tgio.IntervalTier("syllable", syllableEntryList)
tonicSTier = tgio.IntervalTier('tonicSyllable', tonicSEntryList)
tonicPTier = tgio.IntervalTier('tonicVowel', tonicPEntryList)
syllableTG = praatio.Textgrid()
syllableTG = tgio.Textgrid()
syllableTG.addTier(syllableTier)
syllableTG.addTier(tonicTier)
syllableTG.addTier(tonicSTier)
syllableTG.addTier(tonicPTier)
return syllableTG
+55 -53
View File
@@ -9,10 +9,10 @@ import itertools
from pysle import isletool
class NullPronunciationError(Exception):
def __init__(self, word):
super(NullPronunciationError, self).__init__()
self.word = word
def __str__(self):
@@ -49,7 +49,7 @@ def _lcs(xs, ys):
ll_b = _lcs_lens(xb, ys)
ll_e = _lcs_lens(xe[::-1], ys[::-1])
_, k = max((ll_b[j] + ll_e[ny - j], j)
for j in range(ny + 1))
for j in range(ny + 1))
yb, ye = ys[:k], ys[k:]
return _lcs(xb, yb) + _lcs(xe, ye)
@@ -58,14 +58,13 @@ def _prepPronunciation(phoneList):
retList = []
for phone in phoneList:
if 'r' in phone:
phone = ['r',]
phone = ['r', ]
try:
phone = phone[0] # Only represent the str by its first letter
phone = phone[0] # Only represent the string by its first letter
phone = phone.lower()
except IndexError:
raise NullPhoneError()
phone = phone.lower()
if phone in isletool.vowelList:
phone = 'V'
retList.append(phone)
@@ -85,14 +84,14 @@ def _adjustSyllabification(adjustedPhoneList, syllableList):
retSyllableList = []
for syllable in syllableList:
j = len(syllable)
tmpPhoneList = adjustedPhoneList[i:i+j]
tmpPhoneList = adjustedPhoneList[i:i + j]
numBlanks = -1
phoneList = tmpPhoneList[:]
while numBlanks != 0:
numBlanks = tmpPhoneList.count("''")
if numBlanks > 0:
tmpPhoneList = adjustedPhoneList[i+j:i+j+numBlanks]
tmpPhoneList = adjustedPhoneList[i + j:i + j + numBlanks]
phoneList.extend(tmpPhoneList)
j += numBlanks
@@ -116,27 +115,32 @@ def _findBestPronunciation(isleDict, wordText, aPron):
isleWordList = isleDict.lookup(wordText)
aP = _prepPronunciation(aPron) # Mapping to simplified phone inventory
aP = _prepPronunciation(aPron) # Mapping to simplified phone inventory
origPronDict = dict((newPron,oldPron) for newPron, oldPron in zip(aP, aPron))
origPronDict = dict((newPron, oldPron)
for newPron, oldPron in zip(aP, aPron))
numDiffList = []
withStress = []
i = 0
alignedSyllabificationList = []
alignedActualPronunciationList = []
for syllableList, stressList in isleWordList:
for wordTuple in isleWordList:
syllableList = wordTuple[0] # syllableList, stressList
iP = [phone for phoneList in syllableList for phone in phoneList]
iP = _prepPronunciation(iP)
alignedIP, alignedAP = alignPronunciations(iP, aP)
alignedAP = [origPronDict.get(phon, "''") for phon in alignedAP] # Remapping to actual phones
# Remapping to actual phones
alignedAP = [origPronDict.get(phon, "''") for phon in alignedAP]
alignedActualPronunciationList.append(alignedAP)
# Adjusting the syllabification for differences between the dictionary
# pronunciation and the actual pronunciation
alignedSyllabification = _adjustSyllabification(alignedIP, syllableList)
alignedSyllabification = _adjustSyllabification(alignedIP,
syllableList)
alignedSyllabificationList.append(alignedSyllabification)
# Count the number of misalignments between the two
@@ -147,7 +151,7 @@ def _findBestPronunciation(isleDict, wordText, aPron):
hasStress = False
for syllable in syllableList:
for phone in syllable:
hasStress = "'" in phone or hasStress
hasStress = "'" in phone or hasStress
if hasStress:
withStress.append(i)
@@ -164,16 +168,16 @@ def _findBestPronunciation(isleDict, wordText, aPron):
for i, numDiff in enumerate(numDiffList):
if numDiff != minDiff:
continue
if bestIndex == None:
if bestIndex is None:
bestIndex = i
bestIsStressed = i in withStress
else:
if not bestIsStressed and i in withStress:
bestIndex = i
bestIsStressed = True
return isleWordList, alignedActualPronunciationList, alignedSyllabificationList, bestIndex
return (isleWordList, alignedActualPronunciationList,
alignedSyllabificationList, bestIndex)
def _syllabifyPhones(phoneList, syllableList, isleStressList):
@@ -193,9 +197,9 @@ def _syllabifyPhones(phoneList, syllableList, isleStressList):
start = 0
syllabifiedList = []
for i, end in enumerate(numPhoneList):
for end in numPhoneList:
syllable = phoneList[start:start+end]
syllable = phoneList[start:start + end]
syllabifiedList.append(syllable)
start += end
@@ -212,21 +216,6 @@ def alignPronunciations(pronI, pronA):
pronI = [char for char in pronI]
pronA = [char for char in pronA]
# -- allow for some flexibility in pronunciation
correctionsTuple = (('d', 't'), ('t', 'd'), ('s', 'z'), ('z', 's'),
('m', 'n'), ('n', 'm'),)
doMatch = lambda i, a: ((i == a) or
((i, a) in correctionsTuple))
def matchExists(targetPhone, pron):
match = False
for phone in pron:
match = match or doMatch(targetPhone, phone)
return match
# Remove vowels
# Remove any elements not in the other list (but maintain order)
pronITmp = pronI
pronATmp = pronA
@@ -244,7 +233,7 @@ def alignPronunciations(pronI, pronA):
startA = pronA.index(phone, startA)
startI = pronI.index(phone, startI)
sequenceIndexListA.append(startA)
sequenceIndexListA.append(startA)
sequenceIndexListI.append(startI)
# An index on the tail of both will be used to create output strings
@@ -254,17 +243,19 @@ def alignPronunciations(pronI, pronA):
# Fill in any blanks such that the sequential items have the same
# index and the two strings are the same length
for x in xrange(len(sequenceIndexListA)):
for x in range(len(sequenceIndexListA)):
indexA = sequenceIndexListA[x]
indexI = sequenceIndexListI[x]
if indexA < indexI :
for x in xrange(indexI - indexA):
if indexA < indexI:
for x in range(indexI - indexA):
pronA.insert(indexA, "''")
sequenceIndexListA = [val + indexI - indexA for val in sequenceIndexListA]
sequenceIndexListA = [val + indexI - indexA
for val in sequenceIndexListA]
elif indexA > indexI:
for x in xrange(indexA - indexI):
for x in range(indexA - indexI):
pronI.insert(indexI, "''")
sequenceIndexListI = [val + indexA - indexI for val in sequenceIndexListI]
sequenceIndexListI = [val + indexA - indexI
for val in sequenceIndexListI]
return pronI, pronA
@@ -273,23 +264,36 @@ def findBestSyllabification(isleDict, wordText, actualPronunciationList):
'''
Find the best syllabification for a word
First find the closest pronunciation to a given pronunciation. Then take
the syllabification for that pronunciation and map it onto the
First find the closest pronunciation to a given pronunciation. Then take
the syllabification for that pronunciation and map it onto the
input pronunciation.
'''
retList = _findBestPronunciation(isleDict, wordText, actualPronunciationList)
retList = _findBestPronunciation(isleDict, wordText,
actualPronunciationList)
isleWordList, alignedAPronList, alignedSyllableList, bestIndex = retList
alignedPhoneList = alignedAPronList[bestIndex]
alignedSyllables = alignedSyllableList[bestIndex]
syllabification = isleWordList[bestIndex][0]
stressedIndex = isleWordList[bestIndex][1]
stressedSyllableIndexList = isleWordList[bestIndex][1]
stressedPhoneIndexList = isleWordList[bestIndex][2]
stressedSyllable, syllableList = _syllabifyPhones(alignedPhoneList,
alignedSyllables,
stressedIndex)
stressedSyllable, syllableList = _syllabifyPhones(alignedPhoneList,
alignedSyllables,
stressedSyllableIndexList)
return stressedSyllable, syllableList, syllabification, stressedIndex
# Count the index of the stressed phones, if the stress list has
# become flattened (no syllable information)
flattenedStressIndexList = []
for i, j in zip(stressedSyllableIndexList, stressedPhoneIndexList):
k = j
for l in range(i):
k += len(syllableList[l])
flattenedStressIndexList.append(k)
return (stressedSyllable, syllableList, syllabification,
stressedSyllableIndexList, stressedPhoneIndexList,
flattenedStressIndexList)
def findClosestPronunciation(isleDict, wordText, aPron):
@@ -298,9 +302,7 @@ def findClosestPronunciation(isleDict, wordText, aPron):
'''
retList = _findBestPronunciation(isleDict, wordText, aPron)
isleWordList, actualPronunciationList, bestIndex = retList
isleWordList = retList[0]
bestIndex = retList[3]
return isleWordList[bestIndex]
+2 -2
View File
@@ -5,7 +5,7 @@ Created on Oct 15, 2014
'''
from distutils.core import setup
setup(name='pysle',
version='1.0.0',
version='1.3.0',
author='Tim Mahrt',
author_email='timmahrt@gmail.com',
package_dir={'pysle':'pysle'},
@@ -13,4 +13,4 @@ setup(name='pysle',
license='LICENSE',
long_description=open('README.rst', 'r').read(),
# install_requires=[], # No requirements! # requires 'from setuptools import setup'
)
)
+6 -6
View File
@@ -20,13 +20,13 @@ firstEntry = lookupResults[0]
firstSyllableList = firstEntry[0]
firstStressList = firstEntry[1]
print searchWord
print firstSyllableList, firstStressList # 3rd syllable carries stress
print(searchWord)
print(firstSyllableList, firstStressList) # 3rd syllable carries stress
# Here we determine the syllabification of a word, as it was said.
# (Of course, this is just a guess)
print '-'*50
print('-'*50)
searchWord = 'another'
anotherPhoneList = ['n', '@', 'th', 'r']
@@ -37,8 +37,8 @@ returnList = pronunciationtools.findBestSyllabification(isleDict,
stressedSyllable, syllableList, syllabification, stressedIndex = returnList
print searchWord
print anotherPhoneList
print syllableList # We can see the first syllable was elided
print(searchWord)
print(anotherPhoneList)
print(syllableList) # We can see the first syllable was elided
+8 -4
View File
@@ -12,21 +12,25 @@ This snippet shows you how to use this function.
from os.path import join
import praatio
from praatio import tgio
from pysle import isletool
from pysle import praattools
path = join('.', 'files')
path = "/Users/tmahrt/Dropbox/workspace/pysle/test/files"
tg = praatio.openTextGrid(join(path, "pumpkins.TextGrid"))
isleDict = isletool.LexicalTool('/Users/tmahrt/Dropbox/workspace/pysle/test/islev2.txt') # Needs the full path to the file
tg = tgio.openTextGrid(join(path, "pumpkins.TextGrid"))
# Needs the full path to the file
islevPath = '/Users/tmahrt/Dropbox/workspace/pysle/test/islev2.txt'
isleDict = isletool.LexicalTool(islevPath)
# Get the syllabification tiers and add it to the textgrid
syllableTG = praattools.syllabifyTextgrid(isleDict, tg, "word", "phone",
skipLabelList=["",])
tg.addTier(syllableTG.tierDict["syllable"])
tg.addTier(syllableTG.tierDict["tonic"])
tg.addTier(syllableTG.tierDict["tonicSyllable"])
tg.addTier(syllableTG.tierDict["tonicVowel"])