AArch64 | |
ppc64le | |
s390x | |
x86-64 |
- Change the egg requirement to use the right name, beautifulsoup4, instead of bs4
- Update to 2.0.3: * Added new tokenizer case for ':' preventing cut in the middle of a time notation - Update to 2.0.2: Features * Added Python 3.7 support, modernization of packaging, testing and CI Bugfixes * Fixed language retrieval/validation broken from new Google Translate page - Update to 2.0.1: Bugfixes * Fixed an UnicodeDecodeError when installing gTTS if system locale was not utf-8 Improved Documentation * Added Pre-processing and tokenizing > Minimizing section about the API's 100 characters limit and how larger tokens are handled - Update to 2.0.0: Features * The gtts module + New logger ("gtts") replaces all occurrences of print() + Languages list is now obtained automatically (gtts.lang) + Added a curated list of language sub-tags that have been observed to provide different dialects or accents (e.g. "en-gb", "fr-ca") + New gTTS() parameter lang_check to disable language checking. + gTTS() now delegates the text tokenizing to the API request methods (i.e. write_to_fp(), save()), allowing gTTS instances to be modified/reused + Rewrote tokenizing and added pre-processing (see below) + New gTTS() parameters pre_processor_funcs and tokenizer_func to configure pre-processing and tokenizing (or use a 3rd party tokenizer) + Error handling: - Added new exception gTTSError raised on API request errors. It attempts to guess what went wrong based on known information and observed behaviour - gTTS.write_to_fp() and gTTS.save() also raise gTTSError on gtts_token error - gTTS.write_to_fp() raises TypeError when fp is not a file-like object or one that doesn't take bytes - gTTS() raises ValueError on unsupported languages (and lang_check is True) - More fine-grained error handling throughout (e.g. request failed vs. request successful with a bad response) * Tokenizer (and new pre-processors): + Rewrote and greatly expanded tokenizer (gtts.tokenizer) + Smarter token 'cleaning' that will remove tokens that only contain characters that can't be spoken (i.e. punctuation and whitespace) + Decoupled token minimizing from tokenizing, making the latter usable in other contexts + New flexible speech-centric text pre-processing + New flexible full-featured regex-based tokenizer (gtts.tokenizer.core.Tokenizer) + New RegexBuilder, PreProcessorRegex and PreProcessorSub classes to make writing regex-powered text pre-processors and tokenizer cases easier + Pre-processors: - Re-form words cut by end-of-line hyphens - Remove periods after a (customizable) list of known abbreviations (e.g. "jr", "sr", "dr") that can be spoken the same without a period - Perform speech corrections by doing word-for-word replacements from a (customizable) list of tuples + Tokenizing: - Keep punctuation that modify the inflection of speech (e.g. "?", "!") - Don't split in the middle of numbers (e.g. "10.5", "20,000,000") - Don't split on "dotted" abbreviations and accronyms (e.g. "U.S.A") - Added Chinese comma (","), ellipsis ("…") to punctuation list to tokenize on * The gtts-cli command-line tool - Rewrote cli as first-class citizen module (gtts.cli), powered by Click - Windows support using setuptool's entry_points - Better support for Unicode I/O in Python 2 - All arguments are now pre-validated - New --nocheck flag to skip language pre-checking - New --all flag to list all available languages - Either the --file option or the <text> argument can be set to "-" to read from stdin - The --debug flag uses logging and doesn't pollute stdout anymore Bugfixes * _minimize(): Fixed an infinite recursion loop that would occur when a token started with the miminizing delimiter (i.e. a space) * _minimize(): Handle the case where a token of more than 100 characters did not contain a space (e.g. in Chinese). * Fixed an issue that fused multiline text together if the total number of characters was less than 100 * Fixed gtts-cli Unicode errors in Python 2.7 Deprecations and Removals * Dropped Python 3.3 support * Removed debug parameter of gTTS (in favour of logger) * gtts-cli: Changed long option name of -o to --output instead of - -destination * gTTS() will raise a ValueError rather than an AssertionError on unsupported language Improved Documentation * Rewrote all documentation files as reStructuredText * Comprehensive documentation writen for Sphinx, published to http://gtts.readthedocs.io * Changelog built with towncrier Misc * Major test re-work * Language tests can read a TEST_LANGS enviromment variable so not all language tests are run every time. * Added AppVeyor CI for Windows * PEP 8 compliance - Add remove-pip-requirement.patch to remove the dependency on pip to build the package.
- Run spec-cleaner
- Use %license for the LICENSE file
- Focus description on gTTS.
- Initial release of python-gTTS 1.2.2