Version: 3.7-bp155.2.13
* Sat Mar 14 2020 Tomáš Chvátal <tchvatal@suse.com>
- Fix build without python2
* Wed Sep 11 2019 Tomáš Chvátal <tchvatal@suse.com>
- Update to 3.4.5 (bsc#1146427, CVE-2019-14751):
* Fixed security bug in downloader: Zip slip vulnerability - for the
unlikely situation where a user configures their downloader to use
a compromised server CVE-2019-14751
* Tue Jul 23 2019 Tomáš Chvátal <tchvatal@suse.com>
- Update to 3.4.4:
* fix bug in plot function (probability.py)
* add improved PanLex Swadesh corpus reader
* add Text.generate()
* add QuadgramAssocMeasures
* add SSP to tokenizers
* return confidence of best tag from AveragedPerceptron
* make plot methods return Axes objects
* don't require list arguments to PositiveNaiveBayesClassifier.train
* fix Tree classes to work with native Python copy library
* fix inconsistency for NomBank
* fix random seeding in LanguageModel.generate
* fix ConditionalFreqDist mutation on tabulate/plot call
* fix broken links in documentation
* fix misc Wordnet issues
* update installation instructions
Version: 3.7-bp152.3.3.1
* Tue Mar 22 2022 Matej Cepl <mcepl@suse.com>
- Update to 3.7
- Improve and update the NLTK team page on nltk.org (#2855,
[#2941])
- Drop support for Python 3.6, support Python 3.10 (#2920)
- Update to 3.6.7
- Resolve IndexError in `sent_tokenize` and `word_tokenize`
(#2922)
- Update to 3.6.6
- Refactor `gensim.doctest` to work for gensim 4.0.0 and up
(#2914)
- Add Precision, Recall, F-measure, Confusion Matrix to Taggers
(#2862)
- Added warnings if .zip files exist without any corresponding
.csv files. (#2908)
- Fix `FileNotFoundError` when the `download_dir` is
a non-existing nested folder (#2910)
- Rename omw to omw-1.4 (#2907)
- Resolve ReDoS opportunity by fixing incorrectly specified
regex (#2906, bsc#1191030, CVE-2021-3828).
- Support OMW 1.4 (#2899)
- Deprecate Tree get and set node methods (#2900)
- Fix broken inaugural test case (#2903)
- Use Multilingual Wordnet Data from OMW with newer Wordnet
versions (#2889)
- Keep NLTKs "tokenize" module working with pathlib (#2896)
- Make prettyprinter to be more readable (#2893)
- Update links to the nltk book (#2895)
- Add `CITATION.cff` to nltk (#2880)
- Resolve serious ReDoS in PunktSentenceTokenizer (#2869)
- Delete old CI config files (#2881)
- Improve Tokenize documentation + add TokenizerI as superclass
for TweetTokenizer (#2878)
- Fix expected value for BLEU score doctest after changes from
[#2572]
- Add multi Bleu functionality and tests (#2793)
- Deprecate 'return_str' parameter in NLTKWordTokenizer and
TreebankWordTokenizer (#2883)
- Allow empty string in CFG's + more (#2888)
- Partition `tree.py` module into `tree` package + pickle fix
(#2863)
- Fix several TreebankWordTokenizer and NLTKWordTokenizer bugs
(#2877)
- Rewind Wordnet data file after each lookup (#2868)
- Correct __init__ call for SyntaxCorpusReader subclasses
(#2872)
- Documentation fixes (#2873)
- Fix levenstein distance for duplicated letters (#2849)
- Support alternative Wordnet versions (#2860)
- Remove hundreds of formatting warnings for nltk.org (#2859)
- Modernize `nltk.org/howto` pages (#2856)
- Fix Bleu Score smoothing function from taking log(0) (#2839)
- Update third party tools to newer versions and removing
MaltParser fixed version (#2832)
- Fix TypeError: _pretty() takes 1 positional argument but 2
were given in sem/drt.py (#2854)
- Replace `http` with `https` in most URLs (#2852)
- Update to 3.6.5
- modernised nltk.org website
- addressed LGTM.com issues
- support ZWJ sequences emoji and skin tone modifer emoji in
TweetTokenizer
- METEOR evaluation now requires pre-tokenized input
- Code linting and type hinting
- implement get_refs function for DrtLambdaExpression
- Enable automated CoreNLP, Senna, Prover9/Mace4, Megam,
MaltParser CI tests
- specify minimum regex version that supports regex.Pattern
- avoid re.Pattern and regex.Pattern which fail for Python 3.6,
3.7
- Update to 3.6.4
- deprecate `nltk.usage(obj)` in favor of `help(obj)`
- resolve ReDoS vulnerability in Corpus Reader
- solidify performance tests
- improve phone number recognition in tweet tokenizer
- refactored CISTEM stemmer for German
- identify NLTK Team as the author
- replace travis badge with github actions badge
- add SECURITY.md
- Update to 3.6.3
- Dropped support for Python 3.5
- Run CI tests on Windows, too
- Moved from Travis CI to GitHub Actions
- Code and comment cleanups
- Visualize WordNet relation graphs using Graphviz
- Fixed large error in METEOR score
- Apply isort, pyupgrade, black, added as pre-commit hooks
- Prevent debug_decisions in Punkt from throwing IndexError
- Resolved ZeroDivisionError in RIBES with dissimilar sentences
- Initialize WordNet IC total counts with smoothing value
- Fixed AttributeError for Arabic ARLSTem2 stemmer
- Many fixes and improvements to lm language model package
- Fix bug in nltk.metrics.aline, C_skip = -10
- Improvements to TweetTokenizer
- Optional show arg for FreqDist.plot, ConditionalFreqDist.plot
- edit_distance now computes Damerau-Levenshtein edit-distance
- Update to 3.6.2
- move test code to nltk/test
- fix bug in NgramAssocMeasures (order preserving fix)
- Update to 3.6
- add support for Python 3.9
- add Tree.fromlist
- compute Minimum Spanning Tree of unweighted graph using BFS
- fix bug with infinite loop in Wordnet closure and tree
- fix bug in calculating BLEU using smoothing method 4
- Wordnet synset similarities work for all pos
- new Arabic light stemmer (ARLSTem2)
- new syllable tokenizer (LegalitySyllableTokenizer)
- remove nose in favor of pytest
* Thu Apr 23 2020 John Vandenberg <jayvdb@gmail.com>
- Update to v3.5
* add support for Python 3.8
* drop support for Python 2
* create NLTK's own Tokenizer class distinct from the Treebank
reference tokeniser
* update Vader sentiment analyser
* fix JSON serialization of some PoS taggers
* minor improvements in grammar.CFG, Vader, pl196x corpus reader,
StringTokenizer
* change implementation <= and >= for FreqDist so they are partial
orders
* make FreqDist iterable
* correctly handle Penn Treebank trees with a unlabeled branching
top node
Version: 3.4.5-bp151.4.3.1
* Sat Mar 14 2020 Tomá? Chvátal <tchvatal@suse.com>
- Fix build without python2
* Mon Oct 14 2019 Matej Cepl <mcepl@suse.com>
- Replace %fdupes -s with plain %fdupes; hardlinks are better.
* Wed Sep 11 2019 Tomá? Chvátal <tchvatal@suse.com>
- Update to 3.4.5 (bsc#1146427, CVE-2019-14751):
* Fixed security bug in downloader: Zip slip vulnerability - for the
unlikely situation where a user configures their downloader to use
a compromised server CVE-2019-14751
* Tue Jul 23 2019 Tomá? Chvátal <tchvatal@suse.com>
- Update to 3.4.4:
* fix bug in plot function (probability.py)
* add improved PanLex Swadesh corpus reader
* add Text.generate()
* add QuadgramAssocMeasures
* add SSP to tokenizers
* return confidence of best tag from AveragedPerceptron
* make plot methods return Axes objects
* don't require list arguments to PositiveNaiveBayesClassifier.train
* fix Tree classes to work with native Python copy library
* fix inconsistency for NomBank
* fix random seeding in LanguageModel.generate
* fix ConditionalFreqDist mutation on tabulate/plot call
* fix broken links in documentation
* fix misc Wordnet issues
* update installation instructions
* Thu May 23 2019 pgajdos@suse.com
- version update to 3.4.1
* add chomsky_normal_form for CFGs
* add meteor score
* add minimum edit/Levenshtein distance based alignment function
* allow access to collocation list via text.collocation_list()
* support corenlp server options
* drop support for Python 3.4
* other minor fixes
* Sun Feb 10 2019 John Vandenberg <jayvdb@gmail.com>
- Remove Python 3 dependency on singledispatch
* Sat Feb 09 2019 John Vandenberg <jayvdb@gmail.com>
- Update to v3.4
+ Support Python 3.7
+ New Language Modeling package
+ Cistem Stemmer for German
+ Support Russian National Corpus incl POS tag model
+ Krippendorf Alpha inter-rater reliability test
+ Comprehensive code clean-ups
+ Switch continuous integration from Jenkins to Travis
- from v3.3
+ Support Python 3.6
+ New interface to CoreNLP
+ Support synset retrieval by sense key
+ Minor fixes to CoNLL Corpus Reader
+ AlignedSent
+ Fixed minor inconsistencies in APIs and API documentation
+ Better conformance to PEP8
+ Drop Moses Tokenizer (incompatible license)
* Wed Feb 06 2019 John Vandenberg <jayvdb@gmail.com>
- Add missing dependency six
- Remove unnecessary build dependency six
- Recommend all optional dependencies
Version: 3.2.5-bp150.2.3
* Tue Mar 06 2018 jengelh@inai.de
- Trim redundant wording from description.
* Mon Mar 05 2018 badshah400@gmail.com
- Use \%license instead of \%doc to install License.txt.
* Tue Jan 30 2018 guigo.lourenco@gmail.com
- Depend on the full python interpreter to fix sqlite3 import
during %check
* Tue Jan 16 2018 guigo.lourenco@gmail.com
- Depend on python-rpm-macros
- Build for both Python2 and Python3
* Tue Dec 19 2017 badshah400@gmail.com
- Update to version 3.2.5:
* Arabic stemmers (ARLSTem, Snowball)
* NIST MT evaluation metric and added NIST
international_tokenize
* Moses tokenizer
* Document Russian tagger
* Fix to Stanford segmenter
* Improve treebank detokenizer, VerbNet, Vader
* Misc code and documentation cleanups
* Implement fixes suggested by LGTM
- Convert specfile to python single-spec style.
- Drop unneeded BuildRequires: python-PyYAML, python-xml,
python-devel; not required for building.
- Change existing Requires to Recommends: these are really needed
for additional features, and not required for basic nltk usage.
- Add new Recommends: python-scipy, python-matplotlib,
python-pyparsing, and python-gensim; enables other optional
features.
- Run fdupes to link-up duplicate files.
- Remove exec permissions for a file not intended to be executed
(not in exec path, no hashbang, etc.)
- Remove hashbangs from non-executable files.
- Run tests following the suggestion from
http://www.nltk.org/install.html.
* Tue Feb 21 2017 stephan.barth@suse.com
- update to version 3.2.2
Upstream changelog:
Support for Aline, ChrF and GLEU MT evaluation metrics, Russian POS tagger
model, Moses detokenizer, rewrite Porter Stemmer and FrameNet corpus reader,
update FrameNet Corpus to version 1.7, fixes: stanford_segmenter.py,
SentiText, CoNLL Corpus Reader, BLEU, naivebayes, Krippendorff’s alpha,
Punkt, Moses tokenizer, TweetTokenizer, ToktokTokenizer; improvements to
testing framework
* Fri Oct 14 2016 toddrme2178@gmail.com
- Update to version 3.2.1
+ No changelog available
* Thu May 21 2015 toddrme2178@gmail.com
- Remove upstreamed nltk-2.0.4-dont-use-python-distribute.patch
- Update to version 3.0.2
+ No changelog available
* Sun Dec 08 2013 p.drouand@gmail.com
- Update to version 2.0.4
+ No changelog available
- Add nltk-2.0.4-dont-use-python-distribute.patch ; force use of
python-setuptools instead of python-distribute
* Thu Oct 24 2013 speilicke@suse.com
- Require python-setuptools instead of distribute (upstreams merged)
* Fri Sep 23 2011 saschpe@suse.de
- Update to version 2.0.1rc1
* Mon Feb 08 2010 oddrationale@gmail.com
- fixed copyright and license statements
- removed PyYAML, and added dependency to installers and download
instructions
- updated to LogicParser, DRT (Dan Garrette)
- WordNet similarity metrics return None instead of -1 when
they fail to find a path (Steve Bethard)
- shortest_path_distance uses instance hypernyms (Jordan
Boyd-Graber)
- clean_html improved (Bjorn Maeland)
- batch_parse, batch_interpret and batch_evaluate functions allow
grammar or grammar filename as argument
- more Portuguese examples (portuguese_en.doctest, examples/pt.py)
* Thu Dec 10 2009 oddrationale@gmail.com
- added python-nltk-remove-yaml.patch to pevent conflict with
python-yaml
- added Requires: python-yaml
* Wed Dec 09 2009 oddrationale@gmail.com
- Initial Release (Version 2.0b7): Sun Feb 7 18:50:18 CST 2010