The 'Lingua::EN::Sentence' module contains the function get_sentences, which splits text into its constituent sentences, based on a regular expression and a list of abbreviations (built in and given).
Certain well know exceptions, such as abbreviations, may cause incorrect segmentations. But some of them are already integrated into this code and are being taken care of. Still, if you see that there are words causing the get_sentences function to fail, you can add those to the module, so it notices them. Note that abbreviations are case sensitive, so 'Mrs.' is recognised but not 'mrs.'
Package Version | Update ID | Released | Package Hub Version | Platforms | Subpackages |
---|---|---|---|---|---|
0.34-bp156.1.1 info | GA Release | 2023-09-07 | 15 SP6 |
|
|
0.33-bp155.1.4 info | GA Release | 2023-05-17 | 15 SP5 |
|
|
0.31-bp154.1.18 info | GA Release | 2022-05-09 | 15 SP4 |
|
|
0.30-bp153.1.14 info | GA Release | 2021-03-06 | 15 SP3 |
|
|
0.30-bp152.3.13 info | GA Release | 2020-04-16 | 15 SP2 |
|
|
0.30-bp151.3.1 info | GA Release | 2019-07-17 | 15 SP1 |
|
|
0.30-bp151.2.12 info | GA Release | 2019-05-18 | 15 SP1 |
|
|
0.30-bp150.2.4 info | GA Release | 2018-07-30 | 15 |
|
|