-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve single language detection when words in other languages are quoted #112
Labels
enhancement
New feature or request
Comments
pemistahl
changed the title
Increase single language detection when words in other languages are quoted
Improve single language detection when words in other languages are quoted
Jan 19, 2023
Thanks for reaching out to me. I will try to improve language detection for inputs like yours, even though it's not a trivial problem to solve. |
@pemistahl If you could point me in the general area I could look at a few options to test adding this feature. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When I put in german sentences with japanese words quoted then it might happen, that lingua claims it's 100% japanese.
For example:
Wir stoßen an: "かんぱい". Er lächelte.
(in english, if you are interested: »We toasted: "kanpai". He smiled«) leads to a ConfidenceValue of 1.0 of japanese. WhileWir stoßen an. Er lächelte.
has a ConfidenceValue of 0.6014287047855706 for german and 0.0 for japanese (I included all languages for detection).The expected result in both should be german, maybe with slight japanese confidence in the first case since a japanese word is quoted but it should not be 100% japanese.
The text was updated successfully, but these errors were encountered: