Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strict internationalized domain names (IDN) validation #13

Open
marmeladema opened this issue Sep 14, 2021 · 4 comments
Open

Strict internationalized domain names (IDN) validation #13

marmeladema opened this issue Sep 14, 2021 · 4 comments

Comments

@marmeladema
Copy link

Hello!

First allow me to thank you for your work 👍 That crate has been really useful and very simple to use!

I am not exactly sure if it's actually a goal of the crate but I figured I might ask.
Should internationalized domain names be properly validated?
I was looking at test cases from https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/master/tests/draft7/optional/format/idn-hostname.json and it seems that some domain names are accepted whereas they should probably be rejected.

A few examples:

  • 〮실례.테스트 should be rejected because it contains a forbidden leading combining character
$ dig 〮실례.테스트
dig: '〮실례.테스트' is not a legal IDNA2008 name (string contains a forbidden leading combining character), use  noidnin
  • 실〮례.테스트 should be rejected because it contains a disallowed character
$ dig 실〮례.테스트
dig: '실〮례.테스트' is not a legal IDNA2008 name (string contains a disallowed character), use  noidnin
  • xn--X should be rejected because it contains invalid punycode data
$ dig xn--X
dig: 'xn--X' is not a legal IDNA2008 name (string contains invalid punycode data), use  noidnin

What do you think? Could the crate be enhanced to provide such domain validation? If not, do you recommend some alternatives?

Thank you for taking the time to read this.

@rushmorem
Copy link
Collaborator

Hello :)

First allow me to thank you for your work

It's my pleasure 🙂

That crate has been really useful and very simple to use!

I'm glad to hear that. Thank you for the feedback!

I am not exactly sure if it's actually a goal of the crate but I figured I might ask. Should internationalized domain names be properly validated?

Yes, absolutely!

...it seems that some domain names are accepted whereas they should probably be rejected.

I thought that using this crate in conjunction with the idna crate would be able to cover all the cases. Turns out I was wrong. Thanks for bringing this to my attention. I have added those tests to this crate's integration tests and added this issue to the README.

@marmeladema
Copy link
Author

Yes unfortunately idna does not seem to be fully compliant either. I hesitated to open an issue there too but it doesn't seem to be maintained that much nowadays.
Moreover, I looked at the implementation of idna, and it's really tailored for converting an input string into either a ascii or unicode version of the domain, not really parsing and validation.
Ideally I'd want a heap allocation free validator fully compliant with idna but I haven't been able to find one.

@L020Isry8fuLjSL7r0Gmxw
Copy link

Have you seen the stringprep crate? It claims to implement parsing and validation of IDN names defined by RFC 3491

@marmeladema
Copy link
Author

I tried but it fails on the second test case of the file I mentioned so it doesn't seem compliant either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants