Skip to content

Conversation

@jrflat
Copy link
Contributor

@jrflat jrflat commented Dec 6, 2025

Improves IDNA encoding performance by converting UTF8 to UTF16 in Swift before calling the UTF16-native ICU functions. Improves compatibility by allowing specific UIDNA errors during nameToASCII.

Motivation:

Performance: ICU uidna_nameToASCII_UTF8 and uidna_nameToUnicodeUTF8 are just convenience wrappers around the UTF16-native functions. Performing the conversions to and from UTF16 ourselves in Swift is faster than having ICU do it for us, and we can also use the fact that nameToASCII produces ASCII on success to efficiently truncate the returned UInt16 buffer.

Compatibility: Resolves #1560. As the issue describes, URL handling of ASCII and IDNA-encoded hosts is inconsistent. For instance, if an IDNA-encoded host has a domain label longer than 63 bytes or the entire domain is longer than 255 bytes, URL previously returned nil because the uidna functions indicated the respective non-fatal errors. Hosts without IDNA-encoding don't see this limitation. Allowing these errors also aligns our behavior with Safari and other WHATWG URL parsers.

Modifications:

  • In cases where we would previously call the UTF8 uidna functions, this PR instead performs the UTF8 to UTF16 transcoding in Swift before passing the UTF16 buffer to the ICU function. On nameToASCII success, we truncate the returned UInt16 ASCII elements to UInt8 and initialize the resulting String. On nameToUnicode success, we create the String from UTF16, which performs the UTF16 to UTF8 transcoding.

  • Use the same allowed errors for nameToASCII that are currently allowed for nameToUnicode: UIDNA_ERROR_EMPTY_LABEL | UIDNA_ERROR_LABEL_TOO_LONG | UIDNA_ERROR_DOMAIN_NAME_TOO_LONG | UIDNA_ERROR_LEADING_HYPHEN | UIDNA_ERROR_TRAILING_HYPHEN | UIDNA_ERROR_HYPHEN_3_4

Result:

IDNA Encoding Performance
  • ~10% speedup for IDNA encoding and decoding.

  • Compatibility with WHATWG parsers/browsers and consistent behavior for IDN and non-IDN host name lengths.

Testing:

  • Added benchmarks for IDNA encoding and decoding.

  • Added unit test for allowed nameToASCII errors.

@jrflat
Copy link
Contributor Author

jrflat commented Dec 6, 2025

@swift-ci please test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Inconsistent handling of IDN vs non-IDN hosts in URL()

1 participant