Skip to content

Conversation

@vstinner
Copy link
Member

@vstinner vstinner commented Nov 12, 2025

  • Copy functions from unicodeobject.c:

    • _PyUnicode_UTF8()
    • PyUnicode_UTF8(), PyUnicode_SET_UTF8()
    • PyUnicode_UTF8_LENGTH(), PyUnicode_SET_UTF8_LENGTH()
    • get_latin1_char()
    • findchar()
  • Share code with unicodeobject.c:

    • _PyUnicode_FromUCS1()
    • _PyUnicode_FiniEncodings()
    • _PyUnicode_TranslateCharmap()
    • _Py_EncodingMapType;

* Copy functions from unicodeobject.c:

  * _PyUnicode_UTF8()
  * PyUnicode_UTF8(), PyUnicode_SET_UTF8()
  * PyUnicode_UTF8_LENGTH(), PyUnicode_SET_UTF8_LENGTH()
  * get_latin1_char()
  * findchar()

* Share code with unicodeobject.c:

  * _PyUnicode_FromUCS1()
  * _PyUnicode_FiniEncodings()
  * _PyUnicode_TranslateCharmap()
  * _Py_EncodingMapType;
@vstinner
Copy link
Member Author

Line count:

$ wc -l Objects/unicode_codecs.c Objects/unicodeobject.c 
  6671 Objects/unicode_codecs.c
  8541 Objects/unicodeobject.c
 15212 total

In PR gh-139354, I created 3 files for codecs:

  • Objects/unicode_convert.c (835 lines)
  • Objects/unicode_codecs_utf.c (2,171 lines)
  • Objects/unicode_codecs.c (3,239 lines)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant