Skip to content

Conversation

cyx2015s
Copy link

Added better wrapping for mixed language text.

  1. Chinese characters can break at anywhere.
  2. Head prohibited punctuations will overflow and will not appear at the start of the line.
  3. Tail prohibited punctuations will not appear at the end of the line.

Notice how "PR 的同时" in the second paragragh break in the two videos.

Demo text used: The Chinese translation of CONTRIBUTING.md in this repo and 离骚

After.mp4
Before.mp4

TODO:

  1. Better way to detect CJK characters. Currently I use 0x3003 <= c && c <= 0xFFFF.
  2. I don't speak Japanese or Korean so I don't know other punctuations. Add more punctuation supports.

Needs feedback in real use cases, currently only tested with a few examples.

@heroboy
Copy link
Contributor

heroboy commented Jul 28, 2025

In the VSCode codebase, this list is more complete.

wordWrapBreakAfterCharacters: register(new EditorStringOption(
EditorOption.wordWrapBreakAfterCharacters, 'wordWrapBreakAfterCharacters',
// allow-any-unicode-next-line
' \t})]?|/&.,;¢°′″‰℃、。。、¢,.:;?!%・・ゝゞヽヾーァィゥェォッャュョヮヵヶぁぃぅぇぉっゃゅょゎゕゖㇰㇱㇲㇳㇴㇵㇶㇷㇸㇹㇺㇻㇼㇽㇾㇿ々〻ァィゥェォャュョッー”〉》」』】〕)]}」',
)),
wordWrapBreakBeforeCharacters: register(new EditorStringOption(
EditorOption.wordWrapBreakBeforeCharacters, 'wordWrapBreakBeforeCharacters',
// allow-any-unicode-next-line
'([{‘“〈《「『【〔([{「£¥$£¥++'
)),

@cyx2015s
Copy link
Author

Thank you, that list is longer than that I expected, and some characters are to wide to use the overflow strategy. I need to come up with a more robust method rather than editing on current one. Stay tuned.

@cyx2015s
Copy link
Author

cyx2015s commented Jul 30, 2025

Demo: Added a text input to word wrapping section to test word wrapping without recompiling the code

Current code lacks optimization and seems to report false real line widths (See Demo -> Text -> Word Wrapping, the boundary takes space into consideration)

and buggy...

@cyx2015s
Copy link
Author

I found why my behavior is different, my algorithm wraps at the space before words, the old algorithm wraps at the end of words. The rendering result is same.

Wrap candidates before:

This| is| a| sentence.|

Wrap candidates after:

This |is |a |sentence.|

@cyx2015s
Copy link
Author

Also fixed issue #8503 and #8139, implemented pull #8439

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants