Closed Bug 1854031 Opened 2 years ago Closed 2 years ago

Upgrade segmenter to ICU4X 1.4

Categories

(Core :: Internationalization, task)

task

Tracking

()

RESOLVED FIXED
122 Branch
Tracking Status
firefox122 --- fixed

People

(Reporter: TYLin, Assigned: m_kato)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(9 files, 1 obsolete file)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

ICU4X 1.3 is schedule to be released soon, and it fixed bugs in Segmenter. We should upgrade it once it is released.

Summary: Update segmenter to ICU4X 1.3 → Upgrade segmenter to ICU4X 1.3
Blocks: 1854032

Note that when ICU4X 1.3 with complied_data feature, https://github.com/rust-lang/cargo/issues/10801 occurs. It means that ./mach vendor rust copies unnecessary crates.

(In reply to Ting-Yu Lin [:TYLin] (UTC-8) from comment #0)

ICU4X 1.3 is schedule to be released soon, and it fixed bugs in Segmenter. We should upgrade it once it is released.

FYI. https://crates.io/crates/icu_segmenter/1.3.2

(In reply to Makoto Kato [:m_kato] from comment #1)

Note that when ICU4X 1.3 with complied_data feature, https://github.com/rust-lang/cargo/issues/10801 occurs. It means that ./mach vendor rust copies unnecessary crates.

Yeah, introducing unnecessary crates into gecko is a pain ... Do we have any workaround in gecko?

Also, I file an issue as https://github.com/unicode-org/icu4x/issues/4109. This may be a bug of diplomat_runtime.

Additional,

  • We don't need dictionary_wl_ext_v1, but when using compiled_data feature, it have to be required.
  • License is change to Unicode License V3 (https://www.unicode.org/license.txt). So we have to update python/mozbuild/mozbuild/vendor/vendor_rust.py and get approval for this license.

So we cannot upgrade to 1.3.2. So we have to wait for fixing comment #3 dictionary_wl_ext_v1 issue and at least.

Although icu_capi uses weak dependency syntax, cargo vendor doesn't support
it. So this command will copy unnecessary crates. To avoid it, we use modified
version of icu_capi.

Also, icu_capi's C++ headers isn't compatible with clang [*1].

From ICU4X 1.3, there are new icu_*_data crates to custom data file, instead of
icu_testdata.

*1 https://github.com/llvm/llvm-project/issues/70162

Depends on D192900

Asking whether Unicode License V3 is allowed (https://phabricator.services.mozilla.com/D193036).

Depends on: 1806348
Attachment #9362253 - Attachment description: WIP: Bug 1854031 - Part 2. Update update-icu4x.sh script to import icu_capi to local and for datagen change. r=TYLin → WIP: Bug 1854031 - Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change. r=TYLin
Assignee: nobody → m_kato
Attachment #9362251 - Attachment description: WIP: Bug 1854031 - Part 1. Allow Unicode License V3 to vendor rust for ICU4X 1.3. r=sylvestre → WIP: Bug 1854031 - Part 2. Allow Unicode License V3 to vendor rust for ICU4X 1.3. r=sylvestre
Status: NEW → ASSIGNED

By update-icu4x.sh script.

Depends on D192900

Remove unnecessary patches in Cargo.toml

Depends on D193880

Depends on D193883

Depends on D193885

Summary: Upgrade segmenter to ICU4X 1.3 → Upgrade segmenter to ICU4X 1.4
Attachment #9362251 - Attachment description: WIP: Bug 1854031 - Part 2. Allow Unicode License V3 to vendor rust for ICU4X 1.3. r=sylvestre → WIP: Bug 1854031 - Part 2. Allow Unicode License V3 to vendor rust for ICU4X 1.4. r=sylvestre
Attachment #9364101 - Attachment description: WIP: Bug 1854031 - Part 4. Update Cargo.toml for ICU4X 1.3. r=TYLin → WIP: Bug 1854031 - Part 4. Update Cargo.toml for ICU4X 1.4. r=TYLin,#supply-chain-reviewers!
Attachment #9364103 - Attachment description: WIP: Bug 1854031 - Part 6. Gecko changes for ICU4X 1.3. r=TYLin! → WIP: Bug 1854031 - Part 6. Gecko changes for ICU4X 1.4. r=TYLin!
Attachment #9364105 - Attachment description: WIP: Bug 1854031 - Part 8. Update lint file path. r=#lint-reviewers → WIP: Bug 1854031 - Part 7. Update lint file path. r=#linter-reviewers
Attachment #9364104 - Attachment description: WIP: Bug 1854031 - Part 7. Add Unicode License V3. r=sylvestre → WIP: Bug 1854031 - Part 8. Add Unicode License V3. r=sylvestre
Attachment #9362253 - Attachment description: WIP: Bug 1854031 - Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change. r=TYLin → Bug 1854031 - Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change. r=TYLin
Attachment #9362251 - Attachment description: WIP: Bug 1854031 - Part 2. Allow Unicode License V3 to vendor rust for ICU4X 1.4. r=sylvestre → Bug 1854031 - Part 2. Allow Unicode License V3 to vendor rust for ICU4X 1.4. r=sylvestre
Attachment #9364100 - Attachment description: WIP: Bug 1854031 - Part 3. Import icu_capi and icu_segmenter_data crate in tree. r=TYLin → Bug 1854031 - Part 3. Import icu_capi and icu_segmenter_data crate in tree. r=TYLin
Attachment #9364105 - Attachment description: WIP: Bug 1854031 - Part 7. Update lint file path. r=#linter-reviewers → WIP: Bug 1854031 - Part 4. Update lint file path. r=#linter-reviewers
Attachment #9364104 - Attachment is obsolete: true
Attachment #9364106 - Attachment description: WIP: Bug 1854031 - Part 9. Update ICU4X document. r=TYLin → WIP: Bug 1854031 - Part 8. Update ICU4X document. r=TYLin
Attachment #9364105 - Attachment description: WIP: Bug 1854031 - Part 4. Update lint file path. r=#linter-reviewers → Bug 1854031 - Part 4. Update lint file path. r=#linter-reviewers
Attachment #9364101 - Attachment description: WIP: Bug 1854031 - Part 4. Update Cargo.toml for ICU4X 1.4. r=TYLin,#supply-chain-reviewers! → Bug 1854031 - Part 5. Update Cargo.toml for ICU4X 1.4. r=TYLin!,#supply-chain-reviewers!
Attachment #9364102 - Attachment description: WIP: Bug 1854031 - Part 5. Run ./mach vendor rust. r=#supply-chain-reviewers,TYLin! → Bug 1854031 - Part 6. Run ./mach vendor rust. r=#supply-chain-reviewers!,TYLin!
Attachment #9364103 - Attachment description: WIP: Bug 1854031 - Part 6. Gecko changes for ICU4X 1.4. r=TYLin! → Bug 1854031 - Part 7. Gecko changes for ICU4X 1.4. r=TYLin!
Attachment #9364106 - Attachment description: WIP: Bug 1854031 - Part 8. Update ICU4X document. r=TYLin → Bug 1854031 - Part 8. Update ICU4X document. r=TYLin
Attachment #9362253 - Attachment description: Bug 1854031 - Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change. r=TYLin → Bug 1854031 - Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change.

Since ICU4C 74 is backed out in tree, we don't land bug Bug 1806348 yet.

ICU4X 1.4 uses Unicode License V3, so we have to update license text.

Blocks: 1423593
Duplicate of this bug: 1867586
Attachment #9362253 - Attachment description: Bug 1854031 - Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change. → Bug 1854031 - Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change. r=TYLin
Attachment #9366280 - Attachment description: Bug 1854031 - Part 9. Add Unicode License V3 for ICU4X. → Bug 1854031 - Part 9. Add Unicode License V3 for ICU4X. r=sylvestre
Pushed by m_kato@ga2.so-net.ne.jp: https://hg.mozilla.org/integration/autoland/rev/69f846d2ca22 Part 1. Update update-icu4x.sh script to import icu_capi to local and for datagen change. r=TYLin https://hg.mozilla.org/integration/autoland/rev/3639285c0b4c Part 2. Allow Unicode License V3 to vendor rust for ICU4X 1.4. r=glandium https://hg.mozilla.org/integration/autoland/rev/b3838936ce45 Part 3. Import icu_capi and icu_segmenter_data crate in tree. r=TYLin https://hg.mozilla.org/integration/autoland/rev/1bf11c6f147b Part 4. Update lint file path. r=linter-reviewers,sylvestre https://hg.mozilla.org/integration/autoland/rev/94260c8116da Part 5. Update Cargo.toml for ICU4X 1.4. r=TYLin https://hg.mozilla.org/integration/autoland/rev/1478a3bac570 Part 6. Run ./mach vendor rust. r=TYLin,supply-chain-reviewers https://hg.mozilla.org/integration/autoland/rev/676ce33ea864 Part 7. Gecko changes for ICU4X 1.4. r=TYLin https://hg.mozilla.org/integration/autoland/rev/980a2ae3ed27 Part 8. Update ICU4X document. r=TYLin https://hg.mozilla.org/integration/autoland/rev/7b7ac6e2bc52 Part 9. Add Unicode License V3 for ICU4X. r=sylvestre
Blocks: 1868454
Blocks: 1899411
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: