Proofreader API: support correction types #54586
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The existing API only supports basic proofreading functionality - given
an input, execute the model to return the fully corrected text and
algorithmically find the list of corrections made on input that would
produce the corrected text.
This CL implements additional model executions for getting correction
type labels for all corrections and return as part of ProofreadResult if
requested by user.
-proofread(): If correction types are requested, after finding the list
of corrections, execute the model to get correction type one-by-one.
Once all correction type labels are received, resolve the promise for
proofread() with the final result.
The model execution for fetching correction types expects requests in
the following format:
Input: "
can
you profread fir me",Corrected input: "
Can
you proofread for me?",Correction instruction: "Correcting
can
toCan
".The model is trained with the above format, where backticks are used to
annotate the error and correction.
Minor change:
-GetCorrections() is added to get the raw corrections
(describing locations of both the errors in the original text and the
corresponding correction in the corrected text). This is refactored from
previous GetProofreadingCorrections() which sets the relevant fields of
ProofreadResult directly. We need the raw corrections' locations to help
annotate the model execution request for fetching correction types.
For more context about the API, please see the explainer published here:
https://github.com/webmachinelearning/proofreader-api/blob/main/README.md
Bug: 403313556, 429259028
Change-Id: I058518a8da37ca94b44620dbecf40438bb8ba63f
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/6863178
Reviewed-by: Mike Wasserman <msw@chromium.org>
Reviewed-by: Alex Gough <ajgo@chromium.org>
Commit-Queue: Queenie Zhang <queeniezhang@google.com>
Cr-Commit-Position: refs/heads/main@{#1508070}