Ginger's Grammar Correction API
Overview
This document describes how to use Ginger’s proofreading API to correct spelling, grammar, punctuation, vocabulary and style issues in documents.

The first section (API details) describes all the information a developer needs to integrate with the API. The next section (Common Use Cases) articulates various ways in which the API can be put to use. It is recommended to read this section before beginning implementation. Certain readers (such as product managers, for example) may find it more useful to begin with this section.

The two appendices dive in detail into two types of information returned by the API – correction types and correction categories respectively.
API details
Endpoints

Sandbox: an endpoint for development and integration purposes. https://sb-partner-services.gingersoftware.com/correction/v1/document

Production: https://doc-partner-services.gingersoftware.com/correction/v1/document

Request: To correct a document, create a POST request with the document text in the body of the request. The document text is assumed to be url-encoded.

Sample request to the production endpoint:

https://doc-partner-services.gingersoftware.com/correction/v1/document?apiKey=someApiKey

Headers
Content-Type
The document type should be specified in the Content-Type header. Supported types and subtypes (type/subtype - case insensitive):
  • text/plain for plain text documents
  • text/html for html documents
If an unrecognized type or subtype are passed, the server will return HTTP 415. For example, the following will return 415:
  • application/plain: only type “text” is supported
  • text/pdf: only subtypes “plain” and “html” are supported
accept-encoding
Compression is supported via the accept-encoding header. When set to “gzip” the transport will be encoded.
URL Params

URL params can be passed in any order

In case an illegal param value is passed, the server returns HTTP error 400 with an appropriate error message.

An unknown param name in the query string will be silently ignored.

Following is the list of supported url params and their values.

Parameters
Name
Values
Comment
apiKey
The api key string obtained
from ginger.

If you do not have an API Key yet, click here for your API key

Mandatory.
lang
US / UK / Indifferent

Not mandatory. Defaults to “Indifferent”

The English language locale to be used for correction.

UK/US will enforce British English or American English spelling respectively. “Indifferent” will make the correction agnostic to locale, so either variation (e.g. colour or color) will be considered correct.

generateSynonyms
true/false

Not mandatory. Defaults to false.

If set to true, returns contextual synonym suggestions in addition to corrections.

For more details about recommendations, see the appendix.

generateRecommendations
true/false

Not mandatory. Defaults to false.

If set to true, returns style and vocabulary recommendations in addition to other corrections.

For more details about recommendations, see the appendix.

avoidCapitalization
true/false

Not mandatory. Defaults to false.

If set to true, beginning of sentence capitalization will not be checked for.

For example, when the flag is set to true the sentence “the boy is tall” will be returned with no corrections (as opposed to suggesting to capitalize the word “the”.

Response

Below is the response for sending the following two-sentence document for correction:

Her closes the door quiet. She not hear anything.

The response contains an array of three corrections with all their details as described below.

The response consists of two arrays: Corrections and Sentences.

Corrections array:

Each element represents an error found in the document and its suggested corrections.

Confidence:

A discrete value indicating how likely the correction is to be precise. Possible values: 4 (High), 3 (Medium), 2 (Low), 1 (None). The higher the value, the more reliable the correction.

ShouldReplace

A Boolean indicating whether it is recommended to automatically replace the error word with the top suggestion. For a review of various client implementation strategies, see the “Common use cases” section in this doc.

CorrectionType:

A classification of the correction into one of the following high-level categories: 1 – Spelling, 2 – Misused Word, 3 – Grammar, 4 – Synonym, 5 – Recommendation, 6 - Punctuation

TopCategoryId:

The id of the grammatical category the top suggestion belongs to. For a detailed description and a full list of category ids see the appendix to this document.

MistakeText:

The word or words which contain an error.

MistakeDefinition:

Whenever available, a dictionary definition of the mistake word

From, To:

Zero based indices of the detected error, relative to the beginning of the document.

LrnFrg:

Whenever available, the grammatical context of the error word/s. A short, not necessarily consecutive, fragment from the original sentence which gives grammatically intact context around the error word/s.

MistakeWordsInLrnFrg:

The indices of the words with errors within the LrnFrg. The indices are zero based, relative to the beginning of the LrnFrg.

Suggestions Array:

An array of suggested corrections for the error. The suggestions are ordered by relevance. Each suggestion contains the following data:

Text: the text with which to replace the error word/s in the original document.

CategoryId: The id of the grammatical category which the suggestion belongs to.
For a detailed description and a full list of category ids see the appendix to this document.

Definition: Whenever available, a dictionary definition of the suggested word.

Recommendations Array:

Similar to the Corrections array. Contains vocabulary and style recommendations, as described in the Recommendations section of the “Contextual Synonyms and Recommendations” appendix. This array will appear only if there are recommendation corrections returned by the API.

Sentences array:

An array in which each element represents a sentence in the original document. A client can use this section to determine how Ginger’s correction engine split the document into sentences, determine which sentences were not corrected and why, compute statistics like the number of sentences in the document and so on.

Each sentence in the array contains the following information:

FromIndex,ToIndex: zero based indices of the sentence, relative to the beginning of the document.

IsEnglish: False means the sentence was flagged as not being in English and thus was not corrected.

ExceededCharactersLimit: Sentences which exceed 300 characters are not corrected. Such sentences will have “true” in this field.

Response Example:

{
  "GingerTheDocumentResult": {
    "Corrections": [
      {
        "CorrectionType": 3,
        "From": 0,
        "Suggestions": [
          {
            "CategoryId": 5,
            "Text": "She"
          }
        ],
        "To": 2,
        "TopCategoryId": 5
      },
      {
        "CorrectionType": 3,
        "From": 20,
        "Suggestions": [
          {
            "CategoryId": 23,
            "Text": "quietly"
          }
        ],
        "To": 24,
        "TopCategoryId": 23
      },
      {
        "CorrectionType": 3,
        "From": 31,
        "Suggestions": [
          {
            "CategoryId": 21,
            "Text": "doesn't hear"
          }
        ],
        "To": 38,
        "TopCategoryId": 21
      }
    ],
    "Sentences": [
      {
        "ExceededCharacterLimit": false,
        "FromIndex": 0,
        "IsEnglish": true,
        "ToIndex": 26
      },
      {
        "ExceededCharacterLimit": false,
        "FromIndex": 27,
        "IsEnglish": true,
        "ToIndex": 49
      }
    ]
  }
}

Returns a 200 response on success

Failure Responses
The response for missing/wrong parameters includes an exception type and an error message.
Failure Response example:


Error 400

A mandatory param was not passed. The Message property provides information about the missing parameter. In the example above it is the mandatory apiKey parameter.

{
  "ErrorType": "Missing parameter",
  "Message": "APIKey is mandatory"
}

{
  "ErrorType": "Authentication error",
  "Message": "The api key is not recognized"
}

{
  "ErrorType": "Authentication error",
  "Message": "The api key is expired or disabled"
}


Error 400

A url param was passed with an illegal value. The Message property provides detailed information. In the example above it is the lang parameter, which was passed with an illegal value of “USS” (instead of (US) Example: generateSynonyms=falsee, avoidCapitalization=tru etc.

{
  "ErrorType": "Illegal parameter value",
  "Message": "lang param value is invalid: uss"
} 


Error 400

The document sent for correction exceeded the maximum allowed limit. The maximum document size is normally in the range of 10,000 and 50,000 characters.

{
  "ErrorType": "Server-side error",
  "Message": "The document is too long"
} 


Error 415

An unsupported document type was passed

{
  "ErrorType": "Unsupported media-type in Content-Type header",
  "Message": "text/plaind is unsupported"
} 


Common use cases

Ginger’s correction API returns suggested corrections for words suspected as errors. It is up to the client of the API to decide how to apply the corrections to the text. The following section outlines common use cases. These can be broadly categorized as interactive and offline scenarios. Ginger’s API contains all the data required to serve both of these scenarios.

There is some overlap between the two, so it is recommended to read both sections below.

Interactive Corrections

In this scenario, a user submits their text for grammar and spelling correction through an online text editor. The following are some common practices for displaying the results of the API.

Highlighting errors

Highlighting or otherwise marking the parts of the text which were identified as errors can help the user focus on correcting them. This can be done using the “From” and “To” fields of each “Correction” object from the “Corrections” array in the response.

Displaying alternative suggestions

When an error is identified, three options exist with regard to how to correct it: there is either a single suggested correction, multiple suggested corrections or no suggested corrections.

Single suggested correction: In terms of the API response, this is the case when the “Suggestions” array of a certain “Correction” object contains a single element. The client application should decide how to display this suggestion to the user. This can be done either by replacing the error word automatically or by only highlighting it and letting the user view the suggestion and decide whether to apply it.

Implementations which choose to replace automatically should consider the “shouldReplace” property of the Correction object. If it is false, it is advised not to perform an automatic replacement.

Multiple suggested corrections: this case is similar to the previous one except that since there are several suggested replacements the user should be able to choose the one they would like to replace the error with. The items in the “Suggestions” array in the response are ordered according to their likelihood, so it is recommended that they are displayed to the user in the same order they are received in the response from the API.

Implementations which choose to replace automatically should thus do so using the first item in the Suggestions array and also consider the “shouldReplace” property of the Correction object as mentioned above.

No suggested corrections: this is a case when Ginger detected that there is a mistake in the text, but is not able to suggest any reasonable option for correction. Here is a simple example of such a case: “My hshfjedkskhd is new.” In such cases it is still helpful to highlight the mistake text, but it is advisable to make It clear to the user that no suggestions are available either by a different highlight color, explicitly stating so in the UI or both.

Displaying additional information about the correction

Implementations may choose to reflect several other parts of the API response to users:

Confidence: A user will likely benefit from knowing a certain correction is considered low confidence and thus be more careful when deciding whether to select one of the suggested alternatives. The Confidence field in the response contains this information. Values of 4 and 3 (High and Medium) can be considered very reliable. A value of 1 indicates low confidence. It is recommended to differentiate between the two in the user interface, e.g. by using a different highlight color in each case.

Correction Types: Corrections can be thought of as belonging to one of several groups or types: spelling mistakes, grammar mistakes, punctuation issues, word usage or vocabulary mistakes, synonyms and style recommendations. This information is contained in the CorrectionType field for each correction. An implementation can choose to use this field in various ways, such as:

  • Help a user distinguish between different types of mistakes by using different colors for spelling, grammar and vocabulary errors
  • Distinguish between mistakes and style recommendations and synonym suggestions
  • Allow users to toggle the display of the different types of corrections on or off

Grammatical category: Each suggestion for correction is classified by the API as belonging to a certain grammatical category or topic. Knowing the category makes it possible to display to the user more fine-grained information about why the correction is being suggested. This both facilitates learning and increases the user’s confidence in the system.

For example, in the sentence “She live in my neighborhood”, the verb “live”” is corrected to “lives” so that it properly matches the subject of the sentence. Implementations can choose to explain to users at this point what subject verb agreement is, or the reason this suggested correction is offered to them. This can be done, for example, by presenting the explanation alongside the suggested correction, using a tooltip when hovering over the error word and so on.

This information is contained in the “CategoryId” field of each suggestion. To make things more convenient, the category of the first suggestion, which is the most likely one, is also given in the TopCategoryId property of the Correction object itself.

For a full list of categories, their description and examples, see the “Correction Categories” appendix in this document.

Definitions: Each suggestion often also has a dictionary definition of it in the Definition field of the Suggestion object. This info can be displayed to the user alongside the suggestion to enhance their understanding of the suggested word and thus make them more confident about their selection. Note that sometimes the definition will not appear, so the implementation must be ready to not display anything in this case.

Sentence level information

Sentence boundaries

Ginger’s API response contains information about where each sentence in the document begins and ends. This information is contained in the FromIndex and ToIndex properties of each Sentence object in the Sentences section. This can be useful for user interfaces which wish to mark sentences which contain errors, for example, or similar needs. To know which sentence contains a specific error, an implementation needs to locate the sentence for which the From and To indices of the correction are contained within the FromIndex and ToIndex of the Sentence.

The document is split into sentences according to standard end of sentence punctuation: period, question mark, exclamation mark and combinations of several such punctuation marks. Common abbreviations ending with a period (such as Dr., Mrs. and the likes) are accounted for. Sentences that do not contain an end of sentence punctuation will be considered a single sentence, which may result in inferior correction quality or exceeding the maximum allowed size of a single sentence (300 chars).

Unproofed sentences

The API response contains information which allows application to conclude which sentences were not proofread and why. This can happen for one of several reasons, each designated by a specific property in the relevant Sentence object of the response.

  • IsEnglish: false indicates the sentence is not in English and was thus not checked
  • ExceededCharactersLimit: true means the sentence was not checked because it is longer than the maximum limit of 300 characters. Implementations may wish to flag these sentences in a specific way, prompt the user to split them and recheck and so on.
Offline Document correction

There are use cases for the correction API which do not involve an interactive user interface. One such common case is when a large number of documents is being sent for correction as part of some business flow. For example, a legal firm may wish to send, at the end of each day, all the documents produced during that day for spelling and grammar proofreading. Each morning, contracts flagged with more than a certain amount of errors or containing certain error types are passed for manual review. Another example, as part of its process of reviewing a new manuscript submitted to it, a book publisher or a scientific journal may wish to automatically proofread it and based on the results send it directly to an editor, pass it to a human proofreader or reject it altogether. A website may wish to perform a periodic review of all the new published content, user reviews and so on. Many other such use cases exist. The common thing about all these cases is that the response from the API is processed automatically by a machine whose output is either data or decisions to serve the next point in the pipeline, a human readable report, a new document with proofreading applied to it and so on.

Summary Information

Interactive and offline implementation alike may find it useful to create summary reports about a single document or set of documents. These may include both document information and proofreading information.

Document level information can include data such as number of sentences in each document, average sentence length, total document length, percent of non-English sentences and percent of too long sentences.

Proofreading information can show statistics about types of errors (Spelling vs. Grammar vs. Vocabulary), a breakup of the user’s mistake by grammatical topic, the overall number of mistakes, mistakes per sentence and the like.

Implementations may even want to consider tracking such statistics over time for profiling or progress monitoring purposes.

All of the information to generate the above statistics is contained in the various fields of the API response discussed in previous sections.

Appendix: Contextual Synonyms and Recommendations
This section covers in more details two special types of suggestions, contextual synonyms and style and vocabulary recommendations.
Contextual Synonyms
English words often have a large number of synonyms. For example the word “very” has the following and many more: highly, awfully, really, extremely, terribly, deeply, seriously, selfsame, actual, much and many others. However, in the context of a particular sentence, only part of them are applicable. When the generateSynonyms param is set to “true”, if a word in the sentence has synonyms which are relevant in the context of that sentence, Ginger’s API will return them as suggestions.

Example:

My coussin is very picky about the restaurants he dines in.

Without any flags, only the spelling mistake coussin->cousin will be corrected.

With generateSynonyms=true, in addition to the spelling correction also synonyms for “very” (really) and for “picky” (particular, fussy, finicky) will be returned.

Recommendations

Recommendations are style and vocabulary suggestions.

There are three types of recommendations currently supported:

  • Spelling out numbers: 0-9 replaced with the spelled-out number (when in the context of counting something)
  • Overused Adjectives: alternative adjectives suggested whenever commonplace/banal ones are being used
  • Flagging of passive voice usage. The active form is not suggested, only the fact that passive voice was used is flagged.

Examples:

  1. There are 2 very good reason for this.

    Without any flags, only the mistake reason->reasons will be corrected.

    With generateRecommendations=true, in addition to the grammar correction also the “2” will be spelled out to “two”

  2. I live in a big house.

    Without any flags, there will be no correction.

    With generateRecommendations=true, there will be two suggestions for the adjective “big” (large, spacious)

  3. The action was taken by her

    Without any flags, there will be no correction.

    With generateRecommendations=true, the use of the passive voice will be flagged

Appendix: Correction Categories

General

A correction category specifies the grammatical topic a suggestion returned by the API belongs to. A category is identified by a unique numeric id which is returned in the LrnCatId field, which is part of each member of the Suggestions array in the response:

Each suggestion has a category associated with it. Thus, if there are multiple suggested corrections for a specific word, each will have its own, potentially different, category. For example, in the sentence: “Book is on the shelf” there are two suggested corrections for “Book”: “The book” and “A book”. The first gets LrnCatId = 13 (DefiniteArticle), the second gets LrnCatId = 12 (IndefiniteArticle).

{
  "Suggestions": [
    {
      "LrnCatId": 43,
      "Text": "boy's"
    },
    {
      "LrnCatId": 43,
      "Text": "boys'"
    }
  ]
} 


List of Correction Categories

Following is the full list of correction grammatical categories the Ginger API may return. Each category is listed with its id and name.

Parameters
Category ID
Category name
1
SplitAndMerge
2
CommonAndProperNouns
5
Pronouns
12
IndefiniteArticle
13
DefiniteArticle
15
Tenses
16
PrimaryVerbs
18
PresentProgressive
19
PastSimple
20
Future
21
SubjectVerbAgreement
23
AdverbialModifiers
29
Prepositions
30
PrepositionsInOnAtConfusion
31
Spelling
34
PresentSimple
35
PresentPerfect
36
PastProgressive
37
PastPerfect
38
TheInfinitive
39
Participles
40
Punctuation
42
Plurality
43
ConsecutiveNouns
44
UK/US English
45
BeginningOfSentenceCapitalization
46
MisusedWords
47
DoubleWords
48
Synonyms
49
CommaAddition
50
ComparativeSuperlative
51
QuestionMarkAddition
100
Vocabulary
102
InformalLanguage
103
OverusedWord
104
PassiveVoice
105
NumeralSpellingOut
1000
Other
Correction Categories – Description and Examples

The following section describes the types of mistakes captured by each category, with examples.

Spelling

Name
Category Id
Description
Examples
Spelling
31
A word was not spelled correctly

Fizix is a great sudgekt

The marble statue had a big hed

Name
Category Id
Description
Examples
Misused
46
Confusion between words which sound similar or have a similar spelling

This problem is to complicated

I wasn't sure what to except

I want to rid a camel

We decided to remove the item form our store

Name
Category Id
Description
Examples
SplitAndMerge
1
A word was accidentally split in two or vice-versa, two words were accidentally merged into a single word

The bed room is comfortable

This type of behavior can mad den me

This looks makebelieve, not real

I amnot going there

Name
Category Id
Description
Examples
UK/US
44
Usage of British instead of American English spelling or vice-versa. Preference is determined by the value set in the “lang” field in the request

The colour purple is my favorite [colour corrected to color is lang=US]

I live near the town center[center is corrected to centre if lang=UK]

Name
Category Id
Description
Examples
CommonAndProperNouns
2
A proper or common noun is not capitalized

My friend john is not well today

We will always have paris

My english teacher is stern

My freudian slip turned out to be fatal

Name
Category Id
Description
Examples
DoubleWords
47
Accidental repetition of a word

I went to to the store

I won't let let her do it

Grammar – Nouns
Name
Category Id
Description
Examples
IndefiniteArticle
12
Confusion between a/an or omission of an indefinite article when it is required

John is studying for a MBA degree

This is an great show

This is great show

Name
Category Id
Description
Examples
DefiniteArticle
13
Omission of the definite article (“the”) when it is required

I had time of my life on this vacation

These are some of things I have to deal with

Name
Category Id
Description
Examples
ConsecutiveNouns
43
Wrong usage of two or more nouns in a row, either in possessive form or as modifiers of each other

Sheryl went to the tickets office

My wife name is Sara

The boys teacher is not coming today

Name
Category Id
Description
Examples
Plurality
42
Confusion between the singular and plural form of a noun

We bought a number of item

Six people lost their life in the accident

I sleep in a small beds

Name
Category Id
Description
Examples
Pronouns
5
Using the wrong pronoun or not using a pronoun when one is required

I need you help

Mary and me just had a long conversation

Grammar – Adjectives
Name
Category Id
Description
Examples
ComparativeSuperlative
50
Wrong usage of comparative and superlative structures

This movie is bad than anything I have seen

She is the most pretty girl in her class

This circle is more round than it seems

Grammar – Verbs
Name
Category Id
Description
Examples
Tenses
15
The wrong verb tense or form is being used and the mistake does not fall under one of the more specific verb related categories (16-21 and 34-39)

Jane couldn’t located your phone number

It is important to submitted the paper on time

Deploying this in the cloud will allowing scaling

Name
Category Id
Description
Examples
PresentSimple
34
Error in forming or applying the present simple tense

The battery is lasting for only 2 hours

Name
Category Id
Description
Examples
PresentPerfect
35
Error in forming or applying the present perfect tense

I have develop this idea for days

Has we met before?

Name
Category Id
Description
Examples
PresentProgressive
18
Error in forming or applying the present progressive tense

Terry has writing a letter at the moment

I am go to the market

Name
Category Id
Description
Examples
PastSimple
19
Error in forming or applying the past simple tense

I go to the store yesterday

Yesterday I download an app for long distance calls

Name
Category Id
Description
Examples
PastPerfect
37
Error in forming or applying the past perfect tense

My parents has never been to Florida before this summer

Tom have already left work before he realized he forgot his notes

Name
Category Id
Description
Examples
PastProgressive
36
Error in forming or applying the past progressive tense

He was sell fruit on the side of the road

I am having a problem with my connection yesterday

Name
Category Id
Description
Examples
FutureTense
20
Error in forming or applying the future tense

Amy try going to the chess club tomorrow

I'll will review the test later

Name
Category Id
Description
Examples
SubjectVerbAgreement
21
Mismatch between the plurality of the verb and the subject

A bouquet of yellow roses lend color and fragrance to the room

My aunt or my uncle are arriving by train today

Name
Category Id
Description
Examples
TheInfinitive
38
Errors in forming or using the infinitive form

I refuse give up

She expects it is ready by five

The tutor allowed us talking as long as it was in English

Name
Category Id
Description
Examples
Participles
39
Errors related to incorrect usage or incorrect form of participles

We are looking for employees with proved experience

My dream has giving me strength to lead others

It's take Sam two hours to get home

Name
Category Id
Description
Examples
AdverbialModifiers
23
Incorrect use of adverbs instead of adjectives or vice-versa or using the wrong adverb

She is a beautifully woman

He closed the door quiet

He sings good

Grammar – Prepositions
Name
Category Id
Description
Examples
Prepositions
29
Applying a wrong preposition. The specific common case of in-on-at confusion is categorized separately (30)

We arrived to the station

The differences among English, Chinese, and Arabic are significant

Do you have a good picture I can incorporate in the presentation?

You can get there with bus or on foot

Name
Category Id
Description
Examples
PrepositionsInOnAtConfusion
30
Confusion between the usage of the prepositions in, on and at

I arrived in five o'clock

She lives in 666 Elm Street

The relevant data appeared on the rightmost column

Grammar – Other
Name
Category Id
Description
Examples
BeginningOfSentenceCapitalization
45
The first word in a sentence not capitalized

this is not right

what a nice house!

Name
Category Id
Description
Examples
CommaAddition
49
Not adding a comma where there should be one, e.g. after an introductory phrase, between list items and so on

Oh well let's go for it

Students will demonstrate understanding of spoken words syllables and sounds (phonemes)

98 West Pulaski Road Huntington Station NY

Name
Category Id
Description
Examples
QuestionMarkAddition
51
Not adding a question mark at the end of a sentence

Is this really true

How did you manage to do it

Word Usage
Name
Category Id
Description
Examples
Vocabulary
100
Using a semantically unsuitable word in a sentence

Can you make me a favor?

I want to stay till the finish of the party

They said us to come quickly

Name
Category Id
Description
Examples
InformalLanguage
102
Usage of slang and informal language. Enabled when generateRecommendations is set to true

A lotta things depend on his success

I coulda made it

Name
Category Id
Description
Examples
OverusedWord
103
Usage of banal or general adjectives instead of more specific ones. Enabled when generateRecommendations is set to true.

He lives in a big house

He is a nice guy

Style
Name
Category Id
Description
Examples
PassiveVoice
104
Usage of the passive voice. Enabled when generateRecommendations is set to true.

The job was done by him

The deed is being done as we speak

Name
Category Id
Description
Examples
NumeralSpellingOut
105
Writing digits instead of spelling out the numbers 0-9

We raise 3 cats and a newborn baby

The bad weather delayed me by 5 days

Name
Category Id
Description
Examples
Synonyms
48
Suggestions of synonyms for the given word

The trip was boring, but the road was beautiful “tiring” will be suggested as a synonym for “boring” and “route” as a synonym for “road”

How I would love a good night's sleep! “slumber” will be suggested as a synonym for “sleep”

Other
Name
Category Id
Description
General
1000
A mistake that does not fall under any of the previous categories.