Closed Bug 1026131 Opened 10 years ago Closed 10 years ago

build gengo translation bookkeeping infrastructure

Categories

(Input Graveyard :: Submission, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

(Whiteboard: u=dev c=translations p=2 s=input.2014q2)

The Gengo human translation system will be creating translation jobs once per hour. When feedback comes in that needs to be translated, we'll queue it up until the next translation job creation run. We'll queue things up in a database table.

This bug covers going through all the requirements, designing the table and implementing the table and required migrations.
Assignee: nobody → willkg
Priority: -- → P1
The rough requirements can be extrapolated from this list:

1. A GengoJob is about translating a single field from a single model
   instance and putting the translated text into another field of that
   model instance. Right now we are only translating the
   Response.description field, so this is a bit more general than we
   need, but it'll help us a ton when Input expands and we have other
   models to translate and potentially multiple fields in a model to
   translate.

2. We send translation jobs to Gengo in batches. Each batch is a
   single GengoOrder and has a unique order id.

3. The GengoJob has a status field. When the Response is saved to the
   database, we call a method on that object to give us a list of
   things that need to be translated. Those get sent to a celery task
   to deal with them. The celery task executes and calls the a method
   in the GengoHumanTranslationSystem class (which hasn't been written
   yet, but you can see other translation system classes) which will
   create a GengoJob instance for each thing that needs to be
   translated. These will have a "created" status.

4. A cron job will kick off once an hour and (hand-waving general
   explanation here) will look for all GengoJob items in the "created"
   status. Then it'll bucket them by src_language. Then it'll use the
   Gengo API to create orders, create GengoOrder instances in our db
   and update the status of all GengoJob instances for a given order
   to "in-progress".

5. The cron job will also use the Gengo API to pull back any completed
   translations and update the GengoJob and GengoOrder accordingly.

6. Over the course of translating something, we have a bunch of Gengo
   API calls. I want to record the responses we get back and be able
   to tie the responses into a conversation about a specific GengoJob
   or GengoOrder. This will make it a lot easier to find bugs in the
   system or edge cases we're not handling correctly. Plus it'll let
   us do metrics later so we can see how everything is performing.
First pass is in a PR: https://github.com/mozilla/fjord/pull/310

It's likely that further development on this project will require changes to those tables, but that's ok. This is a good first pass.
Landed in:

* https://github.com/mozilla/fjord/commit/38d8584
* https://github.com/mozilla/fjord/commit/98d30fb

Waiting to push this to production until I have other bits done.
Summary: build gengo translation queue → build gengo translation bookkeeping infrastructure
Pushed to prod already.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Fixed this in 2014q2. Fixing the whiteboard data.
Whiteboard: u=dev c=translations p=2 s=input.2014q3 → u=dev c=translations p=2 s=input.2014q2
Product: Input → Input Graveyard
You need to log in before you can comment on or make changes to this bug.