Closed
Bug 1362434
Opened 8 years ago
Closed 6 years ago
Send ALL processor exceptions to Sentry
Categories
(Socorro :: General, task, P2)
Socorro
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: peterbe, Unassigned)
References
Details
We have Raven installed and working in the processor thanks to https://github.com/mozilla/socorro/commit/02ed76354658367fe17766da7382ad9d476874f7 but that commit message is just not accurate.
We do get Sentry errors from the processor (in stage and prod) but they're only around the crash storage piece.
Earlier this week we had unexpected exceptions within a processor rule (either in the _predicate() or the _action() call) and nobody noticed until, by chance, I watched the /var/log/messages on one of the processor prod nodes.
* Will's point is that much of the code within the processor rules system is NOT ready to be pointed to something like Sentry. The code is too OK with exceptions happening. (Which is odd in my view because if it's OK that exceptions happen, then why bother to log them?)
* We might get a crazy flood of Sentry exceptions if we start injecting Raven in the processor rule execution loop.
* Sentry is pretty good at just incrementing a counter.
* Generally, as a team, we prefer to surface errors and deal with them rather than relying on logging. If an error isn't important then the code isn't important.
* Most likely, Processor 2017 will be done very differently and have a different approach to exception handling. A better one. But Processor 2017 is many months away.
* This week we had a terrible exception happening in processors that required quick feet and a slightly rushed afternoon production hotfix.
Reporter | ||
Comment 1•8 years ago
|
||
By the way, later that same day I found another exception repeatedly happening in the processor logs.
https://bugzilla.mozilla.org/show_bug.cgi?id=1362125
This one might be an example of something that's NOT worth correcting. Perhaps it's fine to fail the processing if the json_dump doesn't have a `crashing_thread` key. Who knows?
Reporter | ||
Updated•8 years ago
|
Assignee: nobody → peterbe
Reporter | ||
Comment 2•8 years ago
|
||
PR https://github.com/mozilla/socorro/pull/3762
See https://github.com/mozilla/socorro/pull/3762#issuecomment-300297587 for why there's more work to do on this topic.
Comment 3•8 years ago
|
||
Commit pushed to master at https://github.com/mozilla/socorro
https://github.com/mozilla/socorro/commit/60f074c1bec1c2357e2345266406e0d98febcdaa
bug 1362434 - send processor exceptions to sentry (#3762)
* bug 1362434 - send processor exceptions to sentry
* addressing review comments
Reporter | ||
Comment 4•8 years ago
|
||
In Comment #2 I pointed out at least two more places where exceptions are swallowed.
Comment #3 is just one of these "holes" fixed.
I'm not intending to own this bug any more for the other places in the processor.
Perhaps someone else takes over or we decide to not bother more until we re-write the processor.
Assignee: peterbe → nobody
Comment 5•6 years ago
|
||
I redid Rule in bug #1505233 so it's not doing any error handling at all anymore and instead Processor2015.process_crash is now doing the error handling.
When a rule throws an error--regardless of whether it's in predicate or action--it will send an item to sentry and add a note to the processor notes in the processed crash. This lets us reprocess crashes that were affected.
Previously rules sent errors to sentry, but it was a lot of code to do it and it wasn't clear if there were cases where errors might slip through.
Comment #2 points out these two places where errors might get swallowed:
1. https://github.com/mozilla-services/socorro/blob/fc6841672e8179cd6721a52cdc1a34d233338bf2/socorro/processor/processor_2015.py#L243-L252
2. https://github.com/mozilla-services/socorro/blob/fc6841672e8179cd6721a52cdc1a34d233338bf2/socorro/processor/processor_app.py#L103-L113
In redoing Rule and moving the error handling to Processor2015.process_crash, I covered the first one.
The second one is covered by bug #1426257. I think we should work on that there, so I'm going to have this bug depend on that one.
Comment 6•6 years ago
|
||
I looked into bug #1426257 and that's now resolved. I think this is all set now. Yay!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•