Closed
Bug 1307478
Opened 8 years ago
Closed 7 years ago
Elasticsearch Indexer / Bulk Indexer
Categories
(bugzilla.mozilla.org :: General, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: dylan, Assigned: dylan)
References
Details
Attachments
(1 file, 5 obsolete files)
31.38 KB,
patch
|
dkl
:
review+
|
Details | Diff | Splinter Review |
Split out the search indexer from bug 1184823 so that that it can be reviewed and deployed separately from the more complicated search code
Assignee | ||
Comment 1•8 years ago
|
||
here's the indexer code. It's a little improved from before.
Attachment #8800789 -
Flags: review?(dkl)
Assignee | ||
Comment 2•8 years ago
|
||
Fixed some warnings and a performance issue. This loads about 900 bugs per second now.
Attachment #8800789 -
Attachment is obsolete: true
Attachment #8800789 -
Flags: review?(dkl)
Attachment #8800815 -
Flags: review?(dkl)
Assignee | ||
Comment 3•8 years ago
|
||
accidentally included the commented out comment indexer part, fixed now.
Attachment #8800815 -
Attachment is obsolete: true
Attachment #8800815 -
Flags: review?(dkl)
Attachment #8800817 -
Flags: review?(dkl)
Assignee | ||
Comment 4•8 years ago
|
||
Added more code to the bulk indexer, it can run as a service now. Also added dependency info It doesn't perform daemonization itself but I think that's okay. Rest of the code is the same.
Attachment #8800817 -
Attachment is obsolete: true
Attachment #8800817 -
Flags: review?(dkl)
Attachment #8800917 -
Flags: review?(dkl)
Comment 5•8 years ago
|
||
Comment on attachment 8800917 [details] [diff] [review] 1307478_4.patch Review of attachment 8800917 [details] [diff] [review]: ----------------------------------------------------------------- 1. missing template/en/default/admin/params/elastic.html.tmpl for config descriptions. Have a few more things I want to try but you can work on these in meantime. ::: Bugzilla/Comment.pm @@ +22,4 @@ > use Bugzilla::Util; > > use List::Util qw(first); > +use Scalar::Util qw(blessed); Still need weaken and isweak for line 287 so do not change this line. ::: Bugzilla/Elastic/Indexer.pm @@ +175,5 @@ > +sub bulk_load { > + my ( $self, $class ) = @_; > + > + $self->put_mapping($class); > + my $bulk = $self->_bulk_helper($class); nit: extra whitespace ::: bulk_index.pl @@ +33,5 @@ > +); > + > +my $indexer = Bugzilla::Elastic::Indexer->new( > + $debug_sql ? ( debug_sql => 1 ) : (), > + $progress_bar ? ('Term::ProgressBar' ) : (), Change to: $progress_bar ? ( progress_bar => 'Term::ProgressBar' ) : (), @@ +60,5 @@ > + my $start_comments = time; > + $indexer->bulk_load('Bugzilla::Comment'); > + print " ", time - $start_comments, " seconds\n" if $verbose > 1; > + > + $loop->stop unless $loop; $once is not used currently. Change to: $loop->stop if $once || !$loop;
Attachment #8800917 -
Flags: review?(dkl) → review-
Assignee | ||
Comment 6•8 years ago
|
||
also on dev
Attachment #8800917 -
Attachment is obsolete: true
Attachment #8808399 -
Flags: review?(dkl)
Assignee | ||
Comment 7•8 years ago
|
||
Forgot one thing. :-)
Attachment #8808399 -
Attachment is obsolete: true
Attachment #8808399 -
Flags: review?(dkl)
Attachment #8808402 -
Flags: review?(dkl)
Comment 8•8 years ago
|
||
Comment on attachment 8808402 [details] [diff] [review] 1307478_7.patch Review of attachment 8808402 [details] [diff] [review]: ----------------------------------------------------------------- fubar would know best how he wants the bulk-index.pl to run on the jobqueue server so the it should be a service daemon probably. Or maybe just a cron job. Not sure. Also should bulk-index.pl live in scripts/? No fussed either way. Awesome work! r=dkl ::: Bugzilla/Comment.pm @@ +22,4 @@ > use Bugzilla::Util; > > use List::Util qw(first); > +use Scalar::Util qw(blessed); Need to add back weaken and isweak. ::: bulk_index.pl @@ +5,5 @@ > +# > +# This Source Code Form is "Incompatible With Secondary Licenses", as > +# defined by the Mozilla Public License, v. 2.0. > + > +# This file has detailed POD docs, do "perldoc checksetup.pl" to see them. Remove this line. Not relevant. @@ +14,5 @@ > +BEGIN { Bugzilla->extensions } > + > +use Bugzilla::Elastic::Indexer; > + > +use Term::ProgressBar; You have this as a recommends in Makefile.PL but bulk-index.pl will fail if this module is not installed. @@ +29,5 @@ > + 'verbose|v+' => \$verbose, > + 'debug-sql' => \$debug_sql, > + 'progress-bar' => \$progress_bar, > + 'once|n' => \$once, > +); Would like to see some usage output if --help (maybe a new bug)
Attachment #8808402 -
Flags: review?(dkl) → review+
Comment 9•8 years ago
|
||
(In reply to David Lawrence [:dkl] from comment #8) > > fubar would know best how he wants the bulk-index.pl to run on the jobqueue > server so the it should be a service daemon probably. Or maybe just a cron > job. Not sure. One question, in two parts: What happens if the daemon/cronjob fail to run for an extended amount of time, and what about if/when we fail over to AWS where ES does not exist (and probably won't for a while) ? If the answer is "nothing bad" then I'm ok with treating it like other cronjobs that run on the admin node. Otherwise, we can run it as a daemon and have Nagios check for it and alert the MOC like the push or jobqueue daemons.
Comment 10•8 years ago
|
||
(In reply to Kendall Libby [:fubar] from comment #9) > (In reply to David Lawrence [:dkl] from comment #8) > > > > fubar would know best how he wants the bulk-index.pl to run on the jobqueue > > server so the it should be a service daemon probably. Or maybe just a cron > > job. Not sure. > > One question, in two parts: What happens if the daemon/cronjob fail to run > for an extended amount of time, and what about if/when we fail over to AWS > where ES does not exist (and probably won't for a while) ? Dylan may be able to provide better answer but from the way it is coded currently, it should be a non-issue as it will import the bugs that do not yet exist and update any that have changed based on the modification times. > If the answer is "nothing bad" then I'm ok with treating it like other > cronjobs that run on the admin node. Otherwise, we can run it as a daemon > and have Nagios check for it and alert the MOC like the push or jobqueue > daemons. I feel that having a nagios monitor would be a nice to have. Right now it is positioned as a convenience system to speed up user completion and bug searches and BMO would continue to operate fine if it were to disappear temporarily but I could see us definitely relying on it more in the future. dkl
Assignee | ||
Comment 11•8 years ago
|
||
(In reply to Kendall Libby [:fubar] from comment #9) > (In reply to David Lawrence [:dkl] from comment #8) > > > > fubar would know best how he wants the bulk-index.pl to run on the jobqueue > > server so the it should be a service daemon probably. Or maybe just a cron > > job. Not sure. > > One question, in two parts: What happens if the daemon/cronjob fail to run > for an extended amount of time, and what about if/when we fail over to AWS > where ES does not exist (and probably won't for a while) ? > > If the answer is "nothing bad" then I'm ok with treating it like other > cronjobs that run on the admin node. Otherwise, we can run it as a daemon > and have Nagios check for it and alert the MOC like the push or jobqueue > daemons. If the indexer isn't running, searches will return stale data. Stale data is bad -- so I think it should be an alert if the daemon isn't running.
Comment 12•8 years ago
|
||
(In reply to Dylan Hardison [:dylan] from comment #11) > If the indexer isn't running, searches will return stale data. Stale data is > bad -- so I think it should be an alert if the daemon isn't running. Good point. I was still thinking the stale data would be harmless but I see where it could cause confusion if search results returned bugs that it was not supposed to or bugs were missing that should be there. dkl
Comment 13•8 years ago
|
||
So... what happens if/when we fail over to AWS? Exactly how bad is bad?
Comment 14•8 years ago
|
||
(In reply to Kendall Libby [:fubar] from comment #13) > So... what happens if/when we fail over to AWS? Exactly how bad is bad? IMO and it may work this way in Dylans patch for quicksearch and user autocomplete is that if the elastic server is not available it just falls back to old behavior without any error to the user. That would work for AWS being that we do not have an elastic server there. dkl
Assignee | ||
Comment 15•7 years ago
•
|
||
To git@github.com:mozilla-bteam/bmo.git 840bbad83..419f3ae9f master -> master
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•