Closed Bug 1307485 Opened 8 years ago Closed 7 years ago

Add code to run a subset of buglist.cgi search queries against the ES backend

Categories

(bugzilla.mozilla.org :: Search, defect, P2)

Production
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: dylan, Assigned: dylan)

References

Details

Attachments

(1 file, 3 obsolete files)

This is the part of the elasticsearch code that pretends to be a Bugzilla::Search and does the complicated translation between our internal search structure and ES queries. This should be easier to review and deploy.
Depends on: 1307478
Attached patch 1307485_1.patch (obsolete) — Splinter Review
Assignee: nobody → dylan
Attachment #8826352 - Flags: review?(dkl)
Priority: -- → P2
Comment on attachment 8826352 [details] [diff] [review]
1307485_1.patch

Review of attachment 8826352 [details] [diff] [review]:
-----------------------------------------------------------------

I would also like to see Simple bug search and possible duplicate detection use elasticsearch if possible. Not necessarily a blocker but seem like they would be a good fit.

Also simple searches that return results for DB search are not returning results for Elastic. Such as:

Component: (is equal to any of the strings) Administration
Product: (is equal to any of the strings) bugzilla.mozilla.org 
Resolution: (is equal to any of the strings) FIXED

I get 0 results for elasticsearch. We need to make sure all searches that fall in the searchable criteria for elastic returns the same results as the DB would (> 500 in this case). bulk_index.pl was up to date.

::: Bugzilla/Elastic/Search.pm
@@ +172,5 @@
> +}
> +
> +sub _describe {
> +    my ($thing) = @_;
> +    

remove all whitespace

@@ +234,5 @@
> +    };
> +
> +    my $func = $class_to_func->{ref $thing} or die "nothing for $thing\n";
> +
> +    return $func->($thing);

could you just call _describe($thing) here? duplicate code.

::: Bugzilla/WebService/Elastic.pm
@@ +33,5 @@
> +
> +sub suggest_users {
> +    my ($self, $params) = @_;
> +
> +    Bugzilla->switch_to_shadow_db();

Remove this since we are not accessing the DB.

Also we should check to see that elastic_nodes is set in Params() otherwise user autocomplete 
will not work if Elastic is down. We should fall back to Bugzilla::WebService::User::get if
Elastic is not available. You are doing similar in buglist.cgi.

@@ +46,5 @@
> +    # if ($params->{limit}) {
> +    #     ThrowCodeError('param_must_be_numeric',
> +    #                    { function => 'Elastic.suggest_users', param => 'limit' })
> +    #       unless detaint_natural($params->{limit});
> +    # }

remove this section. does elasticsearch have a way to limit the amount of results? we do not want people trying to retrieve all users.

@@ +67,5 @@
> +__END__
> +
> +=head1 NAME
> +
> +Bugzilla::Webservice::Elastic - Elasticsearch-backed info

Fix POD docs to match what is in this module.

::: buglist.cgi
@@ +780,5 @@
> +    $fulltext = 1;
> +}
> +
> +$vars->{'search_description'} = $search->search_description;
> +$vars->{elastic} = $elastic;

not used in any template from what i can tell
Attachment #8826352 - Flags: review?(dkl) → review-
Attached patch 1307485_3.patch (obsolete) — Splinter Review
Everything except the docs, I'll work on that. You'll need to delete the bugzilla ES database and reindex.
Attachment #8826352 - Attachment is obsolete: true
Attachment #8838362 - Flags: review?(dkl)
Attached patch 1307485_4.patch (obsolete) — Splinter Review
glob pointed out that quicksearch=cheese was matching way more than it should -- this was because my ngram implementation was incorrect. Instead of fixing this, I've removed ngrams and turned substring searches into wildcard searches on the non-analyzed .eq fields. 

This is probably going to be slower (than the non-working ngrams ;)) but we can work at this. Maybe people won't mind if we treat substring searches on product/component as prefix searches? 

we should also encourage the use of allwords/anywords.
Attachment #8838362 - Attachment is obsolete: true
Attachment #8838362 - Flags: review?(dkl)
Attachment #8838377 - Flags: review?(dkl)
Depends on: 1330768
Comment on attachment 8838377 [details] [diff] [review]
1307485_4.patch

Review of attachment 8838377 [details] [diff] [review]:
-----------------------------------------------------------------

Looking better but atill does not proper handle resolution == '---' which means bugs with no resolution (open)

Elastic:

Status: (is equal to any of the strings) NEW () Component: (is equal to any of the strings) WebService () Product: (is equal to any of the strings) Bugzilla () Resolution: (is equal to any of the strings) --- () 
Zarro Boogs found.

elastic=0:

Status: (is equal to any of the strings) NEW ( bugs.bug_status IN ('NEW') ) Component: (is equal to any of the strings) WebService ( bugs.component_id IN (605) ) Product: (is equal to any of the strings) Bugzilla ( bugs.product_id IN (19) ) Resolution: (is equal to any of the strings) --- ( bugs.resolution IN ('') ) 
45 bugs found.

::: Bugzilla/Elastic/Indexer.pm
@@ +44,3 @@
>                  analysis => {
> +                    filter => {
> +                        asciifolding_original => { 

whitespace

::: Bugzilla/WebService/Elastic.pm
@@ +44,5 @@
> +    ThrowUserError('user_access_by_match_denied')
> +      unless Bugzilla->user->id;
> +
> +    trick_taint($params->{match});
> +    my $results = Bugzilla->elastic->suggest_users($params->{match} . "");

Should we add a fallback here to Bugzilla::WebService::User::get in case elastic is not configured or user search fails inside and eval()? We do similar for buglist.cgi.
If ElasticSearch is unreachable, we would no longer have user auto complete which I guess would not be the end of the world.

@@ +61,5 @@
> +__END__
> +
> +=head1 NAME
> +
> +Bugzilla::Webservice::Elastic - Elasticsearch-backed info

Still need to either remove POD or update to match this module and its functions.

::: template/en/default/list/list.html.tmpl
@@ +304,5 @@
> +[% END %]
> +
> +[% BLOCK elastic_disable %]
> +<br>
> +<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try without ElasticSearch</a>

Would rather see:

<strong>Note:</strong>Using ElasticSearch.
<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try same search without</a>.
Attachment #8838377 - Flags: review?(dkl) → review-
(In reply to David Lawrence [:dkl] from comment #5)
> Comment on attachment 8838377 [details] [diff] [review]
> 1307485_4.patch
> 
> Review of attachment 8838377 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Looking better but atill does not proper handle resolution == '---' which
> means bugs with no resolution (open)
> 
> Elastic:
> 
> Status: (is equal to any of the strings) NEW () Component: (is equal to any
> of the strings) WebService () Product: (is equal to any of the strings)
> Bugzilla () Resolution: (is equal to any of the strings) --- () 
> Zarro Boogs found.

Aha, I'll have to do that in the tree-pruning code.

> elastic=0:
> 
> Status: (is equal to any of the strings) NEW ( bugs.bug_status IN ('NEW') )
> Component: (is equal to any of the strings) WebService ( bugs.component_id
> IN (605) ) Product: (is equal to any of the strings) Bugzilla (
> bugs.product_id IN (19) ) Resolution: (is equal to any of the strings) --- (
> bugs.resolution IN ('') ) 
> 45 bugs found.
> 
> ::: Bugzilla/Elastic/Indexer.pm
> @@ +44,3 @@
> >                  analysis => {
> > +                    filter => {
> > +                        asciifolding_original => { 
> 
> whitespace
> 
> ::: Bugzilla/WebService/Elastic.pm
> @@ +44,5 @@
> > +    ThrowUserError('user_access_by_match_denied')
> > +      unless Bugzilla->user->id;
> > +
> > +    trick_taint($params->{match});
> > +    my $results = Bugzilla->elastic->suggest_users($params->{match} . "");
> 
> Should we add a fallback here to Bugzilla::WebService::User::get in case
> elastic is not configured or user search fails inside and eval()? We do
> similar for buglist.cgi.
> If ElasticSearch is unreachable, we would no longer have user auto complete
> which I guess would not be the end of the world.

The code works exactly that way. Actually, it does more -- if ES is down it just uses the regular matching.

https://gist.github.com/dylanwh/5aa8fb27fab2a6dd0bd565939a144ca8

> @@ +61,5 @@
> > +__END__
> > +
> > +=head1 NAME
> > +
> > +Bugzilla::Webservice::Elastic - Elasticsearch-backed info
> 
> Still need to either remove POD or update to match this module and its
> functions.
> 
> ::: template/en/default/list/list.html.tmpl
> @@ +304,5 @@
> > +[% END %]
> > +
> > +[% BLOCK elastic_disable %]
> > +<br>
> > +<a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try without ElasticSearch</a>
> 
> Would rather see:
> 
> <strong>Note:</strong>Using ElasticSearch.
> <a href="buglist.cgi?[% urlquerypart FILTER html %]&amp;elastic=0">Try same
> search without</a>.
That works for me.
(In reply to Dylan Hardison [:dylan] from comment #6)
> > Should we add a fallback here to Bugzilla::WebService::User::get in case
> > elastic is not configured or user search fails inside and eval()? We do
> > similar for buglist.cgi.
> > If ElasticSearch is unreachable, we would no longer have user auto complete
> > which I guess would not be the end of the world.
> 
> The code works exactly that way. Actually, it does more -- if ES is down it
> just uses the regular matching.
> 
> https://gist.github.com/dylanwh/5aa8fb27fab2a6dd0bd565939a144ca8

Sorry bout that. Was looking for an eval() like you have in buglist.cgi and didnt see it.

dkl
Attached patch 1307485_6.patchSplinter Review
Note: this adds reporter to the index.
Attachment #8838377 - Attachment is obsolete: true
Attachment #8843450 - Flags: review?(dkl)
Comment on attachment 8843450 [details] [diff] [review]
1307485_6.patch

Review of attachment 8843450 [details] [diff] [review]:
-----------------------------------------------------------------

r=dkl
Attachment #8843450 - Flags: review?(dkl) → review+
Summary: Bugzilla::Elastic::Search → Add search to run a subset of buglist.cgi search queries against the ES backend
Summary: Add search to run a subset of buglist.cgi search queries against the ES backend → Add code to run a subset of buglist.cgi search queries against the ES backend
To git@github.com:mozilla-bteam/bmo.git
   1ef7b44..9c26c01  master -> master
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Backing this out while I investigate bug 1343533
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 1348379
Depends on: 1348380
To github.com:mozilla-bteam/bmo.git
   4cc54c4c9..1ad85fd04  master -> master

This will go out assuming bug 1348380 can go out.
To github.com:mozilla-bteam/bmo.git
   e53de8c34..b921e3142  master -> master

Next week, with bug 1348380
Depends on: 1359224
To github.com:mozilla-bteam/bmo.git
   0ea91b298..ffe838a92  master -> master
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: