Closed
Bug 1428227
Opened 6 years ago
Closed 6 years ago
Google doesn't index bugzilla.mozilla.org any more
Categories
(bugzilla.mozilla.org :: Search, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: fireattack, Assigned: dylan)
Details
Attachments
(1 file)
User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.71 Safari/537.36 Steps to reproduce: Recently I found that Google site search for BMO barely return any results (a regression, as it worked fine before). An extreme example is just search "site:bugzilla.mozilla.org": https://www.google.com/search?q=site%3Abugzilla.mozilla.org Actual results: It returns only 3 results here. DDG and Bing doesn't seem to be affected: https://duckduckgo.com/?q=site%3Abugzilla.mozilla.org&t=hg&ia=web https://www.bing.com/search?q=site%3Abugzilla.mozilla.org I noticed this problem a 2 days ago and thought it might be temporary, but it persists. It is the same with other keyword(s), compare: https://www.google.com/search?q=webext+site%3Abugzilla.mozilla.org with https://duckduckgo.com/?q=webext+site%3Abugzilla.mozilla.org&t=hg&ia=web https://www.bing.com/search?q=webext%20site%3Abugzilla.mozilla.org Expected results: It should show more results. Additional notes: It also affects the built-in BMO feature, Google search (https://bugzilla.mozilla.org/query.cgi?format=google), since it just redirects you to Google. I checked https://bugzilla.mozilla.org/robots.txt but didn't see anything strange, it does have `Allow: /show_bug.cgi`. another user reports that pages are now marked as <meta name="robots" content="noindex" /> , not sure if it's new, or if it has any effect.
Assignee | ||
Comment 1•6 years ago
|
||
Only error pages should have noindex. However we're putting nofollow on all links -- maybe we should only do that for links that lead off-site.
Assignee | ||
Comment 2•6 years ago
|
||
Oh wow, would you look at that? The logic is totally wrong. I'll have a fix up in jiffy.
Assignee: nobody → dylan
Assignee | ||
Comment 3•6 years ago
|
||
Instead of a "noindex" which is a boolean (and a double negative) we just mirror the 'robots' meta directive as a variable. so we set robots = noindex on the error pages, and robots == index everywhere else by default. I originally reviewed this code, and I didn't test that anything *wasn't" set to noindex.
Attachment #8940040 -
Flags: review?(ehumphries)
Attachment #8940040 -
Flags: review?(ehumphries) → review+
Assignee | ||
Comment 4•6 years ago
|
||
Note to self: after push, using google webmaster tools to request a re-index.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(dylan)
Resolution: --- → FIXED
Assignee | ||
Comment 5•6 years ago
|
||
Hey, how can I get google to re-index us? will it just happen or ... ?
Flags: needinfo?(dylan) → needinfo?(glob)
iirc it should happen automatically because of the sitemap. https://support.google.com/webmasters/answer/6065812?hl=en hints this may take several days. google's sitemap view does indeed show that there's a lot of pages ignored: > 1,317,302 Submitted; 3,388 Indexed https://www.google.com/webmasters/tools/sitemap-list?hl=en&siteUrl=http://bugzilla.mozilla.org/#MAIN_TAB=1&CARD_TAB=-1
Flags: needinfo?(glob)
You need to log in
before you can comment on or make changes to this bug.
Description
•