984726 - [Gaia] Add screenshot reftests

Reporter

Description

•

10 years ago

We would like to have UI tests to keep the result of css changes and the UI of each app always correct.

For this, we could compare screenshots of correct version and the patch version.
If they are same, we could believe the patch is good for css changes and UI.

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Summary: UI test for style changes. → UI test for UI and css changes.

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Summary: UI test for UI and css changes. → Tests for UI and CSS changes.

Evan Tseng [:evanxd]

Reporter

Comment 1

•

10 years ago

There is one way we already had WIP patch to achieve this.

It is a local tool to generate a report includes screenshots of current and previous version in the git branch and the diff results between these two versions.
Please refer to http://bugzil.la/926801 - Generate a report shows screenshots in current and previous version.

How do you guys think?

Flags: needinfo?(gaye)

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Flags: needinfo?(jlal)

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Flags: needinfo?(mike)

Mike Pennisi [:jugglinmike]

Comment 2

•

10 years ago

Hey Evan,

This is an interesting but tricky problem. How do you see this functionality being used? During continuous integration? Or only locally during development?

Flags: needinfo?(mike)

Evan Tseng [:evanxd]

Reporter

Comment 3

•

10 years ago

Hi Mike,

The tool in Comment 1 is only for locally development.

For continuous integration, I think we could add screenshot tests in Gaia to check CSS and UI related changes in patches.

The below is an idea to run the screenshot tests on Travis.
We could store all correct screenshot images in a GitHub repo,
then we could add marionette tests to capture screenshot and compare with the correct one. If they are same, the test is passed.

Once we have these thing on Travis, I think we could have less and less CSS and UI related regression.

How do you think?

Thanks.

Evan Tseng [:evanxd]

Reporter

Comment 4

•

10 years ago

And do you guys have any good idea to skip the CSS and UI related regression?
I would like to know that.

I hope we could have a solution to skip the CSS and UI related regression.
Thanks.

Mike Pennisi [:jugglinmike]

Comment 5

•

10 years ago

(In reply to Evan Tseng [:evanxd][:愛聞插低] from comment #4)
> And do you guys have any good idea to skip the CSS and UI related regression?
> I would like to know that.
> 
> I hope we could have a solution to skip the CSS and UI related regression.
> Thanks.

If I get your meaning, then you may be interested in TravisCI's "allow_failures" option.

Evan Tseng [:evanxd]

Reporter

Comment 6

•

10 years ago

Hi Mike,

Sorry, we don't have any "allow_failures" currently.
See it at https://github.com/mozilla-b2g/gaia/blob/master/.travis.yml.

So what do you mean that interested in TravisCI's "allow_failures" option.
Thanks.

Evan Tseng [:evanxd]

Reporter

Comment 7

•

10 years ago

Hi Kevin and Marcus,

I have an idea to skip the css and UI related regression.

We could store all correct screenshot images in a GitHub repo,
then we could add marionette tests to capture screenshot and compare it with the correct one. If they are same, the test is passed.

Once we have these thing on Travis, I think we could have less and less CSS and UI related regression.

How do you guys think?

Thanks.

Flags: needinfo?(kgrandon)

Summary: Tests for UI and CSS changes. → Test cases to skip css and UI related regression.

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Flags: needinfo?(m)

Kevin Grandon :kgrandon

Comment 8

•

10 years ago

Hi, so it looks like you want to implement reftests for gaia. This is definitely a good thing and would help catch a lot of the regressions that we have been seeing. I am a bit worried about fonts, and rendering on different platforms. If we only deal with one CI env (travis), it may not be too bad - we just need a way to easily generate new screenshots.

We should try to get this working with a single test first. Also - why do you want it to be an allowed failure? Any jobs we add to travis now need to fail the tree if they fail.

I'm also thinking that you could do this today inside of a marionette test by calling screenshot() and comparing to a data URI in a file on disk. It would be interesting to put together a prototype that did this. Perhaps we could add a folder inside of each app, apps/calendar/test/ref/add_account_test.js for example.

Flags: needinfo?(kgrandon)

Kevin Grandon :kgrandon

Comment 9

•

10 years ago

I'm renaming this bug to something that I think is a little bit more clear to me.

Also if you are not familiar you may want to see how the platform handles this: https://developer.mozilla.org/en/docs/Creating_reftest-based_unit_tests (though our cases are quite a bit different)

Summary: Test cases to skip css and UI related regression. → [Gaia] Add screenshot reftests

Kevin Grandon :kgrandon

Updated

•

10 years ago

OS: Mac OS X → All

Hardware: x86 → All

Marcus Cavanaugh [:mcav]

Comment 10

•

10 years ago

I think reftests could be neat if implemented in Gaia, per what Kevin says.

Flags: needinfo?(m)

Evan Tseng [:evanxd]

Reporter

Comment 11

•

10 years ago

(In reply to Kevin Grandon :kgrandon from comment #8)
> Hi, so it looks like you want to implement reftests for gaia. This is
> definitely a good thing and would help catch a lot of the regressions that
> we have been seeing. I am a bit worried about fonts, and rendering on
> different platforms. If we only deal with one CI env (travis), it may not be
> too bad - we just need a way to easily generate new screenshots.
Yes, I think we could run the reftests on Travis first.

> We should try to get this working with a single test first. Also - why do
> you want it to be an allowed failure? Any jobs we add to travis now need to
> fail the tree if they fail.
I think there might be some misunderstanding happened here. I do not want it to be an allowed failure. I hope patches must pass the tests.
 
> I'm also thinking that you could do this today inside of a marionette test
> by calling screenshot() and comparing to a data URI in a file on disk. It
> would be interesting to put together a prototype that did this. Perhaps we
> could add a folder inside of each app,
> apps/calendar/test/ref/add_account_test.js for example.
Sure, let's do a prototype to get the detailed feedback.

Evan Tseng [:evanxd]

Reporter

Comment 12

•

10 years ago

(In reply to Marcus Cavanaugh [:mcav] <mcav@mozilla.com> from comment #10)
> I think reftests could be neat if implemented in Gaia, per what Kevin says.

Thanks. :)

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Assignee: nobody → evanxd

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Depends on: 993247

Evan Tseng [:evanxd]

Reporter

Comment 13

•

10 years ago

We could add reference test for keyboard app as a prototype of the idea in Comment 7.

Kevin Grandon :kgrandon

Comment 14

•

10 years ago

I assume you've checked with Rudy or a keyboard peer first? We can do it for calendar no problem, but I just want to make sure the module owner is up to speed :)

Evan Tseng [:evanxd]

Reporter

Updated

•

10 years ago

Depends on: 993276

Evan Tseng [:evanxd]

Reporter

Comment 15

•

10 years ago

Hi Kevin,

Yes, Rudy knew these things.
And the test(Bug 993247) for keyboard app is blocked by Bug 952443(about b2g focus issue).

Sure, let's do this for Calendar first, and the bug is Bug 993276.

Miller Medeiros [:millermedeiros]

Comment 16

•

10 years ago

TL;DR; I think it's a bad idea, this won't make us move any faster.

I don't think the amount of work/bureaucracy is justified. This will probably cause more headaches than it will avoid. There is no such a thing as a pixel-perfect layout.

The calendar app changes the UI based on the current date... It will also make it really hard to make CSS changes and still pass the test (small design tweaks). You basically need to make the change, execute test on travis, grab a screenshot from travis, update the reference image and push a new commit.. People will make mistakes, the tree will be closed because of perma-red tests... We won't know if error was caused by a real regression or if it was a valid update.

If people are making mistakes than they should be encouraged to write more unit/integration tests (that works locally) to validate the correctness of the app, and also to always test things manually before doing a pull request. Visual errors are very easy to catch by a human, specially since the apps doesn't have that many different screens.

Andrew Sutherland [:asuth] (he/him)

Comment 17

•

10 years ago

As a counterpoint, I do think this makes sense on an opt-in basis for apps, probably via a plugin helper that lets you checkpoint the current visual state with a name.  Ex:
  client.screenshot.saveNow('email-message-list-full');
  ...
  client.screenshot.saveNow('email-message-list-empty');

Specifically as a value proposition:
- email has seen visual regressions from building blocks changing things.  (It would still be nice if apps each had their own specific checkout of building blocks, of course.)
- it can be a hassle to manually verify that there aren't visual regressions in e-mail because there are so many different cards.  It would be amazing if I could see all the e-mail cards that changed from a single commit.  (Note that this is mainly for building block changes again, and with web components and scoped CSS and such the chances for fallout are hopefully reduced.)

Obviously apps would only want to trigger snapshots in cases where:
- They can keep a pixel-identical layout.  So date-dependent and other potential random factors need to be eliminated.
- The developers care about how all of the given things look.

It might also make sense to include some type of normalization transform mechanism in the screenshot process where it's a hassle to control the data values.  For example, give it some selectors that it should synchronously replace the textContent of with 'lorem ipsum', synchronously screenshot, then synchronously put everything back.

Evan Tseng [:evanxd]

Reporter

Comment 18

•

10 years ago

Good idea, Andrew. We could capture screenshots on an pot-in basis for apps.

So I think we could do like `make test-integration APP=email SCREENSHOT=1` to show the screenshots in the terminal. The default value of `SCREENSHOT` argument is 0, and it means we will not capture and print screenshots.

And we could just reuse https://github.com/mozilla-b2g/marionette-debug and do the below script in test code to capture and print the screenshot.
```
client.debug.screenshot('email-message-list-full');
client.debug.screenshot('email-message-list-empty');
```

How do you guys think?

Flags: needinfo?(mike)

Evan Tseng [:evanxd]

Reporter

Comment 19

•

10 years ago

Hi Gareth, James, Mike,

How do you think about Comment 18?

Arthur Chen [:arthurcc][inactive after 6/30]

Comment 20

•

10 years ago

This kind of test is good for catching regressions for apps that heavily rely on shared styles. To have tolerance for changes introduced by string changes or font, maybe we can have a threshold and warns developers only when the amount of difference is over a certain threshold. And we should be able to update the reference screenshots very easily as it may be painful doing the update when changing shared styles.

James Lal [:lightsofapollo]

Comment 21

•

10 years ago

We need to revisit this soonish... I think having reftests is useful I will write up something post verticalhome (we could use this for some tests there at a minimum)

Flags: needinfo?(jlal)

Mike Pennisi [:jugglinmike]

Updated

•

10 years ago

Flags: needinfo?(mike)

Evan Tseng [:evanxd]

Reporter

Updated

•

8 years ago

Assignee: evan → nobody

Evan Tseng [:evanxd]

Reporter

Updated

•

8 years ago

Flags: needinfo?(gaye)

Firefox Bug Husbandry Bot

Comment 22

•

6 years ago

Bulk closed as part of https://bugzilla.mozilla.org/show_bug.cgi?id=1422750

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → INCOMPLETE

BMO Automation

Updated

•

6 years ago

Product: Testing → Testing Graveyard