Closed Bug 778077 Opened 12 years ago Closed 10 years ago

Implement fastSeek API on media elements (and switch the built-in controls over to it)

Categories

(Core :: Audio/Video, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla31
feature-b2g 2.0
Tracking Status
firefox31 --- fixed

People

(Reporter: kinetik, Assigned: cpearce)

References

(Depends on 1 open bug, )

Details

(Keywords: dev-doc-complete, Whiteboard: [ucid:multimedia13,2.0,ft:multimeida-platform])

Attachments

(2 files, 1 obsolete file)

I can't find a bug for this, despite being fairly sure we had one.

Implement the fastSeek API for media elements being proposed here: https://www.w3.org/Bugs/Public/show_bug.cgi?id=14851, specifically Hixie's proposal in comment 9.
We currently use fast seek for WebM and accurate seek for all other formats.
Should elem.mozFastSeek() fail as not implemented or simply perform an accurate seek in a media format where faster seek can't be done (like raw)?
(In reply to Reuben Morais [:reuben] from comment #1)
> We currently use fast seek for WebM and accurate seek for all other formats.

That's not correct, WebM seeking is frame and sample accurate.  It does use an index if one is available in the file, but so does Ogg if an index is present.  Most of the actual seeking logic is shared between all decoder implementations.

Sadly, doing "fast" seeking in WebM without an index is fairly painful, since it doesn't provide the same resync guarantees that Ogg does.

> Should elem.mozFastSeek() fail as not implemented or simply perform an
> accurate seek in a media format where faster seek can't be done (like raw)?

Accurate seeking is usually implemented by seeking to the nearest preceding resync point and then decoding up to the exact position requested.  It's hard to imagine a format where accurate seeking is possible but approximate seeking is not.  Why don't you think it's possible for raw?
(In reply to Matthew Gregan [:kinetik] from comment #2)
> (In reply to Reuben Morais [:reuben] from comment #1)
> > We currently use fast seek for WebM and accurate seek for all other formats.
> 
> That's not correct, WebM seeking is frame and sample accurate.  It does use
> an index if one is available in the file, but so does Ogg if an index is
> present.  Most of the actual seeking logic is shared between all decoder
> implementations.

Hm, the comment at https://mxr.mozilla.org/mozilla-central/source/media/libnestegg/include/nestegg.h#183 made me think the algorithm would always seek to the nearest key frame.

> > Should elem.mozFastSeek() fail as not implemented or simply perform an
> > accurate seek in a media format where faster seek can't be done (like raw)?
> 
> Accurate seeking is usually implemented by seeking to the nearest preceding
> resync point and then decoding up to the exact position requested.  It's
> hard to imagine a format where accurate seeking is possible but approximate
> seeking is not.  Why don't you think it's possible for raw?

I had the impression raw videos had no interframe compression so every frame was a keyframe, making approximate seeking just as costly as accurate seeking.
(In reply to Reuben Morais [:reuben] from comment #3)
> Hm, the comment at
> https://mxr.mozilla.org/mozilla-central/source/media/libnestegg/include/
> nestegg.h#183 made me think the algorithm would always seek to the nearest
> key frame.

Look at nsWebMReader::Seek and nsBuiltinDecoderReader::DecodeToTarget.

> I had the impression raw videos had no interframe compression so every frame
> was a keyframe, making approximate seeking just as costly as accurate
> seeking.

More resync points make seeking easier, because there's less to decode before arriving at the desired timecode.
(In reply to Matthew Gregan [:kinetik] from comment #4)
> nsBuiltinDecoderReader::DecodeToTarget.

A-ha! There it is, I knew I was missing something in this complex class hierarchy. So, assuming all readers have the ability to seek to the nearest key frame, essentially all you'd need to do is skip the DecodeToTarget call and adjust mCurrentTime appropriately on the media element?
Attached patch WIP patch (obsolete) — Splinter Review
I've been working on this bug. Attached is a work in progress combined patch.
Todo, mostly as a note to myself:
- Enforce the invariants mentioned in https://www.w3.org/Bugs/Public/show_bug.cgi?id=14851
- Ogg backend seeks to offset in bytes, needs extra logic to make sure we're in a keyframe.
- Make sure currentTime, seeking, etc are all updated accordingly on the media element during a fast seek
- Test other backends extensively
Sorry guys, college is consuming 25 hours of my days so I can't work on this right now, feel free to take it.
Depends on: 882718
I'll do this. We want this to speed up thumbnailing on B2G, specifically on low end devices like the Tarako.
Assignee: nobody → cpearce
Block Firefox OS multimedia platform 1.5 feature meta bug
* Implement HTMLMediaElement.fastSeek(), basically by changing all the MediaDecoderReader::Seek() overrides to not call MediaDecoderReader::DecodeToTarget(), and have MediaDecoderReader::DecodeSeek() call DecodeToTarget() if we're doing an accurate (non-fast) seek.
* Update gizmo.mp4 to have a keyframe every second, instead of only 1 keyframe at the start of stream. This makes the unit test I added more useful for mp4...
* I pushed most of the seek target clamping logic in MediaDecoder up into HTMLMediaElement, so that we're clamping in fewer places. Note MediaDecoderStateMachine::Seek() still sanity checks the seek target.
* We have to update the currentTime/MediaDecoder playback position after a seek completes now, rather than assuming the seek always got it exactly right.
* Removed those pesky assertions about seek target lying in the first frame after seek, since actually sometimes the media doesn't have samples for all streams after a seek (either due to the media being encoded like that, or because of a bug in the platform's decoder, not entirely sure).
* Green: https://tbpl.mozilla.org/?tree=Try&rev=b028258565e2
Attachment #662418 - Attachment is obsolete: true
Attachment #8396889 - Flags: review?(cajbir.bugzilla)
Comment on attachment 8396889 [details] [diff] [review]
Patch v1: Implement HTMLMediaElement.fastSeek()

Review of attachment 8396889 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with changes below.

::: content/html/content/public/HTMLMediaElement.h
@@ +35,5 @@
>  class ErrorResult;
>  class MediaResource;
>  class MediaDecoder;
>  class VideoFrameContainer;
> +class SeekTarget;

Why is this needed? I don't see SeekTarget used - although perhaps it should be to avoid the boolean parameter added to Seek.

@@ +875,5 @@
>    void PopulatePendingTextTrackList();
>  
> +  // Seeks to aTime seconds. If aDoFastSeek is true, this instead seeks to
> +  // the preceeding sync point, or thereabouts.
> +  void Seek(double aTime, bool aDoFastSeek, ErrorResult& aRv);

I'm not a fan of introducing boolean parameters. I'd prefer a SeekType enum with values like Exact and EarliestKeyframe (assuming that's what fastseek is). Then have this function take that as an argument.

::: content/html/content/src/HTMLMediaElement.cpp
@@ +1336,5 @@
> +}
> +
> +/**
> + * Returns true if aValue is inside a range of aRanges, and put the range
> + * index in aIntervalIndex if it is not null.

If what is not null? aIntervalIndex is a reference and can't be null.

@@ +1338,5 @@
> +/**
> + * Returns true if aValue is inside a range of aRanges, and put the range
> + * index in aIntervalIndex if it is not null.
> + * If aValue is not inside a range, false is returned, and aIntervalIndex, if
> + * not null, is set to the index of the range which ends immediately before aValue

Same here re 'not null'

@@ +1345,5 @@
> +static bool
> +IsInRanges(dom::TimeRanges& aRanges, double aValue, int32_t& aIntervalIndex)
> +{
> +  uint32_t length;
> +  aRanges.GetLength(&length);

aRanges.GetLength is defined to return an nsresult. I know that the implementation always returns NS_OK but if that's ever changed then this will silently pass and 'length' will be used uninitialized further on. I suggest either (a) check the return value and do something, or (b) initialize 'length' to zero so failure at least won't have undefined results. I'd go for (b).

@@ +1348,5 @@
> +  uint32_t length;
> +  aRanges.GetLength(&length);
> +  for (uint32_t i = 0; i < length; i++) {
> +    double start, end;
> +    aRanges.Start(i, &start);

Same here regarding initialization and error result. In this case however Start() can actually fail so it is possibly an issue.

@@ +1353,5 @@
> +    if (start > aValue) {
> +      aIntervalIndex = i - 1;
> +      return false;
> +    }
> +    aRanges.End(i, &end);

Check error result or intialize 'end'.

::: content/media/MediaDecoder.h
@@ +344,5 @@
>  
>    // Seek to the time position in (seconds) from the start of the video.
> +  // If aDoFastSeek is true, we'll seek to the sync point/keyframe preceeding
> +  // the seek target.
> +  virtual nsresult Seek(double aTime, bool aDoFastSeek);

Don't use boolean parameter.

::: content/media/test/test_fastSeek.html
@@ +24,5 @@
> +      v.fastSeek(v.target);
> +      ok(Math.abs(v.currentTime - v.target) < 0.01,
> +         v.name + " seekTo=" + v.target + " currentTime (" + v.currentTime + ") should be close to seek target initially");
> +    }
> +    

trailing white space here and in other places in the file.
Attachment #8396889 - Flags: review?(cajbir.bugzilla) → review+
Comment on attachment 8396889 [details] [diff] [review]
Patch v1: Implement HTMLMediaElement.fastSeek()

Review of attachment 8396889 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the review!

::: content/html/content/public/HTMLMediaElement.h
@@ +875,5 @@
>    void PopulatePendingTextTrackList();
>  
> +  // Seeks to aTime seconds. If aDoFastSeek is true, this instead seeks to
> +  // the preceeding sync point, or thereabouts.
> +  void Seek(double aTime, bool aDoFastSeek, ErrorResult& aRv);

OK, I'll #include "MediaDecoder.h" in HTMLMediaElement.h and use SeekTarget::Type.
bug 6331058 is to blame, here's a try push without bug 6331058, and it's green:
https://tbpl.mozilla.org/?tree=Try&rev=46f7f3566c8d
(In reply to Ed Morley [:edmorley UTC+0] from comment #18)
> Backed out for reftest failures:
> https://tbpl.mozilla.org/?tree=Mozilla-
> Inbound&rev=00fa39c23b44&jobname=reftest
> 
> remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/525d4e2eab84

seems https://tbpl.mozilla.org/php/getParsedLog.php?id=36868539&tree=Mozilla-Inbound is also a related failure
(In reply to Ed Morley [:edmorley UTC+0] from comment #18)
> Backed out for reftest failures:
> https://tbpl.mozilla.org/?tree=Mozilla-
> Inbound&rev=00fa39c23b44&jobname=reftest
> 
> remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/525d4e2eab84

This is due to the logic changes I made to DecodeToTarget() while I was adding logging to it. I will revert the logic changes I made, but keep the additional logging.


(In reply to Carsten Book [:Tomcat] from comment #19)
> (In reply to Ed Morley [:edmorley UTC+0] from comment #18)
> > Backed out for reftest failures:
> > https://tbpl.mozilla.org/?tree=Mozilla-
> > Inbound&rev=00fa39c23b44&jobname=reftest
> > 
> > remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/525d4e2eab84
> 
> seems
> https://tbpl.mozilla.org/php/getParsedLog.php?id=36868539&tree=Mozilla-
> Inbound is also a related failure

This is caused by MediaOMXReader demuxing and seeking audio and video streams separately. When doing fastSeek the MediaOMXReader seeks both audio and video to the "key frame" preceeding the seek target. But because typically audio has many more sync points than video, the audio stream can be seeked to much closer to the seek target, and because we use the audio to drive the clock when we have both audio and video streams, we'll set the current time to the timestamp of the audio stream after the seek, which is not near the video stream's keyframe, so the test is failing.

We can fix this by changing the MediaOMXReader to seek the video stream first, getting the timestamp of the first video sample after the seek, and then seeking the audio stream to that timestamp.
This applies on top of the previous. Fixes failures are per previous comment.

https://tbpl.mozilla.org/?tree=Try&rev=63b5ecb927eb
Attachment #8399725 - Flags: review?(cajbir.bugzilla)
Attachment #8399725 - Flags: review?(cajbir.bugzilla) → review+
https://hg.mozilla.org/mozilla-central/rev/9c0afbe41ce8
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
Does this need any QA testing before this code is released, perhaps some video control regression testing?
Flags: needinfo?(cpearce)
(In reply to Anthony Hughes, QA Mentor (:ashughes) from comment #24)
> Does this need any QA testing before this code is released, perhaps some
> video control regression testing?

Yes please! Currently this is enabled in Fennec Nightly builds in the built-in video controls. You can test by just seeking with the built in controls.

I still need to create a Gaia bug to get this used for seeking in the Video App in Firefox OS, but seeking in the Firefox OS Browser App should be using this, and should be faster.
Flags: needinfo?(cpearce)
Does this only apply to mobile builds? Is desktop impacted?
The feature ships in desktop builds, but the video controls don't use it on desktop. The videocontrols only use fastSeek when operating in "touch" mode, i.e. on mobile.
For b2g, do we have STR or pass criteria for this bug?
(In reply to Paul Yang [: pyang] from comment #28)
> For b2g, do we have STR or pass criteria for this bug?

Yes. To verify that this works correctly, you test that seeking in an HTML5 <video> works using the built in controls in the B2G Browser App.

i.e.
1. Open http://pearce.org.nz/h in B2G Browser App.
2. Seek the video on that page using the controls displayed on the video

If that works, it's verified.

Note that the Gaia Video App doesn't yet use fastSeek, that's being worked on in bug 996398.
I just tried and find some difficulties.
Can't scroll controll since events of moving screen, long tapping are triggered at the same time.  Who should I ask for?
(In reply to Chris Pearce (:cpearce) from comment #29)
> Yes. To verify that this works correctly, you test that seeking in an HTML5
> <video> works using the built in controls in the B2G Browser App.
> 
> i.e.
> 1. Open http://pearce.org.nz/h in B2G Browser App.
> 2. Seek the video on that page using the controls displayed on the video
> 
> If that works, it's verified.

That doesn't actually verify fastSeek works does it? It just verifies seeking works. The verification steps above work with or without the fastSeek patch. What's probably needed is an example that uses keyframes wide enough apart to tell the difference between fastSeek and normal seek.
Whiteboard: [ucid:multimedia13,2.0,ft:multimeida-platform]
Sounds reasonable.  Chris - Do we have an appropriate sample for this, or we can reuse the one you provided, but just show the trick to enable fastseek?  Thanks in advance.

(In reply to cajbir (:cajbir) from comment #31)
> (In reply to Chris Pearce (:cpearce) from comment #29)
> > Yes. To verify that this works correctly, you test that seeking in an HTML5
> > <video> works using the built in controls in the B2G Browser App.
> > 
> > i.e.
> > 1. Open http://pearce.org.nz/h in B2G Browser App.
> > 2. Seek the video on that page using the controls displayed on the video
> > 
> > If that works, it's verified.
> 
> That doesn't actually verify fastSeek works does it? It just verifies
> seeking works. The verification steps above work with or without the
> fastSeek patch. What's probably needed is an example that uses keyframes
> wide enough apart to tell the difference between fastSeek and normal seek.
Flags: needinfo?(cpearce)
You can use this test case:
http://people.mozilla.org/~cpearce/fastSeek/
Use the big sliders beneath the video to test slow and fast seek, an average time taken is displayed.

This video has a 600-frame keyframe distance. Which is ridiculous, but it demonstrates that fastSeek is better on this video.
Flags: needinfo?(cpearce)
Flags: in-moztrap-
feature-b2g: --- → 2.0
Depends on: 1023771
Depends on: 1026330
Flags: in-qa-testsuite?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: