Closed Bug 1238038 Opened 8 years ago Closed 2 years ago

Allow capture from more than one mic at a time in getUserMedia

Categories

(Core :: WebRTC: Audio/Video, defect, P2)

defect

Tracking

()

RESOLVED FIXED
101 Branch
Tracking Status
firefox45 --- unaffected
firefox46 --- unaffected
firefox47 --- unaffected
firefox48 --- unaffected
firefox-esr45 --- unaffected
relnote-firefox --- 101+
firefox-esr91 --- wontfix
firefox99 --- wontfix
firefox100 --- wontfix
firefox101 --- fixed

People

(Reporter: jesup, Assigned: chunmin)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: regression)

Attachments

(12 files, 17 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
As part of the full-duplex work, we've temporarily regressed an ability - capturing from more than one mic with getUserMedia at a time.  This is only partly/poorly supported in Chrome, and our understanding is that virtually no one is currently using it.

It is a useful feature for music recording and a few other cases, but music recording requires disabling of the AEC and NS, which we haven't landed yet.

We plan to land support for this within 1 revision of full-duplex landing.
Rank: 17
Blocks: 1188022
Has no affect until full-duplex actually is turned on
Assignee: rjesup → dminor
I'll pick this up as a Q4 goal. It is important because it is necessary to support switching between microphones during a call.
Status: NEW → ASSIGNED
Discussion on IRC in reference to implementation using multiple, communicating MediaStreamGraphs:

jesup> padenot: you'd need to plan for: open output; open input (convert MSG to full_duplex); open second input (new MSG); close input 1 -- you now have an input MSG and an output MSG - do you combine them into a single full-duplex MSG?  How?  do you leave them as separate MSGs with a rate converter and  buffer inbetween? 
13:20 < jesup> btw, that's the "normal" case of "I want to switch my mic during a call"
13:23 < jesup> The alternative has it's own issues - on second open, start an input-only callback driver, and route that through a rate converter.  On close of stream 1, you close that input-only driver and reopen the output driver as full-duplex, and dump the rate converter.  Note: a small glitch may be impossible to avoid, but you are switching mics already, and in most cases the new input is just-opened.
13:27 < jesup> in this alternative there's one MSG.  When you add multiple outputs into the picture, you have multiple MSGs run from (output or full-duplex) callback drivers.  Which MSG(s) do inputs get associated with (for full-duplex)?  What if we choose the "wrong" one?  My guess would be to put the first input on the "default" output MSG (if we can know which is default), and then:
13:28 < jesup> for multiple MSGs for multi-mic input: add input-only MSGs for additional inputs, and for the alternative put all additional inputs onto the same MSG as the first.     Note that both of these come with added complications if an output being used for full-duplex closes and we have other active outputs
13:28 < jesup> (flip side of the original case I gave above)

It looks like the plan is to try the one MSG alternative as setting up multiple MSGs is a substantial amount of work.
I started in on this, but it got ugly quickly. I think it either needs someone with more experience with the MSG code to do some refactoring along the way, or we should back up and consider tackling the multiple MSG approach that Paul suggested.
Assignee: dminor → nobody
Status: ASSIGNED → NEW
Mass change P1->P2 to align with new Mozilla triage process
Priority: P1 → P2
how does this affect sites that want to allow changing the microphone in a settings screen during the call?
The current microphone has to be kept open while that dialog is open (might be subject to discussion), then GUM is called again with a new device id which opens a second microphone.

We do this on appear.in and I am quite surprised this doesn't error. It probably doesn't work as intended though but with microphones that is hard to notice.
We won't initially allow multiple microphone in the same document. This problem is caused by an insane amount of technical debt and I can't fix everything at once.

Jitsi issue https://github.com/jitsi/jitsi-meet/issues/2835 depends on this one

I'm interested in finding out when this might get fixed.

This here is required for Jitsi Meet to work properly.
As such, as you had some whiteboard for [jitsi-meet], please assign that to this. (It is not, currently.)

I've involved in the discussion about how to implement it. I am taking the bug for now. Perhaps we can split this goal into several smaller tasks so it will be clearer what we are going to do.

Assignee: nobody → cchang

(In reply to C.M.Chang[:chunmin] from comment #14)

I've involved in the discussion about how to implement it. I am taking the bug for now. Perhaps we can split this goal into several smaller tasks so it will be clearer what we are going to do.

Thank you!

Brief update from the discussion: Now our architecture (simplified version) is something like this (for microphone device). The goal is to implement this design.

Depends on: 1702646

Why many local audio streams can't be merged into a single stream ? Maybe this way could spare the bandwidth ? Even if you keep 2 channels (for stereo) into the single stream, would it allow the codec to achieve a few optimizations ?

(In reply to Jérôme from comment #18)

Why many local audio streams can't be merged into a single stream ?

They might not come from the same physical device, and so will drift with respect to each other. This is the main complexity in this bug, keep the low-latency, but reclock audio input devices.

To add a non-native input source in the MediaTrackGraph, we need a
input-only cubeb stream.

Depends on D116534

Add a AudioInputStream class that will generate audio input data from
one thread and it allows its user to read the data from another thread.

Attachment #9224575 - Attachment description: WIP: Bug 1238038 - Implement CubebInputStream class → WIP: Bug 1238038 - Create a cubeb input stream wrapper

The following patches will add an abstract MediaTrack for the audio
input devices. The NativeInputTrack will be one implementation of it.
These MediaTracks holding the audio input data should be put together so
this patch moves the NativeInputTrack into individual .h and .cpp files,
which will also be the places implementing this new type of MediaTrack.

This patch addes a new DeviceInputTrack class, which is also the base
class for our NativeInputTrack class. The AudioInputTrack's
data-producer should always be DeviceInputTrack. We will implement
another subclass of DeviceInputTrack for non-native input device in the
following patches. As a result, the AudioInputTrack has no need to care
whether its requested device source is native or non-native.

This patch creates a new class, DeviceInputTrackManager, to manage all
the DeviceInputTrack. It helps to lower the complexity of
MediaTrackGraph and it is the only way to get access to the
DeviceInputTrack.

Depends on D118552

By the comment in AudioDriftCorrection: "The construction can happen
in any thread", the AudioDriftCorrection class is ought to be created
in any thread, but it's not true actually. The fact that ctor calls
Preferences::GetInt makes it can only be called on main-thread, which
is the thread Preferences::GetInt should be.

To make AudioDriftCorrection work as what it's expected, the
preference getter should be moved to the AudioDriftCorrection's caller
itself.

Attachment #9225787 - Attachment is obsolete: true

InitDataHolderIfNeeded is used to initialize an AudioDataBuffers but
there is no memory needed to be allocated and there is no need to wrap
the AudioDataBuffers by a Maybe.

Depends on D118552

AudioBufferInfo::Scope::All is a shortcut version of
static_cast<AudioBufferInfo::Scope>(AudioBufferInfo::Scope::Input | AudioBufferInfo::Scope::Output).

Depends on D122513

D116534 is probably solved by D122338 in bug 1725137

Attachment #9224574 - Attachment is obsolete: true
Depends on: 1725810
Attachment #9236045 - Attachment description: WIP: Bug 1238038 - Remove InitDataHolderIfNeeded() → WIP: Bug 1238038 - Don't clear input data unless necessary
Attachment #9236083 - Attachment is obsolete: true
Attached file Bug 1238038 - Add debugging logs (obsolete) —

Add more debugging log showing device information

Depends on D118862

Attachment #9224575 - Attachment description: WIP: Bug 1238038 - Create a cubeb input stream wrapper → Bug 1238038 - Create a cubeb input stream wrapper
Attachment #9227313 - Attachment description: WIP: Bug 1238038 - Move NativeInputTrack to individual files → Bug 1238038 - Move NativeInputTrack to individual files
Attachment #9227798 - Attachment description: WIP: Bug 1238038 - Add a base type of MediaTrack for different devices → Bug 1238038 - Add a base type of MediaTrack for different devices
Attachment #9236045 - Attachment description: WIP: Bug 1238038 - Don't clear input data unless necessary → Bug 1238038 - Don't clear input data unless necessary
Attachment #9228457 - Attachment description: WIP: Bug 1238038 - Create a class managing MediaTrack for devices → Bug 1238038 - Create a class managing MediaTrack for devices
Attachment #9235703 - Attachment description: WIP: Bug 1238038 - Relax AudioDriftCorrection's thread limitation → Bug 1238038 - Relax AudioDriftCorrection's thread limitation
Attachment #9229030 - Attachment description: WIP: Bug 1238038 - Create a MediaTrack for non-native device → Bug 1238038 - Create a MediaTrack for non-native device
Attachment #9241255 - Attachment description: WIP: Bug 1238038 - Add debugging logs → Bug 1238038 - Add debugging logs

The patches can be tested here: https://chunminchang.github.io/playground/webrtc/multi_microphones.html.

The test page can open multiple AudioInputTrack with different devices, and the channel count can be reset anytime. The audio-processing including autoGainControl, echoCancellation and noiseSuppression can be set when the track are opened. Those audio-processing settings are used to set the AudioInputTrack to passthrough mode. It will lead to a different path when ProcessInput is called.

Attached file Bug 1238038 - Remove useless comment (obsolete) —

This comment should be deleted in Bug 1725810.

Depends on: 1735201
Attachment #9236045 - Attachment description: Bug 1238038 - Don't clear input data unless necessary → Bug 1238038 - Move audio data processing to ProcessInput
Attachment #9229030 - Attachment description: Bug 1238038 - Create a MediaTrack for non-native device → WIP: Bug 1238038 - Create a MediaTrack for non-native device

The NativeInputTrack should only care about input data. Other
non-data related members should be moved out of the class. This makes
NativeInputTrack become a simple class focusing on input data storing.

Depends on D122513

The NativeInputTrack should be the only track that can open or close the
audio stream.

Depends on D130142

Add a simple test demonstrating the usage of NativeInputTrack

Depends on D118242

This patch implements a NonNativeInputTrack class that inherits the
DeviceInputTrack class. The audio data in this class instance is
produced by a non-graph audio stream via NotifyInputData(), and is
consumed by the graph thread via ProcessInput().

Depends on D122297

Attachment #9236045 - Attachment description: Bug 1238038 - Move audio data processing to ProcessInput → WIP: Bug 1238038 - Move audio data processing to ProcessInput
Attachment #9236045 - Attachment description: WIP: Bug 1238038 - Move audio data processing to ProcessInput → Bug 1238038 - Move audio data processing to ProcessInput

Comment on attachment 9244448 [details]
Bug 1238038 - Remove useless comment

Revision D127605 was moved to bug 1741959. Setting attachment 9244448 [details] to obsolete.

Attachment #9244448 - Attachment is obsolete: true

Comment on attachment 9241255 [details]
Bug 1238038 - Add debugging logs

Revision D125626 was moved to bug 1741959. Setting attachment 9241255 [details] to obsolete.

Attachment #9241255 - Attachment is obsolete: true

Comment on attachment 9227313 [details]
Bug 1238038 - Move NativeInputTrack to individual files

Revision D117965 was moved to bug 1741959. Setting attachment 9227313 [details] to obsolete.

Attachment #9227313 - Attachment is obsolete: true

Comment on attachment 9236045 [details]
Bug 1238038 - Move audio data processing to ProcessInput

Revision D122513 was moved to bug 1741959. Setting attachment 9236045 [details] to obsolete.

Attachment #9236045 - Attachment is obsolete: true

Comment on attachment 9248980 [details]
Bug 1238038 - Add test for NativeInputTrack

Revision D130232 was moved to bug 1741959. Setting attachment 9248980 [details] to obsolete.

Attachment #9248980 - Attachment is obsolete: true
Depends on: 1741959
Depends on: 1742655
Depends on: 1746893

Comment on attachment 9235703 [details]
Bug 1238038 - Relax AudioDriftCorrection's thread limitation

Revision D122297 was moved to bug 1746893. Setting attachment 9235703 [details] to obsolete.

Attachment #9235703 - Attachment is obsolete: true
Attachment #9248820 - Attachment is obsolete: true
Attachment #9248821 - Attachment is obsolete: true

Separate common interfaces that will be used for NonNativeInputTrack
class, from NativeInputTrack, into a base class: DeviceInputTrack, where
the NonNativeInputTrack will be implemented in the next patch.

Depends on D137781

Attachment #9248981 - Attachment description: Bug 1238038 - Create a MediaTrack for non-native device → WIP: Bug 1238038 - Create a MediaTrack for non-native device
Attachment #9227798 - Attachment is obsolete: true

Depends on D130233

Attachment #9224575 - Attachment description: Bug 1238038 - Create a cubeb input stream wrapper → WIP: Bug 1238038 - Create a cubeb input stream wrapper

Depends on D116535

Comment on attachment 9262249 [details]
WIP: Bug 1238038 - Fix comment for include guard

Revision D137817 was moved to bug 1754250. Setting attachment 9262249 [details] to obsolete.

Attachment #9262249 - Attachment is obsolete: true

Comment on attachment 9262163 [details]
WIP: Bug 1238038 - Narrower down the thread limits

Revision D137781 was moved to bug 1754254. Setting attachment 9262163 [details] to obsolete.

Attachment #9262163 - Attachment is obsolete: true

Depends on D137911

Depends on: 1754615

Depends on D116535

Depends on D138189

Attachment #9262461 - Attachment description: WIP: Bug 1238038 - Allow creating AudioChunk from interleaved buffer → WIP: Bug 1238038 - Create AudioChunk from interleaved buffer

This patch creates a class to manage all DeviceInputTracks in one place,
which simplify the code in the MediaTrackGraph.

Depends on D138726

The mInputDeviceID can be removed now since we can get the input-device
id via the DeviceInputTrackManager on the graph-thread side now.

Depends on D139444

Attachment #9229030 - Attachment is obsolete: true
Attachment #9228457 - Attachment is obsolete: true
Attachment #9262164 - Attachment description: WIP: Bug 1238038 - Add a base type of MediaTrack for different devices → Bug 1238038 - Add a base type of MediaTrack for different devices
Attachment #9248981 - Attachment description: WIP: Bug 1238038 - Create a MediaTrack for non-native device → Bug 1238038 - Create a MediaTrack for non-native device
Attachment #9262461 - Attachment description: WIP: Bug 1238038 - Create AudioChunk from interleaved buffer → Bug 1238038 - Create AudioChunk from interleaved buffer
Attachment #9224575 - Attachment description: WIP: Bug 1238038 - Create a cubeb input stream wrapper → Bug 1238038 - Create a cubeb input stream wrapper
Attachment #9263357 - Attachment description: WIP: Bug 1238038 - Disable input device switching by default → Bug 1238038 - Disable input device switching by default
Attachment #9262462 - Attachment description: WIP: Bug 1238038 - Create an input-only audio source → Bug 1238038 - Create an input-only audio source
Attachment #9262905 - Attachment description: WIP: Bug 1238038 - Start non-native audio in NonNativeInputTrack → Bug 1238038 - Start non-native audio in NonNativeInputTrack
Attachment #9263856 - Attachment description: WIP: Bug 1238038 - Allow opening multiple devices → Bug 1238038 - Allow opening multiple devices
Attachment #9265085 - Attachment description: WIP: Bug 1238038 - Manage MediaTrack for devices in one class → Bug 1238038 - Manage MediaTrack for devices in one class
Attachment #9265086 - Attachment description: WIP: Bug 1238038 - Remove mInputDeviceID in MediaTrackGraph → Bug 1238038 - Remove mInputDeviceID in MediaTrackGraph
Depends on: 1756930
Depends on: 1763385

Depends on D137910

Add a waitable event that will be fired when the state is changed, so
the user can wait for certain state of the cubeb stream.

Depends on D143100

Add a test to verify we can open multiple microphones

Depends on D139445

Depends on: 1763630
Depends on: 1764861
Pushed by cchang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ed855828ee19
Add a base type of MediaTrack for different devices r=padenot
https://hg.mozilla.org/integration/autoland/rev/cee7aa3686a2
Create a MediaTrack for non-native device r=padenot
https://hg.mozilla.org/integration/autoland/rev/3e81a5bebba2
Create AudioChunk from interleaved buffer r=padenot
https://hg.mozilla.org/integration/autoland/rev/f2c23215c842
Relax channel limit in FromInterleavedBuffer r=padenot
https://hg.mozilla.org/integration/autoland/rev/28f666109dfa
Notify state changed in MockCubeb r=padenot
https://hg.mozilla.org/integration/autoland/rev/eff158481883
Create a cubeb input stream wrapper r=padenot,pehrsons
https://hg.mozilla.org/integration/autoland/rev/b66220158398
Create an input-only audio source r=padenot
https://hg.mozilla.org/integration/autoland/rev/80575371d79f
Start non-native audio in NonNativeInputTrack r=padenot
https://hg.mozilla.org/integration/autoland/rev/8f7feefeda0c
Allow opening multiple devices r=padenot
https://hg.mozilla.org/integration/autoland/rev/a1e8aebd2e88
Manage MediaTrack for devices in one class r=padenot
https://hg.mozilla.org/integration/autoland/rev/eb898ad02abd
Remove mInputDeviceID in MediaTrackGraph r=padenot
https://hg.mozilla.org/integration/autoland/rev/2b2de9f71e7f
Mochitest for opening multi mics r=padenot
Attachment #9263357 - Attachment is obsolete: true
Regressions: 1765216
See Also: → 1764380
See Also: → 1763736
Regressions: 1765312
Crash Signature: [@ mozilla::MediaEngineWebRTCMicrophoneSource::Start]
Blocks: 1731141

This should be for bug 1731141. The patches here remove the code that causes the crash.

Crash Signature: [@ mozilla::MediaEngineWebRTCMicrophoneSource::Start]

Release Note Request (optional, but appreciated)
[Why is this notable]:
Audio input support improvements for web conferencing.
[Affects Firefox for Android]:
Yes
[Suggested wording]:
Firefox now allows users to use as many microphones as you want at the same time during video conferencing! The most important benefit is that you can easily switch your microphones at any time, if your conferencing service provider enables this flexibility.
[Links (documentation, blog post, etc)]:

relnote-firefox: --- → ?

jitsi enables this in PR 1988 🎉

Added to the final Fx101 relnotes.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: