Skip to content
This repository has been archived by the owner on Jan 12, 2019. It is now read-only.

Blacklist playlist for 2 minutes on early abort to prevent cache loop #1220

Merged
merged 6 commits into from
Aug 14, 2017

Conversation

gesinger
Copy link
Contributor

@gesinger gesinger commented Aug 3, 2017

Description

This is a simpler, more reliable, and overall better fix for loops due to inconsistent network/caching than #1211

Requirements Checklist

  • Feature implemented / Bug fixed
  • If necessary, more likely in a feature request than a bug fix
  • Reviewed by Two Core Contributors

@@ -15,6 +15,7 @@ import { minRebufferMaxBandwidthSelector } from './playlist-selectors';

// in ms
const CHECK_BUFFER_DELAY = 500;
const ABORT_EARLY_BLACKLIST_MILLIS = 1000 * 60 * 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 minutes seems very long to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a tough one, and I can see the argument from both sides. But if we don't have the bandwidth now to support the playlist, either because of inconsistent network or because of a low bandwidth network, but can support a different one, I can see this playlist being a poor choice for future selection, as it will likely cause more problems. Two minutes is an arbitrary selection, but seemed long enough to help prevent the segment loops. Maybe it is too long though. Do you think one minute might be better, or did you have another duration in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that if we had to abort early because of poor network, it is likely that it will still be an issue in the future. My concern is that this would happen during startup or very early in playback, but then as we have substantial forward buffer, we could up-switch without causing rebuffering, but it may still be within that 2 minutes. This could just be a hypothetical situation that wouldn't actually play out in

@@ -825,6 +826,7 @@ export default class SegmentLoader extends videojs.EventTarget {
// BANDWIDTH_VARIANCE and add one so the playlist selector does not exclude it
this.bandwidth =
switchCandidate.playlist.attributes.BANDWIDTH * Config.BANDWIDTH_VARIANCE + 1;
this.playlist_.excludeUntil = Date.now() + ABORT_EARLY_BLACKLIST_MILLIS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we log a warning like we do when blacklisting other playlists? Related, should we change this to be an event that can notify MasterPlaylistController that it needs to blacklist the playlist so that all blacklisting goes through the same path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, good call. I'll do both.

*/
blacklistCurrentPlaylist(error = {}) {
blacklistCurrentPlaylist(error = {}, blacklistDuration, skipPlaylistSelection = false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is skipPlaylistSelection really necessary? I think it would be better to leave playlist selection here as is and just trigger abortearly and not bandwidthupdate from segmentLoader. If we are blacklisting the current playlist, I don't really see a scenario in which we would not want to switch from it. I think it's also reasonable to not trigger bandwidthupdate for the early abort case, as the bandwidth set in the segment loader after an early abort is not necessarily the bandwidth of the user, but a specific bandwidth to force playlist selection to select a certain variant. It's a bit of a lie for segment loader to claim it has an update to calculated bandwidth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on the bandwidth being artificial, therefore not triggering. Will do that instead.

videojs.log.warn = origWarn;
});

QUnit.test('does not get stuck in a loop due to inconsistent network/caching',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test looks like a bit of a bear. It might be helpful to do one of a few things to make things clearer to future explorers:

  1. Find a way to break this test up into a few smaller tests... maybe evaluate each stage as its own test. I don't know if qunit lets your group things as you can in mocha with describe blocks, but you could have something like:
describe('caching', () => {
  describe('does not get stuck in a loop due to inconsistent network/caching' () => {
    before(() => { // bootstrap data these tests use });
    function setUpStep1() => { ... };
    function setUpStep2() => { ... };

    it('tests the first step', () => {
     setupStep1();
     assert(x,y, 'output of step 1 looks good!');
    });

   it ('tests second step', () => {
     setupStep1();
     setupStep2();
     assert(x,y, 'output of step 2 looks good!');
   });
  })
})

This might yield more code, but the tests themselves might end up looking a bit clearer, since each is testing a distinct scenario with regards to your use case. It might make debugging a bit easier (instead of the caching test as a whole breaking on an assert that you need to dig through, the test that broke should give you the information that you need to debug a little bit faster).

There is a library that could work (there could be others), I don't know if we want to introduce it here: https://github.com/square/qunit-bdd

  1. If you don't want to do that, maybe adding a comment block on this test that describes what's going on would be useful. Looking at this test I'm aware that there is some kind of loop that is caused by caching, but the use case is hard to tell without reading through the entire test. If someone introduces something that breaks this test it might be difficult to debug whether something is supposed to break based on the change, or if their change broke functionality that we need to preserve.

  2. Maybe find a way to break some things up into separate files. The path of having a test per file seems to have kind of outgrown its usefulness here. We could break the per-file tests into folders, then have test files for various scenarios (or something). Basically breaking this part of the file into something like test/master-playlist-controller/caching.test.js.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the test is too complicated. We discussed a bit via chat, about how this seems to combine unit and integration testing to create one inordinately long test, and that it would be better to split it up into functional pieces. However, for the time being, I think I'm going to go with option 2 (a comment block explaining the test steps) as this is a bit of an odd case, and we should eventually find a way to extract this kind of integration test into a separate test setup.

'warning message is correct');

videojs.log.warn = origWarn;
});

QUnit.test('does not get stuck in a loop due to inconsistent network/caching',
function(assert) {
/*
* This test is a long one, but it is meant to follow a true path to a possible loop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention this behavior is amplified by browser and cdn caching

* This test is a long one, but it is meant to follow a true path to a possible loop
* This test is a long one, but it is meant to follow a true path to a possible loop.
* The reason for the loop is due to inconsistent network bandwidth, often caused or
* amplified by caching at the browser or edge server level.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎸 🔈

@mjneil mjneil added tested and removed in progress labels Aug 14, 2017
@gesinger gesinger merged commit e22d9b4 into videojs:master Aug 14, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants