The Kitchen Sync

What genius he employs in selecting titles for posts! Bravo!

Notes on syncing animations and sound in Flash.

Pause/Resume: There’s no way to pause a playing sound, only stop it. So to pause you have to store the time at which you stop it then when you want to unpause you start the sound playing at the given time. Great (?), but apparently Flash isn’t real concerned about coming in on time, so it may start a few ms early. Which wouldn’t be a problem unless you were counting on your next update happening after the previous update, in which case you’re plum out of luck. In my case I’m doing a calculation to see how time has passed in a looping sound. If the current playhead position is less than the last checked position then I assume the sound has looped and calculate accordingly. If, say, the current position = last position – 1, then almost an entire loop has gone by, which information is then used to update animations, which as you may imagine makes things really fucking wrong. The solution I used was to wait to do any updates until the stored pause position is passed, then reset the pause position to 0. But I’m not real happy with that and I’m hoping as I learn more a better way will be revealed.

II.

Here’s an attempt at a summary of what I’ve learned from a magnificent post at reddit (who knew it was possible?) about syncing:

It is possible to make a good rhythm game with Flash. I was already coming to this conclusion, but it’s good to see another project (TREBL: Rhythm Arcade) that uses it well.
We need a way to do audio and video calibration. Audio is more important, since that is (hopefully) what the user will use to aim for notes. In a video showing calibration from Rock Band creators Harmonix, the audio adjustment is under 10ms. Video adjustment is < 100ms (!) on a large TV. I don’t know yet whether it’s fair to generalize those figures or not.
Audio position isn’t guaranteed to update regularly. It will more likely update in a step-wise fashion, ie. it may produce the same number multiple times in a row then jump to a larger number but on average will be correct. This corresponds roughly to what I’m seeing in our initial demos, where notes jitter a little bit. I will note that our jitter is very minimal, though, and is not noticeable at smaller sizes. Of course that is n=1, though, and the results are likely to be highly variable dependent on the audio hardware and drivers.
Here’s his block of code for determining song time:
```
songStarted() {
    previousFrameTime = getTimer();
    lastReportedPlayheadPosition = 0;
    mySong.play();
}

everyFrame() {
    songTime += getTimer() - previousFrameTime;
    previousFrameTime = getTimer();
    if(mySong.position != lastReportedPlayheadPosition) {
        songTime = (songTime + mySong.position)/2;
        lastReportedPlayheadPosition = mySong.position;
    }
}
```
Analysis: when we update at 60 times/second, check to see if the song position has been updated. If it has, average it with the internal song position that we’re keeping. The internal song position is updated based on the more accurate getTimer(). But of course it’s likely to drift away from the recording, so averaging them together keeps them in check. Question: is it still possible to drift away significantly? Let’s say mySong.position updates only every 100ms or so. And maybe our internal counter has a constant drift of 10ms per 100ms. The first update the song is at 100 but we’re at 110, so we compromise at 105. The next update the song is at 200, the internal is at 215, so we get 207. Then 300, 317 -> 308. 400, 418 -> 409. Seems to me it will just keep increasing, if that drift is a constant, which is kind of a big question. So that bears looking into.
Delays to be concerned with: audio processing -> speakers -> ears (sound in air is like 1ms/foot). Key press -> listeners in program. Graphics -> screen. As discussed before, graphics are probably the largest. Audio less so but most important. Key press registration delays are probably minimal if you use listeners. Hard to separate it from audio/visual delays in a calibrator. And if you use two separate calibrators then you may be including key delays twice. Accounting for it may take fiddling (read: uninformed hacking about with the numbers).
Question: for a calibration tool, could you have a simultaneously playing audio click and visual flash and the user adjusts a slider to line them up? Then we have an idea of when they perceive things to be simultaneous, or the relative delay between sight and sound. Note it does not tell you when the user perceives it, so it’s not a complete test.
In the calibration process we may also have to deal with user interpretation, ie. they may play a little behind the beat or anticipate it even when focusing intently on a simplified calibration task. In the Rock Band vid they did automatic calibration with a tool built into the guitar which is neat but we don’t have that luxury (and plus, WHO CALIBRATES THE CALIBRATORS?)
This isn’t in the article, but I’m going to assert that calibration tests should be done at the indifference point, around 96bpm, a tempo where people tend to not anticipate too much and not drag too much (according to Vierordt’s Law, which I don’t really know if it’s true or not).

Science Ninja Team

Developing awesome things

Leave a Reply Cancel reply