Welcome back to another WebGL tutorial 🎉

This time around we’re going to learn how to trigger a sound effect at a precise moment during a skeletal animation.

We’ll combine the WebGL and Web Audio APIs in order to render an animated baseball player model and then play a sound effect whenever his baseball and bat make contact.

You can use this technique on a whole host of situations. For example, you might trigger sound effects precisely as your character’s feet make contact with the ground. Or play a clapping sound as your character claps her hands. Or any other use case that your imagination cooks up.

We will not be using any third party rendering libaries. We’ll be writing our own vertex and fragment shader from scratch.

You’ll hopefully walk away with a better sense of how to implement skeletal animation, as well as how to perfectly time your sound effects to your character’s activities.

Precisely triggering our sound effect

The concept behind our technique is simple. Our skeletal animation has 20 discrete keyframes. Each time we render our model we sample from a lower and upper keyframe, based on how much time has elapsed.

If our sound effect is meant to be played on the eigth keyframe, we’ll trigger it everytime our lower keyframe changes from keyframe #7 to keyframe #8.

click and drag to move the camera

If you wanted to you could trigger your animation at some offset amount of time in between your lower and upper keyframe, but this time around we’re going to keep our scope simple and only trigger sound effects on exact keyframes.

Alright, now that you have a sense of what we’re about to do, let’s dive in!

Getting set up

Before we can start writing code, we’ll need to set up our environment.

First, create a new file for this tutorial.

mkdir webgl-skeletal-sound-tutorial
cd webgl-skeletal-sound-tutorial
touch tutorial.js

Next, let’s download our 3d model and texture.

curl -OL https://github.com/chinedufn/\
webgl-skeletal-animation-sound-tutorial/raw/\
master/baseball-player.dae > baseball-player.dae
curl -OL https://github.com/chinedufn/\
webgl-skeletal-animation-sound-tutorial/raw\
/master/baseball-player-uvs.png > baseball-player-uvs.png

Next we’ll download our COLLADA parser so that we can get the vertex and joint data from our COLLADA .dae file.

npm install collada-dae-parser@0.12.0 \
expand-vertex-data@1.0.1 \
mat-to-dual-quat@1.0.0

gl-matrix, skeletal-animation-system we’ll power the math calculations that we’ll need for our experience.

npm install skeletal-animation-system@0.6.5 \
gl-mat4@1.1.4 gl-mat3@1.0.0 gl-vec3@1.0.3

Let’s install a development server so that we can easily view our work locally.

npm install budo@10.0.4

Lastly, we’ll want to parse our COLLADA file into JSON.

./collada-dae-parser/bin/dae2json.js \
baseball-player.dae > baseball-player.json

Creating our animation / sound effect experience

You’ll learn more if you type out the tutorial, but all of the source code is on GitHub should you need a full reference.

Now open up tutorial.js and let’s dive in!


var glMat4 = require('gl-mat4')
var glMat3 = require('gl-mat3')
var glVec3 = require('gl-vec3')
var expandVertexData = require('expand-vertex-data')
var mat4ToDualQuat = require('mat4-to-dual-quat')
var animationSystem = require('skeletal-animation-system')

We first require all of the dependencies that we’ll be using. gl-mat4 gl-mat3 and gl-vec3 are our standard matrix and vector libraries that we use to calculate the data for our uniforms.

expand-vertex-data de-compresses our JSON model vertex data so that we can use it for our vertex attributes.

mat-4-to-dual-quat is used to convert 4x4 matrices into dual quaternions. I’ve written about why I tend to prefer dual quaternions over matrices for vertex skinning, so we’re going to convert all of our joint matrices into dual quaternions.

And lastly, skeletal-animation-system is a wrapper around some math for interpolating and blending our model’s joint data over time.


var model = require('./baseball-player.json')
var baseballPlayer = expandVertexData(model)
baseballPlayer.keyframes = Object.keys(model.keyframes)
.reduce(function (dualQuats, keyframe) {
  dualQuats[keyframe] = []
  for (var k = 0; k < model.keyframes[keyframe].length; k++) {
    dualQuats[keyframe][k] = mat4ToDualQuat(
      model.keyframes[keyframe][k]
    )
  }
  return dualQuats
}, {})

Earlier we converted our COLLADA file into JSON, and here we require that JSON. As mentioned above, we’ll then convert all of the joint matrices into dual quaternions using mat4-to-dual-quat.


var canvas = document.createElement('canvas')
canvas.style.cursor = 'pointer'
canvas.width = 500
canvas.height = 500

var isDragging = false
var xCamRot = Math.PI / 20
var yCamRot = 0
var lastX
var lastY
canvas.onmousedown = function (e) {
  isDragging = true
  lastX = e.pageX
  lastY = e.pageY
}
canvas.onmouseup = function () {
  isDragging = false
}
canvas.onmousemove = function (e) {
  if (isDragging) {
    xCamRot += (e.pageY - lastY) / 60
    yCamRot -= (e.pageX - lastX) / 60

    xCamRot = Math.min(xCamRot, Math.PI / 2.5)
    xCamRot = Math.max(-0.5, xCamRot)

    lastX = e.pageX
    lastY = e.pageY
  }
}

canvas.addEventListener('touchstart', function (e) {
  lastX = e.touches[0].clientX
  lastY = e.touches[0].clientY
})
canvas.addEventListener('touchmove', function (e) {
  e.preventDefault()
  xCamRot += (e.touches[0].clientY - lastY) / 60
  yCamRot -= (e.touches[0].clientX - lastX) / 60

  xCamRot = Math.min(xCamRot, Math.PI / 2.5)
  yCamRot = Math.max(yCamRot, -0.5)

  lastX = e.touches[0].clientX
  lastY = e.touches[0].clientY
})

As per usual, we’ll set up our canvas events for detecting mouse and finger drags so that we can control our camera.


var gl = canvas.getContext('webgl')
gl.clearColor(0.0, 0.0, 0.0, 1.0)
gl.enable(gl.DEPTH_TEST)

var vertexGLSL = `
attribute vec3 aVertexPosition;
attribute vec3 aVertexNormal;
attribute vec2 aVertexUV;

attribute vec4 aJointIndex;
attribute vec4 aJointWeight;

varying vec3 vNormal;

uniform mat4 uMVMatrix;
uniform mat4 uPMatrix;
uniform mat3 uNMatrix;

uniform vec4 boneRotQuaternions[20];
uniform vec4 boneTransQuaternions[20];

varying vec3 vLightWeighting;
varying vec2 vUV;

void main (void) {
  vec4 weightedRotQuats =
    boneRotQuaternions[int(aJointIndex.x)] * aJointWeight.x +
    boneRotQuaternions[int(aJointIndex.y)] * aJointWeight.y +
    boneRotQuaternions[int(aJointIndex.z)] * aJointWeight.z +
    boneRotQuaternions[int(aJointIndex.w)] * aJointWeight.w;

  vec4 weightedTransQuats =
    boneTransQuaternions[int(aJointIndex.x)] * aJointWeight.x +
    boneTransQuaternions[int(aJointIndex.y)] * aJointWeight.y +
    boneTransQuaternions[int(aJointIndex.z)] * aJointWeight.z +
    boneTransQuaternions[int(aJointIndex.w)] * aJointWeight.w;

  float xRot = weightedRotQuats[0];
  float yRot = weightedRotQuats[1];
  float zRot = weightedRotQuats[2];
  float wRot = weightedRotQuats[3];
  float magnitude = sqrt(xRot * xRot + yRot * yRot + zRot * zRot +
  wRot * wRot);
  weightedRotQuats = weightedRotQuats / magnitude;
  weightedTransQuats = weightedTransQuats / magnitude;

  float xR = weightedRotQuats[0];
  float yR = weightedRotQuats[1];
  float zR = weightedRotQuats[2];
  float wR = weightedRotQuats[3];

  float xT = weightedTransQuats[0];
  float yT = weightedTransQuats[1];
  float zT = weightedTransQuats[2];
  float wT = weightedTransQuats[3];

  float t0 = 2.0 * (-wT * xR + xT * wR - yT * zR + zT * yR);
  float t1 = 2.0 * (-wT * yR + xT * zR + yT * wR - zT * xR);
  float t2 = 2.0 * (-wT * zR - xT * yR + yT * xR + zT * wR);

  mat4 convertedMatrix = mat4(
      1.0 - (2.0 * yR * yR) - (2.0 * zR * zR),
      (2.0 * xR * yR) + (2.0 * wR * zR),
      (2.0 * xR * zR) - (2.0 * wR * yR),
      0,
      (2.0 * xR * yR) - (2.0 * wR * zR),
      1.0 - (2.0 * xR * xR) - (2.0 * zR * zR),
      (2.0 * yR * zR) + (2.0 * wR * xR),
      0,
      (2.0 * xR * zR) + (2.0 * wR * yR),
      (2.0 * yR * zR) - (2.0 * wR * xR),
      1.0 - (2.0 * xR * xR) - (2.0 * yR * yR),
      0,
      t0,
      t1,
      t2,
      1
      );

  vec3 transformedNormal = (convertedMatrix *
    vec4(aVertexNormal, 0.0)).xyz;

  float y;
  float z;
  y = transformedNormal.z;
  z = -transformedNormal.y;
  transformedNormal.y = y;
  transformedNormal.z = z;

  transformedNormal = uNMatrix * transformedNormal;

  vec4 leftWorldSpace = convertedMatrix *
  vec4(aVertexPosition, 1.0);
  y = leftWorldSpace.z;
  z = -leftWorldSpace.y;
  leftWorldSpace.y = y;
  leftWorldSpace.z = z;

  vec4 leftHandedPosition = uPMatrix * uMVMatrix * leftWorldSpace;

  gl_Position = leftHandedPosition;

  vNormal = transformedNormal;
  vUV = aVertexUV;
}
`

I’ve written an explanation of a dual quaternion linear blending vertex shader so I won’t dive into detail here.

In short, we’re writing a vertex shader that will position all of our vertices based on where are bones are located in model space.


var fragmentGLSL = `
precision mediump float;

varying vec3 vLightWeighting;
varying vec3 vNormal;
varying vec2 vUV;

uniform vec3 uAmbientColor;
uniform vec3 uDirectionalColor;
uniform vec3 uLightingDirection;
uniform sampler2D uSampler;

void main(void) {
  float directionalLightWeighting = max(
    dot(vNormal, uLightingDirection),
  0.0);
  vec3 lightWeighting = uAmbientColor + uDirectionalColor *
  directionalLightWeighting;

  vec4 baseColor = texture2D(uSampler, vec2(vUV.s, vUV.t));
  gl_FragColor = baseColor * vec4(lightWeighting, 1.0);
}
`

Our simple fragment shader combines our model’s texture and some directional lighting in order to render our fragments.


var vertexShader = gl.createShader(gl.VERTEX_SHADER, vertexGLSL)
gl.shaderSource(vertexShader, vertexGLSL)
gl.compileShader(vertexShader)

var fragmentShader = gl.createShader(
  gl.FRAGMENT_SHADER,
  fragmentGLSL
)
gl.shaderSource(fragmentShader, fragmentGLSL)
gl.compileShader(fragmentShader)

var shaderProgram = gl.createProgram()
gl.attachShader(shaderProgram, vertexShader)
gl.attachShader(shaderProgram, fragmentShader)
gl.linkProgram(shaderProgram)

gl.useProgram(shaderProgram)

var vertexPosAttrib = gl.getAttribLocation(shaderProgram,
                                           'aVertexPosition')
var vertexNormalAttrib = gl.getAttribLocation(shaderProgram,
                                              'aVertexNormal')
var vertexUVAttrib = gl.getAttribLocation(shaderProgram,
                                          'aVertexUV')
var jointIndexAttrib = gl.getAttribLocation(shaderProgram,
                                            'aJointIndex')
var jointWeightAttrib = gl.getAttribLocation(shaderProgram,
                                             'aJointWeight')

gl.enableVertexAttribArray(vertexPosAttrib)
gl.enableVertexAttribArray(vertexNormalAttrib)
gl.enableVertexAttribArray(vertexUVAttrib)
gl.enableVertexAttribArray(jointIndexAttrib)
gl.enableVertexAttribArray(jointWeightAttrib)

var ambientColorUni = gl.getUniformLocation(shaderProgram,
                                            'uAmbientColor')
var lightingDirectionUni = gl.getUniformLocation(
  shaderProgram,
  'uLightingDirection')
var directionalColorUni = gl.getUniformLocation(
  shaderProgram,
  'uDirectionalColor')
var mVMatrixUni = gl.getUniformLocation(shaderProgram, 'uMVMatrix')
var pMatrixUni = gl.getUniformLocation(shaderProgram, 'uPMatrix')
var nMatrixUni = gl.getUniformLocation(shaderProgram, 'uNMatrix')
var uSampler = gl.getUniformLocation(shaderProgram, 'uSampler')

var boneRotQuaternions = {}
var boneTransQuaternions = {}
for (var i = 0; i < 20; i++) {
  boneRotQuaternions[i] = gl.getUniformLocation(
    shaderProgram,
    `boneRotQuaternions[${i}]`)
  boneTransQuaternions[i] = gl.getUniformLocation(
    shaderProgram,
    `boneTransQuaternions[${i}]`)
}

var vertexPosBuffer = gl.createBuffer()
gl.bindBuffer(gl.ARRAY_BUFFER, vertexPosBuffer)
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(
  baseballPlayer.positions
), gl.STATIC_DRAW)
gl.vertexAttribPointer(vertexPosAttrib, 3, gl.FLOAT, false, 0, 0)

var vertexNormalBuffer = gl.createBuffer()
gl.bindBuffer(gl.ARRAY_BUFFER, vertexNormalBuffer)
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(
  baseballPlayer.normals
), gl.STATIC_DRAW)
gl.vertexAttribPointer(vertexNormalAttrib, 3, gl.FLOAT, false,
                       0, 0)

var jointIndexBuffer = gl.createBuffer()
gl.bindBuffer(gl.ARRAY_BUFFER, jointIndexBuffer)
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(
  baseballPlayer.jointInfluences), gl.STATIC_DRAW)
gl.vertexAttribPointer(jointIndexAttrib, 4, gl.FLOAT, false, 0, 0)

var jointWeightBuffer = gl.createBuffer()
gl.bindBuffer(gl.ARRAY_BUFFER, jointWeightBuffer)
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(
  baseballPlayer.jointWeights), gl.STATIC_DRAW)
gl.vertexAttribPointer(jointWeightAttrib, 4, gl.FLOAT, false, 0, 0)

var vertexUVBuffer = gl.createBuffer()
gl.bindBuffer(gl.ARRAY_BUFFER, vertexUVBuffer)
gl.bufferData(gl.ARRAY_BUFFER, new Float32Array(
  baseballPlayer.uvs), gl.STATIC_DRAW)
gl.vertexAttribPointer(vertexUVAttrib, 2, gl.FLOAT, false, 0, 0)

var vertexIndexBuffer = gl.createBuffer()
gl.bindBuffer(gl.ELEMENT_ARRAY_BUFFER, vertexIndexBuffer)
gl.bufferData(gl.ELEMENT_ARRAY_BUFFER, new Uint16Array(
  baseballPlayer.positionIndices), gl.STATIC_DRAW)

gl.uniform3fv(ambientColorUni, [0.3, 0.3, 0.3])
var lightingDirection = [1, -1, -1]
glVec3.scale(lightingDirection, lightingDirection, -1)
glVec3.normalize(lightingDirection, lightingDirection)
gl.uniform3fv(lightingDirectionUni, lightingDirection)
gl.uniform3fv(directionalColorUni, [1, 1, 1])

gl.uniformMatrix4fv(pMatrixUni, false, glMat4.perspective(
  [], Math.PI / 3, 1, 0.1, 100))

var texture = gl.createTexture()
var textureImage = new window.Image()
var imageHasLoaded
textureImage.onload = function () {
  gl.bindTexture(gl.TEXTURE_2D, texture)
  gl.pixelStorei(gl.UNPACK_FLIP_Y_WEBGL, true)
  gl.texParameteri(
    gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.NEAREST)
  gl.texParameteri(
    gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.NEAREST)
  gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA,
                gl.UNSIGNED_BYTE, textureImage)
  imageHasLoaded = true
}
textureImage.src = 'baseball-player-uvs.png'

Nothing too new here that we haven’t done in previous tutorials. We’re setting up all of our attributes and uniforms so that we can send the relevant data to the GPU.


var keyframesToPlaySoundOn = {
  7: true,
  9: true
}

var clockTime = 0
var lastStartTime = new Date().getTime()

var previousLowerKeyframe

And here’s where things start to come together. We have an object specifying the keyframes that we want to play our sound effect on. And then we’re getting set up to keep track of our lower keyframe after each render.

Sampling keyframes We’ll be keeping track of the lower keyframe that we’re sampling

Every frame we sample from two keyframes, lower and upper, based on the current clock time. When our lower keyframe is different from our previous lower keyframe, we know that we’ve started a new section of the animation. If this new section of the animation is the ball hitting the bat, we’ll play a sound effect.


function draw () {
  var currentTime = new Date().getTime()

  var timeElapsed = (currentTime - lastStartTime) / 1000 *
    playbackSpeed
  clockTime += timeElapsed
  lastStartTime = currentTime

  gl.clear(gl.COLOR_BUFFER_BIT, gl.DEPTH_BUFFER_BIT)

Here we advance the current time based on our playback speed. So as you drag the playback slider in the demo the clock time will increase more slowly.


  var animationData = animationSystem.interpolateJoints({
    currentTime: clockTime,
    keyframes: baseballPlayer.keyframes,
    jointNums: [
      0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
      13, 13, 14, 15, 16, 17, 18, 19, 20
    ],
    currentAnimation: {
      startTime: 0,
      range: [0, 12]
    }
  })

Next we use skeletal-animation-system to interpolate our lower and upper dual quaternions based on the current keyframes that we’re sampling. There are other options for skeletal-animation-system, but there are the only ones that we need.


  var newLowerKeyframe =
    animationData.currentAnimationInfo.lowerKeyframeNumber
  if (keyframesToPlaySoundOn[newLowerKeyframe] &&
      previousLowerKeyframe !== newLowerKeyframe) {
    if (!muted) {
      audio.play()
    }
  }
  previousLowerKeyframe = newLowerKeyframe

As mentioned above, we check if we’re transitioning over to the ball hitting the bat. If so, we play our bat hitting ball sound.


  for (var j = 0; j < 20; j++) {
    var rotQuat = animationData.joints[j].slice(0, 4)
    var transQuat = animationData.joints[j].slice(4, 8)

    gl.uniform4fv(boneRotQuaternions[j], rotQuat)
    gl.uniform4fv(boneTransQuaternions[j], transQuat)
  }

We push all of our interpolate dual quaternions to the GPU to be used during dual quaternion linear blending.


  var modelMatrix = [1, 0, 0, 0, 0, 1, 0,
    0, 0, 0, 1, 0, 0, 0, 0, 1]
  var nMatrix = glMat3.fromMat4([], modelMatrix)

  var camera = glMat4.create()
  glMat4.translate(camera, camera, [0, 0, 2.5])
  var yAxisCameraRot = glMat4.create()
  var xAxisCameraRot = glMat4.create()
  glMat4.rotateX(xAxisCameraRot, xAxisCameraRot, -xCamRot)
  glMat4.rotateY(yAxisCameraRot, yAxisCameraRot, yCamRot)
  glMat4.multiply(camera, xAxisCameraRot, camera)
  glMat4.multiply(camera, yAxisCameraRot, camera)
  glMat4.lookAt(camera, [camera[12], camera[13], camera[14]],
                [0, 0, 0], [0, 1, 0])
  var mVMatrix = glMat4.multiply([], camera, modelMatrix)

  gl.uniformMatrix3fv(nMatrixUni, false, nMatrix)
  gl.uniformMatrix4fv(mVMatrixUni, false, mVMatrix)

  // Once our texture has loaded we begin drawing our model
  if (imageHasLoaded) {
    gl.drawElements(
      gl.TRIANGLES,
      baseballPlayer.positionIndices.length, gl.UNSIGNED_SHORT, 0)
  }

  window.requestAnimationFrame(draw)
}

Here we set up our camera and render our model.


var audio = new window.Audio()
audio.crossOrigin = 'anonymous'
audio.src = 'bat-hit-ball.mp3'

var context = new window.AudioContext()
var analyzer = context.createScriptProcessor(1024, 1, 1)
var source = context.createMediaElementSource(audio)
var gainNode = context.createGain()

source.connect(analyzer)
analyzer.connect(gainNode)
gainNode.connect(context.destination)

Next we take advantage of the Web Audio API to create our volume bars display. We’ll use an analyzer node to retrieve all of the sound that is being emitted at different time samples. We’ll take the loudest sound and use that to determine how many bars to color.


var muted = true
var hasClickedBefore = false
var muteButton = document.createElement('button')
muteButton.innerHTML = 'Click to un-mute'
muteButton.style.cursor = 'pointer'
muteButton.style.marginRight = '10px'
muteButton.style.marginLeft = '10px'
muteButton.style.width = '100px'
muteButton.style.height = '40px'
muteButton.onclick = function () {
  if (!hasClickedBefore) {
    hasClickedBefore = true
    gainNode.gain.value = 0
    audio.play()
    setTimeout(function () {
      gainNode.gain.value = 1
    }, 500)
  }

  muted = !muted

  muteButton.innerHTML = muted ? 'Click to un-mute' :
    'Click to mute'
}

var playbackSlider = document.createElement('input')
playbackSlider.type = 'range'
playbackSlider.max = 100
playbackSlider.min = 0
playbackSlider.value = 85

var playbackSpeed = 0.85

var playbackDisplay = document.createElement('div')
playbackDisplay.innerHTML = 'Playback: 85%'

playbackSlider.oninput = function (e) {
  playbackSpeed = e.target.value / 100
  playbackDisplay.innerHTML = 'Playback: ' + e.target.value + '%'
}

var volumeBarContainer = document.createElement('div')
volumeBarContainer.style.display = 'flex'
volumeBarContainer.style.display = 'flex'
var volumeBars = []
for (var k = 0; k < 5; k++) {
  var volumeBar = document.createElement('div')
  volumeBar.style.width = '40px'
  volumeBar.style.height = '40px'
  volumeBar.style.border = 'solid #333 1px'
  volumeBar.style.transition = '0.9s ease background-color'
  volumeBar.style.marginRight = '2px'
  volumeBars.push(volumeBar)
  volumeBarContainer.appendChild(volumeBar)
}

Add our controls and volume display into the page.


analyzer.onaudioprocess = function (e) {
  var out = e.outputBuffer.getChannelData(0)
  var input = e.inputBuffer.getChannelData(0)
  var max = 0

  for (var i = 0; i < input.length; i++) {
    out[i] = input[i]
    max = input[i] > max ? input[i] : max
  }

  var volume = max * 50
  for (var j = 0; j < 5; j++) {
    if (j < volume) {
      volumeBars[j].style.backgroundColor = 'red'
    } else {
      volumeBars[j].style.backgroundColor = 'white'
    }
  }
}

As mentioned above, we use the audio analyzer to determine how many bars to render in our current volume display.


var controls = document.createElement('span')
controls.style.display = 'flex'
controls.style.marginBottom = '5px'
controls.style.alignItems = 'center'

var mountLocation = document.getElementById(
  'webgl-skeletal-sound-tutorial'
) || document.body
controls.appendChild(playbackSlider)
controls.appendChild(muteButton)
controls.appendChild(volumeBarContainer)
mountLocation.appendChild(controls)
mountLocation.appendChild(playbackDisplay)
mountLocation.appendChild(canvas)

draw()

And lastly we dump our controls into the page and begin our request animation frame render loop.

Whoa, you did it!

Congrats! What would you like to learn about next? Let me know on Twitter.

‘Til next time,

- CFN