Audition as Physical Inference

LJ.png
Author James Traer (mailto:jtraer@mit.edu)
Last Updated 2017-09-30 20:11:50

1 Research

I am a research scientist at MIT, working in the Computational Audition Laboratory. I study how humans infer the physical properties of the world from sounds. We all make such inferences regularly - when we hear the sound of a coin falling on the ground, or something rattling inside a box (or the sickening feeling when we set down something expensive with a little too much force and hear a "crack") - but it is not obvious how the pattern of vibrations arriving at our ears maps to the physical structure of the sound source. In general there are many physical variables which affect the sound, and so attributing part of the sound to any one cause is an ill-posed problem.

For example, consider the sound of striking a box with a mallet. The sound is louder if the box is struck with more force, but it is also louder for different boxes (e.g. metal vs wood). Thus the amplitude can tell us how hard the box was struck, but only if we first know the box material. We can infer the material from the decay rate of the sound, but only if we first know the size and shape of the box (and the material of the mallet, the location of the strike, the reverberation of the room and the distance between the object and the microphone, etc…). If we don't have all of this information, we (we meaning me - as a scientist with many years of experience with audio signal processing and physics inverse models) cannot disambiguate the various causal mechanisms from a single sound.

However, we ("we" meaning humans) can easily listen to just such a sound and imagine the object, the mallet and the room. This demonstrates that the auditory system is capable of intuitively solving the physics problem that I cannot solve via an algorithm. At least not yet.

2 Current projects

2.1 Intuitive Physical Inference from Sound

  • I am running a battery of perceptual experiments to assess the human ability to seperately infer different causal mechanisms from real-world recordings of impact sounds.
  • I am exploring the physics behind mundane everyday sounds (hitting, scraping rolling, cracking, crumpling and shattering) and recording the sounds everyday objects make. I analyse these sounds looking for statistical measures that are diagnostic of material and physical cues.
  • I synthesize sounds using physics and natural statistics inspired models and assess whether they are realistic enough to fool human listeners. If so, I then use them to assess how changing waveform parameters (e.g. mode frequencies, decay rates, attack time) affect human perception of physical properties.
  • Inspired by human responses, I am developing a probabilistic inference algorithm to make similar material inferences from sound.
  • I then test how both humans and the model change their judgements in presence of new information about the sound source (e.g. video of the contact, or images of the objects) which partially disambiguates the original task.
  • If the model succeeds where humans succeed, fails where humans fail, and changes it's judgements as do humans when presented with more information we argue the model is likely similar to the neural mechanism employed by the human brain.

2.2 Reverberation

  • I am measuring how acoustic reverberation changes with the physical parameters of the room, such as size, furnishings, wall material and the location of the source and the listener. In future we hope to test whether humans are sensitive to these acoustic changes.
  • I am running a battery of perceptual experiments to determine how reverberation changes human perception of sound sources.

2.3 Audio-visual Integration

  • I am building a virtual-reality interface to test whether visual information (about the objects or the room) changes the sounds that listeners report hearing.

3 Publications