Aural frontiers: audio engineering on the web

featured

"Beep" said the computer. All is good. "Beep, Boop"; you got a boot problem. For many, human-computer audio interactions were not always terribly exciting. Although computers playing music existed since the 1950's, home systems had to wait. For most, the first real excitement of computer audio came from what would later be know as chiptunes, music composed from the extremely limited palette of contemporary machines, mostly for arcades and home game systems and later for some staples of computing like the C64. Although lacking fidelity, these compositions brought life to the games and ignited a spark of imagination for many. But let's turn back the clock even more:

“Supposing, for instance, that the fundamental relations of pitched sounds in the science of harmony and of musical composition were susceptible of such expression and adaptations, the engine might compose elaborate and scientific pieces of music of any degree of complexity or extent. ”

— from Ada's Lovelace notes to Charles Babbage's Analytical Engine Seminar, 1843

That's Ada Lovelace, the first computer programmer, talking about generative music, another aspect of computer sound which enjoyed many different incarnations over the last two centuries. But what's most exciting today is how accessible such a feat is. After 20 odd years of hosting plain, old, HTML documents browsers are getting on the audio turf. If you want your browser to dance, you can use a neural network from Google and the Web Audio API, by virtue of magenta.js. Bet Ada would love to hack on that.

New horizons for web audio

The standardisation of the Web Audio API, implemented across browsers, is opening up new possibilities to new audiences (well past playing Bon Jovi's "Now or never" on your homepage). The Web Audio API is a candidate recommendation at W3C which means it's here to stay and a host of other API's work well with it (ie. Streaming) and more are coming (ie. MIDI). But what kind of things can we build with it? Can we move audio on the web from the purely functional to something that can enhance or even drive our experience in a way that was not feasible before?

In "Audio Technology and Mobile Human Computer Interaction: From Space and Place, to Social Media, Music, Composition and Creation" the authors explore audio-enhanced autoethnography. They connect sounds to places, whether it is music or geo-located tweets, extending the experience with an additional dimension. We naturally perceive with all our senses so when we receive more information, a different perspective and way to interact with the environment can arise. On the recent WAC2018 multiple performers used the audiences mobiles, playing sounds, sometimes interactively. The audience's attention divided between the performance and what's happening on the their devices, creating a somewhat chaotic but interactive experience. Although located audio experiences are mostly a domain of artists there is plenty of opportunities here as and when technology gets more sophisticated. At the moment the biggest issue is timing as the web does not provide anything to sync audio (or in fact, anything) across multiple devices (yet).

Immersive experiences

Another avenue of expertimentation are VR and AR technologies. With their advent, the role of audio becomes more important as the lack of it makes these experiences much more artificial. Spatial audio has made it's way to the web too, complementing the rise of immersive API's for the web. AR especially opens interesting possibilities in terms of interfaces, as sound can increase interface design options and reduce cognitive load. Some examples include AR conferencing and tools for architects who could explore a space before it's built not only visually, but also acoustically. There is also ideas for visually impaired, as audio information could be added where previously there was none.

Using sound to convey non-audio information

Transforming non-audio information into sounds (sonification) is another use case being explored and becoming more accessible with web tech. At the WAC2018 two projects really stood out in this regard: Nick Violi presented a novel idea to transform live Google Analytics data into ambient music, changing based on intensity and other, adjustable parameters. Although quite far from anything in use today it shows a potential direction "sonified" interfaces could take. Another fascinating project by multiple researchers tracks body movement and turns it into sounds as an aide during physio therapy. The research looks into extending BFB (bio feedback training) instruments into having not only visual cues on screens, which can be distracting to the undertaken activity, but also audio ones which could be less intrusive.

Brave new aural frontiers

The world of web audio engineering is a young one but we can already see some very novel ideas in terms of music generation, sonification, augmented, virtual and located experiences, thanks in part to the developments in Web Audio API. Although many of these are still just experiments and commercial products might still be a few years away, we could be on the verge of a big change in how computer interfaces and computer aided experiences look (or rather, sound) like.

If the above got you interested I highly recommend checking the WAC2018 playlist, from the conference that inspired this blog post. Thanks to Red Badger for sponsoring this trip.