Cogito Blog

Interspeech 2019 – Machine Learning-enabled Creativity and Innovation In Speech Tech

Dr. John Kane

One thing that was evident at the 2019 Interspeech conference, held in Graz, Austria, is that the speech technology scene continues to flourish. The level of industry attention and investment continues to rise rapidly, there were 2,075 registered attendees (more than any previous year) and colleagues I met when we were earning our PhDs, have grown into academic and industry leaders.

 

A wide breadth of topics specific to the speech science discipline were covered by four keynote speakers. Keiichi Tokuda presented a wonderful overview of the recent history of speech synthesis. Manfred Kaltenbacher gave a detailed and in-depth talk on the physiology and physics of speech production. Mirella Lapata’s keynote displayed just how far natural language processing interfaces have come — while not quite to the level of Samantha from the film “Her”, science fiction is becoming less and less fantastical.

  

But it was Tanja Schultz’s talk which elicited the strongest reaction. Hearing synthesized speech reverberate in the main conference hall, produced using electrodes placed directly on a human cortex, was certainly the eeriest moment of the week. Brain-to-speech synthesis is as troubling as it is intriguing! At the same time her core point of “Think beyond acoustics” was inspiring for many in attendance.

 

To read more about my experience at Interspeech 2019, including my paper highlights, head on over to our Medium page: https://medium.com/@CogitoCorp/interspeech-2019-machine-learning-enabled-creativity-and-innovation-in-speech-tech-30d46df482e8