The Ephemeral Nature of Voice

November 19, 2012

Martin Geddes thinks voice need not be ephemeral. I beg to differ.

In a recent keynote speech, Martin Geddes decided to coin a new term: Hypervoice; and then give it some substance. While I haven't attended that event, Martin was kind enough to share the presentation with the world on Slideshare.

My main problem is with slide 62:

Just as we now routinely digitally capture our words and images, we will capture our voices. Voices need no longer be ephemeral.

Capturing voice (and video for that matter) is quite different than text and images. It doesn't behave the same. It doesn't work the same.

We capture images today. We do so when it fits the need, when we have a camera with us, with an Instagram application so we can edit the image captured before uploading it to give it an ambiance that wasn't there to begin with – editing it. We do the editing part "offline" and then we publish it.

Writing this post took a lot of editing (just this sentence was written and rewritten more than once). Text is a process of putting thoughts into words. And then updating it – deleting, adding, trimming, emphasizing – things that are natural to do to text.

Instant messaging – chatting with someone online is done in an ephemeral type of an interaction. Yes – most systems today do store the chat history, but No – this doesn't make it any less ephemeral. The interaction is stored, but it isn't an edited experience. There's no real URL for that interaction – no hypertext to compare to.

And voice? We interact and communicate with people through voice. It is the natural way of doing things. We do it face to face, where we use body language as well. We do it over a video chat session – talking-heads-style, and we do it over the phone with only our voice.

We could have recorded it all, then translated it automatically into text, archive it all in the cloud and make it searchable. But there's no editing here either – what you said was simply recorded and then served later when needed the next time.

There's a lot more to do with voice – that's for sure. And archiving it intelligently is probably the first step we should think of. Translation should be there as well. Gleaning insights out of it would be nice. But at the end of the day – the basic experience is ephemeral. We chat in the moment – the uncut version – and then we store it.s


You may also like