References to TTS in other posts should be made when the text of the work referenced is suitable to “play” as audio eg on a smart phone while walking or other activities and especially to qualify authors claims that they have read or are reading the work but in fact have only “played” it, which inevitably results in missing a lot.

Texts with extensive mathematics, tables, diagrams etc are less suitable but could sometimes be used for reading and viewing alternately in parallel.

Tag TTS should be applied to posts that mention it, with Audio tag instead for actual human reader.

I find the Samsung “Text To Speech” voices with FBReader for .epub and similar formats surprisingly acceptable. Expect standard Google voices and Apple iPhone also adequate. Actual Audio is better but TTS does not sound robotic like a Dalek.

FBReader TTS does not work with .pdf files. I use @VoiceAloud which is less satisfactory for bookmarking etc although can use same voice engines.

Could also process .pdf offline to extract text (if necessary first using OCR) so easier to use with .epub and/or to produce an Audio file.

Could also process Audio offline for rate processing if using a cheap player that cannot adjust rate (I usually find 2x speed more comfortable to listen to.

Offline processing can also easily deal with DRM copy protection problems (eg add-ons for calibre).

Don’t have time at the moment and already have plenty to read in .epub format but will eventually want to setup utilities to handle this stuff smoothly, together with cataloging, peer to peer exchange, Calibre, Library Genesis, Zotero etc.

Will also want better control from headset and/or cheap smart watch. Can pause and resume using single press of headset center button (with other two for volume up and down). Too awkward using smartphone to step back for bits missed due to drifting off, making and using bookmarks, adding annotations (voice or text) etc.

Should be easy enough to add this for Android using existing remote audio APIs and “intents”. But I don’t know how and don’t have time to figure out.

There should also be some app that simply allows one to reconfigure watch and/or headset buttons for custom handling of this stuff but I haven’t found it (nor looked hard).

This sub-category will be for more on such issues.

Also relevant to publishing. I believe it is possible to provide “hints” that improve default TTS voice pronunciations and also to embed such hints in .epub etc.


