Text To Speech (TTS) converts text into human-like speech. Baidu TTS is a free TTS SDK,we use this sdk to develop TTS app.
Step 1: Create Application
First, registered account at Baidu . After that,log in to the Baidu Voice Developer Platform and create an application.
choose speech technology
choose create application
input information,
package name must be match the package name of your app.
Step 2: Download SDK and library
Check download sdk button in left hand side, then choose speech synthesis and download SDK for android.
Unzip sdk than copy Baidu-TTS-Android-2.3.5.20180713_6101c2a/app/src/main/jniLibs/armeabi folder into your project jniLibs folder.
And copy Baidu-TTS-Android-2.3.5.20180713_6101c2a/app/libs/com.baidu.tts_2.3.2.jar into your project libs folder.
Step 3: Import jar
in build.gradle, add as follow
build.gradle 1 2 3 4 dependencies { ... compile files('libs/com.baidu.tts_2.3.2.jar') }
Step 4: Init TTS
init SpeechSynthesizer
1 2 3 4 private SpeechSynthesizer mSpeechSynthesizer;mSpeechSynthesizer = SpeechSynthesizer.getInstance(); mSpeechSynthesizer.setContext(this );
setup TTS Listener
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 public class MainActivity extends AppCompatActivity implements SpeechSynthesizerListener { private void initTTS () { ... mSpeechSynthesizer.setSpeechSynthesizerListener(this ); } @Override public void onSynthesizeStart (String s) { } @Override public void onSynthesizeDataArrived (String s, byte [] bytes, int i) { } @Override public void onSynthesizeFinish (String s) { } @Override public void onSpeechStart (String s) { } @Override public void onSpeechProgressChanged (String s, int i) { } @Override public void onSpeechFinish (String s) { } @Override public void onError (String s, SpeechError speechError) { } }
Step 5: set AppId, AppKey 和 AppSecretKey
In Baidu website, this application is built, you will get AppId, AppKey and AppSecretKey.
1 2 int result = mSpeechSynthesizer.setAppId(appId);result = mSpeechSynthesizer.setApiKey(appKey, secretKey);
Step 6: Verify and Download authorized file
TtsMode.ONLINE : pure online, download authorized file automatically. TtsMode.MIX : From online fusion, online priority;
1 2 3 4 5 6 7 8 9 10 private boolean checkAuth () { AuthInfo authInfo = mSpeechSynthesizer.auth(ttsMode); if (!authInfo.isSuccess()) { String errorMsg = authInfo.getTtsError().getDetailMessage(); return false ; } else { Log.i(TAG, "checkAuth success!" ); return true ; } }
Step 7: Import TTS Model
copy Baidu-TTS-Android-2.3.5.20180713_6101c2a/app/src/main/assets into your_project/src/main/assets
Before using TTS, copy model files into sd card folder.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 private String MODEL_FILENAME ; private String TEXT_FILENAME ; private static final String SPEECH_FEMALE_MODEL_NAME = "bd_etts_common_speech_f7_mand_eng_high_am-mix_v3.0.0_20170512.dat" ; private static final String TEXT_MODEL_NAME = "bd_etts_text.dat" ; @Override protected void onResume () { super .onResume(); copyModelFileToSD(); initTTS(); } private void copyModelFileToSD () { String folder = MainActivity.this .getFilesDir().getAbsolutePath(); MODEL_FILENAME = folder + "/" + TEXT_MODEL_NAME; TEXT_FILENAME = folder + "/" + SPEECH_FEMALE_MODEL_NAME; InputStream is = null ; FileOutputStream fos = null ; try { Context context = MainActivity.this .getApplicationContext(); File textFile = new File (TEXT_FILENAME); File modelFile = new File (MODEL_FILENAME); if (!textFile.exists()) { textFile.createNewFile(); is = context.getAssets().open(TEXT_MODEL_NAME); fos = new FileOutputStream (textFile); copyFile(is, fos); } else { } if (!modelFile.exists()) { modelFile.createNewFile(); is = context.getAssets().open(SPEECH_FEMALE_MODEL_NAME); fos = new FileOutputStream (modelFile); copyFile(is, fos); } else { } } catch (IOException e) { Log.e(TAG, "Error: " + e.toString()); } finally { closeObject(is); closeObject(fos); } } private void copyFile (InputStream is, FileOutputStream fos) throws IOException { byte [] buffer = new byte [2048 ]; int byteCount = 0 ; while ((byteCount=is.read(buffer))!=-1 ) { fos.write(buffer, 0 , byteCount); } fos.flush(); } private void closeObject (Closeable obj) { try { if (null != obj) { obj.close(); } } catch (IOException e) { Log.e(TAG, "Error: " + e.toString()); } }
After check authorization, setup parameters for speach model.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 private void setupParam () { mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_TTS_TEXT_MODEL_FILE, TEXT_FILENAME); mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_TTS_SPEECH_MODEL_FILE, MODEL_FILENAME); emotion male, 4 :child mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_SPEAKER, "0" ); mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_VOLUME, "9" ); mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_SPEED, "4" ); mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_PITCH, "4" ); mSpeechSynthesizer.setParam(SpeechSynthesizer.PARAM_MIX_MODE, SpeechSynthesizer.MIX_MODE_DEFAULT); }
MIX_MODE_DEFAULT: If wifi connection, using TtsMode.ONLINE, else using TtsMode.MIX. In TtsMode.ONLINE, if request time more than 6 second, it will change to TtsMode.MIX mode automatically.
MIX_MODE_HIGH_SPEED_SYNTHESIZE_WIFI: the same as MIX_MODE_DEFAULT,but request time more than 1.2 second, it will change to TtsMode.MIX mode automatically.
MIX_MODE_HIGH_SPEED_NETWORK: can use 3G/4G or Wifi, but request time more than 1.2 second, it will change to TtsMode.MIX mode automatically.
Initialize TTS flow as follow
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 private void initTTS () { mSpeechSynthesizer = SpeechSynthesizer.getInstance(); mSpeechSynthesizer.setContext(this ); mSpeechSynthesizer.setSpeechSynthesizerListener(this ); int result = mSpeechSynthesizer.setAppId(appId); result = mSpeechSynthesizer.setApiKey(appKey, secretKey); if (!checkAuth()) { return ; } setupParam(); result = mSpeechSynthesizer.loadModel(TEXT_FILENAME, MODEL_FILENAME); result = mSpeechSynthesizer.initTts(TtsMode.MIX); if (result != 0 ) { Log.e(TAG, "init failed" ); } else { Log.e(TAG, "init success" ); } }
Step 8: Speech
If speach text immediately, you can use speak api. Using synthesize api can synthesis text, than using speak api to read out.
1 2 String text = "test baidu TTS" ;mSpeechSynthesizer.speak(text);