Vosk and Scala

Vosk and Scala

Vosk is a speech recognition toolkit. It can work offline. I thought it would be interesting to feed some text into a multimodal graph, so I started testing it.

I added two dependencies into my SBT project definition as it is suggested in the documentation:


lazy val root = project
  .aggregate(memory, hexagon, semantic)
    name := "binet",
    libraryDependencies ++= commonDependencies,
    libraryDependencies += "dev.zio" %% "zio" % "2.0.19",
    libraryDependencies += "net.java.dev.jna" % "jna" % "5.13.0", // <- VOSK
    libraryDependencies += "com.alphacephei" % "vosk" % "0.3.45" // <- VOSK
    memory % "test->test;compile->compile",
    hexagon % "test->test;compile->compile",
    semantic % "test->test;compile->compile"

I downloaded a model for Russian language from here:


This page on StackOverflow was very useful to get it working:


I had to set the correct sound format, which I didn't initially.

It lags behind by a second or two. I believe it's because I have no graphics card in my computer. It may be something else, though. I'm not sure.

I'm happy that speech recognition works, and is free of charge. Everything is ready for new experiments!