Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is it possible to have realtime keyword spotting in flutter #1248

Closed
jtdLab opened this issue Aug 12, 2024 · 5 comments
Closed

is it possible to have realtime keyword spotting in flutter #1248

jtdLab opened this issue Aug 12, 2024 · 5 comments

Comments

@jtdLab
Copy link

jtdLab commented Aug 12, 2024

Hi is it possible to use serpha-onnx for keyword spotting in a stream from mic in a flutter app?

@jtdLab
Copy link
Author

jtdLab commented Aug 12, 2024

I tried to modify the dart example from file but could not make it work for streaming. I get the audio from mic via https://pub.dev/packages/flutter_sound but it never detects any keword.

@csukuangfj
Copy link
Collaborator

Could you show your changes?

@jtdLab
Copy link
Author

jtdLab commented Aug 12, 2024

Model setup should be okay (ModelLoader just loads sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01 and unpacks it so the app can use it)

  Future<void> initialize({
    required int sampleRate,
    required String language,
  }) async {
    sherpa_onnx.initBindings();

    final transducer = sherpa_onnx.OnlineTransducerModelConfig(
      encoder: await _modelLoader.encoderPath(_modelName(language)),
      decoder: await _modelLoader.decoderPath(_modelName(language)),
      joiner: await _modelLoader.joinerPath(_modelName(language)),
    );
    final modelConfig = sherpa_onnx.OnlineModelConfig(
      transducer: transducer,
      tokens: await _modelLoader.tokensPath(_modelName(language)),
    );
    final config = sherpa_onnx.KeywordSpotterConfig(
      model: modelConfig,
      keywordsFile: await _modelLoader.keywordsPath(_modelName(language)),
    );
    _spotter = sherpa_onnx.KeywordSpotter(config);
    _stream = _spotter.createStream();
    _sampleRate = sampleRate;
  }

When now calling predict with samples emitted from flutter_sound stream it predicts null all the time.

  String? predict(Uint8List samples) {
    final samplesFloat32 = _convertBytesToFloat32(samples);
    _stream.acceptWaveform(
      samples: samplesFloat32,
      sampleRate: _sampleRate,
    );

    while (_spotter.isReady(_stream)) {
      _spotter.decode(_stream);
    }

    final keyword = _spotter.getResult(_stream).keyword;
    if (keyword.isNotEmpty) {
      print('Detected: $keyword');
      return keyword;
    }

    return null;
  }
}

Float32List _convertBytesToFloat32(
    Uint8List bytes, [
    Endian endian = Endian.little,
  ]) {
    final values = Float32List(bytes.length ~/ 2);

    final data = ByteData.view(bytes.buffer);

    for (var i = 0; i < bytes.length; i  = 2) {
      final short = data.getInt16(i, endian);
      values[i ~/ 2] = short / 32678.0;
    }

    return values;
  }

flutter_sound config looks like this 16bit PCM, 16000 sampleRate

 await _recorder.startRecorder(
        toStream: _audioController.sink,
        codec: base.Codec.pcm16,
      );

@csukuangfj
Copy link
Collaborator

please check _spotter.ptr and _stream.ptr and see if they are null.

I suspect that model initialization is failed.

Make sure you read the logs carefully.

Note you can pass debug: true to ModelConfig to get more logs.

@csukuangfj
Copy link
Collaborator

By the way, please change

    while (_spotter.isReady(_stream)) {
      _spotter.decode(_stream);
    }

    final keyword = _spotter.getResult(_stream).keyword;
    if (keyword.isNotEmpty) {
      print('Detected: $keyword');
      return keyword;
    }

You need to put _spotter.getResult in the while loop.

In case you have not read our KWS dart example, please read it now:

while (spotter.isReady(stream)) {
spotter.decode(stream);
final result = spotter.getResult(stream);
if (result.keyword != '') {
print('Detected: ${result.keyword}');
}
}

It clearly shows how to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants