Skip to content

Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.

License

Notifications You must be signed in to change notification settings

k2-fsa/sherpa-ncnn

Folders and files

NameName
Last commit message
Last commit date
Jan 6, 2025
Mar 4, 2024
Aug 31, 2023
Jan 6, 2025
Aug 4, 2023
Apr 15, 2023
Aug 4, 2023
Mar 22, 2023
Apr 6, 2023
Sep 19, 2023
Feb 24, 2024
Mar 8, 2024
Mar 9, 2024
Aug 24, 2024
Aug 17, 2024
May 23, 2023
Feb 24, 2024
Sep 7, 2022
Feb 24, 2024
Mar 8, 2024
Sep 7, 2022
Feb 22, 2023
Dec 30, 2022
Aug 17, 2024
Jul 11, 2024
Feb 2, 2023
Feb 2, 2023
Feb 2, 2023
Feb 2, 2023
Jun 20, 2023
Aug 18, 2023
Jul 11, 2024
Jan 31, 2023
Feb 6, 2023
Jul 11, 2024
Aug 17, 2024
Feb 24, 2024
Feb 6, 2024
Dec 18, 2022
Mar 5, 2024
Apr 12, 2023
Dec 30, 2022

Repository files navigation

Supported functions

Real-time Speech recognition Voice activity detection
✔️ ✔️

Supported platforms

Architecture Android iOS Windows macOS linux
x64 ✔️ ✔️ ✔️ ✔️
x86 ✔️ ✔️
arm64 ✔️ ✔️ ✔️ ✔️ ✔️
arm32 ✔️ ✔️
riscv64 ✔️

Supported programming languages

1. C++ 2. C 3. Python 4. JavaScript
✔️ ✔️ ✔️ ✔️
5. Go 6. C# 7. Kotlin 8. Swift
✔️ ✔️ ✔️ ✔️

It also supports WebAssembly.

Introduction

This repository supports running the following functions locally

  • Streaming speech-to-text (i.e., real-time speech recognition)
  • VAD (e.g., silero-vad)

on the following platforms and operating systems:

with the following APIs

  • C++, C, Python, Go, C#
  • Kotlin
  • JavaScript
  • Swift

We support all platforms that ncnn supports.

Everything can be compiled from source with static link. The generated executable depends only on system libraries.

HINT: It does not depend on PyTorch or any other inference frameworks other than ncnn.

Please see the documentation https://k2-fsa.github.io/sherpa/ncnn/index.html for installation and usages, e.g.,

  • How to build an Android app
  • How to download and use pre-trained models

We provide a few YouTube videos for demonstration about real-time speech recognition with sherpa-ncnn using a microphone:

Links for pre-built Android APKs

Description URL
Streaming speech recognition Address

Links for pre-trained models

https://github.com/k2-fsa/sherpa-ncnn/releases/tag/models

Useful links

How to reach us

Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.

See also