Match score not available

Speech Recognition Engineer

Remote:

Full Remote

Contract:

Full time

Experience:

Mid-level (2-5 years)

Work from:

United States

Offer summary

Qualifications:

Deep knowledge of speech recognition theory, Experience with Kaldi or K2 libraries, Experience in developing ASR models, Highly skilled in C++ and software engineering, Proficiency in data analysis for tech development.

Key responsabilities:

Implement new BSR version for platforms
Improve system efficiency and reduce model footprint
Automate model training and optimize performance
Develop integration with natural language processing
Conduct research and develop data processing pipelines

Art2Hire Tech Recruiters Human Resources, Staffing & Recruiting TPE https://www.art2hire.com/

See all jobs

Job description

Our Client, a Silicon Valley startup that teaches over a million children worldwide to speak English with their voice-based virtual tutor, is now looking for a Speech Recognition Engineer to join their team.

Location: CA, United States
Type: Full-time, Remote
Start date: ASAP
About the project:

Headquartered in Mountain View, California, the company has a growing international team combining expertise in voice and virtual human technologies, education, and game design.
It's an artificial intelligence English voice tutor for children. With the help of AI and voice technology, the company gives children the practice of speaking English. The project holds multiple international awards and nominations, including GESA (Global Edtech Startup Awards) in London, EnlightEd in Madrid, Voice Summit Awards in New York, among others. In addition, the application is the #1 app for kids on the App Store and Google Play.

About the Role:

We are seeking a Speech Recognition Engineer to elevate the communication technology to the next level. In this role, you will enhance the system's ability to understand millions of children worldwide.
Your work will play a crucial role in improving the interaction between the platform and young learners, making the learning experience more effective and engaging.
You will be tasked with improving the company's Speech Recognition solution.
You will boost the core tech stack to enable higher quality recognition performance and smaller footprint.
You'll also conduct R&D in speech recognition for children and develop audio data analysis and processing pipelines with team members.

Responsibilities include but are not limited to:

Implement a new version of BSR (BSR2) for iOS and Android using K2 & Icefall. Improve the system efficiency and cut the model footprint by a factor of four.
Automate model training, experiment with different model architectures, optimize the model performance for different classes of devices.
Develop a seamless integration between the BSR and the natural language processing pipeline to support dialogue interactions better.

Requirements:

Speech Recognition Experience: Deep knowledge and understanding of the speech recognition theory and processes.
Kaldi Experience: A background in working with and building solutions using Kaldi or K2 libraries. While optional, this skill is highly desirable as it will speed up onboarding tremendously.
Machine Learning Experience: Experience with developing ASR models. A good grasp of machine learning skills needed for training and testing new models.
Excellent Technical Skills: Highly skilled in C++ and software engineering. Writes reliable and efficient code.
Data-Based Idea Generator: Independently and proactively analyzes data to generate insights and ideas for technology development.
Proactive Problem-Solver: Takes initiative, proposes solutions, and acts without waiting for detailed instructions.
Perfectionist: Strives for excellence and aims to work with A-players to learn from the best.
Excellent Communicator: Can express themselves reliably and clearly in spoken and written form.
Team Player: Values collaboration, seeks and provides feedback, and listens to stakeholders, especially product development people.
Continuous Learner: Stays up-to-date with the latest in speech recognition, always hungry for new knowledge.

Position details:

Fully remote, based between Pacific Time Zone (Los Angeles, U.S.) and GMT+3 (Istanbul) with 2-4 strategic offsite sessions per year.
Language: Must speak English. Russian is a plus.
Competitive market salary
Paid sick leave and vacation days
Significant equity compensation (stock options)