Innovating Guest Services With Voice-Controlled Amenities and Personalized Recommendations 

Client

Our client, an emerging hotel chain, sought to elevate guest experiences by implementing cutting-edge technology in their rooms. They needed a robust wake word detection system that would enable their guests to effortlessly control room amenities like lighting, temperature, and entertainment systems. Additionally, they wanted to integrate voice-activated concierge services to offer personalized recommendations and instant access to hotel information, ensuring unparalleled service excellence throughout their stay.

Challenges

  • Build an end-to-end wake word detection system to detect the presence of a custom wake word in a continuous audio stream.
  • The system should accurately detect words in diverse acoustic environments, including noisy backgrounds or echo-prone spaces.
  • The system should be generalized well across various user voices, accents, and speech patterns.

Approach

  • Used DNN-based, pre-trained multilingual embedding model which is trained on 760 frequent words from nine languages using the following datasets:
    • MLCommons Multilingual Spoken Words
    • Common Voice Corpus
    • Google Speech Commands for background noise samples 
  • Adopted a few-shots transfer learning approach, to create a five-shot keyword spotting context model.
  • Few-shot learning framework enables the pre-trained model to generalize over the new categories of data. This helps in faster fine-tuning of the context model. 
  • Treated the wake word detection as a classification problem that separates the input as target word, unknown word, and background.
  • Fine-tuned the model with only five custom target keyword samples and 5000 non-target examples to classify a custom target keyword irrespective of the language.
  • Extracted MFCC features from the input audio data
  • The softmax layer identifies the custom wake word and classifies it as the target word.

Impact

  • Achieved an accuracy of 97% in detecting the custom wake words from speech.
  • Achieved 99.9% uptime during stress testing and system performance evaluation, ensuring scalability and reliability.
  • Increase in guest satisfaction scores by 23% following system implementation.
  • Seamlessly integrated wake word detection and concierge services with existing hotel systems.