Enhancing Patient Experience with Intelligent Age and Gender Detection
Client
Our client, a healthcare technology company, prioritized demographic insights to enhance patient care. Focusing on telemedicine and remote monitoring, they required an advanced, speech-based age and gender detection system. Seamless collection of demographic data during patient interactions was aimed at enabling personalized care plans, empowering healthcare professionals, and enhancing care quality.
Challenges
- Build a system that predicts the age and gender of the user based on the user’s speech.
- The system should be highly accurate in predicting both age and gender from spoken utterances.
- The system should be evaluated for biases that may lead to unfair predictions across different demographic groups.
Approach
- Collected TIMIT, NISP, and noise datasets, covering various accents, languages, and speaking styles.
- Preprocessed the speech data to distribute gender data evenly within the datasets.
- Designed and utilized a multi-scale deep learning architecture with three parallel CNNs, each operating on different kernel sizes.
- Defined and configured the hyperparameters.
- Extracted features through convolutional and pooling layers to automatically learn feature representations, and to detect patterns and structures in the spectrograms.
- Implemented a regression task for predicting age and mapped age ranges to numerical values.
- Applied a classification task for predicting gender and used binary encoding for gender.
- Trained the model using the preprocessed features and labeled data from the TIMIT and NISP datasets on NVIDIA A100 GPU.
- Evaluated the model on a validation set, measuring metrics such as accuracy, mean squared error, and mean absolute error.
Impact
- Achieved a Mean Squared Error of 5.6 for age detection and an accuracy of 97% for gender detection with demographic variations, languages, and accents.
- Demonstrated consistent and unbiased age predictions with less than 6% variation in performance across diverse demographic groups.
- Improved patient throughput by 7% and reduced administrative costs by 9% with automated data collection and processing.
- Increased telehealth utilization rates by 13% due to enhanced effectiveness and personalized experiences.