Text-To-Speech (TTS)

Text-To-Speech (TTS)

Text-To-Speech (TTS)

The Text-To-Speech (TTS) project involved developing a sophisticated accessibility service that allows students to have text read aloud while taking online tests. This comprehensive system leverages Amazon Polly's deep learning technologies to synthesize natural-sounding human speech and provides visual text highlighting using speech marks metadata. The service is currently integrated with the SchoolCity student portal and DnA online testing platforms, offering an enhanced visual and auditory experience for students with diverse learning needs.

The Challenge

SchoolCity needed a high-quality text-to-speech solution that could provide natural-sounding speech synthesis while offering visual text highlighting for enhanced accessibility. The challenge included integrating with Amazon Polly for speech synthesis, implementing speech marks for visual synchronization, ensuring low-latency audio delivery, supporting multiple voices and languages, and creating a seamless integration with existing educational platforms while maintaining high performance and reliability.

Text-To-Speech (TTS)
  • Natural-sounding speech synthesis
  • Visual text highlighting with speech marks
  • Multi-voice and multi-language support
  • Real-time audio streaming
  • Global content delivery network
  • High-performance API architecture

The Solution & Results

Developed a comprehensive TTS service using Amazon Polly for natural-sounding speech synthesis with deep learning technologies. Implemented speech marks metadata to provide visual text highlighting synchronized with audio playback. Built a Node.js service using Fastify for high-performance API handling, integrated with Amazon DynamoDB for data management, AWS S3 for audio storage, and AWS CloudFront for global content delivery. Deployed the solution using Docker containers with Terraform for infrastructure management and Concourse CI for automated deployment.

Successfully delivered a robust TTS service that provides natural-sounding speech synthesis with visual text highlighting capabilities. The system now serves thousands of students across the SchoolCity student portal and DnA online testing platforms, significantly improving accessibility and learning outcomes for students with diverse needs. The solution has enhanced the educational experience by providing high-quality audio content with synchronized visual feedback.

Text-To-Speech (TTS)
Text-To-Speech (TTS)