Abstract - The task of sign language translation is crucial for enhancing communication for the deaf and hard-of-hearing community. In this paper, we build upon previous work by enhancing the large language model (LLM) component of a hybrid ap- proach for sign language translation from continuous video streams. The prior hybrid approach employed a GPT-3.5 Turbo prompt to generate translations from identified sign spottings. We improve this by incorporating fine-tuning techniques and revising the prompt. We employ a sign spotter from existing literature to iden- tify individual gestures within the video stream. These identified gestures are then processed by an LLM, which constructs grammatically correct and coherent sen- tences. Our evaluation of two models demonstrates significant improvements, with the fine-tuned Gemini-1.0 Pro model showing the most successful enhancement in translation accuracy and coherence.
View PDFAbstract - Currently, LiDAR has become ubiquitous for autonomous vehicles due to its distinct ability to capture dynamic changes in an incredibly detailed map of the vehicle’s environment in real-time. In comparison to other sensors (such as stereo camera vision), LiDAR is versatile to long distances while still maintaining high definition. However, deploying and testing these systems presents a large caveat to any motivated entity who is looking to do so. Logistically, testing these models and training them on physical vehicles requires extreme resources and are computationally expensive. Though, through the use of a simulated environment testing is much more feasible. So, in this regard, I methodically approached a solution to LiDAR odometry to localize a vehicle in real-time, using sensor data to create a trajectory path. Subsequently, I used the CARLA simulation environment to test varying vehicle environments and sensor conditions.
View Poster