experience
lossfunk
aug 2025 - presentresident researcher | remote
- exploring novel dual-stream reinforcement learning architecture that addresses fundamental limitations in current rl research through multi-scale temporal hierarchies and physics-aware learning, contributing to advances in safety-critical robotics simulation.
- investigating multi-algorithm comparison framework for systematic evaluation of conventional vs dual-stream reward schemes across dqn, ppo, sac, and a3c, with real-time genesis physics simulation and taichi backend integration for authentic robotic control.
- contributing to production-ready rl systems through modular environment adapters, automated training pipelines with progress tracking, and comprehensive model management for physics simulation and game environments, aiming to advance safe ai.
alan turing institute
dec 2024 - april 2025research associate | collaboration: university of birmingham & u.s. army research institute
- developed a heterogeneous-agent reinforcement learning framework using human-proxy agents that simulate realistic human constraints and capabilities to improve ai-human collaboration in multi-agent systems.
- designed a cooperative grid-world capture environment based on stag hunt game theory, where machine agents had full observability but couldn't detect target health, while human-proxy agents had limited vision but unique disease detection abilities.
- conducted experiments across various environment configurations, varying disease probability and penalty severity to analyze cooperation patterns.
- demonstrated that rl agents trained with human-proxy teammates achieved superior cross-environment performance, with teams trained under moderate risk conditions showing 30-40% higher collaboration rates.
riskopsai
jun 2023 - aug 2024ai ml intern | san jose, remote
- built a deep learning pipeline with tensorflow and pytorch using resnet-50 and transformer models for classification. used tensorrt to optimize inference, achieving 25% faster performance and 15% higher accuracy.
- created distributed ml infrastructure with apache airflow and mlflow on aws gpu clusters using horovod, reducing training time by 20%. set up automated data pipelines for feature engineering.
- developed a predictive analytics system using postgresql and bigquery with scikit-learn and xgboost on imbalanced data. optimized queries to improve decision-making efficiency by 30%.
srm institute of science & technology
oct 2022 - feb 2024researcher under dr. vaishnavi moorthy | chennai, india
- developed an autonomous navigation system using ros2 by fusing lidar, rgb-d, and imu data through an extended kalman filter. this improved localization accuracy by 15%.
- built a slam system using sac and trpo algorithms in pytorch to improve path planning with rrt* and a*. reduced navigation errors by 25% using dynamic obstacle avoidance.
- created a real-time perception pipeline using opencv and pcl, integrating yolov7 for object detection. achieved 20ms latency and 95% detection accuracy in changing environments.