Machine Learning Engineer · Bengaluru, India

Soham Shinde

I teach robots to see.

machine learning · computer vision · robotics

Soham Shinde
NowAt Clutterbot, building the perception pipelines that tell indoor robots where things are.

About

I'm a Machine Learning Engineer at Clutterbot. I build and ship the segmentation and detection models that let our indoor robots see and move. Since we put them on Qualcomm DragonWing and NVIDIA edge hardware, the same models run about 30% faster.

Before Clutterbot I was in research labs, pointing models at images and asking what they understood: video transformers for classroom activity recognition at NTU, annotation tooling around Segment Anything at TCS Research, and restoration of deteriorated Rajasthani murals at CSIR-CEERI. The thread through all of it is computer vision, 3D modeling, applied ML, and robotics. The details are in the experience section below.

I studied Electronics & Communication Engineering with a minor in Data Science at BITS Pilani Goa, and wrote autonomous navigation for Project Kratos, our student-built Mars rover. Off the keyboard I brew Blue Tokai in a French press, boulder, and trek in the Himalayas. Same loop as debugging: find the next hold, commit, sometimes fall.

Experience

Research labs, internships, and now a robotics startup. Newest first.

  1. Clutterbot logo

    Machine Learning Engineer · Clutterbot

    May '25 — Present

    I build the pipelines that train and ship our segmentation and detection models for indoor navigation, and package them for the Autonomous Navigation team. Moving the perception stack onto the Qualcomm DragonWing QCS6490P and NVIDIA IoT kits bought us about 30% more edge inference speed on barebones Linux with GStreamer. To keep target locking stable through that migration, I built a custom 6DOF pose-estimation proof of concept.

    • edge inference
    • perception
    • 6DOF pose
    • GStreamer
  2. Nanyang Technological University logo

    Research Associate · Nanyang Technological University

    Jul '24 — May '25

    Built and benchmarked a Video Vision Transformer (ViViT) for classroom activity recognition: 88% test accuracy on a 927-clip EduNet subset, 72% on an independent 100-video set. Saliency maps confirmed the model watched the right things, raised hands and writing on the board. With Dr. Yuvaraj and Dr. Amalin, I also worked on gaze estimation for student-engagement measures.

    • ViViT
    • gaze estimation
    • video
    • HCI
  3. TCS Research logo

    Intern · TCS Research

    Jun '24 — Aug '24

    Wrapped Segment Anything (SAM) in a PyQt5 GUI for AI-assisted image annotation. Automated the stitching of segmented regions into larger sub-scenes for semantic scene understanding.

    • SAM
    • PyQt5
    • segmentation
    • tooling
  4. CSIR-CEERI logo

    Research Intern · CSIR-CEERI, Pilani

    May '23 — Aug '23

    Restored and segmented deteriorated Rajasthani murals with U-Net++, DeepLabV3+, PSPNet and FPN under Dr. Dhiraj Sangwan. Generated synthetic damage with StyleGAN2-ADA and reached SSIM 0.9812 on the inpainting pipeline.

    • inpainting
    • GANs
    • cultural heritage
    • SSIM 0.9812

Publications

Two papers on segmenting and restoring artwork and ancient wall paintings. One accepted, one under review.

Projects

Point clouds, smart glasses, a Mars rover.

  • Feb '24 — May '24

    CloSe++

    with Dr. Garvita Tiwari

    Extended the CloSe-Net framework for fine-grained 3D clothing segmentation from coloured point clouds. Sharpened edge detection between clothing types and automated clothing-type detection to remove manual input.

    • 3D
    • point clouds
    • segmentation
    CloSe
  • Oct '22 — Aug '23

    Project Visio

    with Prof. Sougata Sen

    Smart glasses to aid the visually impaired: object detection and scene understanding, Tiny-ML inference on ESP microcontrollers, and a companion Android app talking to the glasses over WiFi.

    • Tiny-ML
    • embedded
    • assistive
  • Sep '22 — Jan '23

    Project Kratos

    BITS Goa Mars rover

    Autonomous Subsystem of BITS Goa's student-built Mars rover prototype. Autonomous navigation in ROS, path planning with A*, Dijkstra and SLAM in Gazebo on an NVIDIA Jetson Xavier, and OpenCV object detection.

    • ROS
    • SLAM
    • path planning
    kratos-the-rover

Contact

Email me at sohams.web@gmail.com. I'm also on GitHub and LinkedIn, and the CV has the long version.