Technical Projects
This page portrays projects done at Center for Innovation (CFI) for fun, real-world challenges using off-the-shelf hardware and software components. Refer LinkedIn for info. about my professional experiences.
Cube, an assistive device for the visually-impaired
- Cube is a seamless assistive device for the blind to help them learn, type, read in braille & navigate easily.
- Arrived at this solution after having pretotyped several experiments, making many agile working prototypes and testing them with blind from NGOs.
- Selected as an incubatee by Nirmaan, a pre-incubation cell at IITM, where we attended workshops by silicon valley VCs, analyzed product-market fit & received ∼$5000 to develop the initial prototype. Find video, media, field-test, article.
Smart-bin
- Designed a smart waste-segregator bin to categorize waste into 4 types: paper, composite, plastic & wet-waste, to tackle the waste-segregation problem at the very root.
- Trained DL models to demarcate waste with ∼95% accuracy.
- Implemented the electronics-subsystem with servos to control bin lids, RFID scanners to scan IDs, BeagleBone AI board to run DL models at edge, LCD to display bin no. and SMPS for power management.
- Field tested at Campus Cafe, IITM. Find slides, video, code.
Accelerating bouncing balls using CUDA DSL
- 1.5× speed-up of a 2D balls-collision game by splitting the rendering into multiple windows, each of which is handled by a CUDA thread.
- Used OpenGL for graphics.
- Utilized pinned memory, texture buffer, read/write coalescing, ternary operator to reduce thread divergence, stream kernels, atomically updated list for ball-states, cache-efficient nested for-loop, nvprof profiler. Play the game, find code.
Automatic Cooking Machine (in collaboration with Butterfly)
- Worked on revamping/improvising the application, mechanical, electronics, electrical stack of AutoChef-v1.0 developed by Butterfly to design AutoChef-v2.0. Exact details are confidential. Find certificate.
Sulabh: A gateway for accessibility
- To make websites truly accessible, we proposed Sulabh, a web app that asks users to choose the kind of assistance/feature (instead of choosing disability) category they would aspire, a few of which are given below:
- Voice Assistance: Screen structure & screen reader, voice recognition & commands, navigation cues, prioritizes errors!
- Gesture Based Assistance: Eye and facial movement tracker for cursor movement and keyboard control;
- Image/Video Captions, Automatic site maps & Hyperlinks briefing assistance;
- Appearance Settings: Brightness, font, font & button size, line & paragraph spacing, color scheme & contrast control;
- Feature for even more “Easy Access & Understanding”: Search, spelling & meaning assistant, keyboard shortcuts for thumbnails and bookmarks, block popups, highlighting keywords, refreshable braille display support for the blind-deaf, etc.
- This work was appreciated by the Ministry of Rural Development, Government of India and reached the finals of Smart India Hackathon - 2020.
- Find slides, flow-chart.
ATBERT: Multi-modal music classifier
- Since music can be classified based on genre, beats, themes, etc. and it is difficult to obtain huge labelled data, a self-supervised (to learn music representation with limited labelled data) multi-modal (multiple inputs from similar distributions is similar to more data) transformer (to tackle long range sequence dependence; parallelizable, reduced computations) will be efficient.
- Correct Pair Prediction (CPP) is used to help model learn cross-modal dynamics. Model tries to predict whether a pair of audio-text embeddings corresponds to the same music or different. One binary classifier which takes in audio-text summary embedding and a random-time multi-modal encoder output are passed through an external 2 layer, non-linear head to classify.
- Info-Noise-Contrastive-Estimation (InfoNCE) loss is also used where a audio-text pair is classified into positive/negative class where negative classes are sampled from a mini-batch based on a novel CANS-Similar (L2 norm based) method. Find code.
Cloud Beacon: IoT for seamless local broadcast
- To share data anywhere, to anyone within proximity without actually having to connecting to the network (to avoid security issues). Find concept doc.
- Use-cases include museums, malls, etc. where information about nearby artifacts can be displayed on phone without the user connecting to the open network.