AI Engineer · Computer Vision · Multimodal AI

Gayathri Adulla

I build production computer vision and multimodal AI systems — from model architecture to deployed pipelines that ship real outcomes.

Currently architecting a dashcam-based computer vision pipeline that automates housing-code-violation detection at city scale. M.S. in Artificial Intelligence, University at Buffalo.

Gayathri Adulla
90%
Violation detection accuracy
98%
Deepfake detection accuracy
300K+
Parcels indexed (production)
0.997
ROC-AUC, β-VAE classifier
01 / About

Systems that ship, not just notebooks that run.

I'm an AI Engineer building computer vision, multimodal AI, and machine learning systems that solve real operational problems. My work spans research, internships, and one live production deployment — converting raw image, video, and structured data into reliable, scalable AI systems.

At Third Estate Analytics, I architected a computer vision pipeline that turns dashcam footage into structured housing-code-violation reports for municipal use — currently running at 90% detection accuracy across a 300,000-parcel dataset. Previously, as a Graduate Research Assistant at the University at Buffalo, I built 3D human reconstruction pipelines using SMPL-X and PyTorch, and engineered distributed simulation testbeds for multi-agent evaluation.

I care about the full lifecycle: model design, data pipelines, evaluation rigor, and the infrastructure (Docker, Kubernetes, FastAPI, Redis) that keeps a system running after the demo ends.

02 / Education

Academic foundation

University at BuffaloJan 2025 – May 2026

M.S. in Artificial Intelligence

  • Relevant coursework: Data-Intensive Computing, Computer Vision, Data Structures & Algorithms, Data Models & Query Languages
Mahindra University2020 – 2024

B.Tech in Artificial Intelligence

03 / Experience

Where the work happened

Third Estate AnalyticsJan 2026 – Present · Buffalo, NY

Artificial Intelligence Engineer

  • Architected a production CV pipeline converting GoPro dashcam footage into address-resolved housing-code-violation reports using GPMF telemetry, SSIM keyframe extraction, and Gemini Vision VLM classification.
  • Achieved 90% violation detection accuracy and 90% address-resolution accuracy across Erie County's 300,000-parcel dataset.
  • Designed a Human-in-the-Loop validation UI (Salesforce LWC/Apex) with an active learning loop for continuous model retraining.
  • Replaced manual municipal inspection workflows, reducing property inspection lifecycles end-to-end.
PythonPyTorchVLMsActive LearningGeospatial Data
University at BuffaloAug 2025 – Dec 2025 · Buffalo, NY

Graduate Research Assistant

  • Reconstructed 3D human body meshes from monocular video using SMPL-X, with a CPU-optimized rendering pipeline (PyTorch, Trimesh, Pyrender).
  • Built a Gradio/Hugging Face Spaces visualization tool comparing motion sequence outputs across model variants.
  • Engineered distributed simulation testbeds (Docker, Redis) for multi-agent scale evaluation.
PyTorchSMPL-XDockerRedis
TresVistaJan 2024 – Aug 2024 · Bengaluru

AI Infrastructure Engineering Intern

  • Automated financial ETL pipelines (Python, SQL, VBA, Spark/Hadoop) across AWS and GCP.
  • Built Power BI / Tableau dashboards that cut manual reporting effort by 50%.
  • Implemented scalable data versioning pipelines, validating security isolation across staging cycles for multi-client deep learning models.
SQLSparkAWSGCP
ZS AssociatesJan 2023 – Dec 2023 · Pune

Machine Learning Engineering Intern

  • Developed LSTM time-series forecasting models, improving macro-precision 18.4% over baseline.
  • Built scalable healthcare data pipelines (SQL, Spark, Hadoop) with governance and secure VM workflows.
TensorFlowPyTorchForecasting
04 / Projects

Technical work, in production and in research

🎬 Multimodal Movie Genre Classification

Live Demo

A multimodal deep learning system fusing DistilBERT text embeddings with ResNet-18 visual features to predict movie genres from posters and plot overviews, deployed via an interactive web interface.

  • Dual-branch late-fusion architecture aligning text and image latent features
  • Addressed class imbalance with weighted BCE loss and label co-occurrence analysis
  • Evaluated with macro/micro F1, confusion matrices, and threshold tuning
PyTorchDistilBERTResNet-18Multimodal

🧠 Deepfake Detection — β-Variational Autoencoder

98% Accuracy

A hybrid deepfake detection framework combining unsupervised representation learning with supervised latent-space classification — detects synthetic faces by modeling structured latent distributions rather than surface-level pixel artifacts.

  • β-VAE (β=4) with 128-dimensional latent space, trained on FFHQ real faces vs. Stable-Diffusion-generated fakes
  • KL-divergence anomaly detection combined with logistic regression on standardized embeddings
  • 98% classification accuracy, 0.997 ROC-AUC, t-SNE-validated class separability
PyTorchVAEsGenerative AIAI Security

🤖 AI Agent Microservice Platform

Distributed Systems

A containerized, cloud-native runtime orchestrating autonomous AI agents — a centralized FastAPI orchestrator coordinates task routing and execution across independent agent microservices via a Redis message queue.

  • Simulates 50+ concurrent agents with health checks, retry logic, and structured JSON logging
  • Modular architecture for rapid integration of new agent types
  • Demonstrates cloud-native patterns used in modern agentic AI platforms
FastAPIDockerRedisMulti-Agent Systems

🧍 3D Human Mesh Reconstruction

Live Demo

A research pipeline reconstructing full 3D human body meshes from monocular video using SMPL-X parametric models, with a CPU-optimized rendering pipeline and a Gradio-based comparison interface.

  • PyTorch tensor pipeline integrated with Trimesh mesh manipulation and Pyrender physically-based rendering
  • Converts raw NPZ pose/shape parameters into frame-wise rendered, vertex-normalized mesh outputs
  • Vectorized batch processing to minimize memory overhead without GPU dependency
PyTorchSMPL-XPyrenderGradio

👁 Partial Face Recognition Under Occlusion

A classical computer vision pipeline recognizing faces under heavy occlusion — eyes-and-nose-only inputs simulating masks, low-quality footage, or cropped images.

  • Fused HOG, SIFT, LBP, and color histogram features with PCA dimensionality reduction
  • SVM classification combined with cosine/Euclidean similarity matching
  • Ensemble feature-fusion strategies improved robustness under rotation, scale, and lighting variation
OpenCVSVMFeature Engineering

🛒 Role-Based E-Commerce Management System

A Java desktop application managing multi-role e-commerce workflows — Admin, Customer Support, Warehouse, Delivery, Supplier, and Fraud Analyst — with role-specific dashboards and a normalized MySQL backend.

JavaSwingMySQLJDBC

🛒 Scalable E-Commerce Data Management System

Collaborative · Team Project

A normalized (3NF) relational data system modeling enterprise e-commerce workflows — customers, products, orders, regional hierarchies — built as part of a Data Models & Query Languages course project.

  • Contributed schema design, SQL query optimization, and Python-based data ingestion/validation tooling
SQLSQLite3Data Modeling

Team project — repository owned and maintained by a teammate.

05 / Skills

Tools of the trade

Machine Learning & AI

PyTorchTensorFlowScikit-learn Computer VisionNLP / Transformers Generative Models (VAE)Multimodal AIRAG

Infra & MLOps

DockerKubernetesFastAPI RedisAWSGCPAirflow

Data Engineering

SparkHadoopSQL PostgreSQLMongoDBETL Pipelines

Languages & Tools

PythonJavaSQL GitJupyterVS Code