Huzaifa Ahmad Gill

🏗️

★ Flagship

Multi-Tenant SaaS Platform

              📦 api-gateway
              📦 auth-service
              📦 subscription-service
              📦 billing-service
              📦 usage-service
              📦 proto
              📦 saas-services-infra
              📦 saas-continious-delivery
            

⚡ In Development

Production-grade multi-tenant SaaS platform built as a true multi-repo microservices system — each service lives in its own dedicated repository with independent CI/CD, versioning, and deployment lifecycle. Services communicate via gRPC using typed contracts from a central proto repo (Buf Schema Registry) as the single source of truth. The platform is backed by a full AWS environment provisioned in Terraform and delivered via ArgoCD GitOps across dev, staging, and prod.

api-gateway Go 1.25

⚡ go-kit framework · port :9000

🔐 Keycloak JWKS JWT validation

🚦 Redis token-bucket rate limiting

🔌 Sony gobreaker circuit breaker

🔀 GraphQL proxy → auth, subscription

🔀 REST proxy → billing

📡 OpenTelemetry OTLP traces + logs

📖 Swagger UI · health endpoints

auth-service Node.js 22

Express 5 · Apollo GraphQL · port :8080

Registration, login, token refresh, logout

PostgreSQL via Prisma ORM

Prometheus metrics · OTel + Winston

Custom JWT (dev) / Keycloak OIDC (prod)

subscription-service NestJS 11

GraphQL API + gRPC server :50051

Kafka publisher (Avro, Glue Schema Registry)

PostgreSQL via TypeORM · circuit breaker

OTel + Winston observability

billing-service Java 21

Spring Boot 4 · port :8082

Stripe payment processing + invoicing

Kafka consumer (billing.usage-charge, Avro)

gRPC client → subscription :50051

Redis Bucket4j rate limiting · JPA/Hibernate

usage-service Python 3.11

Airflow 2.10 · 3 DAGs (ingest → aggregate → embed)

ChromaDB + OpenAI embeddings RAG layer

pybreaker circuit breaker · tenacity retry

🔗 proto repo — Buf Schema Registry

Single source of truth for all gRPC contracts

subscription/v1/subscription.proto

Buf linting + validation rules (buf.yaml)

Consumed by NestJS (subscription-service) and Spring Boot (billing-service)

Migration guide · registry comparison docs · per-language usage examples

☁️ Terraform Infra (saas-services-infra)

modules/eks — EKS 1.32, IRSA, AWS Verified Access

modules/rds — isolated PostgreSQL per service

modules/msk — Amazon MSK + Glue Schema Registry

modules/elk / grafana / otel — full observability

modules/k8s-and-helm — ArgoCD, Keycloak, Airflow

🤖 Atlantis — PR-driven Terraform: autoplan on *.tf changes, gated atlantis apply, per-env workspaces (dev / test / stage / prod) for root, api-gateway, auth, subscription, billing

break-glass: workflow_dispatch GitHub Action fallback

🚀 saas-continious-delivery — GitOps Layer

charts/* — per-service Helm charts rendering argoproj.io/Rollout (Istio canary) with Prometheus AnalysisTemplate gates + KEDA ScaledObject

saas-chart/ — master infra chart (Gateway API CRDs, Istio, Keycloak, Airflow, observability)

infra/overlays/ — Kustomize dev / staging / prod overlays consumed by ArgoCD

🧬 CUE schemas — schemas/service/values.cue is the source of truth; generates strict values.schema.json validated by Helm at install time

🌐 Crossplane — AppDatabase XRD + Composition provisions Postgres DBs / roles / grants inside TF-provisioned RDS via provider-sql

Argo Rollouts + KEDA + Istio + Crossplane reconciled by ArgoCD across dev-local / staging / prod

graph TD A([External Traffic]) --> B[Istio Gateway\nmTLS · PeerAuthentication] B --> C[api-gateway :9000\nGo · go-kit · JWT · Redis rate limit · gobreaker] C -->|GraphQL| D[auth-service :8080\nNode.js · Express 5 · Prisma · PostgreSQL] C -->|GraphQL| E[subscription-service :8081\nNestJS 11 · TypeORM · Kafka publisher] C -->|REST| F[billing-service :8082\nSpring Boot 4 · Stripe · Kafka consumer] E -->|gRPC :50051| F F --> G[usage-service :8083\nPython · Airflow 2.10 · ChromaDB RAG] H[proto repo\nBuf Registry] -. contracts .-> E & F K[Atlantis\nPR-driven Terraform] -. provisions .-> L[(AWS\nEKS · RDS · MSK · ECR)] M[CUE schemas\nvalues.schema.json] -. validates .-> N[Helm Charts] O[Crossplane\nAppDatabase XRD] -. provisions .-> P[(Postgres\nDBs · roles · grants)] I[ArgoCD GitOps\nArgo Rollouts · KEDA · Istio] -. deploys .-> C & D & E & F & G I -. reconciles .-> O J[OTel DaemonSet] -. traces .-> C & D & E & F & G

Go · go-kit Node.js · Express 5 NestJS 11 Spring Boot 4 Python 3.11 gRPC Protobuf · Buf Registry GraphQL · Apollo Kafka / MSK (Avro) Stripe Istio mTLS Keycloak OIDC Sony gobreaker Redis rate limiting ArgoCD GitOps Argo Rollouts KEDA CUE Schemas Crossplane Atlantis Helm OCI / ECR EKS 1.32 Terraform RDS PostgreSQL Airflow 2.10 ChromaDB OpenTelemetry AWS Verified Access Glue Schema Registry

🎬

⚙ MLOps Flagship

Kinetics MLOps Platform — HyperPod on EKS

              📦 kinetics-pipeline
              📦 Kinetics-Continious-Delivery
            

🧠 ML Platform

End-to-end MLOps platform for training a video action-recognition model (PyTorch CNN-LSTM, ImageNet-pretrained ResNet backbone → LSTM) on Kinetics-400, running on SageMaker HyperPod orchestrated by EKS — with cost controls engineered in from day one. GPUs default to scale-to-zero: Karpenter only provisions a Spot GPU node when a job pod is pending, then consolidates back to zero. Distributed DDP training (AMP bf16 + torch.compile + transfer learning) checkpoints to S3 for auto-resume after Spot interruption, with full MLflow tracking, a versioned Model Registry, and dual serving paths — all delivered via ArgoCD GitOps from a dedicated delivery repo.

$0idle-GPU cost

~−60%Spot savings

12Terraform modules

k8s 1.34standard support

2 reposinfra + GitOps CD

training PyTorch

🎞️ CNN-LSTM · ResNet backbone → LSTM

⚡ Distributed DDP · AMP bf16 · torch.compile

🧊 Freeze→unfreeze transfer learning

💾 S3 checkpoints · auto-resume latest.pt

🚀 HyperPodPyTorchJob · torchrun

cost controls 💸

Karpenter scale-to-zero · Spot-first GPUs

Auto-stop Lambda (autoscaling-off fallback)

Budgets + anomaly detection · single NAT

S3 lifecycle · DCGM GPU-util visibility

storage / data

FSx for Lustre lazy-loads from S3 data bucket

DVC versions manifests (not raw video)

MLOps + serving SageMaker

SageMaker-managed MLflow · params · top1/top5

Model Registry · approval-gated versions

SageMaker endpoint + self-hosted FastAPI

Shared Predictor core · Prometheus drift metrics

GitOps delivery

ArgoCD app-of-apps + ApplicationSet

Training job on manual sync (no accidental GPU run)

☁️ Terraform Infra (kinetics-pipeline)

modules/eks · hyperpod — EKS 1.34 + HyperPod operator add-on

modules/karpenter — Spot GPU + SQS interruption queue

modules/storage — S3 + FSx Lustre lifecycle

modules/mlflow · ecr · cost — tracking, registry, budgets

🔐 EKS Pod Identity (not IRSA) · access entries · AWS Client VPN (SAML / IAM Identity Center)

🤖 CI/CD — GitHub Actions, keyless OIDC

3 least-privilege OIDC roles · no static AWS keys

Terraform fmt / validate / tflint / plan-on-PR / apply-on-main

buildx amd64 → ECR: training + inference images

Cross-repo GitOps bump via GitHub App token → delivery repo

🧬 CUE schemas strict-vet every rendered Helm + gitops manifest

graph TD Dev[GitHub Actions\nkeyless OIDC] --> ECR[(ECR\ntraining + inference)] ECR --> CD[Kinetics-CD repo\nGitOps source of truth] CD --> Argo[ArgoCD\napp-of-apps] Argo --> EKS[EKS 1.34\nPod Identity] Argo -.manual sync.-> Job[HyperPodPyTorchJob\ntorchrun DDP] Karp[Karpenter\nscale-to-zero] -->|pending pod| GPU[g5 GPU node\nSpot] Job --> GPU S3D[(S3 data)] --> FSx[FSx Lustre\nlazy-load] --> GPU GPU --> Train[CNN-LSTM\nAMP · torch.compile] Train --> CKPT[(S3 checkpoints\nauto-resume)] Train --> ML[SageMaker MLflow\ntop1/top5] CKPT --> Reg[Model Registry\napproval gate] Reg --> EP[SageMaker Endpoint] Reg --> API[FastAPI serving\nPrometheus] DCGM[DCGM · Prometheus\nGrafana · OTel] -.observes.-> GPU DCGM -.observes.-> API

PyTorch · CNN-LSTM DDP · AMP bf16 torch.compile Kinetics-400 SageMaker HyperPod EKS 1.34 Karpenter (Spot · scale-to-zero) FSx for Lustre MLflow Model Registry SageMaker Endpoint FastAPI Serving DVC Terraform ArgoCD GitOps Helm CUE Schemas EKS Pod Identity AWS Client VPN (SAML) DCGM / Prometheus OpenTelemetry GitHub Actions OIDC ECR

🔧

FlowForge

AI

AI-powered project planning platform. Feed it a client proposal and a LangGraph multi-agent pipeline produces project timelines, Gantt charts, and 6 types of technical diagrams — with self-healing retry loops that automatically fix invalid outputs before delivery. Built on HuggingFace Inference API for diagram generation and Together AI for prompt optimisation, with PostgreSQL session tracking and full output versioning.

graph TD A([Client Proposal]) --> B[PromptOptimizer\nLangChain · Together AI] B --> C[TimeAgent\nMilestones · Gantt] C --> D{TimeValidator\n↺ 3 retries} D -->|invalid| C D -->|valid| E[PlanAgent\nTask breakdown · Risks] E --> F[ImageGenerator\nMermaid × 6 types\nHuggingFace API] F --> G{Validator\n↺ 3 retries} G -->|invalid| F G -->|valid| H([FastAPI Response\nPostgreSQL Session Store])

LangGraph LangChain HuggingFace API Together AI FastAPI PostgreSQL Alembic Mermaid Docker Compose Self-healing Agents

🕸️

Multi-Agent RAG System

AI

Production-grade multi-agent RAG system orchestrated by LangGraph. A Planner node dynamically routes queries across three specialised agents — RAG (LLaMA 3.3 70B via Groq + ChromaDB), Web Search (Tavily + Wikipedia + GPT-4o-mini), and Memory (SQLite session history) — then a ReAct-style Replan node decides if more tool calls are needed before a final Claude Sonnet Aggregator synthesises the answer. Each agent communicates with its own MCP microservice over HTTP. Vue 3 frontend, FastAPI backend, Docker Compose, and LangSmith tracing.

graph TD A([User Query]) --> B[Chat Service\nQuery Contextualization] B --> C[Planner Node\nClaude Sonnet] C --> D[Memory Agent\nSQLite · MCP :8003] C --> E[RAG Agent\nLLaMA 3.3 70B · ChromaDB · MCP :8001] C --> F[Web Agent\nTavily · Wikipedia · MCP :8002] D & E & F --> G{Replan Node\nReAct — done?} G -->|more tools needed| C G -->|done| H[Aggregator\nClaude Sonnet] H --> I([FastAPI :8000\nVue 3 · Vite]) J[LangSmith] -. traces .-> C & G & H

LangGraph Claude Sonnet LLaMA 3.3 70B GPT-4o-mini MCP ChromaDB FastAPI Vue 3 + Vite Groq Tavily SQLite Memory LangSmith Docker Compose

☁️

Cloud-Based RAG Architecture

Done

Serverless RAG pipeline on AWS — ingests documents via Lambda, stores embeddings in a vector-capable DynamoDB layer, and queries through Bedrock foundation models behind an API Gateway with Terraform-managed infrastructure.

graph LR A([Client]) --> B[API Gateway] B --> C[Lambda\nProcessor] C --> D[(DynamoDB\nVectors)] C --> E[Bedrock LLM] E --> B F[Terraform IaC] -. provisions .-> B

TerraformAWS Lambda API GatewayBedrock DynamoDB

🧠

MCP Code Advisor Web App

AI

Full-stack AI code advisor using the Model Context Protocol. Deployed on EKS behind Route 53 with TLS via Certbot, containerised with Docker, and provisioned via Terraform + Ansible with GitHub Actions CI/CD.

graph LR A([React UI]) --> B[FastAPI\nMCP Server] B --> C[LLM\nCode Advisor] D[GitHub Actions] --> E[Docker → ECR] E --> F[EKS Cluster] F --> B G[Route 53\n+ Certbot TLS] --> F H[Terraform\n+ Ansible] -. provisions .-> F

FastAPIReact EKSECR TerraformAnsible GitHub Actions

🤖

Agentic RAG (Multi-LLM)

AI

Multi-agent RAG system orchestrating LLaMA 3.1, LLaMA 3.2, and GPT-4.1 Mini through LangChain. Agents autonomously decide which model to call per sub-task, using ChromaDB as the vector store.

graph TD A([Query]) --> B[LangChain\nRouter Agent] B --> C[LLaMA 3.1] B --> D[LLaMA 3.2] B --> E[GPT-4.1 Mini] C & D & E --> F[(ChromaDB\nVector Store)] F --> G[FastAPI\nResponse]

FastAPILangChain LLaMA 3.xGPT-4.1 Mini ChromaDBDocker

🔍

Multimodal RAG

AI

Extends standard RAG to handle text, images, and structured data together. Uses multimodal embeddings via Hugging Face and OpenAI API for cross-modal retrieval and generation.

graph LR A[Text] --> D B[Images] --> D C[Structured\nData] --> D D[Multimodal\nEmbeddings\nHugging Face] --> E[(Vector\nIndex)] E --> F[Cross-modal\nRetrieval] F --> G[OpenAI API\nGeneration]

PythonLangChain Hugging FaceOpenAI API Multimodal Embeddings

🛒

Voice AI Web Application

Done

Feature-rich voice based AI platform with real-time WebSockets, AI voice (Whisper AI), Meta Llama integration, XTTS V2 Encoder (TTS) and a robust async backend. Full Kubernetes deployment with Helm, PgBouncer connection pooling, and GitHub Actions pipelines.

graph TD A([Next.js]) --> B[Django Backend\ngRPC + WebSockets] B --> C[RabbitMQ] B --> D[(PostgreSQL\nPgBouncer)] B --> E[(Redis)] F[Whisper AI\nMeta Llama] --> B G[Kubernetes\n+ Helm] -. orchestrates .-> B h[XTTS v2 Encoder\n] --> B

DjangoNext.js gRPCWhisper AI XTTS V2 Encoder Meta LlamaRabbitMQ RedisPostgreSQL KubernetesHelm

₿

Crypto Currency Web Application

Done

High-performance crypto trading platform in Go with gRPC microservices. AWS-native deployment on EKS with RDS + ElastiCache, GitOps via ArgoCD, and full IaC through Terraform + Helm charts.

graph LR A([React]) --> B[Go Gin API] B --> C[gRPC\nMicroservices] C --> D[(RDS\nPostgreSQL)] C --> E[(ElastiCache\nRedis)] F[EKS + Terraform] -. hosts .-> B G[ArgoCD] -. GitOps .-> F

GoGin gRPCReact EKSRDS ElastiCacheArgoCD Terraform

🏦

Online Banking System

Done

Enterprise-grade banking platform with Spring Boot microservices, event-driven architecture via Kafka, multi-database strategy (PostgreSQL + MySQL + MongoDB), and complete observability with Grafana, Prometheus, Loki & Tempo.

graph TD A([Next.js]) --> B[Keycloak\nAuth] B --> C[Spring Boot\nMicroservices] C --> D[Kafka\nEvent Bus] D --> C C --> E[(PostgreSQL\nMySQL\nMongoDB)] C --> F[(Redis)] G[Grafana\nPrometheus\nLoki] -. observes .-> C H[Jenkins\nKubernetes] -. deploys .-> C

Spring BootNext.js KafkaKeycloak RedisGrafana PrometheusJenkins Kubernetes

🚚

Logistics Supply-Chain Analyzer

Done

Supply-chain logistics engine on ASP.NET Core (.NET 10) + Neo4j, built with Clean Architecture & CQRS (MediatR + FluentValidation). Models warehouses, weighted routes, and shipments as a graph, exposing a JWT-secured, rate-limited REST API for weighted shortest-path analytics, an enforced shipment lifecycle with memory-safe streaming, route cost/duration estimation, and delivery-risk scoring. Redis caching, Kafka (Redpanda / Confluent Schema Registry) + RabbitMQ messaging, RFC 7807 errors, and Testcontainers integration tests.

graph TD A([Client]) --> B[ASP.NET Core API\n.NET 10 · JWT · rate limit] B --> C[MediatR\nCQRS · FluentValidation] C --> D[Application\nCommands / Queries] D --> E[Domain\nWarehouses · Routes · Shipments] D --> F[Infrastructure\nNeo4j Cypher repos] F --> G[(Neo4j\nGraph DB)] B --> H[(Redis\ncache)] B --> I[Kafka / Redpanda\nSchema Registry] B --> J[RabbitMQ] G --> K[Shortest Path\nRisk Scoring]

.NET 10ASP.NET Core Neo4jClean Architecture CQRS · MediatRFluentValidation RedisKafka / Redpanda RabbitMQJWT Auth DockerTestcontainers

⚡

ETL Pipeline Project

Done

Scalable ML data pipeline with Apache Airflow for ETL automation, MLflow experiment tracking, DVC data versioning, and AWS S3 storage. Pushgateway + Prometheus for full pipeline observability.

graph LR A([Data Source]) --> B[Apache Airflow\nJob Trigger] B --> C[scikit-learn\nML Pipeline] C --> D[(AWS S3)] C --> E[MLflow\nExperiment Tracking] D --> F[DVC\nVersioning] G[Pushgateway\nPrometheus] -. monitors .-> C

PythonApache Airflow MLflowscikit-learn AWS S3DVC PrometheusPushgateway

📡

Video Streaming & Event Pipeline

Done

End-to-end serverless video ingestion pipeline using AWS Kinesis Video Streams, Kinesis Data Streams, Lambda, Redshift analytics, and OpenSearch for indexing. FFmpeg + OpenCV for stream processing with X-Ray distributed tracing.

graph LR A([Camera /\nVideo Source]) --> B[OpenCV\nFFmpeg] B --> C[Kinesis Video\nStreams] C --> D[Lambda\nProcessor] D --> E[Kinesis Data\nStreams] E --> F[(Redshift)] E --> G[(OpenSearch)] H[X-Ray] -. traces .-> D

PythonOpenCV FFmpegKinesis LambdaRedshift OpenSearchX-Ray Terraform

📈

Stock Price Forecasting (FYP)

AI

Final Year Project — LSTM-based deep learning model for stock price prediction. Flask web interface exposing real-time inference on live market data with NumPy/Pandas feature engineering pipelines.

graph LR A([Market Data]) --> B[Pandas\nFeature Engineering] B --> C[LSTM Model\nTensorFlow] C --> D[Flask API\nInference] D --> E([Web UI\nReal-time Charts])

PythonTensorFlow LSTMFlask PandasNumPy

👁️

Object Detection Model

AI

CNN-based computer vision pipeline for real-time object detection. Built and trained from scratch with TensorFlow, exploring custom architecture design and transfer learning optimisations.

graph LR A([Image Input]) --> B[CNN Backbone\nTensorFlow] B --> C[Feature\nExtraction] C --> D[Classification\nHead] C --> E[Bounding Box\nRegression] D & E --> F([Detection\nOutput])

PythonTensorFlow CNNsComputer Vision

🎞️

Sequence-Based Neural Network

AI

Spatio-temporal video prediction model trained on the FlyingThings3D sample dataset. Fuses 8 modalities into a 14-channel tensor per frame — RGB stereo (left/right), disparity, disparity change, forward/backward optical flow, material, and motion segmentation — then learns to predict the next frame's disparity change from a 5-frame sequence. Two architectures are benchmarked head-to-head: a custom CNN (spatial) + LSTM (temporal) pipeline and TensorFlow's built-in ConvLSTM2D, with matplotlib visual comparison of predicted vs ground-truth disparity maps.

graph TD A([FlyingThings3D\nSampler.tar.gz]) --> B[Multi-modal Loader\n.pfm · .png · .pgm] B --> C[14-channel Tensor\nRGB L/R · disp · flow · motion] C --> D[Augmentation\nflip · rotate · brightness] D --> E[Sequence Builder\nlen=5 · stride=1] E --> F{Architecture} F -->|approach 1| G[CNN Encoder\n270×480 → 64-dim] --> H[LSTM\ntemporal] F -->|approach 2| I[ConvLSTM2D\nspatial + temporal] H & I --> J[Predicted\nDisparity Change] J --> K[Matplotlib\nVisual Comparison]

TensorFlow / Keras CNN + LSTM ConvLSTM2D FlyingThings3D Multi-modal Fusion Optical Flow Stereo Disparity OpenCV NumPy Matplotlib

🎮

A3C Reinforcement Learning Agent

AI

Asynchronous Advantage Actor-Critic (A3C) agent trained on OpenAI Gymnasium environments using PyTorch. Implements distributed training across multiple parallel actor-learner threads.

graph TD A[Global Network\nShared Weights] --> B[Worker 1\nActor-Critic] A --> C[Worker 2\nActor-Critic] A --> D[Worker N\nActor-Critic] B & C & D --> E[Gymnasium\nEnvironment] E --> B & C & D B & C & D -. async gradients .-> A

PythonPyTorch A3COpenAI Gymnasium Reinforcement Learning

Who I Am

Tech Stack

What I've Built

Let's Connect

Open to Opportunities