Available for new opportunities

Huzaifa Ahmad Gill

I build robust, scalable systems — from intelligent AI pipelines and cloud-native microservices to full-stack web applications that solve real-world problems.

Who I Am

I'm a Computer Science graduate passionate about building systems that are not just functional — but architecturally sound, scalable, and intelligent. I love working at the intersection of cloud infrastructure, AI/ML, and backend engineering.

My work spans from designing multi-LLM agentic pipelines and RAG architectures to deploying Kubernetes-orchestrated microservices on AWS with full observability stacks. I treat every project as an engineering problem worth solving properly.

Currently exploring distributed systems design patterns — particularly API Gateway federation, circuit breakers, and event-driven microservice architectures with gRPC and Protocol Buffers.

BSc (Hons) Computer Science
University of Hertfordshire
Oct 2022 – Aug 2025
🎓 Upper Second Class Honours (2:1)

Tech Stack

☁️ Cloud & Infrastructure
AWS LambdaAPI GatewayEKS ECRRDSElastiCache BedrockDynamoDBS3 KinesisTerraformRoute 53
🤖 AI / ML
LangChainOpenAI APILLaMA 3.x ChromaDBHugging FaceTensorFlow PyTorchLSTMRAG Whisper AIOpenCV
🐳 DevOps & Containers
DockerKubernetesHelm ArgoCDGitHub ActionsJenkins AnsiblePrometheusGrafana LokiOpenTelemetry
⚙️ Backend & APIs
PythonGoJava FastAPIDjangoSpring Boot gRPCProtobufREST WebSocketsApache KafkaRabbitMQ
🗄️ Databases
PostgreSQLMySQLMongoDB RedisDynamoDBPgBouncer ElastiCache
🌐 Frontend
ReactNext.jsTypeScript HTML/CSSTailwind CSS

What I've Built

A collection of systems, pipelines, and applications — each solving a distinct engineering challenge.

🏗️
Multi-Tenant SaaS Platform
📦 api-gateway 📦 auth-service 📦 subscription-service 📦 billing-service 📦 usage-service 📦 proto 📦 saas-services-infra 📦 saas-continious-delivery
⚡ In Development

Production-grade multi-tenant SaaS platform built as a true multi-repo microservices system — each service lives in its own dedicated repository with independent CI/CD, versioning, and deployment lifecycle. Services communicate via gRPC using typed contracts from a central proto repo (Buf Schema Registry) as the single source of truth. The platform is backed by a full AWS environment provisioned in Terraform and delivered via ArgoCD GitOps across dev, staging, and prod.

api-gateway Go 1.25
⚡ go-kit framework · port :9000
🔐 Keycloak JWKS JWT validation
🚦 Redis token-bucket rate limiting
🔌 Sony gobreaker circuit breaker
🔀 GraphQL proxy → auth, subscription
🔀 REST proxy → billing
📡 OpenTelemetry OTLP traces + logs
📖 Swagger UI · health endpoints
auth-service Node.js 22
Express 5 · Apollo GraphQL · port :8080
Registration, login, token refresh, logout
PostgreSQL via Prisma ORM
Prometheus metrics · OTel + Winston
Custom JWT (dev) / Keycloak OIDC (prod)
subscription-service NestJS 11
GraphQL API + gRPC server :50051
Kafka publisher (Avro, Glue Schema Registry)
PostgreSQL via TypeORM · circuit breaker
OTel + Winston observability
billing-service Java 21
Spring Boot 4 · port :8082
Stripe payment processing + invoicing
Kafka consumer (billing.usage-charge, Avro)
gRPC client → subscription :50051
Redis Bucket4j rate limiting · JPA/Hibernate
usage-service Python 3.11
Airflow 2.10 · 3 DAGs (ingest → aggregate → embed)
ChromaDB + OpenAI embeddings RAG layer
pybreaker circuit breaker · tenacity retry
🔗 proto repo — Buf Schema Registry
Single source of truth for all gRPC contracts
subscription/v1/subscription.proto
Buf linting + validation rules (buf.yaml)
Consumed by NestJS (subscription-service) and Spring Boot (billing-service)
Migration guide · registry comparison docs · per-language usage examples
☁️ Terraform Infra (saas-services-infra)
modules/eks — EKS 1.32, IRSA, AWS Verified Access
modules/rds — isolated PostgreSQL per service
modules/msk — Amazon MSK + Glue Schema Registry
modules/elk / grafana / otel — full observability
modules/k8s-and-helm — ArgoCD, Keycloak, Airflow
graph TD A([External Traffic]) --> B[Istio Gateway\nmTLS · PeerAuthentication] B --> C[api-gateway :9000\nGo · go-kit · JWT · Redis rate limit · gobreaker] C -->|GraphQL| D[auth-service :8080\nNode.js · Express 5 · Prisma · PostgreSQL] C -->|GraphQL| E[subscription-service :8081\nNestJS 11 · TypeORM · Kafka publisher] C -->|REST| F[billing-service :8082\nSpring Boot 4 · Stripe · Kafka consumer] E -->|gRPC :50051| F F --> G[usage-service :8083\nPython · Airflow 2.10 · ChromaDB RAG] H[proto repo\nBuf Registry] -. contracts .-> E & F I[ArgoCD GitOps\nHelm OCI → ECR] -. deploys .-> C & D & E & F & G J[OTel DaemonSet] -. traces .-> C & D & E & F & G
Go · go-kit Node.js · Express 5 NestJS 11 Spring Boot 4 Python 3.11 gRPC Protobuf · Buf Registry GraphQL · Apollo Kafka / MSK (Avro) Stripe Istio mTLS Keycloak OIDC Sony gobreaker Redis rate limiting ArgoCD GitOps Helm OCI / ECR EKS 1.32 Terraform RDS PostgreSQL Airflow 2.10 ChromaDB OpenTelemetry AWS Verified Access Glue Schema Registry
🔧
AI

AI-powered project planning platform. Feed it a client proposal and a LangGraph multi-agent pipeline produces project timelines, Gantt charts, and 6 types of technical diagrams — with self-healing retry loops that automatically fix invalid outputs before delivery. Built on HuggingFace Inference API for diagram generation and Together AI for prompt optimisation, with PostgreSQL session tracking and full output versioning.

graph TD A([Client Proposal]) --> B[PromptOptimizer\nLangChain · Together AI] B --> C[TimeAgent\nMilestones · Gantt] C --> D{TimeValidator\n↺ 3 retries} D -->|invalid| C D -->|valid| E[PlanAgent\nTask breakdown · Risks] E --> F[ImageGenerator\nMermaid × 6 types\nHuggingFace API] F --> G{Validator\n↺ 3 retries} G -->|invalid| F G -->|valid| H([FastAPI Response\nPostgreSQL Session Store])
LangGraph LangChain HuggingFace API Together AI FastAPI PostgreSQL Alembic Mermaid Docker Compose Self-healing Agents

Production-grade multi-agent RAG system orchestrated by LangGraph. A Planner node dynamically routes queries across three specialised agents — RAG (LLaMA 3.3 70B via Groq + ChromaDB), Web Search (Tavily + Wikipedia + GPT-4o-mini), and Memory (SQLite session history) — then a ReAct-style Replan node decides if more tool calls are needed before a final Claude Sonnet Aggregator synthesises the answer. Each agent communicates with its own MCP microservice over HTTP. Vue 3 frontend, FastAPI backend, Docker Compose, and LangSmith tracing.

graph TD A([User Query]) --> B[Chat Service\nQuery Contextualization] B --> C[Planner Node\nClaude Sonnet] C --> D[Memory Agent\nSQLite · MCP :8003] C --> E[RAG Agent\nLLaMA 3.3 70B · ChromaDB · MCP :8001] C --> F[Web Agent\nTavily · Wikipedia · MCP :8002] D & E & F --> G{Replan Node\nReAct — done?} G -->|more tools needed| C G -->|done| H[Aggregator\nClaude Sonnet] H --> I([FastAPI :8000\nVue 3 · Vite]) J[LangSmith] -. traces .-> C & G & H
LangGraph Claude Sonnet LLaMA 3.3 70B GPT-4o-mini MCP ChromaDB FastAPI Vue 3 + Vite Groq Tavily SQLite Memory LangSmith Docker Compose

Serverless RAG pipeline on AWS — ingests documents via Lambda, stores embeddings in a vector-capable DynamoDB layer, and queries through Bedrock foundation models behind an API Gateway with Terraform-managed infrastructure.

graph LR A([Client]) --> B[API Gateway] B --> C[Lambda\nProcessor] C --> D[(DynamoDB\nVectors)] C --> E[Bedrock LLM] E --> B F[Terraform IaC] -. provisions .-> B
TerraformAWS Lambda API GatewayBedrock DynamoDB

Full-stack AI code advisor using the Model Context Protocol. Deployed on EKS behind Route 53 with TLS via Certbot, containerised with Docker, and provisioned via Terraform + Ansible with GitHub Actions CI/CD.

graph LR A([React UI]) --> B[FastAPI\nMCP Server] B --> C[LLM\nCode Advisor] D[GitHub Actions] --> E[Docker → ECR] E --> F[EKS Cluster] F --> B G[Route 53\n+ Certbot TLS] --> F H[Terraform\n+ Ansible] -. provisions .-> F
FastAPIReact EKSECR TerraformAnsible GitHub Actions

Multi-agent RAG system orchestrating LLaMA 3.1, LLaMA 3.2, and GPT-4.1 Mini through LangChain. Agents autonomously decide which model to call per sub-task, using ChromaDB as the vector store.

graph TD A([Query]) --> B[LangChain\nRouter Agent] B --> C[LLaMA 3.1] B --> D[LLaMA 3.2] B --> E[GPT-4.1 Mini] C & D & E --> F[(ChromaDB\nVector Store)] F --> G[FastAPI\nResponse]
FastAPILangChain LLaMA 3.xGPT-4.1 Mini ChromaDBDocker
🔍
AI

Extends standard RAG to handle text, images, and structured data together. Uses multimodal embeddings via Hugging Face and OpenAI API for cross-modal retrieval and generation.

graph LR A[Text] --> D B[Images] --> D C[Structured\nData] --> D D[Multimodal\nEmbeddings\nHugging Face] --> E[(Vector\nIndex)] E --> F[Cross-modal\nRetrieval] F --> G[OpenAI API\nGeneration]
PythonLangChain Hugging FaceOpenAI API Multimodal Embeddings

Feature-rich voice based AI platform with real-time WebSockets, AI voice (Whisper AI), Meta Llama integration, XTTS V2 Encoder (TTS) and a robust async backend. Full Kubernetes deployment with Helm, PgBouncer connection pooling, and GitHub Actions pipelines.

graph TD A([Next.js]) --> B[Django Backend\ngRPC + WebSockets] B --> C[RabbitMQ] B --> D[(PostgreSQL\nPgBouncer)] B --> E[(Redis)] F[Whisper AI\nMeta Llama] --> B G[Kubernetes\n+ Helm] -. orchestrates .-> B h[XTTS v2 Encoder\n] --> B
DjangoNext.js gRPCWhisper AI XTTS V2 Encoder Meta LlamaRabbitMQ RedisPostgreSQL KubernetesHelm

High-performance crypto trading platform in Go with gRPC microservices. AWS-native deployment on EKS with RDS + ElastiCache, GitOps via ArgoCD, and full IaC through Terraform + Helm charts.

graph LR A([React]) --> B[Go Gin API] B --> C[gRPC\nMicroservices] C --> D[(RDS\nPostgreSQL)] C --> E[(ElastiCache\nRedis)] F[EKS + Terraform] -. hosts .-> B G[ArgoCD] -. GitOps .-> F
GoGin gRPCReact EKSRDS ElastiCacheArgoCD Terraform

Enterprise-grade banking platform with Spring Boot microservices, event-driven architecture via Kafka, multi-database strategy (PostgreSQL + MySQL + MongoDB), and complete observability with Grafana, Prometheus, Loki & Tempo.

graph TD A([Next.js]) --> B[Keycloak\nAuth] B --> C[Spring Boot\nMicroservices] C --> D[Kafka\nEvent Bus] D --> C C --> E[(PostgreSQL\nMySQL\nMongoDB)] C --> F[(Redis)] G[Grafana\nPrometheus\nLoki] -. observes .-> C H[Jenkins\nKubernetes] -. deploys .-> C
Spring BootNext.js KafkaKeycloak RedisGrafana PrometheusJenkins Kubernetes

Scalable ML data pipeline with Apache Airflow for ETL automation, MLflow experiment tracking, DVC data versioning, and AWS S3 storage. Pushgateway + Prometheus for full pipeline observability.

graph LR A([Data Source]) --> B[Apache Airflow\nJob Trigger] B --> C[scikit-learn\nML Pipeline] C --> D[(AWS S3)] C --> E[MLflow\nExperiment Tracking] D --> F[DVC\nVersioning] G[Pushgateway\nPrometheus] -. monitors .-> C
PythonApache Airflow MLflowscikit-learn AWS S3DVC PrometheusPushgateway

End-to-end serverless video ingestion pipeline using AWS Kinesis Video Streams, Kinesis Data Streams, Lambda, Redshift analytics, and OpenSearch for indexing. FFmpeg + OpenCV for stream processing with X-Ray distributed tracing.

graph LR A([Camera /\nVideo Source]) --> B[OpenCV\nFFmpeg] B --> C[Kinesis Video\nStreams] C --> D[Lambda\nProcessor] D --> E[Kinesis Data\nStreams] E --> F[(Redshift)] E --> G[(OpenSearch)] H[X-Ray] -. traces .-> D
PythonOpenCV FFmpegKinesis LambdaRedshift OpenSearchX-Ray Terraform

Final Year Project — LSTM-based deep learning model for stock price prediction. Flask web interface exposing real-time inference on live market data with NumPy/Pandas feature engineering pipelines.

graph LR A([Market Data]) --> B[Pandas\nFeature Engineering] B --> C[LSTM Model\nTensorFlow] C --> D[Flask API\nInference] D --> E([Web UI\nReal-time Charts])
PythonTensorFlow LSTMFlask PandasNumPy

CNN-based computer vision pipeline for real-time object detection. Built and trained from scratch with TensorFlow, exploring custom architecture design and transfer learning optimisations.

graph LR A([Image Input]) --> B[CNN Backbone\nTensorFlow] B --> C[Feature\nExtraction] C --> D[Classification\nHead] C --> E[Bounding Box\nRegression] D & E --> F([Detection\nOutput])
PythonTensorFlow CNNsComputer Vision

Asynchronous Advantage Actor-Critic (A3C) agent trained on OpenAI Gymnasium environments using PyTorch. Implements distributed training across multiple parallel actor-learner threads.

graph TD A[Global Network\nShared Weights] --> B[Worker 1\nActor-Critic] A --> C[Worker 2\nActor-Critic] A --> D[Worker N\nActor-Critic] B & C & D --> E[Gymnasium\nEnvironment] E --> B & C & D B & C & D -. async gradients .-> A
PythonPyTorch A3COpenAI Gymnasium Reinforcement Learning

Let's Connect

Open to Opportunities

Whether it's a full-time role, a contract project, or an interesting engineering problem — I'd love to hear from you.

huzaifaahmad2210@gmail.com

+92 301 0507689