Currently in the US — open to relocation globally

I build ML systems
that survive production.

My name is Pratiksha. I'm an AI/ML Engineer with 4+ years building production ML systems at Deutsche Bank and HCLTech. I specialize in RAG pipelines, MCP-based agentic workflows, LLM evaluation, and compliance-grade model governance in regulated financial environments.

I'm actively looking for ML Engineer roles in Germany, the Netherlands, Canada, and Australia. My Basel III, MiFID II, and HIPAA background isn't something I learned from a course — it shaped how I design and deploy real production systems. I'm EU Blue Card eligible and can relocate quickly.

Open to relocation — EU Blue Card eligible RAG · MCP · Agentic AI Python · AWS · Azure · PyTorch LLM Evaluation · Fine-tuning MS Data Science · SUNY Buffalo
RAG Pipeline · Financial Compliance · GenAI
Regulatory RAG System

Compliance analysts at Deutsche Bank were spending hours manually searching thousands of pages of Basel III and MiFID II documentation. I built a retrieval system that makes that search instant, cited, and reliable.

The key engineering decision was in retrieval strategy. Pure vector search scored 79% on our eval set — hybrid BM25 plus vector scored 91%. Regulatory text has dense, exact terminology like LCR and CET1 that semantic search alone misses. The 70/30 weighting was tuned on a 200-question eval set built with compliance analysts.

LlamaIndexChromaDBFAISS OpenAI EmbeddingsBM25 FastAPIBasel IIIMiFID II
pipeline.py
# Hybrid retrieval: 70% vector + 30% BM25
return QueryFusionRetriever(
    retrievers=[
        VectorIndexRetriever(index=self.index, similarity_top_k=10),
        BM25Retriever.from_defaults(nodes=self.nodes, similarity_top_k=10),
    ],
    retriever_weights=[0.7, 0.3],
    similarity_top_k=5,
)
# Result: 91% relevance@5 vs 79% pure vector
91%
relevance@5 on internal compliance eval set
60%
reduction in analyst research time
10K+
regulatory documents indexed
View on GitHub →

Agentic AI · MCP · Financial Services
MCP Financial Data Assistant

Risk analysts needed to query live trading and liquidity data for ad-hoc questions — but every query meant writing SQL or waiting for a data engineer. I built an MCP server that lets Claude do it in natural language.

The server exposes 6 structured financial tools. Claude decides which tools to call in what order, executes them, and synthesizes the answer. The hardest part was tool design — too granular and the model makes too many steps, too broad and it loses precision. Getting the boundaries right took several iterations with actual analyst workflows.

Anthropic MCPClaude API PythonTool Use Agentic WorkflowsSQLiteFastAPI
server.py
# MCP tool: expose financial data to Claude
@server.call_tool()
async def call_tool(name: str, args: dict):
    if name == "get_risk_metrics":
        r = db.execute("SELECT * FROM risk_metrics WHERE portfolio_id=?",
                      (args["portfolio_id"],)).fetchone()
        return dict(r)
# Claude resolves: "What's our APAC credit exposure this week?"
# → get_portfolio → get_risk_metrics → get_exposure_by_region
90s
analyst time-to-insight, down from 15 minutes
6
financial data tools exposed via MCP
100%
audit-logged — every tool call traceable
View on GitHub →

LLM Evaluation · Financial Services
LLM Benchmarking Framework

At Deutsche Bank I needed to figure out which LLMs were actually worth deploying in a regulated financial environment. So I built an evaluation framework to find out.

The framework compares models on what actually matters for enterprise use: TTFT, throughput, hallucination rate, and cost per token. The finding that shaped the final architecture: GPT-4 had the best accuracy but cost 15x more than Gemini. The answer wasn't "use the best model" — it was use the right model for the right risk level.

GPT-4Gemini ProGrok TTFTHallucination Rate PythonLangFuse
benchmark.py
# Compare models on what matters in production
results = LLMBenchmark(
    models=["gpt-4", "gemini-pro", "grok-1"],
    metrics=["ttft", "throughput", "hallucination_rate", "cost_per_token"],
    risk_level="high"
).run(dataset="regulatory_qa.jsonl")
# GPT-4 → high-stakes tasks
# Gemini → high-volume synthesis (65% cheaper)
65%
reduction in regulatory research time
3
LLMs evaluated head-to-head in production
15x
cost difference that changed the architecture

Time-Series Forecasting · Risk Models
Market Risk Forecasting Pipeline

At Deutsche Bank I built the forecasting layer for market risk and liquidity models. The pipeline runs daily across 25M+ records from multiple asset class systems.

The interesting part was model selection. ARIMA is interpretable and fast. Prophet handles seasonality. LSTM captures non-linear patterns. The choice isn't about which model is best in theory — it's about which one you can explain to a risk committee and audit under Basel III.

ARIMAProphetLSTM Apache AirflowMLflow AWS SageMakerBasel III
21%
improvement in VaR forecasting accuracy
50min
pipeline runtime, down from 5 hours
25M+
daily records processed

NLP · Healthcare · MLOps
Clinical NLP Anomaly Detection

At HCLTech I built a pipeline to catch ICD-10 coding errors in physician notes before they caused billing or compliance issues. Medical coders were spending hours on manual review — the goal was to make that review targeted.

The pipeline uses SpaCy for entity extraction and TF-IDF for anomaly scoring. The deployment constraint made this more interesting than the model — everything had to be HIPAA-compliant with strict audit logging and zero PII in model inputs or outputs.

SpaCyTF-IDFFlask DockerAWS EC2 HIPAAMLflow
25%
improvement in ICD-10 coding accuracy
75%
faster model deployment time
90%+
model consistency maintained in production

I'm making a deliberate move, not a desperate one.

The kind of ML work I find most interesting — production systems in regulated industries, models that have to be explainable and auditable, AI that's held to a real standard — that work is happening seriously across Europe, Canada, and Australia right now.

My background in Basel III, MiFID II, and HIPAA isn't incidental. Working at Deutsche Bank in the US means I already understand how European financial regulation shapes ML system design. I'm not learning that on the job — I've been doing it. That same rigor maps directly to OSFI in Canada and APRA in Australia.

I'm eligible for the EU Blue Card and the Dutch Highly Skilled Migrant visa. I can relocate quickly. The countries I'm targeting aren't random — they're where the work I want to do is actually happening.

🇩🇪 Germany 🇳🇱 Netherlands 🇦🇺 Australia 🇨🇦 Canada 🇸🇪 Sweden 🇬🇧 UK
Let's talk.

If you're hiring ML engineers in Germany, the Netherlands, Canada, or Australia — or if you're curious about the RAG pipeline, the MCP assistant, or the LLM evaluation work — I'd genuinely like to hear from you. Email works best.