I build what the data team does not yet have — the pipeline that replaces the spreadsheet, the model that goes beyond the notebook, the dashboard that needs no user manual. Twenty years across particle physics, NLP, geospatial analysis, agricultural finance, healthcare, education, and media: different domains, the same discipline. Quality first. Accessible outputs. Value demonstrable before the scale-up.
“We have data scattered across spreadsheets, vendor feeds, and siloed systems with no single source of truth.”
At FIRA (Mexico's national agricultural development trust), I replaced an institution's
ad-hoc Excel workflows with an automated, governed pipeline — integrating satellite vegetation indices,
climate risk models, and regulatory databases into credit decisions for agricultural producers.
Credit analysts adopted it without technical training.
Not because it was simple — because it was designed for them.
At D3Clarity, I built the analytics practice from scratch, adding MDM and
data lake infrastructure for international clients. Semarchy xDM certified
— MDM, Data Profiling, Data Lineage, Golden Records.
“We have thousands of documents — meeting minutes, contracts, reports — and can't extract anything useful from them.”
At D3Clarity, I built the full NLP pipeline: automated scraping of public
school board meeting minutes → ML topic classification → Power BI dashboard mapping
sales opportunities per district. Public records turned into a production commercial tool.
At BairesDev, a RAG and NLP pipeline made unstructured legal contracts
queryable — insight extraction per lawyer, per contract. My MSc thesis addressed
automatic text classification in 2009 — this work predates the LLM era.
“Our models work in notebooks. They don't work in the organization.”
Four years at D3Clarity: every project followed PoC → production,
without exception. At CITEIM, Transformer-based models for continuous
sign language recognition — fully local inference pipeline, on-device. No cloud dependency by design.
At the HAWC Observatory, distributed computing pipelines processed
petabyte-scale cosmic ray datasets across US and Mexican institutions.
The physics background is not incidental — it is where production-grade scientific
software is built under real constraints.
| 2025 – 2026 | FIRA | Data Scientist | Digital transformation of Mexico's national agricultural credit process. Automated pipeline replacing manual workflows — satellite, climate, and regulatory data integrated into repeatable, governed credit assessments. Adopted by non-technical analysts without retraining. |
| 2024 – 2025 | CITEIM | Data Scientist | Transformer-based models for continuous sign language recognition. Fully local inference pipeline — on-device processing, no cloud dependency; deliberate architectural choice to reduce latency and eliminate operational cost. Current-generation AI applied to a direct human communication problem with social impact. |
| 2023 – 2024 | Punto Singular | Data Scientist & Team Lead | KPI and Quality dashboards (R, Python, Tableau, Looker) for business stakeholders. Financial market streaming prototype on AWS (WebSocket pipelines). Multi-sector client engagements: EdTech, environmental monitoring, industrial analytics. Founded an internal Data Science group — students solving real problems for nonprofits and clients. |
| 2023 | BairesDev | Data Scientist | Graph database modeling (Neo4j) for relationship analysis. Transformer-based text summarization. RAG + NLP pipeline for legal contracts — clause extraction, obligation mapping, per-lawyer portfolio querying. AirTable data curation. |
| 2019 – 2023 | D3Clarity | Data Scientist | Built the analytics practice from scratch — expanding the company's offering beyond MDM. Four years: NLP pipelines, data lakes, ETL, forecasting models, MDM, executive dashboards for international clients. Agile/Scrum with Jira. Every project: PoC to production. |
| 2004 – 2019 | HAWC · Tec MTY · UMSNH | Researcher · Lecturer · Consultant | PhD candidate at the HAWC Observatory — petabyte-scale cosmic ray data, distributed computing, Bayesian spectral analysis (14 co-authored publications). Teaching at Tec de Monterrey. IT infrastructure and sysadmin at UMSNH. Consulting across real estate, media, healthcare, education, and public sector. |
| 2002 | CERN / U. Lausanne | DAQ System Member | Data acquisition system for a Positron Emission Tomography scanner (ClearPET project). Signal conversion and data handling at CERN, under CONACYT-CERN cooperation. The benchmark for data precision was set here. |
Contributing member of The HAWC Collaboration — an international multi-institution observatory operating at 4,100m altitude, Sierra Negra, Mexico. PhD research published in Phys. Rev. D (2022). The Collaboration's research has appeared in Science (2017), among other peer-reviewed journals and international conference proceedings.
Publication record on ORCIDAvailable to relocate. Open to international positions. Logistical transition is not a constraint.