- Extended the clinical-trial eligibility pipeline with new LangGraph processing steps and Langfuse traceability; built criteria-level evaluation harnesses and resolved structured-output failures to raise matching reliability.
- Shipped patient-facing product features end to end — Next.js / React / TypeScript frontend through gRPC + FastAPI Python services to the LLM matching engine and EHR data layer.
- Designed a three-layer env-var safety system (typed schema → pre-deploy CI lint → boot-time validation) and standardized CI across four repos with a blocking test gate and Playwright post-deploy smoke tests.
- Collapsed a fragmented three-repo local stack into a single Makefile-driven command.
Colin Murphy
Founding engineer and full-stack builder with 8+ years shipping end-to-end, AI-powered products in fast-moving, data-intensive environments. Strongest in applied AI — LLM services, evaluation, observability, and grounding for domain-specific accuracy — and comfortable owning the full stack around it, from Python backends to React/Next.js frontends and cloud infrastructure (AWS & GCP). I have built solo from zero-to-MVP and led a small, craft-focused engineering team.
Experience
- Developed LLM pipelines using RAG and LangChain, grounding model outputs in domain literature for accurate, well-supported answers in high-stakes clinical decision-making.
- Architected and led the build of an LLM-assisted annotation platform (React/TypeScript, Node.js, PostgreSQL) with OAuth role-based review workflows that accelerated expert labeling throughput; personally owned the database schema and migrations.
- Built ingestion pipelines to parse, OCR, and extract insight from 100,000+ unstructured PDF patient reports and 9M+ images into a structured knowledge base.
- Trained a custom classification model spanning 60+ dermatological conditions, with metrics surpassing best-in-class numbers in peer-reviewed literature.
- Established model development lifecycle and the deployment scheme for at-scale machine learning applications.
- Conducted data analysis as needed for New Product Development and Manufacturing Sustainability — ANOVAs, sampling plans, specification alignment, designed experiments as well as more advanced techniques like machine learning for predictive modelling and interactive data visualizations.
- Developed data tools and data applications for use in production and R&D.
- Conducted experimental design and hypothesis testing for product development and product sustainability.
- Led interdisciplinary team of scientists to introduce data collection and exploration for 2 implant cleaning systems.
- Built data processing pipeline for industrial manufacturing lines. Built random forest classification models to predict the presence of implants in 12 different chemical processing stages.
- Designed LSTM model for time-series forecasting of solution chemistry — predicting downstream rinse water conductivity for process optimization/sustainability.
- Parse through archived pdf documents to collect quality related information and build a relationship graph tool.
- Tasked with managing the Residual Manufacturing Materials Management (RM3) program for all U.S. based manufacturing locations.
- Manage and coordinate medical device validation testing projects with both internal and external customers.
- Analyze data and prepare test reports for established Quality Assurance programs.
- Perform investigate lab testing of medical devices to evaluate cleanliness.
- Patterned Arrays of Gold Nanoframes Using High Resolution E-Beam Lithography (Renewable Energy Lab, College of Engineering)
- Optical detection of hot electron induced dissociation of H2 on gold nanoparticles (Borguet Research Group, Department of Chemistry)
Skills
Education
Publications & Project Papers
Publications
Project Papers
- E-Weaver: Sustainable Clothing Aggregation and Recommendation System (Part 1: Data acquisition and EDA) — paper available upon request
- E-Weaver: Sustainable Clothing Aggregation and Recommendation System (Part 2: Modeling) (Colin Murphy, Matthew Merenich, Dayun Piao, Zifeng Wang) — paper available upon request
- Deep Learning and Chemometrics: Quantitative and Qualitative Spectroscopy Interpretation of Aqueous Solutions (Colin Murphy)
- The Cryptocurrency Filabuster: an Analysis of Blindly Signed Contracts as a Feather Forking Countermeasure (Colin Murphy)
- Classifying Short Text Reddit Posts (Colin Murphy, Mithila Guha, & Angela Mastrianni)
- Utilizing Machine Learning to Classify Chronic Disease: Quantifying Biopsychosocial Risk and Resilience (Colin Murphy, Meghan Colosimo, John Obuch, & Fatih Catpinar)
- Impact of the COVID-19 Pandemic on BIPOC populations (Colin Murphy, Madhumitha Santhana Krishnan, Nicholas Lawrence, Joshua Lister, & Rebecca Topper)
- Patterned Arrays of Gold Nanoframes Using High Resolution E-Beam Lithography (Colin Murphy, Maryam Haifathalian, Carrigan Braun, & Svetlana Neretina)
Interests
Outside of the digital realm I enjoy staying active by running, but my true passion is for cycling. I have been an avid cycler for over 10 years and prefer it as a mode of transportation in the city. I have also participated in the American Cancer Society Philadelphia Bike-a-thon (fundraising event) for the past 10 years — biking 65-100 miles from Philadelphia to Atlantic City.
I also like to "geek out" over analog synthesizers and producing different types of ambient music by experimenting with different soundscapes.
Like many others of my generation, I am very conscious of our impact on our environment and strive to incorporate sustainability into every aspect of my personal life, while also advocating for a more sustainable society as a whole.