You are viewing a preview of this job. Log in or register to view more details about this job.

Software Engineer

Machine Learning Engineer (LLM / Clinical NLP)

A remote, project-based contract to build and fine-tune the oncology AI model at the core of our product, and own the backend it runs on, alongside an early-stage clinical AI team.

Employment type: Contract, project-based

Location: Fully remote (United States)

Start date and duration: Defined per project, set by the contract scope

Work authorization: Applicants must be legally authorized to work in the United States. LUMIQO is unable to sponsor work visas for this role.

About LUMIQO

LUMIQO is an EHR-integrated, AI-powered platform that automates adverse event (AE) reporting in oncology clinical trials. Capturing and grading drug side effects today is slow, manual, and error-prone: 250+ hours per patient, costly data mistakes, and high staff burnout, all of which slow down the trials that bring new cancer treatments to patients. We replace that with a purpose-built oncology small language model that reads clinical notes, detects adverse events, grades them per CTCAE, and produces export-ready safety logs, always with a human in the loop.

We are an early-stage, founder-led team with advisors from Johns Hopkins and senior AI engineers from companies like OpenAI. We are building toward our MVP, so the work you do here will shape the product from the ground up. We care about getting safety right, learning fast, and giving every team member real ownership.

About the Role

We are hiring a hands-on Machine Learning Engineer on a project-based contract to build the model and the backend it runs on. To be clear about what this is and is not: this is not a model-serving or MLOps-only role, and it is not a pure data-pipeline role. You will be training and fine-tuning the language model itself, building the detection logic around it, and writing the backend code that turns it into a working product.

You will work directly with the founders and our AI/ML advisors, with a short path from your idea to something shipped.

What You'll Do

Fine-tune and evaluate small and open language models (clinical or general LLMs adapted to the medical domain) to extract adverse events from oncology EHR notes.
Build the hybrid detection engine that combines rules with ML to identify AEs and assign CTCAE grades.
Develop and validate NER and text-classification components for clinical entities such as drugs, events, severity, and causality signals.
Write and own the backend: APIs, data-processing pipelines for clinical text, and the evaluation harnesses that wrap the model. You are the person who codes this.
Stand up rigorous evaluation: track sensitivity, specificity, and F1 on clinical datasets, and design the uncertainty and human-in-the-loop workflow for when the model is not confident.
Work with synthetic and de-identified clinical data, following HIPAA-aware practices and keeping an audit trail for regulatory readiness.

What We're Looking For (Required)

We hire for skills, not years on a resume. Skills you have built through projects, research, coursework, internships, or work all count. At a minimum, we are looking for:

Strong programming skills, primarily in Python, with the ability to own backend code (APIs, services, data pipelines) end-to-end.
Hands-on experience fine-tuning language models (required). You have personally fine-tuned and evaluated transformer-based models, for example with LoRA / PEFT or full fine-tuning, whether through coursework, research, a personal project, or a job. Calling a hosted API is not enough on its own.
Solid ML fundamentals: named-entity recognition, text classification, dataset curation, and evaluation metrics (F1, precision and recall, sensitivity and specificity).
Working knowledge of the modern ML stack: PyTorch and Hugging Face Transformers, or close equivalents, with code versioned in Git.
Comfort working with messy biomedical or clinical text, or strong adjacent NLP work you can point to.
Soft skills that matter on a small team: clear communication, ownership, and comfort with ambiguity and a fast pace.

Education

A bachelor's degree in computer science, machine learning, computational biology, biomedical informatics, or a related field is helpful, but equivalent hands-on skill counts just as much. A master's degree is ideal but not required.

Nice to Have (Bonus)

Clinical or biomedical NLP exposure: BioBERT, PubMedBERT, or ClinicalBERT; datasets like MIMIC, n2c2 / i2b2, or BC5CDR;
Experience in HIPAA-regulated environments.
Published research or open-source model contributions.
Some MLOps or model-serving experience is a plus, but this is a build-the-model role, not an infrastructure role.

You do not need to check every box. If you are excited about this work and can show real, hands-on machine learning, we want to hear from you.

Benefits

Beyond compensation, this role offers:

Hands-on exposure to real-world AI in healthcare and clinical research, including the chance to explore emerging AI methods applied to clinical data.
Direct experience in an early-stage startup environment, including how an AI company operates day-to-day.
A seat in strategy discussions and cross-functional team meetings.
Networking with industry professionals through LUMIQO's partner network, including Johns Hopkins Technology Ventures.
Relevant industry conferences, with travel reimbursement.
A flexible, fully remote schedule that can flex around academic commitments.

Equal Opportunity

LUMIQO is an equal opportunity employer. We are building a diverse team and welcome applicants of all backgrounds. We do not discriminate on the basis of race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, veteran status, or any other protected characteristic.