Descrizione Lavoro

Overview
Neutralis is building the learning brain for industrial heat‑pump plants. We fuse model‑based RL with digital twins and strict safety constraints to turn messy plant telemetry into better decisions, hour by hour. This is paper‑to‑plant work with real impact on energy, reliability, and decarbonization.
Challenge
Industrial plants are complex, safety‑critical, and non‑stationary. Off‑policy data, partial observability, actuator limits, drift, and human‑in‑the‑loop operations make naïve RL fail fast. Your mission is to own a safe, reproducible path from data to control: offline → simulated → shadow → live, with guardrails at every step.
Responsibilities

Own the RL/control roadmap: architect offline RL + model‑based control with a digital twin in the loop; define safety envelopes and verification gates.
Build the pipeline: data curation, policy learning, simulation/gym environments, evaluation harnesses, and promotion criteria from sim to plant.
Ship reproducible research to production: baselines, ablations, and clear experiment tracking; transform results into services/APIs.
Lead and mentor a 15–20 person cohort of MSc/PhD thesis students and research engineers; set standards for code, experiments, and writing.
Partner with domain experts (HVAC/OT/BMS) on constraints, actuation limits, failure modes, and alarm triage.
Land safety: define fallback controllers, interlocks, and shadow‑mode strategies; quantify risk and uncertainty.
Collaborate across the stack with our FastAPI services, time‑series store, and observability/ML Ops.
Communicate: write crisp technical notes, contribute to publications where useful, and present results to partners.

What you’ll bring

Track record shipping RL/controls for physical systems (energy, robotics, process, automotive, etc.).
Deep hands‑on skill in offline RL (e.g., CQL/IQL/TD3‑BC) and model‑based RL/MPC; comfort with system identification and constrained optimization.
Strong engineering in Python and PyTorch or JAX; experience with experiment tracking (MLflow/W&B), containers, and CI.
Rigor around evaluation and safety: distribution shift, uncertainty, guardrails, fallback policies.
Ability to lead, mentor, and scale a research‑engineering team.
Clear writing and stakeholder communication.
Degree in CS/EE/ME/Controls or equivalent experience.

Nice to have

Familiarity with OT/BMS/historians (OPC UA, Modbus, BACnet, PI), time‑series modeling, anomaly detection.
Experience with digital twins/simulation, domain randomization, and sim‑to‑real transfer.
MLOps in AWS; FastAPI, PostgreSQL + a time‑series DB.
Italian language skills.

Why Neutralis

Hard problems, real plants: your work moves real energy, not just a leaderboard.
Ownership: technical stewardship from first principles to deployment.
Talent platform: lead a serious thesis cohort and shape a next‑gen team.
Impact: measurable COP uplift, energy savings, reliability gains.
Compensation: competitive package with meaningful equity; conference and equipment budget.

Location & working model
On‑site in Milan (primary). Some flexibility for exceptional candidates. Occasional visits to partner sites.
What success looks like (6–12 months)

A documented, reproducible RL pipeline from data → policy → evaluation → shadow.
Benchmarked policies that outperform baselines in sim and shadow with clear safety margins.
A mentored student cohort delivering publishable experiments and production‑ready components.
Accepted path to controlled live trials with partners.

How to apply
Apply on LinkedIn or send a short note with "RL — Senior", a link to work you’re proud of (GitHub/Google Scholar/website), and availability. DMs welcome.
Neutralis is an equal‑opportunity employer. We value clarity, safety, and results over pedigree. If you’ve shipped control systems that matter, we want to hear from you.
#J-18808-Ljbffr

Come Candidarsi

Per maggiori informazioni e per candidarti, clicca il pulsante.

Candidati Ora

Senior Reinforcement Learning Builder

Senior Reinforcement Learning Builder

Località

Divisione Aziendale

Tipo di contratto

Data di pubblicazione

Descrizione Lavoro

Come Candidarsi

Senior Reinforcement Learning Builder

Senior Reinforcement Learning Builder

Località

Divisione Aziendale

Tipo di contratto

Data di pubblicazione

Descrizione Lavoro

Condividi

Come Candidarsi

Offerte correlate

Milano - Mobile Android Developer

Contabile Senior

Area Manager Gourmet (Milano, IT, 20143)