AI Cloud Solution Architect & Engineer

Roma 20-12-2025

AI Cloud Solution Architect & Engineer

Neurons Lab Roma 20-12-2025
Riassunto

Località

Roma

Divisione Aziendale

Tipo di contratto

Data di pubblicazione

20-12-2025

Descrizione Lavoro

About The Project
Join Neurons Lab as an AI Cloud Solution Architect & Engineer - a unique hybrid role combining strategic solution design with hands-on engineering execution. You’ll bridge the gap between client requirements and technical implementation, designing AI / ML architectures and then building them yourself using modern cloud infrastructure practices.
Our Focus
We specialize in serving Banking, Financial Services, and Insurance (BFSI) enterprise customers with stringent compliance, security, and regulatory requirements. You’ll work on mission-critical AI / ML systems where security architecture, data governance, and regulatory compliance are paramount.
Duration
Part-time long-term engagement with project-based allocations
Reporting
Direct report to Head of Cloud
Objective
Deliver end-to-end AI cloud solutions by combining architectural excellence with hands-on engineering capabilities :

Architecture & Design : Gather requirements, design cloud architectures, calculate ROI, and create technical proposals for AI / ML solutions
Engineering Excellence : Build production-grade infrastructure using IaC, develop APIs and prototypes, implement CI / CD pipelines, and manage AI workload operations
Client Success : Transform business requirements into working solutions that are secure, scalable, cost-effective, and aligned with AWS best practices
Knowledge Transfer : Create reusable artifacts, comprehensive documentation, and architectural patterns that accelerate future project delivery

KPI
Architecture & Pre-Sales

Design and document 3+ solution architectures per month with comprehensive diagrams and specifications
Achieve 80%+ client acceptance rate on proposed architectures and estimates
Deliver ROI calculations and cost models within 2 business days of request

Engineering Delivery

Deploy infrastructure through IaC (AWS CDK / Terraform) with zero manual configuration
Create at least 3 reusable IaC components or architectural patterns per quarter
Implement CI / CD pipelines for all projects with automated testing and deployment
Maintain 95%+ uptime for production AI / ML inference endpoints
Document architecture and implementation details weekly for knowledge sharing

Quality & Best Practices

Ensure all solutions pass AWS Well-Architected Review standards
Deliver comprehensive documentation within 1 week of architecture completion
Create simplified UIs / demos for PoC validation and client presentations

Areas of Responsibility
Solution Architecture (40%)
Requirements & Design

Elicit and document business and technical requirements from clients
Design end-to-end cloud architectures for AI / ML solutions (training, inference, data pipelines)
Create architecture diagrams, technical specifications, and implementation roadmaps
Evaluate technology options and recommend optimal AWS services for specific use cases

Business Analysis

Calculate ROI, TCO, and cost-benefit analysis for proposed solutions
Estimate project scope, timelines, team composition, and resource requirements
Participate in presales activities: technical presentations, demos, and proposal support
Collaborate with sales team on SOW creation and customer workshops

Strategic Planning

Design for scalability, security, compliance, and cost optimization from day one
Create reusable architectural patterns and reference architectures
Stay current with AWS AI / ML services and emerging cloud technologies

Cloud Engineering & AI Infrastructure (60%)
Infrastructure as Code Development

Build and maintain cloud infrastructure using AWS CDK (primary) and Terraform
Develop reusable IaC components and modules for common patterns
Implement infrastructure for AI / ML workloads: GPU clusters, model serving, data lakes
Manage compute resources: EC2, ECS, EKS, Lambda, SageMaker compute instances

Application Development

Develop Python applications: FastAPI backends, data processing scripts, automation tools
Create prototype interfaces using Streamlit, React, or similar frameworks
Build and integrate RESTful APIs for AI model serving and data access
Implement authentication, authorization, and API security best practices

AI / ML Operations (MLOps)

Deploy and manage AI / ML model serving infrastructure (SageMaker endpoints, containerized models)
Build ML pipelines: data ingestion, preprocessing, training automation, model deployment
Implement model versioning, experiment tracking, and A/B testing frameworks
Manage GPU resource allocation, training job scheduling, and compute optimization
Monitor model performance, inference latency, and system health metrics

DevOps & Automation

Design and implement CI / CD pipelines using GitHub Actions, GitLab CI, or AWS CodePipeline
Automate deployment processes with infrastructure testing and validation
Implement monitoring, logging, and alerting using CloudWatch, Prometheus, Grafana
Manage containerization with Docker and orchestration with Kubernetes / ECS

Data Engineering

Build data pipelines for AI training and inference using AWS Glue, Step Functions, Lambda
Design and implement data lakes using S3, Lake Formation, and data cataloging
Implement automated and scheduled data synchronization processes
Optimize data storage and retrieval for ML workloads

Security & Compliance

Implement cloud security best practices: IAM, VPC design, encryption, secrets management
Build enterprise security and compliance strategies for AI / ML workloads
Ensure solutions meet regulatory requirements (PCI-DSS, GDPR, SOC2, MAS TRM, etc where applicable)
Conduct security reviews and implement remediation strategies

Cost & Performance Optimization

Optimize cloud spend for compute-intensive AI workloads
Implement spot instance strategies, auto-scaling, and resource scheduling
Monitor and optimize GPU utilization, inference latency, and throughput
Perform cost analysis and implement cost-saving measures

Operations & Support

Implement disaster recovery procedures for AI models and training data
Manage backup strategies and business continuity planning
Troubleshoot and resolve production issues in AI infrastructure
Provide technical guidance to project teams during implementation

Skills
Cloud Architecture & Design

Strong solution architecture skills with ability to translate business requirements into technical designs
Experience in Well Architected review and remediation
Deep understanding of AWS services, particularly compute, storage, networking, and AI / ML services
Experience designing scalable, highly available, and fault-tolerant systems
Ability to create clear architecture diagrams and technical documentation
Cost modeling and ROI calculation capabilities

Technical Leadership

Comfortable leading technical discussions with clients and stakeholders
Ability to guide engineers and share knowledge effectively
Strong problem-solving and analytical thinking skills
Experience with architectural decision-making and trade-off analysis

Programming & Development

Advanced Python programming: object-oriented design, async programming, testing
API development with FastAPI, Flask, or similar frameworks
Frontend development basics: React, etc (for prototypes and demos with AI code generation tools)
Shell scripting for automation and deployment
Git version control and collaborative development workflows

Infrastructure as Code

AWS CDK (required) - CloudFormation experience is valuable
Terraform (highly preferred) for multi-cloud or hybrid scenarios
Understanding of IaC best practices: modularity, reusability, testing
Experience with infrastructure testing and validation frameworks

AI / ML Infrastructure

Hands-on experience with AWS SageMaker: training jobs, endpoints, pipelines, notebooks
Understanding of ML lifecycle: data preparation, training, deployment, monitoring
Experience with GPU management and optimization for training / inference
Knowledge of containerization for ML models (Docker, container registries)
Familiarity with ML frameworks: PyTorch, TensorFlow, LangChain, Llamaindex, etc

DevOps & Automation

CI / CD pipeline design and implementation (GitHub Actions, GitLab CI, AWS CodePipeline)
Container orchestration: Docker, Kubernetes, Amazon ECS
Configuration management and deployment automation
Monitoring and observability: CloudWatch, Prometheus, Grafana, ELK stack

Communication & Collaboration

Excellent written and verbal communication in Advanced English
Ability to explain complex technical concepts to non-technical stakeholders
Comfortable with client-facing presentations and technical demos
Strong documentation skills with attention to detail
Collaborative mindset with ability to work across functional teams

Problem-Solving

Advanced task breakdown and estimation abilities
Debugging and troubleshooting complex distributed systems
Performance optimization and tuningIncident response and root cause analysis

Knowledge
AWS Cloud Platform (Required)

AWS Certified Solutions Architect Associate (minimum requirement)
AWS Certified Solutions Architect Professional or AWS Certified Machine Learning - Specialty (highly preferred)
Deep knowledge of core AWS services:
Compute: EC2, Lambda, ECS, EKS, SageMaker
Storage: S3, EFS, EBS, FSx
Networking: VPC, Route53, CloudFront, API Gateway, Load Balancers
AI / ML: SageMaker, Bedrock, Rekognition, Textract, Comprehend, Lex, Polly
Data: RDS, DynamoDB, Redshift, Glue, Athena, Kinesis
Security: IAM, KMS, Secrets Manager, Security Hub, GuardDuty
DevOps: GitHub Action, CodePipeline, CodeBuild, CodeDeploy, CloudFormation, CDK, Terraform

AI / ML Technologies

Understanding of machine learning concepts and model training / deployment lifecycle
Familiarity with Generative AI technologies: LLMs, RAG, vector databases, prompt engineering
Knowledge of ML frameworks and libraries: PyTorch, TensorFlow, scikit-learn, pandas, numpy
Experience with MLOps practices and tools
Understanding of model serving patterns: real-time vs batch inference

Software Development

Modern software development practices: testing, code review, documentation
API design principles: RESTful, GraphQL, event-driven architectures
Database design and optimization: SQL and NoSQL
Authentication and authorization: OAuth, JWT, IAM

DevOps & Infrastructure

Linux / UNIX system administration
Networking fundamentals: TCP / IP, DNS, HTTP / HTTPS, load balancing
Security best practices for cloud environments
Disaster recovery and business continuity planning

Industry Knowledge

Understanding of cloud consulting delivery models
Familiarity with agile / scrum methodologies
Awareness of compliance frameworks: GDPR, HIPAA, SOC2, ISO27001
Knowledge of FinTech, or other regulated industries (plus)

Additional Knowledge (Preferred)

Azure or GCP certifications and experience
Multi-cloud architecture patterns
Serverless architecture patterns
Data engineering and data lake design
Cost optimization strategies and FinOps practices

Experience
Cloud Engineering & Architecture

5+ years in cloud engineering, DevOps, or solution architecture roles
3+ years hands‑on experience with AWS services and architecture
Proven track record of designing and implementing cloud solutions from scratch
Experience with both greenfield projects and cloud migration initiatives

AI / ML Infrastructure

2+ years working with AI / ML workloads on cloud platforms
Hands‑on experience deploying and managing ML models in production
Experience with GPU-based compute for training or inference
Understanding of AI / ML infrastructure challenges and optimization techniques

Infrastructure as Code

3+ years building infrastructure using IaC tools (AWS CDK, Terraform, CloudFormation)
Experience creating reusable IaC modules and components
Track record of infrastructure automation and standardization

Software Development

4+ years programming experience in Python (required)
Experience building APIs with FastAPI, Flask, or similar frameworks
History of creating prototypes, MVPs, or PoC applications
Comfortable with full-stack development for demos and prototypes

DevOps & Automation

3+ years implementing CI / CD pipelines and deployment automation
Experience with containerization (Docker) and orchestration (Kubernetes / ECS)
Linux / UNIX system administration experience
Monitoring and observability implementation

Client-Facing Work

Experience gathering requirements and translating them into technical solutions
History of presenting technical architectures to clients and stakeholders
Participation in presales activities, demos, or technical workshops
Ability to work directly with customers to solve complex problems

Industry Experience (Preferred)

Consulting or professional services background
Experience in regulated industries (FinTech, Insurance, Banks)
Work with enterprise clients on large-scale implementations
Startup or fast-paced environment experience

#J-18808-Ljbffr

Condividi

Come Candidarsi

Per maggiori informazioni e per candidarti, clicca il pulsante.