Senior Software Engineer Manager, Risk Engineering Operations
Galileo Financial Technologies
Employee Applicant Privacy Notice
Who we are:
Welcoming, collaborative and having the opportunity to make an impact - is how our employees describe working here. Galileo is a financial technology company that provides innovative and revolutionary software products and services that power some of the world's largest Fintechs. We are the only payments innovator that applies tech and engineering capabilities to empower Fintechs and financial institutions to unleash their full creativity to achieve their most inspired goals. Galileo leads its industry with superior fraud detection, security, decision-making analytics and regulatory compliance functionality combined with customized, responsive and flexible programs to accelerate the success of all payments companies and solve tomorrow's payments challenges today. We hire energetic and creative employees while providing them the opportunity to excel in their careers and make a difference for our clients. Learn more about us and why we work here at https://www.galileo-ft.com/working-at-galileo.
The role:
In this role, we are seeking a highly capable and hands-on software engineering lead to establish and manage our new Level 1 and Level 2 (L1/L2) support team for IRM Analytics production applications. This role is crucial for ensuring the reliable operation of our systems as well as for enabling the capability of our core engineering team to deliver new development initiatives. This role will be responsible for both day-to-day hands-on support and small enhancements, as well as managing a nearshore team focused on operations support and service delivery.
What you’ll do:
- Lead and mentor a team of 2–3 support engineers with a focus on growing and scaling the team as production applications are onboarded
- Establish and enforce a clear L1 → L2 → L3 escalation process, defining thresholds for when issues are elevated to core engineering (L3).
- Oversee all ticketing and tracking processes, ensuring all incidents, service requests, and enhancements are logged and managed in Jira
- Define and deliver weekly or bi-weekly reporting on key metrics including Mean Time to Resolution, ticket volume, resolution times, and escalation ratio.
- Ensure the team maintains a high level of responsiveness and communications quality, meeting SLA adherence targets for issue acknowledgment.
Hands-on Technical Support:
- Act as the primary Level 2 technical resource, investigating and resolving complex production issues that Level 1 triage cannot handle.
- Perform root cause analysis (RCA) and implement timely fixes or workarounds to restore service.
- Coordinate across the team to execute small enhancements, configuration changes, and non-disruptive updates to production applications.
- Create and maintain comprehensive runbooks, playbooks, and FAQs to establish a robust knowledge base and shorten resolution time for recurring issues.
What you’ll need:
- Proven experience in a hands-on operations support or application support role (L2 equivalent).
- Strong technical end-end proficiency in the following:
- Data Engineering - SQL, Snowflake, Python scripting, dbt, Airflow, and Airtable
- Software engineering - Solid devops and infrastructure-as-code experience using ArgoCD and GitLab for deployment, CI/CD, and environment management, REST APIs, React
- Familiarity with data observability tools and practices.
- Knowledge of cloud-native data platforms (e.g., AWS, GCP)
- Familiarity with monitoring tools and ticketing/alerting systems such as Jira, ServiceNow, and Slack-based alerting.
- Exceptional communication and client-facing interaction skills to interface effectively with risk management stakeholders.
- Experience or comfort managing a remote or nearshore team to align with US time zones.
- Advanced Level of English (Mandatory)
Success Criteria
Success in this role will be measured by the following team performance indicators:
- Support Team Autonomy: The L1/L2 support team resolves >70% of tickets without escalation to L3 (core engineering).
- SLA Adherence: Maintain 95% or better adherence for issues acknowledged within 24 hours.
- Stakeholder Satisfaction: Positive feedback from business stakeholders on responsiveness and reliability.
- Core Engineering Efficiency: Core IRM Analytics engineers spend >70% of their time on new builds and development