QA Engineer ITAM

    Bucharest, RO
9-9A Dimitrie Pompeiu Boulevard
020335 Bucharest
Rumänien

Site Reliability Engineer

This is US
Our ambition is to be the leading European provider of Enterprise Service Management Software!
By using our platform, customers can manage IT and business processes, assets, endpoints, and identities for improved productivity, agility, security, and employee experience. By enabling digital working environments and IT self-service through holistic integration and automated processes, we digitalize and automate our customers' everyday tasks. Join our diverse team of over 600 professionals spread across Europe!

We deeply care about our people and the work we do. Our culture is built on our strong values, and customer success is our top priority. Start your personal development journey with us, no matter if it is about personal or professional growth, we want you to reach your full potential through personalized goals and a life that you love. 
We want to do the right things - and do them right!

Remote work is an essential part of everyday life, though we also deeply value the magic that happens when we all come together. Do you want to be part of building the European leader in service management software, and work in a culture that inspires you to grow? At Matrix42, you can!

YOUR MISSION

As an SRE on the product squad we treat operations as if it's a software problem. Our mission is to protect our cloud products with an ever-watchful eye on their availability, latency, performance, and capacity.

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, distributed, fault-tolerant systems. SRE ensures that Matrix42's services--both our internally critical and our externally-visible systems--have reliability, uptime appropriate to users' needs and a fast rate of improvement.

Additionally, SRE's will keep an ever-watchful eye on our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating manual work through automation.

As an SRE, you'll have the opportunity to manage the complex challenges of scale, while using your expertise in coding, algorithms, complexity analysis and large-scale system design.

  • Engage in and improve the whole lifecycle of services--from inception and design, through deployment, operation, and refinement.
  • Support services before they go live through activities such as system design & architecture consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
  • Conduct educated trade-offs between security, performance, and maintainability.
  • Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
  • Research and upskill in new technologies.
  • Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
  • Practice sustainable incident response and blameless retrospectives (postmortems).
  • Collaborate with security teams to implement and maintain security best practices and measures.
  • Ensure compliance with industry regulations and standards related to security and data privacy.

MUST HAVES

  • 5+ years of experience working as Site Reliability, or comparable Cloud Engineering position with Microsoft Azure or AWS.
  • Excellent understanding of running production container-based workloads in the public cloud using a major provider (Azure, AWS, GCP, etc.) and cloud architecture design.
  • Great exposure to building production observability mechanisms (system heath, metrics aggregation, logs management, dashboards) using 3rd party tools like Prometheus, Grafana, ELK stack or cloud provider native tools like Azure Monitor, Azure Application Insights, Azure Log Analytics / AWS CloudWatch, AWS X-Ray, NewRelic.
  • Strong database skills (MSSQL, Azure SQL, Mongo DB).
  • Good knowledge of Linux and networking.
  • Network Design First thinking in Cloud Environment for Security Reason: Private Access + Isolation VNETs, App Gateway, Traffic Manager.
  • Knowledge of infrastructure operations and knowledge of current cloud technologies.
  • Experience with IaC tooling and adaptability to work with BICEP.
  • Good scripting skills (Bash/PowerShell/Python).
  • Experience working in an empowered product team, agile environment, and a DevOps culture..

NICE TO HAVES

  • Elastic Search (stack) knowledge
  • SLI/SLO/SLA concepts familiarity
  • Experience with Microsoft Cloud Adoption Framework for Azure
  • Building Hybrid cloud solutions
  • Experience setting up and operating alerting for production workloads.
  • Adaptability to explore Infrastructure subjects from scratch and create PoC

FOR YOU

We could tell you all about the 25 days of vacation or about the flexible working hours, as part of everyday life. But in our eyes, that's not a benefit, it's standard. Here are some of our benefit offers:

Learning & Development Opportunities

- Up to 6 additional days off for personal or professional development

- Log into our online platforms to expand your knowledge or improve your language skills.

One Social Day is for you to assist in social settings or attend events which help improve our environment.

The possibility to choose the benefits that works for you either a fitness membership, a retirement plan, meal tickets, etc.

And many more.. ask us about it!

JOIN US

Send us your application, including your salary requirements and earliest possible starting date, directly through our online portal via the "APPLY NOW" button. If you have any questions, please do not hesitate to contact Irina Neculae.

We ask for your understanding that MATRIX42 can only accept applications online via the applicant portal in connection with our applicant management system due to the currently valid EU data protection regulations.

Ähnliche Stellenanzeigen um Rumänien
Lade...