Infrastructure/Automation Manager

Full Time
Irvine, CA 92612
Posted
Job description

Essential Functions:

  • Hires, trains, manages, develops, assigns, evaluates, and sets goals for department and staff. Conducts performance evaluations, recognizes, and acknowledges positive and productive behavior, and provide constructive/corrective feedback for performance issues.
  • Responsible for managing the day-to-day operations of a large-scale enterprise corporate infrastructure to ensure highly available up-time of business-critical hardware and software.
  • Leads, develops, and implements an enterprise-wide instrumentation strategy to support real time observability, health checks, and remediations.
  • Troubleshoots, isolates and diagnoses network/systems and production issues/problems, independent and/or in collaboration with cross-functional teams.
  • Proactively monitors, analyzes alert information, initiates corrective actions to promptly isolate issues, restore service, implements changes, and provides analysis and consultation to appropriate teams.
  • Monitors and remediates an extensive IT environment (databases, servers, voice, network, on-premise datacenter, AWS infrastructure, and business critical applications) using DataDog, SolarWinds, and other enterprise monitoring tools.
  • Manages application monitoring (APM) tools and ensures it is setup properly to monitor metrics, traces, and logs. Ensures completion of monitoring, logging, and alerting activities.
  • Owns and utilizes PagerDuty for on-call and/or escalation orchestration.
  • Collaborates with Engineering teams to ensure applications are emitting the right metrics, debugs issues to better understand how to improve, and automates tool visibility, usability, and enables them to quickly set up instrumentation.
  • Develops operational dashboards and reviews reports tracking key performance indicators (KPIs) and trends to present to team and management.
  • Promotes automation to replace manual processes and implement changes to improve processes and workflows.
  • Updates and manages tickets through resolution with other internal support teams.
  • Performs Preventative Maintenance tasks and creates, maintains, updates Ops runbooks.
  • Participates in the department budget planning process and manages the team’s budget accordingly.
  • Reviews and updates department procedures. Participates in the creation and revision of department policies.
  • Analyzes trends from customer requests, recommends and develops improvements to overall service levels or department processes to ensure high quality service delivery in every step. Drives support metrics by documenting end-user interactions.
  • Manages team’s workload using Jira, ServiceNow, and/or other solutions used by the Company.
  • Contributes to and encourages team to create knowledge base articles of problems and solutions.
  • Provides timely notifications to management/outage team and escalates tickets to appropriate management as needed.
  • Manages vendor escalations for P1/P2 outages and recurring issues with any part of the global infrastructure.
  • Provides updates to senior management through email, and phone calls about pending tasks and projects.
  • Participates in the development, evaluation, and execution of the company’s strategic plans.
  • Collaborates with Product Owners in agile environment to meet project deadlines.
  • Provides feedback to Agile Scrum Product Owners on application improvements.
  • Monitors/follows up with Agile Scrum team Product Owners on backlog item status.
  • Performs other related duties and projects as business needs require at direction of management.

Education and Experience:

  • Bachelor’s Degree in Information Technology, Computer Science, or related field preferred.
  • Minimum seven (7) years of experience in IT infrastructure
  • Minimum three (3) years of supervisory or managerial experience in an enterprise environment, or any equivalent education and/or experience from which comparable knowledge, skills and abilities have been demonstrated/achieved.
  • Hands-on advanced level experience with virtualization (VMWare or Nutanix) and core infrastructure systems.
  • Experience with monitoring systems and APMs including but not limited to SolarWinds, DataDog, AppDynamics, CloudWatch, Dynatrace.
  • Automation experience (Powershell, scripting) preferred.
  • Experience working in a DevOps culture.

Job Type: Full-time

Pay: $150,000.00 - $180,000.00 per year

Benefits:

  • 401(k)
  • Dental insurance
  • Flexible schedule
  • Health insurance
  • Paid time off
  • Vision insurance

Experience level:

  • 5 years

Schedule:

  • 8 hour shift

Ability to commute/relocate:

  • Irvine, CA 92612: Reliably commute or planning to relocate before starting work (Required)

Experience:

  • APM: 3 years (Required)
  • Automation: 2 years (Required)

Work Location: In person

johnandkristie.com is the go-to platform for job seekers looking for the best job postings from around the web. With a focus on quality, the platform guarantees that all job postings are from reliable sources and are up-to-date. It also offers a variety of tools to help users find the perfect job for them, such as searching by location and filtering by industry. Furthermore, johnandkristie.com provides helpful resources like resume tips and career advice to give job seekers an edge in their search. With its commitment to quality and user-friendliness, johnandkristie.com is the ideal place to find your next job.

Intrested in this job?

Related Jobs

All Related Listed jobs