Site Reliability Engineer
Job DescriptionIntel is seeking well-rounded Site Reliability Engineers developing automated solutions for monitoring, performance and capacity planning, high availability and disaster response for on-prem multi-server clustered mission critical applications. This individual will work with a team of developers and other infrastructure engineers to design, implement, and maintain our private-cloud infrastructure:
Primary Duties and Responsibilities:
- Engage in and improve the whole lifecycle of services - Designing, building, deploying, monitoring, performance tuning and supporting clustered multi-tier applications.
- Develop and implement operational performance and reliability indicators for the applications
- Synthesize the operational metrics and incident to drive actions to improve performance and reliability.
- Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews. Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Drive improvements to reduce complexity with clean and simple implementations.
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
- Practice sustainable incident response and blameless post mortems.
- Self-starter who can manage ambiguity with a bias for action
Key Skills Required:
- Infrastructure and application configuration automation, monitoring, visualization using industry standard tools
- Strong hands-on knowledge of Linux fundamentals, System administration scripting, performance tuning / scalability, troubleshooting.
- Knowledge of tuning database engines (e.g. PostgreSQL) on Linux OS.
- Architecting experience with CI-CD / Devops solutions
- Linux /Windows Systems administration experience, storage technologies and user-level knowledge of firewalls and load-balancers.
- Strong knowledge of network protocol layers, configuration/tuning.
- Knowledge of storage fundamentals (block/object, direct attached/networked solutions)
- Prior experience in running large-scale Linux enterprise clustered applications with proven DevOps mindset is a plus.
Qualifications
Inside this Business GroupIntel's Information Technology Group (IT) designs, deploys and supports the information technology architecture and hardware/software applications for Intel. This includes the LAN, WAN, telephony, data centers, client PCs, backup and restore, and enterprise applications. IT is also responsible for e-Commerce development, data hosting and delivery of Web content and services.
INExperienced HireJR0147533