Apply Now    
Job ID: JR0172501
Job Category: Engineering
Primary Location: Albuquerque, NM US
Other Locations:
Job Type: Experienced Hire

HPC & AI Systems Administrator

Job Description


The PC & AI Systems Administrator has deep technical knowledge of the design and deployment of data centers and the associated subsystems. These can include expertise in data center layout, mechanical design systems, cooling, power delivery and other critical data center design expertise. The deliverables of the role may take the form of design of Intel's data centers, support for customers in designing their data centers or in the development of new products and technologies based on data center design expertise.

HPC Frontier Lab / CRT-DC runs the Intel High Performance Computing benchmarking cluster called Endeavour.  Endeavour is our renown and largest cluster showcasing Intel Architecture supporting deals, development, performance optimization and so much more. We are System integrators of future platforms. We also host other clusters to support HPC, Cloud, Enterprise, and other clusters for Technology, Pathfinding, and Innovation.

The  HPC & AI Systems Administrator will be responsible for:

  • Providing support and maintenance of large cluster hardware and software for high availability, consistency, and optimized performance.
  • Managing various operating systems.
  • Supporting Hardware such as rack-mounted servers and workstations.
  • supporting the latest Intel HPC data center technologies, including servers, fabric, storage.
  • Utilizing their skills in the areas of cluster debugging, Linux scripting, cluster validation tests, server expansion, file system tests and benchmarks
  • Serving as a consultant for all projects and customers of the CRT Datacenter, creating and improving methodologies used in the datacenter to enhance the performance, reliability and manageability of the CRT clusters.

The ideal candidate should exhibit the following behavioral skills:

  • Relationship management
  • Effective influencing
  • Agile written and verbal communicator


Qualifications

Minimum qualifications are required to be initially considered for this position. Preferred qualifications are in addition to the minimum requirements and are considered a plus factor in identifying top candidates.

Minimum Qualifications:

  • Bachelor's degree in Computer Science, Computer Engineering or any other related field and 4+ years of experience OR Master's degree in Computer Science, Computer Engineering or any other related field and 3+ years of experience
  • 3+ years of Linux experience supporting complex servers
  • 3+ years of Experience installing and managing the Linux operating systems on a server
  • 3+ years of Experience administering a Linux server for multiple users
  • 3+ years of experience with managing at least several identical Linux servers in a cluster
  • 1+ year of experience with the technical concepts, architecture, systems, development methods, and disciplines associated with the defined program, and utilizes knowledge to accelerate project completion.


Preferred Qualifications:

  •  Programming in at least one of the following languages (C, Python or Bash)
  • Experience managing cluster systems with 100+ nodes
  • Experience with Gigabit Ethernet
  • Experience with high performance interconnects, preferably Mellanox InfiniBand or Omni-Path
  •  Experience managing HPC clusters with discrete GPUs (Nvidia, AMD or Intel)
  • Experience with containers (Singularity, Podman, Charliecloud, Docker, Kubernetes, others)
  • Experience administering high performance cluster file systems (Lustre, GPFS, others)
  • Experience with supporting AI frameworks (TensorFlow, others)
  • Experience with Extreme or Cisco network hardware setup and configuration
  • Experience with MPI libraries, preferably Intel MPI
  • Experience writing HPC application
  • Experience with containerization as it pertains to HPC / AI workloads
  •  Experience with virtualized networks
  • Experience managing Cloud based cluster systems
  •  A+ certification or equivalent experience.
  • RHCSA certification or equivalent
  •  CCENT certification of equivalent
  •  RHCE certification or equivalent
  • CCNA certification or equivalent

Requirements listed would be obtained through a combination of industry relevant job experience, internship experiences and or schoolwork/classes/research.

Inside this Business Group

Intel Architecture, Graphics, and Software (IAGS) brings Intel's technical strategy to life. We have embraced the new reality of competing at a product and solution level—not just a transistor one. We take pride in reshaping the status quo and thinking exponentially to achieve what's never been done before. We've also built a culture of continuous learning and persistent leadership that provides opportunities to practice until perfection and filter ambitious ideas into execution.



Posting Statement

All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance.

USExperienced HireJR0172501
Apply Now    

What would you like to do now?

Connect with Us

Get Job Alerts

Get started
Student Center

Find out more about working at Intel

Learn more
Hiring Process

Hiring Process

Learn more

Grow your network of opportunities