Brief Descriptio
We are looking for an experienced Linux engineer for a full-time position. The position’s main focus will be to support high performance computing (HPC) clusters. Most Linux footprint within the environment is RHEL-based, deployed both on-prem and in cloud. Additional focus will be on supporting, building, and maintaining the scientific software, which will require some basic development skills (scripting languages, makefiles, etc.). Candidates are expected to work with various internal technical teams to manage and further develop the growing infrastructure, streamline management and operation of the footprint.
Responsibilities
- Manage HPC cluster configuration and Linux based computation servers, as well as provide performance tuning.
- Assist in resolving performance issues on the cluster, help with designing HPC jobs.
- Install, configure, and manage Linux systems throughout the environment using HPC-specific software.
- Apply consistent security configuration standards across the Linux infrastructure
- Implement and maintain management and monitoring tools.
- Develop opportunities to streamline and automate deployment and configuration tasks.
- Provide technical support and guidance for Linux deployments.
- Build from source, install, configure, and manage linux applications on the cluster.
Requirements
- Experience in supporting production RHEL-like Linux servers and applications (5+ years), general knowledge of common open-source applications, such as NGINX, PostgreSQL, MariaDB, and Git
- General understanding of services like DHCP/DNS/NTP
- Experience with Ansible or similar configuration-as-a-code software: 3+ years
- Comfortable with configuring OS images from scratch
- Comfortable with building software from the source code, understanding the concept of dependencies, libraries, and linking
- Background in remote management of large on-prem infrastructure (multiple servers, switches, storages, etc.)
- Excellent communication and interpersonal skills
- Performance optimization skills in Linux
- Excellent communication skills with the English proficiency of at least B2+
- AWS: hands-on experience with production environment
- Hands-on experience with HPC cluster software: Bright Cluster Manager, Univa Grid Engine, Slurm, EasyBuild, Spack
- Python/R or similar scripting language
Conditions
- Competitive compensation
- Remote or office work
- Flexible working hours
- Healthcare benefits: medical insurance
- Continuous education, mentoring, and professional development programs
- A team with an excellent tech expertise
- Certifications paid by the company
Contact Information