This job board retrieves part of its jobs from: Toronto Jobs | Emplois Montréal | IT Jobs Canada

The website dedicated to professionals of the education industry

To post a job, login or create an account |  Post a Job

   Teaching Career canada   

The jobs board where teachers and instructors go to find opportunities

Slide 1
Slide 2
Slide 3
previous arrow
next arrow

New

Systems Administrator – Cloud

Université McGill

This is a Contract position in Montreal, QC posted August 3, 2022.

Position Summary/Description

McGill University is seeking a Systems Administrator, Cloud (OpenStack) Systems to take a significant role in the operations, maintenance of present and planning for future initiatives in the area of Advanced Research Computing (ARC).

Reporting to the Associate Director, Operations of the McGill High Performance Computing Centre (“HPC Centre”), the incumbent will work within the Calcul Québec organization and join a vibrant team of HPC systems administrators and analysts across several Quebec institutions. It is an opportunity to be working with leading edge technology and a team that has installed and operated supercomputers in the Top 500.

McGill is a founding member of Calcul Québec, a consortium of Québec Universities whose objective is to provide advanced research computing (ARC) to the research community including HPC data centres at the leading edge of technology and highly qualified computing experts. More than 600 research groups take advantage of the resources made available to them by Calcul Québec to conduct research in various fields. Calcul Québec is a Regional Partner of the Digital Research Alliance of Canada, the non-profit organization in charge of coordinating Advanced Research Computing (ARC) efforts throughout Canada. The ARC environment includes an HPC cluster consisting of over one thousand nodes with a mixture of CPU and GPU processors, a cloud environment, as well as 50+ petabytes of Lustre parallel file systems with backup and archive capabilities. The system also includes 90+ petabytes of tapes.

Main Duties and Responsibilities

The systems administrator will be responsible for the core operations, maintenance, and growth planning for the OpenStack platform:

Specific tasks:

  • Investigates, pilots and recommends unique hardware and software solutions.
  • Make decisions and recommendations to Management regarding major acquisitions.
  • Use programming skills to automate common and repetitive administration tasks. Create procedures whereby tasks can be delegated to users or user support groups (i.e., Help Desk).
  • Ensure systems and technology are upgraded to remain technically current.
  • Develop and maintain software and processes to enhance systems’ maintainability, functionality, security and integrity. Contribute features and bug fixes to relevant open source software projects.
  • Sets up performance and security monitoring.
  • Develop and maintain documentation of services, configurations and procedures.
  • Provide support to clients; identify, research and resolve technical issues; track and monitor problems and escalations to ensure timely resolutions.
  • Diagnose and ensure correct operation of server and storage systems in consultation with vendors and other technical staff.
  • Monitor the status of alerts, tickets and processes to ensure timely completion of tasks and resolution of issues.
  • Communicate and collaborate with clients regarding impacts and implications of system failures, maintenance, and cyber security incidents.
  • Manage national projects and teams, and provide support to researchers and technical experts from other institutions.
  • Make decisions and recommendations to senior management for procedural changes and improvements.
  • Maintenance and evolution of Openstack and Ceph.
  • Perform daily routine maintenance such as reviewing activity logs, performing storage pool maintenance
  • Server client troubleshooting and issue resolution.
  • Manage future software upgrades.
  • Deliver reliable and high performance access to storage in a high availability environment for storage operations including multipath and automated failover.

Qualifications/ Requirements

At least 3 years experience in a large, enterprise environment containing hundreds of server, storage and network elements operating in a clustered setup.

Demonstrated expertise in the following areas:

  • Linux systems administration including Debian or Ubuntu.
  • Deployment tools – Ansible (openstack-ansible)
  • Git and review process
  • Shell scripting and other scripting languages such as Python
  • Automation and monitoring of systems administration tasks

Knowledge in the following areas is considered as assets:

  • experience with Ceph (including CephFs, Radosgw, Rbd)
  • Linux container technology (lxc/lxd)
  • Virtual machine (libvirt/qemu)
  • Database management including MySQL/Galera
  • Open-vswitch (https://www.openvswitch.org/), or OVN

Other required skills:

Attention to detail in the level of work performed, taking pride, responsibility and a sense of ownership for the successful operations of the systems under their administration and the availability and reliability of those systems in support of all research users. Advanced problem-solving skills. Good oral and written communication skills in both French and English. Ability to work in complex technical environments.

Ability to effectively work with multiple concurrent tasks and priorities, so as to achieve successful outcomes and results. Ability to take supervisory and management direction so as to work effectively and with little direct supervision in order to complete tasks. Ability to work effectively with a distributed team in a collaborative environment across Quebec and Canada. Ability to work cooperatively with a diverse team of professionals, acting as a technical resource for others in the team, as well as to work together with other staff on projects of significant importance and value to the organization and to the clients we serve. Ability to perform problem identification and perform issue resolution in a complex environment. A demonstrated aptitude for learning new technologies.

Minimum Education and Experience:

Bachelor’s Degree 3 Years Related Experience /

Annual Salary:

(MPEX Grade 05) $65,500.00 – $81,870.00 – $98,240.00

Hours per Week:

33.75 (Full time)

Supervisor:

Associate Director Operations

Position End Date:

2024-03-31

Deadline to Apply:

2022-08-31

McGill University hires on the basis of merit and is strongly committed to equity and diversity within its community. We welcome applications from racialized persons/visible minorities, women, Indigenous persons, persons with disabilities, ethnic minorities, and persons of minority sexual orientations and gender identities, as well as from all qualified candidates with the skills and knowledge to productively engage with diverse communities. McGill implements an employment equity program and encourages members of designated groups to self-identify. Persons with disabilities who anticipate needing accommodations for any part of the application process may contact, in confidence, accessibilityrequest.hr@mcgill.ca.

Other required skills:

Attention to detail in the level of work performed, taking pride, responsibility and a sense of ownership for the successful operations of the systems under their administration and the availability and reliability of those systems in support of all research users. Advanced problem-solving skills. Good oral and written communication skills in both French and English. Ability to work in complex technical environments.

Ability to effectively work with multiple concurrent tasks and priorities, so as to achieve successful outcomes and results. Ability to take supervisory and management direction so as to work effectively and with little direct supervision in order to complete tasks. Ability to work effectively with a distributed team in a collaborative environment across Quebec and Canada. Ability to work cooperatively with a diverse team of professionals, acting as a technical resource for others in the team, as well as to work together with other staff on projects of significant importance and value to the organization and to the clients we serve. Ability to perform problem identification and perform issue resolution in a complex environment. A demonstrated aptitude for learning new technologies.

Package

Salary: 65500.00-81870.00