The Tenstorrent team combines technologists from different disciplines who come together with a shared passion for AI and a deep desire to build great products. We value collaboration, curiosity, and a commitment to solving hard problems. Find out more about our culture.
Tenstorrent is seeking a High-Performance Computing (HPC) Systems Engineer to support Accelerated ML Storage platforms. You will focus on delivery of ML storage services with an emphasis of multi-tenant cloud storage requirements. Duties include administrating both high-speed and archiving cloud storage services. You will also be responsible for understanding workload bottlenecks and work with all necessary teams to drive resolution.
- Build and maintain high-performance storage environments designed for multi-tenant HPC cloud
- Work closely with other AI/ML Engineers and Data Engineering Subject Matter Experts
- Work with Central IT, Cybersecurity, and Engineering teams for both on-premises and cloud deployments
- Ensure user and technical issues are promptly prioritized and resolved
- Effectively communicating with cloud tenants as required
- Monitor resource usage and planning for increased capacity
- Additional responsibilities assigned from time to time
- Bachelor’s Degree in a related discipline or equivalent experience, with 3 years of professional experience
- Ceph Certified Specialist or equivalent experience
- Strong sense of urgency, client-oriented and ability to maintain positive partnerships
- Experience with Ceph, Swift, Luster, NFS, S3, and/or high-performance storage
- Experience with performance measuring/modelling of high-performance storage
- Experience with OpenStack
- Automation using tools such as Ansible, Puppet, BASH and Python Scripting
- Willing to roll up your sleeves and help out with hardware and software issues
- Experience with Luster, Weka, Vast, and/or Spectrum Scale
- Familiarity with Container Storage, including Container Storage Interfaces (CSI) and Persistent Volumes
- Familiarity with Infrastructure Automation
- Familiarity of Data Center design, including server hardware, rack diagrams, power, and cooling requirements
- Knowledge of Monitoring and Performance, such as Prometheus, Grafana, Dynatrace, Sysdig
Austin, TX; Santa Clara, CA
Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.