AI is redefining the computing paradigm. The new paradigm computation demand is incommensurable with the existing software and hardware criteria. The best AI solutions require unifying the innovations in the software programming model, compiler technology, heterogenous computation platform, networking technology, and semiconductor process and packaging technology. Tenstorrent drives the innovations through holistic views of each technological component in software and hardware to unify them to create the best AI platform.
As a performance architect in the dynamic and motivated Tenstorrent Platform Architecture team, you will work in a cross-functional team on ML software stacks, HPC and general purpose workloads, graph compiler, cache coherency protocols, super-scalar CPU, fabric/interconnection, networking, and DPU.
- Collaborate with the software team and platform architecture team to understand CPU hardware requirements for AI accelerator compiler, OS, video/image/voice processing, security, networking, and virtualization technology. Identify the application performance bottlenecks and functional requirements.
- Identify representative benchmarks for the software applications. Perform data-driven analysis based on simulation or analytical models to evaluate software, architecture, and u-architecture solutions to improve performance and power efficiency or reduce hardware.
- Set CPU architecture direction based on the data analysis and work with a cross-functional team to achieve the best hardware/software solutions to meet PPA goals.
- Develop a cycle-accurate CPU model that describes the microarchitecture, uses it for evaluation of new features.
- Collaborate with RTL and Physical design engineers to make power, performance, and area trade-offs.
- Drive analysis and correlation of performance feature both pre and post-silicon.
Experience and qualifications:
- BS/MS/PhD in EE/ECE/CE/CS
- Strong background in CPU ISA's, u-architecture research, and performance benchmarks.
- Understanding SOC fabric, coherency protocols, memory technology, and accelerator technology is a plus.
- Prior experience or strong understanding of ML/AI algorithms, compiler, and OS kernel is a plus.
- Proficient in C/C++ programming. Experience in the development of highly efficient C/C++ CPU models.
We have presence in Toronto, Austin, Santa Clara, Portland, and Raleigh. We are open to remote candidates on a case by case basis.
Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.