LLM Architect
LLM Architect Edinburgh (on-site) £100k-120k + exceptional benefits A rare chance to drive the future of AI infrastructure at one of the world''s leading RandD tech organisations. This is a senior opportunity with a global research leader, where you ll architect and optimise the platforms that deliver large-scale language models to production. You ll be working on some of the hardest challenges in distributed AI systems: building ultra-reliable, ultra-scalable environments for inference and deployment. What you ll be doing Designing cloud-native architectures to run large language models on serverless frameworks (e.g. Kubernetes, Knative, or custom-built FaaS). Developing approaches to minimise cold-start latency through advanced container snapshotting, weight pre-loading, and graph partitioning. Building distributed inference pipelines with tensor parallelism, model sharding, and efficient memory scheduling to serve LLMs at scale. Experimenting with quantisation, pruning, and KV-cache management to squeeze maximum throughput from GPU/accelerator clusters. Working closely with applied researchers to turn state-of-the-art methods into robust, production-grade systems. What you ll bring Deep understanding of large-scale ML systems engineering, with direct experience in deploying or optimising LLMs. Hands-on expertise in C++/Rust/Go for systems programming, plus Python for model integration. Strong knowledge of distributed runtimes and scheduling frameworks (e.
Perform a fresh search...
-
Create your ideal job search criteria by
completing our quick and simple form and
receive daily job alerts tailored to you!