Awesome post, Logan! I was fascinated the whole way through. It's cool to see the way your discipline both differs and overlaps with my daily. Though the tech stacks are wildly different, we are both focused on enabling other developers to build amazing user-facing stuff on top of a set of primitives. I agree that staying nimble and using resources in an optimal manner really sets platform teams apart—I'm sure that's all the more true in ML infra where, as you say, resources are precious and experimentation is really at the fore.
Thanks for your insights—you're clearly an amazing technologist 👏
Thanks Drew! Admittedly, I haven't worked on a non-ML platform team but I think it's really interesting how the focus of a platform affects design decisions. Especially coming from modeling experience this was something super interesting to me about ML infra.
FWIW, if you ever feel like posting about it, I would personally love to hear more about how you evaluate data center networks and compute architectures when deploying a model.
I like stories where an understanding of the actual computer system allows us to write much more efficient software. Tends to be a bit of a blindspot for me.
This is a really good idea. While not directly what I work on, that's been super interesting to learn about. I'll have to make sure my understanding of it is rock solid first.
Great article, I think everyone is focused on the fancy algorithms but not many talk about what goes on behind making these algorithms operational. Would love to read more a about each parts of the ML infrastructure.
Agreed. I see a lot about machine learning but everyone seems to gloss over how we bring models to users and the complexities involved in it. I’ll definitely be writing more about it.
Awesome post, Logan! I was fascinated the whole way through. It's cool to see the way your discipline both differs and overlaps with my daily. Though the tech stacks are wildly different, we are both focused on enabling other developers to build amazing user-facing stuff on top of a set of primitives. I agree that staying nimble and using resources in an optimal manner really sets platform teams apart—I'm sure that's all the more true in ML infra where, as you say, resources are precious and experimentation is really at the fore.
Thanks for your insights—you're clearly an amazing technologist 👏
Thanks Drew! Admittedly, I haven't worked on a non-ML platform team but I think it's really interesting how the focus of a platform affects design decisions. Especially coming from modeling experience this was something super interesting to me about ML infra.
FWIW, if you ever feel like posting about it, I would personally love to hear more about how you evaluate data center networks and compute architectures when deploying a model.
I like stories where an understanding of the actual computer system allows us to write much more efficient software. Tends to be a bit of a blindspot for me.
This is a really good idea. While not directly what I work on, that's been super interesting to learn about. I'll have to make sure my understanding of it is rock solid first.
Thanks, Logan. I learned alot. I know that because I don't know much about what you wrote about yet. Onto some learning!
I’m glad it was helpful! I’ll be expanding on these topics in future articles 😊
Great article, I think everyone is focused on the fancy algorithms but not many talk about what goes on behind making these algorithms operational. Would love to read more a about each parts of the ML infrastructure.
Agreed. I see a lot about machine learning but everyone seems to gloss over how we bring models to users and the complexities involved in it. I’ll definitely be writing more about it.