Top 10 Machine Learning Resources and Updates 06/14/2024
A new AI employee, a four hour video on recreating GPT-2, meritocracy at Scale, and more
Here are the top 10 machine learning updates and resources from this week. I’m trying a new format where I distill all the recent updates to the top 10 most important and share them with everyone. Paid subscribers will get another email containing the complete list. You can get that email by supporting Society's Backend for just $1/mo.
If you want all the ML updates from X, follow me there.
Researchers use large language models to help robots navigate
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Today we're thrilled to introduce Jace, your AI employee
Zeta Labs introduces Jace, an AI employee that can handle tasks in the digital world. Jace can interact with websites like a human, making it useful for tasks like booking flights and setting up a company. Exciting examples show Jace's capabilities in business setup and email handling. Join the waitlist to explore Jace's potential at jace.ai.
📽️ New 4 hour (lol) video lecture on YouTube:
A new 4-hour YouTube video for building a GPT-2 model from scratch. It covers network creation, optimization, training, and evaluation. It also references the Zero To Hero series and the nanoGPT GitHub repo. The video is detailed and comprehensive.
Researchers use large language models to help robots navigate
Researchers are using language models to help robots navigate without relying on visual data. This method simplifies training and improves robot performance in situations with limited visual information.
What If We Recaption Billions of Web Images with LLaMA-3?
Researchers improved image descriptions using LLaMA-3, an advanced AI model. They recaptioned 1.3 billion images, enhancing training for vision-language tasks. This boosted performance in image retrieval and text-to-image generation, making models more accurate and useful.
5 Free Datasets to Start Your Machine Learning Projects
The blog lists 5 free datasets to build machine learning models: Boston House Prices, Stroke Prediction, Netflix Stock Prices, ImageNet, and Yelp. These datasets help practice regression, classification, time series, computer vision, and NLP, essential for a robust ML portfolio.
Meritocracy at Scale
Scale hires based on merit, excellence, and intelligence (MEI). This means they look for the best person for the job, demand high standards, and prefer smart individuals. They evaluate people as individuals, not based on demographics.
The author argues that meritocracy and diversity are not in conflict. A merit-based process naturally brings diverse backgrounds and ideas. They focus on objective selection without bias towards race, gender, etc. This approach is fair, legal, and good for business. It ensures a strong team and fair treatment of colleagues.
Mira Murati says the AI models that OpenAI have in...
Mira Murati states that OpenAI's lab models are similar to public ones. This is an interesting departure from Sam Altman’s previous statement that the things they are working on eclipse their current offerings.
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Samba is a new model combining State Space Models and Sliding Window Attention to handle infinite context sequences efficiently. It outperforms other models and scales well, offering faster processing and better memory recall even with very long sequences.
Compute Better Spent: Replacing Dense Layers with Structured Matrices
Dense layers in models are slow and inefficient. This study explores structured matrices as better alternatives. Using optimal scaling, structured matrices like Block Tensor-Train (BTT) outperform dense ones, making models faster and more efficient.
What Apple Intelligence Means for You
Apple Intelligence, announced at WWDC, integrates advanced AI into Siri, enhancing context understanding and privacy. It uses on-device and cloud-based Large Language Models (LLMs). This innovation boosts AI productivity and security, impacting both Apple and the tech industry.
Thanks for reading!