Meta's New Segmentation Model, A New Open-Source Image Generation Model, Apple Intelligence Model Reports, and More

Machine learning resources and updates 8/5/2024

Aug 05, 2024

∙ Paid

This was a huge week for machine learning. So much so that I’ve included a top 13 for free subscribers. Follow me on X for more frequent posts and updates.

Support the Society's Backend community for just $1/mo to get the full list each week. Society's Backend is reader-supported. Thanks to all paying subscribers! 😊

Meta Segment Anything Model 2 design

SAM 2 is a unified model for segmenting objects in both images and videos, using simple inputs like clicks or masks. It offers robust, real-time segmentation and outperforms existing models, even in unfamiliar videos. SAM 2’s design includes a memory module for tracking objects across frames and supports extensive datasets for diverse real-world applications.

source

Google’s Gemini 1.5 Pro dethrones GPT-4o

Google’s Gemini 1.5 Pro has outperformed OpenAI's GPT-4o in generative AI benchmarks. The experimental version scored 1,300 in the LMSYS Chatbot Arena, surpassing GPT-4o and Anthropic’s Claude-3. On other benchmarks, the experimental Gemini 1.5 Pro version doesn’t outperform, highlighting the discrepancies between benchmark evaluations. Always know what a benchmark shows you.

source

Introducing GitHub Models: A new generation of AI engineers building on GitHub

GitHub Models is a new platform that allows developers to easily access, experiment with, and deploy AI models directly within GitHub. It offers tools like a playground for testing models and integration with Codespaces and Azure for seamless development and production. This initiative aims to democratize AI, empowering over 100 million developers to become AI engineers.

source

Announcing Black Forest Labs

Black Forest Labs has launched, focusing on advancing generative AI models for media like images and videos. They introduced the FLUX.1 suite of text-to-image models, aiming to set new standards in image synthesis. The company successfully raised $31 million in funding and is looking to hire more engineers.

source

Google's Character.AI Investment Boosts Chatbot Game, AI LABS' Role in Training Models

Google is investing heavily in Character.AI to boost its chatbot capabilities. Part of this deal has Noam Shazeer, CEO of Character.AI, coming back to Google DeepMind. This comes soon after both Microsoft and Amazon have acquired talent from small AI companies showing the capabilities of large tech companies to outcompete.

source

How to Use Benchmarks to Build Successful Machine Learning Systems

Machine learning engineers should use benchmarks as initial guides but must test models in real-world scenarios before finalizing them. Benchmarks often miss real-world complexities and can be manipulated, leading to poor performance in practical applications. For successful ML systems, engineers must focus on relevant, representative, recent, and repeatable benchmarks while also evaluating models for latency, cost, scalability, and domain-specific performance.

source

Perplexity planning revenue sharing program with web publishers next month

Perplexity will start a revenue-sharing program with web publishers next month, sharing ad revenue from search result ads. The program will include both free and paid versions of Perplexity, rewarding publishers whose links are cited. Despite facing criticism and legal issues, Perplexity's chief business officer claims the company has always cited sources and that the revenue-sharing plan predates these criticisms.

source

The SearchGPT Paradigm

OpenAI's SearchGPT is a new AI-powered search engine that directly answers questions with cited sources, unlike traditional search engines. It is still a prototype with a minimalist design and some rough edges. The launch raises questions about the future of search engines, content monetization, and the sustainability of AI-driven models.

source

Smaller, Safer, More Transparent: Advancing Responsible AI with Gemma

Google introduced Gemma 2, a high-performing AI model, emphasizing safety and transparency. The new additions include a smaller 2B model, ShieldGemma safety classifiers, and Gemma Scope for model interpretability. These tools aim to help developers create safer, more efficient, and transparent AI applications.

source

Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judge

Large Language Models (LLMs) are advancing quickly but usually need expensive human data to improve. A new method lets LLMs judge and refine their own responses without human help. This self-improvement technique has significantly boosted the models' performance in following instructions.

source

🥇Top ML Papers of the Week

The newsletter highlights the top machine learning papers of the week, featuring advancements in self-improving alignment techniques and multi-agent frameworks for complex web searches. It also covers improvements in reliability and traceability of RAG systems, and methods for limiting reasoning output length. Additionally, it discusses safety content moderation models, persona agent evaluation benchmarks, and approaches to address inefficiencies in KV cache memory consumption.

If you haven’t subscribed to

elvis

‘s

NLP Newsletter

you should, I included it in this list of resources each week because of how valuable going through papers is.

source

Friend is a $99 AI necklace that wants to help you remake the movie "Her"

The Friend pendant is a $99 AI necklace designed to be an emotional companion, not a productivity tool or phone replacement. It listens and responds to users, aiming to help combat loneliness.

I included this because its such a terrible idea. It’s an AI band-aid for a problem we need to address properly. It’s also an always-listening AI device—something anyone should be aware of.

source

Apple Intelligence Foundation Language Models

Apple has developed advanced language models for on-device and server use, enhancing features in iOS, iPadOS, and macOS. These models improve tasks like text writing, notification management, and image creation. Apple emphasizes Responsible AI principles in their model development and shared insights at their Natural Language Understanding workshop. View the publication here at the source below.

source

Keep reading with a 7-day free trial

Subscribe to Society's Backend to keep reading this post and get 7 days of free access to the full post archives.