Understanding Reasoning LLMs, How AI Companies Get Around Regulation, Understanding AI Engineering, and More
Must-reads for 2-6-25
Hi everyone!
Each week I send out a succinct article highlighting events and sharing resources AI engineers should know about. Subscribe if you want them in your inbox. Please support the authors of the below resources. A huge thanks to all my supporters! If you would like, you can support my writing for just $2/mo.
If you want to learn AI/machine learning, I created a roadmap to do it all entirely for free. Check it out here. Enjoy this week’s resources!
Always be (machine) learning,
Logan
Events you should know about
DeepSeek's Possible Ban: There’s potential legislation in the US to ban DeepSeek and hit any DeepSeek users with a hefty fine of 1 million dollars and jail time. This is specifically for the DeepkSeek app as far as I can tell (not using the model locally) and echoes a similar sentiment to the TikTok legislation we saw about a month ago. The US is worried about national security.
Gemini 2.0: Google released the Gemini 2.0 suite of models. These models are available both in the Gemini app/website and in AI Studio. The 2.0 models show impressive performance gains and are considerably cost effective.
OpenAI Deep Research: OpenAI released Deep Research, an AI agent for multi-step research available to the $200/mo tier ChatGPT subscribers. It’s powered by o3, ChatGPT’s latest reasoning model and has shown impressive research abilities in just about 10 minutes.
OpenAI Operator: OpenAI introduced Operator, an AI agent that can perform tasks on the web for you. It is currently in research preview, but a reliable AI web agent opens up many possibilities for what can be built with AI by extending how it interacts with services.
What you missed last week
Why Medical AI is Garbage, Realistic Perspectives on DeepSeek Models, Understanding Reasoning Models, and More
Resources you should read
Knowledge Navigator
By
The Knowledge Navigator, a concept from a 1987 Apple video, has become a reality with OpenAI's Deep Research tool, which assists users in conducting extensive research efficiently. This tool enhances information gathering and analysis, making complex research tasks quicker and easier, but it still has limitations in deep inquiry. As AI technology continues to improve, tools like Deep Research will likely become essential for knowledge creation and insight generation.
How AI companies get around data regulation
Many countries are enacting data localization regulations to protect citizens' data and gain a competitive edge in AI development. These laws complicate data access for companies, especially startups, limiting their ability to train AI models across borders. Federated machine learning offers a solution by allowing models to be trained without transferring sensitive data, thus complying with these regulations.
How Transformer LLMs Work
The course "How Transformer LLMs Work" provides an in-depth look at the key components of transformer architecture that power large language models (LLMs). It covers essential topics like tokenization, embeddings, self-attention, and recent improvements in attention mechanisms. By the end of the course, learners will understand how LLMs process language and gain skills to implement these models using the Hugging Face library.
Deep Dive into LLMs like ChatGPT
Andrej Karpathy released a 3.5 hour video going over LLMs. He’s made many educational videos in the past but this one is specifically designed to be informative for a wider audience.
AI Engineering with Chip Huyen
AI engineering focuses on building products using readily available models and APIs, shifting away from the traditional model-centric approach of machine learning engineering. It emphasizes solving real problems rather than overcomplicating solutions with advanced AI tools. Learning can be enhanced through project-based experiences alongside structured education, as AI tools can automate parts of coding but not the entire problem-solving process.
Native Speakers
By
Two main approaches are emerging for integrating AI into our digital world: restructuring software to be more AI-compatible or adapting AI to work with existing human-centric interfaces. The future of AI integration will likely involve a balance between these structured and unstructured methods, leveraging the strengths of both. Successful AI systems will need to navigate both worlds, enhancing human interaction without requiring drastic changes.
Making the U.S. the home for open-source AI
By
Building a sustainable open-source AI ecosystem is challenging but crucial for a future where AI is accessible to more people, rather than controlled by a few wealthy companies. The U.S. must invest in open AI research to compete with emerging models from countries like China and ensure a thriving ecosystem. Open-source AI can foster collaboration and innovation, but it requires changing incentives and creating effective feedback loops to succeed.
Understanding Reasoning LLMs
By
The article explains four main approaches to enhance reasoning capabilities in language models (LLMs), including pure reinforcement learning and supervised fine-tuning. It discusses the development of DeepSeek models, highlighting the benefits of combining reinforcement learning and supervised fine-tuning for better performance. Additionally, the author offers insights on developing reasoning models even on a limited budget.
How to Scale Your Model
This is a comprehensive guide released by Google DeepMind engineers detailing how they scale models on TPUs. It’s got many detailed sections including sections on TPUs, transformers, sharding matmuls, training and inference. This guide aims to teach how to analyze performance, parallelize models, and implement techniques using JAX.
An agents economy
By
AI agents are becoming capable of taking on tasks traditionally performed by humans, prompting organizations to rethink their structures and workflows. As agents learn and integrate into the workforce, they could replace many human roles, especially in process-oriented tasks. However, challenges remain in ensuring agents can navigate complex interpersonal dynamics and retain valuable institutional knowledge.
LLM Gateway: The One Decision That Removes 100 AI Engineering Decisions
The LLM Gateway simplifies AI engineering by consolidating many common decisions into a single solution, preventing developers from reinventing the wheel. It helps manage complexities like model routing, logging, and output validation, which can otherwise become overwhelming. By adopting this streamlined approach, teams can focus on building and scaling their AI applications more efficiently.
4 Open-Source Alternatives to OpenAI’s $200/Month Deep Research AI Agent
OpenAI's Deep Research AI Agent costs $200 a month, but there are four open-source alternatives available. These alternatives, such as Deep-Research and OpenDeepResearcher, offer customizable features for efficient research tasks without the high cost. All options are fully open-source, allowing users to modify and self-host based on their needs.
Deep Agent Released R1-V: Reinforcing Super Generalization in Vision-Language Models with Cost-Effective Reinforcement Learning to Outperform Larger Models
Deep Agent's R1-V uses reinforcement learning to improve the generalization of vision-language models (VLMs) while being cost-effective. It outperforms larger models in out-of-distribution tests, demonstrating that smaller, well-trained models can achieve high performance. R1-V's efficient training process and curated datasets enable robust learning without extensive computational resources.
ByteDance Proposes OmniHuman-1: An End-to-End Multimodality Framework Generating Human Videos based on a Single Human Image and Motion Signals
ByteDance has unveiled OmniHuman-1, an advanced AI model that generates realistic human videos from a single image using various motion signals. This model improves motion realism, gesture accuracy, and adaptability by integrating audio, video, and other inputs in its training process. OmniHuman-1 demonstrates superior performance compared to existing animation models, making it a valuable tool for diverse applications in digital content creation.
Meta AI Introduces VideoJAM: A Novel AI Framework that Enhances Motion Coherence in AI-Generated Videos
Meta AI has launched VideoJAM, a framework that significantly improves motion coherence in AI-generated videos by integrating motion representation into the training and inference processes. This approach enhances video quality with minimal modifications to existing models, resulting in more realistic and fluid motion. Evaluations show that VideoJAM reduces common artifacts and consistently achieves higher motion coherence scores compared to traditional methods.
Google DeepMind Achieves State-of-the-Art Data-Efficient Reinforcement Learning RL with Improved Transformer World Models
Google DeepMind has developed a new model-based reinforcement learning (MBRL) method that achieves state-of-the-art performance in the Craftax-classic environment, surpassing previous benchmarks. Their approach incorporates advanced techniques like Dyna with warmup and patch nearest-neighbor tokenization, leading to significant improvements in sample efficiency and reward scores. This research shows the effectiveness of using transformer models and better observation encoding in enhancing reinforcement learning capabilities.
OpenAI's Deep Research is an IMPRESSIVE agent! (Tested)
OpenAI's Deep Research is an advanced AI agent that performs impressively in various tests. It showcases strong capabilities in understanding and generating human-like responses. Overall, its performance highlights significant advancements in AI technology.
AI, Broken Tech Job Market and Learning to Code
The tech job market has become saturated with developers due to a surge in coding bootcamps and the rise of AI, leading to concerns about job security for coders. Despite fears of AI replacing developers, learning to code remains valuable, as understanding programming concepts is essential for effectively utilizing AI tools. Upskilling in coding will enhance your abilities and help you navigate the evolving tech landscape.
Responsible AI: Our 2024 report and ongoing work
Google's 2024 Responsible AI Progress Report highlights significant advancements in AI technology and governance while emphasizing the importance of safety and ethical standards. The report outlines updated AI principles focused on bold innovation, responsible development, and collaborative progress to address the complexities and risks associated with AI. As AI continues to evolve, Google commits to adapting its approach to ensure the technology benefits society and addresses global challenges.
Updating the Frontier Safety Framework
Google DeepMind has updated its Frontier Safety Framework to enhance security protocols for advanced AI systems, focusing on mitigating risks from powerful models. The new framework emphasizes heightened security levels, rigorous deployment procedures, and proactive measures against deceptive alignment risks. Collaboration across the AI industry is crucial for establishing common standards and ensuring the safe development of AI technologies.
How Google Spanner Powers Trillions of Rows with 5 Nines Availability
Google Cloud Spanner is a highly scalable, globally distributed database that ensures data consistency and durability through innovative technologies like TrueTime and the Paxos consensus algorithm. It partitions data into manageable chunks, enabling efficient load balancing and fault tolerance, while simplifying user interactions with a familiar SQL interface. Spanner supports strong consistency for transactions, making it suitable for modern applications requiring high availability and reliability.
What Investors and Product People Should Know about DeepSeek Part 1: Correcting Important Misunderstandings [Markets]
DeepSeek's new reasoning model R1 has sparked concerns about the value of GPUs in AI training, leading to bearish reactions in the market. Many AI companies are hindered by misaligned incentives that prioritize scaling over innovation, while DeepSeek has the flexibility to experiment and refine its approach. Additionally, the rise of open-source models like DeepSeek doesn't necessarily threaten AI companies' profitability due to the significant investments required for secure and stable implementations.
OpenAI Introduces Deep Research: An AI Agent that Uses Reasoning to Synthesize Large Amounts of Online Information and Complete Multi-Step Research Tasks
OpenAI has launched Deep Research, a tool that helps users conduct in-depth, multi-step research by synthesizing information from various online sources. Unlike traditional search engines, Deep Research provides detailed, well-cited reports tailored for professionals in fields like finance and science. The tool can process complex queries and deliver comprehensive insights, making research tasks more efficient and manageable.
Fine-Tuning Your LLM to "Think" Like DeepSeek R1, on Your Computer
DeepSeek R1 is a large AI model that needs powerful GPUs, but smaller distilled versions are available for easier use. The AI community is creating datasets from R1 to help fine-tune other models, making it possible for them to emulate R1's reasoning. This article explains how to fine-tune Llama 3.2 3B using these datasets on consumer-grade hardware with a cost-effective supervised approach.
The End of Search, The Beginning of Research
AI is evolving with the development of Reasoners and autonomous agents, enabling systems to conduct research at a speed and depth comparable to human experts. OpenAI's Deep Research demonstrates this advancement by effectively engaging with complex academic topics and producing high-quality analysis quickly. While general-purpose agents like Operator face limitations, narrow agents are already achieving impressive results and hint at a transformative future for AI in research and other fields.
Educating for AI fluency: Managing cognitive bleed and AI dependency
Generative AI tools like ChatGPT can enhance learning but may also lead to overreliance, weakening critical thinking and creativity. Educators must teach AI fluency, which combines human and AI strengths for better communication and cognitive engagement. A structured training approach, where students actively interact with AI, can help maintain their independence and improve their skills.
Introducing deep research
Buttons that combine icons and text labels are more effective than those with just one element, as they reduce user confusion and improve usability. Icons alone can be ambiguous, leading to errors and delays, while text labels provide clarity and enhance user satisfaction. Overall, the icon-plus-label format is preferred because it streamlines interactions and minimizes mistakes.
January 2025 AI Ethics Round-up
January 2025 saw significant developments in AI, particularly surrounding the controversial emergence of DeepSeek, which caused a $1 trillion loss in the AI market. OpenAI has accused DeepSeek of illegally using its technology, igniting discussions on AI ethics and governance. Amidst this chaos, practical insights on AI governance have emerged, highlighting the need for organizations to establish effective oversight.
Does DeepSeek Wiping out $1T of Market Value Make Sense?
DeepSeek's recent market impact has led to a significant decline in tech stocks, particularly Nvidia, which lost $500 billion in value. The author argues that while the reaction to DeepSeek might seem excessive, US tech stocks are generally overvalued and due for a correction. Ultimately, the situation reflects a complex interplay between technology developments and inflated equity prices, rather than a straightforward cause-and-effect relationship.
Computer-Using Agent
OpenAI has introduced a new model called Computer-Using Agent (CUA) that can perform tasks on the web by interacting with graphical user interfaces like humans do. CUA combines advanced vision capabilities and reasoning skills, achieving high success rates in various digital tasks. While still in early development, it prioritizes safety and aims to gather user feedback for continuous improvement.
Strengthening America’s AI leadership with the U.S. National Laboratories
OpenAI has partnered with the U.S. National Laboratories to enhance scientific research using its advanced AI models. This collaboration aims to drive breakthroughs in fields like healthcare, energy, and national security. The initiative aligns with OpenAI's mission to ensure AI technology benefits humanity while supporting U.S. global leadership.
OpenAI o3-mini
OpenAI has launched o3-mini, a new, cost-effective model optimized for STEM reasoning, available in ChatGPT and the API. It offers enhanced speed and accuracy for tasks in science, math, and coding, while allowing developers to choose different levels of reasoning effort. O3-mini replaces o1-mini, providing users with higher message limits and improved performance, making it a powerful tool for technical applications.
Why Artificial Neural Networks Are So Damn Powerful - Part I
Neural networks are powerful mathematical tools that can adapt to various tasks and model complex relationships in data. They utilize specialized structures, like convolutional and recurrent layers, to effectively process different types of information. Additionally, their ability to automatically learn features and optimize learning objectives makes them essential in modern AI applications.
16 Techniques to Supercharge and Build Real-world RAG Systems—Part 1
Implementing a RAG (Retrieval-Augmented Generation) system involves several critical steps, from data preparation to embedding and querying in a vector database. Achieving high performance requires addressing challenges like retrieval accuracy, chunk size optimization, and system reliability over time. This guide offers 16 practical techniques to enhance RAG applications, focusing on refining retrieval mechanisms and improving response quality.
Artificial Neural Networks Are Nothing Like Brains
Artificial neural networks (ANNs) are powerful AI tools but are fundamentally different from biological brains. While inspired by neural structures, ANNs operate through mathematical functions and lack the brain's complex learning mechanisms. Understanding these differences is crucial to avoid misconceptions about AI's capabilities and limitations.