QFM021: Machine Intelligence Reading List June 2024

Everything that I found interesting in June 2024 about machines behaving intelligently

Matthew Sinclair
12 min readJul 2, 2024
Photo by vackground.com on Unsplash

We kick off this month’s reading list with the transformative potential of AI in executive roles. If AI Can Do Your Job, Maybe It Can Also Replace Your CEO (nytimes.com) highlights AI’s growing capability to manage high-level decision-making tasks traditionally reserved for CEOs, suggesting a future where AI could play a pivotal role in corporate leadership, albeit with human oversight to ensure strategic alignment and accountability. If it can take the jobs of call centre staff, designers, and software engineers, is there something so special about executive jobs that leaves them immune?

Another theme is the drive to understand the inner workings of gen-AI systems more deeply. Here’s what’s really going on inside an LLM’s neural network (arstechnica.com), unveiling how AI models like Claude operate on the inside. These studies reveal the intricate patterns within neural networks, enhancing our ability to interpret and potentially steer AI behaviour in critical applications such as security and bias mitigation.

We then examine the practical experience of deploying AI at scale with What We Learned from a Year of Building with LLMs (Part I) (oreilly.com). The O’Reilly article provides lessons from a year of building with LLMs, emphasizing the importance of robust prompting techniques and structured workflows.

Finally, this month’s list touches on AI deployment's ethical and operational considerations. What’s the future for generative AI? The Turing Lectures with Mike Wooldridge (youtube.com) examines the importance of addressing bias, misinformation, and ethical concerns in AI’s advancement.

As always, the Quantum Fax Machine Propellor Hat Key will guide your browsing. Enjoy!

If AI Can Do Your Job, Maybe It Can Also Replace Your CEO (nytimes.com): The article discusses how artificial intelligence (AI) might not only replace routine jobs but also high-level executive roles, including CEOs. With AI’s capability to analyse markets, automate communication, and make dispassionate decisions, some companies are already experimenting with AI leadership to cut costs and increase efficiency, though human oversight remains necessary for accountability and strategic thinking.

#AI #Automation #Leadership #CorporateManagement #FutureOfWork

Here’s what’s really going on inside an LLM’s neural network (arstechnica.com): Anthropic’s recent research unveils how the Claude LLM’s neural network operates by mapping millions of neurons’ activities, revealing that concepts are represented across multiple neurons. This mapping process, using sparse auto-encoders and dictionary learning algorithms, helps identify patterns and associations in the model, providing partial insights into its internal states and conceptual organisation.

#AI #MachineLearning #NeuralNetworks #ArtificialIntelligence #Research

Scaling Monosemanticity — Extracting Interpretable Features from Claude 3 Sonnet (transformer-circuits.pub): Researchers at Anthropic have successfully scaled sparse autoencoders to extract high-quality, interpretable features from the Claude 3 Sonnet language model, demonstrating that the technique can handle state-of-the-art transformers. These features are diverse, covering concepts from famous people to programming errors, and are crucial for understanding and potentially steering AI behaviour, especially in safety-critical areas such as security vulnerabilities and bias.

#AI #MachineLearning #NaturalLanguageProcessing #Safety #AIResearch

What is the biggest challenge in our industry? (thrownewexception.com): The biggest challenge in the tech industry is the anxiety caused by layoffs and the fear of AI replacing jobs, leading to mental health issues like burnout. Leaders can help by fostering open communication, leading positively, leveraging new technologies, investing in continuous learning, and collaborating with HR to support their teams.

#TechIndustry #AI #MentalHealth #Leadership #Layoffs

What We Learned from a Year of Building with LLMs (Part I) (oreilly.com): Over the past year, the authors built real-world applications using large language models (LLMs) and identified crucial lessons for developing effective AI products. They emphasise the importance of robust prompting techniques, retrieval-augmented generation, structured workflows, and rigorous evaluation and monitoring to overcome the complexities and challenges inherent in leveraging LLMs for practical use.

#AI #MachineLearning #LLM #TechInnovation #ProductDevelopment

Achieving the Self-Thinking Business (linkedin.com): The article discusses Honu’s development of a “Self-Thinking Business” model through the introduction of a Cognitive Layer that bridges the gap between current AI capabilities and true business autonomy. This new layer aims to transform AI from tactical automation tools into strategic decision-makers by providing a comprehensive, contextual understanding of business data and operations, reducing the need for extensive data and compute resources.

#AI #BusinessAutomation #CognitiveLayer #AutonomousAgents #Innovation

What’s the future for generative AI? The Turing Lectures with Mike Wooldridge (youtube.com): Mike Wooldridge, a Professor of Computer Science at the University of Oxford, discusses the current capabilities and future potential of generative AI, highlighting both its transformative possibilities and the significant challenges it presents, including issues of bias, misinformation, and ethical concerns.

#GenerativeAI #FutureTech #AIChallenges #MachineLearning #TechEthics

Introducing Generative Physical AI — youtube.com: NVIDIA introduced Generative Physical AI, a technology enabling robots to learn and refine their skills in simulated environments, leveraging NVIDIA’s AI supercomputers and robotics platforms. This development aims to minimise the gap between simulation and real-world application, enhancing the autonomy and functionality of future robotics.

#NVIDIA #GenerativeAI #Robotics #AItechnology #Computex2024

Grounding — Enhance GEN AI with YOUR DATA (youtube.com): The article discusses techniques for grounding generative AI models to ensure their outputs are accurate and reliable by integrating real-world data, employing human oversight, and using multiple models to verify results. These methods are crucial for preventing errors in fields like healthcare, finance, and legal services, and involve strategies like Retrieval-Augmented Generation (RAG) and Reinforcement Learning from Human Feedback (RLHF).

#AI #GenerativeAI #AIAccuracy #AITrustworthiness #GroundingAI

Generative AI Handbook: A Roadmap for Learning Resources — genai-handbook.github.io: The Generative AI Handbook offers a comprehensive roadmap for learning about modern artificial intelligence systems, particularly focusing on large language models and image generation. It organises existing resources like blogs, videos, and papers into a textbook-style presentation aimed at individuals with a technical background who seek to deepen their understanding of AI fundamentals and applications. The handbook emphasises the importance of foundational knowledge to effectively use and adapt to rapidly evolving AI tools and techniques.

#GenerativeAI #AIHandbook #MachineLearning #AIeducation #DeepLearning

The Future of AI: In a recent LinkedIn post, Matt Webb shared his thoughts on the future of AI and its applications. Matt is focused on the smaller, more ubiquitous aspects of AI, such as home hardware and managing intelligent agents.

#AI #FutureOfWork #Innovation #Technology #LinkedIn

Back To Atoms: AI has always been seen as the technology of the future but it has finally arrived with ChatGPT and Large Language Models (LLMs). This post reflects on the journey of AI, the realization of its ‘magic,’ and the implications it may have on the software industry and our future. The author speculates that the next wave in technology may bring us back to focusing on tangible, real-world innovations.

#AI #TechFuture #ChatGPT #LLM #Innovation

My personal AI research agenda, mid 2024 (and a pitch for work): Matt Webb shares his latest work with AI agents, specifically a smart home assistant demonstrating emergent behaviour. He discusses the simplicity of creating sophisticated AI behaviours with minimal code and outlines his personal AI research interests, including human-AI collaboration, simple agents acting in the world, and tiny, ubiquitous embedded intelligence.

#AI #Research #SmartHome #TechInnovation #Collaboration

The Next Great Scientific Theory is Hiding Inside a Neural Network: Miles Cranmer discusses the potential of neural networks to uncover groundbreaking scientific theories. The lecture delves into the expanding applications of machine learning, from text generation to construction infrastructure. Highlighting the intersection of AI and scientific discovery, this talk envisions a future where neural networks become pivotal in advancing knowledge.

#NeuralNetworks #MachineLearning #AI #ScientificDiscovery #Innovation

Transforming Customer Support and Sales with Mendable’s AI Solutions: Mendable introduces Firecrawl, a tool that converts websites into LLM-ready markdown or structured data. Their platform offers various AI capabilities to streamline customer support and sales through AI-powered knowledge bases, secure data integrations, enterprise-grade security, and detailed customer interaction insights. They also support custom AI model training and have free and enterprise pricing plans.

#AI #CustomerSupport #SalesEnablement #EnterpriseSecurity #AIModelTraining

Why Apple is Taking a Small-Model Approach to Generative AI: Apple introduced its new generative AI offering, Apple Intelligence, at WWDC 2024. Unlike larger models from competitors, Apple’s approach focuses on smaller, customized models integrated seamlessly with its operating systems to prioritize a frictionless user experience. Apple Intelligence is designed to handle various tasks while maintaining privacy and efficiency, with the speech generation and image creation models being processed on-device for speed and user focus.

#Apple #GenerativeAI #WWDC2024 #AI #Privacy

Sober AI is the Norm: The article discusses the current state of AI, emphasizing the need for ‘Sober AI’ amidst the hype surrounding advanced artificial intelligence technologies. Highlighting observations from the Databricks Data+AI Summit, it points out that most AI work is mundane, involving data preparation and pipeline management rather than groundbreaking advancements. The writer argues that even these seemingly modest applications hold significant value in driving practical business intelligence solutions.

#AI #BusinessIntelligence #DataScience #TechSummit #MachineLearning

Can LLMs invent better ways to train LLMs?: Sakana AI explores using Large Language Models (LLMs) for inventing better ways to train themselves, termed LLM². They leverage evolutionary algorithms to develop novel preference optimization techniques, significantly improving model performance. Their latest report introduces ‘Discovered Preference Optimization (DiscoPOP)’, achieving state-of-the-art results across various tasks with minimal human intervention. The approach promises a new paradigm of AI self-improvement, reducing extensive trial-and-error efforts traditionally required in AI research.

#LLMs #AIResearch #DeepLearning #EvolutionaryAlgorithms #DiscoPOP

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?: The SWE-bench project investigates the ability of language models to automatically resolve GitHub issues. It uses a dataset comprising 2,294 issue-pull request pairs from 12 popular Python repositories, with evaluations based on unit test verification. The leaderboard showcases various models and their performance on this task, with Amazon Q Developer Agent currently leading.

#LanguageModels #GitHub #Automation #MachineLearning #Python

Will We Run Out of Data? Limits of LLM Scaling Based on Human-Generated Data: Epoch AI has estimated the total supply of human-generated public text at about 300 trillion tokens. They project that, at the current rate of usage, language models will exhaust this data stock by 2026 to 2032, or even earlier with high-frequency training. Their forecast also explores the impact of different training strategies on data consumption, noting that models trained beyond computed-optimal levels might leverage more data to enhance training efficiency. The discussion includes possible avenues to sustain AI progress, such as developing synthetic data, tapping into other forms of data, and improving data efficiency.

#AI #Data #MachineLearning #Research #EpochAI

Reverse Turing Test Experiment with AIs: This video showcases an experiment where advanced AIs try to determine who among them is the human. Created in Unity and featuring voices by ElevenLabs, it presents a reverse Turing Test scenario. The experiment aims to explore how AI identifies human traits.

#AI #TuringTest #ReverseTuringTest #Unity #ElevenLabs

I Will Piledrive You If You Mention AI Again: The article explores the author’s frustration with the overhyping of AI technologies in professional software engineering. With formal training in data science, the author critiques how AI initiatives are often pushed by individuals lacking in-depth understanding, leading to a culture of hype and grift. He emphasises the gap between genuine technological advancements and the superficial, profit-driven pushes that dominate the industry landscape today.

#AI #TechIndustry #Hype #DataScience #Critique

Gen AI Testing and Evaluation with ARTKIT: As Generative AI (Gen AI) systems become more integrated into critical processes, their testing and evaluation gain importance for ensuring safety, ethics, and effectiveness. ARTKIT, an Automated Red Teaming and testing toolkit, facilitates this by automating key steps like generating prompts, interacting with systems, and evaluating responses. It aids in creating testing pipelines that offer insights into Gen AI system performance, highlighting areas that require improvement. However, human-driven testing remains essential for a comprehensive evaluation.

#GenerativeAI #AI #Testing #Evaluation #Ethics

Why we no longer use LangChain for building our AI agents:: Octomind shares their experience using LangChain for building AI agents and why they decided to replace it with modular building blocks. The article highlights the limitations and complexity introduced by LangChain’s high-level abstractions and demonstrates how simpler code with minimal abstractions improved their productivity and made the team happier. It suggests that often a framework might not be necessary and advocates for a building-block approach for AI development.

#AI #Tech #LangChain #AIDevelopment #Coding

OpenAI’s GPT-5 Pushed Back To Late 2025, But Promises Ph.D.-Level Abilities: OpenAI’s long-awaited GPT-5, initially rumored for release in late 2023 or summer 2024, is now projected for late 2025 or early 2026. Mira Murati, OpenAI’s CTO, outlined the system’s capabilities, comparing it to having PhD-level intelligence in specific tasks, a leap from GPT-4’s high schooler-level smartness.

#OpenAI #GPT5 #AI #TechNews #ArtificialIntelligence

Regards,
M@

[ED: If you’d like to sign up for this content as an email, click here to join the mailing list.]

Originally published on quantumfaxmachine.com.

You can also check out the Slideshare version here:

--

--