Stay ahead with 5 practical AI insights delivered every Monday and Thursday. Join 9,000+ readers from top AI companies like Unite.ai and Midjourney.
Share
π± Elon's in Trouble for This One
Published 4 months agoΒ β’Β 5 min read
Imagine training a powerful ChatGPT like model, but purposefully leaving out all of Einstein's work.
How much prompting would it take to get a theory of General Relativity?
Is it even possible?
It'd be a cool experiment to see if AI can really imagine world changing ideas.
Today's Ai5:
π€ Grok 2 Enters the Chat
π§ͺ Sakana's AI Scientist
π§ Simple Idea: $1 Billion Valuation
π± Google's AI-Powered Pixel 9
π AI in 2023
Prompt of the Day π¨
The fact you can now make images like this on π is pretty wild!
Grok 2 Enters the Chat π€
Yesterday, new AI language models from xAI, Grok-2 and Grok-2 mini, have been released in beta on π. Alongside the upgraded chat model, π have integrated Flux as well. This means you can use the new image generator right there on π! (the catch is you have to be a premium subscriber)
The language models show good improvements in various benchmarks, including reasoning, reading comprehension, math, and coding. I tested Grok with these 2 popular reasoning tests (which most LLMs still fail) and it passed with flying colors!
But let me tell you about the bigger story here. In true Elon/π style, you can get away with a lot more with Grok and Flux. For example, Midjourney doesn't even allow the words "President Trump", while Grok... well... π
And trust me, there's no shortage of these kinda images if you search "Grok" on π. Elon really likes living on the edge doesn't he.
Sakana AI just unveiled a new concept known as the The AI Scientist. The Japanese startup aims to automate the entire academic research process with their AI scientist. From brainstorming ideas to writing papers and even conducting peer reviews. Here's how it works. π
The model starts by brainstorming ideas and checking for originality. It then conducts research by running code and analyzing the results. Finally, it produces a full paper and an automated peer review.
The AI Scientist has already churned out papers on topics like diffusion models with some scoring βWeak Acceptβ ratings from its own review.
It's an impressive idea, but itβs far from perfect. The AI has been known to make critical errors along the way. There's also the obvious questions around maintaining scientific integrity.
Despite all that, these kinds of ideas showcase what's possible in the future. And how AI will continue serving to accelerate basically everything. You can learn more about Sakana's work here.
Simple Idea: $1 Billion Valuation π§
While the idea is simple, the execution is not.
World Labs is a 4 month old AI startup founded by a renowned Stanford University professor Fei-Fei Li. While being relatively unknown, the company is already valued at over $1 Billion after it's second capital raise recently.
So, what's the big idea?
Well, World Labs aims to create AI models that can accurately estimate the 3D structure of real-world objects and environments. The idea is to enable detailed digital replicas without the need for extensive data collection.
It's a simple idea, but getting it right will take how humans interact with machines to another level. Having massive implications for robotics, healthcare, and AI advancement as a whole.
Check out this recent 15 minute TED talk from Fei-Fei. It's a really interesting watch if you have any interest in computer vision, robotics, and generative AI.
Google had their 2024 Made by Google keynote on Tuesday, launching their new range of Pixel 9 phones. To say there's some heavy AI integration is an understatement.
Google's multi-modal Gemini is deeply integrated into the Android OS for Pixel 9 devices. Gemini will control a range of AI-powered features and assistance throughout the entire user experience.
Here's all the AI features you'll find on the next generation Pixel phones:
Gemini Assistant: A rebuilt AI assistant experience powered by Gemini models
Call Notes: Provides a private, on-device summary of phone conversations
Pixel Screenshots: Analyzes and makes screenshots searchable
Pixel Studio: An on-device image generator for creating custom images
Add Me: Allows users to add themselves to group photos after they're taken
Made You Look: Uses fun visuals on the screen to capture children's attention for photos
Video Boost: Enhances video quality using cloud processing and AI
Loss of Pulse Detection: Uses AI to detect if a person has lost their pulse and automatically calls emergency services (Pixel watch)
Gemini Live: Enables free-flowing conversations with the AI assistant through voice (Google's answer to OpenAI's Voice Mode)
There were a bunch of live demos, all of which didn't go off without a hitch. (this is hard to watch, but they got there in the end!)
Considering Google's track record so far, it's a decent gamble integrating AI this hard into Pixel. Also, there's some obvious privacy concerns... like the fact it listens to your phone calls and takes notes for later.
You can watch the full keynote from Google on YouTube.
AI in 2023 π
I was looking for a specific AI statistic the other day when I came across the AI Index Report. This massive 502 page pdf from Stanford summarizes basically everything you can think of related to AI that happened in 2023.
I put together some of the more interesting finds here for your viewing pleasure. π (yes I scrolled the whole thing).
Number of Foundational Models: A foundational model is like GPT4o. Interesting that Google has the most models, but arguably some of the worst results. Maybe they should focus their efforts.
Training Costs: Gemini Ultra cost over $190 million! GPT4 came in second at just under $90 mil. Interestingly Meta's Llama 2 came in under $4 million.
Midjourney: Progress of Midjourney image generations over 22 months for an identical prompt.
AI Incidents: This includes autonomous cars causing pedestrian fatalities or facial recognition systems leading to wrongful arrests. This is WAY better than I thought it would be. Only 123?
Alright I better stop there, this newsletter is getting outta hand. If you want to check out more of these stats here's the link.
Snack Sized 5 πͺ
1οΈβ£ If you're a user of Ideogram, they've released a Creator's Club where you can get perks and early access to new models.
3οΈβ£ Guy builds a Chrome plugin that blocks your internet every hour and uses computer vision to check you've done 10 push ups. π€£ What could go wrong?
There's an emerging trend putting intelligent AI into closed systems and observing the behaviors. Given the rate AI systems can adapt, it's possible we might just learn a few things about our own civilization. And even our destiny. It also makes me wonder, are we as humans being observed on Earth in our own little closed system? Today's Ai5: 𧬠AI Clones Actually Work π₯ LLMs Gone Wild β‘ Colossus is Live π AI Powers Body Scan Startup 𧱠Minecraft Agent Civilization Prompt of the Day π¨ Holding a...
Would you trust an AI Doctor? Something that remembers every piece of medical literature ever written. Compared to (maybe) someone who studied 20 years ago achieving average grades. No shade on doctors, but you can see the potential. Today's Ai5: π ElevenLabs SFX Library πͺ¦ RIP Call Centers ποΈ Introducing Minimax π¦ Google Can Hear Sickness π€ Agent.ai Prompt of the Day π¨ A 3D render of the word "Ai5" with each letter crafted as a unique, vibrant flame pattern. Distinctive flame letters are...
It's estimated that the 1993 game Doom was installed on more computers than Windows 95 at one point. To this day, Doom has a huge modding community. It propelled shareware style distributions, birthed video game speedrunning, and popularized network multiplayer. The full game was just 2.39 Megabytes. Today's Ai5: πΉοΈ AI Powered DOOM π€ OpenAI Are Releasing an Agent π Midjourney Personality Test π± Claude Artifacts on Mobile π Everything to Know About SB 1047 Prompt of the Day π¨ in the style of...