This week’s cover depicts a ferret reading a newspaper, touching on three themes: first, it showcases the incredible power of MidJourney v.6’s text-to-image modeling; second, The New York Times is suing OpenAI for billions of dollars; third, Apple has released a new open-source AI model called Ferret. A MidJourney image of a ferret reading a newspaper ties it together. The title is in Times New Roman font, created using MidJourney and Photoshop.

Executive Summary

  • NY Times Sues Open AI: The New York Times is suing OpenAI “for ‘billions of dollars in statutory and actual damages’ related to the ‘unlawful copying and use of The Times’s uniquely valuable works.’ It also calls for the companies to destroy any chatbot models and training data that use copyrighted material from The Times.”
  • Waymo: While we pondered LLMs, Google’s driverless Waymo cars took 700,000+ trips in 2023.
  • Image Recognition:  AI is getting almost too good at guessing where a photo was taken and identifying everything in the photo (without context nor metadata). For example, a multimodal AI guessed a user’s photo of a backcountry hiking trail within 35 miles of the location.
  • Apple Releases AI: Without fanfare, Apple quietly launched Ferret, a powerful open-source machine learning model.
  • The Year of AI Robots: Brett Adcock, a robotics CEO who previously thought robot hardware would outpace software, now says they are progressing at the same rate.  It jarred him.  This means, as quickly as robots develop agility, they will equally develop the ability to interact with their environment.
  • Microsoft CoPilot: Microsoft continues to rapidly integrate AI features into its existing products like Office, under the product name Co-Pilot.  
  • AI Video: Elon Musk predicts 2024 will be the year of AI video.  Users are creating compelling video using image-to-video and text-to-video prompting.  The latest update from MidJourney is so good that animating stills from MidJourney is (for the moment) outpacing video prompts using text.
  • Job upheaval: While Newsweek announced layoffs attributed to AI advancements, one of the world’s oldest newspapers is leveraging AI to create new jobs.
  • OpenAI Wearables: In a significant move, OpenAI has recruited a top Apple design executive to develop innovative AI wearables.

Top 9 Stories

These are the must-click links if you only have time for a few.  Even if they look boring, click them!  I did the work, so you don’t have to worry.  All are 10/10 would recommend.

The Rest: AI News of The Week

Don’t let the volume overwhelm you.  Have fun and skim it. The links are organized by topic, sorted from ‘coolest’ to ‘least cool’, and each topic is clearly defined with a headline.  I’ve added a description and glossary of what the topics mean, beneath each label, in plain language.  I do the work so you don’t have to!   The links descriptions are often pulled directly from tweets or articles, so it’s not always my voice.  Pause when you see something that interests you.  Reach out to me any time.  I enjoy sharing and discussing these items!

Apple Ferret

Apple launched Ferret so quietly that it was made public in October yet no one noticed until December.  The main strength of the model is the ability to recognize elements in an image and draw a line around them (similar to last week’s theme of ‘segmentation’). 

Introducing Ferret, a new MLLM that can refer and ground anything anywhere at any granularity.

Apple Ferret 

Did Apple build a multimodal LLM that rivals Google’s Gemini already?

https://github.com/apple/ml-ferret

Apple releases Ferret

An End-to-End MLLM that Accept Any-Form Referring and Ground Anything in Response

Apple’s ‘Ferret’ is a new open-source machine learning model

https://appleinsider.com/articles/23/12/24/apples-ferret-is-a-new-open-source-machine-learning-model

A Bit of Fun

I got a sincere kick out of reading ChatGPT play the apathetic office banter meta-statement game.  Anyone who works in an office will get it.  The game of tennis from Rosencrantz & Guildenstern are Dead is fantastic.  The security footage with huge boots was also great.

This is spot-on.  

“ChatGPT, Play along, making meta-statements without actual content. I’ll start: Tentative question?”

“Hey ChatGPT, lets play question tennis from Rosencrantz & Guildenstern are Dead. It is a game. Do you know what it is?”  

ai generated cctv footage of police arresting ppl for wearing huge boots

ChatGPT as information Swiss army knife:

“Look up this bottle of wine, tell me how it is rated on various sites and also how it should talk about it so i sound sophisticated”

What the average redditor looks like according to midjourney (all versions)

Legal/Ethics

The power of multimodal image recognition plus AI context and prediction has led to AI becoming the world’s champion “Geoguesser”.  And… elections.

Artificial intelligence can find your location in photos, worrying privacy experts

The PIGEON algorithm was able to geolocate this 2012 photo of the author on a backcountry trail in Yellowstone National Park to within roughly 35 miles of where it was taken.

https://www.npr.org/2023/12/19/1219984002/artificial-intelligence-can-find-your-location-in-photos-worrying-privacy-expert

world’s best ai vs geoguessr pro

Google: How we’re approaching the 2024 U.S. elections

https://blog.google/outreach-initiatives/civics/how-were-approaching-the-2024-us-elections/

AR/VR

The biggest news in AR is the Apple Vision Pro which is coming out in a matter of weeks.  The biggest theme is the concept of “gaussian splatting” which (to my very layperson’s knowledge) is taking still images or frames of a video and extrapolating a 3D image..  For a single image, it’s OK.  For a series of images, it can remarkably stitch them together into a real time AR/VR model.  And for recorded video, it essentially turns the video into a gaming environment.  That’s very dumbed down, but it’s important not to let terms turn us off.  Gaussian Splatting should not be a trigger term.  Get over it.  As Brad Hamilton says, “Learn it.  Love it.  Live it.”

Apple Vision Pro tipped for late Jan/early Feb release

Apple Vision Pro Currently In Mass Production, Says Analyst, Believes The Headset Is Company’s Most Important Product For Next Year

https://wccftech.com/apple-vision-pro-currently-in-mass-production-most-important-product-2024/

Apple Vision Pro tipped for late Jan/early Feb release

Recording a family house in AR/VR: “I’m still convinced the killer use case for 3d reconstruction tech is memory capture. No surprise Apple is headed in this direction.“

Introducing Marigold, a universal monocular depth estimator, delivering incredibly sharp predictions in the wild! Based on Stable Diffusion, it is trained with synthetic depth data only and excels in zero-shot adaptation to real-world imagery. Check it out:

LangSplat: 3D Language Gaussian Splatting (these are the little headlines that are the big headlines, if you’re technical)

ground CLIP features into a set of 3D language Gaussians, which attains precise 3D language fields while being 199 × faster than LERF

Fun with real-time diffusion & controlnet – turning a webcam video subject into an alien monster

“Alibabi: Make-A-Character: High Quality Text-to-3D Character Generation within Minutes”

https://github.com/Human3DAIGC/Make-A-Character

Business/Enterprise

The Bloomberg story is the one that deserves some unpacking.  Six months after Bloomberg spent $1,000,000 on a custom finance model, another finance model came out that outperforms it and costs $100.  Separately, I tend to take Google’s hiring and layoff numbers with a large grain of salt. 

“Bloomberg invested over a million dollars in developing a finance-domain focused Large Language Model (LLM) named BloombergGPT.  And within just six months of the release of BloombergGPT, a model (AdaptLLM-7B) costing merely $100 came out surpassing BloombergGPT in performance.”

Google likely to layoff 30,000 employees post new AI innovation

“The proposed restructuring is anticipated to primarily impact Google’s ad sales department, where the company is exploring the benefits of leveraging AI for operational efficiency.”

Artificial intelligence checks whether your Louis Vuitton bag is fake

Technology company Entrupy claims that it can use AI to detect whether a luxury item is fake with near-perfect accuracy.

Anthropic Projected At Least $850 Million in Annualized Revenue in 2024

Anthropic has projected it will generate more than $850 million in annualized revenue by the end of 2024, The Information reported . That’s a 70% increase from a projection it gave to some investors just three months ago.

https://www.theinformation.com/briefings/anthropic-projected-at-least-850-million-in-annualized-revenue-in-2024

ChatGPT Helps, and Worries, Business Consultants, Study Finds

The A.I. tool helped most with creative tasks. With more analytical work, however, the technology led to more mistakes.

OpenAI competitor Anthropic projects $850 million in annualized revenue

https://the-decoder.com/openai-competitor-anthropic-projects-850-million-in-annualized-revenue

OpenAI 

OpenAI Is in Talks to Raise New Funding at Valuation of $100 Billion or More

OpenAI would be second-most valuable US startup behind SpaceX

Company also in talks for billions from G42 for chip venture

https://www.bloomberg.com/news/articles/2023-12-22/openai-in-talks-to-raise-new-funding-at-100-billion-valuation

OpenAI Is in Talks to Sell Shares at an $86 Billion Valuation

https://www.bnnbloomberg.ca/openai-is-in-talks-to-sell-shares-at-an-86-billion-valuation-1.1986538

OpenAI Is in Talks to Raise New Funding at Valuation of $100 Billion or More

https://finance.yahoo.com/news/openai-talks-raise-funding-100-211552141.html

Images

MidJourney’s improvement over one year is outstanding.  For every cartoonish Dalle-3 image we see (all of them it seems), MidJourney is out there making photorealistic images that push the boundaries of what I think is possible with diffusion.  Diffusion is taking random static and telling the computer, “Hey, if this was a photo of a walrus smoking a cigar on a bullet train in Wisconsin, what would it look like?”…  and the computer “denoises the random static” into the text.  Whoever figured it out should get a prize in computing (pretty sure it’s Jonathan Ho, Ajay Jain, and Pieter Abbeel).  Be sure to look at the examples below.

MidJourney v6 Fanfare Continues

Midjourney v1 until v6, same prompt”

white background, closeup portrait of a very old mean man, 92 years old, wrinkles, realistic skin, studio lighting,, canon f/4

The skin details in #midjouney v6 are insane.

MidJourney V6 can replicate almost any animation style.  10 flawless examples with prompts.

Futuristic Nike explorations

Make-A-Character

Make-A-Character: Make-A-Character: High Quality Text-to-3D Character Generation within Minutes

https://huggingface.co/spaces/Human3DAIGC/Make-A-Character

Wearables

Embodied devices (from Siri to robots) are when AI is embedded into an object in order to let you chat with it in plain language, i.e. “get the red sock from the couch and bring it over here”.   OpenAI is making moves in that direction, by hiring the head of design for iPhone.  Feels like a big deal.

Apple’s iPhone Design Chief Enlisted by Jony Ive, Sam Altman to Work on AI Devices

Design executive Tang Tan is set to leave Apple in February

Tan will join Ive’s LoveFrom design studio, work on AI project

https://www.bloomberg.com/news/articles/2023-12-26/apple-iphone-design-head-tang-tan-to-work-with-jony-ive-sam-altman-on-ai-tech

Humane’s AI Pin will start shipping in March

https://www.theverge.com/2023/12/22/24012429/humane-ai-pin-shipping-marc

Robotics/Embodiment

Brett Adcock, a robotics CEO who previously thought robot hardware would outpace software, now says they are progressing at the same rate.  It jarred him.  This means, as quickly as robots develop agility, they will equally develop the ability to interact with their environment.  The first link is his tweet, and it’s worth reading.

2024 will be the year of Embodied AI

Waymo cars took 700,000+ trips in 2023

https://waymo.com/blog/2023/12/dear-waymo-community-reflections-from-2023.html.html

Waymo has 7.1 million driverless miles — how does its driving compare to humans?

The Google spinoff’s robotaxis led to a reduction in injury-related and police-reported crashes when compared to human benchmarks, according to new research.

https://www.theverge.com/2023/12/20/24006712/waymo-driverless-million-mile-safety-compare-human

LG Ushers in ‘Zero Labor Home’ With Its Smart Home AI Agent at CES 2024

With its advanced ‘two-legged’ wheel design, LG’s smart home AI agent is able to navigate the home independently. The intelligent device can verbally interact with users and express emotions through movements made possible by its articulated leg joints. Moreover, the use of multi-modal AI technology, which combines voice and image recognition along with natural language processing, enables the smart home AI agent to understand context and intentions as well as actively communicate with users.

Science/Medicine

The first link is the coolest one.  AI is able to read damaged scrolls without opening them!

AI Continues to Chip Away At Ancient Scrolls (especially ones we cannot open without ruining)

(see previously: https://www.semafor.com/article/10/12/2023/ai-deciphers-ancient-scrolls-burned-and-buried-for-2000-years

We launched the world’s first Gen AI bot that helps radio-technologists and young radiologists with the appropriate scan protocols in every scenario. It’s already being pinged 500+ times a day.

The Race to Put Brain Implants in People Is Heating Up (Paywall)

Thanks in part to Elon Musk, the field of brain-computer interfaces has captured both public and investor interest, with a cadre of companies now developing implantable devices.

https://www.wired.com/story/the-race-to-put-brain-implants-in-people-is-heating-up/

AI reveals how microplastics are harming global soil and agriculture

https://www.earth.com/news/ai-shows-how-microplastics-are-harming-global-soil-and-agriculture

AI companion ElliQ: Reducing senior loneliness

ElliQ, created by Intuition Robotics, is notable for being the first artificial intelligence device explicitly designed to reduce loneliness and isolation in older Americans.

https://www.aiacceleratorinstitute.com/ai-companion-elliq-reducing-senior-loneliness/

https://elliq.com/

MyHeritage Releases AI Record Finder™ and AI Biographer™ — Two Groundbreaking Features That Transform Genealogy Using Artificial Intelligence

https://www.myheritage.com/research/ai-record-finder/

https://www.businesswire.com/news/home/20231226134311/en/MyHeritage-Releases-AI-Record-Finder%E2%84%A2-and-AI-Biographer%E2%84%A2-%E2%80%94-Two-Groundbreaking-Features-That-Transform-Genealogy-Using-Artificial-Intelligence

Scientists discover the first new antibiotics in over 60 years using AI

https://www.euronews.com/next/2023/12/31/scientists-discover-the-first-new-antibiotics-in-over-60-years-using-ai

Video

While text-to-images has been the star this year, text-to-video has struggled to become camera ready beyond a few second clips.  The challenge with AI images is it’s tough to keep them consistent.  The requirement of video frames is they have to be consistent.  It’s temporarily easier to animate a still image (using different technology) than to prompt a video with text.  Below, are three incredible examples of how MidJourney’s image improvements have led to essentially full fledged movie trailers (quick cuts help) using AI.  There are two big video AI platforms out there – Pika and Runway.  A new startup, Assistive, wants to join them.

MidJourney + Runway 

MidJourney + SVD + Topaz

Movie Trailer using AI

Pika Labs’ text-to-video AI platform opens to all: Here’s how to use it

https://pika.art/login

Elon Musk: AI movies next year

Domo AI Real Time Filters

Introducing two new models for our Video-to-Video function:

1. Storybook Cartoon

2. Color Illustration

Introducing Leonardo Motion  

Generate videos from your images in just a couple of clicks.

Now available to all users, paid and free. Our top plan now also includes unlimited video generations.

https://app.leonardo.ai/

Assistive 

Today, we’re launching our first product, Assistive Video. It’s a generative video platform for creating high quality videos from text and image prompts.

https://assistive.chat/blog/introducing-assistive-video

https://assistive.chat/product/video

http://assistive.chat/video 

News/Journalism

I’m not sure how to take Newsweek’s layoffs, since my instinct is they were not in the greatest shape, anyway.  Which is why the second link is much cooler.  Berrow’s Worcester Journal is creating new jobs called “AI Reporters”.  These roles appear to resemble those of coordinators or producers which take data-driven reporting or community records/minutes and use AI to transform them into content, which the AI Reporters then manually proof and polish.  Sort of an “AI assisted” role.

Newsweek: Massive Layoffs Are Coming in 2024

4/10 to be replaced by AI

How one of the world’s oldest newspapers is using AI to reinvent journalism

The AI reporters use an in-house copywriting tool based on the technology ChatGPT, a souped-up chatbot that draws on information gleaned from text on the internet. Reporters input mundane but necessary “trusted content” – such as minutes from a local council planning committee – which the tool turns into concise news reports in the publisher’s style.

https://www.theguardian.com/technology/2023/dec/28/how-one-of-the-worlds-oldest-newspapers-is-using-ai-to-reinvent-journalism

New York Times Lawsuit

Next week’s newsletter will have more extensive coverage and analysis. There are valid concerns, but also instances of miscommunication and a lack of nuance. The lawsuit appears to lean heavily on emotional appeals and connects a lot of dots that may or may not be related.  However, beneath these emotional appeals lie some solid points.  I highly recommend reading the articles and links explaining the nuances of the case, below.  

The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work

Millions of articles from The New York Times were used to train chatbots that now compete with it, the lawsuit said.

The New York Times is suing Microsoft/OpenAI over copyright infringement, claiming the companies are responsible for “billions of dollars” in damages and demanding any chatbot models and data that pulled copyrighted work from The Times be destroyed.

https://variety.com/2023/digital/news/new-york-times-sues-openai-microsoft-copyright-infringement-1235851238/

OpenAI’s Napster/Google Moment

The New York Times got rolled by Google in the 2000s, but they’re not getting rolled this time around.

https://calacanis.substack.com/p/openais-napstergoogle-moment

A thread on some misconceptions about the NYT lawsuit against OpenAI. Morality aside, the legal issues are far from clear cut. Gen AI makes an end run around copyright and IMO this can’t be fully resolved by the courts alone.

In the New York Times OpenAI lawsuit, you can see how complex the relationship of training data to output can be. On one hand, they find that you can induce ChatGPT to produce exact content from famous Times articles, on the other, they show it also hallucinates false articles.

IP concerns may threaten smaller players in AI, but the large generative AI companies (Adobe, Microsoft, OpenAI, Anthropic) all agreed to defend their users against any copyright or infringement claims. I wonder if this will prove a barrier to startup & open source entrants.

Anthropic Joins the Party, Offers Copyright Shield to Enterprise AI Customers

Artificial General Intelligence (AGI)

Will scaling work?

Data bottlenecks, generalization benchmarks, primate evolution, intelligence as compression, world modelers, and other considerations

https://www.dwarkeshpatel.com/p/will-scaling-work

Multimodality

This is the ability for a language model to “see, hear, etc”.   Just like Apple’s Ferret, GPT-4 is multimodal and can identify objects in photos.

New Multi-Modal with Search Grounding.

Microsoft is combining GPT-4 Vision, Bing image search and web data to deliver a better understanding of queries.  Search Grounding was not only able to identify the image, but also the EXACT shuttle.

Microsoft

Microsoft’s next Surface laptops will reportedly be its first true ‘AI PCs’

https://www.theverge.com/2023/12/28/24017890/microsoft-ai-surface-laptops-arm

EXCLUSIVE: Microsoft readies ‘next-gen’ AI-focused Surface Pro 10 and Surface Laptop 6 with Arm chips and design upgrades for 2024

https://www.windowscentral.com/hardware/surface/microsoft-surface-pro-10-laptop-6-major-update-intel-arm-ai-2024

Copilot for Windows Features Overview

https://www.microsoft.com/en-us/windows/copilot-ai-features

Microsoft Copilot is now available as a ChatGPT-like app on Android

You no longer need the Bing mobile app to access Copilot on Android devices.

https://www.theverge.com/2023/12/26/24015198/microsoft-copilot-mobile-app-android-launch

Copilot for Web

https://copilot.microsoft.com/

Copilot Launches for iOS

https://apps.apple.com/us/app/microsoft-copilot/id6472538445

Explainers

Autonomous AI Video Clip Generator (GPT-4 API, Whisper, PyTube ++)

I created a system in Python that searches YouTube and downloads the video, transcribes it with OpenAI Whisper, uses GPT-4 to find the USERs clip from in the video and cuts the video in that clip, the USER can choose between 16:9 or 9:16 format and if they want subtitles. Perfect for YouTube automation and TikToks / Reels

Google’s Video Poet Elevates AI Video!

Multimodal AI, Gemini, and Google’s Data Moat – Can it beat OpenAi

20/Dec/2023 – AI report Q&A, phi-2 demo, Mistral, Apple HUGS – Weekly livestream Nov-Dec/2023 – LIVE

The Stanford AI INDEX REPORT

Measuring trends in Artificial Intelligence

Google’s Year In Review: 2023: A year of groundbreaking advances in AI and computing

https://blog.research.google/2023/12/2023-year-of-groundbreaking-advances-in.html

Jeff Bezos on Generative AI: “They’re not inventions. They’re discoveries.”

https://aninternetreference.substack.com/p/jeff-bezos-on-generative-ai-theyre

How Not to Be Stupid About AI, With Yann LeCun (paywall)

It’ll take over the world. It won’t subjugate humans. For Meta’s chief AI scientist, both things are true.

https://www.wired.com/story/artificial-intelligence-meta-yann-lecun-interview/

China/Baidu

Baidu’s ChatGPT-like Ernie Bot has more than 100 mln users -CTO

https://www.reuters.com/technology/baidus-chatgpt-like-ernie-bot-has-more-than-100-mln-users-cto-2023-12-28/

We asked GPT-4 and Chinese rival ERNIE the same questions. Here’s how they answered

https://edition.cnn.com/2023/12/15/tech/gpt4-china-baidu-ernie-ai-comparison-intl-hnk/index.htm

Audio

Nendo is a generative music AI

https://colab.research.google.com/drive/1uGQIejuCKKEQrFBgzaHtdCYHEIFbJD6l

CassetteAI is another generative music AI

https://cassetteai.com/dashboard

New Models

Release of Robin v1.0 – a Suite of Multimodal Models

The Robin team is proud to present Robin, a suite of  Multimodal (Visual-Language) Models.   These models outperform, or perform on par with, the state of the art models of similar scale.

https://sites.google.com/view/irinalab/blog/robin-v1-0

Nous Hermes 2 – Yi-34B is a state of the art Yi Fine-tune.

Nous Hermes 2 Yi 34B was trained on 1,000,000 entries of primarily GPT-4 generated data, as well as other high quality data from open datasets across the AI landscape.

https://huggingface.co/NousResearch/Nous-Hermes-2-Yi-34B

Chips/Hardware

Nvidia’s biggest Chinese competitor unveils cutting-edge new AI GPUs — Moore Threads S4000 AI GPU and Intelligent Computing Center server clusters using 1,000 of the new AI GPUs. Beefy clusters with 200 petaops of AI compute.

https://www.tomshardware.com/pc-components/gpus/nvidias-biggest-chinese-competitor-unveils-cutting-edge-new-ai-gpus-moore-threads-s4000-ai-gpu-and-intelligent-computing-center-server-clusters-using-1000-of-the-new-ai-gpus

Technical/Dev/IT

Beyond Memorization: Violating Privacy Via Inference with Large Language Models

Current privacy research on large language models (LLMs) primarily focuses on the issue of extracting memorized training data. At the same time, models’ inference capabilities have increased drastically. This raises the key question of whether current LLMs could violate individuals’ privacy by inferring personal attributes from text given at inference time. In this work, we present the first comprehensive study on the capabilities of pretrained LLMs to infer personal attributes from text.

https://arxiv.org/abs/2310.07298

Evaluating Language-Model Agents on Realistic Autonomous Tasks

In this report, we explore the ability of language model agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild. We refer to this cluster of capabilities as “autonomous replication and adaptation” or ARA. We believe that systems capable of ARA could have wide-reaching and hard-to-anticipate consequences, and that measuring and forecasting ARA may be useful for informing measures around security, monitoring, and alignment. Additionally, once a system is capable of ARA, placing bounds on a system’s capabilities may become significantly more difficult.

https://arxiv.org/abs/2312.11671

Quantum Computing’s Hard, Cold Reality Check  Hype is everywhere, skeptics say, and practical applications are still far away
https://spectrum.ieee.org/quantum-computing-skeptics

LLM in a flash: Efficient Large Language Model Inference with Limited Memory

https://huggingface.co/papers/2312.11514

Google Addresses the Mysteries of Its Hypercomputer

It turns out that the Hypercomputer is Google’s take on a modular supercomputer with a healthy dose of its homegrown TPU v5p AI accelerators, which were also announced this month.   

Happy to announce the open sourcing of the Capybara dataset! 

All of this diversity is contained in less than 20K examples, already aggressively filtered to keep out censorship and undesirable responses.

LG’s latest Gram laptops are predictably stuffed with AI features

https://www.engadget.com/lgs-latest-gram-laptops-are-predictably-stuffed-with-ai-features-163910204.html

A deep dive into training dynamics of diffusion models

Introducing AskVideos-VideoCLIPv0.1, a versatile text-grounded video embedding model. Like its image-only counterpart, CLIP, VideoCLIP enables you to compute a single embedding for videos that can be used to compute similarity with text and perform vector retrieval.

Consistent lesson from 70 years of AI progress is both how counterintuitive the shape of progress is (chess was much easier than AGI), but also how predictable (neural nets were conceptualized in the 1940s!) with the right mix of intuition and science.

Credits/Sources

Most of these links come from just a few incredible sources.  Please follow them:

Previous Issues

38 responses to “AI News #13: Week Ending 12/29/2023 with Executive Summary and Top 9 Stories”

  1. […] psyched to have called out Apple’s Ferret-UI model back in December 2023. In fact, I thought it was so important that I made the cover image a ferret reading a newspaper. […]

  2. […] year ago in October 2023, Apple released an open model called Ferret that could identify and ground objects in a UX – essentially […]

  3. […] a year ago, Apple released an open model called Ferret (paper here) that could identify and engage with objects in a user-interface  – multimodal […]

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading