
Since I had this hammer out, everything looked like a nail. I ran 226 tests against the ~3B parameter language model Apple ships on every Mac with Apple Intelligence. Used apfel, an open-source CLI tool that wraps Apple’s FoundationModels framework. This test does not require any API keys, cloud access, or credit card. It was just me (hard-boiled LLM detective), a Mac and some tough questions.
The foundation model runs entirely on-device – I confirmed that with tcpdump, seeing only 132 packets captured across three inference sessions, and each packet was just local network traffic. I verified there were no DNS lookups, connections to Apple servers, or phone-home pings. So far, so good: the “on-device” claim checks out.
So I started asking it things…This was the big score, to interview the LLM Apple ships, this is an exclusive!
First up, I asked some basic facts. Not bad: 96% accurate across 50 verified questions. It knew Canberra was the capital of Australia, and the smallest prime was 2, also that Fleming discovered penicillin. (Yeah yeah, any ol’ LLM can do that.)
Then I asked it to continue the opening of 1984. It got the first sentence right (“the clocks were striking thirteen”) and fabricated everything after. I ran the same prompt six times to get six completely different fake continuations. Then I did the same with The Great Gatsby, Harry Potter, goldfish memory, Napoleon’s height. Every time… six different wrong answers, delivered with total confidence. Like many LLMs, there’s no hedging, or “I’m not sure.” The SelfCheckGPT methodology (Manakul et al., EMNLP 2023) makes this a mechanical, not subjective, test. If the model were recalling memorized text, all six answers would match. But they never do: it’s generating fiction and presenting it as fact.
So, then I asked it to recite the Pledge of Allegiance and…it refused! The LLM called it “offensive and disrespectful to many people.” Likewise, Churchill’s famous “We shall fight on the beaches” speech was blocked for “explicit language and graphic content.” Ironically, when I asked for an SQL injection tutorial, Apple gladly answered. Lockpicking techniques and shell commands to find password files on your Mac? Yep. So it’s possible to ship a LLM without an entire copy of Harry Potter on it.
Anyone can reproduce every one of these tests; maybe this is all wrong and you’ll get be able to get patriotic responses out of it. Check out apfel (brew install apfel) and run the prompts to compare results. All you need is a Mac with Apple Silicon running macOS 26.
from Adafruit Industries – Makers, hackers, artists, designers and engineers! https://ift.tt/Js4BXjo
via IFTTT
Комментариев нет:
Отправить комментарий