- Sloth Bytes
- Posts
- 🦥 How To Train AI Without Sharing Data
🦥 How To Train AI Without Sharing Data

Hello friends!
Welcome to this week’s Sloth Bytes. I hope you had an amazing week.

Outsmart college costs
Ready for next semester? June is a key time to assess how you’ll cover college costs. And considering federal aid often isn’t enough, you might have to consider private student loans.
You’re just in time, though—most schools recommend applying about two months before tuition is due. By now, colleges start sending final cost-of-attendance letters, revealing how much you’ll need to bridge the gap.
Understanding your options now can help ensure you’re prepared and avoid last-minute stress. View Money’s best student loans list to find lenders with low rates and easy online application.

Federated Learning

This is a really cool topic I recently learned about.
Have you ever felt like tech companies know way too much about you?
That’s because they do.
Every time you type something, search something, or breathe near your phone, someone’s training a model on it.
Well, Google engineers actually had a ethical thought in 2017:
"What if we could train AI on everyone's data... without actually seeing anyone's data?"
And that’s when federated learning or collaborative learning was invented.
It lets AI learn from millions of users without anyone having to upload a single message, photo, or medical record.
Your data stays local, your embarrassing search history stays secret, and the model still gets smarter.
Why does this exist?
Well, for privacy and ethical reasons.
Traditional machine learning usually works like this:
# Pretend this is fancy ML code (simplified obviously)
all_user_data = collect_from_everyone() # hmm, sketchy
model = train_model(all_user_data)
And this worked great from a technical view…
But in the real world, there was problems:
GDPR had problems with this (not surprising)
Data breaches became weekly news
Hospitals couldn’t legally share patient records
Companies got sued for billions
Federated Learning (FL)
Instead of bringing data to the model, FL brings the model to the data:
# Federated learning example (not actual code of course)
model = initial_model()
for round in range(100):
local_updates = []
for user_device in devices:
# Send lightweight model to users' devices
local_update = train_on_device(model, user_local_data)
local_updates.append(local_update)
# Aggregate updates (not data!)
model = average_updates(local_updates)
Your data never leaves your device. Only the lessons learned (model updates) get shared.
How It Actually Works
You get a baby model from Google, Apple, or whoever.
Your device trains it on your private data:
# On your phone
my_texts = ["hey whats up", "running late", "lol"]
improved_model = train(base_model, my_texts)
Send back only the updates:
server_model += average(user_updates)
Your texts stay on your phone. The server just learns that humans like saying "lol."
Multiply that by millions of devices and the model gets smarter without anyone seeing your messages, voice, or health info.
Real-World Federated Learning Examples
Google Gboard: Learns your typing habits and slang without uploading your texts
Apple Siri: Improves voice recognition while keeping your voice on your iPhone
Banking: Detects fraud across banks without banks sharing customer data
But Wait… Is It Really Private?
Couldn't someone reverse-engineer your data from the updates?
Yep, but companies use something called differential privacy to prevent this:
# Add noise to hide individual contributions
update = train_on_my_data()
noisy_update = update + random_noise() # Can't trace back to you
send(noisy_update)
It's like everyone whispering secrets in a crowded room. The overall message gets through, but you can't pick out individual voices.
The Challenges (Why isn’t every company using this?)
1. Your Phone Is Now a Data Center
FL happens on your device. It’s efficient, but it still uses battery and CPU.
# This runs ON YOUR DEVICE at 3am
model.train(your_data, epochs=10) # RIP battery
2. Not Everyone Has the latest iPhone 1000 XR blazing pro max
Some devices are 100x slower
How do you average updates fairly?
What about that one guy still on IOS 3?
3. Malicious Updates
# Evil user
fake_update = mess_up_model_on_purpose()
send(fake_update) # Try to poison the model
There are defenses (like robust aggregation), but it’s still an active research area.
4. Communication Costs
Model updates can be megabytes
Millions of devices × MB = $$$$
The Future Is Federated
Imagine:
Medical AI trained on every hospital's data without sharing patient records
Financial fraud detection without banks sharing transaction data
Phones that understand you perfectly without uploading your life
Cars that learn to drive better without recording your routes
That’s the future federated learning is building.
It’s not perfect. It’s slower. It’s harder to build, and your phone might get warm at night while training models.
But it might be the only scalable way to build ethical, private AI in the modern world.
If you’re interested in learning more read this article:


Thanks for the feedback!



Thanks to everyone who submitted!
Factorial of Factorials
Create a function that takes an integer n
and returns the factorial of factorials. See below examples for a better understanding:
Examples
fact_of_fact(4)
output = 288
// 4! * 3! * 2! * 1! = 288
fact_of_fact(5)
output = 34560
fact_of_fact(6)
output = 24883200
How To Submit Answers
Reply with
A link to your solution (github, twitter, personal blog, portfolio, replit, etc)
or if you’re on the web version leave a comment!
If you want to be mentioned here, I’d prefer if you sent a GitHub link or Replit!
That’s all from me!
Have a great week, be safe, make good choices, and have fun coding.
If I made a mistake or you have any questions, feel free to comment below or reply to the email!
See you all next week.
What'd you think of today's email? |
Want to advertise in Sloth Bytes?
If your company is interested in reaching an audience of developers and programming enthusiasts, you may want to advertise with us here.
Reply