Today's Deep-Dive: ELATO

0:00

Welcome to the deep dive before we jump in today a quick. Thank you to our

0:03

supporter safe server

0:05

Safe server handles software hosting and they're really focused on supporting your

0:09

digital transformation

0:10

So if you're looking for reliable hosting you can check them out at

0:13

www.safe server dot de like I said today

0:18

We're embarking on a mission that feels a bit

0:22

Well bit sci-fi maybe don't it leans that way we're looking at a lotto AI

0:27

Right, and the core idea is taking sophisticated conversational AI like really

0:32

advanced stuff and getting it out of the screen

0:34

Out of your phone or computer and into physical things specifically toys plushies

0:40

even it sounds simple

0:41

But the sources suggest this is way beyond just a talking toy exactly

0:45

We're gonna break down how they're trying to merge the hardware the software and

0:48

these really distinct AI personalities

0:51

The goal seems to be making these interactions feel hyper realistic. That's the

0:56

hook isn't it?

0:57

The source material literally says they're giving plushies voices that feel

1:00

ridiculously real

1:01

Yeah, it made me think of that movie Ted the talking teddy bear, right?

1:05

But imagine that powered by actual live cutting-edge AI. That's kind of wild and

1:10

that's what we're diving into

1:11

It's sort of the next step for digital companions moving them into the physical

1:15

world. Okay, let's start with the basics the hardware

1:18

For anyone maybe new to this kind of tech. What is the a lotto device?

1:23

Physically well at its heart. It's a small gadget an IOT client. Technically. He's

1:29

got the microphone

1:30

It's got the speaker but the clever part is how it attaches. Okay, it uses two

1:35

simple silicone straps

1:36

So you can clip it on to pretty much any toy you already have. Oh, right

1:40

So you don't need to buy their specific toy. Nope that old teddy bear in the attic

1:45

Yeah, suddenly you can have you know a brain and a voice that flexibility seems

1:49

like a big deal

1:50

It really is and the setup sounds incredibly simple aimed at well anyone no tech

1:55

skills needed

1:56

How simple are we talking like three steps simple first clip the device onto the

2:00

toy?

2:01

Okay, second connect it to your home Wi-Fi. It uses what's called a captive portal

2:06

Uh-huh like when you connect at a hotel exactly that it makes its own little

2:09

network

2:10

Temporarily to guide you super easy and third pick a character personality from

2:14

their list and just start talking to it

2:17

Wow, okay. Now I saw they're actually two different products mentioned. Yeah, they're

2:21

catering to slightly different people

2:22

There's the main AI device. That's the consumer one, right pre-order price

2:26

mentioned was $69 that gets you the device

2:29

Access to all the AI characters unlimited apparently and a free month of their

2:34

premium subscription

2:35

It's the plug-and-play version and the other one for tankers

2:38

That's the AI dev kit a bit cheaper $59 on pre-order

2:42

This one's really for developers makers people who want to mess around with it. How

2:47

so it has open source firmware

2:48

Runs over a standard USB C connection and lets you load your own custom voices or

2:54

even your own AI models

2:55

If you want much more flexible if you're technically inclined gotcha and practical

2:59

things

3:00

Battery life is this thing always plugged in apparently not they claim a week of

3:04

battery life, which is pretty good makes it actually portable

3:08

Yeah, that's essential if it's meant to be a companion, and I saw something about

3:12

community support uh-huh over 1200 stars

3:15

They said which suggests. There's already a decent buzz around it people are

3:19

interested that early engagement is usually a good sign

3:22

Definitely shows people are intrigued by the idea yeah, and maybe even want to

3:26

build on it themselves, okay?

3:27

Let's shift gears the hardware is neat, but the sources really emphasize the

3:32

personalities the who

3:34

This seems to be where a lot of really tries to stand out absolutely this isn't

3:39

just about making a toy talk

3:40

It's about giving it a very specific often complex character

3:44

They mentioned over a hundred ai characters available a hundred and they're not

3:49

just slight variations

3:50

Yeah, not at all the examples they give are incredibly diverse they seem to be

3:54

leaning into strong personalities even flawed ones

3:57

Not just helpful assistant, okay. Give us some examples. What kind of range are we

4:01

talking well? You've got the comforting

4:04

nostalgic types

4:05

Like Dottie Mae Dottie Mae described as a classic Southern diner waitress uses

4:10

terms like hun sweetie

4:12

gives unsolicited advice

4:15

recommends the pie

4:17

Pure comfort food in voice form basically ah okay, so that's one end. What about

4:22

the other end? Oh they go there dramatic flamboyant characters

4:25

There's captain star flash is a super overconfident space captain who thinks laser

4:30

solve everything right or dr

4:32

Voltanus the classic mad scientist full of manic energy apparently shouts catchphrases

4:36

think loud thunder effect

4:38

So you could clip this onto like a superhero toy or something exactly or maybe

4:42

something completely incongruous for comedic effect

4:45

And what about more thoughtful characters yep?

4:48

They mentioned paradox pithius an ancient Greek philosopher type sounds wise wise,

4:53

but also apparently kind of smug

4:54

He answers your deep questions

4:56

With even deeper possibly more annoying questions makes you think but maybe grinds

5:02

your gears a little okay

5:03

This is where that uncensored aspect might come into right the comedy take sugarplum

5:07

the description is fascinating

5:08

Speaks in a super sweet bubbly childlike voice sounds innocent

5:13

But apparently drops comments so dark it makes Satan clutches pearls whoa

5:18

Okay, that's a choice. It's intentional friction right that contrast creates shock

5:23

value makes it memorable

5:25

It's not trying to be bland and they seem to lean into existing pop culture stuff,

5:29

too. I saw Ted mentioned

5:30

Yeah, Ted the inappropriate Teddy. Yeah, clearly referencing the movie character

5:35

Boston accent bar fly mouth

5:37

Can you imagine where that goes uncensored indeed any other specific types loads?

5:42

They mentioned Mikey Sally Sullivan hardcore Boston guys swearing rants, then there's

5:47

the proper British lad

5:49

What's his deal judges your tea making skills apologizes constantly if you bump

5:53

into him very specific cultural niche

5:55

It seems like they're aiming for very defined archetypes. Totally and it's not just

5:59

comedy or stereotypes

6:00

They even list Zoran Mamdani the political activists. Yeah described as empathetic

6:04

focused on social equity and justice

6:07

So the range covers serious and specific viewpoints to not just jokes

6:11

So the strategy isn't just make a friend

6:14

It's pick a very specific memorable character exactly depth and distinctiveness

6:18

over just being generally agreeable

6:21

You clip it on you get that personality fully formed which brings us neatly to the

6:26

how we know the what the device we know

6:28

These wild personalities

6:31

How does the tech actually pull this off in real time making a toy have a

6:35

continuous natural conversation globally?

6:38

That sounds hard. It is hard. The key seems to be what the source calls real-time

6:43

speech to speech conversion

6:45

We're talking potentially up to 15 minutes of uninterrupted chat 15 minutes. Wow

6:50

How they use what the source referred to as a brain trust? They're not relying on

6:54

just one AI model

6:55

Oh, okay. So they're pulling from multiple sources, which ones? Yeah, it's quite a

6:59

list of the big names right now

7:00

Open AI is real-time API

7:02

Google's Gemini live API 11 labs AI agents and also Hume AI EVI for four different

7:09

ones

7:09

Why so many wouldn't that be complicated? It probably is but the idea is that each

7:13

model has strengths

7:14

Maybe one is faster one sounds more natural one is better at catching emotional

7:18

cues

7:19

By using several they can kind of pick the best tool for the job for each part of

7:22

the conversation or blend them

7:25

It helps keep the latency low and the quality high like hedging your bets that

7:30

makes sense redundancy and optimization

7:32

Okay for someone listening who isn't a developer. Can we simplify the architecture?

7:36

You mentioned a triangle earlier

7:37

Yeah, think of it as three core pieces working together really fast first

7:41

You've got the device itself the IOT client that ESP 32 thing

7:45

We talked about clip to the toy it just captures your voice and plays the AI's

7:49

voice sends the audio securely using web sockets

7:52

Okay, piece one the ears and mouth on the toy exactly piece two is the edge server

7:57

This runs on something called Dino think of it as the super fast traffic controller

8:01

or router my edge

8:02

It means it's located geographically close to you and also close to the big AI

8:06

models

8:07

Its whole job is to grab the audio from the toy

8:10

Instantly fire it off to the right AI service like Gemini or 11 labs get the

8:15

response back and zap it straight to the toys speaker

8:18

Minimizes delay got it the middleman ensuring speed and the third piece. That's the

8:25

front end

8:25

Basically the website or app you use built with next.js. This is where you choose

8:30

your characters

8:31

Maybe create custom ones adjust the volume that kind of thing. Ah, and I saw you

8:34

can tweak the pitch

8:35

Yeah, the pitch factor so you could take a serious character's voice and make it

8:39

sound high pitched and cartoonish if you wanted more

8:42

Customization. Okay. So the whole thing relies on speed if there's a big delay it

8:46

ruins the illusion of conversation

8:48

What kind of performance are they claiming? The numbers are pretty impressive,

8:51

especially for a global system

8:52

They're aiming for under two seconds round-trip latency under two seconds from you

8:58

speaking to hearing the reply

9:00

Yeah, which is generally fast enough to feel pretty conversational not like a walkie-talkie

9:05

and the audio quality. Does it sound clear?

9:07

They mentioned using the Opus Kodak at 12 kiloby piece

9:11

Which in non-technical terms means it should sound pretty clear and crisp even

9:17

though they're keeping the data rate low for speed

9:19

Okay, one more tech thing. How does it know when I've finished talking? Do I have

9:24

to press a button? No, and that's crucial

9:26

They use something called Server VAD voice activity detection. Server VAD?

9:31

Right, instead of the little device trying to guess, the powerful server analyzes

9:36

the audio stream in real time

9:38

It figures out precisely when you've naturally paused or finished speaking. Ah, so

9:42

it makes turn-taking much smoother

9:44

Exactly, less awkward silence, fewer interruptions, key for making it feel real.

9:50

Plus they mentioned OTA updates. Over the air?

9:52

Yeah, means the software on the device can be updated automatically over Wi-Fi

9:56

So it can get better over time without you needing to plug it into a computer. Okay,

10:00

so putting it all together

10:01

It's quite an ambitious project. Merging these very specific, sometimes wild

10:06

personalities with hardware that enables smooth, fast

10:09

conversation. It really is. The big takeaway seems to be shifting AI interaction

10:14

away from just typing in a box. And

10:16

into a physical object you can actually talk with. Like, really talk with. Whether

10:21

you want that companion to be a nurturing waitress like Dottie May or

10:24

a sarcastic philosopher or an inappropriate teddy bear. Right, it's that

10:29

customization delivered through a physical form.

10:32

So the final thought for you listening, the source emphasizes this device has no

10:37

filters, no rules.

10:39

We have the tech now to give an innocent looking plushie a voice that could be,

10:44

well,

10:44

deliberately offensive like Ted, or shockingly dark like Sugar Plum, or maybe even

10:48

politically charged.

10:50

If digital companionship becomes totally personalized and unrestrained, what does

10:55

that mean?

10:55

What happens when we start designing companions not to be helpful or polite, but

10:59

maybe

10:59

unhinged?

11:01

Provocative. Something to think about as this tech develops.

11:04

Well, that's all we have time for on this deep dive and thanks again to our

11:07

supporters Safe Server.

11:08

Remember, they handle software hosting and support digital transformation.

11:12

sources.

11:12

sources.

Today's Deep-Dive: ELATO

Episode description

Persons