1 00:00:00,000 --> 00:00:04,640 Welcome to the deep dive the place where we unlock the secrets of emerging tech and 2 00:00:04,640 --> 00:00:08,950 Really figure out why it matters to you today. We're tackling something that I 3 00:00:08,950 --> 00:00:10,820 think a lot of people are thinking about absolutely 4 00:00:10,820 --> 00:00:16,420 We're talking about personal assistance, but with a huge twist one that doesn't spy 5 00:00:16,420 --> 00:00:16,680 on you 6 00:00:16,680 --> 00:00:23,060 We are diving into Leon AI right if you've ever wanted that convenience 7 00:00:23,060 --> 00:00:27,320 You know of a digital helper, but without that feeling that some big company is 8 00:00:27,320 --> 00:00:28,800 listening in on your living room 9 00:00:28,960 --> 00:00:31,640 This is for you. It's a very common feeling these days 10 00:00:31,640 --> 00:00:36,790 It is we're talking about an open source totally private and self-hosted virtual 11 00:00:36,790 --> 00:00:37,620 assistant 12 00:00:37,620 --> 00:00:41,680 But before we get into why Leon is sometimes called a virtual brain 13 00:00:41,680 --> 00:00:45,970 We really want to thank the supporter of this deep dive of course this show is 14 00:00:45,970 --> 00:00:47,340 brought to you by safe server 15 00:00:47,340 --> 00:00:51,320 So if you're looking to host software just like Leon and you need some support for 16 00:00:51,320 --> 00:00:52,640 your digital transformation 17 00:00:52,800 --> 00:00:58,010 Safe server is there for you. You can find all the info at www.safeserver.de. Okay, 18 00:00:58,010 --> 00:00:59,320 so let's unpack this 19 00:00:59,320 --> 00:01:03,330 We've all used the big-name digital assistant. Yeah, but they all work on a trade. 20 00:01:03,330 --> 00:01:05,280 You give them your data for convenience 21 00:01:05,280 --> 00:01:09,140 That's the deal. Yeah, our mission today is to figure out what makes Leon 22 00:01:09,140 --> 00:01:10,600 fundamentally different 23 00:01:10,600 --> 00:01:14,880 we want to focus on its its core idea of ownership and 24 00:01:14,880 --> 00:01:19,480 The really cool modular architecture behind it and that's the hook right there 25 00:01:19,920 --> 00:01:25,000 Leon is designed to be your virtual brain, but and this is the critical part 26 00:01:25,000 --> 00:01:31,370 This brain can live entirely on your server on my server not theirs. Exactly. That's 27 00:01:31,370 --> 00:01:31,880 the whole shift 28 00:01:31,880 --> 00:01:35,620 It's an assistant that does stuff when you ask it to whether you're talking or just 29 00:01:35,620 --> 00:01:38,560 typing you control the entire relationship 30 00:01:38,560 --> 00:01:42,450 So functionally it's similar to what we already know, but the whole infrastructure 31 00:01:42,450 --> 00:01:44,560 is flipped upside down completely inverted 32 00:01:44,560 --> 00:01:48,560 So I lead with open source. What is that MIT license, which I know the creator 33 00:01:48,560 --> 00:01:51,740 chose. What does that actually mean for a regular person? 34 00:01:51,740 --> 00:01:54,760 It means freedom and it means speed of development 35 00:01:54,760 --> 00:01:58,810 The MIT license is basically the most permissive one out there. So no strings 36 00:01:58,810 --> 00:02:00,480 attached pretty much 37 00:02:00,480 --> 00:02:05,940 It signals that the goal is growth its contribution with almost no restrictions. 38 00:02:05,940 --> 00:02:06,900 Anyone can see the code 39 00:02:06,900 --> 00:02:10,600 They can change it use it for their own stuff commercial or not and then contribute 40 00:02:10,600 --> 00:02:10,920 back 41 00:02:11,080 --> 00:02:15,000 So you're not just a user you're part of a community that actually owns the tool 42 00:02:15,000 --> 00:02:16,160 and that idea of ownership 43 00:02:16,160 --> 00:02:21,090 Feeds directly into its main selling point, which is privacy. Yes, that is the 44 00:02:21,090 --> 00:02:21,480 central pillar 45 00:02:21,480 --> 00:02:25,820 This is where Leon goes head-to-head with all the concerns about data tracking from 46 00:02:25,820 --> 00:02:27,900 commercial AI because it's on my server 47 00:02:27,900 --> 00:02:31,240 Exactly, you are in complete control of your data 48 00:02:31,240 --> 00:02:36,450 No conversation goes to the third-party server unless you explicitly set it up that 49 00:02:36,450 --> 00:02:39,320 way and the source material really highlights this 50 00:02:39,880 --> 00:02:45,270 You can configure Leon to work while being completely offline. Wait, okay offline 51 00:02:45,270 --> 00:02:46,960 mode. That's a total game-changer 52 00:02:46,960 --> 00:02:51,890 You're talking about sensitive business data or just conversations in your own home 53 00:02:51,890 --> 00:02:53,440 that gives you complete 54 00:02:53,440 --> 00:02:59,150 Digital sovereignty. Mm-hmm. But here's the question then if I host it myself am I 55 00:02:59,150 --> 00:03:02,080 sacrificing all the power of you know 56 00:03:02,080 --> 00:03:06,020 Massive cloud computers and that's the exact challenge Leon is built to solve 57 00:03:06,160 --> 00:03:09,960 It uses its modularity to stay powerful while keeping you in control 58 00:03:09,960 --> 00:03:13,240 The whole goal is to automate stuff and make your virtual life easier 59 00:03:13,240 --> 00:03:18,320 Right, so it uses this really flexible structure to manage all that complexity 60 00:03:18,320 --> 00:03:19,860 right there locally 61 00:03:19,860 --> 00:03:23,040 So tell me about that structure. It's not just one big program 62 00:03:23,040 --> 00:03:27,480 Is it it's built on a system of modules they call skills, correct? 63 00:03:27,480 --> 00:03:31,400 Think of Leon's core as like the central nervous system 64 00:03:31,400 --> 00:03:37,160 Yeah, it's the conductor the skills are the individual musicians in the orchestra 65 00:03:37,160 --> 00:03:41,390 I like that analogy each skill is a tiny self-contained module that does one 66 00:03:41,390 --> 00:03:42,400 specific thing 67 00:03:42,400 --> 00:03:46,520 It could check the weather manage your calendar control your smart lights 68 00:03:46,520 --> 00:03:49,980 Anything so if I want Leon to learn something new, I don't have to update the 69 00:03:49,980 --> 00:03:50,760 entire system 70 00:03:50,760 --> 00:03:52,760 I just add a new skill precisely 71 00:03:52,760 --> 00:03:57,760 It makes it super scalable without making the core program bloated or fragile the 72 00:03:57,760 --> 00:03:59,500 structure lets anyone 73 00:03:59,500 --> 00:04:03,040 You know any developer out there create their own skills and then share them like 74 00:04:03,040 --> 00:04:04,200 an app score for abilities 75 00:04:04,200 --> 00:04:08,520 That's a great way to put it the creators even say there is only one core to rule 76 00:04:08,520 --> 00:04:08,760 them all 77 00:04:08,760 --> 00:04:12,440 To keep it all consistent and the plan is to build a dedicated skill registry 78 00:04:12,440 --> 00:04:13,200 platform 79 00:04:13,200 --> 00:04:18,140 So like NPM for JavaScript or pip for Python exactly like that, but just for Leon's 80 00:04:18,140 --> 00:04:19,080 automation skills 81 00:04:19,080 --> 00:04:24,180 It'll make finding and installing new abilities super simple. That's really elegant 82 00:04:24,280 --> 00:04:29,210 But let's look behind the curtain for a second the repository the actual code has a 83 00:04:29,210 --> 00:04:30,440 bunch of different parts 84 00:04:30,440 --> 00:04:32,440 What are those other things doing? 85 00:04:32,440 --> 00:04:35,320 Like why does a personal assistant need a TCP server? 86 00:04:35,320 --> 00:04:40,160 Yeah, that gets a little technical but it's all about performance 87 00:04:40,160 --> 00:04:44,000 The different parts of Leon need to talk to each other and they need to do it fast 88 00:04:44,000 --> 00:04:49,640 You have the main server the skills the web app for the interface and the hot work 89 00:04:49,640 --> 00:04:52,860 No, that's always listening for you to say its name. So they're all separate 90 00:04:52,860 --> 00:04:53,360 processes 91 00:04:53,360 --> 00:04:57,320 They are and the TCP server is like the internal switchboard 92 00:04:57,320 --> 00:05:00,840 it just makes sure data gets passed between all those different parts reliably and 93 00:05:00,840 --> 00:05:01,440 quickly and 94 00:05:01,440 --> 00:05:06,380 Since so many AI tools are written in Python. They also needed a dedicated Python 95 00:05:06,380 --> 00:05:08,720 bridge to translate because it translate 96 00:05:08,720 --> 00:05:14,040 Yeah, it lets the JavaScript core talk seamlessly with all the Python AI libraries 97 00:05:14,040 --> 00:05:17,870 They need to use that paints a much clearer picture. It's like a distributed system 98 00:05:17,870 --> 00:05:18,320 in a box 99 00:05:18,320 --> 00:05:23,140 So speaking of cutting-edge tools, what's the current state of development? 100 00:05:23,760 --> 00:05:27,220 Read the dev branch was going through some major changes around June 101 00:05:27,220 --> 00:05:32,160 2024 it is the biggest shift is the integration of what are called foundation 102 00:05:32,160 --> 00:05:34,340 models. Hmm. Okay for a beginner 103 00:05:34,340 --> 00:05:38,700 What does that mean? Think of foundation models as those huge general purpose AI 104 00:05:38,700 --> 00:05:39,200 brains 105 00:05:39,200 --> 00:05:42,720 Like the ones that power the most famous chat bots you hear about 106 00:05:42,720 --> 00:05:48,520 Leon is starting to use those but and this is key in a hybrid approach hybrid 107 00:05:48,520 --> 00:05:51,560 So it's not just using that massive AI brain for everything 108 00:05:51,560 --> 00:05:55,740 No, because that would be really slow and heavy for a self-hosted assistant their 109 00:05:55,740 --> 00:05:56,960 hybrid model is smart 110 00:05:56,960 --> 00:06:00,320 It balances the power of those big models for complex tasks 111 00:06:00,320 --> 00:06:05,260 Like say summarize this long email for me with simpler faster techniques for basic 112 00:06:05,260 --> 00:06:06,520 commands like turn on the light 113 00:06:06,520 --> 00:06:10,970 Exactly for something simple like that. It uses a much lighter almost instant 114 00:06:10,970 --> 00:06:12,120 classification method 115 00:06:12,120 --> 00:06:16,050 It only calls in the big guns the foundation models when it actually needs that 116 00:06:16,050 --> 00:06:16,820 extra brainpower 117 00:06:17,040 --> 00:06:21,970 So you get the best of both worlds speed for simple things and power for complex 118 00:06:21,970 --> 00:06:22,080 ones 119 00:06:22,080 --> 00:06:27,650 Optimal speed and accuracy without sacrificing that local first efficiency. Okay. 120 00:06:27,650 --> 00:06:29,380 Here's where it gets really interesting for me 121 00:06:29,380 --> 00:06:32,040 The AI toolbox itself 122 00:06:32,040 --> 00:06:37,120 How does Leon get its ears and mouth in a way that still respects my privacy, right? 123 00:06:37,120 --> 00:06:42,840 We're talking about three core AI concepts here NLP or natural language processing 124 00:06:42,840 --> 00:06:46,860 which is for understanding what you mean the brain part the brain part then TTS or 125 00:06:46,860 --> 00:06:48,640 text-to-speech for its voice and 126 00:06:48,640 --> 00:06:55,170 STT speech-to-text for listening to you and I the user get to choose which tech I 127 00:06:55,170 --> 00:06:57,620 want for all of those you get total control 128 00:06:57,620 --> 00:07:01,680 Absolute control this modularity means you can choose what you're comfortable with 129 00:07:01,680 --> 00:07:04,600 if you want the absolute best performance and the clearest voice 130 00:07:04,600 --> 00:07:09,540 You can connect it to cloud services like Google cloud AWS or IBM Watson, but if I'm 131 00:07:09,540 --> 00:07:10,760 doing this whole thing for privacy 132 00:07:10,760 --> 00:07:13,840 I'd want the local options. Are they any good? Are they robust enough? 133 00:07:13,840 --> 00:07:17,890 They're getting better all the time. And yes, there are strong offline options for 134 00:07:17,890 --> 00:07:19,000 text-to-speech 135 00:07:19,000 --> 00:07:23,720 You have tools like CMU flight for the listening part speech to text 136 00:07:23,720 --> 00:07:29,170 They support Koki STT and more are on the way. What's the trade-off then? The trade-off 137 00:07:29,170 --> 00:07:31,120 is usually a little bit of speed and 138 00:07:31,120 --> 00:07:34,800 Maybe the voice doesn't sound quite as natural as the big cloud ones 139 00:07:35,320 --> 00:07:39,880 But for people who are serious about privacy the peace of mind you get from running 140 00:07:39,880 --> 00:07:41,660 it all locally is well 141 00:07:41,660 --> 00:07:43,540 It's often worth that small dip in quality 142 00:07:43,540 --> 00:07:47,500 It's just amazing how the flexibility is built into every single layer from the 143 00:07:47,500 --> 00:07:49,320 core right down to the voice engine 144 00:07:49,320 --> 00:07:54,410 Now open source projects like this live and die by their community for someone 145 00:07:54,410 --> 00:07:56,200 listening who's getting excited about this 146 00:07:56,200 --> 00:07:58,520 How is it all maintained and how can they get involved? 147 00:07:58,520 --> 00:08:03,440 Leon is really driven by that idea that the more skills he has the more skillful he 148 00:08:03,440 --> 00:08:04,320 becomes the author 149 00:08:04,440 --> 00:08:06,840 Louie grenard is big on community 150 00:08:06,840 --> 00:08:10,600 he encourages people to join their discord channel to share ideas or you know, 151 00:08:10,600 --> 00:08:15,060 Even contribute code and building something this complex has to take a ton of time. 152 00:08:15,060 --> 00:08:17,140 It does it's mostly a spare time project 153 00:08:17,140 --> 00:08:20,240 So what's the plan for sustainability? Yeah, make sure it doesn't just you know 154 00:08:20,240 --> 00:08:20,840 fade out 155 00:08:20,840 --> 00:08:24,800 That's a super important point for any big open source project 156 00:08:24,800 --> 00:08:28,760 The source material notes that at the end of the day sustainability is key. The 157 00:08:28,760 --> 00:08:30,560 author has bills to pay, right? 158 00:08:30,560 --> 00:08:35,800 We all do we all do so sponsoring the project lets him and other core contributors 159 00:08:35,800 --> 00:08:40,740 Dedicate more real focused time to it instead of just grabbing a few hours here and 160 00:08:40,740 --> 00:08:40,980 there 161 00:08:40,980 --> 00:08:46,660 Financial support helps turn a passion project into well a full-time dedicated job. 162 00:08:46,660 --> 00:08:47,820 That makes perfect sense 163 00:08:47,820 --> 00:08:52,760 Okay, let's get practical for anyone listening. Who's ready to install this virtual 164 00:08:52,760 --> 00:08:54,480 brain on their own machine 165 00:08:54,480 --> 00:08:58,680 What do they need to get started? It's actually not as scary as it sounds 166 00:08:58,680 --> 00:09:03,840 You just need a few standard things installed first. This is for Linux Mac OS or 167 00:09:03,840 --> 00:09:04,400 Windows 168 00:09:04,400 --> 00:09:08,000 You need a modern version of node.js and its package manager NPM 169 00:09:08,000 --> 00:09:12,120 Okay node.js and NPM got it and we always recommend using the latest stable 170 00:09:12,120 --> 00:09:12,680 versions 171 00:09:12,680 --> 00:09:17,020 It just helps avoid headaches down the line. So once that's on my machine, what's 172 00:09:17,020 --> 00:09:18,600 the first Leon specific step? 173 00:09:18,600 --> 00:09:23,360 The first step is to install the Leon command line interface or CLI 174 00:09:23,520 --> 00:09:28,760 You just open your terminal and type NPM install global at Leon ICLEI 175 00:09:28,760 --> 00:09:32,280 That one command gives you the main tool to manage Leon 176 00:09:32,280 --> 00:09:36,260 Okay, then to actually download all the files and set up your assistant 177 00:09:36,260 --> 00:09:42,110 You run a really simple kind of cool command Leon create birth Leon create birth. I 178 00:09:42,110 --> 00:09:42,520 like that 179 00:09:42,520 --> 00:09:46,120 Yeah, it downloads everything gets the whole structure ready for beginners 180 00:09:46,120 --> 00:09:48,800 We definitely say stick to the stable version for now 181 00:09:48,800 --> 00:09:52,040 Especially with the big changes we talked about happening on the development branch 182 00:09:52,040 --> 00:09:55,740 your tip and then to actually fire it up you just run Leon start the server will 183 00:09:55,740 --> 00:09:57,400 boot up and then you just open a 184 00:09:57,400 --> 00:10:01,960 Web browser and go to HTTP dot localhost dot one three three seven, and that's it. 185 00:10:01,960 --> 00:10:02,520 That's it 186 00:10:02,520 --> 00:10:05,690 You should see the web interface ready for you to configure skills and start 187 00:10:05,690 --> 00:10:07,560 talking to your own private AI 188 00:10:07,560 --> 00:10:11,740 They've really tried to make self hosting less intimidating. This has been a really 189 00:10:11,740 --> 00:10:13,200 fascinating deep dive 190 00:10:13,200 --> 00:10:17,240 Leon AI is clearly this powerful privacy first and 191 00:10:17,760 --> 00:10:22,690 Super customizable assistant that's built by and for its community. Yeah, it's more 192 00:10:22,690 --> 00:10:23,360 than just an alternative 193 00:10:23,360 --> 00:10:27,810 It really is. Yeah, it feels like it's demanding a higher standard for how we think 194 00:10:27,810 --> 00:10:28,920 about our data 195 00:10:28,920 --> 00:10:33,420 And that kind of leads us to a final provocative question for you to talk about if 196 00:10:33,420 --> 00:10:35,500 we connect the rise of these 197 00:10:35,500 --> 00:10:40,200 open-source self hosted AIs like Leon to the bigger picture 198 00:10:40,200 --> 00:10:45,040 How will this shift change what we expect from technology? 199 00:10:45,360 --> 00:10:50,290 We'll self hosting tools like this eventually make us see those big centralized 200 00:10:50,290 --> 00:10:51,160 assistance as being you know 201 00:10:51,160 --> 00:10:55,160 Fundamentally broken because they're missing the most important feature 202 00:10:55,160 --> 00:11:00,250 Privacy that is something to mull over as you decide what tech you want in your 203 00:11:00,250 --> 00:11:01,880 life and where you want to draw that line 204 00:11:01,880 --> 00:11:03,880 On your own digital sovereignty 205 00:11:03,880 --> 00:11:07,560 Thank you again to safe server for supporting this deep dive into open-source 206 00:11:07,560 --> 00:11:08,240 architecture 207 00:11:08,240 --> 00:11:11,240 You can learn more about how they can help of your software hosting and digital 208 00:11:11,240 --> 00:11:13,920 transformation at www safe server dot de 209 00:11:14,320 --> 00:11:15,640 We'll catch you next time. 210 00:11:15,640 --> 00:11:17,040 We will, for another Deep Dive.