1 00:00:00,000 --> 00:00:04,040 Okay, let's dive in. Welcome back to the deep dive where we try to make sense of 2 00:00:04,040 --> 00:00:04,720 complex tech 3 00:00:04,720 --> 00:00:10,070 stuff and make it useful for you. Today we're tackling Apache Airflow. It's this, 4 00:00:10,070 --> 00:00:10,400 well, 5 00:00:10,400 --> 00:00:14,400 this orchestration tool you hear about everywhere, handling simple scripts right up 6 00:00:14,400 --> 00:00:15,600 to really complex 7 00:00:15,600 --> 00:00:19,820 ML pipelines. And our mission today, especially for you, the listener maybe just 8 00:00:19,820 --> 00:00:20,560 starting out, 9 00:00:20,560 --> 00:00:24,720 is to really pull back the curtain on Airflow. We want you to walk away 10 00:00:24,720 --> 00:00:25,840 understanding what it 11 00:00:25,840 --> 00:00:31,690 actually is, its core ideas, and why, frankly, it's become such a standard for 12 00:00:31,690 --> 00:00:33,040 managing workflows. 13 00:00:33,040 --> 00:00:36,320 Yeah, absolutely. I mean, if you've ever had that horrible 2 a.m. wake-up call 14 00:00:36,320 --> 00:00:36,960 because some 15 00:00:36,960 --> 00:00:41,120 critical nightly script failed and you're scrambling across like a dozen systems 16 00:00:41,120 --> 00:00:41,280 just 17 00:00:41,280 --> 00:00:44,840 to figure out what went wrong, well, that's the core problem Airflow really sets 18 00:00:44,840 --> 00:00:45,520 out to solve. 19 00:00:45,520 --> 00:00:49,280 The sources define it pretty clearly. It's a platform built by the community 20 00:00:49,280 --> 00:00:52,720 to programmatically author, schedule, and monitor workflows. 21 00:00:52,720 --> 00:00:57,200 Programmatically. Okay, that sounds key. It is key. That's the shift in thinking we 22 00:00:57,200 --> 00:01:01,200 want you to grasp. It's not just about setting timers. It's treating the whole 23 00:01:01,200 --> 00:01:01,600 workflow, 24 00:01:01,600 --> 00:01:05,680 the entire process, as actual software. And when you do that, when it's all code, 25 00:01:05,680 --> 00:01:09,430 suddenly it's maintainable. You can version it like any other code. You can test it 26 00:01:09,430 --> 00:01:10,240 properly. 27 00:01:10,240 --> 00:01:14,000 And maybe most importantly, multiple people can collaborate on it effectively. 28 00:01:14,000 --> 00:01:18,880 Right. That leap from just having scattered scripts everywhere to having this codified, 29 00:01:18,880 --> 00:01:23,250 managed process. That's the aha moment. That's what makes Airflow feel almost 30 00:01:23,250 --> 00:01:23,840 indispensable 31 00:01:23,840 --> 00:01:28,240 once you've used it. Okay. That makes sense. And handling these complex automations, 32 00:01:28,240 --> 00:01:28,800 well, 33 00:01:28,800 --> 00:01:32,880 it needs solid infrastructure. So we really want to thank the supporter of this 34 00:01:32,880 --> 00:01:33,760 Deep Dive Safe 35 00:01:33,760 --> 00:01:37,680 Server. Safe Server supports hosting for tools like Airflow and helps with your 36 00:01:37,680 --> 00:01:38,080 digital 37 00:01:38,080 --> 00:01:44,560 transformation journey. You can find out more at www.safeserver.de. All right. Let's 38 00:01:44,560 --> 00:01:44,960 get into 39 00:01:44,960 --> 00:01:49,980 the nuts and bolts then. The building blocks. The sources really stress that Airflow 40 00:01:49,980 --> 00:01:51,120 defines these 41 00:01:51,120 --> 00:01:56,380 sometimes really complex multi-step processes entirely in Pure Python. Yeah. And 42 00:01:56,380 --> 00:01:57,120 that's a huge, 43 00:01:57,120 --> 00:02:00,530 huge advantage for getting started and for keeping things maintainable down the 44 00:02:00,530 --> 00:02:00,960 line. 45 00:02:00,960 --> 00:02:05,520 If you know Tython, you're basically good to go. You just use standard Python 46 00:02:05,520 --> 00:02:06,160 features you already 47 00:02:06,160 --> 00:02:11,140 know, like datetime formats for scheduling. You can use loops to generate tasks 48 00:02:11,140 --> 00:02:12,160 dynamically. 49 00:02:12,160 --> 00:02:17,200 It gives you complete flexibility. So no weird XML files or obscure command line 50 00:02:17,200 --> 00:02:18,480 flags to learn? 51 00:02:18,480 --> 00:02:22,910 Nope. None of that black magic. Just Python. Okay. So we define all the steps in 52 00:02:22,910 --> 00:02:23,440 Python. 53 00:02:23,440 --> 00:02:27,240 How does Airflow actually know the order to run things in? What manages the 54 00:02:27,240 --> 00:02:28,080 dependencies? 55 00:02:28,080 --> 00:02:33,720 Ah, that brings us to a really core concept. The DAG. That stands for Directed Acyclic 56 00:02:33,720 --> 00:02:34,080 Graph. 57 00:02:34,080 --> 00:02:37,840 Right. DAGs. Heard that term. It's the fundamental unit of work in Airflow. 58 00:02:37,840 --> 00:02:41,490 The DAG is essentially the blueprint for your workflow. It lays out all the 59 00:02:41,490 --> 00:02:42,640 individual tasks 60 00:02:42,640 --> 00:02:46,990 and, crucially, the dependencies between them. Then the Airflow scheduler looks at 61 00:02:46,990 --> 00:02:47,760 that DAG 62 00:02:47,760 --> 00:02:52,960 and executes the tasks, making absolutely sure that, say, step B doesn't even think 63 00:02:52,960 --> 00:02:53,520 about starting 64 00:02:53,520 --> 00:02:58,160 until step A has finished successfully. That control aspect sounds really powerful. 65 00:02:58,160 --> 00:03:01,760 But hang on. If my process is always the same, you know, pretty static, 66 00:03:01,760 --> 00:03:06,130 why wouldn't I just use a basic scheduled script or maybe a simple serverless 67 00:03:06,130 --> 00:03:06,800 function? 68 00:03:06,800 --> 00:03:09,840 Why add the overhead of Airflow? That's a really good question, 69 00:03:09,840 --> 00:03:14,240 and it touches on the difference between basic scheduling and proper orchestration. 70 00:03:14,240 --> 00:03:18,800 Airflow brings all the extras. Handling complex failures and retries, figuring out 71 00:03:18,800 --> 00:03:19,600 dependencies, 72 00:03:19,600 --> 00:03:24,530 providing visibility, things simple schedulers just don't do. But you're right 73 00:03:24,530 --> 00:03:25,600 about the scope. 74 00:03:25,600 --> 00:03:29,680 Airflow really shines when your workflows are, let's say, mostly static or change 75 00:03:29,680 --> 00:03:30,320 slowly over 76 00:03:30,320 --> 00:03:36,640 time. And critically, the tasks inside your workflow. Ideally, they need to be idempotent. 77 00:03:36,640 --> 00:03:39,920 Idempotent. Okay, define that for us. Idempotent means that running the same 78 00:03:39,920 --> 00:03:43,820 task multiple times produces the exact same result as running it successfully just 79 00:03:43,820 --> 00:03:44,240 once. 80 00:03:44,240 --> 00:03:48,960 Ah, so if a task fails halfway through and Airflow automatically reruns it. 81 00:03:48,960 --> 00:03:53,870 Exactly. You don't want it creating duplicate data or sending the same email twice 82 00:03:53,870 --> 00:03:54,400 or charging 83 00:03:54,400 --> 00:03:59,520 a credit card again. You have to design your tasks so that rerunning them is safe 84 00:03:59,520 --> 00:03:59,760 and leads 85 00:03:59,760 --> 00:04:03,280 to the same final state. It's a vital discipline for stable pipelines. 86 00:04:03,280 --> 00:04:07,680 Got it. So Airflow is like the orchestra conductor, making sure everyone plays 87 00:04:07,680 --> 00:04:08,240 their part, 88 00:04:08,240 --> 00:04:11,760 even if they need to restart a measure. But I remember the sources warning, 89 00:04:11,760 --> 00:04:14,880 it's not for moving huge amounts of data between tasks. 90 00:04:14,880 --> 00:04:19,040 That's correct. Definitely not a streaming solution. Tasks can pass small bits of 91 00:04:19,040 --> 00:04:19,920 information, 92 00:04:19,920 --> 00:04:23,700 little pieces of metadata between each other using something called XCOMs cross 93 00:04:23,700 --> 00:04:24,400 communication. 94 00:04:24,400 --> 00:04:25,040 XCOMs. 95 00:04:25,040 --> 00:04:29,660 But think of XCOMs, like passing a little note, maybe a file path, a database 96 00:04:29,660 --> 00:04:30,320 record ID, 97 00:04:30,320 --> 00:04:35,520 or just a status flag. They're absolutely not designed for shuffling gigabytes of 98 00:04:35,520 --> 00:04:36,640 data around. 99 00:04:36,640 --> 00:04:40,830 If you have tasks that need to process large volumes of data, the best practice is 100 00:04:40,830 --> 00:04:41,280 always, 101 00:04:41,280 --> 00:04:44,800 always to delegate that heavy lifting to an external system built for it, like a 102 00:04:44,800 --> 00:04:45,280 database 103 00:04:45,280 --> 00:04:49,840 query, a Spark job, or a dedicated data processing service. Airflow just triggers 104 00:04:49,840 --> 00:04:50,800 and monitors it. 105 00:04:50,800 --> 00:04:54,880 Okay, makes sense. Keep Airflow focused on the orchestration. Now, thinking about 106 00:04:54,880 --> 00:04:55,920 best practices, 107 00:04:55,920 --> 00:05:00,560 let's talk about the core design ideas that make Airflow so popular. The source has 108 00:05:00,560 --> 00:05:01,280 mentioned four 109 00:05:01,280 --> 00:05:06,100 key principles. Let's start with dynamic. What's the benefit there? This is a 110 00:05:06,100 --> 00:05:07,040 massive difference 111 00:05:07,040 --> 00:05:11,280 compared to older scheduling tools. Because your pipelines are just Python code, 112 00:05:11,280 --> 00:05:12,000 you can use all 113 00:05:12,000 --> 00:05:17,390 the power of Python loops, functions, classes, imports, conditional logic to 114 00:05:17,390 --> 00:05:18,320 actually generate 115 00:05:18,320 --> 00:05:23,840 your pipelines dynamically. Wait, so say I onboard like 50 new clients and each one 116 00:05:23,840 --> 00:05:24,640 needs a slightly 117 00:05:24,640 --> 00:05:28,770 different daily reporting pipeline. I don't have to manually create 50 separate DAG 118 00:05:28,770 --> 00:05:30,080 files. Exactly. 119 00:05:30,080 --> 00:05:33,810 You could write Python code that reads a list of clients and generates a unique 120 00:05:33,810 --> 00:05:34,320 parameterized 121 00:05:34,320 --> 00:05:38,240 DAG instance for each one automatically. That's incredibly powerful for managing 122 00:05:38,240 --> 00:05:38,960 complexity at 123 00:05:38,960 --> 00:05:43,840 scale. Wow, okay. That saves a ton of manual effort. What's next? Scalable. Right. 124 00:05:43,840 --> 00:05:44,160 Airflow 125 00:05:44,160 --> 00:05:48,970 is built with a modular architecture. It uses things like a method queue, think Celery 126 00:05:48,970 --> 00:05:49,440 or Rabbit 127 00:05:49,440 --> 00:05:55,220 MQ, to distribute tasks out to potentially many worker machines. It's designed from 128 00:05:55,220 --> 00:05:55,760 the ground up 129 00:05:55,760 --> 00:06:00,460 to scale out horizontally. You can add more workers as your workload grows. It's 130 00:06:00,460 --> 00:06:01,760 meant to scale, 131 00:06:01,760 --> 00:06:07,310 theoretically, to infinity. Okay. Dynamic generation scales out. What about 132 00:06:07,310 --> 00:06:07,680 connecting 133 00:06:07,680 --> 00:06:12,800 to everything? That sounds like extensible. Precisely. You're not stuck with just 134 00:06:12,800 --> 00:06:13,280 the built-in 135 00:06:13,280 --> 00:06:18,560 tools. Airflow has this concept of operators. If the standard operator for, say, 136 00:06:18,560 --> 00:06:19,120 interacting with 137 00:06:19,120 --> 00:06:22,820 your specific database or cloud service doesn't quite do what you need... You can 138 00:06:22,820 --> 00:06:23,280 just write your 139 00:06:23,280 --> 00:06:26,410 own. Yep. You can easily define your own custom operators, hooks, and sensors. You 140 00:06:26,410 --> 00:06:27,040 can extend the 141 00:06:27,040 --> 00:06:30,520 libraries to create the exact level of abstraction that makes sense for your team 142 00:06:30,520 --> 00:06:31,360 and your environment. 143 00:06:31,360 --> 00:06:36,160 It's very flexible. And the last principle, elegant. That sounds a bit subjective. 144 00:06:36,160 --> 00:06:36,800 It speaks more to 145 00:06:36,800 --> 00:06:40,650 the developer experience, I think. The idea is that the pipelines themselves, the 146 00:06:40,650 --> 00:06:41,200 Python code 147 00:06:41,200 --> 00:06:46,570 defining the DAGs, should be lean, clear, and explicit. It also uses the Jinja 148 00:06:46,570 --> 00:06:47,760 templating engine 149 00:06:47,760 --> 00:06:52,690 pretty heavily, which is built right into the core. This lets you parameterize your 150 00:06:52,690 --> 00:06:53,200 tasks, 151 00:06:53,200 --> 00:06:58,540 really effectively passing in dates, configurations, things like that, without 152 00:06:58,540 --> 00:06:59,600 making the Python code 153 00:06:59,600 --> 00:07:05,040 itself overly complicated or messy. It keeps things readable. That internal elegance 154 00:07:05,040 --> 00:07:05,360 seems 155 00:07:05,360 --> 00:07:09,320 to carry over to the outside, too, because, honestly, having spent way too many 156 00:07:09,320 --> 00:07:10,000 hours staring 157 00:07:10,000 --> 00:07:16,000 at cryptic Cron logs in a terminal, the fact that Airflow has a proper visual UI 158 00:07:16,000 --> 00:07:17,120 feels like 159 00:07:17,120 --> 00:07:20,690 more than just a nice-to-have. It feels essential. Oh, it's a huge part of the 160 00:07:20,690 --> 00:07:21,680 appeal. It's definitely 161 00:07:21,680 --> 00:07:26,240 the anti-Cron experience. One of its absolute standout features is the useful UI. 162 00:07:26,240 --> 00:07:26,640 It gives you 163 00:07:26,640 --> 00:07:30,320 this really robust modern web application where you can see everything, you monitor 164 00:07:30,320 --> 00:07:30,880 workflows, 165 00:07:30,880 --> 00:07:34,430 you can trigger them manually, you can manage connections, variables. It's all 166 00:07:34,430 --> 00:07:35,440 visual. You get 167 00:07:35,440 --> 00:07:39,180 full insight into what's running, what failed, and access to the logs for every 168 00:07:39,180 --> 00:07:40,160 single task run. 169 00:07:40,160 --> 00:07:43,840 And it's not just one basic dashboard, right? There are specific views. 170 00:07:43,840 --> 00:07:48,360 Yeah, several really helpful ones. There's the main DAGs overview showing all your 171 00:07:48,360 --> 00:07:49,200 pipelines. 172 00:07:49,200 --> 00:07:54,260 There's the grid view, which is great for seeing task statuses laid out over time. 173 00:07:54,260 --> 00:07:54,800 And critically, 174 00:07:54,800 --> 00:07:59,220 there's the graph view. This actually draws out your DAG showing all the tasks and 175 00:07:59,220 --> 00:08:00,480 their dependencies, 176 00:08:00,480 --> 00:08:05,120 and it colors them based on the status of a specific run. You can instantly see the 177 00:08:05,120 --> 00:08:05,520 flow 178 00:08:05,520 --> 00:08:09,920 and pinpoint exactly where something went wrong and often why. There's also a code 179 00:08:09,920 --> 00:08:10,320 view to see 180 00:08:10,320 --> 00:08:15,710 the data source code directly in the UI. That visual debugging is invaluable. And 181 00:08:15,710 --> 00:08:16,160 this ties 182 00:08:16,160 --> 00:08:20,480 into the robust integrations, doesn't it? The UI manages connections to all sorts 183 00:08:20,480 --> 00:08:21,200 of systems. 184 00:08:21,200 --> 00:08:24,400 Absolutely. Airflow comes packed with plug and play operators for basically 185 00:08:24,400 --> 00:08:24,880 everything you'd 186 00:08:24,880 --> 00:08:29,490 expect in a modern tech stack. All the major clouds, Google Cloud Platform, AWS, 187 00:08:29,490 --> 00:08:30,960 Microsoft Azure plus 188 00:08:30,960 --> 00:08:35,840 databases, messaging systems, container orchestrators, data warehouses, tons of 189 00:08:35,840 --> 00:08:39,920 third-party services too. So chances are whatever infrastructure you're using now 190 00:08:39,920 --> 00:08:40,880 or planning to use, 191 00:08:40,880 --> 00:08:44,310 Airflow probably has ready-made components to interact with it. That makes adoption 192 00:08:44,310 --> 00:08:44,560 much 193 00:08:44,560 --> 00:08:49,520 smoother. It definitely feels enterprise ready, which probably explains the huge 194 00:08:49,520 --> 00:08:50,400 community around 195 00:08:50,400 --> 00:08:54,860 it. It's an official Apache Software Foundation project, right? Open source. 196 00:08:54,860 --> 00:08:55,440 Completely open 197 00:08:55,440 --> 00:09:00,160 source. And the community is massive and very active. You mentioned the GitHub 198 00:09:00,160 --> 00:09:01,040 stars earlier, 199 00:09:01,040 --> 00:09:05,870 tens of thousands, thousands of contributors. There's a really busy Slack channel 200 00:09:05,870 --> 00:09:06,320 where people 201 00:09:06,320 --> 00:09:10,390 help each other out. It's a very vibrant ecosystem. That open source nature makes 202 00:09:10,390 --> 00:09:11,040 it easy to get 203 00:09:11,040 --> 00:09:14,110 started, which is great for beginners. But you mentioned earlier, there's a 204 00:09:14,110 --> 00:09:14,960 difference between 205 00:09:14,960 --> 00:09:19,660 just running it locally and setting it up for real, for production. Yes, that's a 206 00:09:19,660 --> 00:09:20,720 critical distinction. 207 00:09:20,720 --> 00:09:25,360 Anyone with some Python knowledge can probably get a simple workflow running on 208 00:09:25,360 --> 00:09:26,080 their laptop 209 00:09:26,080 --> 00:09:31,360 fairly quickly. It is easy to use in that sense. However, deploying Airflow for 210 00:09:31,360 --> 00:09:31,760 production 211 00:09:31,760 --> 00:09:35,310 workloads has some strict requirements you absolutely need to be aware of. Okay, 212 00:09:35,310 --> 00:09:35,840 like what? 213 00:09:35,840 --> 00:09:39,840 The big one is the operating system. Airflow is only officially supported for 214 00:09:39,840 --> 00:09:40,720 production on 215 00:09:40,720 --> 00:09:46,100 POCX compliant operating systems. Basically, that means Linux. The community 216 00:09:46,100 --> 00:09:46,960 maintains a 217 00:09:46,960 --> 00:09:51,460 reference Docker image based on Debian Bookworm, which is a good standard to follow 218 00:09:51,460 --> 00:09:51,920 if you're a 219 00:09:51,920 --> 00:09:56,880 Windows user. No native support. No native production support. You must use either 220 00:09:56,880 --> 00:09:56,960 the 221 00:09:56,960 --> 00:10:01,950 Windows subsystem for Linux version 2, WSL 2, or run Airflow within Linux 222 00:10:01,950 --> 00:10:02,800 containers, 223 00:10:02,800 --> 00:10:07,360 perhaps using Docker desktop. That's non-negotiable for a stable production setup. 224 00:10:07,360 --> 00:10:11,360 Okay, that's a major infrastructure point. Linux first. What else? The database. 225 00:10:11,360 --> 00:10:15,520 When you first install Airflow, it defaults to using Schoolite. That's fine, 226 00:10:15,520 --> 00:10:18,820 only for local development and testing, just to try things out. But not for 227 00:10:18,820 --> 00:10:19,440 production. 228 00:10:19,440 --> 00:10:23,080 Absolutely not recommended for production. Schoolite doesn't handle concurrent 229 00:10:23,080 --> 00:10:24,000 access well, 230 00:10:24,000 --> 00:10:27,120 which you definitely have in a production Airflow setup with multiple components 231 00:10:27,120 --> 00:10:27,440 hitting the 232 00:10:27,440 --> 00:10:32,480 database. For production, you really need a proper, robust database like PostgreSQL 233 00:10:32,480 --> 00:10:33,200 or MySQL. 234 00:10:33,200 --> 00:10:37,600 About it. Use a real database. That need for stability seems reflected in how they 235 00:10:37,600 --> 00:10:41,360 manage the project itself, too. Yeah, they become quite rigorous about it. 236 00:10:41,360 --> 00:10:46,400 Since version 2.0.0, Airflow follows strict semantic versioning major dot minor dot 237 00:10:46,400 --> 00:10:46,720 patch. 238 00:10:46,720 --> 00:10:51,600 This gives you predictability. You know a PADCH release won't break things, 239 00:10:51,600 --> 00:10:54,950 and a minor release might add features but should be backward compatible within 240 00:10:54,950 --> 00:10:55,440 that 241 00:10:55,440 --> 00:11:00,080 major version. They also actively manage their dependencies on other big libraries, 242 00:11:00,080 --> 00:11:04,240 things like Skoll Alchemy for the database interaction, Flask for the web UI, 243 00:11:04,240 --> 00:11:08,480 Celery or Kubernetes for scaling. They often set upper version bounds on these 244 00:11:08,480 --> 00:11:09,280 dependencies. 245 00:11:09,280 --> 00:11:11,920 Why do that? To ensure stability. It prevents 246 00:11:11,920 --> 00:11:15,920 a situation where you upgrade Airflow, but an underlying library it depends on has 247 00:11:15,920 --> 00:11:16,240 also 248 00:11:16,240 --> 00:11:19,840 updated with a breaking change you weren't expecting. By pinning or capping 249 00:11:19,840 --> 00:11:20,720 dependency 250 00:11:20,720 --> 00:11:24,640 versions, they provide a more predictable and stable upgrade experience for you. 251 00:11:24,640 --> 00:11:28,550 Okay, so wrapping this up, what's the main takeaway for someone listening, maybe 252 00:11:28,550 --> 00:11:29,520 new to this? 253 00:11:29,520 --> 00:11:34,170 It seems like if you're dealing with complex automation tasks, especially ones that 254 00:11:34,170 --> 00:11:34,480 run 255 00:11:34,480 --> 00:11:38,880 regularly and need to be reliable, Airflow is pretty much the standard way to go. 256 00:11:38,880 --> 00:11:39,280 It lets you 257 00:11:39,280 --> 00:11:43,360 define everything in Python, making it maintainable and testable, and gives you 258 00:11:43,360 --> 00:11:44,480 that powerful UI to 259 00:11:44,480 --> 00:11:48,390 see what's going on, ditching those old command line headaches. That sums it up 260 00:11:48,390 --> 00:11:49,360 well. It brings 261 00:11:49,360 --> 00:11:54,240 software engineering best practices to your automation workflows. And maybe a final 262 00:11:54,240 --> 00:11:54,480 thought, 263 00:11:54,480 --> 00:11:57,970 a challenge for you to consider, is you start building your own DAGs. We talked a 264 00:11:57,970 --> 00:11:58,400 lot about 265 00:11:58,400 --> 00:12:03,440 idempotency, how crucial it is that tasks can be rerun safely. So pause and really 266 00:12:03,440 --> 00:12:04,080 think about the 267 00:12:04,080 --> 00:12:08,880 real world consequences when that principle is violated. What are the most common, 268 00:12:08,880 --> 00:12:09,280 maybe even 269 00:12:09,280 --> 00:12:13,830 disastrous pipeline failures you can imagine that happen precisely because a step 270 00:12:13,830 --> 00:12:15,280 wasn't idempotent? 271 00:12:15,280 --> 00:12:19,020 Think about things like duplicate financial transactions, or sending marketing 272 00:12:19,020 --> 00:12:19,440 emails out 273 00:12:19,440 --> 00:12:23,910 multiple times by accident. Then consider, how does the very act of defining your 274 00:12:23,910 --> 00:12:24,480 workflow, 275 00:12:24,480 --> 00:12:28,790 step-by-step, and structured Python code force you to confront and design against 276 00:12:28,790 --> 00:12:29,280 those kinds 277 00:12:29,280 --> 00:12:33,590 of devastating mistakes up front? That enforced discipline. That's where Airflow 278 00:12:33,590 --> 00:12:34,480 adds a huge layer 279 00:12:34,480 --> 00:12:37,730 of safety and security. That's a great point. Designing for failure and reruns 280 00:12:37,730 --> 00:12:38,240 right from the 281 00:12:38,240 --> 00:12:42,000 start. Excellent food for thought. And once again, a big thank you to Safe Server 282 00:12:42,000 --> 00:12:42,800 for supporting this 283 00:12:42,800 --> 00:12:46,630 deep dive. Safe Server helps you host software like Airflow and manage your digital 284 00:12:46,630 --> 00:12:47,680 transformation. 285 00:12:47,680 --> 00:12:53,130 You can find more details at www.safeserver.de. Okay, that's all for the session. 286 00:12:53,130 --> 00:12:55,360 We'll catch you on the next deep dive.