Today's Deep-Dive: Agentset

0:00

Okay, let's unpack this. If you're working with AI agents, you've probably run

0:04

smack into the trust

0:05

barrier. We're talking about that fundamental problem with large language models,

0:09

the dreaded

0:10

hallucination where the AI just invents stuff. Yeah, invents facts. And it's more

0:16

than just

0:17

annoying, right? It's a huge challenge if your agent needs to know about your

0:20

specific,

0:21

maybe internal knowledge. Exactly. So today we're doing a deep dive into the tech

0:26

built to fix

0:27

this trust crisis, Retrieval Augmented Generation, RAGRI, for short. Ah. But before

0:33

we really get

0:33

into the weeds of grounding these agents, we really want to thank SafeServer. Ah,

0:37

yes. They

0:38

focus on hosting exactly this kind of complex, cutting edge software. They're all

0:43

about supporting

0:43

your digital transformation journey, making sure you've got the right setup for

0:46

advanced RAG apps.

0:47

You can find out more about how they help with hosting over at www.safeserver.de.

0:52

So, our mission today, to give you a crucial shortcut, we're going to demystify

0:57

this platform

0:57

called AgentSet. It's designed so pretty much anyone can build these really

1:01

reliable, traceable

1:02

frontier RAG apps. We'll break down how it works so even if you're totally new to

1:08

RAG,

1:08

you'll get why it promises to skip all that painful, expensive trial and error you

1:12

often see.

1:12

And that's key, focusing on beginners too. Because RAG, well, fundamentally it's

1:16

about giving the AI

1:17

proof. You ground the agent in a specific knowledge base so it stops being a

1:21

general know-it-all and

1:22

becomes an expert on your stuff, your documents. We're looking at a system designed

1:26

to make that

1:27

really complex engineering job more plug-and-play. Right, reliable answers right

1:32

out of the box.

1:33

So let's start right there with that fundamental difference. The promise is

1:37

building reliable AI

1:38

agents fast, cutting down hallucinations and, you know, impressing people from the

1:43

get-go.

1:44

Yeah, and it's interesting to think about the pain point Agentset is trying to

1:47

solve here. If you,

1:48

the listener, tried the DIY route, maybe using tools like Langchain or Lamma Index,

1:53

the source material suggests you hit a wall pretty fast, a steep learning curve,

1:57

complex setup,

1:58

loads of boilerplate code, and maybe worst of all, the retrieval quality. It's just

2:02

all over the place. Inconsistent.

2:05

Let's pause on that inconsistency. What does that complexity actually mean for an

2:09

engineer?

2:10

It's not just like calling one API, is it?

2:12

Oh, absolutely not. No. The trouble starts immediately with getting the documents

2:15

in and

2:16

shopping them up. Ingestion and chunking. When you build our Ag yourself, you have

2:20

to figure out,

2:20

okay, how do I break this huge document into pieces? Small enough for the LLM,

2:25

but big enough to keep the meaning. Right.

2:27

Do I use paragraphs, fixed number of words, some recursive method? You choose wrong,

2:33

and the whole retrieval thing can just fail. It's a huge decision that,

2:37

if you're doing it yourself, needs tons of tuning and testing.

2:40

An agent set comes in and basically says, look, we've got a ready-to-use engine.

2:44

It handles that complexity, those architectural choices,

2:47

right away. And it starts with ingestion. Exactly. Your documents, your knowledge,

2:51

get automatically parsed. And from over 22 file formats.

2:54

That's a lot. It is. And it's important because it's not

2:57

just the easy ones like PDF and Word docs. It includes tricky ones like

3:01

EML emails, CSVs, even image files like BMP and complex XMLs. That breadth alone

3:08

solves a huge integration headache for developers dealing with messy corporate

3:12

data silos. Okay, so the documents are in. How does

3:15

the system prep them for retrieval? The platform uses its own built-in chunking

3:20

strategy. It automatically breaks everything down

3:22

into these manageable searchable bits trying to maximize the context.

3:26

Then they get embedded, turned into numbers basically, vectors and stored in

3:30

a vector database. This makes finding them later really fast

3:34

using math, that whole ingestion and prep stage. That's

3:37

where a lot of DIY rag projects stumble because of bad choices early on.

3:41

AgentSet aims to make those good choices for you. Okay, let's move to section two

3:45

then, the core function. How does AgentSet actually guarantee

3:48

accuracy and fight off those hallucinations?

3:50

This is really key if you need dependable enterprise-grade answers.

3:54

Right. They aim for reliable accuracy by using what the sources call

3:59

best-in-class R-RAID techniques right from the start.

4:02

They've essentially optimized the retrieval bit before you even think

4:05

about customizing anything. Two features really jumped out at me

4:08

from the material, hybrid search and re-ranking.

4:12

Let's unpack why these are like safety nets against bad answers.

4:16

Okay, hybrid search is kind of the proactive step. See,

4:19

basic vector search is good at finding stuff that's semantically similar chunks

4:23

talking about the same topic. Getting similar, yeah.

4:25

But similar meaning doesn't always mean it's the right context for the specific

4:29

question. Hybrid search casts a wider net. It

4:32

combines that vector search with good old keyword and full text search.

4:36

This finds more potentially relevant bits of information,

4:40

making sure something isn't missed just because the vector math was slightly off.

4:44

And then re-ranking. That's the quality control, right?

4:47

Hybrid search finds maybe a thousand relevant looking chunks.

4:50

How does the system pick the best three to actually show the LLM?

4:54

Precisely. Re-ranking is like the final editor.

4:58

It takes all those candidates from the hybrid search and sorts them based on true

5:02

relevance and

5:03

quality. It ensures the absolute best, most contextually spot-on material gets

5:08

passed to

5:09

the large language model. That's how you get the highest accuracy by cleaning up

5:13

the retrieved

5:13

information before the LLM even sees it. That's a critical distinction. The AI isn't

5:18

just grabbing nearby stuff. It's prioritizing the quality of the evidence.

5:22

Exactly. And they add another layer, too. Built-in support for deep research.

5:27

You can choose quick answer or a deeper dive. Deep research takes longer, naturally,

5:32

but it looks at way more sources and gives back really in-depth answers with more

5:36

context.

5:38

Great for complex questions or high stakes decisions.

5:40

That may be the most vital feature for building trust with the person asking the

5:43

question.

5:44

Citations.

5:45

Absolutely non-negotiable, usually. The system automatically cites the exact

5:50

sources for its

5:51

answers. This lets you, the user, click through and see the original document, the

5:55

page, even

5:56

the paragraph where the AI got its info. In a business setting, that traceability

6:00

is essential

6:01

for compliance, for validation.

6:03

And building on that control idea, there's metadata filtering.

6:06

Give us a quick, practical example of why that matters.

6:09

Sure. So this lets you limit the AI's answers to only a specific slice of your data

6:14

based on

6:15

tags you added when you uploaded the documents. Imagine a big company. You might

6:19

need an agent

6:20

that only answers using documents tagged Legal 2023 Q4 to make sure it's compliant,

6:26

maybe excluding

6:27

marketing stuff entirely. It keeps the agent operating within very specific

6:30

boundaries. Again,

6:32

making sure the answers are traceable and vetted from your chosen sources.

6:35

Okay. That level of reliability, that optimized architecture, that usually takes a

6:39

dedicated

6:39

expense of engineering team. But agents had to talk about production in hours. So

6:44

let's

6:44

shift to developer experience and flexibility section three.

6:47

Right. If we pivot from accuracy to just making it easy to implement, what's really

6:53

important for

6:53

scaling is how accessible they've made deployment. They offer ready to go SDKs for

6:58

JavaScript and

6:59

Python, clean APIs, typed SDKs too. This means developers can upload data and plug

7:05

it into

7:05

existing systems fast without wrestling with, you know, messy or undocumented code.

7:11

Okay. But let's be a bit skeptical. If agent set is prepackaging all these fancy

7:15

rogue techniques, what's the catch? Am I locked into their way of doing things?

7:20

Their cloud,

7:21

their choice of AI model. Excellent question. And the source material really

7:25

stresses this.

7:25

Agent set is extremely model agnostic. You are specifically not locked into one

7:30

vendor's AI.

7:31

That's a huge strategic plus. You keep control, you pick your own vector database,

7:36

your own embedding model, and critically your own large language model.

7:39

And that's not just a tech detail, is it? That's about cost, it's about strategy.

7:43

Absolutely. It lets you fine tune performance and manage your budget.

7:46

Maybe you use a powerful pricey model like GPT-4 for that deep research feature.

7:50

Or maybe a cheaper, faster model from Anthropic or Cohere for everyday customer

7:55

questions.

7:56

Agent set works with all the big players. Open AI, Anthropic, Google AI,

8:00

Cohere. You keep control of your underlying tech stack, avoid getting locked in.

8:04

That makes a lot of sense. Now once I've got agent set running,

8:08

how do I actually connect that knowledge base to other applications?

8:11

So for bringing that knowledge out, they offer a couple of ways.

8:14

One is just easy integration using their standard AI SDK. Pretty straightforward.

8:19

But for connecting that powerful, grounded knowledge base to external apps or maybe

8:24

microservices, they have something called the Model Context Protocol Server or MCP

8:28

server.

8:28

Okay. What does the MCP server do exactly?

8:31

Think of it like a secure gateway, a dedicated one.

8:34

It lets your other applications query the agent set knowledge base,

8:38

that whole sophisticated RRAG engine, without needing to rebuild the retrieval and

8:42

LLM logic

8:42

themselves. It essentially serves up the contextual proof, ready for any external

8:47

app to use to generate a reliable answer. And to help developers get going faster,

8:51

they even threw in a chat playground. Yes. With message editing and citations built

8:56

right in, it's brilliant for just quickly trying things out, prototyping.

9:00

Developers can immediately see if the RIG is working well, check the accuracy of

9:04

answers,

9:04

without having to push anything live, it cuts down testing time dramatically.

9:08

Okay, let's move to section four. This is huge. Security and control. Especially

9:14

when we're talking

9:14

about grounding AI in sensitive proprietary company data. What protections are

9:19

baked in?

9:20

How do they build that trust? Security seems to be multilayered,

9:23

often aiming higher than standard practice. Your data is secured with end-to-end

9:27

encryption using

9:28

bank-grade AES-256. And all the data moving around is secured using TLS. Standard,

9:34

but essential.

9:35

But the real decider for many companies is control. Who actually owns the data?

9:38

Who controls the infrastructure? Where does it live?

9:41

Absolutely. And the key differentiator here seems to be flexibility and hosting

9:45

control.

9:46

The platform lets users host your data on top of your own vector database,

9:50

your own storage bucket, and your own chosen AI models. For sensitive systems,

9:54

keeping ownership and control over that data stack is paramount.

9:57

And what about organizations with really strict rules,

10:00

especially around where data can physically be? Data residency requirements?

10:04

Yep. For compliance needs, Agentset offers specific options for EU data residency.

10:10

That means ensuring data is processed only on servers within the EU,

10:13

which ticks a big box for GDPR and similar regulations.

10:17

And for the absolute maximum control, maybe for finance or healthcare,

10:21

they support on-premise deployment.

10:23

So on-prem means putting the whole Agentset system behind your own company firewall,

10:28

in your own cloud.

10:28

That's exactly it. It lets you deploy Agentset inside your existing cloud

10:33

environment,

10:33

AWS, Azure, GCP, whatever, but completely under your security rules behind your

10:39

firewalls.

10:40

That's the ultimate level of control for really critical applications.

10:43

So it sounds like if you're looking to adopt this,

10:45

there are two pretty clear paths depending on your needs.

10:48

Yeah, basically. First, there's Agentset Cloud. That's the quickest way to get

10:52

started,

10:52

kick the tires. Crucially, they have a generous free tier, 1,000 pages of documents

10:57

you can ingest,

10:58

10,000 retrievals. Makes it really easy to experiment without commitment.

11:02

And then, for those who need that total control, you can self-host.

11:06

Right. Because Agentset is open source MIT license, which is very permissive. You

11:11

can

11:11

just download the whole thing and run it yourself. That gives you absolute maximum

11:15

technical control

11:16

over the entire RAG setup, the knowledge, the infrastructure. It's really a choice

11:21

between

11:22

getting to market fast versus having complete uncompromised control.

11:25

So, to kind of wrap up the key takeaway for you, the listener, Agentset takes the

11:31

really painful

11:32

complex bits of RAG getting diverse data in, figuring out chunking, doing advanced

11:36

retrieval

11:36

like hybrid search, adding citations and packages at all into an accessible engine

11:41

that's accurate

11:41

out of the box. It essentially lets you skip building that whole complex

11:45

infrastructure from

11:46

scratch. You bypass the RAG engineering headaches and can focus straight away on

11:50

building a reliable

11:51

AI app you can actually trust. And here's where it gets really interesting for me,

11:54

the final thought. Given how easy they're making deployment, the model flexibility,

11:59

these powerful accuracy features like hybrid search and deep research, how quickly

12:03

will these

12:04

traceable knowledge-backed AI agents just become the standard? Will they totally

12:08

replace the simpler,

12:10

less reliable chat bots that, you know, can't actually prove where they got their

12:13

answers from

12:14

using your specific documents? It feels like the demand for trust is making traceability

12:19

basically mandatory. A huge thank you once again to SafeServer for supporting this

12:23

deep dive and

12:24

enabling digital transformation. If you want to explore hosting solutions for this

12:28

kind of

12:28

advanced software, please do visit www.safeserver.de. And thank you for joining us

12:34

Go forth and ground your agents.

12:34

Go forth and ground your agents.

Today's Deep-Dive: Agentset

Episode description

Persons