Recap: Making MCP actually work with YC founder Wojciech Błaszak of Golf.dev

Earlier this week, we hosted a workshop with Wojciech Błaszak, CEO of Golf.dev. We dug into how companies are deploying MCP today, and what is still broken under the hood.

If you're building LLM-native systems and the phrase "M by N integration problem" gives you whiplash, this one is for you.

Here are the big takeaways.

What even is MCP?

MCP (Model Context Protocol) is the standard for plugging tools, data, and prompts into your agent system. Instead of writing M×N custom bridges between every tool and every agent, MCP simplifies the graph to M+N.

Each setup includes:

Host: the main AI application
Client: the bridge between host and server
Server: the tool and data layer agents interact with

MCP exposes tools, resources, prompts, and sampling. Communication happens through structured JSON RPC, using transports like local stdio streams, remote Server-Sent Events (SSE), and Streamable HTTP to communicate with clients.

What breaks in the wild

Common pitfalls that limit agent performance:

Mapping OpenAPI specs directly into tools (100+ endpoints is too much)
Bundling complex parameters instead of breaking them into simpler calls
Exposing DELETE endpoints without proper safeguards

Key recommendation from Golf.dev: Keep MCP servers under 40 tools to avoid the risk that it breaks.

Security and access best practices

As MCP adoption grows, so does the need for robust security design. The latest spec updates and field experience highlight several critical practices:

Separate your resource and verification servers: This architectural shift improves security by isolating access control logic from the data itself
Implement OAuth for private data access: If your MCP server touches sensitive information, OAuth is a must. Support for SSO makes integration cleaner and more secure
Enforce role-based access control (RBAC): Match your MCP server's permissions to the user roles defined in your main application, so agents inherit the right access by default
Restrict tool visibility on a per-user basis: Tool-level access control lets you expose only a subset of tools to each user (for example, 7 out of 20 tools), improving both security and usability

Why MCP gets adopted (or not)

Right now, dev tool companies are leading adoption. IDE integration with tutorials and docs are the most common use cases currently.

Non-dev MCP servers often see zero usage. The reason? Workflow friction. If the user has to leave their existing interface, adoption drops.

Embedding into products works. Asking people to open a separate Claude Desktop doesn't.

Remote MCP servers are the future

The shift is underway. Notion released theirs six days ago.

The emerging standard:

Companies host their own remote MCP servers
Agents plug in via simple URLs
Authentication, permissioning, and testing are baked in

Wojciech expects remote-first adoption to be widespread within 6 to 12 months.

Customer support use cases

While MCP is gaining traction in dev tools, it is also quietly transforming how AI agents operate in customer support.

Here are a few use cases we covered:

End-to-end resolution: Agents can access multiple internal systems via MCP to fully resolve support cases without human handoff
Complex issue investigation: Agents can analyze internal logs, customer history, and usage data across tools to troubleshoot technical issues
Knowledge sharing and retrieval: Internal wikis, docs, and ticket archives can be exposed as MCP resources, making agent responses more informed and consistent
Ticket creation and ops automation: 14.ai integrates with Linear via MCP to generate new tickets directly from AI-led conversations, reducing manual workflows and improving context handoff

Why evals matter

If you’re deploying AI agents, evals are not optional.

Evals are repeatable tests that assess how well your agent performs on real tasks. Unlike unit tests, which validate e.g. whether an API responds correctly, evals measure whether the agent’s behavior actually produces useful outcomes.

They help answer questions like:

Did the agent select the right tool?
Was the response accurate, clear, and actionable?
Did the agent follow the correct escalation path?
Did the task succeed from the user’s perspective?

Good evals are scenario-based. They simulate high-leverage interactions, like a failed billing sync or a stuck onboarding flow, and evaluate whether the agent handled them correctly using the tools and data it had access to.

As systems grow more complex (multiple tools, retrieval sources, and model variants), evals become the connective tissue for debugging, tracking regressions, and aligning behavior with product goals.

TL;DR

If you are building with MCP, keep these in mind:

Stay under 40 tools
Carefully select which API endpoint to expose and pay particular attention to ones that can change the system
Design for agents, not humans
Secure everything
Build remote-compatible infrastructure now
Evals are not optional

Big thanks to Wojciech for joining and to everyone who showed up for the conversation.

More deep dives coming soon.

Recap: Making MCP actually work with YC founder Wojciech Błaszak of Golf.dev

What even is MCP?

What breaks in the wild

Security and access best practices

Why MCP gets adopted (or not)

Remote MCP servers are the future

Customer support use cases

Why evals matter

TL;DR

Keep reading

Automating TikTok Shop at Scale: Building a Scalable Creator Engine

Oneleet

AI can write the code if you build the right harness.