Sunday, April 20, 2025

Finally, a break

Hi guys,

The news was out last Friday. I am going to leave Parcel Perform. 

For a sabbatical leave. Between May and October.

Sorry for the gasp. You come here for drama.

This has been planned since last year. I originally planned to leave soon after the 2024 BFCM, after the horizontal scalability of the system became a solved problem. I would like to say permanently, but I have learned that nothing ever is - the scalability, not me leaving for good. But then there were such and such issues with our data source (if you know, you know), and AI took the industry by storm. So I stayed. Eventually though, I knew I needed this break.

I have been on this job for almost 10 years. The first line of code was October 2015. I thought I would be done and move on in 5 years! I have been around longer than most furniture in the office and gone through 4 office changes. A decade is indeed a long time. It is a wet blanket that dampens any excitement and buzz that comes out of the work. Things get repetitive. Except for system incidents, I have lost count of how many ways things can combine to blow off. I praise every morning to wake up to no new alert.

When I was 23, I was fired from my job, and I was unemployed for 6 months. More like unemployable. It could have been longer had my savings not gone dry. I wrote, read, cooked, rode, swam, organized events, and lived a different life I didn't know I should. It was the time I needed to recover from depression. It was the best of my life. I want to experience that one more time.

Upon this news, I received some questions, the most common ones are below.

Why are you leaving now? Is something bad happening?

I am still the CTO of Parcel Perform, just on sabbatical leave. And on the contrary, I think this is a good window for me to take a break. The business is in its best shape since the painful layoff in 2022. We are positive about the future, we are actively expanding the team for the first time in 3 years. The Tech Hub, of which I am directly responsible, has demonstrated that in the face of unprecedented incidents, we are resilient, innovative, and get things done. With the multi-cluster architecture and other innovations, we won't face an existential scalability problem for a long time.

In the last 6 months, we have invested in incorporating GenAI into our product. I believe we have the right data, tech stack, and an experiment-focused process, though only time can tell. To be frank, all the fast-paced experiments we are doing, known internally as POC, reminded me of all the things I loved about working here in the early days. Ideas are discussed in a small circle, stretched on a whiteboard, implemented in less than a day, and repeated. It has been so fun recently that I got cold feet. Perhaps I shouldn't take this break yet. But I am not getting younger, I am getting married, and soon will start a family. I won't have time for myself in a long time. It has to be now.

What will you do during the break?

Oh wow I'm gonna play so much computer game, my brain rots. I have been a vivid fan of Age of Empires II since the time there was only one kid in my neighborhood with a computer good enough to play the game. I am an average player, slow even, so perhaps we are looking at more losses than wins. But hey, it builds character.

I will host board game nights here and there. It's another long-lasting hobby of mine, and a perfect social glue for my group of friends. While I am at that, I probably want to up my mocktail game too. My friends are largely in my age bracket, so for the same reasons above, my ultimate goal is to have more quality time.

As far as dopamine goes, that's it. I am not planning for retirement after all. Can't afford that yet.

To be frank, the pretext of this break is that I want to work on my sleep, which has been less than ideal for a long time. I couldn't figure out a single one thing that could improve my sleep so it is gonna be a total revamp. Distance from work. White bed sheet. Sleep hygiene. Gotta catch em all.

I am probably still awake more than 12 hours a day though. I will be reading as much as I can, fiction, non-fiction, and whatnot. Real life is crazy these days; the distinction is getting vague. There are some long Murakami novels I want to go through. I find that reading his works in one go, or at least with a minimal pause, offers the ideal immersive experience.

What I read, I write. I hope I can find an interesting topic to write every month. If you are keeping track, the last few days have been quite productive ;) I am starting my first subtrack AI Crossroads because that's how I feel these days: an important moment in my life, our lives, that I cannot afford to miss. I am excited and confused. I am sure somebody out there is feeling the same. 

And I will pick up Robotics as a new hobby. As GenAI gets "solved", its reach will not stop at the virtual world. Robotics seems to be the next logical frontier where a new generation of autonomous devices crops up and occupies our world. 

Writing about these things, I'm already pumped!

Who will replace your role?

The good thing about cooking up this plan from last year is that I have had plenty of time to arrange for my departure. The level of disruption should be minimal. People won't notice when I am gone and when I am back. Or so I hope.

There isn't a simple 1 to 1 replacement. Parcel Perform is not getting a new CTO, and there will still be a job for me when I am back. Or so I hope.

As a CTO, my work comes in 3 buckets: feature delivery, technical excellence, and AI research.

Feature delivery is where we have the most robust structure. Over the years, we have managed to find and grow a Head of Engineering and two Engineering Managers. The tribe-squad structure is stable. We are getting exponentially better at cross-squad collaboration as well as reorganization. There is a handful of external support ranging from QA, infrastructure, to data and insight to ensure the troops have the oophm they need to crush down deliveries.

Technical excellence means making sure Parcel Perform tech stack stays relevant for the next 5 years. This is an increasingly ambitious goal. Our tech stack is no longer a single web server. The team is growing. The world is moving faster. But we have 4 extremely bright Staff Engineers. They each have spent years at the organization, are widely regarded for the depth of their technical knowledge, and are definitely better than me on my best day in their field of expertise. We have spent the last couple of months aligning their goals with the needs of Parcel Perform. The alignment is at its strongest point ever since we adopted a goal-oriented performance review system.

Lastly, AI research is the preparation of the organization for the AI future, across technologies, processes, and strategic values. While I will continue the research in my own time, there is now a dedicated AI team that has been made the center of Parcel Perform's AI movement. Despite the humble beginning, the team is 2x their size in the coming months and won't let us "go gentle into that good night" that is our post-apocalyptic lives with the AI overlords.

I think we are in good hands.

What will you do when you come back?

Honest answer, I don't know.

Also honest answer, I don't think it gonna be the same as what I am doing today. Sure, some aspects gonna be the same. 5 months isn't that long. Neither is it short. The organization will continue to grow and evolve to meet the demand of the market, the gap I leave will be filled. When I am back, the organization will undoubtedly be different from what it is today. I will have to relearn how to operate effectively again. I will need to identify the overlap between my interests, my abilities, and the needs of the new Parcel Perform. 

Final honest answer, I am anxious for that future, and it is the best part.

AI is an insult to life itself. Or is it?

The Quote That's Suddenly Everywhere

"AI is an insult to life itself."

I only stumbled across this quote a couple of months ago, attributed to Hayao Miyazaki - the man behind the famed Studio Ghibli. Since then, I've noticed it's literally everywhere, particularly as we speak, there's a massive trend of people using ChatGPT and other image generators to create pictures in the distinctive Ghibli aesthetic. My social feeds are flooded with AI-crafted images of ordinary scenes transformed with that unmistakable Ghibli magic - the soft lighting, the whimsical elements, the characteristic designs, the stuff of my childhood. You know what, I am not good with this kind of words. Here is one from your truly.

The trend has brought Mr Miyazaki's quote back into the spotlight, wielded as a battle cry against this specific type of AI creation. There's just one small problem - this quote is being horribly misused.

I got curious about the context (because apparently I have nothing better to do than fact-check AI quotes when I should be testing that MCP server), so I dug deeper. Turns out, Mr Miyazaki wasn't condemning all artificial intelligence. He was reacting to a specific demonstration in 2016 where researchers showed him an AI program that had created a disturbing, headless humanoid figure that crawled across the ground like something straight out of a horror movie. For God's sake, the animation reminded him of a disabled friend who couldn't move freely. Yeah, I quite agree, that was the stuff of visceral nightmare.

Src: https://www.youtube.com/watch?v=ngZ0K3lWKRc&t=3s

It's also worth noting that the AI Mr Miyazaki was shown in 2016 was primitive compared to today's models like ChatGPT, Claude, or Midjourney. We have no idea how he might react to the current generation of AI systems that can create stunningly convincing Ghibli-style imagery. His reaction to that zombie-like figure doesn't necessarily tell us what he'd think about today's much more advanced and coherent AI creations. Yet the quote lives on, stripped of this crucial context, repurposed as a condescending umbrella on all generative AI.

The Eerie Valley of AI Art

Here's where it gets complicated for me. When I look at these AI-generated Ghibli scenes, they instantly evoke powerful emotions - nostalgia, wonder, warmth - all the feelings I've associated with films like "Spirited Away" or "Princess Mononoke" over years of watching them (for what it's worth, not a big fan of Totoro, it's ok). The visual language of Ghibli taps directly into something deep and meaningful in my experience.

That is what art does. That is what magic does. But this is not, isn't it? These mass-produced imitations feel like they're borrowing those emotions without earning them. I feel an unsettling hollowness to the "art" - like hearing your mother's voice coming from a stranger. The signal is correct, but the source feels wrong.

I'm confronted with a puzzling contradiction: if a human artist were to draw in the Ghibli style (and many talented illustrators do), I wouldn't feel nearly the same unease. Fan art is celebrated, artistic influence is natural, and learning by imitation is as old as art itself. So why does the AI version feel different?

So while the quote is misused, I wonder if Mr Miyazaki's statement might contain an insight that applies here too. These AI creations, in their skillful but soulless imitation, do feel like a kind of insult, not to life itself perhaps, but to the deep human relationship between artist, art, and audience that developed organically over decades.

The Other Side

As a human, I consume AI features. Yet as an engineer, I build AI features. And there is another side to this.

You probably have heard it. Every time on the Internet there is a complaint that the US' gun culture is a nut case, there is a voice from a dark and forgotten 4chan corner screaming back "A gun is just a tool. A gun does not kill people. People do. And if you take the gun away, they will find something else anyway".

But this argument increasingly fails to capture the reality of AI image generators. These systems aren't neutral tools - they've been trained on massive datasets of human art, often without explicit permission from the artists. When I prompt an AI to create "art in Ghibli style," I'm not merely using a neutral tool - I'm activating a complex system that has analyzed and learned from thousands of frames created by Studio Ghibli artists.

This is fundamentally different from a human artist studying and being influenced by Mr Miyazaki's work. The human artist brings their own lived experience, makes conscious choices about what to incorporate, and adds their unique perspective. The AI system statistically aggregates patterns at a scale no human could match, without discernment, attribution, or compensation.

I've built enough software systems to know that complexity breeds emergence. When algorithms make thousands or millions of decisions across vast datasets, the traditional model of direct human control becomes more of a fiction than a reality. You can't just look at the code and know exactly what it will do in every situation. Trust me, I've tried to "study" deep learning.

Perhaps most significantly, as these systems advance, the distance between the creator's intentions and the system's outputs grows. The developers at OpenAI didn't specifically write code that says "here's exactly how to draw a flattering image of a dude taking note on a motorcycle" - they created a system that learned from millions of images, and now it can generate Ghibli-style art that no human specifically programmed it to make. These AI systems develop abilities their creators didn't directly put there and often can't fully predict. This expanding gap between intention and outcome makes the "tools are neutral" argument increasingly unsatisfying.

This isn't to say humans have lost control entirely. Through system design, regulation, and deployment choices, we retain significant influence. But the "tools are neutral" framing no longer adequately captures the complex, bidirectional relationship between humans and increasingly sophisticated AI.

Why We Can't Resist Oversimplification

So far, there are two camps. The "AI is an insult" reflects people whose work and lives are negatively impacted. And the "Tools are neutral" defends AI creators. I tried, but I am sure I have done a less-than-stellar job capturing the thought processes from both camps. Still, as poorly as it is, it feels fairly complex. This complexity is exactly why we humans lean toward simplified narratives like "AI is an insult to life itself" or "It's just a tool like any other." The reality is messy, contradictory, and doesn't fit neatly into either camp.

Humans are notoriously lazy thinkers. I know I am. Give me a simple explanation over a complex one any day of the week. My brain has enough to worry about with keeping our production systems alive. I've reached the point where I celebrate every morning that the #system-alert-critical channel has no new message.

This pattern repeats throughout history. Complex truths get routinely reduced to easily digestible (and often wrong) summaries. Darwin's nuanced theory of evolution became "survival of the fittest." Einstein's revolutionary relativity equations became "everything is relative." Nietzsche's exploration of morality became "God is dead." In each case, profound ideas were flattened into bumper sticker slogans that lost the original meaning. Make good YouTube thumbnails though.

This happens because complexity requires effort. Our brains, evolved for quick decision-making in simpler environments (like not getting eaten by tigers), naturally gravitate toward cognitive shortcuts. A single authoritative quote from someone like Mr Miyazaki provides an easy way to validate existing beliefs without engaging with the messier reality.

There's also power in simple narratives. "AI threatens human creativity" creates a clear villain and a straightforward moral framework. It's far more emotionally satisfying than grappling with the ambiguous benefits and risks of a transformative technology. I get it - it's much easier to be either terrified of AI or blindly optimistic about it than to sit with the uncertainty.

I am afraid that in the coming weeks and months, we cannot afford such simplification.

The choice we have to make

Young technologists today (myself included) find ourselves in an extraordinary position. We're both consumers of AI tools created by others and creators of systems that will be used by countless others. We stand at the edge of perhaps the most transformative wave of innovation in human history, with the collective power to influence how this technology shapes our future and the future of our children. FWIW, I don't have a child yet, I like to think I will.

The questions raised by AI-generated Ghibli art - about originality, attribution, the value of human craft, the economics of creation - aren't going away. They'll only become more urgent as these systems improve and proliferate.

The longer I work in tech, the more I realize that the most important innovations aren't purely technical - they're sociotechnical. Building AI systems that benefit humanity requires more than clever algorithms; it requires thoughtful consideration of how these systems integrate with human values and creative traditions.

For those of us in this pivotal position, neither absolute rejection nor blind embrace provides adequate guidance. We will need to navigate through this, hopefully with better clarity than Christopher Columbus when he "lost" his way to discovering America. My CEO made me read AI-2027 - there is a scenario where humans fail to align AI superintelligence and get wiped out. Brave new world.

1. Embrace Intentional Design and Shared Responsibility

We need to be deliberate about what values and constraints we build into creative AI systems, considering not just what they can do, but what they should do. This might mean designing systems that explicitly credit their influences, or that direct compensation to original creators whose styles have been learned.

When my team started writing our first agent, we entirely focused on what was technically possible. Is this an agent or a workflow? Is this a tool call or a node in the graph? Long context or knowledge base? I know, technical gibberish. The point is, we will soon evolve past that learning curve, and what comes next is thinking through the implications.

2. Prioritize Augmentation Over Replacement

The most valuable AI applications enhance human creativity rather than simply mimicking or replacing it. We should seek opportunities to create tools that make artists more capable, not less necessary.

When I see the flood of AI-generated Ghibli art, I wonder if we're asking the right questions. The most exciting creative AI tools don't just imitate existing styles - they help artists discover new possibilities they wouldn't have found otherwise. The difference between a tool that helps you create and one that creates instead of you may seem subtle, but it's profound. 

I have been lucky enough to be part of meetings where the goal of AI agents is to free the human colleagues from boring, repetitive tasks. I sure hope that trajectory continues. Technology should serve human values, not the other way around.

3. Ensure Diverse Perspectives and Continuous Assessment

The perspectives that inform both the creation and governance of AI systems should reflect the diversity of populations affected by them. This is especially true for creative AI, where cultural context and artistic traditions vary enormously across different communities.

It's so easy to build for people in my immediate circle and call it a day. As an Asian, I see how AI systems trained predominantly on Western datasets create a distorted view of creativity and culture. Without clarification, a genAI model would assume I am a white male living in the US. Bias, prejudice, stereotype. We have seen this.

Finding My Way in the AI Landscape

The reality of our relationship with AI is beyond simple characterization. It is neither an existential threat to human creativity nor a neutral tool entirely under our control. It represents something new - a technology with growing capabilities that both reflects and reshapes our creative traditions.

Those Ghibli-style images generated by AI leave me with mixed feelings that I'm still sorting through. On one hand, I'm amazed by the technical achievement and can't deny the genuine emotions they evoke. On the other hand, I feel I am being conditioned to feel that way.

Perhaps this ambivalence is exactly where we need to be right now - neither rejecting the technology outright nor embracing it uncritically, but sitting with the discomfort of its complexity while we figure out how to move forward thoughtfully.

For our generation that will guide AI's development, the challenge is to move beyond reductive arguments. Neither blind techno-optimism nor reflexive technophobia will serve us well. Instead, we need the wisdom to recognize both the extraordinary potential and the legitimate concerns about these systems, and the courage to chart a course that honors what makes human creativity valuable in the first place.


This post was written with assistance from Claude, which felt a bit meta given the topic. They suggested a lot of the structure, but all the half-baked jokes and questionable analogies are mine alone. And it still took me a beautiful Saturday to pull everything together in my style.


Sunday, April 6, 2025

MCP vs Agent

On 26th Nov 2024, Anthropic introduced the world to MCP - Model Context Protocol. Four months later, OpenAI announced the adoption of MCP across its products, making the protocol the de facto standard of the industry. Put it another way, a few months ago, we were figuring out when something is a workflow and when it is an agent (and we still are). Today, the question is how much MCP Kool-Aid we should drink.

What is MCP?

MCP is a standard to connect AI models and external data sources or tools. This allows models to integrate with external systems independently of platform and implementation. Before MCP, there is tool use, but the integration is platform-specific, be it LangChain, CrewAI, LlamaIndex, and whatnot. An MCP Postgres server, however, works with all platforms and applications supporting the protocol. MCP to AI is HTTP to the internet. I think so, I was a baby when HTTP was invented. 

I won't attempt to paraphrase the components of MCP; modelcontextprotocol.io is dedicated to that. Here is a quick screenshot.

If you have invested extensively in tool use, the rise of MCP doesn't necessarily mean that your system is obsolete. Mind you, tool use is probably still the most popular integration method out there today, and can be made MCP-compatible with a wrapper layer. Here is one from LangChain.

A system with both tool use and MCP looks something like this.

For a 4-month-old protocol, MCP is extremely promising, yet far from the silver bullet for all AI integration problems. The most noticeable problems MCP has not resolved are:

  • Installation. MCP servers run on local machines and are meant for desktop use. While this is okay-ish for professional users, especially developers, it is unsuitable for casual use cases such as customer support.
  • Security. This post describes "tool poisoning" and it is definitely not the last.
  • Authorization. MCP's OAuth 2.1 adoption is a work in progress. OAuth 2.1 itself is still a draft.
  • Official MCP Registry. As far as I know, there are attempts to collect MCP servers, they are all ad-hoc and incomplete, like this and this. The quality is hit and miss, official MCP servers tend to fare better than community ones.
  • Lack of features. Streaming support, stateless connection, proactive server behavior, to name a few.
None of these problems indicates a fundamental flaw of MCP and in time all will be fixed. As with any emerging technology, it is good to be optimistically cautious when dealing with these hiccups.

MCP in SaaS

I work at a SaaS company, building applications with AI interfaces. I was initially confused by the distinction between agent and MCP. Comparing agent to MCP is apple to orange, I know. MCP is more on par with tool use (on steroids). Yet the comparison makes sense in this context: if I only have so many hours to build assistance features for our Customer Support staff, should I build a proprietary agent or an MCP server connecting to, say, Claude Desktop App, assuming both get the work done? Perhaps at this point, it is also worth noting that I believe office workers in the future will spend as much time on AI clients like Claude or ChatGPT as they do on browsers today. If not more. Because the AI overlord doesn't sleep and neither will you!

Though I ranted about today's limitation of MCP, the advantages are appealing. Established MCP clients, like Claude Desktop App, handle gnarly engineering challenges such as output token pagination, human-in-the-loop, or rich content processing with satisfying UX. More importantly, a user can install multiple MCP servers on their device, both in-house and open-sourced, which opens up various workflow automation possibilities.

When I build agents, I noticed that a considerable amount of time went to plumbing - overhead tasks required to get an agent up and running. It couldn't be overcome with better boilerplate code, but still... An agent is also a closed environment where any non-trivial (and sometimes trivial) changes require code modification and deployment. The usability is limited to what the agent's builder provides. And the promise of multi-agent collaboration, even though compelling, has not quite been delivered yet. Finally, an agent is as smart as the system prompts it was given. Bad prompts, like the ones made by yours truly, can make an agent perform far worse than the Claude Desktop App.

Then why are agents not yesternews yet? As far as I know, implementing an agent is the only method that provides full behavior control. Tool call (even via MCP) in an agent is deterministic, and everything related to agent reasoning can be logged and monitored. Claude Desktop App, as awesome as it is, offers little explanation of why it does what it does (though that too can be changed in the future). Drinking too much MCP Kool-Aid could also mean giving too much control of the system to third-party components - the MCP clients - and can lead to an existential threat.   

Conclusion

The rise of MCP is certain. Just as MCP clients might replace browsers as the most important productivity tool, at some point, vendors might cease asking for APIs and seek MCP integration instead. Yet it will not be a binary choice between MCP servers and agents. Each offers distinct advantages for different use cases.
  • MCP allows cooperating a large number of tools, making it suitable for internal dog flooding and fast exploration. MCP would also see more adoption among professional users than casual ones.
  • Agents being more controlled, tamed, and preferably well tested would be the default choice for customer-facing applications.
Within a year, we'll likely see a hybrid system where MCP and agent-based approaches coexist. Of course, innovations such as AGI or MCP going beyond its initial local and desktop use can change this balance. There is no point in predicting long-term future at this point.