Sunday, September 10, 2017

Can Kafka guarantee message ordering?

If you never heard of Kafka, this is not for you. The short paragraph won't repeat the concept of a messaging system and its components, internal as well as external.

Kafka has been proven to be a versatile tool for different use cases. In some of those, message ordering, that messages are consumed in the same order as they are created, is more important than the speed of either consumption or production. There is a big difference between depositing $100 and then withdrawing from it, and the other way around.

This illustration is here to please the eyes rather than any particular purpose


Partition

In Kafka, order can only be guaranteed within a partition. This means that if messages were sent from the producer in a specific order, the broken will write them to a partition and all consumers will read from than in that order. So naturally, single-partition topic is easier to enforce ordering compared to its multiple-partition siblings.

But sometimes, this setting is not desirable. Having a single partition limits the throughput speed to that of a single consumer. And in most of systems, an enforcement on global order of all messages ever created is unnecessary. The order of depositing and withdrawing of my account shouldn't interfere that of my colleagues (non-causal). In these cases, as long as relevant messages are sent to the same partition, message (causal) ordering is still guaranteed. In Kafka, this is achieved using keyed messages. Kafka partitioners would hash keys and ensure a key always go to the same partition given the number of partitions is not changed.

If the naive hash scheme skews the size of your partitions, e.g. a ADHD bitcoin broker with a disproportionately number of activities making his partition twice as large as anything else, Kafka allows custom partitioner.

Configuration

Kafka numerous settings include `retries` indicates the number of time a message will be retried on encountering intermittent error, and `max.in.flight.requests.per.connection`, the number of unacknowledged messages the producer will send on a single connection before blocking (pipelining, not like they are sent concurrently). The combination of these two can cause a bit of trouble. It is hard to build a reliable system without zero `retries`. On the other hand, it is possible that the broker fails the first batch of message, succeeds in the second (already in flight), and then retries the first batch and succeeds this time. With positive `retries`, `max.in.flight.requests.per.connection` should be set to 1. This comes with a penalty on producer throughput and should only be used where order enforcement is a must.

Producer

Sadly, there is no ordering guarantee for messages written by different producers, the order of reception is used. If the system design insists on multiple producers, one topic, and strong ordering, the responsibility falls into the producer application code. This means bigger investment and is the reason why this comes last. As both clock and network in a distributed system are not reliable, sequence numbers can replace timestamp in showing order of messages. The Lamport timestamp is one of such simple and compact mechanism. Lamport timestamp however cannot tell whether two messages are concurrent or causally dependent. The more heavy-weight version vectors would gladly take the job with a few extra dimes.

TLDR

Single partition, and producer are easier to maintain total order. On multiple partitions, get casually dependent messages to the same partition using key and partitioner. Independent sequence ordering is needed with multiple producers. Accept the tradeoff of not having message pipelining if you want retries and order in the same place.

Sunday, August 6, 2017

Stories of the underdogs II

The internet as we know today started in the late 1980s. By mid 1990s, it had already had a revolutionary impact on culture, commerce, and technology. Around that time Vietnam had its first commercial internet connection. The tipping point was 2004 when internet usage grew by 200% and gave 10% of the population access to the highway of information. A young population, and a network of locally connected game centers, the first profound impact of the internet in Vietnam was the gaming industry. There were some competitions and when the dust settled, two dominating players emerged: Singapore-based Garena and homegrown VNG. When Garena came to Vietnam in 2010, the press described the event as the doom of VNG, that the little Vietnamese startup would soon be devoured. And there was foundation to that, Garena was led by a Stanford MBA graduate, with a seed fund of $2M and operated at 5 different countries before entering Vietnam. On the other hand, VNG founder was originally a bank officer by day and ran his game center in an alley at night. The startup got $80k seed fund and was non-existential outside of the country. Seven years later, both are still around and extremely successful being the first two SEA startups worth more than $1B. There is a little twist to that though. VNG is still virtually non-existential outside of Vietnam. It manages to get on the same tier with Garena with only one market, while the later has 8. Within Vietnam, Garena is solely known as a game publisher, VNG has transformed into an Internet Giant whose products touch every Vietnamese internet users.

The story of VNG is inspiring, against all odds, and exactly the kind of folk tales you suppose to hear in the world of startups. Most startups suffer from inexperience, understaffed, under financed, and overloaded. Of course, once in a while, there are exceptions, like Color, a photo-based social network led by veteran Vietnamese American founders with $41M initial funding. But for the majority of startups, those 4 horsemen of entrepreneurship continue to chase them till exit. Yet despite being disadvantaged, startups succeeded and disrupted markets from time to time. Again, except if you were Color, the startup was a flop and sold to Apple at $7M. So what if we have read the startup stories wrong. That is, what if instead of succeeding despite of all the challenges, great startups succeed exactly because of the challenges they face?

To listen at these stories differently, first we need to redefine what is an advantage. The same qualities that make big companies appear to be fearsome are often the source of great weakness. For example,
when it comes to technology startups, being big isn't exactly an advantage. While a bigger developer team in theory can produce software faster, they are also significantly harder to manage, might suffer from low productivity because of communicating and bureaucracy overhead, and more expensive to keep. Instagram was acquired for a billion dollar with a team of 13. Whatsapp has 50 engineers for 900 million users. With the power of software, hardware, and automation, size is more a liability than an advantage.

On another point, it's tartarus for a company to expand its core competencies to other areas. Once its core competencies are formed, much of the company resources is spent on extending the scope of features so the product is applicable to more use cases, refining development process, and organizational structure to sustain all those activities. Innovations are still cultivated, usually in form of specialization of teams. which results in a stream of smaller, continuous improvement over an existing product or service. That also makes disruptive innovation, like creating products or services that did not exist before, harder. Google for example is an extremely successful company, yet despite its size and the number of projects, much of Google revenue surrounds its core competencies as a search engine. Most of the projects whose role is to ensure Google's relevancy in the next decade such as Android, self-driving car, or AI, are from acquisition and not in-house.

While big corporates focus on regional and global scale, Vietnam startups, even the successful ones, think and act local. This in turn results in products solving very specific problems existing only in the country, and virtually unheard of for anyone else. As a game distributor, Garena did and still aces VNG in every perspectives. They are well-tuned towards international partnership, multi-nation online game operation, and game tournament hosting. What left for VNG were games that are much less popular and operate in only one country. And it was meant to be that way. VNG resources were spent on another problem of Vietnam in the late 2000s. Back then, to run a game center, you needed to be some sort of computer/game enthusiast because the hardware constantly ran outdated, latest games kept popping up, and your computer needed to be protected against these children who were too eager to try whatever tricks they found on the internet. But that wouldn't scale when game center became a business and investment. So VNG came up with this business development team that would consult small business owners with new location, hardware and software setups, and on top of that, install a home grown management software that not only protected the computers against threats, updated with latest games, but also enabled a VNG membership country-wide. So all of sudden, credit player accommodated in one center, can be used in another, as long as the other place also installed VNG management software. The software spread like wildfire and so did VNG presence in every corner of the country.

The trend in Vietnamese startup in recent years can be summed up as following:
  • Solve local problems
  • Avoid direct confrontation with bigger players
  • Disproportionately abundant in Entertainment and Lifestyle segment, due to the size of it
  • Much fewer B2B effort even among each other, which sometimes unintentionally leads to fewer strategic partnership enhancing value proposition and creating win-win situation
In other words, many startups here find guerrilla tactic fits them right. But then, if such can be formulated, why aren't we seeing more successful startups from Vietnam? Because guerrilla tactic is hard, to the point of desperation. Niche local problems aren't always obvious, they more often lurk in a corner, hiding in plain sight. Finding them is like shooting arrow to a bullseye you cannot see. Sometime, the bullseye doesn't even exist. And you have to go on like that days after days, with marginal profit, investors and employees show doubt, sometimes very expressively, none of the effort seems worth it. You only turn to guerrilla tactic when nothing else seems to work. Even startups succeed in this tactic, drop it as soon as they can. Right when VNG revenue from game distribution was stable, they started looking for extension in conventional market, like social network, media streaming, and messaging, all of which eventually replace game distribution as VNG's cash cow.

Doing startup is one of the many ways to gain perspectives of the world. A great one. Much of what we consider valuable in our world arise out of these kinds of challenges, because the act of facing overwhelming odds produces greatness and beauty. But just as important, the nature of these challenges might not be what they appear to be, giants have weakness, and underdogs over and over accomplish the unexpected with right preparation.

Sunday, February 26, 2017

Bye Chung Cư Nguyễn Huệ. Bye Saigon.

Bạn phải đọc cái tin này, thì mới biết tôi đang lầm bầm gì. Nên đọc đi, tôi chờ. Lười lười đọc tiêu đề được rồi.



Giải tán mấy quán này, chắc chung cư Nguyễn Huệ chỉ còn cái xác bê tông. Từ lâu người ta không còn ở đây nữa, tại vì, như trong bài, hạ tầng xập xệ quá rồi. Nói cách khác, những hàng quán này giờ là thứ duy nhất giữ sức sống của chung cư. Dẹp quán đi, người dân không (muốn) quay lại ở, chung cư chắc nhường chỗ cho một toà cao ốc văn phòng phù hợp hơn với khuôn mặt thành phố.

Ở Saigon, các hộ kinh doanh nhỏ là xương sống của nền kinh tế, họ không có chỗ trên khuôn mặt thành phố? Đã lâu ở Saigon không có những công trình như Hồ Con Rùa, không nhầm mục đích quần què gì hết, đẹp thì làm. Ngày xây Hồ Con Rùa, ông kiến trúc sư chắc không nghĩ đến việc bán bánh tráng trộn, hay công viên 30/4 thành chỗ gom giấy báo lót đít. Cũng như chung cư Nguyễn Huệ, những địa danh này trờ thành môi trường để các mô hình kinh doanh phát triển một cách tự nhiên (organic). Đó là dạng công việc cần đến chính phủ, tạo ra môi trường phát triển tự nhiên cho xã hội, và nhường công tác quản lý vi mô (đổ đầy shop vào một cái mall trống) cho khối tư nhân. Khi chính phủ hiệp lực với tư nhân bự, tư nhân nhỏ thành người làm thuê, ở thuê.

Chung cư Tôn Thất Đạm. Cái vẻ chất chất nghệ sĩ bạn nhìn thấy mà không gọi tên được là nhờ tính tự nhiên trong tổng thể. Như một ngôi làng Thuỵ Sĩ.
Trà Lạc Đình, Hồ Con Rùa cũng đi luôn? Cô chú ở đây từ hồi chung cư mới dựng.
Hồ Con Rùa nhìn từ Lạc Đình.

Từ sau ngày 25/3/2015, Saigon hay được so sánh với (quá khứ) Singapore. Bằng vai phải lứa với Singapore thành KPI. Ở Sing, khu Tiong Bahru có hơn chục chung cư cũ, được xây từ thời lập quốc nên chính phủ quyết định phân hàng di sản quốc gia, mọi thay đổi, sửa chữa đều phải xin phép. Giờ thành thánh địa cho tụi hispter ghét mainstream. Booksactually nơi tôi mua được bản Sorrow of War ở ngay đây. Những căn đầu tiên ở Tiong Bahru xây hồi 1920, nhưng phần lớn những gì ta nhìn thấy giờ đây là từ 1960. Saigon đâu thiếu di-sản-quốc-gia-singapore.

Tài sản quốc gia Singapore. Nhân tiện, Marina Bay Sands không phải.

Nghị định số 99/2015 có thể tránh được nếu toà nhà 42 Nguyễn Huệ từ bỏ chức năng chung cư của mình? Tôi không biết, tôi là dân tay ngang, và bạn đừng có mà tin một thằng ất ơ trên Internet. Nhưng trung-tâm-mua-sắm-nhìn-như-chung-cư / mall-that-look-like-apartment-building thành trào lưu mới của Saigon? I can live with that. Heck, after a selfie stick is decided to be called "gậy tự sướng", I can live with anything.