Saturday, December 29, 2018

The Unintentional Side Effect of a Bad Concurrency Model

這篇 The Unintentional Side Effect of a Bad Concurrency Model 出自 Joe Armstrong 的 blog 。文章裡提到的 Erlang 世界觀,跟 Robert C. Martin 主張的 Clean Architecture 相左。 Robert C. Martin 認為:「 micro-services 只是一種 deploy 的選項 ,並不是 Architecture 。」然而在 Erlang 的世界觀卻不這麼認為,在 Erlang programmer 的世界觀認為:
(1) micro-services 的設計,才應該是 default 的選項。我們應該要透過不停地在系統中加入新的小型獨立的 communicating objects 來讓系統成長 (micro-services),而不是把「不需要溝通」的程式愈寫愈大 (monolithic)。
(2) 模組與模組之間,應該要透過 communication protocol/message 來溝通,而不是依賴於 API 。

而作者的結論是:『為什麼世界上大部分的程式都不是用 erlang 的世界觀在開發,主要就是因為大部分的其它語言都沒有一個好的 concurrency model ,於是就造成了一種未意料到的副作用。』

Sequential languages are designed to write sequential programs, and the only way for a sequential program to grow in functionality is for it to get larger. It's technically difficult to split it into cooperating processes so this is not usually done. The concurrency in the application cannot be used to structure the application.

We should grow things by adding more small communicating objects, rather than making larger and larger non-communicating objects.

Concentrating on the communication provides a higher level of abstraction than concentrating on the function APIs used within the system. Black-box equivalence says that two systems are equivalent if they cannot be distinguished by observing their communication patterns. Two black-boxes are equivalent if they have identical input/output behavior.

Friday, December 28, 2018

How to build stable systems

 How to build stable systems 算是一篇集大成的文章,內容同時涵蓋了軟體開發的各種層面,並且提出作者認為合理、有效的作法。

在「前置準備」方面,他主張,『如果要試用新的技術,只能做單一賭博』不要一口氣嘗試太多技術,過度增加專案的危險性。
A project usually have a single gamble only. Doing something you’ve never done before or has high risk/reward is a gamble. Picking a new programming language is a gamble. Using a new framework is a gamble. Using some new way to deploy the application is a gamble. Control for risk by knowing where you have gambled and what is the stable part of the software. Be prepared to re-roll (mulligan) should the gamble come out unfavorably.

在「系統規畫」方面:
由於作者來自 erlang 的背景,他的思維是以 micro-services 做為預設值,所以他主張,模組與模組之間要透過 protocol 來溝通。( 注意:傳統的系統規畫通常是建議,系統先做成 monolithic,模組與模組之間先透過 API 來溝通,之後再逐步視需要,將模組變成獨立運作的 service。)
Your system is a flat set of modules with loose coupling. Each module have one responsibility and manages that for the rest of the software. Modules communicate loosely via a protocol, which means any party in a communication can be changed, as long as they still speak the protocol in the same way. Design protocols for future extension. Design each module for independence. Design each module so it could be ripped out and placed in another system and still work.

他主張 end-to-end principle 。只有端點需要有複雜的邏輯。
In a communication chain, the end points have intelligence and the intermediaries just pass data on. That is, exploit parametricity on mulitple levels: build systems in which any opaque blob of data is accepted and passed on. Avoid intermediaries parsing and interpreting on data. Code and data changes over time, so by being parametric over the data simplifies change.

在「設置檔」方面,他主張「除非系統真的超大,不然,並不需要全然動態的『設置檔』」所以不需要太早使用 etcd/Consul 之類的,把設置檔放在 S3 即可。
Avoid the temptation of too early etcd/Consul/chubby setups. Unless you are large, you don’t need fully dynamic configuration systems. A file on S3 downloaded at boot will suffice in many cases.

Tuesday, December 25, 2018

Erlang 的啟發

(1) Remote procedure call 不是一個好的 idea
What seems like a good, simple idea on the surface -- hiding networks and messages behind a more familiar application-development idiom -- often causes far more harm than good.

Request-response is a network-level message exchange pattern, whereas RPC is an application-level abstraction intended.

Equating RPC with synchronous messaging means the later wrongly suffers the same criticisms as RPC.

RPC 主要有兩個問題:
(1) 無法提供合理的抽象層來處理 network failure
(2) 讓工程師沒有注意到透過網路呼叫程序的代價,容易設計出 response time 過長的系統。

Erlang 沒有提供「遠端程序呼叫」。Erlang 只有提供 message passing 和 failure handling 的 primitives ,即 send/receive 和 link 。

(2) Scalable, fault-tolerant, hot-upgradable 的共同點
一個系統同時若要同時滿足上述三件事,其實需要實作下列的 primitives
(a) detect failure
(b) move state from one node to another

When designing a system for fail-over, scalability, dynamic code upgrade we have to think about the following:

What information do I need to recover from a failure?
How can we replicate the information we need to recover from a failure?
How can we mask failures/code_upgrades/scaling operations from the clients

(3) Cooperative multitasking is fragile
協作式多工是「脆弱」的設計。( 但是,cooperative 的效能通常會比 preemptive multitasking 好一些。)

The weakness of the cooperative model is its fragility in server settings. If one of the tasks in the task queue monopolizes the CPU, hangs or blocks, then the impact is worse throughput, higher latency, or a deadlocked server. The fragility has to be avoided in large-scale systems, so the Erlang runtime is built preemptively. The normal code is “instrumented” such that any call which may block automatically puts that process to sleep, and switches in the next one on the CPU.

(4) Stacking Theory for Systems Design
對 service 的可用性,定義出數個不同的 mode of operation ,比方說:
VM 有 run 起來,這就是 level 0 。
database connection 有通,這就是 level 1 。

如果某些條件達成,系統就會轉換到高一級的 operational level 。如果 error 發生,就自動轉換低一級的 operational level 。

By "stacking" service, we can go back to level 0 and start best-effort transitions to level 1 again. This structure is a ratchet-mechanism in Erlang systems: once at a higher level, we continue operating there. Errors lead to the fault-tolerance handling acting upon the system. It moves our operating level down — by resetting and restarting the affected processes — and continues operation at the lower level.

用了這樣子多個 operational mode 的設計,系統較容易得到「容錯」的特性。

A system assuming the presence of other systems has an unwritten dependency chain. You have to boot your production systems in a certain order or things will not work. This is often bounds for trouble.

(5) Processes are the units of error encapsulation --- errors occurring in a process will not affect other processes in the system. We call this property strong isolation.

關於如何將軟體模塊化總是有許多不同的爭議。編譯器設計者通常把硬體想象成完美的,並且主張由通過靜態編譯時型別檢查來提供良好的隔離性。與編譯器設計者們相反,作業系統設計者們則主張運行期檢查,並主張將進程做為保護單位與故障單位

Friday, December 21, 2018

why does 12 factor recommend not to daemonize processes?

最近在研究 Erlang/Elixir ,又發現了以前自己總是不小心忽略的東西 supervisor 。

然後就查了一下 12 factor app。為什麼會想查 12 factor app ?印象中,我好像不知道在哪裡看過一篇文章,提到,許多 Erlang 的設計,後來都被 12 factor app 所採用,意思大概是說,Erlang 真的就是分散式系統設計的先趨。

結果,12 factor app 裡頭也有一條 rule  有提到 process supervisor 的相關概念。但是,我覺得說明最清楚的,反而是 stackoverflow 上,對下列問題的解答:
「為什麼 12 factor app 不推荐我們 daemonize processes?」

答案是:
重點不在於不要使用 daemonize processes 這個解法。重點在於: process management framework 其實 *nix 早就有了,沒事不要自己做。要善用現有的工具,這個才是 DevOps 的精神。