程式人蔘: Data-driven programming 資料驅動程式設計

Thursday, September 7, 2017

Data-driven programming 資料驅動程式設計

Data-driven programming ，出自 The Art of UNIX Programming 一書。基本的概念是這樣子：

首先，人腦很不善長理解循序的「邏輯」。相反的是，人腦比較容易理解「資料」，怎樣算是資料呢？不管是表格、標記語言、巨集、樣板系統，這些都算是資料，都比循序邏輯容易理解。

於是，基於這個洞見，Unix 設計師使用它們的工具集：「高階語言、資料驅動程式設計、程式碼產生器、領域專用語言」來讓程式碼可以被極小化資料指定的規格所自動生成。

These insights ground in theory a set of practices that have always been an important part of the Unix programmer's toolkit — very high-level languages, data-driven programming, code generators, and domain-specific minilanguages. What unifies these is that they are all ways of lifting the generation of code up some levels, so that specifications can be smaller.

所謂資料驅動程式設計 (Data-driven programming) 和 OO 來做比較的話，主要有兩點不同：

在資料驅動程式設計裡，「資料」不僅只是物件的狀態，而是往往定義了程式的控制結構。 OO 的首要考量是「封裝」，而資料驅動程式設計首要的考量是「固定的程式碼」寫得愈少愈好。

In data-driven programming, the data is not merely the state of some object, but actually defines the control flow of the program. Where the primary concern in OO is encapsulation, the primary concern in data-driven programming is writing as little fixed code as possible.

書中是用 python 做為資料驅動程式設計的例子，但是 python 是 1990 才有的東西。

所以書裡有一段話描述了 1969 年的歷史：1969 年的 UNIX programmer 習慣於寫「語法解析器的規格」來生成「語法解析器」，好用來處理「標記語言」。因為做完語法解析器之後，剩下的工作就是對配置文件來做一般的「樹走訪」就可以完成了。要漂亮地解決問題，需要資料驅動程式設計的兩個階段來達成，而其中一個 (樹走訪) 建構於於另一個 (語法解析) 之上。

Unix programmers are very used to writing parser specifications to generate parsers for processing language-like markups; from there it was a short step to believing that the rest of the job could be done by some kind of generic tree-walk of the configuration structure. Two separate stages of data-driven programming, one building on the other, were needed to solve the design problem cleanly.

在 Data-driven programming 的概念下，程式不只是 Engine ，而且是 Data-programmable engine 。Data-driven programming 的經典實作品是 Ant 和 Interpreter。