Summer Camp on Programming 2018

General information

Time
July 30 ~ August 1, 2018
Contact
Josh Ko, hsiang-shang [at] nii [dot] ac [dot] jp
Yongzhe Zhang, zyz915 [at] nii [dot] ac [dot] jp
Access
The easiest, fastest and smoothest way to get from Tokyo to Karuizawa is by Shinkansen. Direct options between Tokyo and Karuizawa are the Asama and Hakutata trains. The time table can be found in this page. Costs and journey times are below.
StationUnreserved seat Reserved seatGreen seat Gran classTime (mins)
Tokyo5,3906,1107,45010,540~ 75
Ueno5,1805,9007,24010,330~ 70
Map
Map (PDF/310kB)

Talks

A unified approach to solving seven programming problems (functional pearl)

Byrd et al. In ICFP 2017.

We present seven programming challenges in Racket, and an elegant, unified approach to solving them using constraint logic programming in miniKanren.

Speaker: Kanae Tsushima (National Institute of Informatics, Japan)

A lambda-calculus foundation for universal probabilistic programming

Borgström et al. In ICFP 2016.

We develop the operational semantics of an untyped probabilistic λ-calculus with continuous distributions, and both hard and soft constraints,as a foundation for universal probabilistic programming languages such as Church, Anglican, and Venture. Our first contribution is to adapt the classic operational semantics of λ-calculus to a continuous setting via creating a measure space on terms and defining step-indexed approximations. We prove equivalence of big-step and small-step formulations of this distribution-based semantics. To move closer to inference techniques, we also define the sampling-based semantics of a term as a function from a trace of random samples to a value. We show that the distribution induced by integration over the space of traces equals the distribution-based semantics. Our second contribution is to formalize the implementation technique of trace Markov chain Monte Carlo (MCMC) for our calculus and to show its correctness. A key step is defining sufficient conditions for the distribution induced by trace MCMC to converge to the distribution-based semantics. To the best of our knowledge, this is the first rigorous correctness proof for trace MCMC for a higher-order functional language, or for a language with soft constraints.

Speaker: Taro Sekiyama (National Institute of Informatics, Japan)

Deep specifications and certified abstraction layers.

Gu et al. In POPL 2015

Modern computer systems consist of a multitude of abstraction layers (e.g., OS kernels, hypervisors, device drivers, network protocols), each of which defines an interface that hides the implementation details of a particular set of functionality. Client programs built on top of each layer can be understood solely based on the interface, independent of the layer implementation. Despite their obvious importance, abstraction layers have mostly been treated as a system concept; they have almost never been formally specified or verified. This makes it difficult to establish strong correctness properties, and to scale program verification across multiple layers.

In this paper, we present a novel language-based account of abstraction layers and show that they correspond to a strong form of abstraction over a particularly rich class of specifications which we call deep specifications. Just as data abstraction in typed functional languages leads to the important representation independence property, abstraction over deep specification is characterized by an important implementation independence property: any two implementations of the same deep specification must have contextually equivalent behaviors. We present a new layer calculus showing how to formally specify, program, verify, and compose abstraction layers. We show how to instantiate the layer calculus in realistic programming languages such as C and assembly, and how to adapt the CompCert verified compiler to compile certified C layers such that they can be linked with assembly layers. Using these new languages and tools, we have successfully developed multiple certified OS kernels in the Coq proof assistant, the most realistic of which consists of 37 abstraction layers, took less than one person year to develop, and can boot a version of Linux as a guest.

Speaker: Josh Ko (National Institute of Informatics, Japan)

Program slicing

Mark Weiser In ICSE 1981.

Program slicing is a method used by experienced computer programmers for abstracting from programs. Starting from a subset of a program's behavior, slicing reduces that program to a minimal form which still produces that behavior. The reduced program, called a “slice”, is an independent program guaranteed to faithfully represent the original program within the domain of the specified subset of behavior. Finding a slice is in general unsolvable. A dataflow algorithm is presented for approximating slices when the behavior subset is specified as the values of a set of variables at a statement. Experimental evidence is presented that these slices are used by programmers during debugging. Experience with two automatic slicing tools is summarized. New measures of program complexity are suggested based on the organization of a program's slices.

Speaker: Huynh Hoang Long (Hanoi University of Science and Technology, Vietnam)

Designing distributed applications with mobile code paradigms

Carzaniga et al. In ICSE 1997.

Large scale distributed systems are becoming of paramount importance, due to the evolution of technology and to the interest of market. Their development, however, is not yet supported by a sound technologiecal and methodological backgound, as the results develooped for small size distributed systems often do not scale up. Recently, mobile code languages (MCLs) have been proposed as a technological answer to the problem. In this work, we abstract away from the details of these languages by deriving design paradigms exploiting code mobility that are independent of any particular technology. We present such design paradigms, together with a discussion of their features, their application domain, and some hints about the selection of the correct paradigm for a given distributed application.

Speaker: Jumpei Tanaka (National Institute of Informatics, Japan)

Generic schema matching with Cupid

Madhavan et al. In ICSE 2018.

Schema matching is a critical step in many applications, such as XML message mapping, data warehouse loading, and schema integration. In this paper, we investigate algorithms for generic schema matching, outside of any particular data model or application. We first present a taxonomy for past solutions, showing that a rich range of techniques is available. We then propose a new algorithm, Cupid, that discovers mappings between schema elements based on their names, data types, constraints, and schema structure, using a broader set of techniques than past approaches. Some of our innovations are the integrated use of linguistic and structural matching, context-dependent matching of shared types, and a bias toward leaf structure where much of the schema content resides. After describing our algorithm, we present experimental results that compare Cupid to two other schema matching systems.

Speaker: Van-Dang Tran (National Institute of Informatics, Japan)

The pairing heap: a new form of self-adjusting heap

Fredman et al. In Algorithmica (1986).

Recently, Fredman and Tarjan invented a new, especially efficient form of heap (priority queue) called the Fibonacci heap. Although theoretically efficient, Fibonacci heaps are complicated to implement and not as fast in practice as other kinds of heaps. In this paper we describe a new form of heap, called thepairing heap, intended to be competitive with the Fibonacci heap in theory and easy to implement and fast in practice. We provide a partial complexity analysis of pairing heaps. Complete analysis remains an open problem.

Speaker: Yongzhe Zhang (National Institute of Informatics, Japan)

On data structures and asymmetric communication complexity

Miltersen et al. In Journal of Computer and System Sciences (1991).

In this paper we consider two-party communication complexity, the “asymmetric case”, when the input sizes of the two players differ significantly. Most of previous work on communication complexity only considers the total number of bits sent, but we study trade-offs between the number of bits the first player sends and the number of bits the second sends. These types of questions are closely related to the complexity of static data structure problems in the cell probe model. We derive two generally applicable methods of proving lower bounds and obtain several applications. These applications include new lower bounds for data structures in the cell probe model. Of particular interest is our “round elimination” lemma, which is interesting also for the usual symmetric communication case. This lemma generalizes and abstracts in a very clean form the “round reduction” techniques used in many previous lower bound proofs.

Speaker: Zhixuan Yang (National Institute of Informatics, Japan)

Random number generators: good ones are hard to find.

Park and Miller In Communications of the ACM (1988).

Practical and theoretical issues are presented concerning the design, implementation, and use of a good, minimal standard random number generator that will port to virtually all systems.

Speaker: Zirun Zhu (National Insititute of Informatics, Japan)

Hijacking Bitcoin: routing attacks on cryptocurrencies

Apostolaki et al. In IEEE Symposium on Security and Privacy (SP 2017).

As the most successful cryptocurrency to date, Bitcoin constitutes a target of choice for attackers. While many attack vectors have already been uncovered, one important vector has been left out though: attacking the currency via the Internet routing infrastructure itself. Indeed, by manipulating routing advertisements (BGP hijacks) or by naturally intercepting traffic, Autonomous Systems (ASes) can intercept and manipulate a large fraction of Bitcoin traffic. This paper presents the first taxonomy of routing attacks and their impact on Bitcoin, considering both small-scale attacks, targeting individual nodes, and large-scale attacks, targeting the network as a whole. While challenging, we show that two key properties make routing attacks practical: (i) the efficiency of routing manipulation; and (ii) the significant centralization of Bitcoin in terms of mining and routing. Specifically, we find that any network attacker can hijack few (<;100) BGP prefixes to isolate ~50% of the mining power-even when considering that mining pools are heavily multi-homed. We also show that on-path network attackers can considerably slow down block propagation by interfering with few key Bitcoin messages. We demonstrate the feasibility of each attack against the deployed Bitcoin software. We also quantify their effectiveness on the current Bitcoin topology using data collected from a Bitcoin supernode combined with BGP routing data. The potential damage to Bitcoin is worrying. By isolating parts of the network or delaying block propagation, attackers can cause a significant amount of mining power to be wasted, leading to revenue losses and enabling a wide range of exploits such as double spending. To prevent such effects in practice, we provide both short and long-term countermeasures, some of which can be deployed immediately.

Speaker: Chunmiao Li (National Insititute of Informatics, Japan)

Hygienic macro expansion

Kohlbecker et al. In ACM Conference on LISP and functional programming (LFP 1986).

Macro expansion in current Lisp systems is naive with respect to block structure. Every macro function can cause the capture of free user identifiers and thus corrupt intended bindings. We propose a change to the expansion algorithm so that macros will only violate the binding discipline when it is explicitly intended.

Speaker: Liye Guo (National Institute of Informatics, Japan)

Contracts for higher-order functions

Findler and Felleisen In ICFP 2002.

Assertions play an important role in the construction of robust software. Their use in programming languages dates back to the 1970s. Eiffel, an object-oriented programming language, wholeheartedly adopted assertions and developed the "Design by Contract" philosophy. Indeed, the entire object-oriented community recognizes the value of assertion-based contracts on methods.In contrast, languages with higher-order functions do not support assertion-based contracts. Because predicates on functions are, in general, undecidable, specifying such predicates appears to be meaningless. Instead, the functional languages community developed type systems that statically approximate interesting predicates.In this paper, we show how to support higher-order function contracts in a theoretically well-founded and practically viable manner. Specifically, we introduce λcon, a typed lambda calculus with assertions for higher-order functions. The calculus models the assertion monitoring system that we employ in DrScheme. We establish basic properties of the model (type soundness, etc.) and illustrate the usefulness of contract checking with examples from DrScheme's code base.We believe that the development of an assertion system for higher-order functions serves two purposes. On one hand, the system has strong practical potential because existing type systems simply cannot express many assertions that programmers would like to state. On the other hand, an inspection of a large base of invariants may provide inspiration for the direction of practical future type system research.

Speaker: Zhenjiang Hu (National Institute of Informatics, Japan)


The camp starts from 15:00
If you plan to have lunch in Tokyo, we recommend the following plans.
If you plan to have lunch in Zushi, the following travel plans will work:
In eithr case
14:05, get on the Keikyu bus NO.16 (逗16) at bus platform 1 and get off at Shonan Kokusaimura Center Mae.

Timetable for Keikyu bus No.16 (逗16) from Shonan Kokusaimura Center Mae (湘南国際村センター前) to Zushi (逗子) station on
August 11, after lunch: 13:45 , 14:45 , 15:45 , 16:45 , 17:45 , 18:31

Trains from Zushi to Tokyo/Shinjuku stations are somehow dense, for every 10 ~ 15 minutes.


If you do not want to make any transfer, use the following trains:
To Tokyo:
14:42, JR Yokosuka line, heading for Tokyo, platform 2
14:54, JR Yokosuka line, heading for Kazusa-Ichinomiya (上総一ノ宮), platform 1
...
15:53, JR Yokosuka line, heading for Kazusa-Ichinomiya (上総一ノ宮), platform 1
16:09, JR Airport line, heading for Narita / Narita Airport (成田・成田空港), platform 1
...


To Shinjuku:
14:33, JR Shonan-Shinjuku line, heading for Utsunomiya (宇都宮), platform 1
15:02, JR Shonan-Shinjuku line (express), heading for Utsunomiya (宇都宮), no platform info
...
15:32, JR Shonan-Shinjuku line, heading for Koganei (小金井), platform 3
16:02, JR Shonan-Shinjuku line (express), heading for Utsunomiya (宇都宮), platform 2
...