BISCUITS: Bidirectional (information) System for Collaborative, Updatable, Interoperable, and Trusted Sharing

Big data processing is now widely employed in all aspects of our lives. Usually, parts or copies of a huge amount of data are stored in separate locations, and is infeasible to be collected and processed in a centralized manner, as it would be exceedingly inefficient to transfer them over the network. We therefore need new software foundations based on which big data can be efficiently analyzed and shared in a distributed way.

A highly relevant research area is bidirectional transformations, which provide a reliable mechanism for data synchronization. The study of bidirectional transformations originates from the long-standing problem of view updating in databases, and has led to a rich collection of bidirectional languages with new programming models tailored for data synchronization. Despite the potential in solving practical synchronization problems including data interoperability, bidirectional technologies are not widely employed yet, and most applications of bidirectional transformations remain only proof of concept.

In this research, we aim to further develop bidirectional technologies to make them more reliable, scalable, and efficient, so as to establish solid foundations for integration, sharing, and interoperability of autonomic distributed big data.

BiGra: Big-Graphs Querying Made Easy

BiGra is a graph query processing framework implemented in GraphX. It aims to provide a platform to execute navigational queries and transformations on big graphs. Its core is built based on a solid foundation of strutural recursion on graphs and program transformations, making it a reliable and efficient framework.

Putback-based bidirectional transformations

Bidirectional transformations developed rapidly in the last ten years and have seen many interesting approaches as well as applications, most of which however are get-based in the sense that the system asks users to write programs describing the get and intends to derive a backward put for free. Nevertheless, the major problem with the get-based approach is that there are usually more than one incomparable puts for a get and thus the backward behaviour of a get-based program becomes ambiguous and unpredictable. Distinct from others, we focus on the putback-based approach, under which there is at most one corresponding get program satisfying round-trip properties according to the input program and thus provide users with full control over the program. A robust core language and several surface languages and applications are developed on the fly.

Bidirectional Programming Languages

Inspired by the view updating techniques, program inversion and transformational programming, we are now working on development of a new general framework that can support bidirectional computation (bidirectional transformation language, bidirectionalization transformation, and bidirectionality analysis), and focus on its application to XML/graph processing and to bidirectional model transformation in software engineering. The BiG Project contains more information of design and implementation of a bidirectional graph transformation language.

High Level Parallel Programming and Parallelization

The problems involved in developing efficient and correct parallel programs have proved much harder than those in developing efficient sequential ones, both for programmers and for compilers. We have made the first attempt to construct a calculational framework for parallelization of sequential programs, including a novel inductive synthesis lemma and a calculation algorithm for parallelization of general recursive functions. Being constructive, this work has proved to be useful both in parallelizing compiler and in parallel programming.

Program Transformation and Optimization in Calculational Form

Correctness-preserving program transformation plays an important role in compiler optimization in functional programming, and it calls for a more systematic study. We have demonstrated that many important program transformations such as deforestation (or fusion), tupling, accumulation, and IO swapping calculation, can be effectively and elegantly formalized in calculational form. In general, program transformation in calculational form enjoys the nice properties of modularity and cheap implementation, which can be embedded in compilers of functional languages.
In addition, program calculation is very helpful in developing efficient algorithms. We have developed some systematic derivation strategies with which we derived several new algorithms: a new divide-and-conquer algorithms for solving two-dimensional version of maximal segment sum problem, a novel efficient parallel program for bracket matching, a completely new algorithm for computing frequent sets in data mining. In particular, we proposed a generic algorithm for solving the maximum marking problems, a class of useful optimization problems.

Bidiectionalization of Graph Transformation Languages (and related studies)

Bidirectional transformation has been studied for decades in different research fields, and gaining more and more attentions lately, for example, in model driven development, where modifications are desired to be propagated not only from upstream to downstream, but also backwards in the model transformation chains. By representing various artifacts as graphs (tree like data structure that includes confluences and cycles), bidirectionalized graph transformation can support such bidirectional interactions. Our particular approach started from giving bidirectional semantics to an existing graph query language from the database research field, and is lately extending the graph model as well as the expressiveness of transformations. This research topic is also an integral part of the BiG project described above, aiming at providing robust and easy-to-use bidirectional graph programming infrastructure that has desirable and clearly defined behavior, to support evolutionary software systems. However, it also includes various topics that might be of general interests to programming language community such as static analysis, program transformation, performance and cost analysis, optimization and integrated development environments.

Optimization on Database Programming Languages

Fusion is a known technique for improving the efficiency by removing unnecessary intermediate data from the computation. Although it has been applied to optimize query languages such as SQL and object query languages, it remains as a challenge to implement fusion for XQuery optimization. This is because XQuery has more complicated semantics; it is context sensitive and requires preservation of document order. In this project, we propose, as far as we are aware, the first XQuery fusion that can deal with both the document order and the context of XQuery expressions. More specifically, we carefully design a context representation of XQuery expressions based on the Dewey order encoding, develop a context-preserving XQuery fusion for ordered trees by static emulation of the XML store.