March 19-21, 2018, Kyoto University, Japan
|March 19||March 20||March 21|
|9:00||(start at 9:15)||
The Inversion Problem for Data Science (tentative)
Rakesh Agrawal (Data Insights Laboratories / Kyoto University)
|(start at 9:30)|
Can Programming be Liberated from Unidirectional Programming? -- An Overview of the BISCUITS Project
Zhenjiang Hu (National Institute of Informatics)
Observing SQL Queries in their Natural Habitat
Torsten Grust (Universität Tübingen)
Towards a practical checker for well-behavedness of bidirectional transformations
Keisuke Nakano (The University of Electro-Communications)
Collaborative Transportation: Towards Privacy-preserving Ridesharing Alliance
Yasuhito Asano (Kyoto University)
On Software Foundations for Data Interoperability
Guang R. Gao (University of Delaware)
Principles of inverse computation
Robert Glück (University of Copenhagen)
Framework for data integration -- Work in progress --
Makoto Onizuka (Osaka University)
Condition checking in reversible programming language
Kanae Tsushima (National Institute of Informatics)
Data Integration: Observations from the last 20 years
Alon Halevy (Recruit Institute of Technology)
Bidirectional Information Systems for Data Sharing
Privacy preserving federated learning with distributed data
Xiong Li (Emory University)
Bidirectional programming: developments and prospects
Hsiang-Shang Ko (National Institute of Informatics)
Personal Data as “New Oil” and its Market
Masatoshi Yoshikawa (Kyoto University)
Towards reduction of source access in incremental updates of relational views
Soichiro Hidaka (Hosei University)
The Absolute Consistency Problem of Graph Schema Mappings with Uniqueness Constraints
Yasunori Ishihara (Nanzan University)
Determinacy and Rewriting of Tree Transformations
Sebastian Maneth (Universität Bremen)
In-Database Text Mining with Neural Embeddings
Alexander Löser (Beuth University of Applied Sciences Berlin)
Toward Controllable Curation of Scientific Metadata
Toshiyuki Shimizu (Kyoto University)
Order-Sensitive XQuery Rewriting
Hiroyuki Kato (National Institute of Informatics)
Query reformulation for advanced queries
Yuya Sasaki (Osaka University)
Different from unidirectional programming, bidirectional programming is a new programming paradigm for developing well-behaved bidirectional transformations in order to solve various synchronization problems. In this talk, I'd like to briefly review recent work on bidirectional programming, and give an introduce to our current BISCUITS project, which aims to establish a new software foundation based on bidirectional transformation for controlling, integrating, and coordinating decentralized data.
A bidirectional transformation is a pair of functions, get and put, which must conform get/put and put/get laws for its well-behavedness. The conformity can be reduced into an equivalence problem of functions. Although the equivalence problem is undecidable in general, it can be decidable under an appropriate restriction to the class of functions. In this talk, I will report an implementation of an equivalence checking algorithm for tree-to-string transformation proposed by Seidl, Maneth, and Kemper. This is a joint work with Yuta Takahashi.
We are facing the challenges of the end of Moore’s Law, as well as the challenges from applications in intelligent data analytics and machine learning. The speaker believes that it may be the time to initiate a new forum encouraging broader participation and direct interaction of scientists and engineers working on computer system architecture, system software including compilers, runtime systems, and OS ) and high level programming programing models and methodology. This talk will provide a personal reflection from speaker’s many years of research, practice and experience with parallel model of computation and systems: dataflow and beyond. The talk will focused on the software foundation of program composability, and scalability in large-scale parallel and distributed systems.
Data integration is a field that is constantly facing new challenges due to technology and social trends. I will give an overview of some of the stages in the evolution of data integration in the last couple of decades, and then discuss some current challenges. In particular, I'll discuss the need for an open-source data integration eco-system, the challenges in combining structured and unstructured data, and the potential of using data integration in the development of technology that furthers users' wellbeing.
I will give a brief introduction to bidirectional transformations (BXs), talk about some of our recent developments in bidirectional programming (which is about reliable and efficient construction of state-based asymmetric lenses, a particular BX model), and discuss some possible future directions.
Challenges in propagating updates through relational views include reduction of materialization. In current state of the art of relational lenses, although Bohannon et al.,'s approach has been incrementalized by Horn and Cheney, their approach still requires querying source data to reinforce functional dependency after updates. Reduction of materialization may be achieved under compositional setting where selectivity of component query near the source is enough so that access to the materialized view by the backward execution of the subsequent queries is reduced accordingly. More promising approach could be to to translate the above reinforcement through each relational incremental lens. In this talk, we describe our initial attempt under selection and projection lenses.
In the first part of the talk we give an overview of results known from the database community concerning view/query determinacy and rewriting. We then present new results for these problems where views and queries are tree transformations. As it turns out, some of our techniques are quite different and hopefully shed new light on these classical problems.
The reason why XML/XQuery ended up failing in practice is in "heavily orderd" feature as presented in FADS@VLDB Workshop 2017 (Failed Aspirations in Database Systems). In this talk, I will show how to escape from the "hevily ordered" world even if the input query itself has an order-sensitive feature.
In the heels of rapid advances in the understanding of nucleic acids (genomics) and proteins (proteomics), glycomics, the study of glycan expression in biological systems, has emerged as the next frontier in the molecular biology revolution. A key problem in glycomics is the inversion problem: Given a database of individual glycan characteristics and an unknown mixture of glycans, discover the composition of the mixture, i.e. determine which glycans are present in the mixture and in what proportion. We present a mathematical optimization formulation of this problem and present some early experimental results. Our formulation is quite general and is likely to find application in many domains beyond glycomics. We also point out some hard open problems.
(Joint work with Odysseas Papapetrou and Thomas R. Rizzo, done at the DIAS Lab, EPFL, Switzerland).
Collaborative transportation including ridesharing like Uber and Lyft has become popular. We have researched several problems about collaborative transportation in recent years. In this talk, we will introduce these problem briefly. Especially, we will explain a data integration problem for privacy-preserving ridesharing alliance system. This problem is also a running example of our vision paper.
This talk surveys fundamental concepts for inverse programming and present reversible computing, a special case that has received growing attention durin the past years. We discuss the key concepts of program inversion and illustrate an algorithm. Program inversion and inverse interpretation are two sides of the same coin, and we discuss their relation by the Futamura projections.
We describe Habitat, a declarative observational debugger for SQL. Habitat facilitates true language-level (not: plan-level) debugging of, probably flawed, SQL queries that yield unexpected results. Users mark SQL subexpressions—ranging from literals, over fragments of predicates, to entire subquery blocks—to observe whether these evaluate as expected. From the marked SQL text, Habitat's derives a new query whose result represents the desired observations.
To prevent debugger users to "drown in a sea of large observations," we build on data provenance to help find reduced database instances that still exercise the query's flaw but lead to smaller observations. We sketch the derivation of fine-grained where- and why-provenance for rich query dialects based on a non-standard interpretation of the SQL semantics (realized in SQL itself).
I will talk about the current status of the framework team. Our framework supports bi-directional transformation across/inside database servers and analysis over integrated views. We employ the trigger and FDW for update propagation of the bi-directional transformation and implemented an application of ride-sharing alliance system. In addition, I would like to make discussions on technical challenges for the next step, in particular, efficient data analysis over integrated views.
A reversible programming language supports deterministic forward and backward computation. Among others, this means that the inverse can be calculated for free. For reversible computation, reversible programming languages need additional assertions for if-expressions and while-loops. In current systems, these assertions are checked dynamically. In this presentation, we introduce a static analysis of these assertions.
In the big data era, personal data is perceived as a new oil or currency in the digital world. Both public and private sectors wish to use such data for studies and businesses. However, access to such data is restricted due to privacy issues. Seeing the commercial opportunities in gaps between demand and supply, the notion of personal data market is introduced. We describe opportunities and challenges of personal data market.
A schema mapping is a formal representation of the correspondence between source and target instances in a data exchange setting. Schema mappings have been extensively studied so far in relational and XML databases. However, in graph databases, they have not received much attention yet. A given schema mapping is said to be absolutely consistent if every source instance has a corresponding target instance. Absolute consistency is an important property because it guarantees that data exchange never fails for any source instance. In this talk, we define schema mappings for graph databases with uniqueness constraints. Our graph databases consist of nodes, edges, and properties, where a property is a key-value pair and gives detailed information to nodes. A uniqueness constraint guarantees the uniqueness of specified properties in the whole graph database, and therefore, is useful for realizing the functionality of primary keys in graph databases. Next, in this talk, we show that the absolute consistency problem is in coNP in general. Then, we present four subclasses of graph schema mappings for which absolute consistency is decidable in polynomial time. Lastly, we present a subclass of graph schema mappings for which absolute consistency is coNP-hard.
We present a novel architecture, In-Database Entity Linking (IDEL), in which we integrate the analytics-optimized RDBMS MonetDB with neural text mining abilities. Our system design abstracts core tasks of most neural entity linking systems for MonetDB. To the best of our knowledge, this is the first defacto implemented system integrating entity-linking in a database. In this talk I will briefly discuss my personal research history of Text Mining in Databases, reaching from IBM SystemT, over SAP HANA, Cloudera IMPALA, EXASOL to finally MonetDB and IDEL. I will also highlight important application scenarios learned during our work with major industry partners in Europe.
For the management of scientific data, the metadata plays an important role. Generally, scientific metadata are in different formats depending on the target domains, and managed by different organizations. In this talk, I will show the current situation of scientific metadata management with the case study of DIAS, and discuss some possible scenarios for controllable curation of scientific metadata by applying BX techniques.
Query reformulation is an essential technique to share data which local databases store. In my talk, I explain necessity of query reformulation techniques for advanced query such as ranking and approximate queries from the perspective of data sharing systems.