Hiroyuki Yamada

Hiroyuki Yamada

日本
672人のフォロワー つながり: 500人以上

アクティビティ

登録してすべてのアクティビティを表示

職務経験

  • Scalar, Inc. グラフィック

    Scalar, Inc.

    Within 23 wards, Tokyo, Japan

  • -

    Tokyo, Japan

  • -

  • -

  • -

    Within 23 wards, Tokyo, Japan

  • -

    Tokyo, Japan

  • -

    Tokyo, Japan

  • -

    Tokyo, Japan

  • -

    Tokyo, Japan

学歴

  • 東京大学 グラフィック

    The University of Tokyo

    Worked on high-performance parallel and distributed database systems and proposed a parallel database with a new execution mechanism, which achieves more than x10 (more than x100 in some cases) performance compared to the state-of-the-art ones on a 128-node cluster.
    Also, have applied the mechanism to the Hadoop ecosystem and column-store databases.

  • Proposed a scalable dynamic indexing method for inverted indexes in a modern server equipped with multi-core CPUs and multiple hard drives, and
    achieved sub-linear scalability with such hardware resources without query performance degradation.

  • Proposed a relaxation method for optimal resource allocation on a large-scale probabilistic variable-length network.

    In 2001, I took a year off and stayed in Canada to learn English and worked at Gap in Vancouver as a Sales Associate.

出版物

  • LakeHarbor: Making Structures First-Class Citizens in Data Lakes

    IEEE ICDE (Special Track)

    Authors: Hiroyuki Yamada, Masaru Kitsuregawa, and Kazuo Goda

    Abstract: This paper introduces LakeHarbor, a new data management paradigm that makes structures (e.g., indexes) first-class citizens in data lakes. The LakeHarbor paradigm enables
    a data lake system to flexibly construct structures based on registered access method functions and execute data processing jobs efficiently with the potential parallelism that the structures inherently hold by exploiting the functions while not…

    Authors: Hiroyuki Yamada, Masaru Kitsuregawa, and Kazuo Goda

    Abstract: This paper introduces LakeHarbor, a new data management paradigm that makes structures (e.g., indexes) first-class citizens in data lakes. The LakeHarbor paradigm enables
    a data lake system to flexibly construct structures based on registered access method functions and execute data processing jobs efficiently with the potential parallelism that the structures inherently hold by exploiting the functions while not sacrificing flexible data processing such as schema-on-read. This paper also presents ReDe, a prototype data processing engine that implements LakeHarbor, and a motivating evaluation and a case
    study of ReDe to explore the potential of LakeHarbor.

    出版物を表示
  • ScalarDB: Universal Transaction Manager for Polystores

    PVLDB (VLDB'23)

    Authors: Hiroyuki Yamada, Toshihiro Suzuki, Yuji Ito, and Jun Nemoto

    Abstract: This paper presents ScalarDB, a universal transaction manager that achieves distributed transactions across multiple disparate databases. ScalarDB provides a database-agnostic transaction manager on top of its database abstraction; thus, it achieves transactions spanning various databases without depending on the transactional capability of underlying databases. ScalarDB is based on several research works and…

    Authors: Hiroyuki Yamada, Toshihiro Suzuki, Yuji Ito, and Jun Nemoto

    Abstract: This paper presents ScalarDB, a universal transaction manager that achieves distributed transactions across multiple disparate databases. ScalarDB provides a database-agnostic transaction manager on top of its database abstraction; thus, it achieves transactions spanning various databases without depending on the transactional capability of underlying databases. ScalarDB is based on several research works and extended to provide a strong correctness guarantee (i.e., strict serializability), further performance optimizations, and several critical mechanisms for productization. In this paper, we describe the design and implementation of ScalarDB. We also present evaluation results showing that ScalarDB achieves database-spanning transactions with reasonable performance and near-linear scalability without sacrificing correctness. Finally, we share some case studies and lessons learned while building and running ScalarDB.

    出版物を表示
  • Nested Loops Revisited Again

    IEEE ICDE (Special Track)

    Authors: Hiroyuki Yamada, Kazuo Goda, and Masaru Kitsuregawa

    Abstract: Hash joins and sort-merge joins have been considered the algorithms of choice for analytical relational queries in most parallel database systems because of their performance robustness and ease of parallelization. On the other hand, nested loop joins have been considered less attractive and are conservatively used. In this paper, we revisit the potential of nested loop joins in a cluster environment. We focus on…

    Authors: Hiroyuki Yamada, Kazuo Goda, and Masaru Kitsuregawa

    Abstract: Hash joins and sort-merge joins have been considered the algorithms of choice for analytical relational queries in most parallel database systems because of their performance robustness and ease of parallelization. On the other hand, nested loop joins have been considered less attractive and are conservatively used. In this paper, we revisit the potential of nested loop joins in a cluster environment. We focus on exploring the parallelism aspect of nested loop joins because there could still be space for improvement by fully exploiting the parallelism of current commodity hardware, which could handle more than thousands of concurrent IOs. We also introduce scalable massively-parallel execution as one of the approaches for achieving massive parallelism in nested loop joins to explore how it widens the potential benefit of nested loop joins. Finally, we discuss future research directions based on our exploration.

    出版物を表示
  • Scalar DL: Scalable and Practical Byzantine Fault Detection for Transactional Database Systems

    PVLDB (VLDB'22)

    Authors: Hiroyuki Yamada, Jun Nemoto

    Abstract: This paper presents Scalar DL, a Byzantine fault detection (BFD) middleware for transactional database systems. Scalar DL manages two separately administered database replicas in a database system and can detect Byzantine faults in the database system as long as either replica is honest (not faulty). Unlike previous BFD works, Scalar DL executes non-conflicting transactions in parallel while preserving a correctness guarantee. Moreover…

    Authors: Hiroyuki Yamada, Jun Nemoto

    Abstract: This paper presents Scalar DL, a Byzantine fault detection (BFD) middleware for transactional database systems. Scalar DL manages two separately administered database replicas in a database system and can detect Byzantine faults in the database system as long as either replica is honest (not faulty). Unlike previous BFD works, Scalar DL executes non-conflicting transactions in parallel while preserving a correctness guarantee. Moreover, Scalar DL is database-agnostic middleware so that it achieves the detection capability in a database system without either modifying the databases or using database-specific mechanisms. Experimental results with YCSB and TPC-C show that Scalar DL outperforms a state-of-the-art BFD system by 3.5 to 10.6 times in throughput and works effectively on multiple database implementations. We also show that Scalar DL achieves near-linear (91%) scalability when the number of nodes composing each replica increases.

    出版物を表示
  • Out-of-order Execution of Database Queries

    PVLDB (VLDB'20)

    Authors: Kazuo Goda, Yuto Hayamizu, Hiroyuki Yamada, and Masaru Kitsuregawa

    Abstract: Intra-query parallelism is a key for database software to offer acceptable responsiveness for data-intensive queries. Many researchers have studied how to achieve greater execution parallelism for database queries. Partitioning is a representative approach, which divides a query into multiple sub-tasks and executes them in parallel. However, given a new query, optimal division is not necessarily…

    Authors: Kazuo Goda, Yuto Hayamizu, Hiroyuki Yamada, and Masaru Kitsuregawa

    Abstract: Intra-query parallelism is a key for database software to offer acceptable responsiveness for data-intensive queries. Many researchers have studied how to achieve greater execution parallelism for database queries. Partitioning is a representative approach, which divides a query into multiple sub-tasks and executes them in parallel. However, given a new query, optimal division is not necessarily obvious. Database software utilizes heuristic rules or statistical information to decide how to divide the query before execution. As yet another approach to achieve execution parallelism, this paper presents out-of-order database execution (OoODE), a massively-parallel query execution method to offer significant speedup for database queries consistently. OoODE dynamically decomposes query work by making the best use of the exact knowledge of the potential execution parallelism for each operation ready to be performed during query execution. With OoODE, the database software is allowed to automatically squeeze out the execution parallelism that the query inherently holds. Hence, for a wide spectrum of queries, OoODE performs significantly faster than the serial (non-parallelized) execution, while it performs better than or comparably with alternative parallelizing methods without the need for dividing the query before execution. This paper presents the experiments that we conducted using the prototyped database software and demonstrates that OoODE is two to three orders of magnitude faster than the serial execution, whereas it is substantially (up to 2.07 times) faster than the best achievable case of partitioning. Besides, OoODE performs two to four orders of magnitude faster than major DBMSs.

    出版物を表示
  • What’s So Different about Blockchain? — Blockchain is a Probabilistic State Machine

    ICDCS Workshops

    Authors: Kenji Saito, Hiroyuki Yamada

    Abstract: Blockchain is a distributed timestamp server technology introduced for realization of Bitcoin, a digital cash system. It has been attracting much attention especially in the areas of financial and legal applications. But such applications would fail if they are designed without knowledge of the fundamental differences in blockchain from existing technology. We show that blockchain is a probabilistic state machine in which participants can…

    Authors: Kenji Saito, Hiroyuki Yamada

    Abstract: Blockchain is a distributed timestamp server technology introduced for realization of Bitcoin, a digital cash system. It has been attracting much attention especially in the areas of financial and legal applications. But such applications would fail if they are designed without knowledge of the fundamental differences in blockchain from existing technology. We show that blockchain is a probabilistic state machine in which participants can never commit on decisions, we also show that this probabilistic nature is necessarily deduced from the condition where the number of participants remains unknown. This work provides useful abstractions to think about blockchain, and raises discussion for promoting the better use of the technology.

    出版物を表示
  • Scalable Online Index Construction with Multi-core CPUs

    ADC

    Authors: Hiroyuki Yamada, Motomichi Toyama

    Abstract: Inverted index is a core element of current text retrieval systems. They can be dynamically constructed using online indexing approaches in the environment which even a small delay in timeliness cannot be tolerated, and the index must always be queryable and up to date. Recently, efficient online index construction schemes have been proposed, however, previous works have not focused on scalability with the modern commodity hardware…

    Authors: Hiroyuki Yamada, Motomichi Toyama

    Abstract: Inverted index is a core element of current text retrieval systems. They can be dynamically constructed using online indexing approaches in the environment which even a small delay in timeliness cannot be tolerated, and the index must always be queryable and up to date. Recently, efficient online index construction schemes have been proposed, however, previous works have not focused on scalability with the modern commodity hardware resources such as multi-core CPUs. In this paper, we propose a scalable online index construction method that better utilizes multi-core CPUs. Using experiments on 30 GB of web data, we demonstrate the efficiency of our method in practice, showing that it dramatically reduces online index construction time without sacrificing query performance.

    出版物を表示

受賞歴

  • Best Paper Award in The IEICE Transactions on Information and Systems

    The Institute of Electronics, Information and Communication Engineers

    To a paper about fast and scalable parallel processing engine.

  • Gold Prize in Rakuten Technology Award 2014

    Rakuten, Inc

    For the development of the fastest databases engine for the era of very large database with Masaru Kitsuregawa, Kazuo Goda and Yuto Hayamizu

  • Best Paper Award in DEIM 2014

    The Institute of Electronics, Information and Communication Engineers (IEICE), Japan

    6 papers are selected out of more than 200 papers.

  • Best Ph.D. award in Information and Communication Engineering Department at The University of Tokyo

    The University of Tokyo

  • Best Poster Award in DEIM 2014

    The Institute of Electronics, Information and Communication Engineers (IEICE), Japan

  • International Conference Encouragement Award

    Keio University

  • Super Developer/Programmer Award

    IPA (Ministry of Economy, Trade and Industry in Japan)

    I worked on the development of the distributed search engine Lux.
    I was one of the 7 developers out of the selected 190 nominees.

言語

  • English

    ビジネス上級

  • Japanese

    母国語またはバイリンガル

Hiroyukiさんによるその他のアクティビティ

Hiroyukiさんのプロフィールを表示

  • 共通の知り合いをチェックする
  • この方への紹介をリクエストする
  • Hiroyukiさんに直接コンタクトする
登録してプロフィールを閲覧

類似するその他のプロフィール

日本Hiroyuki Yamadaという名前のその他のユーザー

これらのコースで新しいスキルを追加