Adrianto Wirawan

Adrianto Wirawan

Greater Cambridge Area
2K followers 500+ connections

About

Adrianto is a results-driven leader, with more than 18 years of experience in leading…

Activity

Join now to see all activity

Experience

  • Genomics England Graphic

    Genomics England

    United Kingdom

  • -

    Cambridge, England, United Kingdom

  • -

    Cambridge, England, United Kingdom

  • -

    Cambridge, United Kingdom

  • -

    Singapore

  • -

    Singapore

  • -

    Mainz Area, Germany

  • -

    Singapore

  • -

    Singapore

  • -

Education

  • Nanyang Technological University Singapore Graphic

    Nanyang Technological University

    -

    Activities and Societies: SCE, CSA, CPG, PINTU

  • -

    Activities and Societies: SCE, CSA, CPG, Hall 12, PINTU

  • -

    Activities and Societies: OSIS, Pramuka, CC Cup, Central Cup, POR CC

Licenses & Certifications

Publications

  • CLOVE: classification of genomic fusions into structural variation events

    BMC Bioinformatics

    A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly…

    A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than one breakpoint or fusion).
    We present CLOVE, an algorithm for integrating the results of multiple breakpoint or SV callers and classifying the results as a particular SV. CLOVE is based on a graph data structure that is created from the breakpoint information. The algorithm looks for patterns in the graph that are characteristic of more complex rearrangement types. CLOVE is able to integrate the results of multiple callers, producing a consensus call.
    We demonstrate using simulated and real data that re-classified SV calls produced by CLOVE improve on the raw call set of existing SV algorithms, particularly in terms of accuracy.

    Other authors
    See publication
  • HECTOR: A parallel multistage homopolymer spectrum based error corrector for 454 sequencing data

    BMC Bioinformatics

    Current-generation sequencing technologies are able to produce low-cost, high-throughput reads. However, the produced reads are imperfect and may contain various sequencing errors. Although many error correction methods have been developed in recent years, none explicitly targets homopolymer-length errors in the 454 sequencing reads.
    We present HECTOR, a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data. In this algorithm, for the first time we have…

    Current-generation sequencing technologies are able to produce low-cost, high-throughput reads. However, the produced reads are imperfect and may contain various sequencing errors. Although many error correction methods have been developed in recent years, none explicitly targets homopolymer-length errors in the 454 sequencing reads.
    We present HECTOR, a parallel multistage homopolymer spectrum based error corrector for 454 sequencing data. In this algorithm, for the first time we have investigated a novel homopolymer spectrum based approach to handle homopolymer insertions or deletions, which are the dominant sequencing errors in 454 pyrosequencing reads. We have evaluated the performance of HECTOR, in terms of correction quality, runtime and parallel scalability, using both simulated and real pyrosequencing datasets. This performance has been further compared to that of Coral, a state-of-the-art error corrector which is based on multiple sequence alignment and Acacia, a recently published error corrector for amplicon pyrosequences. Our evaluations reveal that HECTOR demonstrates comparable correction quality to Coral, but runs 3.7× faster on average. In addition, HECTOR performs well even when the coverage of the dataset is low.
    Our homopolymer spectrum based approach is theoretically capable of processing arbitrary-length homopolymer-length errors, with a linear time complexity. HECTOR employs a multi-threaded design based on a master-slave computing model. Our experimental results show that HECTOR is a practical 454 pyrosequencing read error corrector which is competitive in terms of both correction quality and speed. The source code and all simulated data are available at: http://hector454.sourceforge.net.

    See publication
  • CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions

    BMC Bioinformatics

    The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases.
    We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU…

    The maximal sensitivity for local alignments makes the Smith-Waterman algorithm a popular choice for protein sequence database search based on pairwise alignment. However, the algorithm is compute-intensive due to a quadratic time complexity. Corresponding runtimes are further compounded by the rapid growth of sequence databases.
    We present CUDASW++ 3.0, a fast Smith-Waterman protein database search algorithm, which couples CPU and GPU SIMD instructions and carries out concurrent CPU and GPU computations. For the CPU computation, this algorithm employs SSE-based vector execution units as accelerators. For the GPU computation, we have investigated for the first time a GPU SIMD parallelization, which employs CUDA PTX SIMD video instructions to gain more data parallelism beyond the SIMT execution model. Moreover, sequence alignment workloads are automatically distributed over CPUs and GPUs based on their respective compute capabilities. Evaluation on the Swiss-Prot database shows that CUDASW++ 3.0 gains a performance improvement over CUDASW++ 2.0 up to 2.9 and 3.2, with a maximum performance of 119.0 and 185.6 GCUPS, on a single-GPU GeForce GTX 680 and a dual-GPU GeForce GTX 690 graphics card, respectively. In addition, our algorithm has demonstrated significant speedups over other top-performing tools: SWIPE and BLAST+.
    CUDASW++ 3.0 is written in CUDA C++ and PTX assembly languages, targeting GPUs based on the Kepler architecture. This algorithm obtains significant speedups over its predecessor: CUDASW++ 2.0, by benefiting from the use of CPU and GPU SIMD instructions as well as the concurrent execution on CPUs and GPUs. The source code and the simulated data are available at http://cudasw.sourceforge.net.

    Other authors
    See publication
  • Feature Selection for Computer-Aided Angle Closure Glaucoma Mechanism Detection

    Journal of Medical Imaging and Health Informatics

    Selection of relevant features is of fundamental importance in building robust classifiers for computer-aided detection (CAD) of angle closure glaucoma mechanism. Typically one is interested in determining which, of a large number of potentially redundant or noisy features, are most discriminative for classification. The objective of the paper is to exploit machine learning algorithms for automated classification of different angle closure mechanisms based on the quantitative assessment of…

    Selection of relevant features is of fundamental importance in building robust classifiers for computer-aided detection (CAD) of angle closure glaucoma mechanism. Typically one is interested in determining which, of a large number of potentially redundant or noisy features, are most discriminative for classification. The objective of the paper is to exploit machine learning algorithms for automated classification of different angle closure mechanisms based on the quantitative assessment of Anterior Segment Optical Coherence Tomography (AS-OCT). In this paper, we propose an effective combination of Minimum Redundancy Maximum Relevance (MRMR), a mutual information feature selection, with AdaBoost, an adaptive boosting to detect angle closure glaucoma. The proposed method effectively combines the best feature selection and classifier to the problem. A sequential forward search was conducted to determine the optimal feature subset by the proposed criteria. The optimal selected feature subset, using only 11.90% of the entire available feature, for our angle closure glaucoma dataset outperforms the result using all 84 features. It also achieves better prediction compared to 4 other methods, i.e., Classification Tree, Support Vector Machine (SVM), Random Forest and Naïve Bayes. The reduced set of features avoids computation of unnecessary features and thus improves the efficiency. Furthermore, 9 out of the 10 selected features have been clinically proven to be important in determining the type of angle closure glaucoma. The algorithm show promising and encouraging results to detect and determine the type of angle closure glaucoma which may help early recognition and treatment of the disease. The reduced complexity of the generated models achieves better generalization and improves the efficiency.

    See publication
  • Review of Tandem Repeat Search Tools: A Systematic Approach to Evaluating Algorithmic Performance

    Briefings in Bioinformatics

    The prevalence of tandem repeats in eukaryotic genomes and their association with a number of genetic diseases has raised considerable interest in locating these repeats. Over the last 10–15 years, numerous tools have been developed for searching tandem repeats, but differences in the search algorithms adopted and difficulties with parameter settings have confounded many users resulting in widely varying results. In this review, we have systematically separated the algorithmic aspect of the…

    The prevalence of tandem repeats in eukaryotic genomes and their association with a number of genetic diseases has raised considerable interest in locating these repeats. Over the last 10–15 years, numerous tools have been developed for searching tandem repeats, but differences in the search algorithms adopted and difficulties with parameter settings have confounded many users resulting in widely varying results. In this review, we have systematically separated the algorithmic aspect of the search tools from the influence of the parameter settings. We hope that this will give a better understanding of how the tools differ in algorithmic performance, their inherent constraints and how one should approach in evaluating and selecting them.

    See publication
  • INVERTER: INtegrated Variable numbER Tandem rEpeat findeR

    Computational Systems-Biology and Bioinformatics

    A tandem repeat in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats occur in the genomes of both eukaryotic and prokaryotic organisms. They are important in numerous fields including disease diagnosis, mapping studies, human identity testing (DNA fingerprinting), sequence homology and population studies. Although tandem repeats have been used by biologists for many years, there are few tools available for performing an exhaustive search…

    A tandem repeat in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats occur in the genomes of both eukaryotic and prokaryotic organisms. They are important in numerous fields including disease diagnosis, mapping studies, human identity testing (DNA fingerprinting), sequence homology and population studies. Although tandem repeats have been used by biologists for many years, there are few tools available for performing an exhaustive search for all tandem repeats in a given sequence. In this paper, we present INVERTER, a de novo tandem repeat finder without the need to specify either the pattern or a particular pattern size, integrated with a data visualization tool. INVERTER is implemented in Java and has a built-in user-friendly Graphical User Interface. A standalone version of the program can be downloaded from http://bmserver.sce.ntu.edu.sg/INVERTER. Comparison search result of INVERTER with an existing software tool is presented. The use of INVERTER will assist biologists in discovering new ways of understanding both the structure and function of DNA and protein

    Other authors
    • Chee Keong Kwoh
    • Li Yang Hsu
    • Tse Hsien Koh
    See publication
  • INVERTER: INtegrated Variable numbER Tandem rEpeat findeR

    Communications in Computer and Information Science

    A tandem repeat in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats occur in the genomes of both eukaryotic and prokaryotic organisms. They are important in numerous fields including disease diagnosis, mapping studies, human identity testing (DNA fingerprinting), sequence homology and population studies. Although tandem repeats have been used by biologists for many years, there are few tools available for performing an exhaustive search…

    A tandem repeat in DNA is a sequence of two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats occur in the genomes of both eukaryotic and prokaryotic organisms. They are important in numerous fields including disease diagnosis, mapping studies, human identity testing (DNA fingerprinting), sequence homology and population studies. Although tandem repeats have been used by biologists for many years, there are few tools available for performing an exhaustive search for all tandem repeats in a given sequence. In this paper, we present INVERTER, a de novo tandem repeat finder without the need to specify either the pattern or a particular pattern size, integrated with a data visualization tool. INVERTER is implemented in Java and has a built-in user-friendly Graphical User Interface. A standalone version of the program can be downloaded from http://bmserver.sce.ntu.edu.sg/INVERTER. Comparison search result of INVERTER with an existing software tool is presented. The use of INVERTER will assist biologists in discovering new ways of understanding both the structure and function of DNA and protein.

    See publication
  • Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architectures

    Oxford University Press

    Multiple sequence alignment is an important tool in bioinformatics. Although efficient heuristic algorithms exist for this problem, the exponential growth of biological data demands an even higher throughput. The recent emergence of multi-core technologies has made it possible to achieve a highly improved execution time for many bioinformatics applications. In this article, we introduce an
    implementation that accelerates the distance matrix computation on x86 and Cell Broadband Engine, a…

    Multiple sequence alignment is an important tool in bioinformatics. Although efficient heuristic algorithms exist for this problem, the exponential growth of biological data demands an even higher throughput. The recent emergence of multi-core technologies has made it possible to achieve a highly improved execution time for many bioinformatics applications. In this article, we introduce an
    implementation that accelerates the distance matrix computation on x86 and Cell Broadband Engine, a homogeneous and heterogeneous multi-core system, respectively. By taking advantage of multiple processors as well as Single Instruction Multiple Data vectorization, we were able to achieve speed-ups of two orders of magnitude compared to the publicly available implementation utilized in ClustalW

    See publication
  • High Performance Protein Sequence Database Scanning on the Cell B.E. Processor

    Scientific Programming

    The enormous growth of biological sequence databases has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing rapidly as well. The recent emergence of low cost parallel multicore accelerator technologies has made it possible to reduce execution times of many bioinformatics applications. In this paper, we demonstrate how the Cell Broadband Engine can be used as a…

    The enormous growth of biological sequence databases has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing rapidly as well. The recent emergence of low cost parallel multicore accelerator technologies has made it possible to reduce execution times of many bioinformatics applications. In this paper, we demonstrate how the Cell Broadband Engine can be used as a computational platform to accelerate two approaches for protein sequence database scanning: exhaustive and heuristic. We present efficient parallelization techniques for two representative algorithms: the dynamic programming based Smith–Waterman algorithm and the popular BLASTP heuristic. Their implementation on a Playstation®3 leads to significant runtime savings compared to corresponding sequential implementations.

    Other authors
    See publication
  • Pairwise Distance Matrix Computation for Multiple Sequence Alignment on the Cell Broadband Engine

    Lecture Notes in Computer Science

    Multiple sequence alignment is an important tool in bioinformatics. Although efficient heuristic algorithms exist for this problem, the exponential growth of biological data demands an even higher throughput. The recent emergence of accelerator technologies has made it possible to achieve a highly improved execution time for many bioinformatics applications compared to general-purpose platforms. In this paper, we demonstrate how the PlayStation®3, powered by the Cell Broadband Engine, can be…

    Multiple sequence alignment is an important tool in bioinformatics. Although efficient heuristic algorithms exist for this problem, the exponential growth of biological data demands an even higher throughput. The recent emergence of accelerator technologies has made it possible to achieve a highly improved execution time for many bioinformatics applications compared to general-purpose platforms. In this paper, we demonstrate how the PlayStation®3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the distance matrix computation utilized in multiple sequence alignment algorithms.

    See publication
  • CBESW: Sequence Alignment on the Playstation 3

    BMC Bioinformatics

    Background
    The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we…

    Background
    The exponential growth of available biological data has caused bioinformatics to be rapidly moving towards a data-intensive, computational science. As a result, the computational power needed by bioinformatics applications is growing exponentially as well. The recent emergence of accelerator technologies has made it possible to achieve an excellent improvement in execution time for many bioinformatics applications, compared to current general-purpose platforms. In this paper, we demonstrate how the PlayStation® 3, powered by the Cell Broadband Engine, can be used as a computational platform to accelerate the Smith-Waterman algorithm.

    Results
    For large datasets, our implementation on the PlayStation® 3 provides a significant improvement in running time compared to other implementations such as SSEARCH, Striped Smith-Waterman and CUDA. Our implementation achieves a peak performance of up to 3,646 MCUPS.

    Conclusion
    The results from our experiments demonstrate that the PlayStation® 3 console can be used as an efficient low cost computational platform for high performance sequence alignment applications.

    Other authors
    See publication
  • Parallel DNA Sequence Alignment on the Cell Broadband Engine

    Lecture Notes in Computer Science

    Sequence alignment is one of the most important techniques in Bioinformatics. Although efficient dynamic programming algorithms exist for this problem, the alignment of very long DNA sequences still requires significant time on traditional computer architectures. In this paper, we present a scalable and efficient mapping of DNA sequence alignment onto the Cell BE multi-core architecture. Our mapping uses two types of parallelization techniques: (i) SIMD vectorization within a processor and (ii)…

    Sequence alignment is one of the most important techniques in Bioinformatics. Although efficient dynamic programming algorithms exist for this problem, the alignment of very long DNA sequences still requires significant time on traditional computer architectures. In this paper, we present a scalable and efficient mapping of DNA sequence alignment onto the Cell BE multi-core architecture. Our mapping uses two types of parallelization techniques: (i) SIMD vectorization within a processor and (ii) wavefront parallelization between processors.

    See publication

Patents

  • PLATFORM FOR VISUAL SYNTHESIS OF GENOMIC, MICROBIOME, AND METABOLOME DATA

    Filed US 62/296,986

Honors & Awards

  • RSS Scholarship Award

    Nanyang Technological University

    The RSS Scholarships aim to provide opportunities to a local or international student seeking admission asfull-time candidate pursuing a Doctor of Philosophy (Ph.D) programme by research at NTU.

    The award is tenable for one year in the first instance and is renewable subject to good progress.The maximum period of the award is up to four years for Ph.D. candidates, subject to good performance and progress, as well as availability of research funding. Upon confirmation of PhD candidature…

    The RSS Scholarships aim to provide opportunities to a local or international student seeking admission asfull-time candidate pursuing a Doctor of Philosophy (Ph.D) programme by research at NTU.

    The award is tenable for one year in the first instance and is renewable subject to good progress.The maximum period of the award is up to four years for Ph.D. candidates, subject to good performance and progress, as well as availability of research funding. Upon confirmation of PhD candidature, the student will be required to assist your School in teaching duties for three hours a week.

  • ASEAN Undergraduate Scholarship

    Nanyang Technological University

    The ASEAN Scholarships aim to provide opportunities to the young people of ASEAN to develop their potential and equip them with skills that will enable them to confidently step into the new millennium.

    The ASEAN Scholarships lead to the award of the Singapore-Cambridge General Certificate of Education Advanced Level (GCE A-Level) (or equivalent) certificates. Outstanding students who perform well in the GCE A-Level (or equivalent) Examination may apply for the ASEAN Undergraduate…

    The ASEAN Scholarships aim to provide opportunities to the young people of ASEAN to develop their potential and equip them with skills that will enable them to confidently step into the new millennium.

    The ASEAN Scholarships lead to the award of the Singapore-Cambridge General Certificate of Education Advanced Level (GCE A-Level) (or equivalent) certificates. Outstanding students who perform well in the GCE A-Level (or equivalent) Examination may apply for the ASEAN Undergraduate Scholarships offered by the National University of Singapore, Nanyang Technological University, Singapore Management University or Singapore University of Technology and Design.

Languages

  • English

    Native or bilingual proficiency

  • Indonesian

    Native or bilingual proficiency

  • German

    Limited working proficiency

  • Chinese

    Elementary proficiency

Recommendations received

More activity by Adrianto

View Adrianto’s full profile

  • See who you know in common
  • Get introduced
  • Contact Adrianto directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses