About
Wondering how to unlock your organization's full potential? I help data leaders age 35-50…
Services
Articles by Dr. Markus
Contributions
Activity
-
DevOps for Startups 🤔 and broadband in Ireland 🤯 💥💣🧨🤯 Running a new tech-startup, building all for scaling and in the cloud, following most…
DevOps for Startups 🤔 and broadband in Ireland 🤯 💥💣🧨🤯 Running a new tech-startup, building all for scaling and in the cloud, following most…
Liked by Dr. Markus Schmidberger
-
DevOps for Startups 🤔 and broadband in Ireland 🤯 💥💣🧨🤯 Running a new tech-startup, building all for scaling and in the cloud, following most…
DevOps for Startups 🤔 and broadband in Ireland 🤯 💥💣🧨🤯 Running a new tech-startup, building all for scaling and in the cloud, following most…
Posted by Dr. Markus Schmidberger
-
Being data-driven is the skill that supposedly makes you successful. But in your personal life, it will slow you down. 💣 It took me some years to…
Being data-driven is the skill that supposedly makes you successful. But in your personal life, it will slow you down. 💣 It took me some years to…
Liked by Dr. Markus Schmidberger
Experience
Education
-
Homodea - Veit Lindau
-
In one year, learn everything you need to take your life to a whole new level and be successful as an integral Life Trust Coach TM. Even if your goal is still fuzzy, in a year you will know exactly how to serve the world as a coach.
-
-
This PhD thesis demonstrates the usefulness of the R language and parallel computing for biological research.
-
-
-
-
Licenses & Certifications
Publications
-
Beziehungsmagie für Männer: 100 Impulse für mehr Verbindung und Liebe
Self Publishing
Book is written in German!
In der heutigen hektischen Welt sind starke und erfüllende Beziehungen von entscheidender Bedeutung. Dieses Arbeitsbuch ist deine Eintrittskarte zu einer erfüllenden Beziehung zu dir selbst und einer Partnerschaft, die auf Liebe, Verbindung und Magie aufbaut.
Ein Buch für Männer, die nach Wegen suchen, ihre Beziehung zu sich selbst und ihrer Partnerin zu vertiefen und zu stärken. Dieses Buch bietet praxiserprobte Impulse und Schritte, um eine dauerhafte…Book is written in German!
In der heutigen hektischen Welt sind starke und erfüllende Beziehungen von entscheidender Bedeutung. Dieses Arbeitsbuch ist deine Eintrittskarte zu einer erfüllenden Beziehung zu dir selbst und einer Partnerschaft, die auf Liebe, Verbindung und Magie aufbaut.
Ein Buch für Männer, die nach Wegen suchen, ihre Beziehung zu sich selbst und ihrer Partnerin zu vertiefen und zu stärken. Dieses Buch bietet praxiserprobte Impulse und Schritte, um eine dauerhafte Verbindung aufzubauen. Von der Kommunikation über die Intimität bis hin zur Konfliktlösung wirst du entdecken, wie du eine Beziehung gestalten kannst, die nicht nur überlebt, sondern auch erblüht.
In "Beziehungsmagie für Männer" findest du:
* Für 50 Wochen je einen Persönlichkeitsentwicklungs- und einen Beziehungsimpuls. Die Veränderung beginnt bei dir selbst und du wirst eingeladen, eine Reise zu mehr Klarheit und Authentizität in deinen Beziehungen zu starten.
* Bewährte Techniken für bessere Kommunikation: Lerne, effektiver zu kommunizieren, Konflikte zu bewältigen und Missverständnisse zu minimieren.
* Praktische Übungen und Anleitungen: Schritt-für-Schritt-Anleitungen und Übungen, die dir dabei helfen, das Gelernte in die Tat umzusetzen und Veränderungen in deinen Beziehungen zu bewirken.
* Anstöße zu inneren Prozessen, um eine tiefere Bindung aufbauen: Entdecke, wie du eine starke emotionale Verbindung zu dir selbst und deiner Partnerin aufbauen und deine Beziehung langfristig stärken kannst.
"Beziehungsmagie für Männer" ist ein Buch, das nicht nur dein Leben, sondern auch deine Partnerschaft transformieren kann. Entdecke, wie du die Liebe, Verbindung und Magie in deiner Beziehung wiederherstellen und vertiefen kannst.
Mache den ersten Schritt zu einer erfüllten Partnerschaft - heute. -
An eight-gene expression signature for the prediction of survival and time to treatment in chronic lymphocytic leukemia
Leukemia
The clinical course of chronic lymphocytic leukemia (CLL) is highly variable, ranging from slow progression and survival for several decades to rapidly progressive and chemotherapy-resistant disease with death within 1 year of diagnosis. The hierarchical model of common genomic aberrations determined by interphase fluorescence in situ hybridization (FISH) and the analysis of the mutational status of the immunoglobulin heavy-chain variable region genes (IGVH status) are broadly used molecular…
The clinical course of chronic lymphocytic leukemia (CLL) is highly variable, ranging from slow progression and survival for several decades to rapidly progressive and chemotherapy-resistant disease with death within 1 year of diagnosis. The hierarchical model of common genomic aberrations determined by interphase fluorescence in situ hybridization (FISH) and the analysis of the mutational status of the immunoglobulin heavy-chain variable region genes (IGVH status) are broadly used molecular markers to predict the prognosis of CLL patients.
Other authorsSee publication -
Conceptual aspects of large meta-analyses with publicly available microarray data: A case study in oncology
Bioinformatics and Biology Insights
Abstract: Large public repositories of microarray experiments offer an abundance of biological data. It is of interest to use and to combine the available material to create new biological information and to develop a broader view on biological phenomena.
Meta-analyses recombine similar information over a series of experiments to sketch scientific aspects which were not accessible by each of the single experiments. Meta-analysis of high-throughput experiments has to handle methodological…Abstract: Large public repositories of microarray experiments offer an abundance of biological data. It is of interest to use and to combine the available material to create new biological information and to develop a broader view on biological phenomena.
Meta-analyses recombine similar information over a series of experiments to sketch scientific aspects which were not accessible by each of the single experiments. Meta-analysis of high-throughput experiments has to handle methodological as well as technical challenges. Methodological aspects concern the identification of homogeneous material which can be combined by appropriate statistical procedures. Technical challenges come from the data management of large amounts of high-dimensional data, long computation time, as well as the handling of the stored phenotype data.
This paper compares in a meta-analysis of a large series of microarray experiments the interaction structure within selected pathways between different tumour entities. The feasibility of such a study is explored and a technical as well as a statistical framework for its completion is presented. Multiple obstacles were met during completion of this project. They are mainly related to the quality of the available data and influence the biological interpretation of the results derived.
The sobering experience of our study asks for combined efforts to improve the data quality in public repositories of high-throughput data. The exploration of the available data in large meta-analyses is limited by incomplete documentation of essential aspects of experiments and studies, by technical deficiencies in the data stored, and by careless duplications of data.Other authors -
Hands-on Tutorial for Parallel Computing with R
Springer - Computational Statistics
Due to the increasing availability of powerful hardware resources, parallel computing is becoming an important issue, as a noticeable speedup may be achieved. The statistical programming language R allows for parallel computing on computer clusters as well as multicore systems through several packages. This tutorial gives a short, practical overview of four, in view of the authors, important packages for parallel computing in R, namely multicore, snow, snowfall and nws. First, the general…
Due to the increasing availability of powerful hardware resources, parallel computing is becoming an important issue, as a noticeable speedup may be achieved. The statistical programming language R allows for parallel computing on computer clusters as well as multicore systems through several packages. This tutorial gives a short, practical overview of four, in view of the authors, important packages for parallel computing in R, namely multicore, snow, snowfall and nws. First, the general principle of parallelizing simple tasks is briefly illustrated based on a statistical cross-validation example. Afterwards, the usage of each of the introduced packages is being demonstrated on the example. Furthermore, we address some specific features of the packages and provide guidance for selecting an adequate package for the computing environment at hand.
Other authorsSee publication -
Indirect Comparison of Interaction Graphs
Book Chapter: Statistical Modelling and Regression Structures -- Festschrift in Honour of Ludwig Fahrmeir; Kneib, Thomas; Tutz, Gerhard (Eds.)
Astrategy for testing differential conditional independence structures (CIS) between two graphs is introduced. The graphs have the same set of nodes and are estimated from data sampled under two different conditions. The test uses the entire pathplot in a Lasso regression as the information on how a node connects with the remaining nodes in the graph.
The interpretation of the paths as random processes allows defining stopping times which make the statistical properties of the test statistic…Astrategy for testing differential conditional independence structures (CIS) between two graphs is introduced. The graphs have the same set of nodes and are estimated from data sampled under two different conditions. The test uses the entire pathplot in a Lasso regression as the information on how a node connects with the remaining nodes in the graph.
The interpretation of the paths as random processes allows defining stopping times which make the statistical properties of the test statistic accessible to analytic reasoning. A resampling approach is proposed to calculated p-values simultaneously for a hierarchical testing procedure. The hierarchical testing steps through a given hierarchy of clusters. First, collective effects are measured at the coarsest level possible (the global null hypothesis that no node in the graph shows a differential CIS). If the global null hypothesis can be rejected, finer resolution levels are tested for an effect until the level of individual nodes is reached.
The strategy is applied to association patterns of categories from the ICF in patients under post-acute rehabilitation. The patients are characterized by two different conditions. Acomprehensive understanding of differences in the conditional independence structures between the patient groups is pivotal for evidence-based intervention design on the policy, the service and the clinical level related to their treatment. Due to extensive computation, parallel computing offers an effective approach to implement our explorative tool and to locate nodes in a graph which show differential CIS between two conditions.Other authors -
Simulation Study for the Agreement between Statistical Methods in Quality Assessment and Control of Microarray Data
Springer - Computational Statistics
As microarray data quality can affect each step of the microarray analysis process, quality assessment and control is an integral part. It detects divergent measurements beyond the acceptable level of random fluctuations. This empirical study identifies association and correlation between the six quality assessment methods for microarray outlier detection used in the arrayQualityMetrics package version 2.2.2. For evaluation two different agreement tests—Cohen’s Kappa, after a homogeneity…
As microarray data quality can affect each step of the microarray analysis process, quality assessment and control is an integral part. It detects divergent measurements beyond the acceptable level of random fluctuations. This empirical study identifies association and correlation between the six quality assessment methods for microarray outlier detection used in the arrayQualityMetrics package version 2.2.2. For evaluation two different agreement tests—Cohen’s Kappa, after a homogeneity marginal criteria, and AC1 Statistic—, the Pearson Correlation Coefficient and realistic microarray data from the public ArrayExpress database have been used. It is possible to assess the quality of a data set using only four of the six currently proposed statistical methods to comprehensively quantify the quality information in large series of microarrays. This saves computation time and reduces decision complexity for the analyst. The new proposed rule is validated with data sets from biomedical studies.
Other authorsSee publication -
affyPara - a Bioconductor Package for Parallelized Preprocessing Algorithms of Affymetrix Microarray Data
Bioinformatics and Biology Insights
Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is…
Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule’s prediction quality honestly. This paper proposes the new Bioconductor package affyPara for parallelized preprocessing of Affymetrix microarray data. Partition of data can be applied on arrays and parallelization of algorithms is a straightforward consequence. The partition of data and distribution to several nodes solves the main memory problems and accelerates preprocessing by up to the factor 20 for 200 or more arrays. affyPara is a free and open source package, under GPL license, available form the Bioconductor project at www.bioconductor.org. A user guide and examples are provided with the package.
Other authorsSee publication -
Parallel Computing with the R Language in a Supercomputing Environment
Book Chapter: High Performance Computing in Science and Engineering, Garching 2009, Springer
R is an open-source programming language and software environment for statistical computing and graphics. During the last decade a great deal of research has been conducted on parallel computing techniques with the R language. Two packages (snow and Rmpi) stand out as particularly useful for general use on computer clusters and the multicore package for the use on multi-core machines.
This article describes the operation of the R language at the supercomputer HLRB2 hosted at the…R is an open-source programming language and software environment for statistical computing and graphics. During the last decade a great deal of research has been conducted on parallel computing techniques with the R language. Two packages (snow and Rmpi) stand out as particularly useful for general use on computer clusters and the multicore package for the use on multi-core machines.
This article describes the operation of the R language at the supercomputer HLRB2 hosted at the Leibniz-Rechenzentrum in Munich, Germany. Additional, a small benchmark is provided and the article explains and discusses two parallel biostatistical applications calculated at the HLRB2. The indirect comparison of interaction graph example outlines the requirements for more than 10.000 processors.Other authors -
State of the Art in Parallel Computing with R
Journal of Statistical Software
R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing.
This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their…R is a mature open-source programming language for statistical computing and graphics. Many areas of statistical research are experiencing rapid growth in the size of data sets. Methodological advances drive increased use of simulations. A common approach is to use parallel computing.
This paper presents an overview of techniques for parallel computing with R on computer clusters, on multi-core systems, and in grid computing. It reviews sixteen different packages, comparing them on their state of development, the parallel technology used, as well as on usability, acceptance, and performance.
Two packages (snow, Rmpi) stand out as particularly suited to general use on computer clusters. Packages for grid computing are still in development, with only one package currently available to the end user. For multi-core systems five different packages exist, but a number of issues pose challenges to early adopters. The paper concludes with ideas for further developments in high performance computing with R. Example code is available in the appendix.Other authorsSee publication -
Bookreview: Introduction to Machine Learning and Bioinformatics
Journal of Statistical Software
-
Parallelized preprocessing algorithms for high-density oligonucleotide array data
22th International Parallel and Distributed Processing Symposium (IPDPS 2008)
tudies of gene expression using high-density oligonucleotide microarrays have become standard in a variety of biological contexts. The data recorded using the microarray technique are characterized by high levels of noise and bias. These failures have to be removed, therefore preprocessing of raw data has been a research topic of high priority over the past few years. Actual research and computations are limited by the available computer hardware. Furthermore most of the existing preprocessing…
tudies of gene expression using high-density oligonucleotide microarrays have become standard in a variety of biological contexts. The data recorded using the microarray technique are characterized by high levels of noise and bias. These failures have to be removed, therefore preprocessing of raw data has been a research topic of high priority over the past few years. Actual research and computations are limited by the available computer hardware. Furthermore most of the existing preprocessing methods are very time consuming. To solve these problems, the potential of parallel computing should be used. For parallelization on multicomputers, the communication protocol MPI (message passing interface) and the R language will be used. This paper proposes the new R language package affyPara for parallelized preprocessing of high-density oligonucleotide microarray data. Partition of data could be done on arrays and therefore parallelization of algorithms gets intuitive possible. The partition of data and distribution to several nodes solves the main memory problems and accelerates the methods by up to the factor ten.
Other authors
Honors & Awards
-
BARC Best Practice Award für Business Intelligence und Analytics 2018
BARC
* Scout24 wins in the category "medium-sized businesses" with a comprehensive transformation approach to data-driven work
* Expert jury recognizes approach that encompasses change in technology, organization and corporate culture
* Markus Schmidberger, Head of Data Technology at Scout24: "Data-controlled work is one of the central corporate values of Scout24. Thanks to our new data organisation, we are enabling more and more employees to evaluate data independently. This also benefits our…* Scout24 wins in the category "medium-sized businesses" with a comprehensive transformation approach to data-driven work
* Expert jury recognizes approach that encompasses change in technology, organization and corporate culture
* Markus Schmidberger, Head of Data Technology at Scout24: "Data-controlled work is one of the central corporate values of Scout24. Thanks to our new data organisation, we are enabling more and more employees to evaluate data independently. This also benefits our users, customers and partners".
https://www.scout24.com/en/Press-Media/News-Archive/Scout24-wins-BARC-Best-Practice-Award.aspx
https://barc.de/news/wmf-und-scout24-gewinnen-den-barc-best-practice-award-fur-business-intelligence-und-analytics-2018 -
Gartner Data & Analytics Excellence Award for Best Data Management and Infrastructure
Gartner Inc.
Gartner, Inc. has announced the winners of the Gartner Data & Analytics Excellence Awards. The awards recognize excellence in data and analytics technology to drive best-in-class initiatives.
Six winners and 12 finalists were chosen from a pool of 152 submissions across all six categories. While the criteria were different for each category, all submissions were assessed by a team of Gartner analysts, and honorees were selected by benchmarking against world-class performance standards…Gartner, Inc. has announced the winners of the Gartner Data & Analytics Excellence Awards. The awards recognize excellence in data and analytics technology to drive best-in-class initiatives.
Six winners and 12 finalists were chosen from a pool of 152 submissions across all six categories. While the criteria were different for each category, all submissions were assessed by a team of Gartner analysts, and honorees were selected by benchmarking against world-class performance standards. Gartner looked for submissions with a strong organizational and leadership component, effective use of modern technologies, and most of all, clear business outcomes.
glomex —The global media exchange based in Germany, glomex built a scalable data management infrastructure through big data investments enabling it to meet customer demand.
http://www.gartner.com/newsroom/id/3591318
Recommendations received
-
LinkedIn User
17 people have recommended Dr. Markus
Join now to viewMore activity by Dr. Markus
-
🎄 5 must-reads on Data Governance for Christmas break : 1️⃣ High impact Data Governance teams : what do they do? https://shorturl.at/1fdKD 2️⃣…
🎄 5 must-reads on Data Governance for Christmas break : 1️⃣ High impact Data Governance teams : what do they do? https://shorturl.at/1fdKD 2️⃣…
Liked by Dr. Markus Schmidberger
-
Being data-driven is the skill that supposedly makes you successful. But in your personal life, it will slow you down. 💣 It took me some years to…
Being data-driven is the skill that supposedly makes you successful. But in your personal life, it will slow you down. 💣 It took me some years to…
Shared by Dr. Markus Schmidberger
Other similar profiles
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More