Machine learning (ML) in metabolomics
Time:2021-11-10

Introduction

The application of machine learning (ML) techniques in the field of metabolomics has been instrumental in driving the development of metabolomics research. With the increasing sensitivity of mass spectrometric devices, the number of untargeted features exponentially increases, and the use of ML methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), and autoencoders, can more easily exhibit the global characteristics of a sample. Rather than being limited to bioinformatic data analysis, the processing of front-end data (such as peak alignment of mass spectra and automated targeted annotation in mass spectrometry), ML can also be used for research in disease risk screening and early diagnosis, especially in complex diseases where genetics and environment combine, such as oncology, cardiovascular diseases and metabolic diseases. ML is widely used as a powerful tool for processing large amounts of data and building research models. From time to time, Metanotitia Health will introduce knowledge about ML, so please stay tuned.

Principal component analysis (PCA)

Principal component analysis (PCA) is the most common method of metabolomics analysis. It uses an Orthogonal transformation to perform a Lineartransformation on observations of a series of potentially correlated variables, resulting in a projection onto a series of linearly uncorrelated variables, known as principal components. Where the first major variance of any projection of this data is at the first coordinate (known as the first principal component or PC1), the second major variance is at the second coordinate (the second principal component or PC2), and so on. In simple terms, the method is designed to show the main components of the data by reducing the dimensionality of high-dimensional features. As shown in the figure below, the second, third and fourth planes in the figure show some of the features of the house, none of which can fully reveal the full picture of the house. Figure 1 (Template source: https://templates.office.com/en-us/fabrikamresidences-the-ultimate-in-modern-livingtm16411224)

Partial Least Squares Regression (PLS)

A limitation of principal component analysis (PCA) is that the rotation and dimensionality reduction used to account for the maximum variation in x does not guarantee the generation of predictable latent features of y. In other words, the spatial vector of PCA is highly susceptible to the influence of high-weighted response variables, while the low-weighted response vector plays a minimal role in the overall principal component. Therefore higher order supervised learning methods such as Partial least square regression (PLS), and in particular PLS-Discriminant Analysis (PLS-DA) for discrete variables, are often used to deal with more complex feature differentiation. The basic objective is to project the data into the space of latent variables in such a way as to maximise the covariance between the feature space X and the response Y. This allows regression modelling to be carried out in the presence of significant multicollinearity in the independent variables.

PCA and PLS-DA analysis are common analytical methods used in metabolomics research. The use of PCA techniques and PLS/PLS-DA analysis can help researchers determine the quality of data, identify differences between sample groups, and search for differential metabolites and candidate molecular markers.

In this issue, I started my journey of machine learning (ML) metabolomics, and in the future, I will bring more knowledge of ML technology to share with interested partners together with the technical experts of Metanotitia!

News
About Us
Pursuit the Best Metabolomics, Empower People to Live Their Best Lives
Parternships and Collaborations

Are you looking to establish strategic collaboration or partnership opportunity in cancer research with an innovative and progressive research company?  We are interested in collaborations with leading researchers, institutions, and facilities with proven track records. Contact us or leave your message below.

Metanotitia Inc.
  • Service hotline

    0451-51021416
  • E-mail

    Info@metanotitia.com
  • Address

    HQ: Floor 3-4, Building C4, Building 6, Science and Technology Headquarters, Shenzhen (Harbin) Industrial Park, Zhigu Street, Songbei District, Harbin
    BO: Room 1307,13F Beike Building, Nanshan District,Shenzhen
  • Contact Us
  • Please confirm the information given above is correct, so that we can reach out to you. Metanotitia is committed to protecting and respecting your privacy, and we’ll only use your personal information to administer your account and to provide the products and services you requested.
    TOP