Bringing new possibilities to cancer care! The world's first full-slice digital pathology model is released

Bringing new possibilities to cancer care! The world's first full-slice digital pathology model is released

Contributing author: Xu Hanwen (second-year doctoral student at the University of Washington)

In recent years, the booming development of digital pathology has become an important part of accelerating breakthroughs in precision medicine. In the process of cancer care, the use of whole-slice imaging technology to convert tumor tissue samples into high-resolution digital images has become a routine technology. Pathological images up to a billion pixels contain a variety of tumor microenvironment information, providing unprecedented opportunities for cancer classification diagnosis, survival rate analysis and precision immunotherapy.

Recently, the generative AI revolution has provided a powerful solution for accurately perceiving and analyzing the massive amount of information in pathology images. At the same time, breakthroughs in multimodal generative AI technology will help understand digital pathology images from multiple time and space scales and integrate them with other biomedical modalities, thereby better depicting the evolution and development of patients' diseases and assisting doctors in clinical diagnosis and treatment.

However, due to the large-scale, high-pixel, and complex features of digital medical pathology images, it is very challenging to efficiently process and understand the complex patterns from a computational perspective . After the digital transformation, each full slide will contain billions of pixels, with an area of ​​more than 100,000 times that of a natural image, making it difficult to apply existing computer vision models. The computational complexity of traditional visual models, such as Vision Transformer, increases rapidly with the size of the input image. At the same time, clinical medical data has the characteristics of cross-scale, multimodal, and high noise, and most of the existing pathology models are based on standard public data sets, which are still a long way from real-world applications.

To this end, researchers from Microsoft Research, Providence Medical Network and the University of Washington jointly proposed the first full-slice-scale digital pathology model, GigaPath .

According to reports, the GigaPath model adopts a two-stage cascade structure and the LongNet architecture recently developed by Microsoft Research, which efficiently solves the problem of processing and understanding billion-pixel images . Researchers have pre-trained GigaPath on a large scale on real-world data, collecting 170,000 full-slice digital pathology images from 30,000 patients in 28 US hospitals under Providence, totaling 1.3 billion pathology tiles.

Experimental results show that GigaPath achieves leading results in 25 out of 26 tasks, including 9 cancer classifications and 17 pathology tasks, and is significantly superior to existing methods in 18 tasks.

The researchers said that this study shows that full-slice-scale modeling and pre-training of large-scale real-world data are extremely important. At the same time, GigaPath will also provide new possibilities for more advanced cancer care and clinical discoveries. It is worth mentioning that the model and code of GigaPath have been open sourced.

method

GigaPath adopts a two-stage curriculum learning, including tile-level pre-training using DINOv2 and full-slice-level pre-training using mask autoencoder with LongNet, as shown in the figure below.

Figure |GigaPath model diagram

DINOv2 is a standard self-supervised method that combines contrastive loss and mask reconstruction loss when training the teacher and student Vision Transformer. However, due to the computational challenges of self-attention itself, its application is limited to small images, such as 256 × 256 tiles. For full-slice-level modeling, the research team applied Dilated Attention from LongNet to digital pathology, as shown below.

Figure|LongNet model diagram

To handle long sequences of image tiles across a full slice, they introduce a series of increasing sizes that subdivide the sequence of tiles into segments of a given size. For larger segments, LongNet introduces sparse attention with sparsity proportional to the segment length, counteracting quadratic growth. The largest size segment will cover the entire full slice. This enables capturing long-range dependencies in a systematic way while keeping the computation tractable (linear in the context length).

Main experimental results

In terms of cancer classification diagnosis , the task goal is to classify fine-grained subtypes based on pathological sections. For example, for ovarian cancer, the model needs to distinguish six subtypes: clear cell ovarian cancer, endometrioid ovarian cancer, high-grade serous ovarian cancer, low-grade serous ovarian cancer, mucinous ovarian cancer, and ovarian carcinosarcoma. **GigaPath achieved leading results in all nine cancer classification tasks, and the accuracy improvement in six of the cancer categories was significant. **For six cancers (breast cancer, kidney cancer, liver cancer, brain cancer, ovarian cancer, and central nervous system cancer), GigaPath's AUROC reached 90% or higher. This is a good start for downstream applications in precision health fields such as cancer diagnosis and prognosis.

In the pathology task , the task goal is to predict whether a tumor exhibits specific clinically relevant gene mutations based solely on the whole-slice image. This prediction task helps to reveal rich connections between tissue morphology and genetic pathways that are difficult for humans to perceive. In addition to some known specific cancer types and gene mutation pairs, how many gene mutation signals exist in whole-slice images remains an unanswered question. In addition, in some experiments, the researchers considered pan-cancer scenarios, that is, identifying universal signals of gene mutations in all cancer types and very diverse tumor morphologies. In such a challenging scenario, GigaPath once again achieved leading performance in 17 of the 18 tasks, significantly outperforming the second place in 12 of the 18 tasks . Gigapath can extract genetically related pan-cancer and subtype-specific morphological features at the level of the entire whole slice, opening the door to complex future research directions in real-world scenarios.

In addition, the researchers further demonstrated the potential of GigaPath in multimodal visual language tasks by introducing pathology reports. Previously, work on pathology visual language pre-training often focused on small images at the tile level. In contrast, GigaPath explores visual language pre-training at the full slice level. By continuing to pre-train on pathology report pairs, the report semantics are used to align the latent space representations of pathology images.

This is more challenging than traditional vision-language pre-training, and without leveraging any fine-grained alignment information between individual image patches and text snippets, GigaPath significantly outperforms three state-of-the-art pathological vision-language models on standard vision-language tasks .

Summarize

Through rich and comprehensive experiments, the researchers proved that GigaPath's related research work is a good practice in full-slice pre-training and multimodal visual language modeling. It is worth mentioning that although GigaPath has achieved leading results in multiple tasks, there is still a lot of room for improvement in certain specific tasks. At the same time, although the researchers explored visual language multimodal tasks, there are still many specific issues to be explored on the road to building a multimodal conversational assistant at the pathological level .

GigaPath is a collaborative project between Microsoft Research, Providence Health System, and the Paul Allen School of Computer Science at the University of Washington. Hanwen Xu, a second-year doctoral student from Microsoft Research and the University of Washington, and Naoto Usuyama, a principal researcher from Microsoft Research, are the co-first authors of the paper. Dr. Hoifung Poon, General Manager of the Health Futures team at Microsoft Research, Professor Sheng Wang from the University of Washington, and Dr. Carlo Bifulco from Providence are the co-corresponding authors of the paper.

Xu Hanwen: A second-year doctoral student at the University of Washington. His research direction is the intersection of AI and medicine. His research results have been published in Nature, Nature Communications, Nature Machine Intelligence, and AAAI. He has served as a reviewer for Nature Communications, Nature Computational Science, and other journals.

Wang Sheng: Assistant Professor of Computer Science at the University of Washington. His research focuses on the intersection of AI and medicine. His research results have been published in Nature, Science, Nature Biotechnology, Nature Machine Intelligence and The Lancet Oncology. His research results have been used by many medical institutions such as Mayo Clinic, Chan Zuckerberg Biohub, UW Medicine and Providence.

Pan Haifeng: General Manager of Health Futures at Microsoft Research. His research interests include generative AI basic research and precision medicine applications. He has won best paper awards at multiple AI conferences, and his open source biomedical models published on HuggingFace have been downloaded tens of millions of times. Some of his research results have begun to be applied in cooperating medical institutions and pharmaceutical companies.

<<:  Why do you lose your appetite when the weather gets hot? Come and unlock your "appetizer"

>>:  What? The coffee you love so much was discovered by sheep first!

Recommend

What to do if an 11-year-old girl has acne on her face

I believe that everyone undoubtedly understands t...

Why is the menstrual period brown and smells like acetic acid?

Women should pay attention to their menstrual hea...

What are the causes of menstrual headaches

Women are prone to headaches during their menstru...

What are the causes of irregular menstrual bleeding?

Irregular vaginal bleeding refers to the phenomen...

Causes of light vaginal bleeding

Some female friends may experience a small amount...

What tests should be done for scanty menstruation? This is how science does it

Many women have less menstrual flow. Because the ...

I have stomach pain and backache but my period has not come. What's going on?

I believe many people are familiar with menstruat...

What are the tips for women to lose weight?

There are no ugly women in the world, only lazy w...

What is the disease of thick yellow leucorrhea with odor?

If the leucorrhea is yellow and thick, it is ofte...

Postpartum vaginal farting

After giving birth, a woman's vagina will bec...

What to do if bleeding occurs 25 days after transplantation

Embryo transplantation is an important method to ...

2024 Medical Science Popularization——Gastrointestinal Ultrasound

Gastrointestinal ultrasound, also known as gastri...