Cloud-based mNGS Analysis: Virus Sequence Comparison in 60 Seconds

Image for post
Image for post

Bolster the growth and digital transformation of your business amid the outbreak through the Anti COVID-19 SME Enablement Program. Get a $300 coupon package for all new SME customers or a $500 coupon for paying customers.

By Eric Lee, nicknamed Zhuang Huai, from Alibaba Container Service.

The Metagenomic next-generation sequencing (mNGS) of SARS-CoV-2 RNA detected in upper or lower respiratory tract within weeks of onset for pan-pathogen detection has contributed to the early discovery and accurate sequencing of COVID-19 RNA. Alibaba Cloud Genomics Service (AGS) provides researchers with the ability to quickly compare mNGS macro genome sequencing data. Simply by using an Alibaba Cloud Object Storage Service (OSS) bucket and the AGS command line tool, you can calculate 3.2 Gbase (22 million reads) of macro genomic data from alveolar sample sequencing and compare it with known pathogen genomes, including COVID-19 (SARS-CoV-2) and 39 reference sequences of BetaCov RNA, all within 60 seconds. In addition, you can upload custom virus libraries for comparison.

mNGS of COVID-19 (SARS-CoV-2) RNA detected in upper or lower respiratory tract within weeks of onset for pan-pathogen detection has contributed to the early discovery and accurate sequencing of COVID-19 RNA.

Image for post
Image for post

For COVID-19, nucleic acid detection uses the multiplex fluorescence RT-PCR kit. This kit is primarily used for the fluorescence detection of the ORF1ab, N, E gene targets of COVID-19. Antibody detection looks for the presence of the lgM and lgG antibodies that the body’s immune system produces when exposed to COVID-19. If such antibodies can be detected, this indicates that the subject has been infected and recovered. In addition, mNGS provides macro genome data obtained through the deep sequencing of diseased tissue to detect and investigate various pathogens.

Although there are many RT-PCR kits available for COVID-19, influenced by the concentration of the virus and the quality of the kit, the RT-PCR kit and other reagent kits produce a large number of false negatives. As a result, doctors and patients often need to repeat the test many times, extending the time spent waiting for results.

The advantage of mNGS is that all known pathogens can be checked for in a single process. mNGS testing can avoid the difficulties that repeated sampling brings for both doctors and patients. It also solves the difficulty of massive samples required for PCR testing. Based on the mNGS nucleic acid sequence alignment analysis method, once the genome of the pathogen is known, researchers simply need to update the database to efficiently and accurately detect subsequent cases.

AGS provides researchers with the ability to quickly compare mNGS macro genome sequencing data. With this ability, you can calculate 3.2 Gbase (22 million reads) of macro genomic data from alveolar sample sequencing and compare it with known pathogen genomes, including COVID-19 and 39 reference sequences of BetaCov RNA, all within 60 seconds. In addition, you can upload custom virus libraries for comparison. For the Chinese Center for Disease Control and Prevention, hospitals, and labs, simply by using an OSS bucket and the AGS command line, they can complete the entire comparison process and produce high-quality reads-matching data and preliminary quality reports. This provides fast and accurate data support for the detection of various pathogens and further protein and variation researches on COVID-19.

Working with the community to combat the epidemic, AGS has opened up its mNGS RNA comparison computing capabilities to gene sequencing vendors, the Chinese Center for Disease Control and Prevention, hospitals, schools, and pharmaceutical companies.

Preparation

1. To download and install the AGS command line interface.

2. To download and install OSSutil.

3. Prepare an Alibaba Cloud account and an OSS bucket to store mNGS sequencing data, such as oss://my-test-shenzhen.

4. Configure bucket access permissions for AGS. For example: ags config oss my-test-shenzhen.

5. Upload mNGS data to the bucket.

6. Run the comparison task to compare the mNGS data with known RNA sequences and sequence databases. Repeat steps 5 and 6 to compare different samples.

COVID-19 Comparison

1. Submit a Comparison Task to Compare the Similarity Between the ICU6 G_S2_L001 Sequence Sample and COVID-19.

2. Check the Comparison Task and Comparison Results.

In this comparison task, 10 million reads (1.4 Gbase) and the COVID-19 sequence MN908947.3 were compared in 43 seconds, generating 3,629 high-quality mapped reads, and 404 reads that scored over 120 points in the COVID-19 feature range. This indicates that COVID-19 RNA sequences can be accurately detected in the sequencing data of this sample.

3. Download the Comparison Data BAM File and Report.

Below is an example for when SARS-CoV-2 RNA detected.

Further Analysis of the Comparison Data

You can use samtools stats, plot-bamstat, and similar tools to compare the BAM output data to further analyze the similarity in coverage and depth. Then, you can use the BAM data for protein composition and variation analysis.

Example stats

Image for post
Image for post

Coverage Analysis

Image for post
Image for post

Comparison with 39 Known Beta Coronaviruses

Using Custom Virus Databases for Comparison

1. Download Reference Sequences from NCBI GenBank and Merge Them into a Multi-contig Reference Sequence. For example, search for all nucleic acid reference series containing betacov and download them.

Image for post
Image for post

2. Rename the Downloaded Sequence.fa File as betacov-ncbi-test.fa.

3. Upload the reference to the OSS bucket.

4. Submit the Comparison Task and Specify the Reference Path.

5. View the Comparison Report and Obtain the Matched Data.

6. Download the Matched Data for Further Analysis.

While continuing to wage war against the worldwide outbreak, Alibaba Cloud will play its part and will do all it can to help others in their battles with the coronavirus. Learn how we can support your business continuity at https://www.alibabacloud.com/campaign/fight-coronavirus-covid-19

Original Source:

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store