Resource

Submit Your Request Now

Submit Your Request Now

×

Overview of Antibody Sequencing

Page Contents View
  • Background
  • Antibody sequencing protocol
  • Antibody sequencing mass spectrometry
  • Other methods for antibody sequencing
  • Antibody sequence database
  • Antibody sequence analysis tool

Our human body is born with a strong immune system. The human immune system is stimulated by different external trigger 'specific' immune response, so as to protect our body from a variety of external pathogens, its specific response is realized by the body's production of a large number of natural antibodies. Statistics show that there are 1010 to 1011 B cells in adults, and these B cells can produce 1012 different natural antibodies. Antibodies are important molecules in the immune system, which can specifically recognize and fight against foreign pathogens, such as bacteria, viruses, etc. The wide application of antibodies is not only limited to immunological research, but also involves the fields of disease diagnosis, treatment and vaccine development. However, our understanding of natural antibodies is only the tip of the iceberg. In order to analyze the mystery of natural antibodies, scientists have developed antibody sequencing technology. Antibody sequencing is a technique used to determine the amino acid sequence of specific antibody molecules in a monoclonal or polyclonal antibody, which provides key information to understand the specificity, activity, three-dimensional structure of the antibody, and its affinity to the antigen. It is very important for antibody engineering, vaccine development, disease diagnosis and treatment research.

This article will introduce the key technologies of antibody sequencing, including antibody sequencing mass spectrometry, methods and schemes of antibody sequencing, antibody sequence database and analysis tools, and discuss the application and challenges of de novo antibody sequencing technology.

Background

Antibodies (also known as immunoglobulins) are proteins produced by B-cells and are divided into five main classes (IgA, IgD, IgE, IgG, and IgM). Each antibody consists of two heavy chains (HC) and two light chains (LC), and the variant regions of the HC and LC (including CDR1, CDR2, and CDR3) determine the specificity and affinity of the antibody. The core goal of antibody sequencing is to obtain and analyze the gene sequences of these variable regions.

With the booming development of antibody drug discovery, especially the widespread use of monoclonal antibodies (mAb) in clinical therapy, the structure and sequence information of antibodies have become particularly important. The emergence of antibody sequencing technology provides researchers with an accurate and comprehensive way to gain insights into the mechanisms of immune response and provide strong support for antibody drug discovery and development through sequence resolution of antibodies.

Approaches for high-throughput sequencing of functional antibody repertoires.Approaches for high-throughput sequencing of functional antibody repertoires(Robinson WH 2015)

Antibody sequencing protocol

1. Sample collection and B cell extraction

  • Sample selection after immune reaction: B cells can be extracted from blood, spleen, lymph nodes or bone marrow. B cells in these tissues are usually activated after immune reaction, and have more abundant antibody gene information.
  • Flow cytometry sorting: FACS sorts target B cells by surface markers (such as CD19, CD20, etc.), and further excludes non-B cells by cell size and granularity to ensure the purity of collected cells.
  • Purity verification of B cells: Cell surface markers (such as IgM and IgD) are usually used to confirm and verify the purity of B cells.

2. RNA extraction and reverse transcription

  • RNA extraction: To extract total RNA from the sorted B cells, high-quality RNA extraction kits are needed to ensure that RNA is not degraded.
  • Reverse transcription process: reverse transcriptase is used to transcribe the extracted RNA into cDNA. Because there are many high GC content regions or repetitive sequences in antibody genes, it is necessary to use appropriate primers and reverse transcription conditions to ensure efficient reverse transcription.

3. Antibody gene amplification

  • PCR amplification: Specific primers were used to amplify the variable regions (VH and VL) of antibody genes. Usually, the designed primers contain universal primers for different subtypes of antibodies VH and VL.
  • Primer design: Primers need to be designed according to antibody subtypes (IgM, IgG, etc.) expressed by B cells and species specificity. The heavy chain variable region (VH) and light chain variable region (VL) are important regions of antibody molecule specificity, so their amplification is the core of antibody sequencing scheme.
  • Diversity capture of antibody sequence: The bias should be minimized in the PCR amplification process to ensure that as many antibody variants as possible can be captured.

4. High-throughput sequencing

  • Library construction: PCR amplification products are purified and used to build a library. In the construction of the library, the antibody gene fragments need to be added with the linker sequence needed for sequencing, so as to be read on the high-throughput sequencing platform.
  • High-throughput sequencing platform: Commonly used platforms include Illumina (short reading platform), PacBio (long reading platform) and Oxford Nanopore.
  • Sequencing depth: In order to improve the coverage and reliability of antibody sequences, the sequencing depth usually needs to be between millions and tens of millions of readings to ensure that enough antibody gene sequences are captured.

5. Data processing and analysis

  • Quality control and denoising: The original sequencing data will contain a certain degree of noise, such as low-quality sequences and contaminated sequences. In the process of data analysis, it is first necessary to control the quality of data, remove low-quality sequences, and reconstruct short sequences through splicing.
  • Sequence splicing and de-duplication: Use special software tools (such as FLASH, PEAR, etc.) to splice pairs of reading lengths, remove duplicate sequences and merge the same sequences. This is to ensure an accurate assessment of the diversity of antibody genes.
  • Antibody sequence annotation and clustering: Annotate the antibody gene sequence by bioinformatics tools (such as IMGT, VBASE2) to determine the combination of V gene and J gene. In addition, cluster analysis can help to identify the clone and subtype diversity of antibodies.
  • Analyze the affinity and specificity of antibodies: identify antibodies with high affinity through cluster analysis and predict their possible antigen specificity. Advanced analysis such as antibody-antigen binding simulation can further determine the antigen binding site of antibodies.

6. Antibody expression and function verification

  • Antibody cloning: after screening the antibody sequences with specificity and diversity, insert them into vectors suitable for expression (such as pET, pcDNA, etc.) by molecular cloning technology, and transfect them into appropriate expression systems (such as Escherichia coli, insect cells or mammalian cells) for antibody expression.
  • Antibody expression: High-purity antibody products were obtained by optimizing expression conditions. Affinity chromatography (such as Protein A or Protein G affinity chromatography) is usually used to purify antibodies.
  • Functional verification: ELISA: Detect the binding of antibody with target antigen by ELISA, and verify the specificity of antibody. Flow cytometry: It is used to verify the binding of antibodies to cell surface antigens, and can detect the affinity of antibodies and their interaction with cell surface receptors. Neutralization test: If the antibody is used for immune neutralization of virus or bacteria, the neutralization test is used to verify whether the antibody can effectively prevent virus or bacteria from infecting cells.

To learn more detailed antibody sequencing steps, please refer to "Antibody Sequencing Protocol".

Antibody sequencing mass spectrometry

MS technology can accurately analyze the composition, structure and modification status of antibodies by measuring the mass-to-charge ratio (m / z) of molecules in an electric or magnetic field. The MS method of antibody sequencing is mainly used to analyze the protein sequence, structure and functional characteristics of antibodies, especially in de novo antibody sequencing.

MS analysis steps

  • Sample preparation: For monoclonal antibodies, antibodies are generally extracted from cultured hybridoma cells ; for polyclonal antibodies, they may need to be extracted from serum or other biological fluids. The commonly used methods include affinity chromatography, ion exchange chromatography, etc.
  • Antibody purification: Antibody samples were purified by protein A, protein G or specific affinity column to remove non-specific proteins and other impurities.
  • Cleavage and reduction: The purified antibody is cleaved into smaller fragments, which are usually treated with a reducing agent to break the disulfide bonds between proteins to obtain separate LC and HC.
  • Digestion treatment: The antibody protein is decomposed into smaller peptides by enzymatic hydrolysis (such as trypsin or other specific proteases).
  • Peptide Purification and Separation: Peptides obtained by further purification and separation by HPLC or similar techniques.
  • MS analysis: The purified peptides were analyzed by high-resolution mass spectrometer (such as Orbitrap, Q-TOF, etc.) to obtain the m / z information of the peptides and determine the sequence of the peptides.
  • Data analysis: According to the peptide sequence generated by mass spectrometry data, the antibody sequence was reconstructed by database search or de novo algorithm.

Advantages and challenges

  • Advantages: MS can provide high-throughput analysis of antibodies, which is suitable for different types of antibodies, including full-length antibodies, fragment antibodies, engineered antibodies, etc. At the same time, MS can reveal post-translational modification information such as glycosylation and phosphorylation of antibodies, which helps us to understand the biological functions of antibodies.
  • Challenges: MS requires high-quality samples and requires high-resolution mass spectrometers. In addition, it has high requirements on the length and sequence of peptides. If it is a complex antibody structure, multi-step analysis may be needed to obtain complete sequence information.

If you want to know more about mass spectrometry, please refer to "Antibody Sequencing Mass Spectrometry".

Other methods for antibody sequencing

In addition to MS, there are many other methods mainly including PCR-based sequencing of antibody genes, sequencing of antibody single-chain clones, and so on.

PCR-based antibody gene sequencing

  • Firstly, mRNA is extracted from B-cells, then reverse transcribed to cDNA, and then specific antibody genes (including HC and LC) are amplified by PCR, so that we can obtain the gene sequence of the antibody. This method is usually used for sequence analysis of monoclonal antibodies, and can obtain more comprehensive antibody sequence information.
  • Advantages: The gene sequence of a specific antibody can be obtained by PCR, and this technology is relatively mature and widely used in antibody development and screening.
  • Disadvantages: The method mainly relies on the isolation of B cells and high-quality RNA extraction, otherwise the success rate is too low, and there may be sequence deviations.

Single-cell antibody sequencing

  • Single-cell antibody sequencing is the resolution of antibody genes in B cells by scRNA-seq, which can efficiently obtain antibody sequences from individual B cells.
  • Advantages: This technology can obtain antibody sequences directly from individual B cells, avoiding the step of antibody cloning in traditional methods. It can obtain rich antibody diversity data, which is suitable for immune response, vaccine research and monoclonal antibody development.
  • Disadvantages: scRNA-seq is costly and requires the support of a high-throughput platform.

Immunobank Sequencing

  • Immunome repertoire sequencing enables a comprehensive analysis of the antibody gene pool in the entire immune system, helping us to identify and analyze diverse antibody sequences. This technology captures the breadth and depth of an individual's immune response by sequencing antibody genes in B cells or T cells, enabling a comprehensive analysis of immune response, immune memory and antibody diversity.
  • Advantages: It can comprehensively assess the antibody pool of the immune system and obtain rich antibody diversity data. It can study immune response, vaccine effect, antibody therapy, immune escape mechanism and other fields. And it can efficiently discover rare, low-frequency antibody variants, which helps to discover potential therapeutic antibodies.
  • Disadvantages: Higher requirements for sample quality, as well as higher sequencing costs, high requirements for data analysis, and especially powerful computational resources and professional bioinformatics support are also needed.

Sequencing of antibody HC and LC clones

  • Antibody HC and LC clone sequencing technology is used to analyze the specific sequence of an antibody by cloning and sequencing the HC and LC regions (including the variable regions VH and VL) in the antibody gene. This technology is often used to screen or study antibodies with specific functions, and is especially widely used in monoclonal antibody development, optimization and functional analysis.
  • Advantages: The complete sequence of a single antibody can be accurately obtained, helping to understand the specificity and affinity of the antibody. It can be used for antibody screening, antibody optimization and its application in clinical therapy, and can provide accurate data support for monoclonal antibody development and engineering modification.
  • Disadvantages: The sequencing process is relatively cumbersome, requiring cloning and screening steps, which is time and cost intensive. And when dealing with complex samples, there may be problems such as clonal overlap and sequence resolution.

Antibody epitope sequencing

  • Antibody epitope sequencing is used to study the binding properties of antibodies by analyzing the specific region (epitope) where the antibody binds to the antigen. This technique reveals structural information about the antibody and helps to determine which antigenic parts are critical binding sites. It is not only used in vaccine research and immune escape mechanisms, but is also often used in the optimization of therapeutic antibodies.
  • Advantage: It can accurately analyze the epitopes of antibody-antigen binding and provide a structural basis for antibody design and optimization. It can be used in the fields of vaccine design, immune escape research and antibody optimization. And it can help us to deeply understand how the immune system recognizes and attacks pathogens or tumor cells.
  • Disadvantages: Requires a high level of technical and facility support, often in combination with other techniques such as X-ray crystallography or cryo-electron microscopy.Higher cost and data analysis requires strong bioinformatics tool support.

More antibody sequencing methods can be consulted "Antibody Sequencing Methods" .

De novo antibody sequencing

Antibody sequencing refers to the technology of directly analyzing the complete amino acid sequence from antibody protein or its enzymolysis peptide segment by combining high-resolution mass spectrometry with bioinformatics analysis without relying on the known gene or amino acid database, and reversely deducing the coding gene sequence.

1. Antibody isolation and sample preparation

  • Separation of light and heavy chains: using reducing agent (such as β-mercaptoethanol) to break disulfide bonds, and combining size exclusion chromatography (SEC) or non-reducing SDS-PAGE to separate light chain (~25 kDa) and heavy chain (~ 50 kDa);
  • Enzymatic cutting strategy: use Trypsin, Lys-C or Glu-C to cut the chain specifically to generate overlapping peptide segments (for example, the peptide segment covering the CDR region should ensure > 90% sequence coverage).

2. High resolution mass spectrometry analysis

  • Instrument configuration: Liquid chromatography: nanoliter flow rate UHPLC(C18 column, 150μ m× 15cm); Mass spectrometer: Orbitrap Fusion Lumos (resolution 240,000 @ m/z 200);
  • Data acquisition mode: DDA (data dependent acquisition): TopN strategy (N=20), MS/MS fragmentation (HCD energy 28%); DIA (data independent acquisition): used for complex modified samples (such as glycosylated antibodies).

3. Sequence reconstruction and gene derivation

  • Peptide de novo sequencing: Using software such as Peaks Studio and Byonic to analyze b/y ion pairs and construct candidate amino acid sequences;
  • Variable area stitching: The recombination site of V(D)J was located by overlapping peptide fragments (such as CDR3 peptide fragments), and the source of germline gene was predicted by combining with IMGT database.
  • Verification of gene synthesis: The deduced DNA sequence was chemically synthesized and transfected into HEK293 cells, and the antibody binding activity (such as anti-HER2 antibody affinity ≥ original drug) was verified by ELISA.

Core advantage

  • It can analyze new species or highly mutated antibodies (such as HIV broad-spectrum neutralizing antibodies). Directly treat crude samples such as serum and tissue lysate (pre-treatment by immunoaffinity purification). Simultaneous identification of key post-translational modifications such as glycosylation (such as N297 in Fc region) and oxidation (such as Met252).

For more information about antibody direct sequencing, please refer to "Overview of De Novo Antibody Sequencing".

Antibody sequence database

  • IMGT: the only standardized immunoglobulin genes database in the world, Germline gene classification covering 200+ species such as humans and mice (e.g. IGHV1-69*01).
  • Abysis: antibody sequence structure association database, Support H3 loop length statistics and germline gene backtracking in CDR region.
  • PDB: contains the crystal structure of 30,000+ antibody/antigen complex (e.g. PDB 7CH4: COVID-19 neutralizing antibody S309). Provide structural parameters such as electrostatic potential and B factor.
  • Thera-SAbDab: specializes in therapeutic antibody structure (including clinical drug candidates). Fc engineering mutations can be screened (such as YTE extending half-life)
  • SAbDab: dynamically update antibody-antigen binding affinity (KD value). Integrated epitope prediction tools (such as Paratome)
  • IEDB: immune epitope database, containing 200,000+antibody-antigen interaction data (such as HIV gp120 neutralizing epitope)

If you want to know more about the antibody sequence database, please refer to "Antibody Sequence Database".

Antibody sequence analysis tool

  • ANARCI: An Antibody Sequence Annotation Tool Based on Hidden Markov Model (HMM). CDR region can be identified (according to Kabat/Chothia/IMGT standard) and the source of germline gene can be predicted. Output: JSON format sequence feature report (including FR/CDR boundary and mutation hotspot).
  • IgBLAST: An immunohistochemical library analysis tool developed by NCBI. Support Qualcomm NGS data comparison (such as batch processing of 10 6 scFv sequences).
  • RosettaAntibody: Integrate antibody-specific energy functions (such as KIC algorithm of CDR-H3 loop). Predictable conformational stability (ΔΔG calculation) of nano-antibodies (such as VHH).
  • AlphaFold-Multimer: Structure prediction of antibody-antigen complex based on deep learning. Accuracy: rmsd < 2 (for verification of known compounds).
  • AbRSA: Antibody rational design platform. Combining molecular dynamics (MD) simulation with machine learning (XGBoost) to predict aggregation tendency (Tms value). Output: Mutant stability ranking (for example, F404Y improves the thermal stability of IgG1 by 5℃).
  • AbAdapt: Antibody Humanization Tool Based on Antagonistic Network. Input mouse antibody sequence and output humanized scheme (CDR transplantation framework selection).

For more information about anti-sequence analysis, please refer to "Antibody Sequence Analysis".

Reference

  1. Robinson WH. "Sequencing the functional antibody repertoire--diagnostic and therapeutic discovery." Nat Rev Rheumatol. 2015 ;11(3):171-82. doi: 10.1038/nrrheum.2014.220
* For Research Use Only. Not for use in diagnostic procedures.
Our customer service representatives are available 24 hours a day, 7 days a week. Inquiry

From Our Clients

Online Inquiry

Please submit a detailed description of your project. We will provide you with a customized project plan to meet your research requests. You can also send emails directly to for inquiries.

* Email
Phone
* Service & Products of Interest
Services Required and Project Description
* Verification Code
Verification Code

Great Minds Choose Creative Proteomics

We use cookies to understand how you use our site and to improve the overall user experience. This includes personalizing content and advertising. Read our Privacy Policy

Accept Cookies
x