Antibody database is a bioinformatics resource for storing and managing antibody-related genes, protein sequences, structures, functions and diversity information, covering antibody sequences (such as VH/VL), three-dimensional structures, antibody-antigen interactions, diversity data (such as V(D)J rearrangement) and functional annotations (such as categories, subtypes and post-translation modifications). These databases provide functions such as data retrieval, sequence alignment, structure prediction and immune spectrum analysis, and support antibody engineering, drug development, immunological research and vaccine design. Users can access data through Web interface or API interface, and download them for off-line analysis, which promotes the cooperation and innovation of antibody research.Here are some common antibody databases:
Database Name | Key Focus & Features | Applications & Utilities |
---|---|---|
IMGT | Comprehensive database for Ig and TCR, including sequences, structures, and gene rearrangements | Supports immunological research, integrates data from literature and structural studies |
IMGT/mAb-KG | Database for classification of monoclonal antibodies, including 139,629 triples | Offers extensive information on mAbs, including engineered antibodies |
NanoLAS | Focuses on nano-antibody data, especially from camelids | Provides flexible search tools for nano-antibody structures |
IEDB | Stores information on linear and nonlinear immunoepitopes | Maps immune epitopes to 3D antigen structures for better understanding |
ABSD | Public database of unique antibody sequences, focusing on ensuring sequence uniqueness and provenance | Represents antibody diversity effectively, supports easy expansion through automated processes |
SAbDab | Provides structural data of antibodies, especially therapeutic ones | Aligns therapeutic sequences with germline genes using ANARCI |
Thera-SAbDab | Focuses on INN-listed antibody therapeutics, providing structural data and clinical information | Tracks therapeutics with clinical data, allows search by variable domain sequence |
PDB-2-PBv3.0 | Enhanced PDB-based database for efficient protein structure analysis, including antibody-antigen complexes | Supports detailed study of molecular interactions and antibody optimization |
AbDb | Collects pre-numbered structures of Fv fragments, used for antibody modeling | Supports epitope conformation studies and antibody modeling |
ATCC | Provides antibody products and related genetic information | Offers customized antibody services and extensive catalog access |
VBASE2 | Germline sequences of human and mouse Ig V genes, integrates data from multiple sources | Supports Ig gene studies, regularly updated to sync with latest information |
BCRdb | Dedicated to storing and analyzing B cell receptor gene, protein, and diversity data | Allows immune spectrum analysis, supports cross-referencing and data export |
HDB | Focuses on hybridoma technology and monoclonal antibody data | Provides data on cell lines and mAb functional properties |
IMGT
IMGT (Immunoglobulin and T-cell Receptor Database) is a comprehensive database and toolkit (covering jawed vertebrates from fish to humans) specifically designed for immunoglobulin (Ig) and T cell receptor (TCR) databases, providing detailed information on immune receptor sequences, gene rearrangements, V (D) J gene selection, amino acid sequences, and more. IMGT IMGT is a comprehensive database and toolset (which covers antibody information for jawed vertebrates ranging from fish to humans) dedicated to the study of Ig and TCR, containing detailed immunoreceptor sequences, gene rearrangements, V(D)J gene selections, amino acid sequences, etc. The IMGT database and tool system consists of the following components.
IMGT/LIGM DB: covers Ig and TCR gene, sequence and structure data, and also provides the V-like and G structure domains of IgSF as well as analysis.
IMGT/3Dstructure-DB: It includes the three-dimensional structure of IG, antibody and TCR, the protein structure of major histocompatibility complexes of MHC-I and MHC-II, and the complex structure of pMHC-I and pMHC-II. It contains the complex of IG/ antigen (Ag) and TR/pMHC and the structural information of immunoglobulin and MHC superfamily members in the complex. These structural data come from protein database (PDB) and are annotated and compared by tools such as IMGT/DomainGapAlign.
IMGT/DomainGapAlign: A standard nomenclature based on IMGT, which provides an accurate description of immune-related molecular domains (such as V domain, C domain, etc.). By comparison, these molecules can be matched with known genes and alleles to ensure the standardization and consistency of data.
IMGT/V-QUEST: for nucleotide sequence analysis to help reveal genetic diversity. IMGT is able to provide strong support for immunological research by integrating data from a variety of sources, such as literature data, X-ray diffraction studies, and structural data, to ensure that the data in its databases are accurate and consistent. The database and tool system consist of multiple components, including the IMGT/LIGM database, which covers immunoglobulin and T cell receptor gene, sequence, and structural data, as well as providing analysis of the V-like and G-domains of IgSF.
IMGT/2D structure database and IMGT/3D structure database: provide two-dimensional and three-dimensional structural data of antibodies and T cell receptors, and provide tools for comparing, analyzing, and visualizing gene, sequence, three-dimensional structure, and other data.
IMGT/mAb-DB:Provide queries for therapeutic monoclonal antibodies and other immune related molecules.
IMGT/V-QUEST: Used for nucleotide sequence analysis to help reveal gene diversity. IMGT integrates knowledge from multiple sources, such as literature data, X-ray diffraction studies, structural data, etc., to ensure that the data in its database is accurate and consistent, providing strong support for immunological research.
IMGT (Lefranc MP et al., 2020)
Select Service
Learn more
IMGT/mAb-KG
IMGT/mAb-KG is a database containing abundant data, which is specially used for recording and classifying monoclonal antibodies (mAbs). It contains 139,629 triples and records a lot of related information. IMGT/mAb-KG covers 1,867 concepts or categories, which are helpful to better understand the functions, sources, applications and other aspects of mAb, including 114 attributes or relationships, which are used to link 21,842 entities and help establish associations in different research and clinical applications. IMGT/mAb-KG contains about 35,000 research products, covering about 500 targets. These studies involve more than 500 kinds of diseases or clinical indications, which shows the wide value of the database in clinical application. More than 150 mabs coupled with drugs or molecules were recorded in the database, which enhanced their functionality or activity. It also contains some radiolabeled MABs and MABs fused with protein or peptides, for example, there are more than 175 fused antibodies. Most engineered MABs use light chain Kappa, which is common in mice. About 95% of mouse antibodies are Kappa type.
In recent years, Lambda light chain has been used more and more, because mice are no longer the only source.In the IMGT/mAb-KG database, more than 475 mAbs are made of IgG1-Kappa, which is the most common IgG type. There are many mAbs in IMGT/mAb-KG for oncology, especially solid tumors, and more than 480 MABs have been applied in this field. More than 190 mabs have been developed for the treatment of rheumatoid arthritis. At present, most clinical studies of mAbs in IMGT/mAb-KG are in phase II or I, and about 600 MABs have been tested in different clinical studies and have been approved for commercialization. In addition to monospecific antibodies, the database also includes more than 120 bispecific mabs, which can simultaneously bind two antigens or two epitopes of the same antigen.
NanoLAS
NanoLAS includes nano-antibody data from many species, among which Lama glama (camel) is the most important source species. When analyzing nano-antibodies, NanoLAS mainly compared the length of complementarity determining regions (CDR1, CDR2, CDR3). It was found that the length of CDR1 and CDR2 did not change obviously, while the length of CDR3 changed greatly. The length of CDR3 ranges from 5 to 28 amino acids, among which CDR3 sequences with 14, 16, 17 and 21 amino acids are common and representative in the database. NanoLAS provides a flexible search and filtering tool. Users can retrieve nano antibody data in many ways, such as searching by PDB ID, amino acid sequence, CDR region, and filtering by year, source organism, ligand receptor and active residue. Users can view the search results in the interface, view the 3D structure of nano-antibody (through 3Dmol.js), and download the PDB file for more detailed analysis. The download results also include detailed information of the antibody, such as pid, total structural weight, total number of atoms, residual base, release date and so on. NanoLAS can also compare the functions of different nano-antibodies, and can see their similarities and differences in 3D view, thus helping us to analyze and understand the structures of various nano-antibodies. In order to continuously enrich the content of the database, NanoLAS provides a data submission page, which allows users to upload their own nano antibody data, and the submitted data will go through the data review process to ensure the accuracy and reliability of the data.
IEDB
IEDB (Immunoepitope Database) stores the information of various immunoepitopes, which are mainly divided into linear epitopes and nonlinear epitopes. Linear epitopes refer to continuous amino acid sequences that can be recognized by T cells or B cells, while nonlinear epitopes are usually spatially adjacent amino acid residues formed by the three-dimensional structure of protein. In addition to the epitopes themselves, IEDB also relates the types of immune responses activated by these epitopes, including the immune responses of T cells or B cells, and provides the immune response data of epitopes in different species (such as humans, mice and birds).In addition, IEDB also provides the source information of epitopes, including pathogens (viruses, bacteria, parasites, etc.), vaccines or other antigen types. Each epitope has detailed sequence information, such as UniProt ID, residue position and linear or nonlinear classification of the epitope. Through cooperation with protein database (PDB) and structure database (such as AlphaFold), IEDB helps researchers to map immune epitopes to corresponding 3D antigen structures, so as to better understand the spatial distribution and functions of these epitopes.
ABSD
ABSD (Antibody Sequence Database) is a public database designed to provide unique antibody sequences, especially the heavy chain sequences of human antibodies. It automatically extracts and integrates data from multiple sources and updates it monthly to ensure that its content is up-to-date and accurate. The goal of ABSD is to ensure the uniqueness of each antibody sequence in the database and solve the problem of redundancy and missing data. The antibody sequences in the database include information from different studies and databases, such as PDB, Kabat, OAS, IMGT, etc. Although many sequences can be found in multiple databases, ABSD ensures the uniqueness and clear provenance of each sequence by maintaining source records. In addition, ABSD supports easy expansion through automated parsers, adding antibody sequences from other species or updating new data sources. When evaluating the representativeness of the ABSD dataset, the author compared the clustering of its VH gene fragments with the clustering of other published human antibody libraries. The analysis results show that the distribution of antibody sequences in the ABSD dataset between different subsets and databases is consistent with expectations. These results suggest that the ABSD database can effectively represent the antibody diversity in the human immune system.
SAbDab
Thera-SAbDab (Structural Antibody Database) relies heavily on the Proposed International Nonproprietary Names (INN) list published by the WHO for most antibody sequences.The INN list is published twice a year (January/February and June/July) and covers the variable structural domain sequences of all INN-awarded antibodies and nanobody-related therapeutics. Through databases such as IMGT mAb-DB, Thera-SAbDab is able to find sequence data for a wide range of therapeutics. For example, sequence information can be found for 47 of the 129 antibody therapeutics proposed before 2006, and the ANARCI numbering system is used to align input sequences with pre-numbered germline sequences to help the user understand the effect of mutations on the binding site. Thera-SAbDab is able to provide structural information on antibody therapeutics, including structure data on a wide range of monoclonal and bispecific antibodies, particularly on variable structure domains. Thera-SAbDab provides structural information on antibody therapeutics, including the structures of a wide range of monoclonal and bispecific antibodies, with particular attention to the detailed structures of variable domains, such as antibodies against TNF-α and IL17A.
Thera-SAbDab compares the structural data with that of the SAbDab database, where all therapeutic sequences are structurally aligned to the B-cell germline genes. The database also contains static and dynamic metadata related to therapeutics. Static metadata includes basic information about the antibody, such as light chain type and target, while dynamic metadata relates to clinical trial information, development status, and survey data. These data are sourced from frequently updated databases such as AdisInsight and ClinicalTrials.gov As of 2019, Thera-SAbDab tracks 558 INNs covering 543 unique therapies, 87.1% of which have sequence data for variable structural domains. For 25 bispecific therapies, 44.0% had at least one variable structural domain with precise or tight structural coverage.Thera-SAbDab not only allows for querying by metadata, but uniquely, it also allows for searching by variable domain sequence. This allows researchers to identify all therapeutic drugs associated with any variable domain region of the query sequence.
PDB-2-PBv3.0
PDB-2-PBv3.0 is an enhanced database based on Protein Data Bank (PDB) data, which is designed for efficient protein structure analysis. The database not only integrates more accurate and detailed protein three-dimensional structure information, but also focuses on the systematic collation of antibody-related data. PDB-2-PBv3.0 contains a variety of antibody structure data, including monoclonal antibodies, polyclonal antibodies and antibody-antigen complexes, which provides an important platform for researchers to deeply analyze the molecular mechanism of antibody recognition and binding to specific antigens. The database fully presents the complete structural information of antibody heavy chain and light chain, and supports the in-depth study of antibody-antigen binding mechanism, affinity and specificity. Through PDB-2-PBv3.0, researchers can accurately analyze the interaction between antibodies and antigens, and obtain key points such as binding interface and binding site. The rich data of antibody-antigen complexes in the database provide strong support for the analysis of the binding mode and biological function of different antibodies. PDB-2-PBv3.0 uses a systematic classification system to scientifically classify antibodies according to categories (such as IgG, IgM, etc.), sources, and structural types.At the same time, the database annotates the domain information of the antibody in detail, including the variable region (Fab region) and the constant region (Fc region), which provides a comprehensive structural basis for antibody engineering research. These structural information is of great value for antibody engineering and optimization. Based on this, researchers can design and modify antibodies to effectively improve the affinity, stability and therapeutic effect of antibodies by modifying the sequence or structure of variable region (VH / VL) and constant region (CH1, CH2, CH3, etc.).
AbDb
AbDb (Antibody Database) is a valuable resource for structural immunologists dedicated to collecting and making available processed antibody structural data, including pre-numbered structures of Fv fragments and associated cognate antigens, if available. The resource is used for applications such as antibody modeling, antigen-binding loop conformation analysis, and epitope conformation studies.Key features of AbDb include: (1) Provides processed PDB files containing variable structural domains (VH and VL), divided into individual antibodies and their cognate antigens (including multichain and non-protein antigens). (2) Data are numbered using the Kabat, Chothia and Martin numbering schemes. (3) Provides 36 categorized datasets containing intact, light-chain-only, or heavy-chain-only antibodies, grouped according to the complexes formed by the antibodies with different types of antigens (e.g., proteins and non-protein antigens). (4) Provide a non-redundant version of the dataset with information describing the redundant clusters. (5) Provides information on redundancy when searching by PDB code and lists all processed PDB files that contain redundant antibodies. (6) Provides a detailed information file on antibody complexes and uncomplexes and defines which antibodies are considered antigens.
ATCC
ATCC (American Type Culture Collection) is a world-renowned biological resource library, providing a large number of antibody products and related genetic information. ATCC provides a powerful online platform through which researchers can find and order the required antibodies. The ATCC website contains a detailed catalogue of antibody products, which supports users to accurately screen and retrieve according to multiple dimensions such as targets, application fields, and species sources, so as to facilitate users to quickly find suitable antibody resources. The antibodies provided by ATCC include but are not limited to the following types : mAbs, Polyclonal Antibodies, Recombinant Antibodies, Antibody Conjugates, specific antibodies (for example, antibodies against certain pathogens or cell surface markers). In addition, ATCC also provides technical support to help users solve technical problems in antibody applications. In addition to standardized antibody products, ATCC also provides customized antibody services to meet different research needs. According to their own research objectives and needs, users can customize mAbs or polyclonal antibodies to help solve problems in specific research.
VBASE2
VBASE2 is a database providing germline sequences of the human and mouse immunoglobulin (Ig) variable (V) genes, primarily for immunological studies.VBASE2 integrates data from multiple sources, such as the EMBL, IMGT, Kabat and VBASE databases, to cover the germline sequences of the V gene. The database not only provides the genome sequence, but also links to the V gene in the Ensembl Genome Browser, supporting data presentation through the DAS server. Links to rearrangement references are provided, and rearrangement V gene matches to VBASE2 are indicated. VBASE2 data are automatically generated from EMBL, Ensembl, and other high-throughput data sources via BLAST searches. The data are regularly updated to ensure synchronization with the latest genome sequence and rearrangement information.DNAPLOT software is used to compare, sequence, and contrast gene sequences, automatically identifying and removing synthetic sequences.The V genes are classified into three categories: (1) Known genomic and rearrangement sequences. (2) Sequences lacking evidence of function (including pseudogenes and orphan genes). (3) Sequences for which multiple V(D)J rearrangements have been observed. Each V gene entry includes a reference link to EMBL, Ensembl, and may include BAC sequence information as well as associated annotations. VBASE2 allows you to query species, V gene loci, V gene families, and other options, and supports retrieval of information from multiple databases.DNAPLOT query Used to compare complete V gene sequences to the VBASE2 dataset, returns best-match V gene alignment information, and provides automatic assignment of V gene families.
BCRdb
BCRdb is a database dedicated to storing and analyzing gene, protein and diversity data related to B cell receptor (BCR). It brings together BCR-related genes and variation information, including human and mouse immunoglobulin heavy chain (IGH), light chain (IGK, IGL) gene germline sequences, as well as BCR rearrangement-related variation data. BCRdb provides detailed information on BCR gene rearrangements, mutations, and their biological effects. Rearrangement is the main source of BCR diversity. The database shows patterns of rearrangements and mutations, and explores the potential effects of these mutations on immune system function. In addition, BCRdb supports immune spectrum analysis. Researchers can query the immune spectrum of different B cell populations, including the diversity and affinity of BCR clones, which helps to understand the process of antibody affinity maturation and the dynamics of B cell response. BCRdb is cross-referenced with other related databases (such as IMGT, VBASE2, etc.), and integrates annotation information commonly used in immunological research. . At the same time, BCRdb can also analyze the diversity of BCR sequences, mutation types, gene rearrangement events, and BCR-antigen binding information. The platform supports query through a simple Web interface, and users can perform text search, sequence alignment, or programmatic access through API. In addition, BCRdb allows users to export data to other analysis tools for further calculation and analysis. The content of the BCRdb database is updated regularly to ensure the accuracy and timeliness of the data. Users can download data including immunoglobulin gene sequence, rearrangement pattern and gene mutation information to facilitate offline analysis. At the same time, BCRdb provides API interfaces that enable developers to integrate their data into custom analysis pipelines.
HDB
Hybridoma Database (HDB) is a professional database focusing on hybridoma technology and monoclonal antibody related data. It systematically includes detailed information on a variety of hybridoma cell lines, including cell sources, culture conditions, fusion methods, immunogen types (such as proteins, peptides, or whole cell antigens), and key experimental parameters such as immunization programs. In addition, the database also recorded practical information such as growth characteristics, antibody secretion efficiency and stability of cell lines. HDB provides detailed data on mAbs, covering amino acid sequences of heavy and light chains, antibody types (such as IgG, IgM, etc.) and subtype classification. The database pays special attention to the functional properties of antibodies, including their specificity (such as targeting bacteria, viruses or tumor markers), binding affinity (Kd value), and performance in ELISA, immunohistochemistry, flow cytometry and other experiments. At the same time, HDB also included the functional verification data of antibodies in virus neutralization, pathological process blocking and quantitative analysis. HDB provides an efficient retrieval function that supports multi-dimensional queries through targets, cell line sources, antibody types, and application fields.
References
- Lefranc MP, Lefranc G."Immunoglobulins or Antibodies: IMGT® Bridging Genes, Structures and Functions." Biomedicines. 2020 ;8(9):319. doi: 10.3390/biomedicines8090319
- Sanou G, Manso T, Todorov K, Giudicelli V, Duroux P, Kossida S. "IMGT/mAb-KG: the knowledge graph for therapeutic monoclonal antibodies." Front Immunol. 2024;15:1393839. doi: 10.3389/fimmu.2024.1393839
- Xiong S, Liu Z, Yi X, Liu K, Huang B, Wang X. "NanoLAS: a comprehensive nanobody database with data integration, consolidation and application." Database (Oxford). 2024 ;2024:baae003. doi: 10.1093/database/baae003
- Mendes M, Mahita J, Blazeska N, Greenbaum J, Ha B, Wheeler K, Wang J, Shackelford D, Sette A, Peters B. "IEDB-3D 2.0: Structural data analysis within the Immune Epitope Database." Protein Sci. 2023 ;32(4):e4605. doi: 10.1002/pro.4605
- Raybould MIJ, Marks C, Lewis AP, Shi J, Bujotzek A, Taddese B, Deane CM."Thera-SAbDab: the Therapeutic Structural Antibody Database."Nucleic Acids Res.2020;48(D1):D383-D388. doi: 10.1093/nar/gkz827
- Karuppasamy MP, Venkateswaran S, Subbiah P. "PDB-2-PBv3.0: An updated protein block database. J Bioinform Comput Biol." 2020 ;18(2):2050009. doi: 10.1142/S0219720020500092
- Ferdous S, Martin ACR."AbDb: antibody structure database-a database of PDB-derived antibody structures." Database (Oxford). 2018 ;2018:bay040. doi: 10.1093/database/bay040
- Retter I, Althaus HH, Münch R, Müller W. "VBASE2, an integrative V gene database." Nucleic Acids Res. 2005;33(Database issue):D671-4. doi: 10.1093/nar/gki088