Edman degradation, also known as step-by-step N-terminal sequencing, is a classic technique to analyze the N-terminal amino acid sequence of protein or polypeptide by chemical directional cleavage and identification. This technology was initiated by Swedish biochemist Pehr Edman in 1950, and was automated in 1967 (liquid-phase pulse sequencer), which provided a Qualcomm solution for protein primary structure analysis for the first time, and directly promoted the decoding of landmark protein structures such as insulin (1955) and ribonuclease (1960).
Although mass spectrometry (such as de novo sequencing) has gradually become the mainstream, Edman degradation is still irreplaceable in the following scenarios because of its characteristics of directly verifying modification sites without database dependence: N-terminal sequence exploration of unknown proteins of new species; Identification of truncated variants of synthetic peptides: Independent verification of collision region of mass spectrometry signal.
Reaction step
1. PITC is coupled with N-terminal amino group of protein.
- Principle: Phenylthiocyanate (PITC) reacts with N-terminal α-amino group of protein to form phenylthiourea (PTC) derivative under weak alkaline condition (pH 9.0).
- Key details: specificity control: ε-amino group of lysine side chain should be blocked in advance (such as acetylation) to avoid side effects. Reaction conditions: Inert gas (such as nitrogen) is needed to protect the reagent from decomposition due to oxidation.
- Frequently asked questions: If the side chain is not closed, it may lead to multiple tags and interfere with the subsequent results.
2. Trichloroacetic acid (TFA) cracking
- Principle: Under anhydrous acidic condition (TFA), thiourea bond of PTC- peptide chain breaks, releasing PTH- aa and shortening peptide chain.
- Key details: directly generating PTH: the pyrolysis step completes the generation of PTH- aa at the same time, without additional transformation (the original step 4 can be merged here). Condition control: strict water-free environment to prevent peptide chain hydrolysis, maintain low temperature (4℃) to reduce side reactions.
- Frequently asked questions: Incomplete hydrolysis may lead to truncated peptide residues, which will affect the next round of reaction.
3. Separation and extraction of PTH-amino acids
- Methods: Hydrophobic PTH- amino acids were extracted with organic solvents (such as ether or ethyl acetate) to remove hydrophilic peptide chains.
- Solvent selection: ether has high solubility in PTH- amino acids, but attention should be paid to its volatility; Ethyl acetate is safer.
- Purification step: after extraction, blow-dry with nitrogen, and redissolve with methanol to improve the injection efficiency of HPLC.
4. HPLC chromatographic analysis
- Separation mechanism: Reversed-phase HPLC(C18 column) is based on the hydrophobic difference of PTH- amino acids, and the mobile phase is acetonitrile-water gradient.
- Detection method: UV detection (254 nm), because PTH group has strong absorption.
- LC-MS: Complex samples can be verified by mass spectrometry to avoid misjudgment caused by overlapping retention time.
5. Chromatogram comparison and amino acid identification
- Standard library: it is necessary to establish the retention time database of known PTH- amino acids.
- Note: Isomers (such as isoleucine and leucine) need to be distinguished by high resolution chromatography or mass spectrometry.
6. Cyclic degradation sequencing
- Efficiency per round: about 98%. After 50 rounds, the signal decays to < 30%, which limits the sequencing length (usually 30-50 residues).
- Error accumulation: it is necessary to verify the key sites through repeated experiments or N-end closed samples.
- Automation application: Qualcomm circulation is realized by using Edman sequencer to reduce human error.
Edman chemistry and automated N-terminal sequence analysis (Reim DF et al., 2001).
Edman degradation and Sanger sequencing
Edman degradation is mainly used to determine the aa sequence of protein or peptide chain. Its principle is to gradually remove the N-terminal aa of protein or peptide chain, one aa at a time, and analyze its species by chromatography. Specifically, the N-terminal aa was labeled with phenylisothiocyanic acid (PITC), and then the aa was excised under acidic conditions, and it was transformed into a stable benzothiazolyl aa (PTH). Then, the aa of PTH were separated and analyzed by high performance liquid chromatography (HPLC), and their identities were finally determined. Edman degradation method is especially suitable for aa sequence analysis of small molecular protein or peptide chain, although its efficiency is low in sequence analysis of long-chain protein.
Sanger sequencing is mainly used to determine the nucleotide sequence of DNA or RNA, and its principle is to interrupt the extension of DNA chain by using specific labeled nucleotides (ddntps) to terminate the synthesis in the synthesis reaction of DNA chain. A series of DNA fragments with different lengths were obtained by setting reaction mixtures containing different ddNTPs, and these fragments were separated by capillary electrophoresis, so as to read the DNA sequence. Sanger sequencing method is widely used in DNA sequence analysis, which is a traditional DNA sequencing technology and is often used in genome, transcriptome analysis and other DNA-related research. It can be used in large-scale genome sequencing, gene mutation analysis, single nucleotide polymorphism (SNP) detection and other fields.
If you want to know more about the difference between Edman degradation and Sanger sequencing, please refer to "Edman Degradation vs Sanger Sequencing".
Advantages and disadvantages
- Pros: The N-terminal aa sequence of proteins can be accurately determined. Requirements for protein samples are relatively low and large quantities of samples are usually not needed . The operation is more mature and is widely used for sequence analysis of small molecule proteins or peptides.
- Disadvantages: When the length of the protein or peptide is long, the degradation efficiency will be reduced, and it is more difficult to complete the analysis of long sequences. Proteins with complex structures or modifications cannot be processed directly. Experimental operations are relatively cumbersome and require the use of sophisticated instrumentation (e.g., HPLC).
Select Service
Learn more
C terminal and N terminal sequencing
Protein N-terminal sequencing refers to the analysis of the amino terminal (i.e. the initial part of amino acid sequence) of protein, and the commonly used technical methods include Edman degradation and mass spectrometry. Protein C-terminal sequencing focuses on the carboxyl terminal of protein (the last part of amino acid sequence). The main goal of C-terminal sequencing is to analyze the amino acid sequence of C-terminal, and the commonly used techniques include carboxypeptidase method, chemical method and mass spectrometry.
Compared with N-terminal sequencing, the technical realization of C-terminal sequencing is more complicated, especially when the C-terminal of protein is modified or difficult to be treated by enzymolysis, the analysis process is more challenging. N-terminal sequencing is more suitable for studying the functional domain and post-translation modification of protein, while C-terminal sequencing focuses more on analyzing the ripening process and processing mechanism of protein.
In addition, due to the folded state of protein and its post-translation modification, the C-terminal is more difficult to resolve than the N-terminal, especially when dealing with polypeptide chains or complex protein mixtures.
To learn more about the differences between C-terminal sequencing and N-terminal sequencing in protein, please refer to "C terminal and N terminal Sequencing".
Application
Study on the Structure and Function of protein
Edman degradation method is often used as an auxiliary technique in analyzing the tertiary structure of proteins and studying their functions. Sequence analysis of different fragments of protein can help scientists understand the functional sites, active regions and interactions with other molecules of protein.
- Chromaffin cells mainly secrete bioactive molecules composed of neuropeptide and catecholamine to mediate intercellular communication, thus participating in various physiological functions, including stress response and hormonal regulation (such as cardiovascular regulation). The research focuses on the soluble components of chromaffin granules, especially the proteolytic fragments of chromaffin A (CgA) and chromaffin B (CgB). In this study, two-dimensional gel electrophoresis (2D gel electrophoresis) was used to separate the soluble protein in vesicles, and then these protein were identified by NH2 terminal Edman sequencing. It was found that CgA protein produced multiple fragments by cleavage at specific binary and unitary sites, which constituted the main protein component in chromaffin secretory vesicles. CgB protein also undergoes specific cleavage at binary and unitary residue sites through similar proteolysis, resulting in multiple CgB fragments, which are also important components of chromaffin secretory vesicles. Over-expression of gA can promote particle formation, while down-regulation will lead to particle loss and impaired secretion function. ProSAAS (precursor hormone) was also identified as a proteolytic fragment in this study, and the fragment was cut by a specific RR dibasic site, suggesting that it is involved in the formation of neuropeptides and hormone precursors. By comparing the proSAAS sequences of cattle and mice, the role of this fragment in chromaffin particles is further supported. In a word, this study reveals the complex processing process of CgA and CgB proteins in chromaffin granules by various technical means, and provides new insights about their biological functions in secretory granules and their roles in neuroendocrine system (Lee JC et al., 2009).
- In order to analyze the disulfide bond structure in spider toxin HNTX-XXI, the research team adopted a multi-enzyme synergistic hydrolysis strategy, and solved the disulfide bond analysis problem caused by containing multiple cysteine residues (such as Cys2/Cys16/Cys17) through two-step digestion. Firstly, trypsin was used for primary digestion to cleave the carboxyl terminal peptide bonds of arginine (Arg) and lysine (Lys). Then, Staphylococcus aureus V8 protease was used for secondary cleavage, targeting the carboxyl terminal peptide bond of glutamic acid (Glu). This double restriction enzyme digestion strategy can effectively disassemble complex peptide segments and avoid the interference of co-cleavage of adjacent cysteine residues. In HPLC separation, peptide segments containing single disulfide bond, such as peptide segment A(Cys10-Cys23) and peptide segment B(Cys14-Cys63), were successfully separated by C18 reversed-phase chromatographic column and acetonitrile-water gradient elution system. For complex peptide fragments containing adjacent cysteine residues, low pH optimization is adopted to improve the resolution. Subsequently, the target peptide was analyzed by Edman degradation method, and the disulfide bond pairing between Cys2 and Cys17, Cys16 and Cys36 was determined. MALDI-TOF MS analysis further verified the existence of disulfide bonds, and confirmed the specific connection site of disulfide bonds by comparing the molecular weight of reduced peptide with that of oxidized peptide. Finally, the disulfide bond networks of HNTX-XXI were identified as: Cys2-Cys17, Cys10-Cys23, Cys14-Cys63 and Cys16-Cys36 (Chen B et al., 2024).
Biomarkers and disease research
In the study of biomarkers, Edman degradation method can be used to analyze specific protein or peptides related to some diseases (such as cancer, neurodegenerative diseases, etc.). By accurately determining the sequence of these peptides, it is helpful for the discovery of markers and their application in disease diagnosis.
- Trop-2 is a transmembrane protein, which cleaves in the first thyroglobulin domain (located between residues R87 and T88) in its extracellular region. Through antibody targeting and N-terminal Edman degradation technology, it was found that this cleavage led to a profound rearrangement of the structure of Trop-2, which may affect its biological function. Trop-2 is not cut in normal human tissues, but it is cut in many tumors (such as skin cancer, ovarian cancer, colon cancer and breast cancer). The post-translation modification of Trop-2 may be induced specifically, giving cancer cells a better growth advantage than normal cells, especially the cleavage at R87-R88 position, which indicates that Trop-2 may undergo specific structural changes after translation. Through mass spectrometry analysis, it was confirmed that Trop-2 was cleaved at R87-T88, and the interaction between ADAM10 and Trop-2 was confirmed. It was found that ADAM10 promoted the activation of Trop-2 by cleaving its R87-T88 site, which was very important for cell growth and malignant transformation. The inhibition of ADAM10 leads to a significant reduction in the cleavage of Trop-2, which affects the proliferation of cells and the ability of tumor formation. The use of ADAM10 inhibitor or ADAM10 siRNA can inhibit the growth of cells expressing Trop-2, but has no effect on the A87-A88 mutant (Trop-2 that cannot be cut by ADAM10). In the mouse model, cells with Trop-2 hydrolysis mutation (A87-A88 mutant) can not activate tumor growth, and the cleavage of Trop-2 is also related to tumor metastasis. In the experiment of transfection of colon cancer cells, the mutant Trop-2 significantly reduced the metastasis volume, which proved that Trop-2 cleavage was the key step of metastasis activation. In breast cancer samples, it was observed that the cleavage of Trop-2 showed wide differences, some samples showed lower cleavage level, while some samples showed higher cleavage level. This is in sharp contrast to the normal breast samples in which Trop-2 has not been cut. Generally speaking, the hydrolysis of Trop-2 plays a key role in the occurrence, development and metastasis of cancer, and its cleavage mechanism may provide a potential target for new cancer treatment strategies (Trerotola M et al., 2021).
Protein sequence analysis
Edman degradation method is mainly used to analyze the aa sequence of small molecular protein or peptide.
- Hemiptera is one of the most abundant orders of insects, including many species destructive to agriculture, such as white flies, psyllids and aphids. At the same time, some Hemiptera (such as kissing insects) are medical pests. Lipid oxidation plays a key role in promoting the contraction of flying muscles, and in aquatic worms, lipid oxidation also provides ATP for swimming of leg muscles. AKH, a lipokinetic hormone, regulates the mobilization of lipids (mainly triacylglycerol) from fat body to hemolymph, and provides energy in insect flight and swimming. After literature search and genome and transcriptome database, the researchers collected AKH sequences from 191 Hemiptera species. Most AKH sequences were predicted by bioinformatics, and only a few were verified by Edman degradation or mass spectrometry. A total of 42 different AKH sequences were identified, of which about 50% were shared with other insect orders (such as Diptera and Lepidoptera), while the remaining 50% were unique to Hemiptera. The study also found nine new kinds of AKH, which have not been found in any insects. Most Hemiptera AKH is 8 peptides long, but compared with other insect orders, the proportion of 10 peptides in Hemiptera is higher. The possibility of using aphid AKH sequence as lead peptide to develop green insecticides was considered (Gäde G et al., 2022).
- Cone snail is a kind of carnivorous marine snail, which has many poisonous species. These venoms contain many active ingredients-conotoxin. αD-FrXXA, a conotoxin belonging to αD superfamily, can regulate nicotinic acetylcholine receptors (nAChRs), which is related to many nervous system diseases (such as Alzheimer's disease, Parkinson's disease, etc.) and has a slight inhibitory activity preference on neuronal nAChR. The mass spectrum of subfraction 4a obtained by MS analysis shows two signals: secondary (z = +1) and primary (z = +2) signals, which are m/z 11,074.174 and m/z 5,513.533 respectively, representing their average masses of 11,073.164 Da and 11,025.046 Da respectively. The difference between them is 48.118 Da, which may be related to γ -carboxylation reaction. The mass spectra of subfraction 4b also showed similar signals: secondary (z = +1) and primary (z = +2) signals, m/z 11,115.118 and m/z 5,481.605, respectively, with corresponding average masses of 11,114.108 Da and 10,961.190 Da. Component 4b was degraded by Edman, and the sequence of 26 amino acids was obtained. Some positions (such as positions 6, 17, 18 and 23) are blank, and it is speculated that these positions may be cysteine residues. Other positions (such as positions 8 and 9) show different signals, which may indicate that amino acids have changed (such as S8P and T9R). By analyzing the transcriptome data of C. fergusoni venom tube, the researchers found a precursor sequence similar to the amino acid sequence in 4b. The study of transcriptomics helped to identify three precursors, and their mature toxin regions were similar to the amino acid sequences produced by Edman degradation of 4b. These precursors are 92 amino acids in length, including signal sequence, precursor region and mature toxin. Precursors 1 and 2 are very similar, and the predicted mature toxins are the same, so they are collectively referred to as toxin 1, while precursor 3 corresponds to toxin 2 (Rodriguez-Ruiz XC et al.,2022).
Peptide sequence analysis
For the sequence analysis of small peptides, Edman degradation method is very efficient.
- Fluorescence imaging has become an indispensable technology in chemical biology, especially active probes and cell structure visualization methods. Small molecule fluorescent groups have the characteristics of small size, easy use and modularity, which makes them have unique advantages in analytical applications. Oxanthrene-based fluoresceins (such as fluorescein and rhodamine) have been widely used since the end of the 19th century. These fluorescent probes have good spectral characteristics, but their main disadvantage is that they are prone to fluorescence self-quenching, which may affect the experimental results. The author uses "fluorescence sequencing" technology, which identifies aa sequences by labeling peptides with fluorophores and analyzing them with single molecule fluorescence microscope. This technology uses Edman degradation process to gradually remove aa, so as to determine the identity of aa according to the change of fluorescence signal. By preparing 12 different peptides (including short linker peptide and long linker peptide), the effects of fluorophore labeling on fluorescence lifetime (FLT) and quantum yield (QY) were studied. When the second fluorescent group was introduced, the FL and QY of the short linker peptide decreased. For example, when the TMR labeled short linker peptide changed from single label (14) to double label (13), FLT decreased from 2.53 ns to 2.19 ns, QY decreased from 0.55 to 0.33, and FL and QY decreased by 13% and 40% respectively. Similarly, the short linker peptides (16 to 15) labeled with Atto 647N also showed similar changes, with FLT decreased by 11% and QY decreased by 15%. When a long linker (such as (PEG)10 linker) is used, the relative position between fluorophores is pulled away, which reduces the influence of fluorescence quenching, thus improving the photophysical properties. Taking TMR as an example, when long linker peptides (20 and 19) were double labeled, FLT decreased from 2.56 ns to 2.31 ns, QY decreased from 0.51 to 0.39, and FL and QY decreased by 9.8% and 24% respectively. Atto 647N-labeled long linker peptides (22 and 21) showed a smaller decrease in FL and QY, with a decrease of 4.8% in FLT and 5.1% in QY. Studies have shown that longer linker peptides can effectively reduce fluorescence quenching and improve signal intensity, thus optimizing sequencing results (Bachman JL et al., 2022).
- Cytotoxin (CTX), as the main component of cobra venom, induces apoptosis through different signal pathways. Although no CTX derived from cobra venom has entered clinical trials at present, studies have shown that they may have anti-cancer potential, especially for those diseases with disordered apoptosis mechanism. In the study of glioblastoma, peptides were isolated from the venom of two Naja species and their antiproliferative activities were studied. The researchers reduced the peptides with dithiothreitol (DTT) and alkylated them with 4- vinylpyridine (4VP). Subsequently, the sequence of the treated peptide was analyzed by N-terminal Edman sequencing technology. In order to confirm the homology of these peptides, BLAST alignment was carried out. The results showed that CTNsen1, CTNancy 1 and CTNancy 2 had 100% sequence identity with known cytotoxins, indicating that their structures were highly conserved. The C-terminal sequence of CTNsen1 was successfully identified by MALDI TOF-TOF MS/MS technology, which indicated that the peptide was a kind of cytotoxin.According to the further BLAST analysis, the similarity of CTNsen2, CTNsen3 and CTNanc3 3 with the known cytotoxin is as high as 98.33%. This shows that these toxins may be some new variants, which may affect their biological functions and interactions. CTNsen1 and CTNanc1 are classified as S-type cytotoxins, while CTNsen2, CTNsen3, CTNanc2 and CTNanc3 are classified as P-type cytotoxins. There are significant differences in the mode of action between P-type and S-type cytotoxin. S-type cytotoxin mainly interacts with cell membrane through electrostatic interaction, while P-type cytotoxin causes extensive damage to cell membrane through deep combination with cell membrane. CTX has four conservative disulfide bonds, which are very important for the structural stability of toxins, and some disulfide bonds have a greater impact on the functionality of toxins(Boughanmi Y et al., 2024).
Sequence confirmation of recombinant protein
In the production of recombinant protein, Edman degradation method can be used to confirm whether the protein obtained by genetic engineering expression system has the expected amino acid sequence.
- In patients with Parkinson's disease and Alzheimer's Harmo's disease, the observed pathological protein aggregation generally exists aSyn, so the study of aSyn is very important for understanding these diseases. Researchers have developed a method to purify the monomer aSyn without chromatography and denaturant, which takes advantage of the amyloid production characteristics of aSyn. The full-length unlabeled aSyn was expressed in Escherichia coli, and the cells were disrupted by ultrasound, and the aggregates were removed by centrifugation. After a series of centrifugation steps, purified water-soluble aSyn was obtained. ASyn can induce E.coli protein aggregation and form fibrous structure. In the purification process, after removing the aggregates, the final aSyn was identified by Western blot and Edman sequencing, and its quality was determined by mass spectrometry (MALDI TOF-MS) (Kamboj S et al., 2021).
References
- Lee JC, Hook V. "Proteolytic fragments of chromogranins A and B represent major soluble components of chromaffin granules, illustrated by two-dimensional proteomics with NH(2)-terminal Edman peptide sequencing and MALDI-TOF MS." Biochemistry. 2009 ;48(23):5254-62. doi: 10.1021/bi9002953
- Chen B, He J, Hu Z, Zeng X. "Assignment of Disulfide Bonds in HNTX-XXI by Double-Enzymatic Digestion and Edman Degradation." J Am Soc Mass Spectrom. 2024;35(12):3089-3094. doi: 10.1021/jasms.4c00319
- Trerotola M, Guerra E, Ali Z, Aloisi AL, Ceci M, Simeone P, Acciarito A, Zanna P, Vacca G, D'Amore A, Boujnah K, Garbo V, Moschella A, Lattanzio R, Alberti S. "Trop-2 cleavage by ADAM10 is an activator switch for cancer growth and metastasis." Neoplasia. 2021;23(4):415-428. doi: 10.1016/j.neo.2021.03.006
- Rodriguez-Ruiz XC, Aguilar MB, Ortíz-Arellano MA, Safavi-Hemami H, López-Vera E. "A Novel Dimeric Conotoxin, FrXXA, from the Vermivorous Cone Snail Conus fergusoni, of the Eastern Pacific, Inhibits Nicotinic Acetylcholine Receptors." Toxins (Basel). 2022;14(8):510. doi: 10.3390/toxins14080510
- Gäde G, Marco HG. "The Adipokinetic Peptides of Hemiptera: Structure, Function, and Evolutionary Trends." Front Insect Sci. 2022 ;2:891615. doi: 10.3389/finsc.2022.891615
- Bachman JL, Wight CD, Bardo AM, Johnson AM, Pavlich CI, Boley AJ, Wagner HR, Swaminathan J, Iverson BL, Marcotte EM, Anslyn EV. "Evaluating the Effect of Dye-Dye Interactions of Xanthene-Based Fluorophores in the Fluorosequencing of Peptides." Bioconjug Chem. 2022 ;33(6):1156-1165. doi: 10.1021/acs.bioconjchem.2c00103
- Boughanmi Y, Berenguer-Daizé C, Balzano M, Mosrati H, Moulard M, Mansuelle P, Fourquet P, Torre F, de Pomyers H, Gigmes D, Ouafik L, Mabrouk K. "Antiproliferative Effects of Naja anchietae and Naja senegalensis Venom Peptides on Glioblastoma Cell Lines." Toxins (Basel). 2024 ;16(10):433. doi: 10.3390/toxins16100433
- Kamboj S, Harms C, Kumar L, Creamer D, West C, Klein-Seetharaman J, Sarkar SK. "A method of purifying alpha-synuclein in E. coli without chromatography." Heliyon. 2021 ;7(1):e05874. doi: 10.1016/j.heliyon.2020.e05874