Background: Catechins, one of the characteristic metabolites of tea, together with theanine and caffeine, determine the taste and quality of tea. Type 1A serine carboxypeptidase-like acyltransferases (SCPL1As) are known to play an important role in the synthesis of catechins in various plants. At present, the phylogenetic relationships among SCPL1A genes in tea plants and their regulatory mechanisms are still unclear.
Results: In this study, the evolutionary relationship and family structure of the SCPL1A genes in tea plants were determined. Promoter analysis of the SCPL1A genes indicated that photosynthesis likely affects the biosynthesis of catechins. RNA-Seq data were used to analyze the differential expression of the SCPL1A genes under mechanical stress and insect sputum stress. In total, 14 genes were found to be differentially expressed in a similar manner under both stresses. The RNA-seq results were further verified by quantitative real-time PCR (qRT-PCR). In addition, the correlations between the expression of SCPL1A genes and transcription factors under different stresses were investigated by weighted gene co-expression network analysis (WGCNA). This analysis revealed that mechanical damages led to an increase in the interaction of transcription factors with SCPL1A genes, while insect bite stress led to a decrease of transcription factor interaction.
Conclusion: We analyzed the evolutionary relationship, structures, and promoters of the SCPL1A gene family, and discovered that light may affect the biosynthesis of catechins through SCPL1A genes. We also identified several genes with differential expression under mechanical wounding and insect stresses. The regulatory relationships between SCPL1A genes and transcription factors were analyzed by WGCNA, which provided an important reference for further study on how tea plants produce catechins.
Co-expression network analysis; RNA-seq; SCPL1A; Tea plant; Transcription factor
Tea (Camellia sinensis) is one of the world's most common non-alcoholic beverages [1-3], and is consumed both for its taste and health benefits. Tea has been cultivated for nearly 5,000 years [4] and its production can be traced back to the medicinal liquor of the Chinese Shang Dynasty [5]. At present, the main tea varieties in China belong to the Camellia family (Camellia sinensis) [6,7], accounting for about 67% of all tea production. Xinyang is located in the old revolutionary area of Dabie Mountain and its geological structure is not conducive to the cultivation of traditional food crops. However, the mild climate and abundant rainfall provide ideal conditions for the growth of Camellia.
Weighted gene co-expression network analysis (WGCNA) is used to analyze large-scale gene expression data [8,9], and this correlation-based technique enables transcriptome data to be transformed into a visual co-expression network between genes [10]. The technique has been successfully used to identify genetic modules associated with drought and bacterial stress in Arabidopsis thaliana and rice [11]. Module allocation in WGCNA is a flexible process that reduces the complexity of a dataset from hundreds of genes to a small number of modules.
A growing number of studies have shown that diverse and complex secondary metabolites help plants adapt to a variety of biotic and abiotic stress environments. The serine carboxypeptidase-like (SCPL) gene is involved in a host of plant metabolic processes, a large number of plant growth processes, and plays an important role in disease resistance and abiotic stress tolerance. Previous studies have shown that an allele of the AsSCPL gene in oats increases resistance to soil pathogens in oat roots [12]. Studies in Arabidopsis, rice, wheat, and barley have also found that the SCPL49 gene plays a significant role in drought tolerance [13]. SCPL19 knockouts in potato and Arabidopsis thaliana were found to have increased tolerance to water stress [14]. The rice SCPL1 gene has been shown to play an important role in abscisic acid (ABA) response and abiotic stress tolerance [15], while SCPL46 is a major regulator of rice seed germination and seed size [16]. Studies in poplars have found that the SCPL27 and SCPL40 genes participate in the process of membrane lipid peroxidation through the conduction of reactive oxygen species and also play important roles in disease resistance [17]. Studies in rapeseed have shown that SCPL17 is a key promoter of oil synthesis [18]. In addition, SCPL1A (type 1A serine carboxypeptidase-like acyltransferases) genes participate in secondary metabolism in various plants [19]. SCPL genes have also been shown to catalyze a step in the synthesis of indole-3-acetate, which plays a role in stress adaptation [20].
Recent studies further analyzed SCPL1A regulatory networks in different tissues of different tea plants and identified 22 SCPL1A genes, 86% of which were generated by duplications [2]. Gene duplication is considered to be the main driving force of plant evolution and diversity [21,22]. Moreover, studies on the genome of Camellia sinensis (Puer tea, camellia sinensis) have been completed, and the key genes involved in important secondary metabolic synthesis of tea have been identified [1,23]. Transferase UGGT and ECGT encoded by the SCPL1A genes have been shown to play a significant catalytic role in the synthesis of tea polyphenols [24]. It was also found that the expression of SCPL1A genes was significantly increased in long-pruned tea plants compared with unpruned wild Camellia sinensis, leading to an increase in polyphenols in pruned tea [25]. This indicates that SCPL1A genes are likely involved in the synthesis of secondary metabolites. However, during the process of evolution, duplicated genes constantly appear and develop new functions, meaning that SCPL1A gene functions may have diverged significantly from other SPCL family members [26]. Tea plants are susceptible to Ectropis oblique in warm and humid climates. The tea plant has produced a series of mechanisms in response to two kinds of stimuli related to oblique feeding E. coli: mechanical Injury and oral secretory stimuli [27].
SCPL1A has been duplicated a number of times, and these duplicates are often differentially expressed during different stress conditions. This indicates that SCPL1As may be closely related to environmental signal transduction and transcription factor regulation. Moreover, as an important gene family for the synthesis of tea polyphenols, SCPL1A genes play crucial roles in the synthesis of catechins. Therefore, the identification of SCPL1A genes associated with environmental stress signaling and the construction of their interaction networks are key steps in the study of tea polyphenol synthesis.
Transcriptome sequencing and data availability
Total RNA was extracted using TRIzol reagent (Invitrogen). RNA-seq libraries were constructed according to the manufacturer’s protocol and were sequenced on the BGISEQ-500. We selected four time points for both mechanical injury and Ectropis oblique stress, 3h, 12h, 24h and 48h. The Camellia sinensis reference genome [2,23] was downloaded from TPIA: http://tpia.teaplant.org/. Salmon v0.8.2 was used to map all the clean reads to the CDS of all genes and to count the number of reads mapped to CDS of each gene with default parameter. DEGs were considered significantly enriched in a metabolic pathway at a q-value < 0.05 compared with the whole transcriptome background. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis was performed by the Database for pathway enrichment analysis using KOBAS software to identify significantly enriched KEGG pathways.
Determination of gene expression and differentially-expressed genes
The expression level of each gene was calculated by the FPKM method. First, Bowtie2 (version 2.1.0, Http://bowtie-bio.sourceforge.net/bowtie2/index.shtml) was used to align the reads to annotated gene models .Based on Bowtie mapping results, the FPKM values for each gene were then calculated by RSEM (version 1.2.29) using the default parameters [28]. We then utilized edgeR [29] and DESeq2 [30] from the counting matrix with cutoffs of padj <0.05 and | log2FC | for each differentially-expressed gene (DEG). Genes with | log2FC |>1 and a false discovery rate (FDR) <0.05 were counted as DEGs. FPKM values were also used to build the co-expression network. The TPIA online website (http://tpia.teaplant.org/) was used to identify the possible biological processes associated with differentially-expressed SCPL1A genes, as well as their molecular features [23].
Multiple sequence alignment of SCPL1A family proteins, phylogenetic analysis and exon/intron identification
Multiple alignments of SCPL1A family proteins were performed using default parameters in DNAMAN (Lynnon Biosoft, USA). The full-length protein sequences of tea plant, Arabidopsis thaliana, Glycine max, Oryza sativa, Solanum lycopersicum, and Zea mays were downloaded from the NCBI protein database to study their evolutionary relationships. Using the MEGA6 program, the Neighbor-Joining (NJ) [31] method was employed to map the phylogenetic tree, and Poisson correction with 1,000 replicates was performed for bootstrap analysis. The exon/intron gene structure map was obtained using the Gene Structure Display Server 2.0 (http://gsds.cbi.pku.edu.cn/) [32]. The MEME online program for protein sequence analysis (http://meme.nbcr.net/meme/intro.html) was used to identify conserved motifs in the identified SCPL1A proteins [33]. The optimization parameters were as follows: number of repetitions, arbitrary; maximum number of patterns, 10; optimal width for each subject, 6 to 100 residues.
Promoter prediction analysis
To investigate cis-acting elements of the predicted promoter regions, the online site PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) was used to analyze the genomic sequence from the start codon to 2000 bp upstream of each SCPL1A gene.
Weighted gene co-expression network analysis
Weighted gene co-expression network analysis (WGCNA) [34] is a comprehensive R software package that outlines and standardizes the methods and functions of co-expression network analysis. A co-expression network of differentially-expressed SCPL1A genes and TFs under mechanical injury and Ectropis oblique stress was constructed using WGCNA. To build the network, correlations were first calculated between different gene expression levels, followed by node and edge generation. Nodes corresponded to genes and the edges were determined by pairwise correlation between gene expression levels. The R package 'blockwiseModules' was then employed with the following parameters: powers = 10, min ModuleSize = 30, mergeCutHeight = 0.25. Other parameters were set to the default. Second, the nodes with edges of correlation r < 0.5 and weighted threshold < 0.3 were removed. Finally, we used the Cytoscape (https://cytoscape.org/) tool to map the interactions through the nodes and edges of the conserved genes.
qRT-PCR analysis
To confirm the reliability of the RNA-seq results, 15 representative genes were selected for qRT-PCR with the KAPA SYBR1FAST qPCR Master Mix (KAPA Biosystems, Woburn, MA, USA) and an ABI 7500 real-time PCR system. In order to obtain relative expression value, the delta-delta CT method was used [35].
SCPL1A gene family evolutionary tree, gene structure and motif analysis
Structural differences in duplicated genes are common, which often lead to the production of functionally distinct paracentric homologs [36]. Comparative genomic analysis has previously revealed that tea plants have 22 SCPL1A genes [2], while there are 19 in Arabidopsis thaliana, 11 in Glycine max, six in Oryza sativa, eight in Solanum lycopersicum, and 21 in Zea mays. To study the evolutionary and phylogenetic relationships of the SCPL1A genes in tea plants and other species at the molecular level, we compared their full-length protein sequences. The phylogenetic analysis constructed by the neighbor-joining method showed that the SCPL1A genes from tea plants and other species could be easily separated (Figure 1A), with the least differences present when comparing tea and Arabidopsis SCPL1A genes. To better understand the structural diversity of the SCPL1A genes after duplication events, their exon/intron structure was compared using the Gene Structure Display Server 2.0 (http://gsds.cbi.pku.edu.cn/). This analysis clearly showed that most paralogs have similar gene structures (Figure 1B). For example, five gene pairs (TEA023451.1 / TEA034036.1, TEA034033.1 / TEA034039.1, TEA034056.1 / TEA034055.1, TEA023432.1 / TEA023444.1 and TEA034050.1 / TEA034049.1) have highly consistent gene structures, including exon/intron number and exon length. However, it is worth noting that there is no annotated UTR region in SCPL1As, and there are some differences in the length of introns and exons of some paralogs, which may be related to the regulation of expression. This also suggests that two paralogs may have subtle expression differences during the growth and development of tea plants. The MEME motif analysis was then used to construct a schematic representation of the protein structure of all SCPL1A genes. As shown in figure 1C, most SCPL1A members have a similar motif composition, indicating that the protein structure is highly conserved, although the function of most of these conserved motifs remains to be elucidated. Overall, the conserved motif composition and similar gene structure of SCPL1A gene members, as well as phylogenetic analysis results, strongly support the reliability of this family classification [2].
Figure 1: The neighborhood linkage tree of the SCPL1A gene family of 6 species and the neighborhood linkage tree of the tea plant SCPL1A gene family, with intron-exon structure and conserved motifs indicated.
(A) NJ tree. The MEGA 5.0 program was used to describe the adjacent system trees of SCPL1A genes from 6 plants including tea plant, Glycine Max, Arabidopsis thaliana, Solanum lycopersicum, Oryza sativa, and Zea mays.
(B) Adjacent phylogenetic tree of tea SCPL1A gene (left). GSDS 2.0 describes the intron-exon structure of all 22 genes (right). Exons and introns are represented by green rectangles and single lines, respectively.
(C) Identification of conserved protein motifs of SCPL1A genes through MEME. The different colored boxes represent different patterns and their positions in each SCPL1A gene sequence.
Cis-element analysis of the SCPL1A genes
To further investigate the potential functions and regulatory mechanisms of the SCPL1A genes, cis-regulatory elements up to 2000 bp upstream of each SCPL1A gene were analyzed. We found that of all the active elements shown in figure 2A, 57.9% were related to light reaction (Figure 2C), and there were at least five cis-related elements in all SCPL1A promoters (Figure 2B). Previous studies have pointed out that SCPL1A is an important gene family for the synthesis of catechins, and light is an important environmental parameter driving photosynthesis [2,37]. In addition, four stress response components, including a MYB binding site involved in drought-inducibility (MBS), a low-temperature responsive element (LTR), a wound-responsive element (WUN-motif) and a defense and stress-responsive element (TC-rich repeats), make up 9.6% of all elements. These stress-responsive elements also suggest that the SCPL1A family may be involved in the transcriptional control of abiotic or biological stress responses. In addition, other elements include auxin reaction elements, abscisic acid reaction elements, and gibberellin reaction elements, which account for 32.5% of all elements (Figure 2C). The names of all putative cis-regulatory elements are shown in figure 2D.
Figure 2: Prediction of major cis-acting elements in the promoter of SCPL1A genes.
(A) The distribution of cis-elements in the 2000 bp promoter region upstream of SCPL1A genes is shown. Different elements are shown in different colors.
(B) The number of major cis-regulatory elements in each SCPL1A gene promoter.
(C) Pie chart representation of the proportions of each cis-type element.
(D) Classification and designation of cis elements.
Differential expression analysis of SCPL1As under different stresses
To study the expression differences of the 22 SCPL1A genes under mechanical injury and Ectropis oblique stress, we first generated a Venny map of the SCPL1A genes with differential expression at each time point under two stresses (Figure 3A). The 15 differentially expressed SCPL1A genes were then divided into two gene cluster modules: co-expressing gene cluster 1 and specific expression gene cluster 2. It is worth noting that the specific expression gene is only expressed under Ectropis oblique stress, and there is no specifically expressed gene under mechanical injury stress. This finding suggests that cluster 2 genes may play different roles under Ectropis oblique stress. The heat maps of the two clusters are shown in Figure 3C. Under different stresses, most SCPL1A genes are induced, which indicates that most of the genes in cluster 1 may be responsive to both stresses . Overall, the expression levels of SCPL1A genes under different stresses showed different patterns. Under mechanical injury stress, the SCPL1A genes showed significant expression differences at 3h and 24h, indicating that they play a role in the early and intermediate responses of mechanical injury stress. Under insect biting stress, SCPL1A genes showed significant differences at 3h and 48h, indicating that they play a role in the early and terminal reactions to Ectropis oblique stress (Figure 3B).
In addition, we performed GO analysis on cluster 1 and cluster 2, and assigned functions to each module (Figure 3D). Nearly all cluster 1 genes were associated with GO functional categories involving secondary metabolic activity, including the glycosphingolipid metabolic, glycolipid metabolic, and liposaccharide metabolic processes. This indicates that the SCPL1A genes play important roles in the metabolism of plants, which is further supported by the fact that Arabidopsis thaliana homologs of these genes have recently been confirmed to be involved in secondary metabolite synthesis [38,39]. Cluster 2 contained one SCPL1A gene, which is mainly involved in chaperone-mediated autophagy. Since this particular gene is not mapped in Arabidopsis, the function of TEA000223 requires further study. These results indicate that the expression patterns of SCPL1A genes change under different stress conditions, with potential impacts on a variety of biological processes.
Figure 3: Differential SCPL1A gene expression classification modules, with a time-varying gene expression map and functional analysis.
(A) The difference between SCPL1A gene UpSet under mechanical injury and Ectropis oblique stress.
(B) The vertical axis of the box plot represents the statistical data for gene expression, and the horizontal axis represents different time points from different stresses.
(C) Expression profiles of 22 tea plant SCPL1A genes (columns) in mechanical injury and Ectropis oblique stress (rows). Gene expression levels are represented in FPKM.
(D) GO functional analysis of different expression modules. Only significant categories are shown (FDR < 0.05).
Co-expression network of the SCPL1A genes and transcription factors
The differential expression of Transcription Factors (TFs) is often responsible for broad changes in the expression of other downstream genes [2], and the tea genome contains a total of 1,380 currently annotated TFs. In order to study the interaction between tea SCPL1A genes and TFs, we constructed a co-expression network of differentially-expressed SCPL1A genes and TFs during Ectropis oblique (Figure 4A) and mechanical injury (Figure 4C) stresses. The co-expression network under mechanical injury stress revealed several phenomena. First, we found the five largest families of transcription factors, MYB, bHLH, WRKY, AP2/ERF-ERF, and NAC, all contained members that were differentially expressed during these stress treatments (Figure 4B). Network modeling indicated that nearly all of these transcription factors interact with SCPL1A members, suggesting that they play a major role in the regulation of SCPL1A genes after mechanical injury. The co-expression network under Ectropis oblique stress also revealed that the same transcription factor families contained members that interacted with SCPL1A family members (Figure 4D). It is also worth noting that the number of transcription factors varied greatly between the two stresses. During mechanical injury, more transcription factors had interactions with differentially expressed SCPL1A genes, while the number of transcription factors interacting with SCPL1A genes under Ectropis oblique stress was small. MYB is the largest known transcription factor family, and its members can work together with WD40 and bHLH family members in ternary complexes that positively or negatively regulate flavonoid synthesis [40,41]. Based on our results, several TFs associated with biotic or abiotic stress responses, such as MYB, WRKY, AP2/ERF-ERF, NAC, and ERF family members appear to affect the expression of catechin biosynthesis genes, and these results provide insights into mechanisms of the interaction between transcription factors and SCPL1A members.
Figure 4: Co-expression network of SCPL1A genes and TFs.
(A) and (C) show co-expression networks of DE SCPL1A genes and TFs under mechanical injury and Ectropis oblique stress, respectively. Pie charts of TF co-expression with differentially expressed SCPL1A genes in mechanical injury and Ectropis oblique. The outer circle contains differentially expressed SCPL1A genes, represented by a green hexagon. The inner circle contains key TFs, indicated by circles of different colors.
qRT-PCR analysis
In order to confirm the RNA-seq data, 15 genes with differential expression under mechanical injury and Ectropis oblique stress were selected for qRT-PCR analysis. The expression trends of these genes were highly consistent with the RNA-seq results, suggesting that the RNA-seq results were very reliable (Figure 5).
Figure 5: The relative expression levels of DEGs were verified by qRT-PCR. The red line represents RNA-seq results, and the green bar represents qRT-PCR results. All qRT-PCR analyses were performed with three biological replications and four technical replications. The error bar represents standard deviation.
Catechins are synthesized primarily from either phenylpropanes or flavonoids, and key regulatory genes for these pathways have been previously reported [2,42]. The expression values of annotated flavonoid biosynthesis genes during mechanical injury and Ectropis oblique stress are shown in figure 6. Interestingly, CHS, F3 'H, LAR, F3 '5' H, DFR, ANS, and ANR genes mostly only play a role in early stress response. However, the expression patterns of CHS, F3 'h, FLS, and ANR genes were very similar to the SCPL1A genes under Ectropis oblique stress, indicating that they may act in concert. Some functions of CHS, CHI, DFR, LAR, and ANR have been identified in previous studies [2] but further studies are needed to investigate how these genes specifically regulate catechin synthesis in tea plants during stress.
Figure 6: Biosynthesis pathway of flavonoids in tea plants. The expression levels of transcripts at different time points of stress are indicated by heat maps. Labels represent different genes: CHS, chalcone synthase; F3′H, flavonoid 3′-hydroxylase; FLS, flavanol synthase; LAR, leucocyanidin reductase; F3′,5′H, flavonoid 3′,5′-hydroxylase; ANS, anthocyanidin synthase; ANR, anthocyanidin reductase.
In this study, we analyzed the evolutionary pattern, gene structure, motifs, and promoters of the SCPL1A genes. The expression patterns of SCPL1A genes under different stresses were analyzed and the regulatory relationships between the SCPL1A gene family and transcription factors under different stresses were studied. We also analyzed the evolutionary relationship of the SCPL1A gene family in several species and studied the structures and motifs of SCPL1A genes in tea plants. Interestingly, promoter prediction indicated that photosynthesis may affect the biosynthesis of catechins by regulating SCPL1A gene expression. The differential expression patterns of SCPL1A gene family members under two different stresses (mechanical injury and Ectropis oblique) were analyzed and classified into two categories according to their expression patterns. Finally, a corresponding co-expression network of differentially expressed SCPL1A genes and TFs was constructed, which revealed that they were regulated more strongly by transcription factors under mechanical injury stress compared with Ectropis oblique stress. These results provide an important reference for further research into how tea plants produce catechins.
This work was supported by Henan Province Central Leading Local Science and Technology Development Fund Project Funding (Z20231811160).
Citation: Lian S, Ren F, Cai S, Wang Z, Zhang W (2024) The Co-expression Networks of Differentially-Expressed SCPL1As and Transcription Factors in Tea Plants with Mechanical Injury and Ectropis oblique Feeding Stress. J Genet Genomic Sci 9: 048.
Copyright: © 2024 Shuaibin Lian, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.