杨秋雪,沃佳美雪,徐晓敏,郭忠录,刘树民,高宏伟 (1.黑龙江中医药大学哈尔滨 1500402.大兴安岭地区中医药协会黑龙江大兴安岭 1650003.黑龙江中医药大学 中医药研究院哈尔滨 1500404.黑龙江中医药大学 附属第一医院哈尔滨 150040) 
中文摘要:草苁蓉为东北道地药材,具有补肾壮阳,润肠通便,止血的功效,草苁蓉中含有草苁蓉多糖、草苁蓉环烯醚萜苷、草苁蓉酸等活性成分,具有丰富的药用价值。目前对草苁蓉的研究多集中于有效成分,药理性等方面,在转录组及功能基因方面的内容较少。为进一步了解大兴安岭草苁蓉的转录组,丰富其遗传信息利用高通量测序技术获得大兴安岭草苁蓉种子的转录组信息。利用转录组测序获得Unigene序列,将Unigene与七大功能数据库注释进行比对注释、SSR分析、CDS预测,得到草苁蓉转录组的信息。结果显示,共获得18.87 G的CleanData,各样本的有效数据量分布在5.4~7.06 Gb,Q30碱基分布在95.66%~97.45%,平均GC含量为49.35%。拼接出Unigene 57 799条,总长度为48 308 661 bp,平均长度为835.8 bp。大兴安岭草苁蓉与油菜(Brassica napus)序列相似度最高,47 035个(81.38%)基因注释到非冗余蛋白序列数据库(NR),33 653 个(58.22%)基因注释到蛋白序列数据库(Swissprot),13 174个(22.79%)基因注释到京都基因与基因组百科全书数据库(KEGG),27 886个(48.25%)基因注释到真核生物蛋白相邻类的聚簇数据库(KOG),42 506 个(73.54%)基因注释到直系同源蛋白分组比对数据库(eggNOG),30 928个(53.51%)基因注释到基因本体论数据库(GO),29 642个(51.28%)基因注释到蛋白家族数据库(Pfam)。预测出SSRs 11 031个,包含SSRs的Unigene有57 799条,包含大于1个SSRs的Unigene有1 886条,复合型SSRs943个。预测出53 033 条CDS序列,其中数据库比对方法预测出47 140条,ESTScan预测出5 893条。研究结果丰富了大兴安岭草苁蓉的转录组数据,为深入研究大兴安岭草苁蓉生物学特性及分子机制等提供参考,也有利于其资源开发和保护利用。
中文关键词:草苁蓉  全长转录组测序  大兴安岭  代谢通路  生物信息分析
Full-length Transcriptome Sequencing and Bioinformatics Analysis of Boschniakia rossica in Daxing’an Mountains
Abstract:Boschniakia rossica is a valuable medicinal materials native to northeastern China,known for its therapeutic effect in strengthening kidney function,combating constipation, and promoting hemostasis.It contains active compounds such as Boschniakia rossica polysaccharides, Iridoid glucosides,Boschniakia rossica acid, all contributing to its significant medicinal value.To further explore the transcriptome of stanche from Daxing’an Mountains and enhance its genetic database, we conducted a full-length transcriptomic analysis of Boschniakia rossica seeds using high-throughput sequencing technology.Through transcriptomes squencing we obtained Unigene sequence,followed by functional annotation, SSR (Simple Sequence Repeat) analysis and CDS (Coding Sequence) prediction, and an in-depth analysis of genetic information.The results showed that the a total of 18.87 Gb of clean data was obtained, with the effective data per sample ranging from 5.4 to 7.06 Gb.The Q30 base quality score ranged from 95.66% to 97.45%, and the average GC content was 49.35%.We identified 57 799 Unigen sequnces, with a total length of 48 308 661 bp and an average length of 835.8 bp .The Boschniakia rossica seeds transcriptome showed the highest sequence similarity to Brassica napus.Functional annotations revealed that 47 035(81.38%) genes were annotated in the NR database, 33 653 (58.22%) in Swissprot, 13 174(22.79%) in KEGG, 27 886(48.25%) in KOG, 42 506(73.54%) in eggNOG, 30 928(53.51%) in GO, 29 642(51.28%) in Pfam.11 031 SSRs were predicted, with 57 799 Unigene containing SSRs, 1 886 Unigene containing more than 1SSRs, and compound SSRs 943 predicted.A total of 53 033 CDS sequences were predicted, with 47 140 were predicted by the database alignment and 5 893 by ESTScan.These findings enrich the transcriptome data of Boschniakia rossica, and provide a reference for further studying the biological characteristics and molecular mechanism of Boschniakia rossica.This study also lays a foundation for the sustainable exploitation, protection, and utilization of Boschniakia rossica resources.
keywords:Boschniakia rossica  Full-length transcriptome sequencing  Daxing’an Mountains  Metabolic pathway  Bioinformatics analysis
