A method of peanut breeding of large pod and high yield based on genomic selection

Min-jie GUO, Li DENG, Jian-li MIAO, Jun-hua YIN, Li REN

CHINESE JOURNAL OF OIL CROP SCIENCES ›› 2024, Vol. 46 ›› Issue (3) : 697-702.

PDF(2183 KB)
Welcome to CHINESE JOURNAL OF OIL CROP SCIENCES, May. 12, 2025
PDF(2183 KB)
CHINESE JOURNAL OF OIL CROP SCIENCES ›› 2024, Vol. 46 ›› Issue (3) : 697-702. DOI: 10.19802/j.issn.1007-9084.2023002

A method of peanut breeding of large pod and high yield based on genomic selection

Author information +
History +

Abstract

The batch selection of peanut cross combination study was carried out to provide theoretical guidance for the efficient breeding of new peanut varieties with high yield. Genomic selection analysis of 220 peanut germplasm resources was conducted using phenotypic data of single plant productivity and 100-pod weight at multiple locations for many years and re-sequencing data with depth of 10. Results showed that the phenotypic data were normally distributed, after genome data control, a total of 527 469 high-quality SNP (single nucleotide polymorphism) sites were obtained. The estimated breeding values of single plant productivity and 100-pod weight were calculated by GBLUP (genomic best linear unbiased prediction) model based on phenotypic data. The estimated breeding values were standardized, and the weights of single plant productivity and 100-pod weight were 70% and 30%, respectively, to obtain the comprehensive breeding index of peanut germplasm individuals. There were 190 hybrid combinations of the top 20 materials which showed comprehensive breeding values, and the comprehensive breeding index of any two combinations were calculated. The coefficient of parentage between each two materials was calculated using G matrix based on the genome data. Standardize the comprehensive breeding index of combination and coefficient of parentage, assign 80% and 20% weights respectively to calculate the comprehensive score of the combination. According to the ranking of the comprehensive score of the combination, we could select the parent directly to set hybrid combination. In conclusion, the germplasm materials derived from the combination Kainong30 × Kaixuan016 are suitable for high-yield parents. Genomic selection can efficiently and accurately calculate the ranking among combinations to select the parents and to make cross combinations in batches, improving the breeding efficiency rapidly.

Key words

peanut / high yield / phenotype / re-sequencing / genomic selection / hybrid combination

Cite this article

Download Citations
Min-jie GUO , Li DENG , Jian-li MIAO , Jun-hua YIN , Li REN. A method of peanut breeding of large pod and high yield based on genomic selection[J]. CHINESE JOURNAL OF OIL CROP SCIENCES, 2024, 46(3): 697-702 https://doi.org/10.19802/j.issn.1007-9084.2023002
花生(Arachis hypogaea L.)是我国重要的油料作物,近年来种植面积稳中有升,单产水平持续提高,总产保持连续增长[1],2021年全国花生总产达到1799万吨[2]。高产是花生品种最基本、最普遍的要求,培育高产花生品种是花生育种中的首要目标[3,4],亲本选择是育种的第一步,批量预测高产亲本对花生高产育种意义深远。
基因组重测序是对有参考基因组物种的不同个体进行的DNA序列测定。全基因组选择(genomic selection,GS)由Meuwissen 于2001年首次提出[5],它是利用覆盖全基因组的高密度SNP标记,结合表型或系谱对个体的育种值进行估计,其假定这些标记中至少有一个标记与所有控制性状的QTL处于连锁不平衡状态,从而标记到多个基因,实现对数量性状的准确评估[6]。GS在畜牧育种中应用较多[7-9],近年在农作物中也有应用[10-12]。花生为异源四倍体作物,基因组大小约2.8 G,存在大量SNP标记位点,GS能够通过估计所有SNP标记效应,实现对全基因组所有基因效应的估计。
从分子标记辅助育种到全基因组关联分析,分子育种技术已成为植物育种中不可或缺的一部分,但仍不能实现对花生个体材料的综合评估。本研究首次利用全基因组选择技术估算花生种质材料的育种值和亲缘关系系数,采用标准化和附加权重法得到杂交组合综合得分,进而获得亲本选配排名,旨在大大提高育成目标品种的可能性,为高产花生批量育种提供理论依据。

1 材料与方法

1.1 试验材料及田间设计

220份花生种质材料(附表1,见首页OSID码)于2019年和2020年均分别种植于河南开封市和信阳市两地,田间采用随机区组试验设计[13],小区设3个重复,每穴1粒,每个材料种植面积13.34 m2(宽2 m×长6.67 m),5月20号左右播种,9月20号左右收获。田间水肥管理按照按当地习惯操作。收获后,按照《花生新品种DUS测试原理与技术》严格测定各试验小区的百果重和单株生产力[14]
Table 1 GBLUP of single plant productivity and 100-pod weight of partial germplasm resources

表1 部分种质资源的单株生产力和百果重的GBLUP育种值

Cultivar

单株生产力育种值

GBLUP of single plant productivity

百果重育种值

GBLUP of 100-pod weight

G1 1.5546 48.9476
G2 -1.8278 -14.0914
G3 0.3472 69.1758
G4 -0.4957 -1.4890
G5 0.3152 26.1357
G6 -0.4381 5.7234
G7 -1.2601 -17.8866
G8 1.0415 55.0564
G9 1.7146 39.9853
G10 1.0923 -43.2856

1.2 基因型数据

苗期提取花生嫩叶DNA,由北京诺禾致源科技股份有限公司采用Illunima二代测序技术对220份种质资源进行全基因组重测序(10 ×)。DNA样本通过纯化、末端修复、连接头、片段大小选择等步骤完成文库构建,数据过滤、评估后获得Clean reads[15]

1.3 数据处理和分析

利用Microsoft Excel 2010整理、计算表型原始数据,使用ASReml软件计算表型的最佳线性无偏估计(best linear unbiased evaluation, BLUE)值,使用R语言的ggplot2制作直方图。利用 Plink1.9 软件对测序数据进行质控,缺失质控使用--geno 0.1,去掉缺失率大于10%的位点。使用Beagle5.3[16]进行自填充,对填充后的数据进行次等位基因频率质控--maf 0.05。使用ASRgenomics包进行G矩阵的构建。将基因型数据数字化,对于每一个位点,主等位基因位点纯合编码为0,杂合基因位点编码为1,次等位基因纯合编码为2,对于plink文件,使用-- recodeA进行编码,使用R语言的cluster[17]和ggplot2[18]作聚类分析。使用ASReml软件计算性状的GBLUP育种值。
计算GBLUP和亲缘关系系数,步骤(1)(2)(3):
(1)构建G矩阵:
G=ZZ'2pi(1-pi)
pi 为位点i的次等位基因频率,Z为SNP标记的设计矩阵,Z’为Z的转置矩阵。
(2)计算GBLUP育种值:
bȗ̑=X'XX'ZZ'XZ'Z+G-1k-1X'YZ'Y
X为固定因子的矩阵结构,Z为随机因子的矩阵结构,Y为观测值的矩阵结构, G-1为亲缘关系G的逆矩阵, b^为固定因子的效应值(BLUE), u^为随机因子的效应值,k为残差方差组分和加性方差组分的比值。
(3)计算亲缘关系系数:
rij=GijGiiGjj
ij分别为两个个体材料,rijij的亲缘关系系数,GijG矩阵中ij所在的值, GiiG矩阵中i所在的对角线的值, GjjG矩阵中j所在的对角线的值。

1.4 数据标准化

1.4.1 个体材料综合育种值

对单株生产力和百果重分别赋予70%和30%的权重,转化为平均值为100、标准差为25的分布。综合育种值公式:gblupaindex=gblupay1_ GBLUP*0.3 + gblupay2GBLUP0.7,综合育种值标准化公式:gblupaindex = 100 + 25*(gblupaindexmeangblupaindex))/sd(gblupa$index)。

1.4.2 材料间的组配综合得分

将组合综合育种值和亲缘关系系数进行标准化,对组合综合育种值和亲缘关系系数分别赋予80%和20%(负)的权重,计算组配综合得分,组配综合得分公式:zonghe_defen = mean_index * 0.8 + y * (-0.2)。

2 结果与分析

2.1 表型和基因型数据统计

2.1.1 表型数据分析

利用混合线性模型,将花生种质资源作为固定因子,环境、环境与品种互作、环境内区组作为随机因子,使用ASReml软件进行方差组分评估和混合线性求解,计算各个种质资源的最佳线性无偏估计(BLUE)值,以作为表型值进行后续GBLUP的估计。对矫正的表型值进行直方图的可视化,查看单株生产力和百果重的两年两地数据的分布情况,从图1可以看出,两个性状均符合正态分布。
Fig. 1 Histogram of single plant productivity and 100-pod weight
Note: A represents histogram of single plant productivity; B represents histogram of 100-pod weight

图1 单株生产力和百果重的直方图

注:A为单株生产力的直方图;B为百果重的直方图

Full size|PPT slide

2.1.2 基因型数据分析

将获得DNA片段数据比对到已知参考基因组Tifrunner[19]上,对全基因组SNP标记鉴定和质量控制,针对基因型数据进行基础SNP挖掘(call snp),共获得608 809个SNP。对其进行数据质控,共质控掉80 433个位点,剩余 528 376个位点。使用Beagle 5.3进行自填充,共质控掉907个位点,剩余有效位点527 469个,输出vcf文件。

2.2 综合育种值及亲缘关系系数

2.2.1 使用GBLUP模型估计育种值

对G矩阵进行聚类分析(图2),结果显示,220份材料可分为3个类群。Group I包含23份花生种质资源,荚果大部分为多粒型。Group II和Group Ⅲ类群分别包含131和66份花生材料,它们在表型性状方面无明显的分类。
Fig. 2 Cluster map of 220 peanut germplasm resources

图2 220份材料的聚类图

Full size|PPT slide

根据G矩阵结合单株生产力和百果重两个性状,得到两个性状的GBLUP育种值,表1为G1~G10这10个种质资源的性状育种值。

2.2.2 个体材料(单株生产力和百果重)的综合育种值

对单株生产力和百果重的估计育种值进行标准化,根据高产育种需求,分别赋予70%和30%的权重,进行标准化计算出单个花生种质材料的综合育种值。将个体材料的综合育种值进行排序,取排名前20的材料,结果见表2。G103、G130、G56和G108等材料的综合育种值依次降低。
Table 2 Comprehensive breeding index of top 20 individuals

表2 前20名个体综合育种值

Cultivar

单株生产力育种值

GBLUP of single plant productivity

百果重育种值

GBLUP of 100-pod weight

综合育种值

Comprehensive breeding index

G103 0.8639 3.1718 169.5928
G130 2.8529 2.2649 168.5229
G56 1.3225 2.1229 152.8460
G108 0.3389 2.4902 151.7811
G100 1.3729 1.7461 145.8683
G3 0.2858 2.1172 144.0051
G21 1.3112 1.5930 142.3395
G38 1.2366 1.6227 142.2953
G8 0.8573 1.6851 140.3270
G1 1.2796 1.4981 140.2097
G80 0.9894 1.6011 139.7902
G166 -0.1630 2.0001 137.9245
G113 1.3960 1.3148 137.5874
G170 1.0297 1.4321 136.8081
G132 1.3874 1.2557 136.3545
G23 0.9294 1.4442 136.2024
G70 1.5240 1.1888 136.1911
G131 0.9505 1.4300 136.0994
G9 1.4113 1.2238 135.9293
G112 1.4482 1.1973 135.7183

2.2.3 两两组合间的综合育种值

将产量综合育种值排名前20的材料(表2)进行两两组合,在不考虑正反交的情况下,共有190个组合( C202),计算两两材料组合间的综合育种值,部分结果如下表3。在不考虑亲缘关系系数的情况下,G130和G103、G56和G103等组合的综合育种值较高。
Table 3 Comprehensive breeding index among partial combinations

表3 部分组合间的综合育种值

ID1

Cultivar

ID2

Cultivar

组合综合育种值

Comprehensive breeding index of combination

G130 G103 169.0578
G56 G103 161.2194
G108 G103 160.6869
G56 G130 160.6844
G130 G108 160.1520
G103 G100 157.7305
G130 G100 157.1956
G3 G103 156.7989
G3 G130 156.2640
G21 G103 155.9661

2.2.4 材料间的亲缘关系系数

利用基因型数据,根据G矩阵计算220份花生种质材料两两之间的亲缘关系系数,表4展示部分结果。亲缘关系系数数值越大,表示亲缘关系越近,材料与其本身的亲缘关系系数为1,如G109,无亲缘关系的材料的亲缘关系系数为0,如G46与G109、G23和G109。
Table 4 Kinship coefficients among partial germplasm resources

表4 部分种质资源间的亲缘关系系数

Cultivar Cultivar

亲缘关系系数

kinship coefficients

G109 G109 1.0000
G87 G109 0.0054
G46 G109 0.0000
G23 G109 0.0000
G47 G109 0.0000
G56 G109 0.4029
G169 G109 0.2918
G64 G109 0.0000
G98 G109 0.2134
G71 G109 0.0000

2.3 材料间的组配综合得分

将组合的综合育种值与组合的亲缘关系系数合并,进行标准化并分别赋予不同权重,计算组配综合得分,并进行排序,组配综合得分排名前60的杂交组合结果见表5。根据高产组合可以直接在种质资源材料中挑选亲本制定杂交组合的配制计划,批量选择亲本,指导杂交组配,增大育种目标的可实现性,提高育种效率,为花生高产育种提供重要技术支撑。另外,本研究利用分子标记构建的G矩阵进行了GBLUP的基因组选择,使用的材料包含了目前推广面积较大的花生品种,在后续的育种中,基于这些品种选育的后代可以使用该参考群,进行亲缘关系G矩阵的构建,从而进行育种值的评估。
Table 5 The top 60 hybrid combinations in terms of comprehensive score

表5 组配综合得分排名前60的杂交组合

序号

No.

组合(♀×♂)

Combination (♀×♂)

组合综合育种值标准化

Standardized comprehensive breeding index of combination

亲缘关系系数标准化

Standardized kinship coefficients

组配综合得分

Comprehensive score of combination

序号

No.

组合(♀×♂)

Combination (♀×♂)

组合综合育种值标准化

Standardized comprehensive breeding index of combination

亲缘关系系数标准化

Standardized kinship coefficients

组配综合得分

Comprehensive score of combination

1 G130 × G103 3.7805 -0.9688 3.2181 31 G112 × G103 1.3713 -0.2001 1.1370
2 G130 × G108 2.4724 -1.2451 2.2269 32 G9 × G130 1.3082 -0.4481 1.1362
3 G130 × G108 2.6291 -0.5762 2.2186 33 G23 × G103 1.4068 -0.0297 1.1314
4 G130 × G100 2.0381 -0.3559 1.7017 34 G130 × G112 1.2927 -0.4724 1.1286
5 G56 × G130 2.5506 1.8192 1.6766 35 G170 × G103 1.4513 0.3512 1.0908
6 G3 × G130 1.9013 -0.7194 1.6649 36 G131 × G103 1.3993 0.1893 1.0815
7 G108 × G103 2.5509 2.0757 1.6256 37 G166 × G103 1.5333 0.7762 1.0714
8 G103 × G100 2.1167 0.6990 1.5536 38 G80 × G130 1.5917 1.0516 1.0631
9 G38 × G103 1.8543 -0.3422 1.5519 39 G56 × G100 0.8868 -0.5322 0.8159
10 G21 × G130 1.7790 -0.5031 1.5238 40 G56 × G38 0.6244 -1.0736 0.7142
11 G38 × G130 1.7757 -0.4143 1.5034 41 G56 × G3 0.7500 -0.4690 0.6938
12 G80 × G103 1.6703 -0.7874 1.4937 42 G56 × G21 0.6276 -0.7604 0.6542
13 G8 × G130 1.6312 -0.7546 1.4558 43 G38 × G108 0.5462 -0.7768 0.5923
14 G130 × G1 1.6225 -0.5612 1.4103 44 G8 × G56 0.4798 -0.9442 0.5727
15 G8 × G103 1.7097 -0.1595 1.3997 45 G108 × G100 0.8086 0.3840 0.5701
16 G3 × G103 1.9799 0.9723 1.3894 46 G56 × G1 0.4712 -0.8239 0.5418
17 G21 × G103 1.8575 0.5234 1.3813 47 G80 × G108 0.3622 -1.0917 0.5081
18 G103 × G1 1.7011 0.0943 1.3420 48 G56 × G113 0.2786 -1.0661 0.4361
19 G131 × G130 1.3207 -1.3264 1.3218 49 G21 × G108 0.5494 0.0773 0.4241
20 G166 × G130 1.4547 -0.6422 1.2922 50 G3 × G108 0.6718 0.5931 0.4188
21 G113 × G103 1.5085 -0.2005 1.2469 51 G108 × G1 0.3930 -0.4010 0.3946
22 G130 × G113 1.4300 -0.4721 1.2384 52 G56 × G166 0.3034 -0.6606 0.3748
23 G170 × G130 1.3727 -0.6675 1.2317 53 G56 × G132 0.1881 -1.0613 0.3627
24 G56 × G108 1.3210 -0.8448 1.2258 54 G56 × G131 0.1694 -1.1276 0.3610
25 G23 × G130 1.3282 -0.6890 1.2004 55 G70 × G56 0.1761 -1.0600 0.3529
26 G132 × G103 1.4180 -0.2039 1.1752 56 G9 × G56 0.1569 -1.0395 0.3334
27 G70 × G103 1.4060 -0.2066 1.1661 57 G56 × G112 0.1414 -1.0665 0.3264
28 G132 × G130 1.3394 -0.4697 1.1655 58 G113 × G108 0.2004 -0.6541 0.2912
29 G70 × G130 1.3274 -0.4758 1.1571 59 G56 × G170 0.2214 -0.4546 0.2681
30 G9 × G103 1.3868 -0.2053 1.1505 60 G56 × G23 0.1769 -0.5486 0.2513

References

1
廖伯寿. 我国花生生产发展现状与潜力分析[J]. 中国油料作物学报202042(2): 161-166. DOI: 10.19802/j.issn.1007-9084.2020115 .
2
国家统计局[DB/OL]. 国家数据查询, 2022-05-30. [2022-11-20]
3
禹山林. 中国花生遗传育种学[M]. 上海: 上海科学技术出版社, 2011.
4
邓丽, 郭敏杰, 苗建利, 等. 基于通径系数和GGE双标图的大粒花生综合分析[J]. 江苏农业科学202149(19): 129-133. DOI: 10.15889/j.issn.1002-1302.2021.19.023 .
5
于洋, 张晓军, 李富花, 等. 全基因组选择育种策略及在水产动物育种中的应用前景[J]. 中国水产科学201118(4): 936-943. DOI: 10.3724/SP.J.1118.2011.00935 .
6
Goddard M E Hayes B J. Genomic selection[J]. J Anim Breed Genet2007124(6): 323-330. DOI: 10.1111/j.1439-0388.2007.00702.x .
7
Hayes B J Lewin H A Goddard M E. The future of livestock breeding: genomic selection for efficiency, reduced emissions intensity, and adaptation[J]. Trends Genet201329(4): 206-214. DOI: 10.1016/j.tig.2012.11.009 .
8
Pryce J E Wales W J de Haas Y, et al. Genomic selection for feed efficiency in dairy cattle[J]. Animal20148(1): 1-10. DOI: 10.1017/S1751731113001687 .
9
Lillehammer M Meuwissen T H E Sonesson A K. Genomic selection for two traits in a maternal pig breeding scheme1[J]. J Anim Sci201391(7): 3079-3087. DOI: 10.2527/jas.2012-5113 .
10
Shikha M Kanika A Rao A R, et al. Genomic selection for drought tolerance using genome-wide SNPs in maize[J]. Front Plant Sci20178: 550. DOI: 10.3389/fpls.2017.00550 .
11
刘策, 孟焕文, 程智慧. 植物全基因组选择育种技术原理与研究进展[J]. 分子植物育种202018(16): 5335-5342. DOI: 10.13271/j.mpb.018.005335 .
12
Xu Y Ma K X Zhao Y, et al. Genomic selection: a breakthrough technology in rice breeding[J]. Crop J20219(3): 669-677. DOI: 10.1016/j.cj.2021.03.008 .
13
盖钧镒. 试验统计方法: 《田间实验和统计方法》[M]. 北京: 中国农业出版社, 2000.
14
刘洪, 任永浩. 花生新品种DUS测试原理与技术[M]. 广州: 华南理工大学出版社, 2012.
15
Li H Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform[J]. Bioinformatics200925(14): 1754-1760. DOI: 10.1093/bioinformatics/btp324 .
16
Browning B L Tian X W Zhou Y, et al. Fast two-stage phasing of large-scale sequence data[J]. Am J Hum Genet2021108(10): 1880-1890. DOI: 10.1016/j.ajhg.2021.08.005 .
17
VanRaden P M. Efficient methods to compute genomic predictions[J]. J Dairy Sci200891(11): 4414-4423. DOI: 10.3168/jds.2007-0980 .
18
Villanueva R A M Chen Z J. ggplot2: elegant graphics for data analysis (2nd Ed.)[J]. Meas201917(3): 160-167. DOI: 10.1080/15366367.2019.1565254 .
19
Peanutbase[DB/OL]. The peanut genome, 2021-06-04. [2022-06-25]
20
康俊梅, 张铁军, 王梦颖, 等. 紫花苜蓿QTL与全基因组选择研究进展及其应用[J]. 草业学报201423(6): 304-312. DOI: 10.11686/cyxb20140636 .
21
He S Schulthess A W Mirdita V, et al. Genomic selection in a commercial winter wheat population[J]. Theor Appl Genet2016129(3): 641-651. DOI: 10.1007/s00122-015-2655-1 .
22
Bertioli D J Jenkins J Clevenger J, et al. The genome sequence of segmental allotetraploid peanut Arachis hypogaea [J]. Nat Genet201951(5): 877-884. DOI: 10.1038/s41588-019-0405-z .
23
Chen X P Lu Q Liu H, et al. Sequencing of cultivated peanut, Arachis hypogaea, yields insights into genome evolution and oil improvement[J]. Molecular Plant201912(7): 920-934. DOI: 10.1016/j.molp.2019.03.005 .
24
Zhuang W J Chen H Yang M, et al. The genome of cultivated peanut provides insight into legume karyotypes, polyploid evolution and crop domestication [J]. Nature Genetics201951(5): 865-876. DOI: 10.1038/s41588-019-0402-2 .
25
Jannink J L Lorenz A J Iwata H. Genomic selection in plant breeding: from theory to practice[J]. Brief Funct Genomics20109(2): 166-177. DOI: 10.1093/bfgp/elq001 .

Footnotes

PDF(2183 KB)

574

Accesses

0

Citation

Detail

Sections
Recommended

/