Characterisation of starch traits and genes in Australian rice germplasm

Size: px
Start display at page:

Download "Characterisation of starch traits and genes in Australian rice germplasm"

Transcription

1 Southern Cross University Theses 2013 Characterisation of starch traits and genes in Australian rice germplasm Ardashir Kharabian Masouleh Southern Cross University Publication details Kharabian Masouleh, A 2013, 'Characterisation of starch traits and genes in Australian rice germplasm', PhD thesis, Southern Cross University, Lismore, NSW. Copyright A Kharabian Masouleh 2013 epublications@scu is an electronic repository administered by Southern Cross University Library. Its goal is to capture and preserve the intellectual output of Southern Cross University authors and researchers, and to increase visibility and impact through open access to researchers around the world. For further information please contact epubs@scu.edu.au.

2 Characterisation of starch traits and genes in Australian rice germplasm Ardashir Kharabian Masouleh (B.Sc, M.Sc) A thesis submitted to Southern Cross University in fulfillment of the requirements for the degree of Doctor of Philosophy Southern Cross Plant Science Southern Cross University Lismore, NSW Australia March 2013 i

3 Statement of originality I certify that the work presented in this thesis is, to the best of my knowledge and belief, original, except as acknowledged in the text, and that the material has not been submitted, either in whole or in part, for a degree at this or any other university. I acknowledge that I have read and understood the Universities rules, requirements, procedures and policy relating to my higher degree research award and to my thesis. I certify that I have complied with the rules, requirements, procedures and policy of the University. Ardashir Kharabian Masouleh March 2013 ii

4 Acknowledgements First, my great gratitude to my principal supervisors, Robert J Henry, Daniel LE Waters and Russell F. Reinke for allowing me to undertake this project at the Southern Cross Plant Science. I would also like to thank them for their direction and endless support during this PhD project. Next I would like to thank my other supervisors in the centre, Graham King and Michael Heinrich, for their help, thoughts and valuable suggestions throughout my candidature. Thanks to the many people who have been of great help in the lab, Stirling Bowen, Peter Bundock, Timothy Sexton and everyone else who helped me out learning various techniques. Thanks to all in the post grad room and beyond, especially Tiffeny Byrnes and Cathy Nock who have been great support as Lab manager and administration. And last but not least thanks to my family, especially my wife Shiva for the endless support during these four and a half years. I could not have this big commitment done without my family support. iii

5 Abstract Starch is a major component of human diets. The physio-chemical properties of starch influence the nutritional value of starch and the functional properties of starch containing foods. Many of these traits have been under strong selection in domestication of rice as a food. A population of 233 breeding lines of rice was analysed for variation in 17 rice starch synthesis genes, encoding seven classes of enzymes, including ADP-glucose pyrophosphorylase (AGPases), granule starch synthases (GBSS), soluble starch synthase (SS), starch branching enzyme (BE), starch debranching enzyme (DBE) and starch phosphorylase (SPHOL) and phosphate translocator (GPT1). This approach employed semito long-range PCR (LR-PCR) followed by next-generation sequencing technology. The amplification products were equimolarly pooled and sequenced using massively parallel sequencing technology (MPS). SNP/Indels in both coding and non-coding were identified and the distribution patterns among individual starch candidate genes characterized. Approximately, 60.9 million reads were generated, of which 54.8 million (90%) mapped to the reference sequences. The coverage rate ranged from 12,708 to 38,300 for SSIIa and SSIIIb, respectively. SNPs and single/multiple-base Indels were analysed in a total assembled length of 116,403 bp. In total, 501 SNPs, of which 110 were non-synonomous/ fuctional, and 113 Indels were detected across the 17 starch related loci. Five genes AGPL2a, Isoamylase1, SPHOL, SSIIb an SSIVb showed no polymorphism. The ratio of synonymous to nonsynonymous SNPs (Ka/Ks) test suggested GBSSI and Isoamylase 1 (ISA1) are the least diversified (most purified) and conservative genes as the studied populations have been through several cycles of selection for low amylose content and gelatinization temperature. The 110 functional SNP loci were analysed for associations with rice pasting and cooking quality. Associations of 65 functional SNPs with starch traits were detected. The GBSSI (waxy gene) and SSIIa had a major influence on starch properties and the other genes had iv

6 minor associations. The G/T SNP at the boundary site of exon/intron1 in GBSSI showed the strongest association with retrogradation and amylose content. The TT allele has been selected in much of the domesticated japonica genepool providing rice with a desirable texture but less resistant starch with associated human health advantages. The GC/TT SNP at exon 8 of SSIIa showed a very significant association with pasting temperature (PT), gelatinization temperature (GT) and peak time. No significant association was found between SSIIa and retrogradation. Other genes contributing to retrogradation were SSI, BEI and SIIIa. The highest level of polymorphism was observed in SSIIIa with 22 SNPs but only limited associations were observed with starch phenotypic values. None of the SNP were found to be strongly associated with chalkiness except for a weak link with a T/C SNP at position 960 (Thr482 to Ala) in Isoamylase2. These associations provide new tools for deliberate selection of rice genotypes for specific functional and nutritional outcomes. Resistant-retrograded starch is widely associated with human health. The highly retrograded starches of cereals usually have a lower glycemic index (GI) which may be beneficial in many human diets. The data reported here suggests 6 glucose-phosphate translocator (GPT1) an enzyme early in the biochemical pathway of starch synthesis, has a major influence on resistant starch production in rice. A T/C SNP at position 1188 of the GPT1 encoding gene, alters Leu24 to Phe, and is highly associated with resistant-retrograded starch and amylose content. The T and C alleles produce high and low levels of retrograded starch, respectively. An association study of 233 genotypes demonstrated a highly significant correlation (R2) of 0.57 and 0.36 (P= ) between this SNP and retrogradation degree and apparent amylose content, respectively. Haplotype and association analysis of this SNP and another G/T SNP at the boundary site of exon/intron1 in GBSSI encoding gene explains most of the variability of retrogradation degree and amylose content in the rice population. v

7 These two SNP contribute to produce higher levels of resistant-retrograded starch, when T SNP in GPT1 and G in GBSSI are present. This T:G haplotype can provide a new tool for deliberate selection of rice genotypes for specific functional and nutritional outcomes such as resistant-retrograded starch and high amylose content non-sticky rices. Granule Bound Starch Synthase I (GBSSI) influences the grain quality of all cereals and, particularly, rice. Using GBSSI as a model plant gene, a number of different computational algorithm tools and programs were used to explore the functional SNPs of this important rice gene and the possible relationships between genetic mutation and phenotypic variation. A total of 51 SNPs/indels were retrieved from databases, including three important coding nonsynonymous SNPs, namely those in exons 6, 9 and 10. Sorting Intolerant from Tolerant (SIFT) results showed that a candidate [C/A] SNP (ID: OryzaSNP2) in exon 6 (coordinate 2494) is the most important non-synonymous SNP with the highest phenotypic impact on GBSSI. This SNP alters a tyrosine to serine at position 224 of the waxy protein. Computational simulation of GBSSI protein with the Geno3D suggested this mutant SNP creates a bigger loop on the surface of GBSSI and results in a shape different from that of native GBSSI. Here, we suggest a potential transcriptional binding factor site (TBF8) which has one [C/T] SNP [rs ] at coordinate 2777 in boundary site of intron 7/exon 8, according to Transcriptional Factor (TF) Search analysis. This SNP might potentially have a major effect on regulation and function of GBSSI. The application of single nucleotide polymorphisms (SNPs) in plant breeding involves the analysis of a large number of samples, and therefore requires rapid, inexpensive and highly automated multiplex methods to genotype the sequence variants. A high-throughput multiplexed SNP assay for eight polymorphisms which explain two agronomic and three vi

8 grain quality traits in rice was optimised. Gene fragments coding for the agronomic traits plant height (semi-dwarf, sd-1) and blast disease resistance (Pi-ta) and the quality traits amylose content (waxy), gelatinization temperature (alk) and fragrance (fgr) were amplified in a multiplex polymerase chain reaction. A single base extension reaction carried out at the polymorphism responsible for each of these phenotypes within these genes generated extension products which were quantified by a matrix-assisted laser desorption ionizationtime of flight system. The assay detects both SNPs and indels and is co-dominant, simultaneously detecting both homozygous and heterozygous samples in a multiplex system. This assay analyses eight functional polymorphisms in one 5 μl reaction, demonstrating the high-throughput and cost-effective capability of this system. At this conservative level of multiplexing, 3072 assays can be performed in a single 384-well microtitre plate, allowing the rapid production of valuable information for selection in rice breeding. vii

9 Table of Contents Title page i Statement of originality ii Acknowledgements iii Abstract iv Table of contents viii List of abbreviations xv Publications arising from thesis xvi Chapter 1 Allele mining and characterization of starch genes in rice: from SNPs to phenotype Starch structure 1 Starch synthesis 1 Starch synthesis enzymes and genes 2 Ka/Ks ratio ("purifying" vs "diversifying" genes) 3 Definition of purifying and diversifying genes 4 ADP-glucose pyrophosphorylase (AGPase), Starch phosphorylase (PHO) and Glucose phosphate translocator (GPT) gene families 5 AGPS2b (small subunit) 5 SPHOL (alpha 1,4 glucan starch phospholrylase) 5 GPT1 (Glucose-6-phosphate translocator) 6 Pathway to amylose 6 Granule bound starch synthesis (GBSS) 6 Pathway to amylopectin 8 Starch Synthase (SS) genes 8 SSI 10 SSII 11 SSIIa 11 SSIIb 12 SSIIIa 13 SSIIIb 13 SSIVa 14 Starch Branching enzymes (SBEs) 14 viii

10 BEI 14 BEIIa 15 BEIIb 15 Debranching Enzymes (DBEs) 16 ISA1 (Iso 1) 16 ISA2 (Iso2) 17 Pullulanase (PUL) 17 Proteins 18 Lipids 19 Environmental factors: Nitrogen (N), Phosphorous (P) and Potassium (K) 19 Thermal stress 20 CO2 21 Objectives of thesis 21 Key concepts 21 Major activities reported in the thesis 22 Chapter 2 Discovery of polymorphisms in starch related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing Summary 24 Introduction 25 Materials and methods 27 Plant materials 27 Variability of genotypes 27 Sample preparation and DNA extraction 27 Designation of starch-metabolizing enzymes/genes involved in starch synthesis 28 Target genes for sequence analysis 28 Designing primers to capture target genes 28 Long range PCR protocol (LR-PCR) 29 DNA equimolar pooling 29 Massively parallel sequencing 30 SNP detection and data analysis 30 Total polymorphism rate and functional SNPs 31 Results 31 ix

11 Number of reads and average coverage 31 Polymorphism discovery and SNP/Indel detection 32 SNP variation across the starch related candidate loci 33 ADP-glucose pyrophosphorylase (AGPase), Starch phosphorylase (PHO) and Glucose phosphate translocator (GPT) gene families 34 AGPS2b (small subunit) 34 SPHOL (alpha 1,4 glucan starch phospholrylase) 35 GPT1 (Glucose-6-phosphate translocator) 35 Granule bound starch synthase (GBSS) gene family 37 GBSSI (Granule bound starch synthase I) 37 GBSSII (Granule bound starch synthase II) 38 Starch synthase (SS) family 38 SSI 38 SSIIa 40 SSIIb 41 SSIIIa 41 SSIIIb 42 SSIVa 42 Starch Branching enzymes (SBEs) 43 BEI 43 BEIIa 44 BEIIb 44 Debranching Enzymes (DBEs) 45 ISA1 (Iso 1) 45 ISA2 (Iso2) 46 Pullulanase (PUL) 46 Distribution of SNPs across the loci 47 Ka/Ks ratio ("purifying" vs "diversifying" genes) 47 Discussion 48 Chapter 3 Bioinformatic tools assist screening of functional SNPs in plants: GBSSI in rice as a model gene Summary 52 Introduction 52 x

12 Materials and methods 54 GBSSI gene as a case study 54 Sequence alignment 54 SNP dataset 55 Computational tools for SNP analysis 55 3D Modelling of GBSSI and comparative study 56 Functional flow chart 58 Results 60 SNPs in GBSSI gene and comparative study 60 Computational algorithm tools 60 UTR Scan 60 TF Search 60 SIFT (Sorting Tolerant from Intolerant) 62 GeneSplicer 63 SEE ESE (Sequence Evaluator of Exonic Splicing Enhancers) 64 FAS-ESS (Systematic identification and analysis of exonic splicing silencers) 65 Simulation for finding functional, constructive changes of ns-coding SNPs 66 Discussion 68 Conclusion 71 Chapter 4 SNP in starch biosynthesis genes associated with the nutritional and functional properties of domesticated rice Summary 73 Introduction 74 Materials and methods 77 Plant materials 77 Physiochemical properties 77 Designation of starch-synthesis genes involved in starch metabolize 78 Candidate genes/enzymes for SNP genotyping 78 SNP dataset 78 Primer design and SNP genotyping 79 Capture PCR protocol, primer extension and mass spectrometry 79 Association analysis 79 xi

13 Statistical parameters 80 Results 80 AGPS2b (small subunit) 80 SPHOL (alpha 1,4 glucan starch phospholrylase) 81 GBSSI (Granule bound starch synthase I) 81 GBSSII (Granule bound starch synthase II) 82 SSI 82 SSIIa 82 SSIIb 83 SSIIIa 83 SSIIIb 84 SSIVa 84 SSIVb 85 BEI 85 BEIIb 85 Debranching Enzymes (DBEs) 86 ISA1 (Isoamylase 1) 86 ISA2 (Isoamylase 2) 86 Pullulanase 86 Discussion 86 Neutral genes with no polymorphism or association 87 Major genes with highly significant associations 88 Contributory genes with low-medium associations 89 Minor genes with very low associations 89 Chapter 5 A SNP in GPT1 is closely associated with nutritionally important resistant-retrograded starch in rice Summary 91 Introduction 92 Materials and methods 95 Plant materials 95 Physiochemical properties 95 Designation of starch-synthesis genes involved in starch metabolism 95 xii

14 Discovery of novel SNP in GPT1 and SNP genotyping in population 96 Association analysis 96 Results 96 GPT1 (Glucose-6-phosphate translocator) 97 GBSSI (Granule bound starch synthase I) 100 Allelic combination of SNPs in GPT1 and GBSSI 100 Discussion 101 Chapter 6 SNPs and marker assisted selection (MAS) in plant breeding. A high-throughput assay for rapid and simultaneous analysis of perfect markers for important quality and agronomic traits in rice using multiplexed MALDI-TOF Mass Spectrometry Summary 104 Introduction 105 Materials and methods 106 Genotypes 106 DNA extraction 106 Primer design/generation of SNP markers 107 Capture PCR protocol 107 Shrimp alkaline phosphatase (SAP) incubation 108 Primer extension and mass spectrometry 110 Results 110 Analysis of PCR products 110 Optimal capture primer concentration 110 MgCl2 concentration 111 Identification of SNPs and polymorphisms in agronomic and quality loci 112 sd Pi-ta 115 waxy 115 alk 116 fgr 117 Missing data and heterozygosity 118 Discussion 118 xiii

15 CHAPTER 7 General discussion - Characterisation of starch traits and genes in Australian rice germplasm Background principles 122 Search in SNP data bases and discovery of polymorphisms 122 Screening of functional SNPs 124 Gene copy number in the rice genome 125 Multiplexed MALDI-TOF Mass Spectrometry markers help to genotype individuals in a cost effective manner 125 Association between SNPs in starch biosynthesis genes and the nutritional and functional properties of domesticated rice 126 The 6-glucose-phosphate translocator (GPT1) may contribute to resistant starch 128 Conclusion and further directions 129 References 132 Appendices 150 xiv

16 List of abbreviations AC Amylose content BDV Breakdown viscosity CHK Chalkiness FV Final Viscosity GPT1 Glucose 6 -Phosphate Translocator gene GT Gelatinisation temperature MT Martin Test MPS Massively parallel sequencing NGS Next generation sequencing Ns non-synonymous PT Pasting temperature PaT Paste temperature PeT Peak time P1 Peak PKT Peak time PKV Peak viscosity PN Predicted N TF Transcriptional factors TFBS Transcriptional factor binding site SB Set back T1 Through UTR Untranslated region xv

17 Publications arising from thesis Publications arising from thesis 1) Masouleh AK, Waters DLE, Reinke RF, Henry RJ (2009) A high-throughput assay for rapid and simultaneous analysis of perfect markers for important quality and agronomic traits in rice using multiplexed MALDI-TOF mass spectrometry. Plant Biotechnology Journal. 7: ) Kharabian, A (2010) An efficient computational method for screening functional SNPs in plants. Journal of Theoretical Biology 265(1): ) Kharabian-Masouleh A, Waters DLE, Reinke RF, Henry RJ (2011) Discovery of polymorphisms in starch-related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing. Plant Biotechnology Journal. 9: ) Kharabian-Masouleh A, Waters DLE, Reinke RF, Ward R, Henry RJ (2012) SNP in starch biosynthesis genes associated with nutritional and functional properties of rice. Scientific Reports. 2:557; DOI: /srep xvi

18 xvii

19 CHAPTER 1 Allele mining and characterization of starch genes in rice: From SNPs to phenotype Starch constitutes most of the dry matter in the harvested organs of crop plants and is one of the most important human foods. Starch is an end product of photosynthesis that is mainly stored in the form of granules in the endosperm of grains and specialized organelles such as chloroplasts and amyloplasts. Numerous studies have been undertaken to elucidate starch biosynthesis and its genetic control and to discover the relationship between its structure, physical properties and the influence of environment on starch properties. Although, a number of comprehensive research and review articles have been published on starch chemistry and pathways of synthesis, there is still much that is not known which means it is not possible to modify starch components or quality in a predictable way. Starch structure Starch, a complex carbohydrate, is a polymer of glucose molecules. It occurs as two main forms: amylose, consisting of predominantly linear chains of glucose monomers linked by α1-4 glycosidic bonds, and amylopectin, in which the chains are branched by the addition of α1-6 glycosidic bonds. Depending upon species and the site of storage, amylose generally constitutes approximately 10 to 35% of the starch found in plants and the remainder is amylopectin. Starch synthesis The biochemistry of starch synthesis is relatively well understood although it is a complex process (Buléon et al., 1998; Libessart et al., 1995). Many enzymes are involved in starch 1

20 synthesis and several isoforms of these enzymes exist, leading to a highly complex biosynthetic process. The starting point of starch synthesis is glucose which is derived from photosynthesis in the green parts of plants. This glucose is transported to and deposited in storage tissue including grain endosperm and tuberous roots. In the amyloplast, glucose is activated by the addition of ADP by ADP-glucose pyrophosphorylase (AGPase) (James et al., 2003). The ADP-glucose is then used by starch synthases which add glucose units to the growing polymer chain to build the starch molecules (Buléon et al., 1998). Starch synthesis enzymes and genes A significant number of enzyme isoforms and activities contribute to starch synthesis and therefore many genes are involved in the process. A simplified pathway diagram of starch bio-synthesis and the enzyme and genes involved is shown in Figure 1. If we consider ADPglucose as the main substrate then there are two different pathways which lead to starch, one toward amylose and the other to amylopectin. In each of these biochemical pathways, different enzymes and genes play a role. These enzymes and genes work in a complex process and each one makes a partial contribution to the starch end product and its quality (Tester et al., 2004). Some starch genes, such as SSIIa, are mainly expressed in the endosperm and others only in leaves while others are expressed in both green and storage tissues. Genes belonging to non- 2

21 Figure 1. A schematic diagram showing the biochemical pathways of cereal starch production. endosperm type are often expressed together with one both tissue type. For example GBSSII, SSIIB, SSIIIb are leaf expressed and are co-ordinately expressed with SSI which is expressed in both tissue types (Hirose and Terao, 2004). For this reason when investigating the association of starch genes with grain quality it is necessary to focus on all genes with a possible phenotypic effect. Mutations in genes which operate early in the starch bio-synthesis pathway (Fig 1) are likely to influence starch quality or quantity. Ka/Ks ratio ("purifying" vs "diversifying" genes) The ratio of non-synonymous (Ka) to synonymous (Ks) SNP can reveal whether a gene has been under purifying, neutral or diversifying selection. The Ka/Ks ratio has been created to classify candidate genes into two main categories of purifying and diversifying genes. 3

22 Under neutral conditions of evolution, at the amino acid level, Ka should equal Ks and hence the ratio Ka/Ks = 1. Any deviation from this score shows the selection pressure on genetic structure of population or candidate genes. The Ka/Ks ratio < 1 indicates negative (purifying) selection and positive (diversifying) selection is Ka/Ks>1 (Roth and Liberles, 2006). SNPs in the genes studied in this thesis were retrieved from The International Rice Functional Genomics Consortium (IRFGC) database ( This database holds records of the sequence analysis, including SNPs, of 20 diverse rice (Oryza sativa L.) cultivars. The different Ka/Ks ratios were calculated for candidate genes, ranging from 0.11 to 2.40 for SSI and SSIIIa, respectively (Table 1). These results indicate that genes such as SSIIIa are under diversifying selection whereas others such as SSI are under purifying selection. Definition of purifying and diversifying genes These terms extend the concept of evolution, in which genes, or more accurately allele frequency, are diversified (diversifying) or purified (purifying) under natural or artificial selection pressure. In natural selection, purifying selection equals negative selection where deleterious alleles (SNP) (point mutations) are gradually removed from the population which tends to stabilise the population (selection). In contrast, diversifying (or disruptive) selection is where allele frequencies change and extreme trait values are favoured over intermediate values. This normally follows positive natural or artificial selection. 4

23 ADP-glucose pyrophosphorylase (AGPase), Starch phosphorylase (PHO) and Glucose phosphate translocator (GPT) gene families These enzymes/genes reside at the top of the starch bio-synthetic pathway and are the starting point of grain starch production. Glucose is first activated by the addition of ADP by AGPase which then becomes the substrate for the other major starch enzymes. There are several gene/isozymes in this classification but AGPS2b has the highest expression level in rice endosperm (Hirose et al., 2006). AGPS2b (small subunit) The role of this subunit in starch granule synthesis has been identified by way of its association with rice shrunken mutants (Kawagoe et al., 2005). A dramatic inhibition of starch synthesis has been observed in AGPase-deficient rice mutants and some other species and results in increased soluble sugars, a large number of underdeveloped granules, small grains and pleomorphic amyloplasts (Rolletschek et al., 2002). In total, 12 SNPs have been retrieved for 20 fully sequenced cultivars in OryzaSNP@MSU database. The polymorphism rate detected for AGPS2b in OryzaSNP database is relatively high at and the Ka/Ks ratio (0.25) indicates that this gene has been under negative or purifying human selection. SPHOL (alpha 1,4 glucan starch phospholrylase) This gene is generally considered to be involved in starch degradation but recent studies suggest some important roles in starch biosynthesis. Although its precise mechanism and influence is still not well known, the mechanism appears to be associated with phosphorylation of some starch-related enzymes and proteins such as starch branching enzymes (SBEs) and starch synthase 5

24 (SSIIa) (Tetlow et al., 2004). In total, 11 SNPs are known in this gene, including two nonsynonymous and four synonymous. The SNP rate is 1.46 and gene has been under negative selection (Table 1). GPT1 (Glucose-6-phosphate translocator) GPT1 is strongly expressed in the endosperm. This gene is believed to be responsible for the import of essential carbon substrates such as Glc6P into the plastids during the grain development (Fischer and Weber, 2002; Jiang et al., 2003). Three SNPs, all in introns, and a Ka/Ks ratio suggests this gene has not been under any selection pressure by humans. Pathway to amylose This is the shortest and simplest pathway of starch synthesis and the most well recognised. ADP-glucose is converted to amylose by the contribution of one major enzyme, granule bound starch synthesis I (GBSS I). Granule bound starch synthesis (GBSSI) GBSSI coded by the Waxy gene is the most well characterised starch biosynthesis enzyme in plants and has very significant effect on starch composition and quality. The α1-4 glycosidic bonds of amylose are synthesised by GBSSI. In rice, high activity of GBSSI produces high amylose content leading to a non-waxy, non-sticky or non-glutinous phenotype. On the contrary, if GBSSI gene is partially active or inactive, the waxy (sticky), glutinous appearance will be produced. In maize the waxy phenotype contains no amylose due to a defect in GBSS encoding gene (Kiesselbach, 1944) while potato and cassava amylose free cultivars have been generated 6

25 by GBSS suppression (Raemakers et al., 2005; Visser et al., 1991; Hovenkamp-Hermelink et al, 1987; Kuipers et al., 1994). Several wx mutants and isoforms of GBSS have been reported in barley waxy cultivars which synthesize small amounts of endosperm amylose (Ishikawa et al., 1995; Patron et al, 2002). There are two isofoms of GBSS, GBSSI and GBSSII. These isoforms are homologous and have approximately 66-69% amino acid sequence identity but their encoding genes are situated at different loci. The gene encoding GBSSI is predominantly expressed in endosperm whereas GBSSII is expressed in leaves and other non-storage tissues (Vrinten and Nakamura, 2000). Therefore, GBSSI is the most important enzyme responsible for endosperm amylose content. GBSSI has been widely studied in different plant species (Nakamura et al., 1998; Nakamura, 2002; Domon et al., 2002; Saito et al, 2004; Shapter et al., 2009). In rice, a significant association between RVA pasting properties and the waxy gene sequence has been found. Three SNP sites in the waxy gene in exon1/intron1 boundary site, exon 6, and exon 10 were determined to be responsible for different apparent amylose content and pasting properties (Larkin and Park, 2003; Chen et al., 2008). Chen et al. (2008) identified four SNPhaplotypes/ alleles that explained the high variability of RVA pasting properties in international rice germplasm GBSSII in rice is exclusively bound to the starch granules of leaves and has an important function in amylose synthesis in the pericarp of the mature ovary (Nakamura et al, 1998). Starch produced by GBSSII may be stored temporarily in the pericarp and later converted to sugar and transferred to the endosperm as a substrate for starch synthesis during endosperm development (Sato, 1984). The possible existence of SNPs/indels in GBSSII and their impacts on starch properties and grain quality is still unknown. 7

26 GBSSI has widely been selecting by breeders in the past two decades and thus is under purifying or almost has been purified. In contrast, it seems GBSSII is one of the most conservative of all starch genes as only one SNP has been detected suggesting this gene has not undergone any artificial selection pressure (Novaes et al., 2008). Pathway to amylopectin This is the second branch of the starch synthesis pathway (Fig 1), the end product of which is amylopection. This pathway is more complex with many genes/enzymes and their isoforms being involved in the process. Although amylopectin is the most abundant constituent of grain starch, the role of different genes on starch composition in this pathway is relatively unknown, perhaps because of complexity of the pathway. Starch Synthase (SS) genes The Starch Synthases (SS) or Soluble Starch Synthases (SSS) exists in all plants in multiple isoforms and are responsible for the construction of α1-4 glycosidic bonds in amylopectin. There are five genes encoding five different SS isoforms in the rice genome (SSI, SSII, SSIII, SSIV and SSVI). All classes of SS are expressed in the endosperm of plants (Li et al., 1999a; Li et al., 1999b; Li et al., 2000) and probably in all starch synthesising cells (Smith, 1999). There is good evidence amylopectin chains are synthesized by the coordinated actions of SSI, SSIIa, and SSIIIa isoforms. 8

27 Table 1. Polymorphism in rice genes responsible for starch synthesis and their status during domestication. No Chro# Nucleotide length (bp) Total number of SNPs AGPS2b Locus No (Rice genome annotation Project) LOC_Os08g25734 Number of Synonymous SNPs 3 Functional SNP rate n.s/total 0.00 Polymorphism rate (SNP/Kb) Ka/Ks ratio* Selection type 12 Number of n.s SNPs Negative Gene status during domestication Purifying SPHOL LOC_Os03g Negative Purifying GPT1 LOC_Os08g Neutral Intact GBSSI LOC_Os06g Negative Purifying GBSSII LOC_Os07g Neutral Intact SSI LOC_Os06g Negative Purifying SSIIa LOC_Os06g Positive Diversifying SSIIb LOC_Os02g Positive Diversifying SSIIIa LOC_Os08g Positive Diversifying SSIIIb LOC_Os04g Positive Diversifying SSIVa LOC_Os01g Positive Diversifying BEI LOC_Os06g Positive Diversifying BEIIa LOC_Os04g Neutral Conservative BEIIb LOC_Os02g Neutral Conservative ISA1 LOC_Os08g Negative Purifying ISA2 LOC_Os05g Positive Diversifying PUL LOC_Os04g Positive Diversifying Sequencing data and polymorphism of SNPs derived from OryzaSNP@MSU database for 20 cultivated rices ( *To avoid value of zero for Ka/Ks ratio +1 will be added when number of non-synonymous and synomymous SNP are zero. 9

28 SSI SSI is primarily responsible for the synthesis of the shortest chains of amylopectin of about 10 glucosyl units or less (DP 7-11). This gene/protein is presumed to be expressed in the endosperm and leaf of rice (Fujita et al., 2006). The SSI gene is located on chromosome 7S of wheat and encodes a Mr protein that is distributed between starch granule and the soluble phase (Li et al, 1999). Studies on chain-length specificities of maize SSI affinities have revealed that the entire carboxy-terminal region of this protein is necessarily required for starch binding (Commuri and Keeling, 2001). RT-PCR analysis shows that there is only one SSI isoform in rice which has steady expression (Hirose and Terao, 2004). The transcript level of SSI is higher in endosperm than leaf sheaths and blades and has therefore been classified as an endosperm and non-endosperm expressing gene (Hirose et al., 2006). The measurement of SSI transcript levels at different seed developmental stages found high expression at 1-3 days after flowering (DAF), peaking at 5 DAF, and remaining almost constant during endosperm starch synthesis, suggesting SSI is the major SS form in cereals (Cao et al., 1999). A comprehensive analysis of mutant rice with a retrotransposon inserted into the SSI encoding gene revealed SSI has a capacity for the synthesis of chains with DP8-12 with the extension of smaller chains (Nakamura, 2002). Fujita et al. (2006) generated four SSIdeficient rice mutant lines using retrotransposon Tos17 insertion. The deficient mutants exhibited a 0%-20% decrease in the amount of SSI protein in comparison to wild type, changed amylopectin structure and increased the gelatinization temperature of endosperm starch, although the complete absence of 10

29 SSI had no effect on the size and shape of seeds and starch granules and the crystallinity of endosperm starch (Fujita et al., 2006). This gene has a very small phenotypic effect on rice eating quality although a significant negative correlation between the ratio of short chains (DP 6-12) and gelatinization temperature has been reported (Umemoto et al., 2008). Although 46 SNPs including one nonsynonymous and four synonymous have been detected in rice, none has yet been reported in the SSI gene of any plant species associated with starch composition, amylopectin quality or quantity. This gene is highly polymorphic with 8.64 SNPs/Kb and is undergoing purifying selection. SSII SSII is responsible for the synthesis of shortest chains and further extensions to produce longer chains are catalysed by SSIIa and/or SSIII (Commuri and Keeling, 2001). Previous studies show that there are three isoforms for SSII in monocots: SSIIa, SSIIb and SSIIc. The role of the latter two in starch biosynthesis, especially SSIIc which only expressed in source tissue, is unknown as no mutants have been found yet (Tetlow et al., 2004). SSIIa SSIIa is known to have a major affect on starch quality. This gene is predominantly expressed in cereal endosperm at very high levels and affects amylopectin structure (Craig et al., 1998; Morell et al., 2003). Loss of SSIIa results in reduced starch content, amylopectin chain length, modification in granule morphology and crystallinity. In monocots, SSIIa elongates the short glucan chains DP 10 to the intermediate size of DP 12-24, thus its loss or down regulation has a dramatic impact on amount and composition of starch (Tetlow et al., 2004). The effect of this gene on rice cooking quality and rice starch texture has clearly been demonstrated by virtue of a significant correlation between gelatinisation temperature (GT) 11

30 and particular SSIIa alleles (Umemoto et al., 2002; Umemoto et al., 2004). Alk, a major gene regulating alkali disintegration resides on the same position as SSIIa on chromosome 6 of rice (Gao et al., 2003). Further studies have shown the GT of rice flour, chain length distribution of amylopectin and alkali spreading score are associated with different SSIIa haplotypes (Umemoto and Aoki, 2005). GT, alkali disintegration and eating quality of rice starch have been explained by polymorphism of two SNPs, [A/G] and [GC/TT], within the exon 8 of alk loci (Waters et al., 2006). These two SNPs were able to explain classification of 70 rice genotypes into either high GT or low GT types which differed in GT by 8 C (Waters et al., 2006). Polymorphism analysis of this gene found 2.4 SNPs per Kb and indicates is under positive human selection (Table 1). SSIIb SSIIb is a low level early expressed gene which is primarily expressed in leaf blades and sheaths (leaf specific) at an early stage of grain filling (Hirose and Terao, 2004). However, a recent study presented evidence that SSIIb contributes with six other starch genes to alter some Rapid Viscosity Analyser (RVA) parameters in glutinous rice (Yan et al., 2010). The exact role of SSIIb in starch synthesis is currently unknown mainly due to lack of mutant phenotypes. There were 12 SNPs, including six non-synonymous SNPs, indicating this gene has some phenotypic impact due to high number of SNPs in exonic regions. Domestication has exerted positive selection pressure on SSIIb. 12

31 SSIIIa The SSIIIa encoding gene is highly expressed in endosperm, although some reports reveal expression in green tissues (Dian et al., 2005). A recent study of a SSIIIa deficient rice mutant found amylose content and the extra long chains of amylopectin increased by 1.3- and 12-fold, due to an increase in GBSSI activity (Fujita et al., 2007). In spite of a relatively high functional SNP rate of 1.686, this gene does not show a high significant association with rice physiochemical characteristics. For example, Yan et al found no functional effect on RVA parameters, at least among glutinous cultivars. Out of 52 SNPs, 19 and 11 SNPs are non-synonymous and synonymous, respectively; suggesting SSIIIa is under diversifying selection with a Ka/Ks ratio of SSIIIb SSIIIb is mainly expressed in rice endosperm but transient expression in leaf sheaths and leaves have also been reported (Hirose et al., 2006). It has also been classified into two different categories on the basis of timing of expression in the developing seed. The late expression category in which it is expressed in the mid to later stage of grain filling (Hirose and Terao, 2004), and the early expression category in which the transcript level increases to maximum level at 3-5 days after flowering (Ohdan et al., 2005). An association study of rice glutinous near-isogenic lines suggested SSIIIb has a significant impact on RVA parameters such as peak time and pasting temperature (Yan et al., 2010). The total number of SNPs reported in OryzaSNP database is 24, of which eight are nonsynonymous and five synonymous, a high Ka/Ks ratio of 1.6. This ratio suggests this gene as a diversifying gene which has been under positive selection during domestication. 13

32 SSIVa SSIVa is one of the least known starch genes in plants. Like most starch synthase genes, SSIVa is exclusively involved in amylopectin biosynthesis. Expression analysis by reverse transcription PCR indicated SSIVa is preferentially expressed in rice endosperm and to a degree in leaf blades as a late or steady expresser gene during grain filling (Hirose and Terao, 2004). QTL mapping and expression profile analysis have shown that high temperature during the grain filling can considerably increase the transcription level of SSIVa by up to 1.11-fold, which is considerably higher than other starch synthase genes (Yamakawa et al., 2007), and may contribute to grain chalkiness (Yamakawa et al., 2008). In total 25 SNPs has been reported for this gene in OryzaSNP database, of which five are nssnps and two synonymous, a Ka/Ks ratio of 2.5, indicating SSIVa is diversifying under human selection. SSIVa may also affect some secondary RVA parameters such as breakdown and setback (Yan et al., 2010). Starch Branching enzymes (SBEs) Starch branching enzymes (SBEs) break α-(1 4)-linkages in existing chains and attach the released reducing ends to C6 hydoxyls, forming the branched glucan, amylopection (Tetlow et al., 2004). BEI BEI is mainly expressed in the endosperm and transcript levels increase rapidly 3-5 days after flowering. Biochemical observations with purified BEI from maize endosperm indicate BEI preferentially branches amylose-type polyglucans and has a high capacity for branching less branched α-glucans (Takeda et al., 1993). Analysis of the catalytic properties of BEI has indicated the N- and C-termini play a critical role in chain length transfer and substrate 14

33 preference (Kuriki et al., 1997). A rice BEI deficient mutant induced by mutagenesis exhibited modified amylopectin structure and grain morphology but the same quantity of starch as the wild type (Satoh et al., 2003) and the BEI encoding gene also effects the RVA profile (Yan et al., 2010). The OryzaSNP@MSU database showed 14 SNPs in total, of which only two were nssnps and one synonymous. Therefore, the Ka/Ks ratio of two suggests this is a diversifying gene.. BEIIa BEIIa is a leaf expressed gene involved in amylopectin synthesis. BEIIa is also expressed in the endosperm but at levels 10-fold lower than leaf tissue (Gao et al., 1997). An association study including the gene and RVA properties demonstrated a low F value (6.60) with a very slight influence in glutinous rice (Yan et al., 2010). No SNP/Indel has been reported in OryzaSNP database, suggesting BEIIa might be one of the most conservative starch-related genes in rice. BEIIb BEIIb is known as amylose extender (ae) in maize and other cereals (Yun and Matheson, 1993) and many studies have reported the significance of this gene on starch properties in various plant species (Fisher et al., 1993; Sun et al., 1997; Sun et al., 1998). This is a granuleand soluble- associated enzyme which is only expressed in the endosperm. Expression of three different functional maize SBE genes in BE-deficient yeast strains demonstrated the presence of BEIIb is necessary to activate BEI and BEIIa (Seo et al., 2002). Additionally, a 0.5- to 0.7 fold decrease in the expression of BEIIb during grain filling creates chalky rice (Tanaka et al., 2004). 15

34 Only five SNPs are in the OryzaSNP database, none of which are in the exonic regions, despite of results of a recent association study that has determined very high F value of between BEIIb and RVA properties in rice (Yan et al., 2010). Debranching Enzymes (DBEs) DBEs belong to α-amylase family of which two classes exist in plants, Isoamylase and Pullulanase. These enzymes debranch (hydrolase) α-(1-6)-linkages in amylopectin and pullulan. Defective DBEs in plants are thought to be responsible for accumulation of phytoglycogen rather than starch, and in turn, change the phenotypic appearance of the endosperm (Bustos et al., 2004). ISA1 (Iso 1) In wheat, the expression of ISA1 cdna was highest in developing endosperm and undetectable in mature grains, suggesting a fundamental biosynthetic role of Isoamylase 1 in plant starch, although precise roles of DBEs are not yet known (Tetlow et al., 2004). Transcript level regulation of ISA1 during rice grain filling in response to high temperatures has been reported by Yamakawa et al. (2007), in which the expression level of ISA1 mrna increased by 0.94 fold under high temperatures, 8 to 30 days after flowering. In rice endosperm, antisense inhibition of Isoamylase 1 altered the structure of amylopectin and the physiochemical properties of starch (Fujita et al., 2003). ISA genes are also thought to contribute to the degree of setback in glutinous rice cultivars (Yan et al., 2010). In OryzaSNP database 16 SNPs were detected, of which two are nssnps. The Ka/Ks ratio of signifies this gene has undergone negative selection during domestication. 16

35 ISA2 (Iso2) Isoamylase 2 corresponds to sugary1 (su1) was first reported in maize endosperm and could separated from Isoamylase 1 by anion-exchange chromatography. (Beatty et al. 1999; Doehlert and Knutson, 1991), A high rate of functional SNP and total polymorphisms was observed for this gene (Table 1). The Ka/Ks ratio of 1.5 suggests this gene is under positive selection and is one of the most diversifying genes among starch-related genes. The high polymorphism rate of 4.16 supports this assertion. Association between ISA2 and rice grain quality is unclear. There is no intron in this relatively small gene (2625 bp), thus each detected SNP/Indel can be potentially important. Pullulanase (PUL) In rice endosperm a defect in pullulanase-type DBE activity triggers and modulates some phenotypic effects (Nakamura et al., 1998). In maize endosperm, it is believed that pullulanase has a dual role, contributing either to starch synthesis or degradation (Dinges et al., 2003). Kubo et al. (1999) suggest pullulanase plays a predominant and essential role in amylopectin synthesis and compensates shortages of isoamylase activity in the construction of multiple cluster structure of amylopectin. The highest polymorphism rate was observed for PUL in OryzaSNP database. In total 108 SNPs were detected in this relatively large gene (10399 bp), of which 10 are nonsynonymous and 9 synonymous, respectively. A Ka/Ks ratio of 1.11 indicates, although this gene is a diversifying gene, its close to one ratio could easily change it into a neutral gene. A recent association study between PUL and RVA profile parameters in glutinous rice has shown strong relations of this gene with peak viscosity, hot paste viscosity, breakdown viscosity and peak time (Yan et al., 2010). Nevertheless, our study only showed very minor 17

36 association between PUL and some physiochemical properties such as chalkiness, gelatinization and pasting temperature (chapter 4). Proteins A wide range of starch granule-associated proteins has been found from different botanical sources that are diverse in number, identity and possibly function (Baldwin, 2001; Schofield and Greenwell, 1987). They are classified into two main categories of low and high molecular weight proteins. It is widely accepted these proteins are located either inside and/or on the surface of starch granules and influence starch properties. The composition and content of these proteins significantly affects the structure and quality of starch and the baking quality of cereals. Juliano et al. (1965) studied the relation of amylose content, protein content, water absorption and gelatinization temperature on cooking and eating qualities of non-waxy rice and found protein content and cooked rice colour of are positively correlated. Retrogradation is the hardening of cooked rice after storage or cooling. Retrogradation rate has significant implications for rice consumers as many of them cook rice in the morning and consume it after several hours of refrigerated storage. Recent studies have shown removal of total proteins causes softer gels at different storage treatments such as 20, 40 and 60 ºC but will not affect firmness following refrigerated storage (Philpot et al., 2006). This suggests protein could have a major influence on room temperatures retrogradation; however, other factors or biochemical mechanisms such as lipid content might be involved in the lower temperatures. Application of pronase, which digests peptides and proteins, to rice kernels and milled starch caused a significant change in thermal properties and gelatinization profile (Marshall et al., 1990). Ragaee and Abdel-Aal (2006) observed significant differences between cereals physiochemical properties such as starch peak, breakdown and setback viscosities (RVA curves) as well as in protein peak viscosity of a number of cereals. 18

37 However, proteins inside maize or wheat hydrated-swollen starch granules after gelatinization are degraded extensively by proteases without any apparent change in properties (Debet and Gidley, 2007). Lipids A number of lines of evidence suggest lipids have a significant role in influencing the physiochemical properties of rice such as retrogradation. Debet and Gidley (2007) found proteins and lipids on the granule surface are determinants of ghost robustness and have a role in ghost formation and integrity, a surface film, rich in protein and lipid, limits expansion of starch granules and prevents dissolution after gelatinization. Lipid removal from rice variety Koshihikari grown in different countries, increased retrogradation rate and firmness of gels after storage at different temperatures. The greater the amount of long chain amylose complexed with lipids, the greater the reduction in retrogradation degree which is caused by the unavailability of long amylose chains (Philpot et al., 2006). Lipids are also involved in the physical structure of the rice kernel. Treatment of different rice varieties with hexane caused significant changes in gelatinization parameters and kernel shape and caused extensive fissure formation (Marshall et al., 1990). Environmental factors: Nitrogen (N), Phosphorous (P) and Potassium (K) The optimal application of nitrogen fertilizers for Australian rice cultivars is kg/ha and an amount higher or lower than this is considered to be a high or low level. Application of different levels of nitrogen in the field during rice plant development influences solid loss and water uptake ratio during cooking. Rice starch grown under nitrogen application has a higher cooked grain hardness, cohesiveness, chewiness, lower amylose content and higher pasting and gelatinisation temperature and enthalpy (Singh et al., 2011). Pot and field 19

38 experiments confirmed increasing N application, decreased amylose content, peak viscosity and breakdown viscosity, while setback and consistency go up (Dayong et al., 2004). Other macro-nutrients such as phosphorous (P) and potassium (K) have also been studied in relation to rice grain amylose content and starch viscosity properties. Application of P has no obvious effect on amylose content, peak viscosity, breakdown, set back and gel consistency. However, increasing amounts of K increased amylose content, peak viscosity, breakdown while the setback and gel consistency were reduced. It seems, the interaction of NPK fertilizers on quality characters of different varieties was significant; while reduction of N and increasing K improves rice cooking and eating quality (Dayong et al., 2004). Thermal stress Thermal stress (temperatures above 37 C) during the critical grain filling period can affect the biochemical processes of starch deposition causing yield loss and starch defects (Peng et al., 2004). Failure in starch deposition results in lightly packed granules and grain chalkiness (Zakaria et al., 2002). It is believed that during heat shock, especially at grain filling, expression of some starch related genes such as GBSSI, BEIIb and a cytosloic dikinase gene is down-regulated, where as some heat shock proteins and alpha-amylases up-regulated (Yamakawa et al., 2008). Yamakawa et al. (2007) suggested that decreased level of amylose and long chain-enriched amylopectin in high temperature-ripened grains is mainly due to repressed expression of GBSSI and BEIIb, respectively. They also reported the expression level of various genes in response to high temperature. Tashiro and Wardlaw, (1991a) showed that high temperature at the milky stage of grain filling has the most extensive influence on rice grain chalkiness because the panicle is the most sensitive organ to high temperature (Sato et al., 1973). 20

39 CO2 Differences in carbon dioxide concentration have no consistent effect on grain and starch parameters of wheat, small effects have been detected on thousand grain weight, starch content and lipid-free amylose content (Tester et al., 1995). Evaluation of the long-term effects of different CO2 concentrations on carbohydrate status and partitioning of rice (Oryza sativa L cv. IR-30) found the photosynthesis rate was substantially increased with CO2 concentrations up to 500μmol mol 1 and then reached a plateau at higher concentrations (Rowland-Bamford et al., 1990). The ratio of starch to sucrose concentration was positively correlated with the CO2 concentration but had no effect on the carbohydrate concentration in the grain at maturity. Objectives of thesis The objectives of this thesis were to: 1) Characterise rice starch biosynthesis genes; 2) Discover DNA polymorphisms in Australian rice germplasm using new cutting edge technologies (Next Generation Sequencing); 4) Detect and prioritise functional SNPs in starch related genes using computational tools; 5) Associate SNPs (genes) with the physiochemical properties of rice grain. Key concepts The key concepts encompassed by this thesis are: 1) Rice starch varies between rice cultivars due to differences in the gene sequence of the enzymes which synthesise the rice starch; 2) Humans can detect differences in rice starch and these differences define rice quality. These differences can be instrumentally quantified; 21

40 3) Differences in gene sequence can be defined and the extent to which they control, or are associated with, rice quality differences measured; 4) The chemical properties of the 20 constituent protein amino acids structure and the accumulated knowledge of how protein structure and function are linked can be used to in algorithms which predict how amino acid differences in any one protein may impact the function of that protein. Major activities reported in the thesis The major activities reported in the thesis were: 1) DNA sequence of 18 starch related genes were retrieved from databases and the location and type of SNPs identified; 2) The retrieved SNPs were analysed and then prioritised based on their predicted importance using bioinformatic tools and algorithms; 3) Long range PCRs (LR-PCR) amplified the 18 starch related genes in 233 Australian rice lines/cultivars; 5) The amplified products were pooled and then sequenced using an Illumina GAIIx platform. 6) The sequencing data was analysed and SNPs detected; 7) The SNPs retrieved from databases were compared with SNPs discovered in the sequencing experiment and novel variations identified; 8) Specific markers were designed and generated for multiplexed MALDI-TOF assay of SNPs. 9) All 233 genotypes were assayed (genotyped) individually for all SNPs using multiplexed MALDI-TOF; 10) The phenotypic data for physiochemical traits of 233 rice individuals were obtained from an Australian breeding program; 22

41 11) Association between SNPs (genes) and traits was assessed using the software TASSEL following a General Linear Model; 12) Data flowing from these activities were discussed within the context of published work in the field. 23

42 CHAPTER 2 Discovery of polymorphisms in starch related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing Summary High-throughput sequencing of pooled DNA was utilised for polymorphism discovery in candidate genes involved in starch synthesis. A total of 17 rice starch synthesis genes, encoding seven classes of enzymes, including ADP-glucose pyrophosphorylase (AGPases), granule starch synthases (GBSS), soluble starch synthase (SS), starch branching enzyme (BE), starch debranching enzyme (DBE) and starch phosphorylase (SPHOL) and phosphate translocator (GPT1) from 233 genotypes were PCR amplified using semi- to long range PCR. The amplification products were equimolarly pooled and sequenced using massively parallel sequencing technology (MPS). By detecting SNP/Indel in both coding and non-coding areas of the genes, the SNP/Indel variation and distribution patterns among individual starch candidate genes were identified and characterized. Approximately 60.9 million reads were generated, of which 54.8 million (90%) mapped to the reference sequences. The average coverage ranged from 12,708 to 38,300 for SSIIa and SSIIIb, respectively. SNPs and single/multiple-base Indels were analysed in a total assembled length of 116,403 bp. A total of 501 SNPs and 113 Indels were detected across the 17 starch related loci. The ratio of synonymous to non-synonymous SNPs (Ka/Ks) test indicated GBSSI and Isoamylase 1 (ISA1) as the least diversified (most purified), reflecting the populations history of selection for low amylose content and gelatinization temperature. This report demonstrates a useful strategy for screening germplasm by MPS to discover variants in a specific target group of genes. 24

43 Introduction The capacity of massively parallel sequencing to simultaneously assay millions of single nucleotide polymorphisms (SNPs) has made genome-wide studies possible (Schuster, 2008). The use of next generation sequencing (Thomas et al., 2006) platforms for population-based sequencing of targeted genomic regions enables the discovery of new variants and their frequencies across selected genes (Harismendy and Frazer, 2009), or allow identification of errors in previously published reference sequences (Bentley, 2006) or SNP databases (Velicer et al., 2006). Massively parallel sequencing technology (Genome Analyser) is a groundbreaking, flexible and high-throughput platform for genetic analysis and functional genomics which is based on ultra deep sequencing of short reads and a huge number of sequencing reactions (Imelfort et al., 2009). This platform utilizes a sequencing-by-synthesis approach in which all four nucleotides are added simultaneously followed by an optic imaging procedure which occurs at each base incorporation step (Mardis, 2008b) and has widely been used by researchers to discover SNPs associated with human genetic diseases, particularly cancer studies (Bentley, 2006; Mardis, 2008a). This platform can be utilised in different ways, from whole genome sequencing (WGS) of plants and animals to specific genomic regions or even functional encoding genes or loci (Bentley et al., 2008; Hillier et al., 2008; Kim et al., 2009). Massively parallel sequencing (MPS) is an attractive cost efficient technology that enables characterisation of genetic traits on an unprecedented scale, in terms of the number of genes, number of samples and allele frequency which is necessary if rare alleles are to be found (Kaiser, 2008; Pettersson et al., 2009). Recently, targeted MPS has been effectively integrated with Long Range PCR (LR-PCR) of pooled DNA samples which minimises the cost of sequencing, amplification, oligonucleotides, and labour (Out et al., 2009). LR-PCR targeted MPS can be employed to deeply sequence regions surrounding candidate genes 25

44 containing SNPs/indels (Varley and Mitra, 2008). Utilising this approach, the full extent of allelic variation in a vast number of encoding genes involved in various aspects of physiology, disease etc. can be recovered and large regions of linkage disequilibrium (~5-11 kb) identified (Bodmer and Bonilla, 2008). One of the major advantages of this approach is the capacity of MPS and targeted gene amplification to provide a high sequence depth in all studied loci simultaneously. For example, a total sequence yield of 1 Gb means a fragment of 10-kb will be read approximately 100,000 times (Out et al., 2009) which meets the requirements for discovery of rare alleles (Druley et al., 2009; Thomas et al., 2006; Ingman and Gyllensten, 2008). The flexibility of the platform is extended when multiple genomic regions of numerous individuals from wild or segregating populations are pooled. Rice (Oryza sativa L.) starch, a complex carbohydrate, is one of the most important crop products for humankind (Fitzgerald, 2004). Starch is synthesized by the activity of several enzymes and has been subjected to extensive studies (Morell et al., 2003). Each of the starch synthesis enzymes exists as a number of different isoforms and is usually classified into one of the specific group of genes, such as ADP-glucose pyrophosphorylase (AGPases and GPT1), starch synthase, starch branching enzyme, starch debranching enzyme and starch phosphorylase (James et al., 2003). In this study, DNA of 233 individuals from a breeding population was equimolarly pooled and 17 rice starch quality-related genes encoding seven classes of starch enzymes which are part of the starch bio-synthesis pathway were amplified by a LR- or Semi LR-PCR (SLRPCR) protocol. The pooled-targeted amplifications were subsequently sequenced using MPS sequencing technology (Illumina Inc., San Diego, CA). By detecting SNP/Indels contained in 26

45 both coding and non-coding areas of the genes, SNP/Indel distribution patterns were characterised. Materials and methods Plant materials All plant material was supplied by Industry and Investment NSW, Yanco Agricultural Research Institute, Australia. Two hundred and thirty three rice lines from a breeding program were analysed. These lines were significantly diverse in starch quality properties, providing a high rate of variation for starch traits. Variability of genotypes This population comprised a series of lines at the F6 stage, from harvested pedigree rows entering the first stage of plot testing. This captured a wide set of lines from a breeding program focused on temperate (japonica-type) rice. Selection was primarily done on plant height, the capacity to flower and set seed in our temperate environment of Australia, and visual inspection of grain size and shape. No selection had taken place for quality traits like gelatinization temperature and RVA curve data (Appendix 2). Sample preparation and DNA extraction Rice seeds of each line were germinated at 25 C and seedlings from each individual selected for DNA extraction. Total genomic DNA was extracted from 15-day-old seedlings using DNeasy 96 Plant kit (Qiagen, Valencia, CA, USA) according to the manufacturer s instructions. 27

46 Designation of starch-metabolizing enzymes/genes involved in starch synthesis The major databases such as NCBI ( and the Rice Genome Annotation Project ( were searched for the general entries of nucleotide sequences (gdna) and full-length cdnas of important gene classes which are presumed to be involved in starch biosynthesis. The available literature was used to choose the most likely candidate genes associated with rice starch quantity and quality (Ohdan et al., 2005; Waters and Henry, 2007; Nakamura, 2002; Hirose et al., 2006; Rahman et al., 2000). The multiple sequence alignment of selected genes was carried out using Sequencher (Gene Codes Corporation, Ann Arbor, MI, USA) and CLUSTAL W ( and a consensus sequence alignment generated for each candidate gene to design the amplification primers. Target genes for sequence analysis The present study focused on the genes encoding seven groups of enzymes, namely ADPglucose pyrophosphorylase (AGPase), granule bound starch synthase (GBSS), starch synthase (SS), branching enzyme (BE), debranching enzyme (DBE), starch phosphorylase (PHO) and glucose phosphate translocator (GPT). Designing primers to capture target genes The sequence of each target gene, including exons and introns, were divided into two relatively equal fragments. Each selected sequence included 500 bp from up- and downstream of the coding regions (5 and 3 UTRs) and an approximately 300 bp overlap in the middle. A set of specific primers were designed for each half using Clone Manager V9.1 (Sci-Ed Software, NC USA) (Appendix 3). 28

47 Long range PCR protocol (LR-PCR) The concentration of extracted DNA was quantified by the automated flurometric protocol of PicoGreen (PicoGreen dsdna Quantification Kit, Invitrogen, CA USA) and then diluted to ng/µl for amplifications. A unified LR-PCR approach was applied to amplify all genes (with few exceptions), simultaneously. BioRad iprooftm High-Fidelity DNA polymerase was used for PCR amplifications, in 10 µl reactions, containing 20 ng of pooled genomic DNA of 20 individuals. The extreme fidelity of iprooftm makes it the enzyme of choice for SNP detection in long amplicons. PCRs were performed using 2 µl of HF or GC buffer (the HF buffer are used for normal and GC buffer for GC rich sequences), 0.2 µl of dntps (10 mm), 2 µl of each forward and reverse primer (2.5 µm), 1 µl of pooled DNA (20 ng/µl), 0.1 µl of iprooftm polymerase (0.2 unit) and 2.7 µl sterile water. As the different genes needed unique optimal conditions for amplification, a unified PCR method to amplify all targeted genes simultaneously was attempted. The touchdown PCR protocol was performed using a Corbett PCR thermo cycler as follows: 98 C for 1 min (1 cycle), followed by 10 touch down cycles of 98 C for 10s, annealing temperature of (10 C degree touch down) and 72oC extension for 4 min, followed by 28 cycles of a normal amplification of 98oC for 10s, 62oC for 20s and 72oC for 4 min. The final extension was done by a cycle of 72oC for 10 min. Prior to Illumina sequencing, the PCR products were Sanger sequenced using BigDye Terminator version 3.1 (Applied Biosystems, Foster City, CA). The generated sequences were aligned with the reference sequence to ensure the correct gene had been captured. DNA equimolar pooling A uniform pooling strategy was applied for all samples. The genomic DNA of 233 breeding lines, which had already been normalised for PCR in previous stage (30 ng/µl), divided into 29

48 12 sections, containing the pools of approximately 20 individuals each, and LR-PCRs were carried out. The concentration of PCR products from these pools were measured using PicoGreen (PicoGreen dsdna Quantification Kit, Invitrogen, CA USA). A second pool was made for each fragment from PCR products. To facilitate the final equimolar pooling of PCR products, the concentration of second pools (33 second pools/amplicons) were individually normalised to 25 ng/µl and then equimolarly pooled into a mega pool based on the predicted lengths, giving consideration to the requirement that larger amplicons need a higher number of copies than smaller fragments. The final mega pool was prepared with the aim of having the final concentration of 2.5 µg of long amplicons, including all 233 individuals. Massively parallel sequencing The final mega pool was subjected to Illumina GA sequencing (Illumina Inc., San Diego, CA). The PCR product fragmentation and library were prepared according to the manufacturer s instructions. The fragments with length of approximately 200 bp were selected for sequencing and 4 pmol of the library were added on to a one flowcell. SNP detection and data analysis Data analysis such as filtering, trimming and mapping to the reference sequences were performed with the CLCbio Work Bench 4 using 17 reference sequences with the specified coordinates, extracted form Genbank (Table 3). The CLC work bench general parameters were set to the following: The conflict resolution changed into all four nucleotides (vote A, C, G, T), non specific and masking references ignored. The reads parameters set to default as the min-max distance, mismatch cost; length fraction and similarity were , 3, 0.9, and 0.9, respectively for both single and paired end reads. This set of parameters was selected 30

49 in order to minimize reads alignment ambiguities as well to detect rare SNPs. The minimum coverage and percent of minimum variant frequency were set at 20 and 0.5, respectively, which meant all variations on or above 0.5%, were considered as SNPs. Total polymorphism rate and functional SNPs The total polymorphism rate was calculated as: TSI TL 100 where, TSI=Total number of SNPs and Indels and TL is the total length of each candidate gene. The functional or nonsynonymous SNP rate was also calculated as: NS TL 1000 where, NS= Number of nonsynonymous SNPs in each locus and TL is the total length of each candidate gene. Results Number of reads and average coverage Sequencing of LR-PCR products of all 17 studied loci generated ~60.9 million reads of which 54.8 million (90%) mapped to the reference sequences. Table 1 shows the summary statistics of the mapping report. The average coverage differed among loci and ranged from 12,708 to 38,300 for SSIIa and SSIIIb, respectively (Fig 1). This difference may be related to factors such as concentration of amplicons and PCR efficiency, number of non-specific products and contamination with external PCR products. For example, LR-PCR products of the SSIIa gene revealed a number of non-specific bands on agarose gel which led to higher unmapped reads and lower coverage. The highest and lowest number of reads was counted for SSIIIa and SSIIa, with 5,920,785 and 876,986 reads, respectively. 31

50 Polymorphism discovery and SNP/Indel detection Starch quality loci of 233 breeding lines were successfully sequenced to great depth and coverage. SNPs and single/multiple-base Indels were discovered in a total length of 116,403 bp assembled by Genome Analyser (GA). In total, 501 SNPs and 113 Indels were detected across the 17 starch related loci (Appendix 1). The total number of polymorphisms was then compared to SNPs available at OryzaSNP MSU database ( (Table 2). A total of 399 SNPs for the targeted loci had already been reported in this database for 20 rice cultivars. As expected, the total number of polymorphisms in this experiment was significantly higher than that reported in the rice database and the confidence score was significantly higher due to huge read coverage. On average, the SNP rate was 4.31 SNPs/kb and 0.97 Indels/Kb. Previous data have reported an average rate of one SNP every 170 bp and one Indel every 540 bp (Goff et al., 2002; Yu et al., 2002). Ave Cov Average coverage Ave Cov Name of gene/enzyme Figure 1. The average coverage in starch related genes 32 PUL ISA2 ISA1 BEIIb BEIIa BEI SSIVa SSIIIb SSIIIa SSIIb SSIIa SSI GBSSII GBSSI GPT1 SPHOL AGPS2b 0

51 The data indicates one SNP in every 232 bp and one Indel every 1030 bp within this set of germplasm and for these candidate genes. Although the average rate of SNPs is gene specific and related to species and structure of the studied population, these results are similar to previous reports (Nasu et al., 2002; Yu et al., 2002). Out of 501 identified SNPs, 75 or ~14.9% of SNPs caused an amino acid change making them potentially functional. All Indels resided in the intronic regions and were thus not responsible for any stop codons, frameshift mutations or amino acid changes. The Indel rate was a slightly higher than previously reported (Goff et al., 2002; Yu et al., 2002) which may have been due to lower stringency mapping criteria the short reads in CLC workbench, with Min Cost 2; Min Insert 2 and Similarity 0.7. The largest and smallest Indels were 8 bp and 1 bp nucleotides, respectively. Table 1. Summary statistics of mapping report, generated by Illumina Genome Analyser sequencing Statistics Number of reads (Count) Average length of reads (bp) Total bases Reads 60,985, ,927,761,087 Mapped to reference 54,813, ,549,512,020 Unmapped 6,172, ,249,067 Reference sequence* 17 (count) 6, ,067 Paired reads 42,720, *Reference sequence was taken from NCBI database. SNP variation across the starch related candidate loci To evaluate the capacity of MPS to detect new variants in starch synthesizing enzyme/genes pools, a comprehensive experiment by Illumina GA platform on 17 different rice starch related genes was conducted. Table 2 summarises the information on newly discovered variation on studied genes. Seven classifications of starch related enzymes which impact starch structure and quality, such as ADP-glucose pyrophosphorylase (AGPase), granule 33

52 bound starch synthase (GBSS), starch synthase (SS), branching enzyme (BE), debranching enzyme (DBE), starch phosphorylase (PHO) and glucose phosphate translocator (GPT) were pool sequenced. The details of each gene member and their detected polymorphism are as follows: ADP-glucose pyrophosphorylase (AGPase), Starch phosphorylase (PHO) and Glucose phosphate translocator (GPT) gene families These enzymes/genes reside at the top of the starch bio-synthesis pathway and are classified as the starting point to grain starch production. Glucose is first activated by the addition of ADP by AGPase which then becomes the substrate for starch synthases enzymes. There are several gene/isozymes in this classification but AGPS2b has the highest expression level in rice endosperm (Hirose et al., 2006). AGPS2b (small subunit) The role of this subunit in starch granule synthesis has been identified by way of its association with rice shrunken mutants (Kawagoe et al., 2005). A dramatic inhibition of starch synthesis has been observed in AGPase-deficient rice mutants and some other species and results in increased soluble sugars, a large number of underdeveloped granules, small grains and pleomorphic amyloplasts (Rolletschek et al., 2002). In total, 30 SNPs and 4 Indels were found across the population for this gene. None of them caused an amino acid change, suggesting this gene has little impact on starch quality in this population. However, the number of SNPs found was significantly higher than those previously reported in rice databases (Table 2). 34

53 SPHOL (alpha 1,4 glucan starch phospholrylase) This gene is generally considered to be involved in starch degradation but recent studies suggest some important roles in starch biosynthesis. Although its precise mechanism and influence is still not well known, the mechanism appears to be associated with phosphorylation of some starch-related enzymes and proteins such as starch branching enzymes (SBEs) and starch synthase (SSIIa) (Tetlow et al., 2004). In total, five and seven non-functional SNPs and Indels were found in this gene, respectively. The SNP rate was lower than that reported in databases (Table 2 and 3). GPT1 (Glucose-6-phosphate translocator) GPT1 strongly expressed in endosperm. This gene is believed to be responsible for import of essential carbon substrates such as Glc6P into the plastids during the grain development (Fischer and Weber, 2002; Jiang et al., 2003). There were 16 SNPs found, one of which causes an amino acid change and 8 Indels. A C/T SNP at reference position of 1188 changes an amino acid from Leu to Phe (Leu42Phe). This is a conservative non-polar amino acid substitution (L F) and therefore might not significantly alter protein activity. However, this is a new functional SNP which has not previously been reported in databases. 35

54 o Table 2. Total polymorphism detected across the 17 starch quality related genes. NC-Number Genbank No# Gene ID in NCBI Gene ID in database Gene Chr No AGPS2b, [ADPglucose pyrophosphorylase (small Unt) SPHOL (alpha 1,4glucan phosphorylase) GPT1 (Glucose-6phosphate/ phosphatetranslocator GBSSI 8 GBSSII (expressed in leaf) SSI SSIIa Average coverage Length Assembeled by Illumina (bp) Number of variants Detected by GA Number of SNPs in OryzaSNP@MSU database SNPs In/dels* High quality 19,203 5, Perlegen+ Machine learning ,561 7, N/A 8 21,042 4, ,486 3, ,260 8, ,707 12,708 7,750 4, N/A ,494 5, N/A 8 19,203 11, Number of Functional (amino acid changes) 1 NC_ Os08g LOC_Os08g NC_ Os03g LOC_Os03g NC_ Os08g LOC_Os08g NC_ Os06g LOC_Os06g NC_ Os07g LOC_Os07g NC_ NC_ Os06g Os06g LOC_Os06g06560 LOC_Os06g NC_ Os02g LOC_Os02g NC_ Os08g LOC_Os08g09230 SSIIb (expressed in leaf) SSIIIa 10 NC_ Os04g LOC_Os04g53310 SSIIIb 4 38,300 8, NC_ Os01g LOC_Os01g52250 SSIVa 1 16,497 10, NC_ Os06g LOC_Os06g51084 BEI 6 30,255 7, NC_ Os04g LOC_Os04g33460 BEIIa 4 15,958 2,265 6 N/A N/A N/A 1 14 NC_ Os02g LOC_Os02g ,491 10, NC_ Os08g LOC_Os08g40930 BEIIb (Amylose extender) Isoamylase1 (DBE) 8 37,526 6, N/A 16 AC LOC_Os05g32710 Isoamylase2 (DBE) 5 20,373 2, N/A NC_ OSJNBa0014C03.3 Os04g LOC_Os04g08270 Pullulanase (DBE) 4 30,067 10, ,403 bp * Total N/A * The lower stringency used for in/dels as follows: Min Cost 2; Min Inser: 2; Similarity 0.7. SNPs in OryzaSNP@MSU database detected and analysed using the Perlegen model-based method as well as a machine learning method. Totally, over 158,000 high quality SNPs have been identified in the rice genome by these two technologies. 36

55 Granule bound starch synthase (GBSS) gene family This family of genes is responsible for production of the amylose component of starch in plants. GBSSI (Granule bound starch synthase I) GBSSI or the waxy gene is one of the most important genes involved in starch synthesis and influences cereal grain quality, particularly in rice. The major role of GBSSI on amylose content is well known and several SNPs associated with starch quality have been characterized in rice (Chen et al., 2008b). Previous studies have shown that three SNPs, one each at the intron/exon 1 boundary, exon 6 and 10 have the most significant impact on starch quality (Cai et al., 1998). Only one functional A/C SNP was detected at position 1086 of the reference sequence and corresponds to the previously reported exon 6 SNPs and causes a Tyr Ser substitution at position 224 of amino acid. This substitution is non-conservative, changes the polarity of the amino acid and the function of GBSSI, enzyme activity and amylose content. Larkin and Park, (2003) have suggested this SNP effects amylose content. One non-functional Indel was also found in this gene. The sole SNP detected in GBSSI in this population compared to the eight SNPs retrieved from OryzaSNP database indicates there has been significant selection pressure imposed on this locus in this population during the course of breeding. A multiplex SNP verification experiment was conducted to validate the data (Masouleh et al., 2009). The results showed that only this SNP, with very low frequency, exists in this population. The breeding selection criteria applied to this population have somehow restricted the polymorphism of the GBSSI gene in this population. The Ka/Ks data also suggests GBSSI is a gene under purifying selection in this population. The highest Ka/Ks ratio of 2.00 was calculated for this gene (Table 3). 37

56 GBSSII (Granule bound starch synthase II) This gene/enzyme is predominantly expressed in leaf, leaf sheaths, culm, and pericarp tissue at a low level, particularly during pre-heading and 1-3 days after flowering (Ohdan et al., 2005). The impact of GBSSII on elongation of amylose in non-storage tissues of cereals has been confirmed (Vrinten and Nakamura, 2000). GBSSII is found exclusively bound to starch granules in green tissues and synthesises amylose which is subsequently consumed by the plant or accumulated in the endosperm (Dian et al., 2003). There were 4 SNPs and 8 Indels identified, one of which occurred at coordinate 1638 of the reference sequence and altered a Leu to Serine at position 523 of GBSSII. This A/G SNP changes the polarity of the amino acid and hence may impact the activity and function of the protein. All Indels were detected in introns. Starch synthase (SS) family This gene/enzyme family is primarily involved in the production of the amylopectin component of starch in plants. SSI This protein is presumed to be expressed in the endosperm and leaf of rice (Fujita et al., 2006). The transcript level of SSI is higher in endosperm than leaf sheaths and blades and has therefore been classified as an endosperm and non-endosperm expressing gene (Hirose et al., 2006). The measurement of SSI transcript levels at different seed developmental stages found high expression at 1-3 DAF, peaking at 5 DAF, and remaining almost constant during starch synthesis in endosperm, suggesting SSI is the major SS form in cereals (Cao et al., 1999). 38

57 Table 3. The polymorphism analysis of starch-related candidate genes in rice N o Gene/Enzyme symbol Gene/Enzyme name Gene coordinates in Genbank Total Polymorphism rate Non-synonymous SNP rate SNP per Kb Indel per Kb Ka/Ks ratio ,754,206 Length Assembeled by Illumina (bp) 5,635 1 AGPS2b 2 SPHOL ADP-glucose pyrophosphorylase (small unit) alpha 1,4-glucan phosphorylase ,183,093-32,190,581 7, GPT1 Glucose-6-phosphate/ phosphatetranslocator Granule Bound Starch Synthase I (Waxy gene) Granule Bound Starch Synthase II 5,138,640-5,142,712 4, GBSSI 1,764,623-1,769,657 3, GBSSII (expressed in leaves) SSI 13,584,483-13,576,435 8, Starch Synthase I 3,078,060-3,085,809 7, SSIIa Starch Synthase IIa 6,747,562-6,751,981 4, Starch Synthase IIb 32,125,071-32,119,749 5, SSIIb (expressed in leaves) SSIIIa Starch Synthase IIIa 5,351,108-5,362,370 11, SSIIIb Starch Synthase IIIb 32,149,493-32,158,120 8, SSIVa Starch Synthase IVa 31,786,842-31,797,321 10, BEI Branching Enzyme I ,782,688 7, BEIIa Branching Enzyme IIa 20,260,837-20,265,349 2, Branching Enzyme IIb 20,213,965-20,224,864 10, BEIIb (Amylose extender) ISA1 (DBE) Debranching Enzyme- Isoamylase 1 25,981,756-25,988,347 6, ISA2 (DBE) Debranching Enzyme- Isoamylase 2 23,596-25,998 2, PUL (DBE) Debranching Enzyme- Pullulanase 4,399,980-4,410,318 10, Ka/Ks ratio: The proportion of non-synonymous (Ka) relative to synonymous (Ks) can reveal whether a gene has been under purifying, neutral or diversifying selection. The data for calculation of Ka/Ks (number of Ka and Ks) can be found in columns R and S of Appendix 1. Values in column R shows the SNPs exist in the coding region (Marked as CDS or mrna) and S column shows the number on nssnps. The total polymorphism rate calculated as: TSI TL 100 where, TSI=Total number of SNPs and Indels and TL is the total length of each candidate gene. The functional or non-synonymous SNPs rate calculated as: NS TL 1000 where, NS= Number of non-synonymous SNPs in each locus and TL is the total length of each candidate gene. 39

58 A comprehensive analysis of mutant rice with a retrotransposon inserted into the SSI encoding gene revealed SSI has a capacity for the synthesis of chains with DP8-12 with the extension of smaller chains (Nakamura, 2002). This gene has a very small phenotypic effect on rice eating quality although a significant negative correlation between the ratio of short chains (DP 6-12) and gelatinization temperature has been reported (Umemoto et al., 2008). There were 73 SNPs and 8 Indels detected in this gene. No functional SNP/Indels were found, in comparison with two amino acid changes that have been reported in the OryzaSNP database. SSIIa SSIIa is known to have a major affect on starch quality. This gene is expressed in the endosperm at very high levels and presumably affects amylopectin structure (Craig et al., 1998; Morell et al., 2003). The effect of this gene on cooking quality and starch texture has clearly been revealed (Umemoto et al., 2008; Umemoto et al., 2004). The gelatinisation temperature (GT), alkali disintegration and eating quality of rice starch have been explained by polymorphism of two SNPs, [A/G] and [GC/TT], within the exon 8 of alk loci (Umemoto and Aoki, 2005; Waters et al., 2006). In total, 31 SNPs and 1 Indel were detected in this gene which was significantly higher than those reported in OryzaSNP database (12 SNPs). Surprisingly, 22 SNPs out of 31 were functional and introduced an amino acid change as determined by CLC Workbench. SNP distribution analysis revealed 80% of these low frequency SNPs (25) were located at the beginning of the reference sequence, starting from coordinates 13 to 553, and bringing about 17 amino acid changes. This high SNP rate may be associated with inefficient PCR and consequent low coverage ( ) of this GC rich region (Appendix 1). Re-sequencing verified only four SNPs between coordinates , with a minimum frequency of 8-10% for the minor allele. Taking the high false positive rate 40

59 into account, a total of nine SNPs and one Indel (six amino acid changes) were identified in this gene (Appendix 1). Three single nucleotide 3-allelic SNPs [G/T/A] and a G/T SNP at positions 72, 77, 81 and 87 of the reference sequences respectively are new polymorphisms. Of the three single nucleotides 3-allelic SNPs, one G/T/A SNP is presumed to cause the most critical amino acid substitution of Arg26Met, Lys which induces a polar to non-polar alteration in the protein. SSIIb It is believed SSIIb is a low level early expressed gene which is primarily expressed in sink and source leaf blades and sheaths (leaf specific) at an early stage of grain filling (Hirose and Terao, 2004). However, a recent study presented evidence it contributes to six other starch genes to alter some RVA (Rapid Viscosity Analyser) parameters in glutinous rice (Yan et al., 2010). Only three SNPs and two Indels were found, both of which were non-functional, indicating this gene does not affect phenotypic variation of this population. SSIIIa The highest rate of polymorphismin in terms of amino acid changes was observed in this gene. In total, 83 SNPs and 15 Indels including 23 non-synonymous and nine synonymous substitutions were detected, indicating this is the most diverse gene in our population. Previous findings have detected 52 SNPs. Of 23 non-synonymous substitutions, 10 are amino acid changes which alter polarity and may produce significant changes in the protein structure (Appendix 1). The SSIIIa encoding gene is highly expressed in the endosperm, although some reports revealed its expression on green tissues (Dian et al., 2005). A recent study of amylopectin chain length in a SSIIIa deficient mutant suggests SSIIIa plays an important role in the elongation of amylopectin B2 to B4 chains. Furthermore, in these 41

60 mutants, the amylose content and the extra long chains of amylopectin increased by 1.3- and 12-fold, due to an increase in GBSSI activity (Fujita et al., 2007). Conversely, no functional effect of SSIIIa differentiation was observed on RVA parameters, at least between glutinous cultivars (Yan et al., 2010). SSIIIb SSIIIb is mainly expressed in endosperm but transient expression in leaf sheaths and leaves has also been reported (Hirose et al., 2006). This might be due to the existence of two divergent groups of SIIIb in rice that are expressed in different tissues (Dian et al., 2005). It has also been classified into two different categories on the basis of timing of expression in the developing seed. In late expression category the gene expressed in the mid to later stage of grain filling (Hirose and Terao, 2004), and in early expressing category the transcript level usually increases to peak, at 3-5 days after flowering (Ohdan et al., 2005). An association study of rice glutinous near-isogenic lines suggested SSIIIb has a significant impact on RVA parameters such as peak time and pasting temperature (Yan et al., 2010). In total, 26 SNPs and 11 Indels were found in this gene. No functional Indels were detected in this gene and of the seven amino acid changes; three changed the polarity of the amino acids, Thr1176Ala, Glu634Gly and Ser756Ile. SSIVa SSIVa is one of the least well known starch genes in plants. Like most starch synthase genes, SSIVa is exclusively involved in amylopectin biosynthesis. Expression analysis with reverse transcription PCR has indicated SSIVa is preferentially expressed in rice endosperm and to a degree in leaf blades as a late or steady expresser gene during grain filling (Hirose and Terao, 2004). QTL mapping and expression profile analysis have shown that high 42

61 temperature during the grain filling can considerably increase the transcription level of SSIVa up to 1.11-fold, which is considerably higher than the other starch synthase genes, with a general expression level range of 0.8-to 1.2 (Yamakawa et al., 2007), and may contribute to grain chalkiness (Yamakawa et al., 2008). SSIVa may also affect some secondary RVA parameters such as breakdown and setback (Yan et al., 2010). Of 27 SNPs identified, five were non-synonymous and six intronic Indels. Only one SNP modified amino acid polarity, a 'C/T' SNP at coordinate 4048 of the gene nucleotide sequence induced a Gly708ASP substitution. Starch Branching enzymes (SBEs) Starch branching enzymes (SBEs) determine the structure of amylopectin by breaking α(1 4)-linkages in existing chains and attaching the released reducing ends to C6 hydoxyls, forming the elongated and branched glucan, amylopection (Tetlow et al., 2004). The nucleotide polymorphisms of different isoforms of branching enzymes were studied and the results are as follows. BEI BEI is mainly expressed in the endosperm. Biochemical observations with purified BEI from maize endosperm indicate that BEI preferentially branches amylose-type polyglucans and has a high capacity for branching less branched α-glucans (Takeda et al., 1993). Analysis of the catalytic properties of BEI has indicated the N- and C-termini play a critical role in chain length transfer and substrate preference (Kuriki et al., 1997). BEI transcript levels increase rapidly 3-5 days after flowering. A rice BEI deficient mutant induced by mutagenesis exhibited modified amylopectin structure and grain morphology but the same quantity of starch as the wild type (Satoh et al., 2003) and the BEI encoding gene also effects 43

62 the RVA profile (Yan et al., 2010). The maize sugary gene arises from a mutation in the maize BEI encoding ortholog (Boyer and Preiss, 1978). In total, 18 SNPs and 6 Indels were found in this gene, one of which is non-synonymous C/T SNP which alters Gly607ASP which is potentially very important as it changes the polarity of the amino acid. BEIIa BEIIa is a leaf expressed gene involved in amylopectin synthesis. BEIIa is also expressed in the endosperm but at levels 10-fold lower than in leaf tissue (Gao et al., 1997). Variation in this gene/enzyme may have a significant influence in rice starch properties, considering that BEIIa is preferentially expressed along with at least one important starch synthesis gene expressed in leaf and endosperm (both tissue expressing genes) (Hirose et al., 2006). An association study including the gene and RVA properties demonstrated a low F value (6.60) with a very slight influence in glutinous rice (Yan et al., 2010). Application of antibodyspecific BEIIa has demonstrated this protein is present in both soluble and granule bound forms in developing wheat endosperm (Rahman et al., 2001). In total, six SNPs were detected including a non-synonymous T/G which causes a Tyr140Ser substitution, with no polarity alteration. No SNP/Indel has been previously reported for this gene, suggesting BEIIa might be one of the most conservative starch-related genes in rice. BEIIb A relatively high variation rate of was detected for this important gene (Table 3), which is also known as amylose extender (ae) in maize and other cereals (Yun and Matheson, 1993). Many studies have reported the significance of this gene on starch properties on various plant species (Fisher et al., 1993; Sun et al., 1997; Sun et al., 1998). This is a granuleand soluble- associated enzyme which is only expressed in the endosperm. Expression of 44

63 three different functional maize SBE genes in BE-deficient yeast strains demonstrated that the presence of BEIIb is necessary to activate BEI and BEIIa (Seo et al., 2002). A recent association study has determined very high F value of between SSIIb and RVA properties in rice (Yan et al., 2010). Additionally, a 0.5- to 0.7 fold decrease in the expression of BEIIb (amylose extender) during grain filling creates chalky rice (Tanaka et al., 2004). There were 53 SNPs, three of which were non-synonymous, and 17 Indels were found in amylose extender. No functional polymorphism were recorded in the available databases but three non-synonymous SNPs C/T (Val403Ile), C/T ( His196Arg) and C/A (Leu97Val) were detected here, none of which changed amino acid polarity. Debranching Enzymes (DBEs) DBEs belong to α-amylase family of which two classes exist in plants, Isoamylase and Pullulanase. These enzymes debranch (hydrolase) α-(1-6)-linkages in amylopectin and pullulan. Defective DBEs in plants are thought to be responsible for accumulation of phytoglycogen rather than starch, and in turn, change the phenotypic appearance of the endosperm (Bustos et al., 2004). ISA1 In wheat, the expression of ISA1 cdna was highest in the developing endosperm and undetectable in mature grains. This suggests a fundamental biosynthetic role of Isoamylase 1 in plant starch, although the precise roles of DBEs are not yet known (Tetlow et al., 2004). The regulation of ISA1 gene at the transcriptional level during grain filling of rice in response to high temperatures has been reported (Yamakawa et al., 2008). In rice endosperm, antisense inhibition of Isoamylase 1 has altered the structure of amylopectin and the physiochemical properties of starch (Fujita et al., 2003). The ISA genes are also presumed to have some sort 45

64 of contributions to the degree of setback on glutinous rice cultivars (Yan et al., 2010). No functional polymorphism was found in this gene in the studied population. Only 9 Indels in the intronic regions were detected. This suggests that this gene has no or minimum effect on variation in starch properties in this population. ISA2 The existence of this type of Isoamylase was first reported in maize endosperm (Doehlert and Knutson, 1991). It was suggested two isoamylase isoforms I and II exist in maize endosperm which were distinguishable by anion-exchange chromatography. On the basis of enzymic characteristics, the sugary1 (su1) protein corresponds to the isoamylase II form in the maize endosperm, (Beatty et al., 1999). Association between ISA2 and rice grain quality is unknown. There is no intron in this relatively small gene (2625 bp), thus each detected SNP/Indel can be potentially important. The polymorphism rate was significantly high about There were 16 SNPs including nine non-synonymous SNP and no Indels in the ISA2 gene. Three of the non-synonymous SNPs altered the polarity of amino acids as follows: T/C, C/A and T/G at coordinates 960, 1712 and 2067 of reference sequence which cause Thr482Ala, Arg231Leu and Thr113Pro substitutions, respectively. Pullulanase (PUL) In rice endosperm a defect in pullulanase-type DBE activity triggers and modulates some phenotypic effects (Nakamura et al., 1998). In maize endosperm, it is believed that pullulanase has a dual role, contributing either to starch synthesis or degradation (Dinges et al., 2003). Kubo et al suggest pullulanase plays a predominant and essential role in amylopectin synthesis and compensates shortages of isoamylase activity in the construction of multiple cluster structure of amylopectin. A recent association study between pullulanase 46

65 and RVA profile parameters in glutinous rice has shown strong relations of this gene with peak viscosity, hot paste viscosity, breakdown viscosity and peak time (Yan et al., 2010). The highest polymorphism rate (1.14) was seen in pullulanse, where in total, 109 SNPs and 10 Indels were detected. This number of SNPs exactly equals the number already reported in OryzaSNP database. In our population, only one non-synonymous SNP was detected at coordinate 2319 of the reference which substitutes a Ser to Asn at position 217 of the protein. This alteration might not be very influential as it does not change the polarity of the molecule. Distribution of SNPs across the loci Distributions of detected polymorphism and coverage patterns of short reads across the length of candidate genes indicated no specific correlation among 17 studied loci (Fig. 2). Some genes such as SPHOL and GBSSI exhibited similar distribution patterns. However, there were no associations among the patterns of different genes. Based on the distribution patterns, it can be concluded that most of candidate genes have shown higher polymorphism rate in the median intron/exon regions rather than UTR ends. Ka/Ks ratio; "purifying" vs "diversifying" genes) The proportion of non-synonymous (Ka) relative to synonymous (Ks) SNP can reveal whether a gene has been under purifying, neutral or diversifying selection. The Ka Ks ratio has been created to classify candidate genes into two main categories of purifying and diversifying genes. Under neutral conditions of evolution, at the amino acid level, Ka should equal Ks and hence the ratio Ka Ks = 1. Any deviation from this score shows the selection pressure on genetic structure of population or candidate genes. A Ka Ks ratio < 1 indicates negative (purifying) selection, while positive (adaptive) selection is indicated when 47

66 Ka Ks > 1. This indicator was applied to assess diversity of samples in the database (20 diverse cultivars) and the Australian population of 233 genotypes. GBSSI and GBSSII were classified as highly conservative genes which are being passed through the adaptive phase by the rice breeder s artificial selection pressure as they showed a low polymorphism rate and high Ka/Ks ratio. Some genes such as AGPS2b, SSIIa and SPHOL had a Ka/Ks ratio of one which means they probably have not been under significant selection pressure. Discussion This MPS analysis of rice starch metabolism candidate genes identified a relatively high SNP/Indel variation at all loci. In total, 501 SNPs and 113 Indels were detected in comparison with 399 SNPs that are already available in the public domain. No Indels are recorded in public databases such as OryzaSNP. Out of 501 SNPs, 75 SNPs (~14.9%) were non-synonymous leading to amino acid changes. All Indels resided in the intronic regions and so no obvious functional Indels were found. The highest and lowest polymorphism rates were observed in Pullulanase (11.443) and GBSSI (0.574), respectively (Table 3). The low polymorphism rate in important GBSSI gene is one of the surprising results of this study. A possible cause is the source of this population which was an Australian rice breeding population. Indica cultivars do not grow in temperate Australia because the day length is not suitable and so indica cultivars are rarely used as parents. Therefore, one of the waxy gene (GBSSI) SNPs had a very low frequency (<0.05) which was confirmed by Sequenom resequencing (Chapter 6). The low frequency of SNPs in some genes can also be attributed to the negative selection pressure (purifying) imposed by breeders within this population. Massively parallel sequencing combined with LR-PCR ensured high sequence depth in terms of the number of candidate genes and number of samples at all studied loci (Pettersson et al., 48

67 2009). Among the numerous elements involved in the MPS, amplification efficiency and pooling strategy are the most important parameters. The error rate of BioRad iprooftm High-Fidelity DNA polymerase is low, , which is approximately 50-fold lower than normal polymerases, and the extension efficiency is high, 5-30s/kb, four times faster and thus makes the PCR faster, making it the enzyme of choice. However, establishment of an efficient long range PCR for large genomic regions can be costly and time consuming (Ingman and Gyllensten, 2008). To solve this problem, semi-long range PCR (SLR-PCR) which is generally more robust and saves time and cost of primers can be used. Preparing an optimal SLR-PCR will increase the performance of MPS and must be established before pooling genomic DNA samples. The error rate of the GA Illumina is reportedly about %. In this experiment 233 DNA samples were pooled. This means that discovery of only one SNP (variant) out of 233 will make the SNP frequency ~0.43% which is lower than the reported error rate. A two step pooling strategy and high coverage depth reducde the risk of false SNP detection. These two strategies significantly overcome the effect of GA Illumina platform error. Figure 1 displays the coverage for all loci, starting from The raw data shows that the maximum coverage in some regions reaches to 240,000 (data not shown). This high coverage has significantly neutralised the error rate. Out et al., 2009 have discussed the correlations between allele frequencies, pool size, coverage depth and error rates in GA Illumina. They demonstrated that a coverage depth of would be enough for detection of SNP frequencies on or above 0.3%. Currently, there is much interest in applying the Illumina GA platform to targeted sequencing of specific candidate genes, particularly for finding SNPs in a large number of individuals in the targeted populations (Hodges et al., 2007). An incorrect pooling strategy is another important issue that may be encountered in generating and analysing data. DNA samples 49

68 from different individuals may not be amplified with the similar efficiency in PCRs, creating random bias. To rectify this issue, smaller pools (20) were tried which minimised the chance of biased amplification of target regions. Using this strategy, rare SNPs which occurred at a frequency lower than 1% were detected. Coverage is also critical. It is believed 20-fold coverage is sufficient for accurate SNP detection (Dohm et al., 2008). Even coverage is also highly desirable. Given this is the case, with average coverage of 90, reasonable SNP data from the beginning of SSIIaH1 fragment should have been obtained, but only 18.5% of the observed SNPs in this region were validated. This can be attributed to the difficulties encountered in amplifying this high GC region. The main reason for different coverage patterns is still unknown (Mardis, 2008b). The highest peak was observed for positions 4885 of BEIIa with 239,019 coverage. However, coverage only affects the accuracy of SNP frequency and not the number of discovered SNPs (Morozova and Marra, 2008). Ingman and Gyllensten (2008) studied the effect of different pooling strategies and coverage levels to evaluate SNP frequencies of pooled and un-pooled individuals in a ~17 kb region and they found that all SNPs, including low frequency (not under 0.4% ) can be detected at coverage levels above 500. They suggested that for pooled PCR products, 50 coverage would be sufficient for SNP frequencies on or above 4%. The very high coverage obtained enabled discovery of rare SNPs with frequencies lower than 0.5%. Sequencing errors are common in NGS and sequencing errors are easily confounded with low frequency SNPs if the minimum number of reads is too low (Futschik and Schlotterer, 2010). The high level of coverage for all candidate genes enabled us to recognise rare SNPs effectively. It has been demonstrated that allelic variation of amino acids and structure of proteins correlate with the effect of natural selection seen as an excess of rare SNPs which affect 50

69 actual phenotypes (Sunyaev et al., 2000). The distribution of genetic variation in 17 starch candidate genes indicates they have been selected for starch properties. The pressure of natural selection can significantly influence the extant pattern of genetic variation (Akey et al., 2002; Barreiro et al., 2008). In this study, the total polymorphism rate and distribution pattern indicate that the candidate genes have been subjected to selection by breeders, as some important genes with high impact on starch properties such as GBSSI and ISA1 have shown unusually low levels of polymorphism. Artificial selection during the breeding program has had a major influence on genetic variation of population studied. These changes in population structure mainly occur due to narrowing the gene pool, and changing the balance between genetic drift and population size during the breeding process. Appendices: Chapter 2 Appendix 1: Full list of discovered SNP/Indel is 17 studies starch related genes. Appendix 2: Full list of Australian breeding lines (population) and their pedigree information. Appendix 3: Target genes and sequence of gene-specific LR-PCR primers. Appendix 4: SNP/Indel distribution and short read co 51

70 CHAPTER 3 Bioinformatic tools assist screening of functional SNPs in plants: rice GBSSI as a model Summary Granule Bound Starch Synthase I (GBSSI) influences cereal grain quality and is one of the most important plant genes. Using GBSSI as a model, a number of different computational tools and programs were used to explore the functional SNPs and the possible relationships between genetic mutation and phenotypic variation. A total of 51 SNPs/indels were retrieved from databases, including three non-synonymous SNPs, namely those in exons 6, 9 and 10. Sorting Intolerant from Tolerant (SIFT) results showed that a candidate [C/A] SNP (ID: OryzaSNP2) in exon 6 (coordinate 2494) is most likely the most important non-synonymous SNP with the highest phenotypic impact on GBSSI. This SNP alters a tyrosine to serine at position 224 of GBSSI. Computational simulation of GBSSI with the Geno3D suggested this mutant SNP creates a bigger loop on the surface of GBSSI and results in a shape different from that of native GBSSI. A potential transcriptional binding factor site (TBF8) which has one [C/T] SNP [rs ] at coordinate 2777 at the intron 7/exon 8 boundary site according to Transcriptional Factor (TF) Search analysis might have an effect on regulation and function of GBSSI. Combining SNP mining data and in silico structural analysis of GBSSI is a computational pathway which can be applied for other plant genes. Introduction Single nucleotide polymorphisms (SNPs) are the most common and simplest type of genetic variation in organisms. SNPs occur at a frequency of approximately one in a thousand base pairs in the human genome (Brookes, 1999) and one in every 170 bp in rice (Yu et al., 2002). 52

71 Although SNPs can be found everywhere throughout the genome, such as gene promoter regions, coding sequences, and intronic sequences, most of them are probably located in intergenic regions, most of which are believed to be stable without any deleterious effect. The occurrence of human disease and evolution (Shastry, 2002), as well as many important traits in plants (Bryan et al., 2000; Kennedy et al., 2006; Edwards et al., 2007), can be attributed to the presence of SNP. SNPs can be categorized and named based on their location and function. For example, SNPs within the coding regions (csnps) of functional genes which introduce amino acid sequence variations are called non-synonymous SNPs (ns SNPs) and are of major interest. Those SNPs which occur in the coding sequences, but do not change amino acids are called synonymous SNPs. However, most SNPs occur in intronic regions. Study of these SNPs is also important because of their influence on gene expression which can occur through different molecular pathways such as changing regulatory elements, splicing patterns, up and down regulation of exonic splice enhancers (ESE), intronic splice enhancers (ISE) and so forth (El Sharawy et al., 2006). Understanding the functional effect of SNPs is a major challenge. SNPs that lead to a single amino acid substitution, stop codon or frame shift mutation are normally recognized as functional and are easily detected. An experimental-based approach can provide the strongest evidence for the functional role of genetic variations. Consequently, many different types of SNP assays have been applied for experimental prioritization of SNPs (Chen and Sullivan, 2003). However, owing to the lack of reliable genotype and phenotypic data, these experiments are not always easy to set up for characterizing the real effect of SNPs. For example, functional analysis of SNPs in important plant genes needs a segregating population or breeding lines, such as near isogenic lines (NILs) (Umemoto et al., 2008; Mikami et al., 53

72 2008). On the other hand, many genes may have a vast number of intronic SNPs that cannot be easily associated with in vivo variation of plant populations. Previous studies have focused on non-synonymous SNPs of human disease genes (George Priya Doss et al., 2008; Rajasekaran et al., 2008). Here, a plant-relevant computational pipeline was developed which covers most of important functional elements at the DNA level in addition to non-synonymous SNPs. The model gene chosen to identify and prioritize substitutions was Granule Bound Starch Synthase I (GBSSI), a major and well characterised gene affecting the amylose content of rice grain (Chen et al., 2004). Different computational algorithm tools including Sorting Intolerant from Tolerant (SIFT), Exonic Splicing Enhancer Finder (ESE Finder), Transcriptional Factor search (TF search) and Exonic Splicing Silencer Search (FAS-ESS), were used to prioritize the candidate SNPs most likely to affect the encoded protein and, subsequently, amylose content and rice grain quality. Materials and methods GBSSI encoding gene as a case study Granule Bound Starch Synthase (GBSSI) was analysed as an example to provide a guide for further confirmatory experimental studies for this or other genes. This gene is well known in most cereals for its affect on amylose content (Chen et al., 2008a), pasting properties (Chen et al., 2008b) and eating quality (Umemoto et al., 2008). Sequence alignment GBSSI coding DNA and mrna sequences for Oryza sativa L. were downloaded from GenBank ( (GenBank locus number for genomic DNA is X and NM_ for mrna.). 54

73 Nucleotide coordinates of on chromosome 6 (LOC_Os06g04200) were extracted from the Rice Genome Annotation Project at Michigan State University ( The total sequence lengths of 5035 bp, 1830 bp and 609 amino acids were recognized in genomic, cdna and GBSSI protein, respectively. SNP dataset SNP dataset for the GBSSI gene (Figure 1) was retrieved from the NCBI database (Sherry et al., 2001) at ( for the relevant chromosome range (gene coordinates) and then checked with the SNP dataset in Oryza SNP Consortium ( by using the following TIGR gene ID: LOC_Os06g The extra DNA length of 2 kbp from each end of the coding region was also searched for the possible existence of SNPs in the 3 and 5 UTRs. Final alignment was carried out by Sequencer 4.6 software (Ann Arbor, MI) and ClustalW2 ( to identify exact location of SNPs in UTR or intronic/exonic regions. Five different functional classes of SNPs were selected to cover the entire gene region, as follows: (1) non-synonymous coding (2) intronic (3) coding synonymous (4) locus region (5) 5 and 3 UTRs. Computational tools for SNP analysis Several computational software programs were applied to predict the actual or possible impact of SNPs on plant phenotypes, as follows: (1) UTRScan (2) TF Search (3) SIFT: Sorting Tolerant from Intolerant (4) GeneSplicer (5) SEE-ESE (6) FAS-ESS (7) Geno3D (8) PDB viewer and (9) RasMol. 55

74 3D Modelling of GBSSI and comparative study The native and mutant structure of GBSSI was modelled by Geno3D software ( This program predicts 3D structures of proteins and enzymes based on amino acid gene sequences. This program is capable of extracting 3D structures of very similar proteins from different databases (specifically, PDB) and then modelling the query sequence using available structure, which, for the GBSSI gene, has the PDB identification number of 3D1J. The modelled structure can be validated by PROCHECK (Laskowski et al., 1993). Table 1. List of SNPs extracted from dbsnp and OryzaSNP consortium sorted by chromosome base position from 5 UTR region. SNP No SNP ID rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs rs SNP/indel [A/C] [C/T] [G/T] [A/T] [A/G] [A/T] [A/C] [C/T] [-/ATAT] [-/T] [-/C] [A/G] [C/T] [C/T] [C/T] [A/G] [-/A] [A/T] [C/T] [A/G] [C/T] [A/T] [A/G] [C/T] [A/C] [C/T] [-/TT] [-/TTC] [-/G] [G/T] [A/G] [A/T] *IUPAC Nucleotide symbol *Symbol M Y K W R W M Y in/del in/del in/del R Y Y Y R in/del W Y R Y W R Y M Y in/del in/del in/del K R W Position 5 UTR Intron7 Exon8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Intron8 Inton8 Exon9 Exon9 Exon9 Exon9 Exon9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 56 Coordinate (bp) Functional class UTR Intronic Coding-synonymous Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Coding-synonymous Coding-synonymous Coding-synonymous Coding-synonymous Coding-ns Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic

75 Coordinates from the beginning of genomic DNA(Genbank accession: X65138). Table 1. Continued SNP No SNP ID SNP/indel *Symbol Position rs rs rs rs rs rs rs rs rs rs rs rs rs [-/C] [A/C] [G/T] [-/GA] [-/C] [A/T] [-/GAA] [A/G] [A/G] [A/C] [A/C] [A/T] [G/T] in/del M K in/del in/del W in/del R R M M W K Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Intron9 Exon10 Coordinate (bp) OryzaSNP1 OryzaSNP2 OryzaSNP3 [G/T] [C/A] [C/T] K M Y Intron 1 Exon 6 Exon OryzaSNP4 OryzaSNP5 OryzaSNP6 [A/-] [G/-] [C/T] in/del in/del Y Intron 9 Intron 9 Exon Functional class Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Intronic Codingsynonymous Intronic Coding-ns Codingsynonymous Intronic Intronic Coding-ns *IUPAC Nucleotide symbol Coordinates from the beginning of genomic DNA(Genbank accession: X65138). Figure 1: The structure of GBSSI encoding gene. Blue boxes represent exons. 57

76 Distribution of SNPs in Waxy gene intronic 78% UTR cssnps intronic nssnps cssnps 14% UTR 2% nssnps 6% Figure 2. Distribution and percentage of SNPs in GBSSI encoding gene. cssnp: coding synonymous; nssnps: coding non-synonymous SNPs; UTR: 5 Untranslated regions Comparative studies were also carried out by Swiss PDB viewer ( and RasMol ( based on superimposed structure and homology analysis of native and mutant protein (Rajesh et al. 2008). Functional flow chart A flow chart was prepared for computational analysis and prioritization of SNPs based on their functionality and possible impact on plant phenotypes (Figure 3). 58

77 Figure 3. Computational pipeline for in silico analysis of functional SNPs. 59

78 Results SNPs in GBSSI and comparative study A total of 51 SNPs and in/dels, were extracted from databases consisting of the following: one in the 5 UTR, three in coding non-synonymous, seven in coding synonymous and forty in the intronic sequences (Table 1) (Figure 2). Computational algorithm tools The following computational tools were used consecutively for comprehensive functional analysis of the GBSSI encoding gene: UTR Scan UTR Scan ( identifies patterns of regulatory region motifs from the UTR database and gives information about important elements in the 5 and 3 UTRs, including whether the matched pattern is damaged (Pesole and Liuni 1999). One regulatory element was found in the 3 UTR. No functional element was recognized by UTR Scan in the 5 UTR region of the GBSSI gene. One A/C SNP [rs ] was found in the non-regulatory element of 5 UTR. Since the number of SNPs in the UTR regions of the GBSSI gene was limited to one only, with none in the regulatory regions, it may be presumed that GBSSI UTRs do not change protein expression level. TF Search Two of the most important functional elements in plant genomes are transcriptional factors (TFs) and transcriptional factor binding sites (TFBSs). These sites are usually short DNA sequences, around 5-15 bp, where the TF elements bind to them to begin transcriptional 60

79 process involving RNA polymerase and the promoter. Occurrence of any mutation in these regions can alter motifs and possibly transcriptional patterns (Bulyk, 2004). Table 2. Transcriptional factor binding sites in GBSSI gene as distinguished by TF Search program No TFBs Sequence Coordinates (gdna) Score *SNP/Indel 1 TTCTAATTATTTGA N/A 2 TCCAACCAA N/A 3 GCGGTCGGT N/A 4 GAGGTAGGA N/A 5 ATGGTTGGA N/A 6 AGCTACCTG N/A 7 AACTACCAG N/A 8 CAGGTTGCT C/T 9 TCCTACCAG N/A 10 TCAAATAATTAGAA /TAA 11 TCATTGTTAAATAT N/A 12 ATATTTAACCAAAT N/A *SNP/indels which can be involved on TFBs and their function. Various experimental and computational approaches have been used to identify genomic locations of transcription factor binding sites, particularly in higher eukaryotic genomes (Sinha and Tompa, 2002). This algorithm tool is capable of recognizing transcriptional binding sites of genes if a non-coding SNP alters the transcription factor binding site of a gene (Heinemeyer et al., 1998). This program can be reached at: ( A total of twelve TFBSs were recognized in the GBSSI gene. The scoring scheme is very straightforward in this version of TF Search (v 1.3), ranging from 85.1 to 89.6, where the highest score is associated with the importance of TFBSs. A [C/T] SNP (rs ) was found in TFBS number 8 (TFB8) which begins from position 2777 at the junction of intron 7 and exon 8 (end of intron 7 and beginning of exon 8) (Table 2). This [C/T] SNP at position 2777 (5 end of TFBS8 sequence) potentially has the highest impact on transcriptional factor 61

80 binding sites in this gene. It should be noted that, in human, the strongest selective pressure was detected for proteins involved in transcription regulation (Ramensky et al., 2002). SIFT (Sorting Tolerant from Intolerant) Each amino acid substitution has the potential to affect protein function. SIFT is a Web-based program which predicts whether an amino acid substitution affects protein function based on sequence homology and the physical properties of amino acids (Ng and Henikoff, 2003). SIFT focuses on non-synonymous SNPs and can be applied to spontaneous occurrence and laboratory-induced point mutations. SIFT is based on the premise that important amino acids will be conserved among sequences in a protein family. As a consequence, changes at amino acids conserved in the family should affect protein function (Ng and Henikoff, 2002). There is a standard tolerance index of 0.05 in this program, and a separate index/value is devoted to each amino acid position. The values above this threshold (gnomon) are assumed to have lower impact on plants. The lower values (<0.05) are indexed as important, with higher phenotypic impact. In fact, amino acids with probabilities < 0.05 and >0.05 are predicted to have higher and lower impact, respectively. This program can be found at: Of 51 investigated SNPs, two have already been verified to be non-synonymous with functional impact on waxy protein (in exon 6 and 10) (Chen et al., 2008 a, b; Larkin and Park, 2003). The impact of other SNPs, in exon 9, has not been characterized and its functional impact needs to be confirmed by association analysis (Table 1 and 3). These nssnps at positions 2494, 3486 and 3235 can cause functional mutations at amino acids 224, 415 and 370 respectively (Table 3). Among these, the tyrosine to serine (Y S) substitution at amino acid 224, which arose from C/A SNP at exon 6, has the highest impact on GBSSI (Figure 4) and corresponds to lowest possible SIFT score, less than 0.05 (Table 3). 62

81 These SNPs were predicted to be tolerated by SIFT analysis with index of 1.00 that indicates these SNPs have smaller effects such as reduction (or increase) of amylose content and endosperm quality. SNP No 47 Table 3. nssnps predicted to have functional significance by SIFT. SNP ID Position OryzaSNP Amino acid substitution Y S SIFT Score (tolerance) *0.00 Impact on protein C/A Coordinate (protein) 224 High Confidence of prediction High 3235 C/T 370 A V 1.00 Low High 3486 C/T 415 P S 1.00 Low High SNP Exon 6 Coordinate (gdna) 2494 rs Exon 9 OryzaSNP6 Exon 10 * Substitutions with scores under of 0.05 are predicted to be Not Tolerated The confidence of predictions have been calculated based on default median conservation value of 3.0 GeneSplicer Splicing is post-transcription modification of RNA in which introns are removed and exons are joined to form mature mrna. Splice sites SNP or in/dels may lead to a truncated or mutant protein (El Sharawy et al., 2006). GeneSplicer ( is a computational tool which predicts splice sites in DNA sequence (Pertea et al., 2001). Although splice sites in the GBSSI encoding gene have been identified, this tool was used to identify the exact location of exon-intron boundaries and the possible existence of SNP/indel in unusual donor-acceptor sites which might change GBSSI structure. A maximum 2 bp deviation was found in predicting splicing sites by this software. This deviation probably results from the tendency of this software to recognize alternative splicing patterns. Only one putative SNP was recognized at the exon1-intron1 junction at position 246. The functional effect of this SNP on waxy protein has already been reported (Cai et al., 1998). 63

82 SEE ESE (Sequence Evaluator of Exonic Splicing Enhancers) Exonic Splicing Enhancers (ESE) are prevalent in plant sequences and normally promote exon recognition and inclusion. These sequences have been identified in several plant genes and reside at variable distances from splice sites. Although such splicing enhancers have been identified in both exons and introns, exon splicing enhancers are generally better characterized and are probably more common. Table 4. Exon/intron boundaries in rice GBSSI gene recognized by GeneSplicer and existence of possible SNPs/Indels ID 5 Donor Acceptor 3 Confidence Exon/Intron1 Exon/Intron2 Exon/Intron3 Exon/Intron4 Exon/Intron5 Exon/Intron6 Exon/Intron7 Exon/Intron8 Exon/Intron9 Exon/Intron10 Exon/Intron11 Exon/Intron12 Exon/Intron13 Exon/Intron High High High High High High High High High High High High High High *The confidence score must be higher than 12 *Confidence score Deviation (bp) SNP/Indel Position [G/T] 246 N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A N/A Most of the ESE candidates are hexamers, and the most important candidates are highlighted by this software whenever they overlap the three 9-mers, GAAGAAGAA, CGATCAACG and TGCTGCTGG, which have been found to be very effective ESEs in plants (Tacke and Manley, 1999). Occurrence of SNPs in these regions may generate aberrant mrnas that are either unstable or code for defective, truncated or deficient protein isoforms. Sequence Evaluator for ESEs (SEE ESE) ( was applied to locate conserved motifs represented by these hexamers in exonic regions near splice sites in GBSSI genes. 64

83 Although a total of 17 potential ESE motifs were found, no SNP/indels were distinguished in these regions. FAS-ESS (Systematic identification and analysis of exonic splicing silencers) Exonic splicing silencers (ESSs) are cis-regulatory sequences (elements) in exons or introns that either inhibit the use of adjacent splice sites, often contributing to alternative splicing (AS), or promote exon skipping. Bioinformatics analyses suggest that these ESS motifs play important roles in suppression of pseudo-exons, in splice site definition, and in AS (Wang et al., 2004). Table 5. List of Exonic Splicing Silencers in GBSSI gene recognized by FAS-SEE program ESS ID *ESS Coordinates ESS Sequence SNP ID *SNP Position SNP ESS AGGTATA OryzaSNP1 246 [G/T] ESS TTATGG OryzaSNP [C/A] ESS TCGTTCA rs [C/T] ESS TCGTTCA rs [A/G] ESS TCCTGG rs [G/A] * All coordinates and position calculated based on GBSSI genomic DNA (Genbank accession number X65183) Underline-highlighted nucleotides show the position of SNPs. FAS-hex2 set ( found of 113 predicted domains, 77 were located in SNP high density regions which are exons/introns 8, 9 and 10. FAS-ESS analysis identified five exonic splicing silencer sites in the GBSSI gene (Table 5). The most important silencer elements are probably ESS2, ESS1 and ESS3, respectively, because they contain SNPs in coding regions or exon-intron splicing sites. ESS2 in exon 6 that has a putative nonsynonymous [C/A] SNP (ID: OryzaSNP2) which is responsible for a Y S change which may have significant effects on GBSSI protein characteristics. 65

84 Simulation for finding functional, constructive changes of ns-coding SNPs Protein secondary and tertiary structure of molecules can alter function and activity (Fersht, 1985; Hrmova and Fincher, 2001). Modelling of GBSSI is the final stage of functional analysis. These computational algorithms can recognize the impact of different nssnps by simulation and comparison of native and mutant molecular structures. Results from SIFT analysis were included at this level. Out of three SNPs which were non-synonymous, only C/A SNP at exon 6 [OryzaSNP2] was recognized as important and the other two SNPs in exons 9 and 10 were found to have lower impact by SIFT analysis (Table 3). This nssnp is associated with the Y S amino acid substitution at position 224 of GBSSI. The 3D modelling was performed by Geno3D. This software identifies the most similar structure to the query amino acid sequence and simulates a 3D protein automatically. Based on a Geno3D search of different protein databases, the structure for GBSSI has a PDB id: 3D1J. The 3D1J is a glycogen synthase, and the 3D crystal structure of this protein has been elucidated in E. coli (Buschiazzo et al. 2004; Sheng et al. 2009). It is thought that synthesis of storage polysaccharides in bacteria and plants is fulfilled by a similar ADP glucose-based pathway (Ball and Morell, 2003). The exact location of the exon 6 SNP was detected by SWISS PDB viewer. Figures 5a and 5c show the exact position of this high impact substitution in the modelled protein and the Y S substitution at residue 224 may change the shape, stability and, in turn, the activity of protein (Figures 5b and 5d). 66

85 Figure 4. ClustalW2 alignment of native (WxN) and mutant (WxM) GBSSI. Y S, A V and P S substitutions were found at residues 224, 370 and 415 corresponds to SNPs number 47, 24 and 51; at exons 6, 9 and 10, respectively. SIFT analysis found the [C/T] SNP at exon 6 had the highest impact on protein structure. 67

86 Discussion Computational algorithms are useful and cost-effective tools for analysis of SNPs and genes. Since the emergence of new high-throughput technologies to sequence the whole genome of plants (Henry, 2008), it is not possible to recognize all functional SNPs in a pool of sequencing data which contains neutral SNPs. Assessment of functional SNPs can be performed by phylogenetic comparison (George Priya Doss et al., 2008), such as the study of statistical correlation with residue substitution. Recently, SNP-linkage disequilibrium and association studies, which need accurate phenotypic data of appropriate populations, have gained acceptance as procedures to assess functional SNPs (Carlson et al., 2003). However, these populations can be difficult to generate (Gupta et al., 2005), and they must have high variation in the studied traits. The efficiency of computational tools for identification of functional SNPs in human cancer-related genes such as BRCA1and BRCA2 has been reported by a number of authors (Rajasekaran et al., 2007; Rajasekaran et al., 2008). Shen et al. (2006) demonstrated application of in silico analysis tools like SIFT, Polyphen and UTRScan to recognize SNPs in a cytokine gene that has a known role in human, immune-related diseases. Based on the success of the latter, a computational screening pathway to prioritize and rank plant SNPs to recognize their functionality and impact on plant phenotypes was developed. The results here show there are significant numbers of important elements in the GBSSI gene and the SNPs have been found within these region and these correlations have already been demonstrated by Soussi et al. (2006). Four of these elements were found to have the highest functional effects and these effects appeared to result from the existence of SNPs in these regions. The non-synonymous [C/A] SNP at exon 6 [OryzaSNP2], for example, has the most significant effect on amylose content according to SIFT and this has been previously shown to be the case experimentally (Chen et al b). 68

87 (a) Tyrosine (b) Tyrosine (d) (c) Serine Serine Figure 5. The 3D molecular modelling of GBSSI generated by Geno3D and viewed by Swiss PDB viewer and RasMol software. Arrows indicate the exact location of Y S (tyrosine to serine) substitution at residue 224 derived from the most important SNP [C/A] in waxy protein gene at exon 6 (OryzaSNP2). (5a and 5b) The native GBSSI protein contains A nucleotide. (5c and 5d). The mutant protein carries C instead of A. The arrows in Figures 5b and 5d also indicate a significant difference in the structural loop that occurs in the substitution region. SIFT assumes important amino acids will be conserved in the protein family, and so changes at well conserved, charged or polar residues are predicted to be high impact, or to affect protein function. If a position in an alignment contains hydrophobic amino acids, then SIFT assumes this position can only contain or tolerate amino acids with hydrophobic character for low level effect on protein function and these can be prioritized by SIFT score. 69

88 The quantitative score of SIFT allows prioritisation of the amino acid changes and to rank the possible functional effects. An important feature of this algorithm is the confidence value. Confidence in a high impact predicted substitution depends on the diversity of the aligned protein sequences and how the sequences are closely related. Therefore, many amino acid residues will become conserved and SIFT will predict most of the substitutions to affect the function of protein which leads to a high false positive or negative error. In fact, a number of functionally neutral substitutions are predicted as high impact or vice versa (false negative effect). To alert the user to these situations, SIFT calculates the median conservation value which measures the diversity of the sequences in the alignment. In SIFT, the conservation is calculated for each position in the alignment and the median of these values is defined. By default, SIFT builds alignments with a median conservation value of 3.0. Predictions based on sequence alignments with higher median conservation values are less diverse and will have a higher false positive error (Ng and Henikoff, 2003). As the default median conservation value of 3.0 and aligned few available homologous sequences was used, the highest possible confidence of SIFT in this study was simply predicted (Table 3). Based on SIFT analysis the [C/A] SNP at exon 6 [Oryza SNP2] is located at a conserved region and is a charged or polar residue. Larkin and Park (2003) found two other coding SNPs, C/T [OryzaSNP3] and C/T [OryzaSNP6] at exons 9 and 10 of GBSSI, which have non-functional and functional effects, respectively (Table 3 and 1). They also verified that haplotypes composed of SNP at the exon/intron1 boundary site, exon 6 and exon 10 regulate the GBSSI function. Chen et al. (2008 a, b) have also confirmed that these SNPs can alter the apparent amylose content and pasting properties of rice. Since the [C/A] SNP at exon 6 [OryzaSNP2] had the highest possible impact on GBSSI, for both native (Y) and mutant (S) GBSSI proteins (Y S = Tyrosine Serine at residue 224) 70

89 were simulated (Fig 5). The superimposed structure of two proteins showed a distinctive deformed loop at the mutation position in comparison with native structure. This deformed loop is located at the outer layer (surface) of GBSSI and alters the 3D shape, structure and function of protein, possibly owing to change in the accuracy of the protein binding site. Sequence similarity confers structural similarity (Chothia and Lesk 1986; Hegyi and Gerstein, 1999), but unfortunately, the relationship between a protein s sequence similarity and functional similarity is not straightforward (Bork and Koonin, 1998). Exonic Splicing Silencers have relatively major effects on splicing pattern by recognition of splicing sites (Wang et al., 2006). Application of FAS-ESS suggested OryzaSNP2 also has silencing action because it is located in an exonic splicing silencer region although there are no reports of experimental evidence of alternatively spliced mrnas or altered protein size in rice plants with regard to this SNP (Table 5). The effect of the [C/A] SNP at exon 6 on amylose content and grain quality has been confirmed by many authors (Sano, 1984; Larkin and Park, 2003; Chen et al., 2008a). FASESS analysis has also suggested another important silencer (ESS1) at the splice site of exon/intron 1 which has a [G/T] SNP [OryzaSNP1]. Significance of this SNP to reduce amylose content has already been reported by experimental analysis which found this SNP decreases activity of GBSSI by alteration of the mrna splice site (Cai et al., 1997). Application of TF has also identified an important TBFS, including one [C/T] SNP [rs ] at coordinate 2777, which may potentially have major effect on GBSSI function. Conclusion Most genetic analysis software has been designed for human or animal genetic studies. Application of a number of programs allowed construction of a computational pathway for 71

90 SNP analysis in plants. There is a significant relationship between in silico and experimental results, thus confirming that computational tools can help identify and characterize functional SNPs. Following Transcriptional Factor (TF) Search analysis, a new [C/T] SNP [rs ] at coordinate 2777 near the boundary site of intron 7/exon 8 was predicted which may have a major impact on GBSSI and related phenotypes. 72

91 CHAPTER 4 SNP in starch biosynthesis genes associated with the nutritional and functional properties of domesticated rice Summary Starch is a major component of human diets. The physio-chemical properties of starch influence the nutritional value of starch and the functional properties of starch containing foods. Many of these traits have been under strong selection in domestication of rice as a food. A population of 233 breeding lines of rice was analysed for variation at 110 functional SNP loci in exonic regions of 18 starch-related genes and the results related to rice pasting and cooking quality. Associations of 65 functional SNPs were detected. Five genes AGPL2a, Isoamylase1, SPHOL, SSIIb an SSIVb showed no polymorphism. The GBSSI (waxy gene) and SSIIa had a major influence on starch properties and the other genes had minor associations. The G/T SNP at the boundary site of exon/intron1 in GBSSI showed the strongest association with retrogradation and amylose content. The TT allele has been selected in much of the domesticated japonica genepool providing rice with a desirable texture but less resistant starch with associated human health advantages. The GC/TT SNP at exon 8 of SSIIa showed a very significant association with pasting temperature (PT), gelatinization temperature (GT) and peak time. No significant association was found between SSIIa and retrogradation. Other genes contributing to retrogradation were SSI, BEI and SIIIa. The highest level of polymorphism was observed in SSIIIa with 22 SNPs but only limited associations were observed with starch phenotypic values. None of the SNPs were found to be strongly associated with chalkiness except for a weak link with a T/C SNP at position 960 (Thr482 to Ala) in Isoamylase2. These associations provide new tools for deliberate selection of rice genotypes for specific functional and nutritional outcomes. 73

92 Introduction Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variation. Many important plant traits and human genetic diseases are attributed to these sequence variations (Shastry, 2002; Bryan et al., 2000) either through influencing gene expression or protein function (Kennedy et al., 2006). Identifying SNP associated with grain starch quality advances our understanding of the starch bio-synthesis pathway and highlights potential ways to generate crops with higher yield and better quality, which directly impacts human nutrition and health. Starch is mainly composed of amylose and amylopection (Miles et al., 1985). Seven classes of starch related enzymes with high impact on grain starch structure and quality are known, including ADP-glucose pyrophosphorylase (AGPase), granule bound starch synthase (GBSS), starch synthase (SS), branching enzyme (BE), debranching enzyme (DBE), starch phosphorylase (PHO) and glucose phosphate translocator (GPT in chapter 5). These genes/enzymes contribute directly or indirectly to the production of starch granules composed of amylose and amylopectin. The Rapid Visco Analyser (RVA) is one the most important means of measuring grain quality parameters (Limpisut and Jindal, 2002). Using 43 gene-specific molecular markers, Yan et al. (2010) analysed the association of 17 starch synthesis-related genes with RVA profile parameters in a collection of 118 glutinous rice accessions. They found that 10 of 17 starch-related genes are involved in controlling RVA profile parameters with be most significant being Pullulanase which plays an important role in the control of peak viscosity (PKV), hot paste viscosity (HPV), cool paste viscosity (CPV), breakdown viscosity (BDV), peak time (PKT), and paste temperature (PT) while seven other starch genes had minor impacts on a few RVA profile parameters. The RVA parameters are controlled by a complex 74

93 genetic system involving many starch-related genes (Tester et al., 1995). This complexity can be attributed to many factors such as genetic, epigenetic, environmental and G E interaction in studied population (Tester et al., 1995; Morell, 2003). Granule bound starch synthase (GBSSI) is the most important starch synthesis gene in rice and other cereal grains. A number of SNPs in rice GBSSI (waxy gene), at the intron/exon 1 junction site, exon 6 and exon 10, have a significant impact on starch quality (Chen et al., 2008a, b; Cai et al., 1998; Larkin et al., 2003) via their impact on amylose content. Starch synthase IIa (SSIIa) has a major affect on starch quality through its impact on amylopectin structure (Craig et al., 1998; Morell, 2003). The effect of this gene on cooking quality and starch texture has been extensively studied as measured by alkali spreading values, gelatinisation temperature (GT) and eating quality of rice starch by polymorphism of two SNPs, [A/G] and [GC/TT] in alk, the gene which codes for SSIIa (Umemoto et al., 2008; Umemoto et al., 2004; Umemoto and Aoki, 2005; Waters et al., 2006). Except for GBSSI and SSIIa., there are no reports of associations between SNP and starch quality parameters with most studies focusing at gene level rather than SNP levels by undertaking comparisons of gene-deficient mutants (Fujita et al., 2006). Massively parallel sequencing (MPS) technology is a flexible and high-throughput platform for genetic analysis and functional genomics which is based on ultra deep sequencing of short read lengths and a huge number of sequencing reactions (Imelfort et al., 2009). KharabianMasouleh et al. (2011) discovered more than 501 SNPs and 113 In/dels in 17 starch related genes in an Australian rice breeding population using a combination of a target-pooled long range PCR and MPS approach, clearly indicating the capacity of high-throughput MPS technology to discover new SNP variants in plant populations. This technology can be used in combination with multiplexed-maldi-tof (Sequenom) to quickly identify genetic variation within plant populations and then assign this variation to individual plants and 75

94 phenotypes, improving the efficiency of marker assisted selection (MAS) (Masouleh et al., 2009). Resistant starch has a significant impact on human health (Sajilata et al., 2006). The incomplete digestion-absorption of non-digestible resistant starch in the small intestine leads to starch fractions with the physiological functions similar to dietary fibre with significant beneficial impact (Asp and Björck, 1992). Starch retrogradation describes the hardening of cooked starch after cooling due to re-crystallization of gelatinized starch components (Fan and Marks, 1998). There is a significant association between retrograded and resistant starch and hence, in this study the term retrograded-resistant starch is used. The in vivo digestion ability and structural features of resistant-retrograded starch with high amylose content in maize, bean and potato flakes were assessed using the ileal contents of four human populations (Faisant et al., 1993). The main resistant starch fraction consisted primarily of retrograded amylose with degree of polymerization of approximately 35 glucose units and a melting temperature of 150 C. Likewise, retrograded amylose in peas, maize, wheat, and potatoes was found to be highly resistant to amylolysis (Ring et al., 1988), suggesting this fraction had high amylose content. Other characters which may an influence on the rate of retrogradation, firmness and resilience of rice starch after cooking are protein and lipid contents (Philpot et al., 2006). High amylose rice cultivars are characterized by low RVA parameters, high resistant starch (RS) content and lower estimated glycemic index (EGS) and highly retrograded rice starch tends to a reduction of hydrolysis index (HI) and glycemic index (Hu et al., 2004). Waxy and low amylose rice starch is more quickly and completely hydrolysised relative to intermediate and high amylose rice (Chung et al., 2006; Hu et al., 2004). In this study a novel SNP in Glucose-6- phosphate translocator 1 (GPT1) gene which is highly associated with amylose content and retrogradation rate of resistant starch (Chapter 5) 76

95 is reported. In addition, an explicit-coherent one by one gene approach is established to unveil association of 18 starch-related genes and their SNP polymorphisms with physiochemical properties of rice starch. Materials and methods Plant materials Plant material was supplied by Industry and Investment NSW, Yanco Agricultural Research Institute, Australia. A population of 233 F6 lines selected from in the Australian temperate (japonica-type) rice breeding program (Appendix 5). Selection was primarily based on capacity to flower and set seed and the morphological traits of plant height, grain size and shape. No selection took place for quality traits such as gelatinization temperature and RVA curve characteristics. Physiochemical properties In total, 13 physiochemical traits including four phenotypic and RVA characteristics were measured. The phenotypic traits consisting of apparent amylose content (AC), gelatinization temperature (GT), grain chalkiness and retrogradation rate [scored by the Martin test (Philpot et al., 2006)], were quantified according standard methods. RVA characteristics such as peak viscosity (PKV), trough viscosity (TV), final viscosity (FV), breakdown, setback, peak time (PKT) and pasting temperature (PT) were measured by a Rapid Visco Analyser (Model, City, country) according to the manufacturer s instructions. 77

96 Designation of starch-synthesis genes involved in starch metabolize The available literature was used to identify the most likely candidate genes associated with rice starch quality (Ohdan et al., 2005; Waters and Henry, 2007; Nakamura, 2002; Hirose et al., 2006; Rahman et al., 2000). The genetic map and approximate location of genes on chromosomes are shown in Appendix 8. The general entries of nucleotide sequences (gdna) and full-length cdnas of important gene classes which are involved in starch biosynthesis were retrieved from the NCBI ( and the Rice Genome Annotation Project ( databases and resequenced using long range PCR and massively parallel sequencing (Illumina GAII) to find novel SNPs/Indels in the studied population (Kharabian-Masouleh et al., 2011). Amplification primers were designed based on consensus sequence alignment of each candidate gene. Candidate genes/enzymes for SNP genotyping In total, eighteen genes representing seven groups of enzymes, namely ADP-glucose pyrophosphorylase (AGPase), granule bound starch synthases (GBSSI and GBSSII), starch synthases (SSI, SSIIa, SSIIb, SSIIIa, SSIIIb, SSIVa, SSIVb), branching enzymes (BEI, BEIIa, BEIIb), debranching enzyme (ISA1, ISA2, Pullullanase), starch phosphorylase (SPHOL) and glucose phosphate-6- translocator (GPT1) (in Chapter 5) were selected for SNP genotyping. SNP dataset SNP for each gene was retrieved from the previous SNP discovery approach (KharabianMasouleh et al., 2011). The total number of functional polymorphisms discovered in the 78

97 population was then compared to SNPs available at OryzaSNP MSU database ( and 59 extra SNPs harvested to ensure that all known non-synonymous SNP (nssnp) were assayed. In total, 110 nssnps with possible functional effects (amino acid change) were chosen for primer design and genotyping, of which 65 were successfully genotyped with different status as polymorphic or non-polymorphic (Appendix 6). In total, 45 SNPs/primer sets either failed to genotype individuals or did not exist in the population (such as those extracted from OryzaSNP databases) and therefore disregarded in the analysis. Primer design and SNP genotyping Several multiplexed assays were designed by Sequenom MassARRAY Assay design 3.1 software to cover all available SNPs. The optimal amplicon size containing the polymorphic site in the software was set to bp. A 10-mer tag (5-ACGTTGGATG-3) was added to the 5 end of each amplification primer to avoid confusion in the mass spectrum and to improve PCR performance (Masouleh et al., 2009). Capture PCR protocol, primer extension and mass spectrometry The steps of capture PCR primer extension, resin cleanup and mass spectrometry were undertaken according to the manufacturer s (Sequenom MassARRAY ) instructions. Association analysis Assays were constructed for 110 polymorphisms defining each of the alleles of 18 genes controlling starch quality traits and retrogradation. SNP calls data of genotyped polymorphic alleles along with phenotypic data then transferred into TASSEL v2.1 (Bradbury et al., 2007) software to find SNPs association with physiochemical properties. A gene by gene approach 79

98 was employed to understand association of each individual gene/snp with target traits. A comprehensive association analysis including all significantly associated SNPs with starch properties was accomplished. The latter analysis shows the impact of significant SNPs or starch quality traits when combined to one another, representing the possible compensating or balancing effects of polymorphism on final phenotypic value of individuals. Statistical parameters Some critical statistics such as F-test, p-value, adjusted p-value and R2 were calculated to measure associations, while the per-mutation was set to Results 65 SNP-assays were designed and 233 individuals genotyped. To avoid complications of association study, a gene by gene approach was applied to find the impact and possible linkage of individual genes on physiochemical and quality-related properties of rice grain. Appendix 6 and 7 represent the identification code; coordinate an association of all studied SNPs. AGPS2b (small subunit) No functional polymorphism was found in AGPS2b suggesting that this gene does not have any impact on physiochemical properties of grain quality in this population. 80

99 SPHOL (alpha 1,4 glucan starch phospholrylase) Five SNPs was retrieved from OryzaSNP database and genotyped, but no functional polymorphism recognised in this population at all. Therefore, this gene had no effect on studied traits. GBSSI (Granule bound starch synthase I) This gene is the most important gene involved in starch synthesis of rice and other cereal grains. Association study showed a strong correlation between WAXYEXIN1 (G/T) SNP at the junction site of Exon1/Intron1 and RVA curve characteristics such as Peak Viscosity (PKV) and Breakdown. The highest F-value= in this experiment was observed for this SNP which shows a significant link to retrogradation rate (Martin test) and amylose content (F-value=121.52). The R2 value for retrogradation and amylose content were 0.66 and 0.51, respectively. The second SNP in GBSSI with association on grain properties was WAXYEX10. This C/T SNP at coordinate 3486 of gene creates a P S substitution and has a very significant association with Trough, Final Viscosity (FV), set back, retrogradation and amylose content with lower linkage than WAXYEX10. The R2 value for retrogradation and amylose content was 0.39 and 0.16, respectively. The latter SNP, WAXYEX6 also revealed some significant association according to calculated p-values 0.01 but did not show any remarkable F and R2 values which suggest small control of critical pasting properties. In total, the results indicate that this gene can solely interpret a significant portion of producing retrograded-resistant starch in rice. Section 3 of Appendix 7 shows a comprehensive result of association study for GBSSI genes. The data suggest that SNPs WAXYEXIN1 and WAXYEX10 are closely contributing to one another, while WAXYEX6 has less value in controlling starch properties. 81

100 GBSSII (Granule bound starch synthase II) GBSSII is found exclusively bound to starch granules in green tissues and synthesises amylose. The synthesised amylose subsequently consumed by the plant or accumulated in the endosperm (Dian et al., 2003). During pre-heading, about 1-3 days after flowering, this gene/enzyme is expressed in leaf, leaf sheaths, culm, and pericarp tissue at a low level (Ohdan et al., 2005). The impact of Vrinten and Nakamura, (2000) confirmed the role of GBSSII on elongation of amylose in non-storage tissues of cereals. One nssnp found at position 1638 of this gene and tested for association study with starch physiochemical traits (Appendix 6). Only one considerable association with pasting temperature (PT) with R2 value of 0.20 was observed for this SNP, although some minor association also calculated with GT and Peak time (sect.4 Appendix 7). SSI Only one T/C SNP at position 5153 of this gene showed minor associations with FV, SB and Martin test (MT), with R2 values of 0.16, 0.11, 0.16, respectively (Appendix 7). SSIIa Starch synthase IIa (SSIIa) gene has effects on starch quality, presumably by affecting amylopectin structure. Two SNPs at positions 631 and (ALKSSIIA4) were tested for association (Appendix 6). The effect of [GC/TT] on alkali disintegration and eating quality of rice starch is already known (Umemoto and Aoki, 2005; Waters and Henry, 2007). Highly significant asociations were found between SNPs of SSIIa and important physiochemical properties such as pasting temperature (PT), peak time (PKT), GT and Breakdown viscosity. A highest F-test value of was observed for ALKSSIIA4 [GC/TT] SNP and PT, suggesting this SNP controls PT, PKT and BDV with R2 values of 82

101 0.642, and 0.168, respectively. This SNP has one of the strongest associations with the physiochemical properties of rice studied in this population (R2=0.642). This suggests the [GC/TT] SNP at position of SSIIa as one of the most influential SNP across the assayed polymorphism. The other G/T SNP at position 631 showed no singnificant association with any traits. SSIIb It is believed SSIIb is a low level early expressed gene, which is primarily expressed in green tissues and at an early stage of grain filling (Hirose and Terao, 2004). In total, 6 different SNPs were genotyped in this population (Appendix 6) and no polymorphism was found, suggesting that this gene has no effect in our study. SSIIIa The highest polymorphism was observed in this gene with 22 SNPs in the coding region causing an amino acid change. Available Polymorphism in this gene showed association with a number of studied properties such as FV, SBV, PT, M-test, AC, Predicted N, Dif, GT and chalkiness. However, most of them revealed very low R2 values less than 0.1, indicating that although they are associated but do not have highly significant effect on physiochemical properties (sect 8. Appendix 7). Apparently, some SNPs in SSIIIa are highly associated with GT and M-test. The highest R2 values of 0.243, 0.156, 0.130, and observed for GT, M-test, Dif, AC and predicted N, respectively (Appendix 6). 83

102 SSIIIb The main effect of SSIIb was observed on pasting temperature (PT). Strong associations were found between T/G and C/A SNPs at positions 7232 and 4543 of SSIIIb with R2 values of and 0.225, respectively. This relatively high R2 value suggest the influence of SNPs in the coding regions of this gene on PT, although minor association were found with peak viscosity (PKV) and difference as well. These SNPs alter a Lys Asn and Ser Ile at position 207 and 756 of corresponding amino acid, respectively. This gene can be classified as a major contributor to pasting temperature, as some of its other SNPs also exhibited significant associations with PT (sect 9. Appendix 7). Hence, this gene is called the pasting temperature (PT) gene. SSIVa SSIVa is one of the least known starch genes expressed in rice endosperm. Our study showed the impact of this gene on PT and GT. Five SNPs were examined in this gene (Appendix 6), of which four showed significant association with PT (sect 10. Appendix 7). A relatively high R2 of was observed for the A/G SNP at position 7160 which influences PT. In addition, four other SNPs, with R2 values ranging from , had an influence on this property. Considering all the influential SNPs in SSIVa, a large portion of phenotypic variation of PT in this population of rice is explained by variation within SSIVa. Some minor association were also observed with GT, PKT, AC and PN. Together, these data suggest SSIIIb and SSIVa in combination have a very strong contribution to PT. 84

103 SSIVb No variation was observed in two SNPs assayed in this population (sect 11. Appendix 7). BEI Only one C/T SNP at position 1558 of this gene was discovered (Kharabian-Masouleh et al., 2011). Nine of 13 studied physiochemical traits were associated with this SNP at medium level, with the highest R2 values observed for AC, MT, SBV and FV, respectively. The relatively high R2 values of and for AC and MT suggests there is a significant contribution of this gene to amylose content and retrogradation. Minor associations were also found between this SNP and PV, BDV and FV (sect 12. Appendix 7). BEIIb BEIIb is coded by the amylose extender (ae) in maize and other cereals (Yun and Matheson, 1993). Two SNPs were examined in this gene (Appendix 6) but no significant association was found with starch properties. Previous studies on biochemical analysis of amyloseextender (ae) mutant of rice (Oryza sativa) had revealed the influence of mutation in this gene on gelatinization properties through the structural alteration of amylopectin by reducing short chains and degree of polymerization (Nishi et al., 2001). No pleiotropic effect with other genes such as BEIIa and SI was found, suggesting this is a neutral gene in this population. The main reason for this inconsistency may be due to nature and minor significance of SNPs. Most of studies of this gene have focused on mutant populations, where a large segment of gene has been deleted. Therefore, the results of these experiments are not comparable and must only be interpreted at gene level and cannot be expanded to naturally occurred SNPs (sect 14. Appendix 7). 85

104 Debranching Enzymes (DBEs) ISA1 (Isoamylase 1) Two SNPs were retrieved from databases and genotyped. No polymorphism detected in this population, indicating simply no association with physiochemical properties of rice (sect.15 Appendix 7). ISA2 (Isoamylase 2) Variation of two SNPs was assessed in this gene and very minor associations with BDV, PT and chalkiness. All R2 values were less than 0.1, which indicate very low association with the variability of associated traits (sect.16 Appendix 7). Pullulanase (PUL) A recent association study between pullulanase and RVA profile parameters in glutinous rice has shown strong relations of this gene with peak viscosity, hot paste viscosity, breakdown viscosity and peak time (Yan et al., 2010). In this study there were only weak associations with two SNPs in pullulanase and PT, GT and CHK with R2 values of 0.174, and 0.066, respectively. The values above 0.1 present a low degree association and can express a portion of current variability in this population of rice (sect.17 Appendix 7). Discussion SSI transcript level has been measured at different seed developmental stages. A high expression level reported at 1-3 DAF, peaking at 5 DAF, and remaining almost constant during starch synthesis in endosperm. This suggests SSI as a major SS form in cereals (Cao et al., 1999). 86

105 Neutral genes with no polymorphism or association In total, 65 SNPs were successfully genotyped in 233 breeding lines (Appendix 6). No polymorphism was detected for AGPS2b, SPHOL, SSIIb, SSIVb and ISA1. Moreover, there was no association between BEIIa and BEIIb and any physiochemical properties in rice. Therefore, seven genes out of eighteen did not contribute to physiochemical properties of this population. Although there have been no reports of associations between naturally occurred SNPs in these genes and quality properties, many studies have reported the importance of these genes in physiochemical properties and quality of starch granules. For example, Kawagoe et al., 2005 described that AGPS2b subunit plays important role in starch granule synthesis and associated with rice shrunken mutants. SPHOL is also supposedly involved in starch degradation and biosynthesis. The mechanism appears to be associated with phosphorylation of some starch-related enzymes and proteins such as starch branching enzymes (SBEs) and starch synthase (SSIIa)(Tetlow et al., 2004). As almost all of these studies have been based on deficient mutants (Rolletschek et al., 2002) and it can be concluded that massive mutations, such as In/dels, which abolish gene functions have an impact on soluble sugar content, structure and appearance of starch granules and quality of endosperm in rice and other species, but SNP may not have any impact on starch quality. Despite the reported impact of BEIIb (amylose extender) and ISA1 on physiochemical properties in several cereal species (Fisher et al., 1993; Sun et al., 1997; Sun et al., 1998; Yamakawa et al., 2008), this study found there was no association any with any physiochemical properties in rice starch. In rice endosperm, antisense inhibition of ISA1 has altered the structure of amylopectin and the physiochemical properties of starch (Fujita et al., 2003). The ISA genes are also presumed to have some sort of contribution to the degree of setback on glutinous rice cultivars (Yan et al., 2010). No significant association was found 87

106 between two detected SNPs in BEIIb and quality traits in this population. The contradictory results can be attributed to the composition or structure of populations. Not all alleles which affect any one trait may be represented in this population or in a particular population; different minor genes might have peculiar regulatory roles and impacts which is mediated by different genetic backgrounds. Major genes with highly significant associations GBSSI and SSIIa are major genes involved in some of the most important grain quality properties such as amylose content and gelatinization temperature. Highly significant associations were found between GBSSI and retrogradation and amylose content in addition to more significant relationships with RVA properties such as BDV, SBV and FV. A number of authors have reported the importance of this gene on starch physiochemical properties of rice and other cereals, where as SNPs at the intron/exon 1 junction site, exon 6 and 10 in rice GBSSI (waxy gene) have the most significant impact on starch quality (Chen et al., 2008a, b; Cai et al., 1998). Larkin and Park, (2003) has already reported a SNP in exon 6 to be effective on amylose content. This study confirms the T/G SNP at intron/exon 1 junction site has a major influence on a number of physiological properties. SSIIa presented very high association with pasting temperature, gelatinization temperature and peak time. The effect of this gene on cooking quality and starch texture has been extensively studied (Umemoto et al., 2004; Umemoto et al., 2008). Umemoto and Aoki, (2005) explained the alkali disintegration and eating quality of rice starch by polymorphism of two SNPs, [A/G] and [GC/TT]. These SNPs within the exon 8 of alk loci are significantly associated with gelatinisation temperature (GT) (Waters et al., 2006) and here it has been confirmed there is a very significant association between the SSIIa exon 8 GC/TT SNP and pasting temperature (R2=0.642). 88

107 Contributory genes with low-medium associations In this study, six genes, GBSSII, SSI, SSIIIa, SSIIIb, SSIVa, BEI had low to medium effects on the final phenotypic variation of individuals. In fact, SNPs in these genes have shown significant association with a number of studied characters with low to medium R2 values and here these genes are termed contributory, where addition of their effects can, in part or full, represent the phenotypic values. Some of these genes might work with one another to reach a certain level of phenotypic expression. The effect of contributing genes and how could they be associated together have widely been studied at gene level (Dian et al., 2003; Fujita et al., 2006; Hirose et al., 2006; Umemoto et al., 2008). SSIIIb and SSIVa are PT-associated genes with relatively medium to high level of association with pasting and/or gelatinization temperatures. Minor genes with very low associations Debranching enzymes showed minor influences in this population. Isoamylase was first reported in maize endosperm (Doehlert and Knutson, 1991). ISA2, is relatively small gene (2625 bp) with no intron. Therefore, it is presumed each detected SNP/Indel could be potentially important in this gene. However, no strong association between the two SNPs in this gene and any physiochemical property was found. However, another debranching enzyme, pullulanase, had low associations with PT, GT and chalkiness (CHK). A recent association study between pullulanase and RVA profile parameters in glutinous rice has shown strong relations of this gene with peak viscosity, hot paste viscosity, breakdown viscosity and peak time (Yan et al., 2010). However, our results differ from these and may be attributed to the structure of the population. Minor genes are very population-specific and in 89

108 each population, different minor genes might contribute to the final phenotypic variability of physiochemical properties. Appendices: Chapter 4 Appendix 5: Full list of 233 studied Australian rice genotypes and their pedigree information. Appendix 6: Name and characteristics of SNPs genotyped in the rice population. Appendix 7: The results of association study among 13 physiochemical traits and SNPs of 18 different starch-related genes. Appendix 8: Linkage map of 17 starch-related genes, showing the approximate each gene s chromosomal location. 90

109 CHAPTER 5 Rice GPT1 SNP associated with resistant-retrograded starch Summary Resistant-retrograded starch is widely associated with human health. The highly retrograded starches of cereals usually have a lower glycemic index (GI) which may be beneficial in many human diets. Presented here is evidence the GPT1 gene, early in the biochemical pathway of starch synthesis, encoding the 6-glucose-phosphate translocator enzyme, has a major influence on resistant starch production in rice. A T/C SNP at position 1188 of the GPT1 gene, alters Leu42 to Phe, and is associated with resistant-retrograded starch and amylose content. The T and C alleles produce high and low levels of retrograded starch, respectively. An association study of 233 genotypes demonstrated a significant correlation (R2) of 0.57 and 0.36 (P= ) between this SNP and retrogradation degree and apparent amylose content, respectively. Haplotype and association analysis of this SNP and another G/T SNP at the boundary site of exon/intron1 in GBSSI gene can explain most of the variability of retrogradation degree and amylose content in this rice population. These two SNPs, T SNP in GPT1 and G in GBSSI, combine to produce higher levels of resistantretrograded starch and may provide a new tool for deliberate selection of rice genotypes for specific functional and nutritional outcomes such as resistant-retrograded starch and high amylose content non-sticky rices. 91

110 Introduction Resistant starch is a major contributor to starch quality with a significant impact on human health (Sajilata et al., 2006). The incomplete digestion-absorption of resistant starch in the small intestine leads to non-digestible starch fractions with a physiological function similar to the beneficial impact of dietary fiber in food (Asp and Björck, 1992). On the other hand, the formation of resistant starch due to retrogradation results in the hardening of cooked starch after cooling due to re-crystallization of gelatinized starch components leading to loss of desirable food texture during the storage of some starch containing foods (Fan and Marks, 1998). The staling of bread or the hardening of pasta or rice on refrigeration after cooking are examples of this process. The concept of starch retrogradation and appropriate methods to measure and score its rate in rice has already been described (Philpot et al., 2006). It is believed there is a significant association between retrograded and resistant starches (Sajilata et al., 2006) and so in this study the term retrograded-resistant starch in rice is used. The in vivo digestibility and structural features of resistant-retrograded starch with high amylose content in maize, bean and potato flakes) were assessed using the ileal contents of different human populations (Faisant et al., 1993) and it was found resistant starch consisted mainly of retrograded amylose with a degree of polymerization of approximately 35 glucose units and a melting temperature of 150 C. Retrograded amylose in peas, maize, wheat, and potatoes was found to be highly resistant to amylolysis and digestion (Ring et al., 1988). The factors which might have a direct or indirect influence on the rate of retrogradation, firmness and resilience of rice starch after cooking are amylose, protein and lipid contents (Philpot et al., 2006). Highly retrograded cooked rice has a low hydrolysis index (HI) and glycemic index (Hu et al., 2004) while waxy and low amylose rice shows more rapid and complete hydrolysis (Chung et al., 2006; Hu et al., 2004). 92

111 The Rapid Visco Analyser (RVA) has been widely used to measure grain quality parameters (Limpisut and Jindal, 2002). Hu et al. (2004) reported that the high amylose rice cultivars are normally characterized by low RVA parameters, such as peak viscosity (PKV), hot paste viscosity (HPV) and cool paste viscosity (CPV), with higher resistant starch (RS) content and lower estimated glycemic index (EGS). Yan et al. (2010) analysed the association of 17 starch synthesis-related genes with the rapid visco analyzer (RVA) profile parameters in a collection of 118 glutinous rice accessions using 43 gene-specific molecular markers. They concluded that 10 of 17 starch-related genes are involved in controlling RVA profile parameters. The association analysis revealed that pullulanase plays an important role in control of peak viscosity (PKV), hot paste viscosity (HPV), cool paste viscosity (CPV), breakdown viscosity (BDV), peak time (PeT), and paste temperature (PaT) in glutinous rice. Alleles associated with starch quality have been characterized. Granule bound starch synthase (GBSSI) is the most important gene involved in starch synthesis in rice and other cereal grains. A number of SNPs, one at the intron/exon 1 junction site, exon 6 and 10 in rice GBSSI (waxy gene) with significant impact on starch quality have been characterized (Chen et al., 2008a, b ; Cai et al., 1998; Larkin and Park, 2003). Starch synthase IIa (SSIIa) is also known to have a major affect on starch quality and is exclusively expressed in the endosperm at very high levels. SSIIa affects the amylopectin structure of starch (Craig et al., 1998; Morell, 2003). The influence of this gene on cooking quality and starch texture has been studied extensively (Umemoto et al., 2008; Umemoto et al., 2004). Umemoto and Aoki, (2005) explained the alkali disintegration and eating quality of rice starch by polymorphism of two SNPs, [A/G] and [GC/TT], in SSIIa. These SNPs within exon 8 of SSIIa are also significantly associated with gelatinisation temperature (GT) (Waters et al., 2006). Although the effect of many starch related genes on grain quality has been widely studied, there is little know how polymorphisms in starch related genes influence starch quality 93

112 parameters, except for those reported for GBSSI and SSIIa. In fact, most studies have focused on comparison of gene-deficient mutants (Fujita et al., 2006) at the gene level rather than the DNA sequence level, probably due to lack of high-throughput technologies to discover new variants in widely diverse populations. Emergence of new technologies such as next generation sequencing and multiplexed-maldi-tof technologies has removed the limitations of traditional sequencing and genotyping methods and improved the efficiency of SNP-trait analysis in plants. The Glucose-6-phosphate translocator (GPT) was first isolated from plastid envelope membranes of maize (Zea mays) endosperm (Kammerer et al., 1998). GPT is a key enzyme found early in the starch biosynthesis pathway and controls the production of precursors for starch and fatty acid biosynthesis. Plant genomes normally contain two functional homologous GPT genes, GPT1 and GPT2, both of which have glucose 6-phosphate translocator activity in the plastids of non-green tissues and can import carbon in the form of glucose 6-phosphate. Mutation in the GPT genes of Arabidopsis is disruptive in starch synthesis and the oxidative pentose phosphate cycle of cereals, which in turn affects fatty acid biosynthesis and oil accumulation (Niewiadomski et al., 2005; Wakao et al., 2008). Sequencing 17 starch-related genes in rice using a long-range PCR protocol combined with massively parallel sequencing discovered a number of novel SNPs in the GPT1 gene indicating that this gene potentially has an influence on rice starch quality properties. This study reports a novel SNP in the rice glucose-6- phosphate translocator 1 (GPT1) gene closely associated with resistant-retrograded starch and amylose content and identifies an allelic combination with the waxy gene which explains most of the variability in retrogradation degree and amylose content in rice. 94

113 Materials and methods Plant materials A population of 233 F6 lines from the Australian temperate (japonica-type) rice breeding program was supplied by Industry and Investment NSW, Yanco Agricultural Research Institute, Australia. Selections for the capacity to flower and set seed and the morphological traits of plant height, grain size and shape had been made on this population. No selection had taken place for quality traits. Physiochemical properties Thirteen physiochemical traits including four phenotypic and RVA characteristics were measured. The phenotypic traits consisting of apparent amylose content (AC), gelatinization temperature (GT), grain chalkiness and retrogradation rate [scored by the Martin test (Philpot et al., 2006), were quantified according standard methods. RVA characteristics such as peak viscosity (PKV), trough viscosity (TV), final viscosity (FV), breakdown, setback, peak time (PKT) and pasting temperature (PT) were measured by a Rapid Visco Analyser (Perten RVA 4500, Segeltorp, Sweden) according to the manufacturer s instructions. Designation of starch-synthesis genes involved in starch metabolism The available literature was used to identify the most likely candidate genes associated with rice starch quality (Ohdan et al., 2005; Waters and Henry, 2007; Nakamura, 2002; Hirose et al., 2006; Rahman et al., 2000). The general entries of nucleotide sequences (gdna) and full-length cdnas of gene classes involved in starch biosynthesis were retrieved from the NCBI ( 95

114 and the Rice Genome Annotation Project ( databases. Amplification primers were designed based on consensus sequence alignment of each candidate gene. SNP/Indels were identified by long range PCR and massively parallel sequencing (Illumina GAII) of the starch biosynthesis genes in this population (Kharabian-Masouleh et al., 2011). Discovery of novel SNP in GPT1 and SNP genotyping A C/T SNP at reference position of 1188 GPT1 was found in this breeding population which changes an amino acid from Leu to Phe (Leu42Phe). The position and function of a SNP at the boundary of intron/exon1 of GBSSI (waxy gene) has been well characterized (Cai et al., 1998; Isshiki et al., 1998). A specific multiplexed mass spectrometry assay (Sequenom MassARRAY) was designed for simultaneous genotype analysis of each of these SNPs according to Masouleh et al., (2009) with modification in sequence of capture and extension primers (Table 1). Association analysis SNP data and phenotypic data were analysed in TASSEL v2.1 (Bradbury et al., 2007) software to identify SNP associated with physiochemical properties. The input genotypic and phenotypic files prepared according to Bradbury et al. (2007) and then imported to the software. The general linear model (GLM) was used for alignment of data with 1000 prematuration. Results Two SNPs were genotyped in 233 individuals. To avoid complications in the association study, a gene by gene approach was applied. The results of the association study for each 96

115 gene were then related to the physio-chemical properties to find the combinations which cause the highest and lowest retrogradation degree and amylose content. Table 1. MassARRAY primers for GPT1 and GBSSI. Primer Sequence 5 3 GPT1_GA_Ref1188ER F R E *ACGTTGGATGGCTTCGGTTTCATCTGTCTC *ACGTTGGATGTAGTGGTGCAAGGTAGAGTG AAGGTAGAGTGGTCTGA GBSSI_EXIN1 F *ACGTTGGATGGATCGATCTGAATAAGAGGG R E *ACGTTGGATGCTGCTTGTGTTGTTCTGTTG AGGAAGAACATCTGCAAG *A 10-mer tag, sequence 5 -ACGTTGGATG-3, was added to the 5 end of each amplification primer to avoid confusion in the mass spectrum and improve chain reaction performance. Extention primer. GPT1 (Glucose-6-phosphate translocator) GPT1 is found early in the starch biosynthesis pathway (Fig 2). Theoretically, any polymorphism in the coding regions or critical domains of genes can influence the starch properties. GPT1 is strongly expressed in the endosperm and imports the essential carbon substrates such as Glc6P into plastids during grain development (Fischer and Weber, 2002; Jiang et al., 2003). A number of SNP/Indels in GPT1 and a novel non-synonymous C/T SNP at position 1188 of the gene were detected. This SNP generates two alleles that encode either a Leu or a Phe. The results of this association study revealed a significant association between this SNP and some physiochemical properties of rice starch. The C/T SNP showed an association with amylose content, predicted N, difference and set back (Table 2). However, the strongest association was found between this SNP and retrogradation degree. The R2 value for this important starch property was Apparent amylose content, which is one of the other critical components of starch, has a very strong association with this SNP with an R2 value of The C allele in GPT1 results in the lowest degree of 97

116 retrogradation of about 0.34 while the T conversely renders the highest value of 2.74 (Fig 1a). Highly retrograded resistant starch releases glucose monomers very slowly, which is highly desirable in human diets. Fig 1b. also shows how GPT1 gene Amylose content (%) Retrogradation degree GPT1 gene Series C T C (b) (a) (c) Haplotype Combinations of SNPs in GPT1 and GBSSI 3 Amylose content (%) Retrogradation degree Haplotype combinations of SNPs in GPT1 and GBSSI GPT1 GBSSI C T GPT1 1 T T GBSSI G (d) 0 GPT1 GBSSI C T GPT1 1 T GBSSI G Figure 1. Effect of GPT1 and GBSSI SNPs on retrogradation degree and amylose content (%). (a) Allele C in GPT1 represents the low retrogradation rate and (b) low amylose content where as T produces the highest values in both studied traits (±SD=0.34 and 1.57, respectively). (c) Haplotype combination of studied SNPs in GPT1 and GBSSI which creates high and low retrogradation degree and (d) amylose content (%)(±SD=0.34 and 1.57, respectively). 98

117 Table 2. Physiochemical properties associated with C/T SNP in GPT1. The R2 values show the portion of total variation explained by SNP in GPT1 gene. Trait Peak viscosity (PV) Break down (BD) Final viscosity (FV) Set back viscosity (SB) Retrogradation degree (Martin-test) Amylose content (AC) Predicted N Difference F-test p-value 7.10E E E E E E E E-20 p-value adjusted 9.99E E E E E E E E-04 R *Values<0.01 are significant, p-values adjusted for multiple tests when the permuation set to 1000 of run. Figure 2. Simplified biochemical pathway of starch synthesis in cereals. GPT1 directly affects on the structural fatty acid of amylose and causes high and low resistant-retrograded starch upon occurrence of T and C SNPs, respectively. different alleles of this gene influence the production of amylose in which, C and T generate the lowest and highest amylose content of 17.8 and 24.7%, respectively. 99

118 GBSSI (Granule bound starch synthase I) GBSSI is probably the most important gene involved in starch synthesis in rice and other cereal grains. This association study showed a strong correlation between the WaxyIN1 (G/T) SNP at the junction site of Exon1/Intron1 and important RVA curve characteristics such as peak viscosity (PKV), set back and breakdown. This SNP has an influence similar to GPT1 on physiochemical properties of rice starch (Table 3). The association study showed significant F-values of and for this G/T SNP, indicating significant association with to retrogradation degree (Martin test) and amylose content, respectively. The R2 values were 0.66 and 0.51 for retrogradation and amylose content, respectively, suggesting this SNP also controls these important traits and can explain a substantial portion of variability in this rice population. Individuals with the T allele exhibited the lowest retrogradation degree of 0.730, whereas the G allele gave the highly retrograded resistant starch (2.60) with high amylose content. The range of amylose content for T and G alleles was 17.5 and 24.5, respectively (Fig 1c and 1d). Allelic combination of SNPs in GPT1 and GBSSI An association between allelic combinations of GPT1 and GBSSI to control the retrogradation degree and amylose content was detected. The T:G GPT1:GBSSI allelic combination produces the highest amylose content and amount of retrograded resistant starch (Fig 1c. and 1d). Conversely, the C:T allelic combination (GPT1:GBSSI) produces the lowest retrogradation and amylose content. Other combinations of T:G and C:T SNPs resulted in values of and % for retrogradation and amylose content, respectively. Some other starch related genes such as BEI, SSI and SSIIa may also play a complementary role. 100

119 Table 3. Physiochemical properties associated with G/T SNP in GBSSI. The R2 values show the portion of total variation explained by waxy1 SNP in GBSSI gene. Trait Peak viscosity (PV) Break down (BD) Final viscosity (FV) Set back (SB) Retrogradation degree (Martin-test) Amylose content (AC) Predicted N Difference F-test p-value 8.87E E E E E E E E-04 p-value adjusted 9.99E E E E E E E E-04 R *Values<0.01 are significant, p-values adjusted for multiple tests when the permuation set to 1000 of run. Discussion Resistant starch can play an important role in human nutrition and health. Resistant starch is digested more slowly than non-resistant starch and releases glucose slowly into the blood stream, resulting in a low glycemic index (GI). An in vitro enzymatic starch digestion study showed that there should be a close relationship between resistant starch, amylose content, retrogradation and glycemic index (Hu et al., 2004). The study revealed that high amylose rice cultivars, characterized by low major RVA parameters, such as peak viscosity, hot paste viscosity, and cool paste viscosity, had more resistant starch content and resulted in a lower estimated glycemic index (Hu et al., 2004). When the retrogradation degree is higher, the starch is more resistant to digestion and the GI is lower. In this study, a significant association between a T/C SNP at position 1188 of GPT1, which alters a Leu42 to Phe, and the presence of resistant-retrograded starch and high amylose content, at which the T and C produces a high and low retrogradation rate, respectively, was found. It is believed amylose content has a significant influence on retrogradation rate (Hu et al., 2004) but some studies show that these two important starch properties might work independently from one another. 101

120 Table 4. Allelic combinations of GPT1 and GBSSI represent different classifications of amylose content and retrogradation degree. Classification GPT1 GBSSI Amylose content (%) 24.45±1.63 *Status G:G Retrogradation Degree (M-test) 2.840±0.73 High AC and Ret No of lines in each class 15 Group 1-High T:T Group 2-High medium C:C G:G 1.577± ±1.08 High-medium AC and Ret 1 Group 3-Low medium C:C G:T 1.032± ±1.16 Low-medium AC and Ret 5 Group 4-Low C:C T:T 0.679± ±3.12 Low AC and Ret 205 *AC=Amylose content; Ret=Retrogradation; M-test=Martin test. The values presented as Mean±SD. Panlasigui et al. (1991) revealed that rice cultivars with very similar amylose content have different digestibility and glycemic index in humans, suggesting that some other mechanisms such as retrogradation must be involved in the process (Panlasigui et al., 1991). In spite of a correlation coefficient of 0.70 between amylose content and retrogradation degree in this study, the conclusion here is these two traits work independently but have some contributing influences on each other. Major genes such as GBSSI and SSIIa and their functional SNPs have a major influence on amylose and amylopectin content in cereals (Nakamura et al., 2005; Umemoto et al., 2004b; Yamamori et al., 2006). Glucose 6-phosphate/phosphate translocator (GPT1) imports carbon resources into non-photosynthetic plastids (Kammerer et al., 1998) and it appears to be a key gene controlling retrogradation degree in rice. Andriotis et al. (2010) conducted an experiment to determine the importance of GPT1 in development of embryo of Arabidopsis. They reported a major influence of GPT1 on seed development, where a strong reduction in activity of this gene resulted in abortion of the embryo due to ultra-structural and biochemical defects including proliferation of starch granules. It was proposed GPT1 is necessary for early embryo development because it catalyses import into plastids glucose 6-phosphate as the substrate for NADPH generation via the oxidative pentose phosphate pathway (Andriotis et al., 2010). Loss of GPT1 activity in developing bean embryos has large effects on storage product synthesis (Rolletschek et al., 102

121 2007). The same loss or activity variation (occurred by SNPs) in the biochemical pathway of GPT1 can change the constitution of starch and particularly amylose content, which normally accounts 20-30% of starch content. GPT1 is important in fatty acid synthesis of oilseed where oil can account for up to 30-40% of dry matter. In Brassica species, application of exogenous Glc6P changed the activity of GPT1, in which uptake and metabolization of Glc6P to fatty acids was altered significantly through plastidial glycolysis (Eastmond and Rawsthorne, 2000; Hutchings et al., 2005). The influence of GPT1 on starch retrogradation may be explained by its role in fatty acid synthesis. The role of fatty acids and lipids in the helical structure of amylose has long been studied (Tester and Morrison, 1990). It is thought lipids play a structural role as a core centre scaffold in holding together the helical architecture of amylose and it has been suggested amylose content is correlated with lipid content (Morrison, 1988). Philpot et al. (2006) reported removal of lipids significantly increased retrogradation rate and the firmness of rice starch gels. They found O. sativa cv Koshihikari grown in Japan had a lower retrogradation rate relative to O. sativa cv Koshihikari grown in Australia, despite the fact that flour from both origins were 18% amylose. This can be attributed to the amount of long amylose chains complexed with lipids. Apparently, the amount of long amylose chains associated with lipid is greater for the Japanese rice, and the higher lipid content linked to long amylose chains explains the lower retrogradation in the Japanese rice. It is possible then that GPT1 influences retrogradation degree via its influence on fatty acid content rather than directly influencing amylose content. The lipids complex with long chain amylose and relatively high concentrations of lipid disrupt recrysalisation, lowering the extent of starch retrogradation. 103

122 CHAPTER 6 A high-throughput assay for rapid and simultaneous analysis of perfect markers for important quality and agronomic traits in rice using multiplexed MALDI-TOF Mass Spectrometry Summary The application of single nucleotide polymorphisms (SNPs) in plant breeding involves the analysis of a large number of samples, and therefore requires rapid, inexpensive and highly automated multiplex methods to genotype the sequence variants. A high-throughput multiplexed SNP assay for eight polymorphisms which explain two agronomic and three grain quality traits in rice was optimised. Gene fragments coding for the agronomic traits plant height (semi-dwarf, sd-1) and blast disease resistance (Pi-ta) and the quality traits amylose content (waxy), gelatinization temperature (alk) and fragrance (fgr) were amplified in a multiplex polymerase chain reaction. A single base extension reaction carried out at the polymorphism responsible for each of these phenotypes within these genes generated extension products which were quantified by a matrix-assisted laser desorption ionizationtime of flight system. The assay detects both SNPs and indels and is co-dominant, simultaneously detecting both homozygous and heterozygous samples in a multiplex system. This assay analyses eight functional polymorphisms in one 5 μl reaction, demonstrating the high-throughput and cost-effective capability of this system. At this conservative level of multiplexing, 3072 assays can be performed in a single 384-well microtitre plate, allowing the rapid production of valuable information for selection in rice breeding. 104

123 Introduction Single nucleotide polymorphisms (SNPs) are the most abundant class of sequence variation and explain the occurrence of human genetic disease (Shastry, 2002) and many important traits in plants (Bryan et al., 2000; Kennedy et al., 2006). The high frequency of SNPs in many plant species, including rice, where comparison of data from japonica and indica cultivars identified one SNP every 170 bp and one indel every 540 bp (Yu et al., 2002), in combination with their genome-wide distribution (Garg et al., 1999; Drenkard et al., 2000; Nasu et al., 2002; Batley et al., 2003), means that they have the capacity to generate highresolution genetic maps (Bhattramakki et al., 2002). The capacity for high resolution means SNP markers are an attractive tool for gene identification. When identified, causal SNPs are the perfect markers within marker-assisted selection programs (Gupta et al., 2001; Rafalski, 2002; Batley et al., 2003). Several techniques have been developed to assay SNPs, including SNP microarray hybridization-based methods (Rapley and Harbron, 2004) and enzyme-based methods including those involving the use of DNA ligase, polymerase and nuclease (McGuigan and Ralston, 2002; Olivier, 2005; Costabile et al., 2006; Gunderson et al., 2006). Other methods, such as Pyrosequencing (Ahmadian et al., 2000), and PCR based approaches (Hayashi et al., 2004) including TaqMan (Livak, 1999) have been designed for SNP and indel detection; however, they are generally not cost- or time-effective per sample. PCR-based markers are preferable because they are efficient, cost- effective and require only a small quantity of genomic DNA for genotyping, and are thus suitable at all stages of plant growth, including early seedling stages. An increasing number of genes controlling important traits in plants are being discovered, and the underlying polymorphisms can be converted into perfect molecular markers. Some recent examples of perfect markers for important traits in plants include rice fragrance 105

124 (Bradbury et al., 2005), wheat grain hardness (Morris, 2002), rice blast resistance (Kennedy et al., 2006) and a range of other disease resistance genes (Jeong et al., 2002), however, each of these have been single- trait, uniplex assays. Plant breeders often track and select for more than one trait within any one cross, and as the number of genes which control important traits expands, the need for rapid, simple, inexpensive, reliable multiplex genotyping methods will become more urgent (Hayashi et al., 2004). The objective of this study was to investigate the capability of the multiplex matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectrometry system (Sequenom MassARRAY, San Diego, CA, USA) as a high-throughput platform for the rapid, simultaneous and robust multiplex assay of SNPs responsible for important agronomic and grain quality traits in rice. In this article, an assay for distinguishing between eight different important polymorphisms simultaneously in a single 5 μl reaction is reported. Materials and methods Genotypes All plant material was supplied by the Australian Plant DNA Bank ( Twenty-five commercial rice cultivars were analysed: Amaroo, Amber, Basmati 370, BL24, Calrose, Calmochi 202, Dawn, Della, Dellmont, Domsorkh, Doongara, Dragon Eye Ball, Goolarah, Jarrah, Jasmine, Kyeema, Khao Dawk Mali 105, L202, Langi, Millin, M7, Nipponbare, Opus, Teqing and YRF204. DNA extraction Total plant DNA was extracted from individual seedlings at 10 days after germination using a Qiagen (Valencia, CA, USA) DNeasy Plant Kit, according to the manufacturer s instructions. 106

125 Primer design/generation of SNP markers Capture and extension primers were designed by Sequenom MassARRAY Assay design 3.1 software, with the exception of the sd-1del primers which were designed by Primer3 ( frodo.wi.mit.edu). The optimal amplicon size containing the polymorphic site in the software was set to bp. A 10-mer tag (5-ACGTTGGATG-3) was added to the 5 end of each amplification primer to avoid confusion in the mass spectrum and to improve PCR performance. Capture PCR protocol Platinum Taq DNA Polymerase (Invitrogen, Carlsbad, CA, USA) in a final volume of 5 μl was used for all capture PCRs. The eight-plex reaction was optimized by testing a number of capture primer and MgCl2 concentrations in the ranges μm and μm, respectively. Uniplex assays using identical PCR conditions confirmed the results of all eight-plex experiments. The optimal eight-plex capture PCR consisted of 3 5 ng of template DNA, 0.5 ul 10 PCR buffer (InviTrogen), 3 mm MgCl2, 2.5 mm of each deoxynucleoside triphosphate (dntp), 5 μm of each primer and 1 unit of Taq polymerase (5 U/μL). The reactions were heated to 94 C for 15 min, followed by 45 cycles of amplification at 94 C for 20 s, 56 C for 30 s and 72 C for 1 min, followed by a final extension at 72 C for 3 min. As the sd-1del is relatively large, the amplification protocol was modified as follows: 3.75 μl of 10 PCR buffer (50 mm), 2.25 μl of MgCl2 (50 mm), 2.1 μl of 10 μm primers (each), 6 μl of dntps (2.5 mm), 12 μl of 2 Enhancer [6% glycerol + 10% dimethyl sulphoxide (DMSO)], 0.3 μl of Platinum Taq polymerase and 1.5 μl (5 U/μL) template. The thermocycling program was 94 C for 5 min, followed by 45 cycles of amplification at 94 C for 30 s, 55 C for 30 s and 72 C for 1 min, followed by a final extension of 72 C for 3 min. Finally, 1 μl of PCR product was added to the multiplex test tubes. 107

126 Shrimp alkaline phosphatase (SAP) incubation Unincorporated dntps were removed by SAP incubation according to the manufacturer s (Sequenom, San Diego, CA, USA) instructions. 108

127 Table 1. MassARRAY markers for eight different functional polymorphisms Polymorphism Trait Capture primers Extension Primer Expected polymorphism sd-1snp Semi-dwarf F:*CGATGTTGATGACCATGGCG R:*CATCCTCCTCCAGGACGAC AGGACGACGTCGGCGGC [C/T] sd-1del Semidwarf** F:*CACGCACGGGTTCTTCCAG R:*AGGAGTTCCATGATCGTCAG GCGACAGCTCCTTCATCTCCTCGC [C/T/A] Pi-ta Blast resistance F:*GCTTCTTTCTTTCTCTGCCG R:* CAAACAATCATCAAGTCAGG AAGTCAGGTTGAAGATGCATAG [G/T] waxyin1 Amylose content F:* GATCGATCTGAATAAGAGGG R:* CTGCTTGTGTTGTTCTGTTG CAGGAAGAACATCTGCAAG [G/T] waxyex6 Amylose Content F:* ACCTCAACAACAACCCATAC R:* GATCATCATGGATTCCTTCG CCCATACTTCAAAGGAACTT [C/A] alk3 Gelatinizatio n Temp F:* TGTCCTCGAACGGGTCGAAC R:* CTCAACCAGCTCTACGCCAT CTTCTGCGGGCTGAGGGACACC [A/G] alk4 Gelatinizatio n Temp F:* TGACAAGGACCTCCTCGTAG R:* CGCAAGTACAAGGAGAGCTG AAGGAGAGCTGGAGGGG [GC/TT] fgr Fragrance F:* ACCTCAACAACAACCCATAC R:* GTTAGGTTGCATTTACTGGG TGGGAGTTATGAAACTGGTA [TATAT/AAAAGATTATGGC] * A 10-mer tag, sequence 5 -ACGTTGGATG-3, was added to the 5 end of each amplification primer to avoid confusion in the mass spectrum and improve PCR performance. ** A modified method was applied to amplify this allele 109

128 Primer extension and mass spectrometry The remaining assay steps of primer extension, resin cleanup and mass spectrometry were undertaken according to the manufacturer s (Sequenom MassARRAY ) instructions. Results Analysis of PCR products Assays were constructed for eight polymorphisms defining each of the alleles of five genes controlling five important commercial traits. The traits and genes were semi-dwarf (sd-1, two alleles) (Sasaki et al., 2002; Spielmeyer et al., 2002), blast disease resistance (Pi-ta, one allele) (Bryan et al., 2000), amylose content (waxy, two SNPs) (Cai et al., 1998; Larkin and Park, 1999; Chen et al., 2008), gelatinization temperature (alk, two SNPs) (Umemoto, 2005; Waters et al., 2006) and fragrance (fgr, one allele) (Bradbury et al., 2005) (Table 1). Optimal capture primer concentration The optimal primer concentration for the amplification of each target polymorphism in uniplex and eight-plex was 0.3 μm. Polymorphism detection at eight-plex was consistent with uniplex data. Increasing the uniplex primer concentration to 0.5 μm led to PCR products of higher concentration, except for waxyin1, in which there was nonspecific amplification at this concentration. The concentrations of PCR products, as measured by a Bioanalyser 2100 (Agilent Technologies, Palo Alto, CA, USA) DNA 500 LabChip Kit, ranged from 7.8 ng/μl (sd-1 SNP) to 12.2 ng/μl (alk4) in uniplex (Figure 1a,b), and were relatively lower in eight-plex, ranging from 6.40 ng/μl (sd-1 SNP) to ng/μl (alk4), which was sufficient to produce an excellent mass spectrum (Figure 1c,d). 110

129 Figure 1. Concentration of PCR products in uni-plex and 8-plex: (a) Concentration of sd-1snp = 7.8 ng/µl (major peak), minor peaks correspond to size standard. (b) alk4 = 12.2 ng/µl, µl (major peak), minor peaks correspond to size standard (c) Concentration of PCR products in 8-plex (d) Concentration of PCR products in a 8-plex which has been analysed individually (all in ng/µl). MgCl2 concentration MgCl2 concentration is one of the most important factors for accurate concurrent amplification of different loci in a multiplex system. The optimal concentration for the amplification of all loci in uniplex and multiplex was 3 mm. At this MgCl2 concentration, all target loci were amplified free from nonspecific amplicons and primer dimers. At lower MgCl2 concentrations of 2 and 2.5 mm, no target DNA was amplified and there were a surprising number of nonspecific bands and primer 111

130 dimers. At concentrations higher than 3 mm, nonspecific bands were present in addition to the target loci. These results were consistent and reproducible in both uniplex and eight-plex. Identification of SNPs and polymorphisms in agronomic and quality loci All eight loci were amplified in 25 cultivars and genotyped by multiplex MALDI-TOF analysis of single-base extension products, and the polymorphisms were compared (Table 2). Of these, three were responsible for important agronomic traits and five for grain quality traits, including six nucleotide substitutions and two insertions/deletions (indels). Polymorphisms were distinguished at all agronomic and quality loci, as described below. sd-1 The semi-dwarf phenotype is caused by a loss of function of the enzyme gibberellin 20-oxidase (GA 20-oxidase). Plants carrying the non-functional form of the gene, sd-1, have a diminished capacity to produce gibberellin, resulting in a reduced plant height and enhanced grain yield. Two alleles of sd-1 were assayed. One sd-1 allele, here called sd-1snp, contains a C/T SNP in exon 2 of the gene (C TC = leucine/ T TC = phenylalanine), which does not affect phenotype as it causes a synonymous mutation (Spielmeyer et al., 2002; Monna et al., 2002). The other allele, here called sd-1del, is characterized by a 280-bp (Spielmeyer et al., 2002) or 278-bp (Sasaki et al., 2002) deletion of part of exon 1 and exon 2 and bp of the intron sequence, a bp deletion in total (Figure 2). 112

131 Table 2. Polymorphisms (SNP) in 25 commercial rice cultivars at eight different functional loci. Cultivars Amaroo Amber Basmati 370 BL24 Calrose Calmochi 202 Dawn Della Dellmont Domsorkh Doongara Dragon Eye Ball Goolarah Jarrah Jasmin Kyeema Khao Dawk Mali 105 L 202 Langi Millin M7 Nipponbare Opus Teqing YRF 204 Polymorphism sd-1snp T C C C C T C C C C C C C T T C C C T T C C T C C sd-1del A A A T A A A A T A T A A A T A A T A A T A A T T Pi-ta T T T G T T T T T T T T T T T T T T T T T T T G T waxyin1 waxyex6 T G G G T T G G G G G T T T T T T T T T T T T G T A C C A A A C C C C C A A A A A A A A A A A A A A 113 alk3 G G G G G G G G G G G A G G G G G G G A G A A G G alk4 TT GC GC GC TT TT GC GC GC GC GC GC GC TT TT GC TT GC GC GC GC GC GC GC GC fgr AAAGATT TATAT TATAT AAAGATT AAAGATT AAAGATT AAAGATT TATAT TATAT TATAT AAAGATT TATAT TATAT AAAGATT TATAT TATAT TATA AAAGATT AAAGATT AAAGATT AAAGATT AAAGATT AAAGATT AAAGATT TATAT Semi-dwarf/Tall Semi-dwarf Semi-dwarf Semi-dwarf Tall Semi-dwarf Semi-dwarf Semi-dwarf Semi-dwarf Tall Semi-dwarf Tall Semi-dwarf Semi-dwarf Semi-dwarf Tall Semi-dwarf Semi-dwarf Tall Semi-dwarf Semi-dwarf Tall Semi-dwarf Semi-dwarf Tall Tall

132 Figure 2. Determination of sd-1del gene on a 2% agarose gel. Fragments around 300 bp indicate 383 bp deletion in the sd-1del gene which is responsible for the semi-dwarf phenotype. Fragments of approximately 700 bp are the intact sd-1del gene of tall plants. Lanes from left to right respectively, 100 bp Ladder, Negative control; rice varieties Nipponbarre, Kyeema, Doongara, Amaroo, BL24, Della and Domsorkh. Although a large deletion, such as sd-1del, can be determined by the size difference of amplification products on a simple 2% agarose gel (Figure 2), the suitability of MALDI-TOF for the identification of large indels was assessed. In theory, only one base (terminator) is added to the SNP site down- stream of the extension primer. Therefore, accurate gene sequence information, particularly the flanking region just before and after the indel, is necessary because single-base extension either recognizes one base inside or outside of the indel. Theoretically, the indel can be determined by the ddntp which terminates the extension reaction (Figure 3). However, when using the assay designed by Sequenom (MassARRAY Assay design 3.1) in both uniplex and eight-plex, no logic call was detected and all genotypes showed A, which corresponds to the sd-1del allele. Modification of the method substantially improved the accuracy of analysis of this allele, from 43.7% to 5% (Table 3). The modification involved amplification of the region containing the deletion in uniplex using PCR primers designed by Primer 3 ( followed by the addition of these uniplex amplicons to the other loci which had been amplified in seven-plex for all subsequent manipulations. 114

133 Figure 3. Determination of sd-1del gene by MALDI TOF. There is a 383bp deletion in semi-dwarf plants, therefore extended single base (mass modified terminator) matches to C or A which is located just after deletion otherwise there will be a peak of T for tall plants. Pi-ta Pi-ta is a major blast resistance gene in rice. Pi-ta encodes a 928-amino-acid polypeptide with a molecular mass of 105 kda. A [G/T] SNP distinguishes susceptible and resistant genotypes (Bryan et al., 2000); amino acid 918 differs between resistant and susceptible genotypes: all susceptible genotypes have a serine (T) at this position, whereas resistant plants have alanine (G). Most of the cultivars in this study carried the T allele that translates to serine (susceptible), whereas BL24 and Teqing contained the resistant G allele (alanine). waxy The waxy gene encodes the enzyme granule-bound starch synthase, which is one of the key factors influencing rice starch quality by affecting apparent amylose content (Sano, 1984; Webb, 1991; Chen et al., 2008). The [G/T] SNP at the intron 1/exon 1 splice site (waxyin1) differentiates between varieties of high and low amylose content (Cai et al., 1998) and, in combination with the exon 6 [C/A] SNP (waxyex6), differentiates between varieties of high, intermediate and low amylose content in southern US germplasm (Chen et al., 2008). Cultivars with T in waxyin1 and A in waxyex6 have the lowest amylose content or even 115

134 glutinous starch. High polymorphism was found at waxyin1 in the studied cultivars Amber, Basmati 370, BL24, Dawn, Della, Dellmont and Domsorkh, Doongara and Teqing contained the G allele, and Jasmine, Nipponbare, Langi and M7 carried the T allele (Figure 4). At waxyex6, 18 of the 25 cultivars displayed A, which suggests low amylose content. alk The major gene regulating alkali disintegration in rice grains, alk (Gao et al., 2003), encodes the enzyme starch synthase IIa (Umemoto et al., 2004). Alkali disintegration is a convenient indirect measure of the gelatinization temperature of rice starch, which is, in turn, associated with rice cooking and eating quality. Two polymorphisms within exon 8 of alk, [A/G] (alk3) and [GC/TT] (alk4), are associated with gelatinization temperature class (Umemoto, 2005; Waters et al., 2006). Figure 4. Sequenom MassARRAY waxyin1 uni-plex spectrum for cv Langi which shows a peak for T 116

135 Figure 5. An 8-plex Sequenom MassARRAY spectrum for cv Langi A combination of alk3 G and alk4 GC is found within varieties of high gelatinization temperature and low alkali spreading, whereas varieties with either alk3 A or alk4 TT are low gel temperature varieties. Both the [GC/ T T] (alk4) and [A/G] (alk3) polymorphisms were determined in all cultivars. fgr A recessive gene (fgr) on chromosome 8 controls rice fragrance. The intact Fgr allele encodes a betaine aldehyde dehydrogenase (BADH2) in non-fragrant rice, whereas fragrant rice contains an 8-bp deletion and three SNPs which prematurely terminate the translation of BADH2. This changes the bio- synthetic pathways in which BADH2 is active, resulting in the accumulation of 2-acetyl-pyrroline, which is responsible for fragrance (Bradbury et al., 2005). The eight-plex assay identified 11 varieties with the fragrant allele fgr. 117

136 Missing data and heterozygosity The highest rate of missing data belonged to sd-1del in eight- plex, which suggests that this allele is not compatible with the multiplex system (Table 3). No missing data were found in waxyin1, Pi-ta, alk4 and fgr. The apparent heterozygosity values were 3.9% and 3.1% in sd1snp and alk4, respectively. Discussion This report as demonstrates DNA polymorphisms can be efficiently confirmed and analysed in rice using a MALDI-TOF mass spectrometry system (Ding and Cantor, 2003). These assays can be used as a marker-assisted selection tool in conventional breeding programs. Rice has been at the forefront of the application of genomics and genomics tools to plant breeding and serves as a model for other crops. A whole rice genome sequence has been available for several years (Goff et al., 2002; Yu et al., 2002), and a comprehensive DNA polymorphism database has recently become available online ( The availability of these resources has accelerated the rate, at which gene function has been elucidated. Emerging DNA sequencing technologies are revolutionizing the field of genomics, bringing the reality of relatively inexpensive comparative genome sequencing of all the major crops much closer. MALDI-TOF mass spectrometry, in combination with comparative genome sequence data, will become increasingly useful in marker-assisted breeding as more genes that control important traits are identified. An efficient PCR is the most important predictor for producing a reliable and consistent assay on this platform (Figure 5). The uniform simultaneous amplification of all loci will resolve the most commonly encountered problems (Siebert and Larrick, 1992). The number and intensity of correct SNP calls are increased with higher PCR product concentrations. The minimum concentration of PCR product is 4 ng/μl for loci, which falls within the default size of bp; however, longer PCR products require a higher concentration as measured by mass to 118

137 Table 3. Percent of missing data in uni-plex and 8-plex and apparent heterozygosity in 8plex Assays/SNPs Plex level sd-1snp sd-1del Pi-ta waxyin1 waxyex6 alk3 alk4 fgr Missing data uni-plex * 0% 8% 0% 0% 0% 1.1% 0% 0% Missing data 8-plex * 4.5% 43.7% 0% 0% 4.2% 3.1% 0% 0% Missing data modified 8-plex Apparent heterozygosity 8-plex and modified 8-plex 4.5% 5% 0% 0% 4.2% 3.1% 0% 0% 3.9% 0% 0% 0% 0% 0% 3.1% 0% * Assays designed with Sequenom MassARRAY Assay design 3.1 Assays designed with Sequenom MassARRAY Assay design 3.1 except for sd-1del where PCR primers were designed by Primer 3, sd-1del amplified in uni-plex and extended and analysed in 8-plex. maintain the molar concentration at acceptable levels for iplex extension reactions. The concentration of PCR products differs between uniplex and eight-plex systems, which may have an effect on peak height calls. These differences are a result of competition between each PCR in multiplex, and show 5.7% 17.8% reductions in the final eight-plex PCR assay compared with the uniplex assay. Even spectral peak heights (Figure 5) are critical for accurate genotype calls using MALDITOF mass spectrometry, and this is achieved by increasing the concentration of individual extension primers, not by modifying capture PCR conditions, because this does not have a significant effect on the final spectra. PCR yield is intrinsic to the PCR conditions and, when optimized, should be adhered to; increasing the concentration of template, primer and Taq enzyme above that recommended concentration may increase yield in uniplex; however, in multiplex, it may lead to the generation of dimers and spurious PCR products. 119

138 Accurate DNA sequence data for each polymorphism represent the most important prerequisite for accurate assay design. However, public domain databases and published papers can have conflicting data for each locus. For example, three different sequences for sd-1del appear in the public domain: the deletion has been reported to be 382 bp (Spielmeyer et al., 2002) or 383 bp (Monna et al., 2002; Sasaki et al., 2002), and differs by the length of intron and the exact location of the deletion. In cases such as this, re-sequencing the target region is necessary for accurate primer design, which ultimately leads to an accurate, consistent assay. The capture PCR stage is important in uniplex reactions, but it is critical in the multiplex system because of the high rate of competition between primers consuming templates and enzyme. Some primers worked well in uniplex, but had missing calls in multiplex, suggesting that there were interactions between primers in eight-plex (Table 3). For example, interactions between waxyin1 and fgr increased the number of missing calls in eight-plex. There was, however, a high correlation of more than 98% between uniplex and eight-plex calls, and missing calls were around 0.15% and 1.68% (not including sd-1del) for uniplex and eight-plex respectively, which compares favourably with other sequencing methods (Jones et al., 2007). Multiplex MALDI-TOF is a powerful tool for the detection and confirmation of SNPs in rice. It has been suggested that this platform has the capability of determining more than 40 SNPs in multiplex (Sequenom, 2006) and, given that the platform can process ten 384-well plates per day, users can theoretically analyse in excess of SNPs daily (Perkel, 2008). This technique can be applied to segregating populations in the early stages of breeding programs to positively select desired polymorphisms and traits, and is a co-dominant system, having the ability to detect alleles in hybrids, heterozygotes (Jones et al., 2007) and polyploids (Henry et al., 2008). The capacity of the system to accurately identify haplotypes at one or 120

139 more loci, alk and waxy for example, allows for the efficient selection of target phenotypes within breeding programs. 121

140 CHAPTER 7 General discussion - Characterisation of starch traits and genes in Australian rice germplasm Background principles Starch is a carbohydrate consisting of large number of glucose units. A significant number of enzyme isoforms and activities contribute to starch synthesis in cereals including rice. Therefore, a substantial number of genes are involved in the process of starch synthesis. A simplified pathway diagram of starch bio-synthesis in Figure 1 (Chapter 1) shows how starch is synthesised by different enzymes and genes in plant green tissues and then deposited in the grain of cereals. Starch consists of two major components, amylose (~20-25%) and amylopectin (~75-80%). Variation in the genes and enzymes involved in synthesis of starch can change the composition and structure of starch considerably which can significantly affect the quality and palatability of rice. The variations normally occur at the DNA level due to spontaneous or induced mutations. The most abundant type of variation in all organisms are SNPs (Bryan et al., 2000; Kennedy et al., 2006). The main hypothesis of this thesis emerges from the fact that SNPs can change the gene, leading to alteration of enzymes, which in turn modifies the biochemical and physiochemical properties of starch (quality). For this purpose, a diverse set of Australian rice germplasm was obtained and the variation of starch related genes at the SNP level studied and a comprehensive association study pursued to ascertain the effect of each gene and its alleles on starch quality. 122

141 Search in SNP data bases and discovery of polymorphisms First, a range of databases such as OryzaSNP ( were interrogated with BLAST ( to identify the previously reported SNPs in the targeted genes. In total, 399 SNPs were detected in 18 starch related genes in data base records. In contrast, sequencing 233 Australian rice breeding lines resulted in the detection of 501 SNPs and 113 Indels, 102 more than were available in the public domain. One of the advantages of this approach was the capacity to detect Indels, none of which were recorded in public databases. However, all Indels detected resided in introns and therefore had no obvious impact on gene function. Of 501 SNPs, only 75 (~14.9 %) were nonsynonymous leading to amino acid changes. This study clearly demonstrated Massively parallel sequencing (MPS) in combination with Long range PCR (LR-PCR) allows analysis of many candidate genes and ensures high sequence depth at all loci (Pettersson et al., 2009). There are a number of available databases which curate DNA polymorphism data which can be converted to DNA markers. Much time, money and intellectual energy has been expended in building and maintaining these databases which are now being rendered redundant by MPS technologies. MPS allows markers to be easily developed for any population of interest at reasonable cost. This means questions can now be tailored for each population/species and answers provided with much greater precision than was possible with marker information derived from unrelated germplasm. The error rate of the Illumina GAIIx and bias in coverage are challenges for this method and accuracy of data. The error rate is reportedly about % (Out et al., 2009). In this experiment, 233 DNA samples were pooled which means the detection of one SNP (variant) out of 233 corresponds to a SNP frequency of ~0.43% which is lower than the reported error 123

142 rate. However, this error rate corresponds to single reads and does not take into account high coverage and the creation of a consensus sequence. The coverage at each base pair reported in this thesis was so high, generally ranging from to and in one gene reaching 240,000., errors could be identified and screened out by imposing a minimum SNP requirement. This has been discussed by Out et al. (2009) who reported there are strong correlations between allele frequencies, pool size, coverage and error rates in Illumina GAIIx sequencing. They demonstrated coverage of would be sufficient for detection of SNP with frequencies at or above 0.3%. High coverage ensured Illumina GAIIx platform error was neutralised in this experiment. Screening of functional SNPs Since the emergence of high-throughput whole genome sequencing technologies (Henry, 2008), it is not possible to recognize functional SNPs in a pool of DNA sequence data which contains neutral SNPs (George Priya Doss et al., 2008). Computational algorithms are useful and cost-effective tools for analysis of SNPs and genes. Recently, SNP-linkage disequilibrium and association studies, which need accurate phenotypic data of appropriate populations, have gained acceptance as procedures to assess functional SNPs (Carlson et al., 2003). However, these populations can be difficult to generate (Gupta et al., 2005), and they must have high variation in the studied traits. In this thesis, a computational screening pathway was developed to prioritize and rank plant SNPs to predict their functionality and impact on plant phenotypes. This showed there are significant numbers of important elements in the GBSSI gene, some of which have a strong association with starch physiochemical properties (Soussi et al., 2006). Based on computational analysis, the [C/A] SNP at exon 6 [Oryza SNP2], SNPs, C/T [OryzaSNP3] and C/T [OryzaSNP6] at exons 9 and 10 of GBSSI have been the most 124

143 influential SNPs in this population. Larkin and Park (2003) verified that haplotypes composed of SNPs at the exon 1/intron1 boundary site, exon 6 and exon 10 regulate GBSSI function. Chen et al. (2008a and 2008b) have also confirmed these SNPs can alter apparent amylose content and pasting properties of rice. The effect of the [C/A] SNP at exon 6 on amylose content and grain quality has been confirmed by many authors (Sano, 1984; Larkin and Park, 2003; Chen et al., 2008a). In silico analysis with FAS-ESS has also suggested another important silencer (ESS1) at the splice site of exon 1/intron 1 which has a [G/T] SNP [OryzaSNP1]. The significance of this SNP which reduces amylose content was confirmed by Cai et al. (1997) and in the association study (Chapter 4). With the advent of whole genome sequencing, this suggests a computational analysis of whole genome data is a means by which identification of important polymorphisms can be accelerated. Gene copy number in the rice genome In polyploid species which have two or more genomes such as wheat and brassica, there are as many copies of each gene as there are haploid genomes. Tetraploids have at least two copies of each gene while hexaploids have three copies and so on. Rice is a diploid species with mostly one copy of each gene. Some genes exist as gene families such as the starch synthases. The rice genome has fully been sequenced and annotated and there is an explicit list of genes available and in this study, there were 17 different genes, some of which were similar in sequence such as the starch synthase genes such as SSI, SSIIa, SSIIb etc. However, this gene family has been extensively characterised at both the gene and enzyme level and so each member of the gene family is now uniquely identified. In addition, these genes are located on different chromosomes, a means by which they are further separated. 125

144 Multiplexed MALDI-TOF Mass Spectrometry markers help to genotype individuals in a cost effective manner Multiplex MALDI-TOF is a powerful tool for the detection and confirmation of SNPs in rice. It has been suggested that this platform has the capability of determining more than 40 SNPs in multiplex (Sequenom, 2006) and, given that the platform can process ten 384-well plates per day, users can theoretically analyse in excess of SNPs daily (Perkel, 2008). This technique can be applied to segregating populations in the early stages of breeding programs to select desired polymorphisms and traits, and is a co-dominant system, having the ability to detect alleles in hybrids, heterozygotes (Jones et al., 2007) and polyploids (Henry et al., 2008). After recognition and prioritization of important SNPs it was essential to find an appropriate way to genotype 110 functional SNPs in 233 individuals. Based on regular sequencing methods at least ( = 25630) assays were needed, which would be too elaborate and expensive. In this thesis, it was demonstrated that DNA polymorphisms can be efficiently confirmed and analysed in rice using a MALDI-TOF mass spectrometry system. These assays can be used as a marker-assisted selection tool in conventional breeding programs. For this reason, an 8-plex assay was designed to check the suitability of some multiplexed MALDI-TOF SNP-specific markers for the first time in plants. The results showed that an optimal condition can be achieved and the method can be efficiently used in the genotyping of rice individuals in association studies (Chapter 4). The only drawback of this system was the inefficient recognition of large indels. However, this study found no Indels in protein coding sequences suggesting Indels are of minor importance in terms of trait determination generally. In the specific case of this rice breeding population, there were no functional indels and they therefore did not influence these data. 126

145 Association between SNPs in starch biosynthesis genes and the nutritional and functional properties of domesticated rice The main aim of this thesis was to find associations among SNPs in different starch-related genes and rice physiochemical properties. For this reason, 110 functional SNPs derived from database searches and direct sequencing (Chapter 2) were chosen and then genotyped in 233 Australian rice lines using Multiplexed MALDI-TOF (Chapter 6). In total, 65 SNPs were successfully genotyped in 233 breeding lines. No polymorphism was detected for AGPS2b, SPHOL, SSIIb, SSIVb and ISA1 (Yan et al., 2010). Moreover, there was no association between BEIIa and BEIIb (Fisher et al., 1993; Sun et al., 1997; Sun et al., 1998; Yamakawa et al., 2008) and any physiochemical properties in rice. Therefore, seven genes out of eighteen did not contribute to physiochemical properties of this population. Despite the existence of some reports at the gene level of the importance of these genes, no association between any of these genes and quality properties was found (Kawagoe et al., 2005; Tetlow et al., 2004). As almost all of these studies have been based on artificially induced mutants that abolish enzyme activity (Rolletschek et al., 2002). The data here suggests artificially induced mutations, such as In/dels, which abolish gene function may have utility in understanding the role of particular genes and enzymes in starch biosynthesis, provide little guidance as to what genes are important in the natural system. In contrast, GBSSI and SSIIa are major determinants of the important grain quality properties amylose content and gelatinization temperature. Highly significant associations were found between GBSSI and retrogradation and amylose content in addition to more significant relationships with RVA properties such as BDV, SBV and FV (Chen et al., 2008a, b; Cai et al., 1998). Larkin and Park, (2003) has already reported a SNP in exon 6 to be effective on amylose content. This study confirms the T/G SNP at exon 1/intron1 junction site has a major influence on a number of physico-chemical properties. 127

146 SSIIa presented very high association with pasting temperature, gelatinization temperature and peak time (Umemoto et al., 2004; Umemoto et al., 2008). Umemoto and Aoki, (2005) explained the alkali disintegration and eating quality of rice starch by polymorphism of two SNPs, [A/G] and [GC/TT]. These SNPs within exon 8 of the alk locus is also significantly associated with the gelatinisation temperature (GT) (Waters et al., 2006). We also confirmed a very significant association between GC/TT SNP at exon 8 of SSIIa with pasting temperature (R2=0.642). In this thesis, six genes, GBSSII, SSI, SSIIIa, SSIIIb, SSIVa, BEI had low to medium effects on the final phenotypic variation of individuals so these genes were called contributory (Dian et al., 2003; Fujita et al., 2006; Hirose et al., 2006; Umemoto et al., 2008). Here for the first time, SSIIIb and SSIVa were identified as PT-associated genes with relatively medium to high levels of association with pasting and/or gelatinization temperature. A statistically high R2 value of was calculated between a T/G SNP in position 7232 of SSIIIb and pasting temperature. Analysis of this population identified a mixture of genes previously known to have a major impact on rice starch quality, GBSSI and SSIIa, and a set of genes which play relatively minor roles. It is possible the minor genes may make a contribution to starch quality in the Australian rice growing environment only, and other sets of genes are important in other environments. This may be a blueprint for future approaches to developing molecular markers for plant breeding. In the past the relative paucity of data has meant only genes of major effect could be pursued in plant breeding. It is now possible to identify more easily environmentally affected genes of small effect which in combination may have a significant impact on traits of interest. 128

147 The 6-glucose-phosphate translocator (GPT1) may contribute to resistant starch Glucose 6-phosphate/phosphate translocator (GPT1) imports carbon into non-photosynthetic plastids (Kammerer et al., 1998) and it appears to be a key gene controlling retrogradation degree in rice. The results of this study revealed that high amylose rice cultivars, characterized by low major RVA parameters, such as peak viscosity, hot paste viscosity, and cool paste viscosity, had more resistant starch and a lower estimated glycaemic index. When the retrogradation degree is higher, the starch is more resistant to digestion and the GI is lower (Hu et al., 2004). In this study, a significant association was found between a T/C SNP at position 1188 of GPT1, which determines Leu42Phe, and resistant-retrograded starch and amylose content. It is believed amylose content has a significant influence on retrogradation rate (Hu et al., 2004) but some studies show that these two important starch properties might work independently from one another. In spite of a correlation coefficient of 0.70 between amylose content and retrogradation degree in this study, the conclusion here is these two traits work independently but have some contributing influences on each other. Conclusion and future directions I conclude that there are three different gene categories affecting starch quality in Australian rice germplasm. First genes with major effects, such as GBSSI and SSIIa, which greatly impact starch characteristics. Any new SNP found in these genes can significantly influence starch phenotype. The second category contains genes with an intermediate impact on starch properties. In this thesis, I named this category contributory genes, which contribute to the major genes to shape the final starch composition. Any variation in these genes has a low to intermediate impact on starch. I suggest six genes, GBSSII, SSI, SSIIIa, SSIIIb, SSIVa, BEI reside in this category. Finally, there are genes such as debranching enzyme genes, which had minor or no impact on starch physiochemical properties. It should be noted that contributory 129

148 and minor genes may differently impact starch phenotypic variation in different germplasm and environments. Environmental factors may have a major influence on plant growth, starch genes, enzyme activities and traits and, therefore, the results of any association study. Do all roads lead to GBSSI and SSIIa? This study suggests that primary determinants of rice starch quality are GBSSI and SSIIa. However, there might be some other genes in the genome that have a significant impact on starch quality. In this thesis, I suggested GPT1 is one of these novel important contributing genes. Expanding the collection of known alleles of starch synthesis structural genes through whole genome sequencing and associating these with starch traits will improve resolution of the interactions in the starch gene network. Whole genome sequencing of up to 3000 cultivars is underway. When complete, associations between starch gene alleles and starch quality parameters will help us to reach an understanding of the role of natural variation in starch genes in determining starch quality. Protein quantity influences rice eating quality. However, does protein composition influence rice eating quality? Protein bodies (PBs) may be an important factor influencing rice composition and quality. PBs reside among starch granules and their distribution may have significant impact on starch quality. There are two types of protein bodies: PB-I that are prolamins and consist 18-20% of grain protein of which there are several subunits of known amino acid/ gene sequence. PB-II are glutelins and consist 70-80% of grain protein. There has been much activity investigating mechanism of storage protein synthesis and relatively little investigating mature grain quantity and amino acid composition of different storage protein classes. A possible path for assessing the role of proteins in rice quality may involve assembling a panel of rice genotypes with known differences in eating quality parameters as measured by taste panel and associating these differences with protein body subunit concentration (proteomics). Protein body subunit composition could be confirmed through whole genome sequencing and assembly to known subunit gene sequence and 130

149 correlations/association of protein body subunit concentration and composition with eating quality parameters tested. Regardless of whether the target for further investigation of rice grain quality is starch or protein, high throughput sequencing applied to structured genetic populations will prove to be a powerful tool which will be invaluable in determining the contribution to each of these entities to grain quality. 131

150 References Ahmadian A, Gharizadeh B, Gustafsson AC, Sterky F, Nyrén P, Uhlén M and Lundeberg J (2000) Single-nucleotide polymorphism analysis by pyrosequencing. Anal. Biochem. 280, Akey JM, Zhang G, Zhang K, Jin L and Shriver MD (2002) Interrogating a high-density SNP map for signatures of natural selection. Genome. Res. 12, Andriotis VME, Pike MJ, Bunnewell S, Hills MJ and Smith AM (2010) The plastidial glucose 6 phosphate/phosphate antiporter GPT1 is essential for morphogenesis in Arabidopsis embryos. Plant. J. 64 (1) Andriotis VME, Pike MJ, Kular B, Rawsthorne S and Smith AM (2010) Starch turnover in developing oilseed embryos. New Phytol. 187: Asp NG and Björck I (1992) Resistant starch. Trends. Food. Sci. Tech. 3: Baldwin PM (2001) Starch Granule Associated Proteins and Polypeptides: A Review. StarchStärke 53: Ball SG and Morell MK (2003) From bacterial glycogen to starch: Understanding the biogenesis of the plant starch granule. Annu. Rev. Plant. Biol. 54(1): Barreiro LB, Laval G, Quach H, Patin E and Quintana-Murci L (2008) Natural selection has driven population differentiation in modern humans. Nat. Genet. 40, Batley J, Mogg R, Edwards D, O Sullivan H and Edwards KJ (2003) A high-throughput SNuPE assay for genotyping SNPs in the flanking regions of Zea mays sequence tagged simple sequence repeats. Mol. Breed. 11, Beatty MK, Rahman A, Cao H, Woodman W, Lee M, Myers AM and James MG (1999) Purification and molecular genetic characterization of ZPU1, a pullulanase-type starchdebranching enzyme from maize. Plant. Physiol. 119, Bentley DR (2006) Whole-genome re-sequencing. Curr. Opin. Genet. Dev. 16, Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL and Bignell HR (2008) Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 456: Bhattramakki D, Dolan M, Hanafey M, Wineland R, Vaske D, Register JC, Tingey SV and Rafalski A (2002) Insertion deletion polymorphisms in 3 regions of maize genes occur frequently and can be used as highly informative genetic markers. Plant Mol. Biol.48,

151 Bodmer W and Bonilla C (2008) Common and rare variants in multifactorial susceptibility to common diseases. Nat. Genet. 40, Bork P and Koonin EV (1998) Predicting functions from protein sequences where are the bottlenecks? Nat. Genet. 18(4): Boyer CD and Preiss J (1978) Multiple forms of starch branching enzyme of maize: evidence for independent genetic control. Biochem. Biophys. Res. Commun. 80, Bradbury LMT, Fitzgerald TL, Henry RJ, Jin Q and Waters DLE (2005) The gene for fragrance in rice. Plant Biotechnol. J. 3, Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y and Buckler ES (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 23: Brookes AJ (1999) The essence of SNPs. Gene. 234 (2): Bryan GT, Wu KS, Farrall L, Jia Y, Hershey HP, McAdams SA, Faulk KN, Donaldson GK, Tarchini R, and Valent B (2000) A single amino acid difference distinguishes resistant and susceptible alleles of the rice blast resistance gene Pi-ta. Plant. Cell. Online. 12(11): Buléon A, Colonna P, Planchot V and Ball S (1998) Starch granules: structure and biosynthesis. Int. J. Biological. Macromol. 23: Bulyk ML (2004) Computational prediction of transcription-factor binding site locations. Genome. Biol. 5(1): Buschiazzo A, Ugalde JE, Guerin ME, Shepard W, Ugalde RA and Alzari PM (2004) Crystal structure of glycogen synthase: homologous enzymes catalyze glycogen synthesis and degradation. EMBO. J. 23(16): 3196 Bustos R, Fahy B, Hylton CM, Seale R, Nebane NM, Edwards A, Martin C and Smith AM (2004) Starch granule initiation is controlled by a heteromultimeric isoamylase in potato tubers. Proc. Natl. Acad. Sci. USA. 101, Cai XL, Wang ZY, Zheng FQ and Hong MM (1997) Regulation-related Intron in 5'Untranslated Region of Rice Waxy Gene. Acta. Phytophysio. Sini. 23: Cai XL, Wang ZY, Xing YY, Zhang JL and Hong MM (1998) Aberrant splicing of intron 1 leads to the heterogeneous 5 UTR and decreased expression of waxy gene in rice cultivars of intermediate amylose content. Plant. J. 14(4):

152 Cao H, Imparl-Radosevich J, Guan H, Keeling PL, James MG and Myers AM (1999) Identification of the soluble starch synthase activities of maize endosperm. Plant. Physiol. 120, Carlson CS, Eberle MA, Rieder MJ, Smith JD, Kruglyak L and Nickerson DA (2003) Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat. Genet. 33(4): Clark RM, Schweikert G, Toomajian C, Ossowski S, Zeller G, Shinn P, Warthmann N, Hu TT, Fu G and Hinds DA (2007) Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science. 317:338 Craig J, Lloyd JR, Tomlinson K, Barber L, Edwards A, Wang TL, Martin C, Hedley CL and Smith AM (1998) Mutations in the gene encoding starch synthase II profoundly alter amylopectin structure in pea embryos. Plant. Cell. Online. 10, Chen X and Sullivan PF (2003) Single nucleotide polymorphism genotyping: biochemistry, protocol, cost and throughput. Pharmacogenomics. J. 3(2): Chen MH, Bergman C and Fjellstrom R (2004) Waxy locus genetic variation associated with amylose content in international rice germplasm. In: 30th Proceedings of the Rice Technical Working Group Meeting. Beaumont, Texas, USA Chen MH, Bergman C, Pinson S and Fjellstrom R (2008a) Waxy gene haplotypes: Associations with apparent amylose content and the effect by the environment in an international rice germplasm collection. J. Cereal. Sci. 47(3): Chen MH, Bergman CJ, Pinson S and Fjellstrom R (2008b) Waxy gene haplotypes: Associations with pasting properties in an international rice germplasm collection. J. Cereal. Sci. 48(3): Chothia C and Lesk AM (1986) The relation between the divergence of sequence and structure in proteins. EMBO. J. 5(4): Chung HJ, Lim HS and Lim ST (2006) Effect of partial gelatinization and retrogradation on the enzymatic digestion of waxy rice starch. J. Cereal. Sci. 43: Commuri PD and Keeling PL (2001) Chain-length specificities of maize starch synthase I enzyme: studies of glucan affinity and catalytic properties. Plant. J. 25: Costabile M, Quach A and Ferrante A (2006) Molecular approaches in the diagnosis of primary immunodeficiency diseases. Hum. Mutat. 27,

153 Dayong X, Jun J, Suyun H, Xiehong W, Yun G and Qingsen Z (2004) Effects of N, P and K fertilizer amount on rice grain amylose content and starch viscosity properties. Chinese. Agri. Sci. Bull. 20: Debet MR and Gidley MJ (2007) Why do gelatinized starch granules not dissolve completely? Roles for amylose, protein, and lipid in granule ghost integrity. J. Agri. Food. Chem. 55: Dian W, Jiang H, Chen Q, Liu F and Wu P (2003) Cloning and characterization of the granulebound starch synthase II gene in rice: gene expression is regulated by the nitrogen level, sugar and circadian rhythm. Planta. 218, Dian W, Jiang H and Wu P (2005) Evolution and expression analysis of starch synthase III and IV in rice. J. Exp. Bot., 56, Ding C and Cantor CR (2003) A high-throughput gene expression analysis technique using competitive PCR and matrix-assisted laser desorption ionization time-of-flight MS. Proc. Natl. Acad. Sci. USA. 100: Dinges JR, Colleoni C, James MG and Myers AM (2003) Mutational analysis of the pullulanase-type debranching enzyme of maize indicates multiple functions in starch metabolism. Plant. Cell. Online. 15, Doehlert DC and Knutson CA (1991) Two classes of starch debranching enzymes from developing maize kernels. Jahresheft der Albrecht-Thaer-Gesellschaft (Germany).138(5) Dohm JC, Lottaz C, Borodina T and Himmelbauer H (2008) Substantial biases in ultra-short read data sets from high-throughput DNA sequencing. Nucleic. Acids. Res. 36 (16) e105 Domon E, Saito A and Takeda K (2002) Comparison of the waxy locus sequence from a nonwaxy strain and two waxy mutants of spontaneous and artificial origins in barley. Genes. Genet. Sys. 77: Druley TE, Vallania FLM, Wegner DJ, Varley KE, Knowles OL, Bonds JA, Robison SW, Doniger SW, Hamvas A and Cole FS (2009) Quantification of rare allelic variants from pooled genomic DNA. Nat. Methods. 6: Drenkard E, Richter BG, Rozen S, Stutius LM, Angell NA, Mindrinos M, Cho RJ, Oefner PJ, Davis RW and Ausubel FM (2000) A simple procedure for the analysis of single nucleotide polymorphisms facilitates map-based cloning in Arabidopsis. Plant. Physiol. 124, Eagles HA, Cane K, Appelbee M, Kuchel H, Eastwood RF and Martin PJ (2012) The storage protein activator gene Spa-B1 and grain quality traits in southern Australian wheat breeding programs. Crop and Pasture Sci. 63(4)

154 Eastmond PJ and Rawsthorne S (2000) Coordinate changes in carbon partitioning and plastidial metabolism during the development of oilseed rape embryos. Plant. Physiol. 122: Edwards D, Forster JW, Cogan NOI, Batley J and Chagné D (2007) Chapter 4: Single nucleotide polymorphism discovery in plants. Association mapping in plants Springer, New York: ElSharawy A, Manaster C, Teuber M, Rosenstiel P, Kwiatkowski R, Huse K, Platzer M, Becker A, Nurnberg P and Schreiber S (2006) SNPSplicer: systematic analysis of SNP-dependent splicing in genotyped cdnas. Hum. Mutat. 27(11) Fairbrother WG and Chasin LA (2000) Human genomic sequences that inhibit splicing. Mol. Cell. Biol. 20(18): Fairbrother WG, Yeh RF, Sharp PA and Burge CB (2002) Predictive identification of exonic splicing enhancers in human genes. Science 9(297) Faisant N, Champ M, Colonna P, Buleon A, Molis C, Langkilde A, Schweizer T, Flourie B and Galmiche J (1993) Structural features of resistant starch at the end of the human small intestine. Europ. J. Clin. Nutr. 47:285. Fan J and Marks B (1998) Retrogradation kinetics of rice flours as influenced by cultivar. Cereal. Chem. 75: Fersht AR (1985) Enzyme structure and function New York, Freeman and Co. USA Fischer K and Weber A (2002) Transport of carbon in non-green plastids. Trends. Plant. Sci., 7, Fisher DK, Boyer CD and Hannah LC (1993) Starch branching enzyme II from maize endosperm. Plant. Physiol. 102, Fitzgerald M (2004) Starch. Rice Chemistry and Technology Fujita N, Kubo A, Suh DS, Wong KS, Jane JL, Ozawa K, Takaiwa F, Inaba Y and Nakamura Y (2003) Antisense inhibition of isoamylase alters the structure of amylopectin and the physicochemical properties of starch in rice endosperm. Plant. Cell. Physiol., 44, Fujita N, Yoshida M, Asakura N, Ohdan T, Miyao A, Hirochika H and Nakamura Y (2006) Function and characterization of starch synthase I using mutants in rice. Plant. Physiol. 140:1070. Fujita N, Yoshida M, Kondo T, Saito K, Utsumi Y, Tokunaga T, Nishi A, Satoh H, Park JH and Jane JL (2007) Characterization of SSIIIa-deficient mutants of rice: the function of SSIIIa and pleiotropic effects by SSIIIa deficiency in the rice endosperm. Plant. Physiol. 144,

155 Futschik A and Schlotterer C (2010) The next generation of molecular markers from massively parallel sequencing of pooled DNA samples. Genetics. 186, Gao M, Fisher DK, Kim KN, Shannon JC and Guiltinan MJ (1997) Independent genetic control of maize starch-branching enzymes IIa and IIb (Isolation and characterization of a Sbe2a cdna). Plant. Physiol. 114, Gao ZY, Zeng DL, Cui X, Zhou YH, Yan MX, Huang DN, Li JY and Qian Q (2003) Mapbased cloning of the ALK gene, which controls the gelatinization temperature of rice. Sci. Chin. Ser. C, Life Sci. 46, Garg K, Green P and Nickerson DA (1999) Identification of candidate coding region single nucleotide polymorphisms in 165 human genes using assembled expressed sequence tags. Genome. Res. 9, George Priya Doss C, Sudandiradoss C, Rajasekaran R, Choudhury P, Sinha P, Hota P and Batra UP (2008) Applications of computational algorithm tools to identify functional SNPs. Funct. Integr. Genomic. 8(4) Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P and Varma H (2002) A Draft Sequence of the Rice Genome (Oryza sativa L. ssp. japonica). Science 296: Gunderson KL, Steemers FJ, Ren H, Ng P, Zhou L, Tsan C, Chang W, Bullis D, Musmacker J and King C (2006) Whole- genome genotyping. Methods. Enzymol. 410, Gupta PK, Roy JK and Prasad M (2001) Single nucleotide polymorphisms: a new paradigm for molecular marker technology and DNA polymorphism detection with emphasis on their use in plants. Curr. Sci. 80, Gupta PK, Rustgi S and Kulwal PL (2005) Linkage disequilibrium and association studies in higher plants: present status and future prospects. Plant Mol Biol 57(4): Hanashiro I, Itoh K, Kuratomi Y, Yamazaki M, Igarashi T, Matsugasako J and Takeda Y (2008) Granule-bound starch synthase I is responsible for biosynthesis of extra-long unit chains of amylopectin in rice. Plant. Cell. Physiol. 49:925. Harismendy O and Frazer K (2009) Method for improving sequence coverage uniformity of targeted genomic intervals amplified by LR-PCR using Illumina GA sequencing-bysynthesis technology. Biotechniques. 46, Harn C, Knight M, Ramakrishnan A, Guan H, Keeling PL and Wasserman BP (1998) Isolation and characterization of the zssiia and zssiib starch synthase cdna clones from maize endosperm. Plant. Mol Biol. 37:

156 Hayashi K, Hashimoto N, Daigen M and Ashikawa I (2004) Development of PCR-based SNP markers for rice blast resistance genes at the Piz locus. Theor. Appl. Genet. 108, Hebsgaard SM, Korning PG, Tolstrup N, Engelbrecht J, Rouze P and Brunak S (1996) Splice site prediction in Arabidopsis thaliana pre-mrna by combining local and global sequence information. Nucleic. Acids. Res. 24(17): Hegyi H and Gerstein M (1999) The relationship between protein structure and function: a comprehensive survey with application to the yeast genome. J. Mol. Biol. 288(1) Heinemeyer T, Wingender E, Reuter I, Hermjakob H, Kel AE, Kel OV, Ignatieva EV, Ananko EA, Podkolodnaya OA and Kolpakov FA (1998) Databases on transcriptional regulation: TRANSFAC, TRRD and COMPEL. Nucleic. Acids. Res. 26(1): Henry RJ (2008) Future prospects for plant penotyping', (ed), Plant genotyping II: SNP technology, CABI Publishing, Wallingford, UK, pp Henry RJ, Pattemore JA, Waters DLE, Kharabian-Masouleh A, Bundock PC and Eliott FG (2008) Applications of the sequenom platform to SNP analysis in plants. In: Plant and Animal Genomes XVI Conference. San Diego, CA: Town and Country Convention Center. Hillier LDW, Marth GT, Quinlan AR, Dooling D, Fewell G, Barnett D, Fox P, Glasscock JI, Hickenbotham M and Huang W (2008) Whole-genome sequencing and variant discovery in C. elegans. Nat. Methods. 5: Hirose T and Terao T (2004) A comprehensive expression analysis of the starch synthase gene family in rice (Oryza sativa L.). Planta Hirose T, Ohdan T, Nakamura Y and Terao T (2006) Expression profiling of genes related to starch synthesis in rice leaf sheaths during the heading period. Physiol. Plant. 128, Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, Middle CM, Rodesch MJ, Albert TJ and Hannon GJ (2007) Genome-wide in situ exon capture for selective resequencing. Nat. Genet. 39, Hovenkamp-Hermelink JHM, Jacobsen E, Ponstein AS, Visser RGF, Vos-Scheperkeuter GH, Bijmolt EW, Vries JN, Witholt B and Feenstra WJ (1987) Isolation of an amylose-free starch mutant of the potato (Solanum tuberosum L.). Theor. Appl. Genet. 75: Hrmova M and Fincher GB (2001) Structure-function relationships of ß-D-glucan endo-and exohydrolases from higher plants. Plant. Mol. Biol. 47(1):

157 Hu P, Zhao H, Duan Z, Linlin Z and Wu D (2004) Starch digestibility and the estimated glycemic score of different types of rice differing in amylose contents. J. Cereal. Sci. 40: Hutchings D, Rawsthorne S and Emes MJ (2005) Fatty acid synthesis and the oxidative pentose phosphate pathway in developing embryos of oilseed rape (Brassica napus L.). J. Exp. Bot. 56:577 Hu P, Zhao H, Duan Z, Linlin Z and Wu D (2004) Starch digestibility and the estimated glycemic score of different types of rice differing in amylose contents. J. Cereal. Sci. 40: Imelfort M, Duran C, Batley J and Edwards D (2009) Discovering genetic polymorphisms in next generation sequencing data. Plant Biotechnol J. 7: Imparl-Radosevich JM, Nichols DJ, Li P, McKean AL, Keeling PL and Guan H (1999) Analysis of purified maize starch synthases IIa and IIb: SS isoforms can be distinguished based on their kinetic properties. Arch. Biochem. Biophys. 362: Imparl-Radosevich JM, Gameon JR, McKean A, Wetterberg D, Keeling PL and Guan H (2003) Understanding catalytic properties and functions of maize starch synthase isozymes. J. Appl. Glycoscience. 50: Ingman M and Gyllensten U (2008) SNP frequency estimation using massively parallel sequencing of pooled DNA. Eur. J. Hum. Genet. 17, Ishikawa N, Ishihara J and Itoh M (1995) Artificial induction and characterization of amylosefree mutants of barley. Barley. Genet. Newsl. 24: Isshiki M, Morino K, Nakajima M, Okagaki RJ, Wessler SR, Izawa T and Shimamoto K (1998) A naturally occurring functional allele of the rice waxy locus has a GT to TT mutation at the 5 splice site of the first intron. Plant. J. 15: James MG, Denyer K and Myers AM (2003) Starch synthesis in the cereal endosperm. Curr. Opin. Plant Biol. 6, Jeong SC, Kristipati S, Hayes AJ, Maughan PJ, Noffsinger SL, Gunduz I, Buss GR and Maroof MAS (2002) Genetic and sequence analysis of markers tightly linked to the soybean mosaic virus resistance gene, Rsv 3. Crop. Sci. 42, Jiang H, Dian W, Liu F and Wu P (2003) Cloning and characterization of a glucose 6phosphate/phosphate translocator from Oryza sativa. J. Zhejiang. Univ-Sci. A., 4, Jones ES, Sullivan H, Bhattramakki D and Smith JSC (2007) A comparison of simple sequence repeat and single nucleotide polymorphism marker technologies for the genotypic analysis of maize (Zea mays L.). Theor. Appl. Genet. 115:

158 Juliano BO, Onate LU and Del Mundo AM (1965) Relation of starch composition, protein content, and gelatinization temperature to cooking and eating qualities of milled rice. Food. Technol. 19. Kaiser J (2008) DNA sequencing: A plan to capture human diversity in 1000 Genomes. Science. 319:395. Kammerer B, Fischer K, Hilpert B, Schubert S, Gutensohn M, Weber A and Flügge UI (1998) Molecular characterization of a carbon transporter in plastids from heterotrophic tissues: the glucose 6-phosphate/phosphate antiporter. Plant. Cell. Online. 10:105. Kawagoe Y, Kubo A, Satoh H, Takaiwa F and Nakamura Y (2005) Roles of isoamylase and ADP glucose pyrophosphorylase in starch granule synthesis in rice endosperm. Plant. J. 42, Kennedy BG, Waters DLE, Henry RJ (2006) Screening for the rice blast resistance gene Pi-ta using LNA displacement probes and real-time PCR. Mol Breeding 18(3): Kharabian-Masouleh A, Waters DLE, Reinke RF and Henry RJ (2011) Discovery of polymorphisms in starch related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing. Plant. Biotechnol. J. 9 (9): Kiesselbach TA (1944) Character, field performance, and commercial production of waxy corn. J. Am. Soc. Agron. 36: Kim JI, Ju YS, Park H, Kim S, Lee S, Yi JH, Mudge J, Miller NA, Hong D and Bell CJ (2009) A highly annotated whole-genome sequence of a Korean individual. Nature. 460: Kubo A, Fujita N, Harada K, Matsuda T, Satoh H, Nakamura Y (1999) The starch-debranching enzymes isoamylase and pullulanase are both involved in amylopectin biosynthesis in rice endosperm. Plant. Physiol. 121(2): Kuipers AGJ, Jacobsen E and Visser RGF (1994) Formation and deposition of amylose in the potato tuber starch granule are affected by the reduction of granule-bound starch synthase gene expression. Plant. Cell. Online. 6:43. Kuriki T, Stewart DC and Preiss J (1997) Construction of chimeric enzymes out of maize endosperm branching enzymes I and II. J. Biol. Chem. 272, Larkin PD and Park WD (2003) Association of waxy gene single nucleotide polymorphisms with starch characteristics in rice (Oryza sativa L.). Mol. Breeding. 12(4): Laskowski RA, MacArthur MW, Moss DS, Thornton JM (1993) PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26(2):

159 Li Z, Rahman S, Kosar-Hashemi B, Mouille G, Appels R and Morell MK (1999) Cloning and characterization of a gene encoding wheat starch synthase I. Theor. Appl. Genet. 98: Li Z, Chu X, Mouille G, Yan L, Kosar-Hashemi B, Hey S, Napier J, Shewry P, Clarke B and Appels R (1999) The localization and expression of the class II starch synthases of wheat. Plant. Physiol. 120:1147. Li Z, Mouille G, Kosar-Hashemi B, Rahman S, Clarke B, Gale KR, Appels R and Morell MK (2000) The structure and expression of the wheat starch synthase III gene. Motifs in the expressed gene define the lineage of the starch synthase III gene family. Plant. Physiol. 123:613. Libessart N, Maddelein ML, Koornhuyse NV, Decq A, Delrue B, Mouille G, D'Hulst C and Ball S (1995) Storage, photosynthesis, and growth: the conditional nature of mutations affecting starch synthesis and structure in Chlamydomonas. Plant. Cell. Online. 7:1117. Limpisut P and Jindal VK (2002) Comparison of rice flour pasting properties using Brabender Viscoamylograph and Rapid Visco Analyser for evaluating cooked rice texture. StarchStärke. 54: Livak KJ (1999) Allelic discrimination using fluorogenic probes and the 5 nuclease assay. Genetic analysis: Biomol. Eng. 14, Lumdubwong N and Seib P (2000) Rice starch isolation by alkaline protease digestion of wetmilled rice flour. J. Cereal. Sci. 31: Mardis ER (2008a) The impact of next-generation sequencing technology on genetics. Trends. Genet. 24, Mardis ER (2008b) Next-generation DNA sequencing methods. Annu. Rev. Genom. Hum. G Marshall W, Normand F and Goynes W (1990) Effects of lipid and protein removal on starch gelatinization in whole grain milled rice. Cereal. Chem. 67: Masouleh AK, Waters DLE, Reinke RF and Henry RJ (2009) A high-throughput assay for rapid and simultaneous analysis of perfect markers for important quality and agronomic traits in rice using multiplexed MALDI-TOF mass spectrometry. Plant. Biotechnol. J. 7, McGuigan FE and Ralston SH (2002) Single nucleotide poly- morphism detection: allelic discrimination using TaqMan. Psychiatr. Genet. 12,

160 McNally KL, Childs KL, Bohnert R, Davidson RM, Zhao K, Ulat VJ, Zeller G, Clark RM, Hoen DR and Bureau TE (2009) Genomewide SNP variation reveals relationships among landraces and modern varieties of rice. Proceedings of the National Academy of Sciences 106: Mikami I, Uwatoko N, Ikeda Y, Yamaguchi J, Hirano HY, Suzuki Y and Sano Y (2008) Allelic diversification at the wx locus in landraces of Asian rice. Theor. Appl. Genet. 116(7): Miles MJ, Morris VJ, Orford PD and Ring SG (1985) The roles of amylose and amylopectin in the gelation and retrogradation of starch. Carbohydate. Res. 135: Monna L, Kitazawa N, Yoshino R, Suzuki J, Masuda H, Maehara Y, Tanji M, Sato M, Nasu S and Minobe Y (2002) Positional cloning of rice semidwarfing gene, sd-1: Rice green revolution gene encodes a mutant enzyme involved in gibberellin synthesis. DNA. Res. 9, Morell MK, Kosar-Hashemi B, Cmiel M, Samuel MS, Chandler P, Rahman S, Buleon A, Batey I.L and Li Z (2003) Barley sex6 mutants lack starch synthase IIa activity and contain a starch with novel properties. Plant. J. 34, Morozova O and Marra MA (2008) Applications of next-generation sequencing technologies in functional genomics. Genomics. 92, Morrison WR (1988) Lipids in cereal starches: A review. J. Cereal. Sci. 8:1-15. Morris CF (2002) Puroindolines: the molecular genetic b a s i s of wheat grain hardness. Plant. Mol. Biol. 48, Nakamura T, Vrinten P, Hayakawa K and Ikeda J (1998) Characterization of a granule-bound starch synthase isoform found in the pericarp of wheat. Plant. Physiol. 118, Nakamura Y (2002) Towards a better understanding of the metabolic system for amylopectin biosynthesis in plants: rice endosperm as a model tissue. Plant. Cell. Physiol. 43, Nakamura Y, Francisco PB, Hosaka Y, Sato A, Sawada T, Kubo A and Fujita N (2005) Essential amino acids of starch synthase IIa differentiate amylopectin structure and starch quality between japonica and indica rice varieties. Plant Mol. Biol. 58: Nasu S, Suzuki J, Ohta R, Hasegawa K, Yui R, Kitazawa N, Monna L and Minobe Y (2002) Search for and analysis of single nucleotide polymorphisms (SNPs) in rice (Oryza sativa, Oryza rufipogon) and establishment of SNP markers. DNA Res. 9, Ng PC and Henikoff S (2002) Accounting for human polymorphisms predicted to affect protein function. Cold Spring Harbor Laboratory Press, pp

161 Ng PC and Henikoff S (2003) SIFT: Predicting amino acid changes that affect protein function. Nucleic. Acids. Res. 31(13): 3812 Niewiadomski P, Knappe S, Geimer S, Fischer K, Schulz B, Unte US, Rosso MG, Ache P, Flügge UI and Schneider A (2005) The Arabidopsis plastidic glucose 6phosphate/phosphate translocator GPT1 is essential for pollen maturation and embryo sac development. Plant. Cell. Online. 17:760. Nishi A, Nakamura Y, Tanaka N and Satoh H (2001) Biochemical and Genetic Analysis of the Effects ofamylose-extender Mutation in Rice Endosperm. Plant. Physiol. 127:459. Novaes E, Drost DR, Farmerie WG, Pappas GJ, Grattapaglia D, Sederoff RR and Kirst M (2008) High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC. Genomics. 9:312. Ohdan T, Francisco Jr PB, Sawada T, Hirose T, Terao T, Satoh H and Nakamura Y (2005) Expression profiling of genes involved in starch synthesis in sink and source organs of rice. J. Exp. Bot. 56, Olivier M (2005) The Invader assay for SNP genotyping. Mutat. Res. 573, Out AA, van Minderhout I, Goeman JJ, Ariyurek Y, Ossowski S, Schneeberger K, Weigel D, van Galen M, Taschner PEM and Tops CMJ (2009) Deep sequencing to reveal new variants in pooled DNA samples. Hum. Mutat. 30, Panlasigui L, Thompson L, Juliano B, Perez C, Yiu S and Greenberg G (1991) Rice varieties with similar amylose content differ in starch digestibility and glycemic response in humans. Am. J. Clin. Nutr. 54:871. Patron NJ, Smith AM, Fahy BF, Hylton CM, Naldrett MJ, Rossnagel BG and Denyer K (2002) The altered pattern of amylose accumulation in the endosperm of low-amylose barley cultivars is attributable to a single mutant allele of granule-bound starch synthase I with a deletion in the 5'-non-coding region. Plant. Physio. 130:190. Peng S, Huang J, Sheehy JE, Laza RC, Visperas RM, Zhong X, Centeno GS, Khush GS and Cassman KG (2004) Rice yields decline with higher night temperature from global warming. Proc. Natl. Acad. Sci. USA. 101:9971. Perkel J (2008) SNP genotyping: six technologies that keyed a revolution. Nat Methods, 5, Pertea M, Lin X, Salzberg SL (2001) GeneSplicer: a new computational method for splice site prediction. Nucleic. Acids. Res. 29(5):

162 Pesole G and Liuni S (1999) Internet resources for the functional analysis of 5'and 3' untranslated regions of eukaryotic mrnas. Trends. Genet. 15(9): Pettersson E, Lundeberg J and Ahmadian A (2009) Generations of sequencing technologies. Genomics. 93, Philpot K, Martin M, Butardo Jr V, Willoughby D and Fitzgerald M (2006) Environmental factors that affect the ability of amylose to contribute to retrogradation in gels made from rice flour. J. Agri. Food. Chem. 54: Raemakers K, Schreuder M, Suurs L, Furrer-Verhorst H, Vincken JP, Vetten N, Jacobsen E and Visser RGF (2005) Improved cassava starch by antisense inhibition of granule-bound starch synthase I. Mol. Breeding. 16: Rafalski A (2002) Applications of single nucleotide polymorphisms in crop genetics. Curr. Opin. Plant. Biol. 5, Ragaee S and Abdel-Aal ESM (2006) Pasting properties of starch and protein in selected cereals and quality of their food products. Food. Chem. 95:9-18. Rahman S, Li Z, Batey I, Cochrane MP, Appels R and Morell M (2000) Genetic alteration of starch functionality in wheat. J. Cereal Sci. 31, Rahman S, Regina A, Li Z, Mukai Y, Yamamoto M, Kosar-Hashemi B, Abrahams S and Morell MK (2001) Comparison of starch-branching enzyme genes reveals evolutionary relationships among isoforms. Characterization of a gene for starch-branching enzyme IIa from the wheat D genome donor Aegilops tauschii. Plant. Physiol. 125, Rajasekaran R, Sudandiradoss C, Doss CGP, Sethumadhavan R (2007) Identification and in silico analysis of functional SNPs of the BRCA1 gene. Genomics. 90(4): Rajasekaran R, George Priya Doss C, Sudandiradoss C, Ramanathan K, Rituraj P and Rao S (2008) Computational and structural investigation of deleterious functional SNPs in breast cancer BRCA2 gene. Chinese. J. Biotech. 24(5): Rajesh S, Raveendran M and Manickam A (2008) Prediction of 3-dimensional structure of EMV1, a group 1 late embryogenesis abundant protein of Vigna radiata Wilczek. Plant. Omics. J. 1(1): Ramensky V, Bork P and Sunyaev S (2002) Human non-synonymous SNPs: server and survey. Nucleic. Acids. Res. 30(17): Rapley R and Harbron SE (2004) Molecular analysis and genome discovery. Sussex, UK: Wiley. 144

163 Ring SG, Gee JM, Whittam M, Orford P and Johnson IT (1988) Resistant starch: its chemical form in foodstuffs and effect on digestibility in vitro. Food. Chem. 28: Roth C and Liberles DA (2006) A systematic search for positive selection in higher plants (Embryophytes). BMC. Plant. Biol. 6, 12. Rowland-Bamford AJ, Allen LH, Baker JT and Boote K (1990) Carbon dioxide effects on carbohydrate status and partitioning in rice. J. Exp. Bot. 41:1601. Rolletschek H, Hajirezaei MR, Wobus U and Weber H (2002) Antisense-inhibition of ADPglucose pyrophosphorylase in Vicia narbonensis seeds increases soluble sugars and leads to higher water and nitrogen uptake. Planta. 214, Rolletschek H, Nguyen TH, Häusler RE, Rutten T, Göbel C, Feussner I, Radchuk R, Tewes A, Claus B and Klukas C (2007) Antisense inhibition of the plastidial glucose 6 phosphate/phosphate translocator in Vicia seeds shifts cellular differentiation and promotes protein storage. Plant. J. 51: Saito M, Konda M, Vrinten P, Nakamura K and Nakamura T (2004) Molecular comparison of waxy null alleles in common wheat and identification of a unique null allele. Theor. Appl. Genet. 108: Sajilata M, Singhal RS and Kulkarni PR (2006) Resistant starch a review. Comprehensive. Rev. Food. Sci. F. 5:1-17. Sano Y (1984) Differential regulation of waxy gene expression in rice endosperm. Theor. Appl. Genet. 68(5): Sasaki A, Ashikari M, Ueguchi-Tanaka M, Itoh H, Nishimura A, Swapan D, Ishiyama K, Saito T, Kobayashi M and Khush GS (2002) A mutant gibberellin-synthesis gene in rice. Nature. 416, Sato K, Inaba K and Tozawa M (1973) High temperature injury of ripening in rice plant. I. The effects of high temperature treatments at different stages of panicle development on the ripening, pp Sato K (1984) Starch granules in tissues of rice plants and their changes in relation to plant growth. Jap. Agric. Res. Quarter. 18: Satoh H, Nishi A, Yamashita K, Takemoto Y, Tanaka Y, Hosaka Y, Sakurai A, Fujita N and Nakamura Y (2003) Starch-branching enzyme I-deficient mutation specifically affects the structure and properties of starch in rice endosperm. Plant. Physiol. 133,

164 Schofield J and Greenwell P (1987) Wheat starch granule proteins and their technological significance. Cereals in a European context. First European Conference on Food Science and Technology, Morton, I.D. (eds.).- New York, NY (USA): VCH, ISBN p Schuster SC (2008) Next-generation sequencing transforms today s biology. Nat. Methods. 5, 1, Seo B, Kim S, Scott MP, Singletary GW, Wong K, James MG and Myers AM (2002) Functional interactions between heterologously expressed starch-branching enzymes of maize and the glycogen synthases of brewer's yeast. Plant. Physiol. 128, Sequenom I (2006) iplex Gold Assay for SNP Genotyping. Biotechniques. Protocol. Guide. 41. San Diego, CA: Sequenom. Shapter FM, Eggler P, Lee LS and Henry RJ (2009) Variation in Granule Bound Starch Synthase I (GBSSI) loci amongst Australian wild cereal relatives (Poaceae). J. Cereal. Sci. 49:4-11. Shastry BS (2002) SNP alleles in human disease and evolution. J. Hum. Genet. 47(11): Shen J, Deininger PL and Zhao H (2006) Applications of computational algorithm tools to identify functional SNPs in cytokine genes. Cytokine. 35(1-2) Sheng F, Jia X, Sheng F, Jia X, Yep A, Jack P and Geiger JH (2009) The crystal structures of the open and catalytically competent closed conformation of Escherichia coli glycogen synthase. J. Biol. Chem. 284(26): Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K (2001) dbsnp: the NCBI database of genetic variation. Nucleic. Acids. Res. 29(1): Siebert PD and Larrick JW (1992) Competitive PCR. Nature 359: Singh N, Pal N, Mahajan G, Singh S and Shevkani K (2011) Rice grain and starch properties: effects of nitrogen fertilizer application. Carbohyd. Polym. 86 (1) Sinha S and Tompa M (2002) Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic. Acids. Res. 30(24): Smith AM (1999) Making starch. Curr. Opin. Plant. Biol. 2: Soussi T, Asselain B, Hamroun D, Kato S, Ishioka C, Claustres M and Beroud C (2006) Metaanalysis of the p53 mutation database for mutant p53 biological activity reveals a methodologic bias in mutation detection. Clin. Cancer. Res. 12:

165 Spielmeyer W, Ellis MH and Chandler PM (2002) Semidwarf (sd-1), green revolution rice, contains a defective gibberellin 20-oxidase gene. Proc. Natl. Acad. Sci. USA 99: Sun C, Sathish P, Ahlandsberg S, Deiber A and Jansson C (1997) Identification of four starch branching enzymes in barley endosperm: partial purification of forms I, IIa and IIb. New. Phytol. 137, Sun C, Sathish P, Ahlandsberg S and Jansson C (1998) The two genes encoding starchbranching enzymes IIa and IIb are differentially expressed in barley. Plant. Physiol. 118, Sunyaev SR, Lathe WC, Ramensky VE and Bork P (2000) SNP frequencies in human genes-an excess of rare alleles and differing modes of selection. Trends. Genet. 16, Tacke R and Manley JL (1999) Determinants of SR protein specificity. Curr. Opin. Cell. Biol. 11(3): Takeda Y, Guan HP and Preiss J (1993) Branching of amylose by the branching isoenzymes of maize endosperm. Carbohydr. Res. 240, Tanaka N, Fujita N, Nishi A, Satoh H, Hosaka Y, Ugaki M, Kawasaki S and Nakamura Y (2004) The structure of starch can be manipulated by changing the expression levels of starch branching enzyme IIb in rice endosperm. Plant. Biotech. J. 2, Tashiro T and Wardlaw I (1991) The effect of high temperature on kernel dimensions and the type and occurrence of kernel damage in rice. Aust. J. Agric. Res. 42: Tester R, Morrison W, Ellis R, Piggo J, Batts G, Wheeler T, Morison J, Hadley P and Ledward D (1995) Effects of elevated growth temperature and carbon dioxide levels on some physicochemical properties of wheat starch. J. Cereal. Sci. 22: Tester RF and Morrison WR (1990) Swelling and gelatinization of cereal starches. I. Effects of amylopectin, amylose, and lipids. Cereal. Chem. 67: Tester RF, Karkalas J and Qi X (2004) Starch--composition, fine structure and architecture. J. Cereal. Sci. 39: Tetlow IJ, Wait R, Lu Z, Akkasaeng R, Bowsher CG, Esposito S, Kosar-Hashemi B, Morell MK and Emes MJ (2004) Protein phosphorylation in amyloplasts regulates starch branching enzyme activity and protein-protein interactions. Plant. Cell. Online. 16, Thomas RK, Nickerson E, Simons JF, Jänne PA, Tengs T, Yuza Y, Garraway LA, LaFramboise T, Lee JC and Shah K (2006) Sensitive mutation detection in heterogeneous cancer specimens by massively parallel picoliter reactor sequencing. Nature. Med. 12,

166 Umemoto T, Yano M, Satoh H, Shomura A and Nakamura Y (2002) Mapping of a gene responsible for the difference in amylopectin structure between japonica-type and indicatype rice varieties. Theor. Appl. Genet. 104:1-8. Umemoto T, Aoki N, Lin H, Nakamura Y, Inouchi N, Sato Y, Yano M, Hirabayashi H and Maruyama S (2004) Natural variation in rice starch synthase IIa affects enzyme and starch properties. Funct. Plant Biol. 31, Umemoto T and Aoki N (2005) Single-nucleotide polymorphisms in rice starch synthase IIa that alter starch gelatinisation and starch association of the enzyme. Funct. Plant Biol. 32, Umemoto T, Horibata T, Aoki N, Hiratsuka M, Yano M and Inouchi N (2008) Effects of variations in starch synthase on starch properties and eating quality of rice. Plant Prod. Sci. 11, Varley KE and Mitra RD (2008) Nested Patch PCR enables highly multiplexed mutation discovery in candidate genes. Genome Res. 18, Velicer GJ, Raddatz G, Keller H, Deiss S, Lanz C, Dinkelacker I and Schuster SC (2006) Comprehensive mutation identification in an evolved bacterial cooperator and its cheating ancestor. Proc. Natl. Acad. Sci. USA. 103 (21) Visser RGF, Somhorst I, Kuipers GJ, Ruys NJ, Feenstra WJ and Jacobsen E (1991) Inhibition of the expression of the gene for granule-bound starch synthase in potato by antisense constructs. Mol. General. Genet. 225: Vrinten PL and Nakamura T (2000) Wheat granule-bound starch synthase I and II are encoded by separate genes that are expressed in different tissues. Plant. Physiol. 122, Wakao S, Andre C and Benning C (2008) Functional analyses of cytosolic glucose-6-phosphate dehydrogenases and their contribution to seed oil accumulation in Arabidopsis. Plant. Physiol. 146:277. Wang YJ, White P, Pollak L and Jane J (1993) Characterization of starch structures of 17 maize endosperm mutant genotypes with Oh43 inbred line background. Cereal Chem. 70: Wang Z, Rolish ME, Yeo G, Tung V, Mawson M and Burge CB (2004) Systematic identification and analysis of exonic splicing silencers. Cell. 119(6): Wang Z, Xiao X, Van Nostrand E and Burge CB (2006) General and specific functions of exonic splicing silencers in splicing control. Mol. Cell. 23(1): Waters DLE, Henry RJ, Reinke RF and Fitzgerald MA (2006) Gelatinization temperature of rice explained by polymorphisms in starch synthase. Plant. Biotechnol. J. 4,

167 Waters DLE, Henry RJ (2007) Genetic manipulation of starch properties in plants: patents Recent. Pat. Biotechnol. 1(3): Waters DLE, Henry RJ, Reinke RF and Fitzgerald MA (2006) Gelatinization temperature of rice explained by polymorphisms in starch synthase. Plant. Biotechnol. J. 4: Webb BD (1991) Rice quality and grades. In: Rice (Bor S. Luh ed.), pp College Station, TX: USDA, Rice Quality Laboratory, Texas A&M University. Yamakawa H, Hirose T, Kuroda M and Yamaguchi T (2007) Comprehensive expression profiling of rice grain filling-related genes under high temperature using DNA microarray. Plant. Physiol. 144, Yamakawa H, Ebitani T and Terao T (2008) Comparison between locations of QTLs for grain chalkiness and genes responsive to high temperature during grain filling on the rice chromosome map. Breed. Sci. 58, Yamamori M, Kato M, Yui M and Kawasaki M (2006) Resistant starch and starch pasting properties of a starch synthase IIa-deficient wheat with apparent high amylose. Aust J Agri Res. 57: Yan CJ, Tian ZX, Fang YW, Yang YC, Li J, Zeng SY, Gu SL, Xu CW, Tang SZ and Gu MH (2010) Genetic analysis of starch paste viscosity parameters in glutinous rice (Oryza sativa L.). Theor. Appl. Genet. 122(1) Yu J, Hu S, Wang J, Wong GKS, Li S, Liu B, Deng Y, Dai L, Zhou Y and Zhang X (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 296, Yun SH and Matheson NK (1993) Structures of the amylopectins of waxy, normal, amyloseextender, and wx: ae genotypes and of the phytoglycogen of maize. Carbohydr. Res. 243, Zakaria S, Matsuda T, Tajima S and Nitta Y (2002) Effect of high temperature at ripening stage on the reserve accumulation in seed in some rice cultivars. Plant Production Science-Tokyo5:

168 Appendices 1-8: Attached Appendices: Chapter 2 Appendix 1: Full list of discovered SNP/Indel is 17 studies starch related genes. Appendix 2: Full list of Australian breeding lines (population) and their pedigree information. Appendix 3: Target genes and sequence of gene-specific LR-PCR primers. Appendix 4: SNP/Indel distribution and short read coverage pattern across candidate loci. Appendices: Chapter 4 Appendix 5: Full list of 233 studied Australian rice genotypes and their pedigree information. Appendix 6: Name and characteristics of SNPs genotyped in the rice population. Appendix 7: The results of association study among 13 physiochemical traits and SNPs of 18 different starch-related genes. Appendix 8: Linkage map of 17 starch-related genes, showing the approximate each gene s chromosomal location. 150

169 Appendix 1: Full list of discovered SNPs/Indels in 17 starch related genes Reference Consensu Variation Referenc Allele Variant Frequency of Frequency of Amino acid Gene position s position type Length e Variants variations Frequencies Counts Coverage #1 #1 Count of #1 Variant #2 #2 Count of #2 Overlapping annotations change Quality AGPS2b SNP 1 C 2 C/T 94.7/ / C T Gene: Os08g N/A High AGPS2b SNP 1 G 2 G/C 95.2/ / G C Gene: Os08g N/A High AGPS2b SNP 1 C 2 C/A 99.0/ / C A Gene: Os08g N/A High AGPS2b SNP 1 C 2 C/T 94.8/ / C T Gene: Os08g N/A High AGPS2b SNP 1 T 2 T/C 98.9/ / T C Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/G 94.9/ / A G Gene: Os08g N/A High AGPS2b SNP 1 G 2 G/A 94.8/ / G A Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/G 99.0/ / A G Gene: Os08g N/A High AGPS2b SNP 1 C 2 C/A 98.3/ / C A Gene: Os08g N/A High AGPS2b SNP 1 T 2 T/A 95.0/ / T A Gene: Os08g N/A High AGPS2b SNP 1 T 2 T/G 98.9/1.1 90/1 91 T G Gene: Os08g N/A High AGPS2b SNP 1 C 2 C/T 97.6/ / C T Gene: Os08g N/A High AGPS2b SNP 1 C 2 C/T 99.3/ / C T Gene: Os08g N/A Low AGPS2b SNP 1 G 2 G/T 98.6/ / G T Gene: Os08g N/A High AGPS2b SNP 1 T 2 T/A 99.3/ / T A Gene: Os08g N/A Low AGPS2b SNP 1 C 2 C/T 99.2/ / C T Gene: Os08g N/A Low AGPS2b SNP 1 C 2 C/A 99.2/ / C A Gene: Os08g N/A Low AGPS2b SNP 1 A 2 A/G 98.6/ / A G Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/G 95.7/ / A G Gene: Os08g N/A High AGPS2b SNP 1 T 2 T/A 95.6/ / T A Gene: Os08g N/A High AGPS2b SNP 1 C 2 C/A 97.5/ / C A Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/G 95.8/ / A G Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/T 98.6/ / A T Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/G 98.6/ / A G Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/G 96.1/ / A G Gene: Os08g N/A High AGPS2b SNP 1 A 2 A/T 97.5/ / A T Gene: Os08g N/A High AGPS2b SNP 1 T 2 T/G 96.4/ / T G Gene: Os08g N/A High AGPS2b SNP 1 C 2 C/T 98.6/ / C T Gene: Os08g N/A High AGPS2b SNP 1 A 2 C/A 77.1/ / C A Gene: Os08g N/A High AGPS2b SNP 1 G 2 G/C 99.2/ / G C Gene: Os08g N/A Low BEI SNP 1 G 2 G/A 89.2/ / G A Gene: Os06g , CDS: Os06g0N/A High BEI SNP 1 C 2 C/T 88.9/ / C T Gene: Os06g N/A High BEI SNP 1 G 2 G/A 88.7/ / G A Gene: Os06g N/A High BEI SNP 1 C 2 C/T 88.3/ / C T Gene: Os06g N/A High BEI SNP 1 C 2 C/A 88.9/ / C A Gene: Os06g N/A High BEI SNP 1 A 2 A/G 89.0/ / A G Gene: Os06g N/A High BEI SNP 1 C 2 C/T 94.5/ / C T Gene: Os06g , CDS: Os06g0Gly607Asp High BEI SNP 1 C 2 C/T 88.9/ / C T Gene: Os06g N/A High BEI SNP 1 G 2 G/A 94.3/ / G A Gene: Os06g N/A High BEI SNP 1 G 2 G/T 93.9/ / G T Gene: Os06g , CDS: Os06g0N/A High BEI SNP 1 G 2 G/A 88.5/ / G A Gene: Os06g N/A High BEI SNP 1 G 2 G/C 87.3/ / G C Gene: Os06g N/A High BEI SNP 1 G 2 G/A 86.8/ / G A Gene: Os06g N/A High BEI SNP 1 A 2 A/G 97.8/ / A G Gene: Os06g N/A High BEI SNP 1 T 2 T/G 91.8/ / T G Gene: Os06g N/A High BEI SNP 1 G 2 G/A 86.8/ / G A Gene: Os06g N/A High BEI SNP 1 A 2 A/G 99.4/ / A G Gene: Os06g N/A Low BEI SNP 1 G 2 G/T 99.0/ / G T Gene: Os06g , CDS: Os06g0N/A High BEIIa SNP 1 N 1 T T Gene: Os04g N/A Low BEIIa SNP 1 G 1 T T Gene: Os04g N/A Low BEIIa SNP 1 G 1 T T Gene: Os04g N/A Low BEIIa SNP 1 G 1 T T Gene: Os04g N/A Low BEIIa SNP 1 A 1 T T Gene: Os04g , CDS: Os04g0N/A Low BEIIa SNP 1 T 2 T/G 97.2/ / T G Gene: Os04g , CDS: Os04g0Tyr140Ser High BEIIb SNP 1 T 2 G/T 92.1/ / G T Gene: Os02g N/A High BEIIb SNP 1 T 2 T/C 99.2/ / T C Gene: Os02g N/A Low BEIIb SNP 1 A 2 C/A 90.5/ / C A Gene: Os02g N/A High BEIIb SNP 1 C 2 T/C 90.5/ / T C Gene: Os02g N/A High BEIIb SNP 1 C 2 T/C 90.0/ / T C Gene: Os02g N/A High BEIIb SNP 1 A 2 G/A 89.8/ / G A Gene: Os02g N/A High BEIIb SNP 1 T 2 T/A 99.5/ / T A Gene: Os02g N/A Low BEIIb SNP 1 T 2 T/A 99.1/ / T A Gene: Os02g N/A Low BEIIb SNP 1 A 2 C/A 89.7/ / C A Gene: Os02g N/A High BEIIb SNP 1 C 2 G/C 89.9/ / G C Gene: Os02g N/A High BEIIb SNP 1 T 2 C/T 86.8/ / C T Gene: Os02g N/A High

170 BEIIb SNP 1 A 2 C/A 90.1/ / C A Gene: Os02g N/A High BEIIb SNP 1 C 2 A/C 90.0/ / A C Gene: Os02g N/A High BEIIb SNP 1 C 2 T/C 90.6/ / T C Gene: Os02g N/A High BEIIb SNP 1 A 2 G/A 89.9/ / G A Gene: Os02g N/A High BEIIb SNP 1 A 2 A/G 99.4/ / A G Gene: Os02g N/A Low BEIIb SNP 1 A 2 C/A 90.4/ / C A Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 A 2 A/G 99.1/ / A G Gene: Os02g , CDS: Os02g0N/A Low BEIIb SNP 1 G 2 T/G 85.0/ / T G Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 T 2 C/T 86.5/ / C T Gene: Os02g N/A High BEIIb SNP 1 C 2 A/C 86.4/ / A C Gene: Os02g N/A High BEIIb SNP 1 G 2 A/G 86.1/ / A G Gene: Os02g N/A High BEIIb SNP 1 T 2 C/T 87.0/ / C T Gene: Os02g N/A High BEIIb SNP 1 T 2 A/T 86.2/ / A T Gene: Os02g N/A High BEIIb SNP 1 G 2 A/G 84.3/ / A G Gene: Os02g N/A High BEIIb SNP 1 T 2 C/T 84.2/ / C T Gene: Os02g N/A High BEIIb SNP 1 T 2 T/C 99.5/ / T C Gene: Os02g N/A Low BEIIb SNP 1 C 2 C/G 93.4/ / C G Gene: Os02g N/A High BEIIb SNP 1 C 2 C/T 98.4/ / C T Gene: Os02g , CDS: Os02g0Val403Ile High BEIIb SNP 1 A 2 A/G 98.1/ / A G Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 A 2 T/A 86.1/ / T A Gene: Os02g N/A High BEIIb SNP 1 T 2 T/C 96.1/ / T C Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 C 2 C/T 96.1/ / C T Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 T 2 T/C 92.7/ / T C Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 G 2 G/A 89.9/ / G A Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 A 2 A/T 89.9/ / A T Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 G 2 G/A 96.4/ / G A Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 T 2 T/A 96.4/ / T A Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 C 2 C/T 99.1/ / C T Gene: Os02g , CDS: Os02g0N/A Low BEIIb SNP 1 C 2 T/C 84.8/ / T C Gene: Os02g N/A High BEIIb SNP 1 T 2 C/T 85.1/ / C T Gene: Os02g N/A High BEIIb SNP 1 C 2 T/C 85.5/ / T C Gene: Os02g N/A High BEIIb SNP 1 A 2 T/A 87.7/ / T A Gene: Os02g , CDS: Os02g0N/A High BEIIb SNP 1 C 2 T/C 85.2/ / T C Gene: Os02g N/A High BEIIb SNP 1 T 2 C/T 60.3/ / C T Gene: Os02g N/A High BEIIb SNP 1 A 2 A/T 57.2/ / A T Gene: Os02g N/A High BEIIb SNP 1 G 2 A/G 84.0/ / A G Gene: Os02g N/A High BEIIb SNP 1 T 2 A/T 84.4/ / A T Gene: Os02g N/A High BEIIb SNP 1 T 2 C/T 83.6/ / C T Gene: Os02g , CDS: Os02g0His196Arg High BEIIb SNP 1 A 2 C/A 84.8/ / C A Gene: Os02g N/A High BEIIb SNP 1 T 2 C/T 87.0/ / C T Gene: Os02g N/A High BEIIb SNP 1 A 2 C/A 85.1/ / C A Gene: Os02g , CDS: Os02g0Leu94Val High BEIIb SNP 1 C 2 C/A 99.2/ /1 126 C A Gene: Os02g , CDS: Os02g0N/A Low Isoamylase SNP 1 A 2 A/G 54.2/ / A G Gene: OSJNBa0014C03.3, CDS: OSN/A High Isoamylase SNP 1 A 2 A/G 53.3/ / A G Gene: OSJNBa0014C03.3, CDS: OSN/A High Isoamylase SNP 1 T 2 T/C 51.9/ / T C Gene: OSJNBa0014C03.3, CDS: OSThr482Ala High Isoamylase SNP 1 C 2 C/T 59.7/ / C T Gene: OSJNBa0014C03.3, CDS: OSN/A High Isoamylase SNP 1 G 2 G/A 65.8/ / G A Gene: OSJNBa0014C03.3, CDS: OSN/A High Isoamylase SNP 1 C 2 C/A 66.1/ / C A Gene: OSJNBa0014C03.3, CDS: OSArg231Leu High Isoamylase SNP 1 G 2 G/T 65.3/ / G T Gene: OSJNBa0014C03.3, CDS: OSLeu122Met High Isoamylase SNP 1 T 2 T/G 85.7/ / T G Gene: OSJNBa0014C03.3, CDS: OSThr113Pro High Isoamylase SNP 1 C 2 C/T 99.5/ /1 184 C T Gene: OSJNBa0014C03.3, CDS: OSN/A Low Isoamylase SNP 1 C 2 C/A 99.5/ /1 197 C A Gene: OSJNBa0014C03.3, CDS: OSGly92Cys Low Isoamylase SNP 1 C 2 C/A 98.3/1.7 58/1 59 C A Gene: OSJNBa0014C03.3, CDS: OSN/A High Isoamylase SNP 1 C 2 C/A 98.3/1.7 59/1 60 C A Gene: OSJNBa0014C03.3, CDS: OSGly81Trp High Isoamylase SNP 1 C 2 C/T 97.6/2.4 41/1 42 C T Gene: OSJNBa0014C03.3, CDS: OSGlu79Lys High Isoamylase SNP 1 C 2 C/T 97.8/2.2 45/1 46 C T Gene: OSJNBa0014C03.3, CDS: OSN/A High Isoamylase SNP 1 C 2 C/A 98.6/1.4 72/1 73 C A Gene: OSJNBa0014C03.3, CDS: OSGly64Trp High Isoamylase SNP 1 C 2 C/A 98.0/2.0 50/1 51 C A Gene: OSJNBa0014C03.3, CDS: OSGly63Cys High GBSSI SNP 1 A 2 A/C 99.1/ / A C Gene: Os06g , CDS: Os06g0Tyr224Ser Low GBSSII SNP 1 T 2 T/A 97.3/ / T A Gene: Os07g , mrna: Os07N/A High GBSSII SNP 1 A 2 A/G 97.5/ / A G Gene: Os07g , CDS: Os07g0Leu523Ser High GBSSII SNP 1 C 2 C/G 97.9/ / C G Gene: Os07g N/A High GBSSII SNP 1 C 2 C/G 98.0/ / C G Gene: Os07g N/A High GPT SNP 1 G 2 G/A 95.2/ / G A Gene: Os08g N/A High GPT SNP 1 C 2 C/G 75.0/ / C G Gene: Os08g N/A High GPT SNP 1 A 2 A/G 95.1/ / A G Gene: Os08g N/A High GPT SNP 1 G 2 G/T 97.9/ / G T Gene: Os08g N/A High GPT SNP 1 C 2 C/T 94.9/ / C T Gene: Os08g , CDS: Os08g0N/A High GPT SNP 1 C 2 C/T 95.8/ / C T Gene: Os08g , CDS: Os08g0Leu42Phe High

171 GPT SNP 1 G 2 G/A 95.6/ / G A Gene: Os08g , CDS: Os08g0N/A High GPT SNP 1 C 2 C/G 96.2/ / C G Gene: Os08g N/A High GPT SNP 1 A 2 A/G 68.6/ / A G Gene: Os08g , CDS: Os08g0N/A High GPT SNP 1 C 2 C/T 73.5/ / C T Gene: Os08g N/A High GPT SNP 1 T 2 T/G 95.9/ / T G Gene: Os08g N/A High GPT SNP 1 G 2 G/A 95.7/ / G A Gene: Os08g N/A High GPT SNP 1 A 2 A/G 73.0/ / A G Gene: Os08g N/A High GPT SNP 1 A 2 A/G 95.8/ / A G Gene: Os08g N/A High GPT SNP 1 T 2 T/C 96.8/ / T C Gene: Os08g N/A High GPT SNP 1 C 2 C/A 97.0/ / C A Gene: Os08g N/A High Pullulanase SNP 1 C 2 C/T 99.0/ / C T Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/G 98.8/ / T G Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.8/ / C T Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/C 99.1/ / A C Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/C 98.9/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.9/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.9/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/A 99.0/ / T A Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/G 99.1/ / A G Gene: Os04g N/A Low Pullulanase SNP 1 G 2 G/C 99.1/ / G C Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/G 98.9/ / A G Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/A 98.7/ / T A Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.7/ / C T Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.7/ / C T Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.8/ / C T Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/T 98.7/ / G T Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/T 99.3/ / G T Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/G 98.8/ / A G Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.7/ / G A Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.7/ / G A Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.7/ / G A Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.7/ / C T Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.4/ / T C Gene: Os04g , CDS: Os04g0N/A High Pullulanase SNP 1 T 2 T/C 99.1/ / T C Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/T 99.4/ / A T Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/T 98.2/ / C T Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/G 98.8/ / A G Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/T 99.4/ / G T Gene: Os04g N/A Low Pullulanase SNP 1 G 2 G/A 98.5/ / G A Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 99.1/ / G A Gene: Os04g , CDS: Os04g0Ser217Asn Low Pullulanase SNP 1 G 2 G/A 98.4/ / G A Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/C 98.3/ / G C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/A 98.2/ / T A Gene: Os04g N/A High Pullulanase SNP 1 A 2 G/A 64.7/ / G A Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.4/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/G 98.5/ / T G Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.5/ / G A Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.9/ / C T Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/A 98.5/ / T A Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.6/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.2/ / T C Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.4/ / C T Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.4/ / T C Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 99.4/ / C T Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/G 99.5/ / A G Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/T 99.3/ / C T Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/T 98.0/ / C T Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 99.3/ / G A Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/G 98.9/ / T G Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 99.4/ / G A Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/G 99.2/ / C G Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/C 97.9/ / A C Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.2/ / G A Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 99.0/ / T C Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/G 99.2/ / C G Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/A 99.4/ / T A Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/G 98.7/ / C G Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.3/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.3/ / T C Gene: Os04g N/A High

172 Pullulanase SNP 1 C 2 C/A 98.5/ / C A Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/G 99.4/ / A G Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/T 99.0/ / C T Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/C 99.4/ / A C Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/G 98.5/ / A G Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/C 98.9/ / G C Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 99.1/ / C T Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/C 98.9/ / T C Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.9/ / C T Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/G 98.8/ / T G Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 99.0/ / T C Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/C 99.0/ / A C Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.8/ / G A Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.8/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.8/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/G 98.8/ / T G Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/A 99.0/ / T A Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 98.8/ / C T Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.6/ / G A Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/G 98.6/ / C G Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/T 98.7/ / G T Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/A 98.8/ / T A Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 99.4/ / C T Gene: Os04g N/A Low Pullulanase SNP 1 G 2 G/C 98.7/ / G C Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/G 98.9/ / A G Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/C 98.6/ / T C Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 99.5/ / G A Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/C 99.3/ / T C Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/C 98.6/ / T C Gene: Os04g N/A High Pullulanase SNP 1 T 2 T/G 97.7/ / T G Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/T 97.8/ / A T Gene: Os04g N/A High Pullulanase SNP 1 C 2 C/T 99.2/ / C T Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/T 99.2/ / C T Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/G 99.1/ / C G Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/G 98.9/ / A G Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/T 98.6/ / A T Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 98.7/ / G A Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/T 98.8/ / G T Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/G 98.4/ / A G Gene: Os04g N/A High Pullulanase SNP 1 A 2 A/T 99.3/ / A T Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/A 99.4/ / T A Gene: Os04g N/A Low Pullulanase SNP 1 C 2 C/A 99.4/ / C A Gene: Os04g N/A Low Pullulanase SNP 1 G 2 G/A 99.3/ / G A Gene: Os04g N/A Low Pullulanase SNP 1 T 2 A/T 67.2/ / A T Gene: Os04g N/A High Pullulanase SNP 1 C 2 T/C 69.2/ / T C Gene: Os04g N/A High Pullulanase SNP 1 G 2 G/A 99.4/ / G A Gene: Os04g N/A Low Pullulanase SNP 1 A 2 A/G 99.4/ / A G Gene: Os04g N/A Low Pullulanase SNP 1 G 2 G/C 99.4/ / G C Gene: Os04g N/A Low Pullulanase SNP 1 T 2 T/A 99.5/ / T A Gene: Os04g N/A Low Pullulanase SNP 1 C 2 T/C 68.3/ / T C Gene: Os04g N/A High SPHOL SNP 1 T 1 A A Gene: Os03g , mrna: Os03N/A Low SPHOL SNP 1 G 1 A A Gene: Os03g N/A Low SPHOL SNP 1 C 1 A A Gene: Os03g N/A Low SPHOL SNP 1 A 1 T T Gene: Os03g N/A Low SPHOL SNP 1 C 2 C/G 98.9/1.1 89/1 90 C G Gene: Os03g N/A High SSI SNP 1 A 2 A/G 99.5/ / A G Gene: Os06g N/A Low SSI SNP 1 T 2 T/A 99.5/ / T A Gene: Os06g N/A Low SSI SNP 1 G 2 G/A 99.5/ / G A Gene: Os06g N/A Low SSI SNP 1 C 2 C/T 99.4/ / C T Gene: Os06g N/A Low SSI SNP 1 G 2 G/T 99.3/ / G T Gene: Os06g N/A Low SSI SNP 1 C 2 C/A 99.1/ / C A Gene: Os06g , mrna: Os06N/A Low SSI SNP 1 C 2 C/A 99.4/ / C A Gene: Os06g N/A Low SSI SNP 1 G 2 G/A 99.5/ / G A Gene: Os06g N/A Low SSI SNP 1 A 2 A/C 99.4/ / A C Gene: Os06g N/A Low SSI SNP 1 A 2 A/G 93.5/ / A G Gene: Os06g N/A High SSI SNP 1 T 2 T/C 99.4/ / T C Gene: Os06g , mrna: Os06N/A Low SSI SNP 1 T 2 T/C 98.1/ / T C Gene: Os06g N/A High SSI SNP 1 T 2 T/A 98.2/ / T A Gene: Os06g N/A High SSI SNP 1 A 2 A/T 98.5/ / A T Gene: Os06g N/A High

173 SSI SNP 1 G 2 G/A 98.3/ / G A Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.1/ / C T Gene: Os06g N/A High SSI SNP 1 G 2 G/A 98.0/ / G A Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.1/ / C T Gene: Os06g N/A High SSI SNP 1 T 2 T/C 98.1/ / T C Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.1/ / C T Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.1/ / C T Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.1/ / C T Gene: Os06g N/A High SSI SNP 1 T 2 T/G 98.2/ / T G Gene: Os06g N/A High SSI SNP 1 T 2 T/G 98.1/ / T G Gene: Os06g N/A High SSI SNP 1 A 2 A/C 98.2/ / A C Gene: Os06g N/A High SSI SNP 1 G 2 G/A 98.4/ / G A Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.4/ / C T Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.0/ / C T Gene: Os06g N/A High SSI SNP 1 G 2 G/T 97.9/ / G T Gene: Os06g N/A High SSI SNP 1 A 2 A/T 98.0/ / A T Gene: Os06g , mrna: Os06N/A High SSI SNP 1 G 2 G/A 97.6/ / G A Gene: Os06g N/A High SSI SNP 1 G 2 G/A 97.7/ / G A Gene: Os06g N/A High SSI SNP 1 G 2 G/A 98.3/ / G A Gene: Os06g N/A High SSI SNP 1 G 2 G/T 98.0/ / G T Gene: Os06g , mrna: Os06N/A High SSI SNP 1 A 2 A/G 98.0/ / A G Gene: Os06g , mrna: Os06N/A High SSI SNP 1 T 2 T/C 97.7/ / T C Gene: Os06g N/A High SSI SNP 1 C 2 C/G 97.5/ / C G Gene: Os06g N/A High SSI SNP 1 G 2 G/A 98.0/ / G A Gene: Os06g N/A High SSI SNP 1 A 2 A/C 98.0/ / A C Gene: Os06g N/A High SSI SNP 1 G 2 G/T 98.2/ / G T Gene: Os06g N/A High SSI SNP 1 C 2 C/A 98.1/ / C A Gene: Os06g N/A High SSI SNP 1 C 2 C/T 97.5/ / C T Gene: Os06g N/A High SSI SNP 1 G 2 G/A 97.7/ / G A Gene: Os06g N/A High SSI SNP 1 A 2 A/T 97.8/ / A T Gene: Os06g N/A High SSI SNP 1 C 2 C/T 97.8/ / C T Gene: Os06g N/A High SSI SNP 1 G 2 G/T 98.0/ / G T Gene: Os06g N/A High SSI SNP 1 T 2 T/C 97.9/ / T C Gene: Os06g N/A High SSI SNP 1 C 2 C/T 97.8/ / C T Gene: Os06g N/A High SSI SNP 1 G 2 G/A 97.8/ / G A Gene: Os06g N/A High SSI SNP 1 T 2 T/C 97.8/ / T C Gene: Os06g N/A High SSI SNP 1 A 2 A/G 98.0/ / A G Gene: Os06g N/A High SSI SNP 1 A 2 A/G 98.1/ / A G Gene: Os06g N/A High SSI SNP 1 T 2 T/C 98.2/ / T C Gene: Os06g , mrna: Os06N/A High SSI SNP 1 G 2 G/A 98.2/ / G A Gene: Os06g , mrna: Os06N/A High SSI SNP 1 A 2 A/G 98.3/ / A G Gene: Os06g , mrna: Os06N/A High SSI SNP 1 C 2 C/T 98.2/ / C T Gene: Os06g N/A High SSI SNP 1 A 2 A/C 98.8/ / A C Gene: Os06g N/A High SSI SNP 1 C 2 C/G 98.7/ / C G Gene: Os06g N/A High SSI SNP 1 T 2 T/G 98.1/ / T G Gene: Os06g N/A High SSI SNP 1 A 2 A/G 97.3/ / A G Gene: Os06g N/A High SSI SNP 1 T 2 T/C 97.9/ / T C Gene: Os06g N/A High SSI SNP 1 C 2 C/T 98.5/ / C T Gene: Os06g N/A High SSI SNP 1 A 2 A/G 98.6/ / A G Gene: Os06g N/A High SSI SNP 1 C 2 C/A 97.9/ / C A Gene: Os06g N/A High SSI SNP 1 A 2 A/C 97.7/ / A C Gene: Os06g N/A High SSI SNP 1 T 2 T/A 98.3/ / T A Gene: Os06g N/A High SSI SNP 1 T 2 T/C 98.3/ / T C Gene: Os06g N/A High SSI SNP 1 T 2 T/C 97.4/ / T C Gene: Os06g N/A High SSI SNP 1 A 2 A/G 98.0/ / A G Gene: Os06g N/A High SSI SNP 1 G 2 G/C 92.8/ / G C Gene: Os06g N/A High SSI SNP 1 C 2 C/T 97.7/ / C T Gene: Os06g N/A High SSI SNP 1 C 2 C/T 97.8/ / C T Gene: Os06g N/A High SSI SNP 1 A 2 A/G 97.8/ / A G Gene: Os06g N/A High SSIIa SNP 1 G 2 G/T 97.8/2.2 44/1 45 G T Gene: Os06g , CDS: Os06g , mrnalow SSIIa SNP 1 G 2 G/T 98.4/1.6 61/1 62 G T Gene: Os06g , CDS: Os06g0N/A Low SSIIa SNP 1 G 2 G/T 98.9/1.1 92/1 93 G T Gene: Os06g , CDS: Os06g0N/A Low SSIIa SNP 1 G 2 G/T 97.9/2.1 95/2 97 G T Gene: Os06g , CDS: Os06g0Gly23Stp Low SSIIa Complex S 1 G 3 G/C/T 95.5/2.2/2.2 85/2/2 89 G C Gene: Os06g , CDS: Os06g0Gly24Ala,Val Low SSIIa Complex S 1 G 3 G/T/A 90.8/8.0/1.1 79/7/1 87 G T Gene: Os06g , CDS: Os06g0N/A High SSIIa Complex S 1 G 3 G/T/A 87.2/11.5/1.3 68/9/1 78 G T Gene: Os06g , CDS: Os06g0Arg26Met,Lys Low SSIIa SNP 1 G 2 G/T 96.2/3.8 77/3 80 G T Gene: Os06g , CDS: Os06g0Arg26Ser Low SSIIa SNP 1 G 2 G/A 98.8/1.2 79/1 80 G A Gene: Os06g , CDS: Os06g0Arg27Lys Low SSIIa Complex S 1 G 3 G/T/A 93.8/4.9/1.2 76/4/1 81 G T Gene: Os06g , CDS: Os06g0Arg27Ser,Arg Low

174 SSIIa SNP 1 G 2 G/T 96.3/3.7 79/3 82 G T Gene: Os06g , CDS: Os06g0Gly28Trp Low SSIIa SNP 1 G 2 G/T 95.5/4.5 84/4 88 G T Gene: Os06g , CDS: Os06g0Gly28Val Low SSIIa Complex S 1 G 3 G/T/A 95.3/3.5/1.2 82/3/1 86 G T Gene: Os06g , CDS: Os06g0N/A Low SSIIa SNP 1 G 2 G/T 90.9/9.1 80/8 88 G T Gene: Os06g , CDS: Os06g0Arg29Ser Low SSIIa Complex S 1 G 3 G/T/A 93.1/4.6/2.3 81/4/2 87 G T Gene: Os06g , CDS: Os06g0N/A Low SSIIa SNP 1 G 2 G/T 97.6/2.4 83/2 85 G T Gene: Os06g , CDS: Os06g0Val31Leu Low SSIIa SNP 1 G 2 G/T 97.6/2.4 80/2 82 G T Gene: Os06g , CDS: Os06g0N/A Low SSIIa SNP 1 G 2 G/T 97.5/2.5 78/2 80 G T Gene: Os06g , CDS: Os06g0Gly32Cys Low SSIIa SNP 1 G 2 G/T 97.4/2.6 76/2 78 G T Gene: Os06g , CDS: Os06g0Gly32Val Low SSIIa SNP 1 G 2 G/T 95.7/4.3 67/3 70 G T Gene: Os06g , CDS: Os06g0Ala34Ser Low SSIIa SNP 1 C 2 C/A 98.8/1.2 84/1 85 C A Gene: Os06g , CDS: Os06g0Pro36Gln Low SSIIa SNP 1 G 2 G/A 99.2/ /1 127 G A Gene: Os06g , CDS: Os06g0Gly43Asp Low SSIIa SNP 1 C 2 C/A 99.2/ /1 129 C A Gene: Os06g , CDS: Os06g0N/A Low SSIIa SNP 1 G 2 G/T 99.2/ /1 125 G T Gene: Os06g , CDS: Os06g0Arg44Leu Low SSIIa SNP 1 T 2 T/C 98.6/1.4 71/1 72 T C Gene: Os06g , CDS: Os06g0N/A Low SSIIa SNP 1 G 2 G/T 99.4/ /1 170 G T Gene: Os06g , CDS: Os06g0Gly135Val Low SSIIa SNP 1 G 2 G/T 99.1/ /1 110 G T Gene: Os06g , CDS: Os06g0Ala142Ser Low SSIIa SNP 1 G 2 G/C 96.2/ / G C Gene: Os06g N/A High SSIIa SNP 1 A 2 G/A 87.9/ / G A Gene: Os06g , Gene: Os06gMet737Val High SSIIa SNP 1 G 2 G/T 63.9/ / G T Gene: Os06g , Gene: Os06gPro32Thr High SSIIa SNP 1 C 2 C/T 55.6/ / C T Gene: Os06g , Gene: Os06gLeu781Phe High SSIIb SNP 1 C 2 C/A 99.5/ / C A Gene: Os02g , Gene: Os02gN/A Low SSIIb SNP 1 G 2 G/A 61.7/ / G A Gene: Os02g , Gene: Os02gN/A High SSIIb SNP 1 G 2 G/T 97.6/ / G T Gene: Os02g , Gene: Os02gN/A High SSIIIa SNP 1 G 2 G/A 87.3/ / G A N/A High SSIIIa SNP 1 G 2 G/A 87.5/ / G A N/A High SSIIIa SNP 1 A 2 A/G 80.7/ / A G N/A High SSIIIa SNP 1 C 2 C/T 94.7/ / C T N/A High SSIIIa SNP 1 C 2 C/T 78.4/ / C T N/A High SSIIIa SNP 1 T 2 T/A 94.8/ / T A N/A High SSIIIa SNP 1 A 2 A/G 95.0/ / A G N/A High SSIIIa SNP 1 G 2 G/A 87.6/ / G A N/A High SSIIIa SNP 1 A 2 A/C 94.7/ / A C N/A High SSIIIa SNP 1 A 2 A/C 87.7/ / A C N/A High SSIIIa SNP 1 C 2 C/T 79.9/ / C T N/A High SSIIIa SNP 1 G 2 G/A 94.3/ / G A N/A High SSIIIa SNP 1 G 2 G/A 94.2/ / G A N/A High SSIIIa SNP 1 G 2 G/A 94.7/ / G A N/A High SSIIIa SNP 1 C 2 C/T 79.8/ / C T N/A High SSIIIa SNP 1 G 2 G/A 87.0/ / G A N/A High SSIIIa SNP 1 C 2 C/T 86.9/ / C T N/A High SSIIIa SNP 1 T 2 T/C 62.6/ / T C N/A High SSIIIa SNP 1 C 2 C/T 87.7/ / C T N/A High SSIIIa SNP 1 C 2 C/T 87.2/ / C T N/A High SSIIIa SNP 1 G 2 G/A 87.3/ / G A N/A High SSIIIa SNP 1 G 2 G/A 87.0/ / G A N/A High SSIIIa SNP 1 C 2 C/A 80.0/ / C A N/A High SSIIIa SNP 1 G 2 G/A 87.2/ / G A N/A High SSIIIa SNP 1 C 2 C/T 86.9/ / C T N/A High SSIIIa SNP 1 G 2 G/A 87.1/ / G A N/A High SSIIIa SNP 1 T 2 T/A 88.8/ / T A N/A High SSIIIa SNP 1 G 2 G/A 88.2/ / G A N/A High SSIIIa SNP 1 T 2 T/A 62.8/ / T A N/A High SSIIIa SNP 1 A 2 A/T 92.0/ / A T N/A Low SSIIIa SNP 1 G 2 G/A 89.6/ / G A N/A High SSIIIa SNP 1 A 2 A/T 58.2/ / A T N/A High SSIIIa SNP 1 A 2 A/G 57.7/ / A G N/A High SSIIIa SNP 1 C 2 C/T 89.7/ / C T N/A High SSIIIa SNP 1 G 2 G/A 91.2/ / G A N/A High SSIIIa SNP 1 T 2 T/A 56.1/ / T A N/A High SSIIIa SNP 1 T 2 T/C 58.1/ / T C N/A High SSIIIa SNP 1 A 2 A/G 92.6/ / A G N/A High SSIIIa SNP 1 C 2 C/T 72.2/ / C T N/A High SSIIIa SNP 1 A 2 A/C 56.4/ / A C N/A High SSIIIa SNP 1 A 2 A/C 71.8/ / A C N/A High SSIIIa SNP 1 C 2 C/T 95.5/ / C T N/A High SSIIIa SNP 1 G 2 G/A 95.3/ / G A N/A High SSIIIa SNP 1 G 2 G/A 56.9/ / G A N/A High SSIIIa SNP 1 T 2 T/C 95.8/ / T C N/A High

175 SSIIIa SNP 1 C 2 C/T 89.0/ / C T N/A High SSIIIa SNP 1 A 2 A/G 56.9/ / A G N/A High SSIIIa SNP 1 A 2 A/G 89.6/ / A G N/A High SSIIIa SNP 1 C 2 C/T 89.6/ / C T N/A High SSIIIa SNP 1 T 2 T/C 89.5/ / T C N/A High SSIIIa SNP 1 A 2 A/T 89.8/ / A T N/A High SSIIIa SNP 1 A 2 A/G 95.4/ / A G N/A High SSIIIa SNP 1 T 2 T/C 96.0/ / T C N/A High SSIIIa SNP 1 A 2 A/G 99.0/ / A G N/A Low SSIIIa SNP 1 C 2 C/T 90.7/ / C T N/A High SSIIIa SNP 1 T 2 T/C 96.0/ / T C N/A High SSIIIa SNP 1 G 2 G/A 92.7/ / G A N/A High SSIIIa SNP 1 C 2 C/T 59.6/ / C T N/A High SSIIIa SNP 1 A 2 A/G 58.0/ / A G N/A High SSIIIa SNP 1 C 2 C/T 56.9/ / C T N/A High SSIIIa SNP 1 C 2 C/T 92.6/ / C T N/A High SSIIIa SNP 1 T 2 T/C 93.0/ / T C N/A High SSIIIa SNP 1 C 2 C/T 61.2/ / C T N/A High SSIIIa SNP 1 C 2 C/A 99.4/ / C A N/A Low SSIIIa SNP 1 G 2 A/G 59.8/ / A G N/A High SSIIIa SNP 1 A 2 A/G 95.4/ / A G N/A High SSIIIa SNP 1 A 2 A/T 52.5/ / A T N/A High SSIIIa SNP 1 T 2 T/C 95.0/ / T C N/A Low SSIIIa SNP 1 G 2 G/T 94.9/ / G T N/A High SSIIIa SNP 1 G 2 A/G 57.9/ / A G N/A High SSIIIa SNP 1 T 2 T/A 94.8/ / T A N/A High SSIIIa SNP 1 T 2 G/T 61.5/ / G T N/A High SSIIIa SNP 1 C 2 C/T 97.4/ / C T N/A High SSIIIa SNP 1 T 2 A/T 62.1/ / A T N/A High SSIIIa SNP 1 G 2 A/G 63.2/ / A G N/A High SSIIIa SNP 1 G 2 A/G 61.8/ / A G N/A High SSIIIa SNP 1 C 2 C/A 94.9/ / C A N/A High SSIIIa SNP 1 A 2 A/T 94.9/ / A T N/A High SSIIIa SNP 1 T 2 C/T 59.1/ / C T N/A High SSIIIa SNP 1 T 2 C/T 59.5/ / C T N/A High SSIIIa SNP 1 G 2 A/G 54.3/ / A G N/A High SSIIIa SNP 1 C 2 T/C 51.4/ / T C N/A High SSIIIa SNP 1 T 2 A/T 58.0/ / A T Gene: Os08g , mrna: Os08N/A High SSIIIb SNP 1 T 2 T/C 92.7/ / T C Gene: Os04g , CDS: Os04g0Thr1176Ala High SSIIIb SNP 1 A 2 A/T 90.1/ / A T Gene: Os04g , CDS: Os04g0N/A High SSIIIb SNP 1 A 2 A/G 90.6/ / A G Gene: Os04g N/A High SSIIIb SNP 1 A 2 A/G 90.2/ / A G Gene: Os04g , CDS: Os04g0N/A High SSIIIb SNP 1 A 2 A/G 90.8/ / A G Gene: Os04g N/A High SSIIIb SNP 1 A 2 A/T 91.2/ / A T Gene: Os04g N/A High SSIIIb SNP 1 A 2 A/G 91.2/ / A G Gene: Os04g N/A High SSIIIb SNP 1 C 2 C/A 90.0/ / C A Gene: Os04g , CDS: Os04g0Ser756Ile High SSIIIb SNP 1 T 2 T/C 90.8/ / T C Gene: Os04g N/A High SSIIIb SNP 1 C 2 C/T 91.4/ / C T Gene: Os04g N/A High SSIIIb SNP 1 T 2 T/G 90.0/ / T G Gene: Os04g , CDS: Os04g0N/A High SSIIIb SNP 1 C 2 C/T 90.5/ / C T Gene: Os04g , CDS: Os04g0N/A High SSIIIb SNP 1 G 2 G/A 90.2/ / G A Gene: Os04g , CDS: Os04g0N/A High SSIIIb SNP 1 T 2 T/C 90.6/ / T C Gene: Os04g , CDS: Os04g0Glu643Gly High SSIIIb SNP 1 C 1 A A Gene: Os04g N/A Low SSIIIb SNP 1 C 1 A A Gene: Os04g N/A Low SSIIIb SNP 1 G 1 A A Gene: Os04g N/A Low SSIIIb SNP 1 C 1 A A Gene: Os04g N/A Low SSIIIb SNP 1 T 1 A A Gene: Os04g N/A Low SSIIIb SNP 1 C 2 C/T 94.7/ / C T Gene: Os04g , CDS: Os04g0Glu460Lys High SSIIIb SNP 1 T 2 T/G 89.2/ / T G Gene: Os04g , CDS: Os04g0Lys207Asn High SSIIIb SNP 1 C 2 C/A 90.3/ / C A Gene: Os04g , CDS: Os04g0Val200Phe High SSIIIb SNP 1 A 2 A/C 90.6/ / A C Gene: Os04g , CDS: Os04g0Phe139Cys High SSIIIb SNP 1 A 2 A/T 90.7/ / A T Gene: Os04g N/A High SSIIIb SNP 1 C 2 C/T 77.7/ / C T N/A High SSIIIb SNP 1 A 2 A/G 79.2/ / A G N/A High SSIVa SNP 1 T 2 T/C 86.7/ / T C N/A High SSIVa SNP 1 A 2 A/G 87.6/ / A G Gene: Os01g , mrna: Os01N/A High SSIVa SNP 1 A 2 A/G 87.9/ / A G Gene: Os01g N/A High SSIVa SNP 1 A 2 A/T 87.7/ / A T Gene: Os01g N/A High SSIVa SNP 1 T 2 T/G 87.9/ / T G Gene: Os01g N/A High

176 SSIVa SNP 1 T 2 T/C 88.9/ / T C Gene: Os01g N/A High SSIVa SNP 1 T 2 T/A 88.2/ / T A Gene: Os01g N/A High SSIVa SNP 1 C 2 C/T 88.3/ / C T Gene: Os01g , CDS: Os01g0Gly708Asp High SSIVa SNP 1 G 2 G/A 86.8/ / G A Gene: Os01g N/A High SSIVa SNP 1 T 2 T/A 87.8/ / T A Gene: Os01g N/A High SSIVa SNP 1 G 2 G/A 87.7/ / G A Gene: Os01g N/A High SSIVa SNP 1 C 2 C/T 86.7/ / C T Gene: Os01g N/A High SSIVa SNP 1 T 2 T/C 87.0/ / T C Gene: Os01g N/A High SSIVa SNP 1 C 2 C/T 87.9/ / C T Gene: Os01g N/A High SSIVa SNP 1 A 2 A/G 74.1/ / A G Gene: Os01g , CDS: Os01g0Val480Ala High SSIVa SNP 1 T 2 T/G 87.1/ / T G Gene: Os01g N/A High SSIVa SNP 1 A 2 A/T 87.4/ / A T Gene: Os01g N/A High SSIVa SNP 1 A 2 A/T 86.8/ / A T Gene: Os01g , CDS: Os01g0Ser469Thr High SSIVa SNP 1 T 2 T/C 87.5/ / T C Gene: Os01g , CDS: Os01g0N/A High SSIVa SNP 1 T 2 T/A 87.7/ / T A Gene: Os01g , CDS: Os01g0N/A High SSIVa SNP 1 T 2 T/C 87.1/ / T C Gene: Os01g , CDS: Os01g0His363Arg High SSIVa SNP 1 C 2 C/A 86.9/ / C A Gene: Os01g , CDS: Os01g0Leu176Phe High SSIVa SNP 1 G 2 G/A 86.1/ / G A Gene: Os01g N/A High SSIVa SNP 1 A 2 A/G 87.5/ / A G Gene: Os01g N/A High SSIVa SNP 1 C 2 C/T 86.7/ / C T Gene: Os01g N/A High SSIVa SNP 1 G 2 G/A 88.5/ / G A Gene: Os01g N/A High SSIVa SNP 1 A 2 A/C 88.5/ / A C Gene: Os01g , mrna: Os01N/A High List of Indels List of Insertion/Deletion Reference Consensu Variation Referenc Allele Variant Frequency of Frequency of Frequency of Frequency Count of Overlappin g annotation Amino acid Mapping position s position type Length e Variants variations Frequencies Counts Coverage #1 #1 Count of #1 Variant #2 #2 Count of #2 Variant #3 #3 Count of #3 Variant #4 of #4 #4 s change AGPS2b INDEL 1-2 -/G 99.1/ / G Gene: Os08N/A AGPS2b INDEL 1-2 -/A 98.9/ / A Gene: Os08N/A AGPS2b INDEL 4 AACT 2 AACT/ / / AACT Gene: Os08N/A AGPS2b INDEL 1-2 -/T 99.1/ / T Gene: Os08N/A BEI INDEL 1-2 -/A 97.9/ / A Gene: Os06N/A BEI INDEL 1 A 2 A/- 95.1/ / A Gene: Os06N/A BEI INDEL 1-2 -/G 92.0/ / G Gene: Os06N/A BEI INDEL 1-2 -/A 99.2/ / A Gene: Os06N/A BEI INDEL /CCTG 99.2/ / CCTG Gene: Os06N/A BEI INDEL 2 GA 2 GA/ / / GA Gene: Os06N/A BEI INDEL 1-2 -/A 99.3/ / A Gene: Os06N/A BEIIb INDEL 1-2 G/- 81.4/ / G Gene: Os02N/A BEIIb INDEL CA/ / / CA Gene: Os02N/A BEIIb INDEL 1 A 2 A/- 96.6/ / A Gene: Os02N/A BEIIb INDEL 1-2 -/A 99.3/ / A Gene: Os02N/A BEIIb INDEL 1 T 2 T/- 96.9/ / T Gene: Os02N/A BEIIb INDEL 1-2 -/T 99.0/ / T Gene: Os02N/A BEIIb INDEL 1-2 -/A 98.1/ / A Gene: Os02N/A BEIIb INDEL 1 A 2 A/- 98.9/ / A Gene: Os02N/A BEIIb INDEL 1 A 2 A/- 98.9/ / A Gene: Os02N/A BEIIb INDEL 1-2 -/A 99.1/ / A Gene: Os02N/A BEIIb INDEL 1 T 2 -/T 73.1/ / T Gene: Os02N/A BEIIb INDEL /AA 41.7/ / AA Gene: Os02N/A BEIIb INDEL 1-2 A/- 56.4/ / A Gene: Os02N/A BEIIb INDEL 1-2 T/- 81.0/ / T Gene: Os02N/A BEIIb INDEL 8 GGGGG 3 GGGGGGGG97.2/1.4/ /2/1 142 GGGGGG GGTGGGG Gene: Os02N/A BEIIb INDEL 2 AT 2 AT/ / / AT Gene: Os02N/A BEIIb INDEL 2 AA 3 AA/AT/ /33.9/ /706/ AA AT Gene: Os02N/A BEIIb INDEL 1-2 -/T 65.1/ / T Gene: Os02N/A BEIIb INDEL 1-2 -/T 64.7/ / T Gene: Os02N/A BEIIb INDEL /TAT 64.7/ / TAT Gene: Os02N/A BEIIb INDEL /TATAT 64.7/ / TATAT Gene: Os02N/A BEIIb INDEL 1 T 2 -/T 76.1/ / T Gene: Os02N/A BEIIb INDEL 1-2 -/T 98.6/ / T Gene: Os02N/A BEIIb INDEL 1 T 2 T/- 96.4/ / T Gene: Os02N/A BEIIb INDEL 3 TTA 2 ---/TTA 76.7/ / TTA Gene: Os02N/A Isoamylase INDEL 1-2 -/A 99.3/ / A Gene: Os08N/A Isoamylase INDEL 1-2 -/T 98.2/ / T Gene: Os08N/A Isoamylase INDEL 2 TA 2 TA/ / / TA Gene: Os08N/A Isoamylase INDEL 1-2 -/A 99.0/ / A Gene: Os08N/A Isoamylase INDEL 1 A 2 A/- 98.5/ / A Gene: Os08N/A

177 Isoamylase INDEL 1-2 -/A 99.2/ / A Gene: Os08N/A Isoamylase INDEL 2 AA 2 AA/ / / AA Gene: Os08N/A Isoamylase INDEL 3 AAA 2 AAA/ / / AAA Gene: Os08N/A Isoamylase INDEL 1 A 2 A/- 59.1/ / A Gene: Os08N/A Isoamylase INDEL 1-2 -/A 98.8/ / A Gene: Os08N/A Isoamylase INDEL 1-2 -/A 98.9/ / A Gene: Os08N/A Isoamylase INDEL 1 A 2 A/- 96.9/ / A Gene: Os08N/A Isoamylase INDEL 1 A 2 A/- 97.8/ / A Gene: Os08N/A GBSSI INDEL 1-2 -/A 97.1/ / A Gene: Os06N/A GBSSI INDEL 1 A 2 A/- 96.1/ / A Gene: Os06N/A GBSSII INDEL 1 A 2 A/- 99.3/ / A Gene: Os07N/A GBSSII INDEL 1 A 2 A/- 99.0/ / A Gene: Os07N/A GBSSII INDEL 1 A 2 A/- 97.6/ / A Gene: Os07N/A GBSSII INDEL 1 A 2 A/- 99.2/ / A Gene: Os07N/A GBSSII INDEL 1-2 -/A 99.5/ / A Gene: Os07N/A GBSSII INDEL 1 T 2 T/- 96.5/ / T Gene: Os07N/A GBSSII INDEL 1-2 -/T 98.5/ / T Gene: Os07N/A GBSSII INDEL 1 C 2 C/- 98.0/ / C Gene: Os07N/A GPT INDEL 1 T 2 T/- 75.7/ / T Gene: Os08N/A GPT INDEL 2 TT 2 TT/ / / TT Gene: Os08N/A GPT INDEL 1-2 -/T 99.0/ / T Gene: Os08N/A GPT INDEL /TA 96.6/ / TA Gene: Os08N/A GPT INDEL 1-2 -/A 96.6/ / A Gene: Os08N/A GPT INDEL /GCC 96.8/ / GCC Gene: Os08N/A GPT INDEL 1-2 -/A 99.4/ / A Gene: Os08N/A GPT INDEL /TG 75.0/ / TG Gene: Os08N/A GPT INDEL 1-2 -/T 94.3/ / T Gene: Os08N/A GPT INDEL 1 T 2 T/- 96.6/ / T Gene: Os08N/A GPT INDEL 4 TTAC 2 TTAC/ / / TTAC Gene: Os08N/A Pullulanase INDEL 1 T 2 T/- 98.8/ / T Gene: Os04N/A Pullulanase INDEL 1-2 -/T 98.6/ / T Gene: Os04N/A Pullulanase INDEL 1-2 -/T 99.3/ / T Gene: Os04N/A Pullulanase INDEL 1-2 -/A 98.7/ / A Gene: Os04N/A Pullulanase INDEL 1-2 -/G 99.5/ / G Gene: Os04N/A Pullulanase INDEL /GA 98.6/ / GA Gene: Os04N/A Pullulanase INDEL 1 G 2 G/- 97.8/ / G Gene: Os04N/A Pullulanase INDEL /GC 98.7/ / GC Gene: Os04N/A Pullulanase INDEL 1 A 2 A/- 99.2/ / A Gene: Os04N/A Pullulanase INDEL 1 T 2 T/- 98.4/ / T Gene: Os04N/A Pullulanase INDEL 1 T 2 T/- 97.7/ / T Gene: Os04N/A Pullulanase INDEL 1-2 -/T 98.8/ / T Gene: Os04N/A SPHOL INDEL 1 T 2 T/- 99.4/ / T Gene: Os03N/A SPHOL INDEL 1-2 -/T 99.4/ / T Gene: Os03N/A SPHOL INDEL 1 A 2 A/- 98.7/ / A Gene: Os03N/A SPHOL INDEL 1 A 2 A/- 83.0/ / A Gene: Os03N/A SPHOL INDEL 1 C 2 -/C 96.5/ / C Gene: Os03N/A SPHOL INDEL 1-2 C/- 96.1/ / C Gene: Os03N/A SPHOL INDEL 1-2 T/- 92.2/ / T Gene: Os03N/A SPHOL INDEL 1 T 2 -/T 92.4/ / T Gene: Os03N/A SSI INDEL 1 C 2 C/- 99.2/ / C Gene: Os06N/A SSI INDEL 1 A 3 A/-/G 81.7/9.6/ /602/ A G Gene: Os06N/A SSI INDEL 1 A 2 A/- 99.4/ / A Gene: Os06N/A SSI INDEL 1-2 -/T 97.9/ / T Gene: Os06N/A SSI INDEL 1 A 2 A/- 97.9/ / A Gene: Os06N/A SSI INDEL 1 G 2 G/- 97.5/ / G Gene: Os06N/A SSI INDEL 3 GGT 2 GGT/ / / GGT Gene: Os06N/A SSI INDEL 1 G 2 G/- 96.4/ / G Gene: Os06N/A SSIIa INDEL 1-2 -/T 93.3/ / T Gene: Os06N/A SSIIb INDEL 1-2 -/T 99.4/ / T Gene: Os02N/A SSIIb INDEL 1 A 2 A/- 99.4/ / A Gene: Os02N/A SSIIb INDEL 1-2 -/A 99.3/ / A Gene: Os02N/A SSIIIa INDEL 1-2 -/A 79.7/ / A N/A SSIIIa INDEL 1 T 2 T/- 64.0/ / T N/A SSIIIa INDEL 1-2 -/A 73.8/ / A N/A SSIIIa INDEL /AA 73.8/ / AA N/A SSIIIa INDEL 1-2 -/T 98.3/ / T N/A SSIIIa INDEL 1 T 2 T/- 98.7/ / T N/A SSIIIa INDEL 1-2 -/T 97.0/ / T N/A SSIIIa INDEL 1 C 2 C/- 99.1/ / C N/A

178 SSIIIa INDEL 1 T 2 T/- 95.7/ / T N/A SSIIIa INDEL /AA 93.0/ / AA N/A SSIIIa INDEL 1-2 -/A 93.0/ / A N/A SSIIIa INDEL /CTTA 70.6/ / CTTA N/A SSIIIa INDEL 1-2 -/A 90.2/ / A N/A SSIIIa INDEL 1-2 -/T 98.5/ / T N/A SSIIIa INDEL 1 A 3 A/T/- 53.7/42.1/ /4636/ A T N/A SSIIIa INDEL 1-2 -/T 63.7/ / T N/A SSIIIa INDEL /TT 63.7/ / TT N/A SSIIIa INDEL 1 T 2 T/- 98.7/ / T N/A SSIIIa INDEL /CCT 54.0/ / CCT N/A SSIIIa INDEL 1-2 -/T 91.6/ / T Gene: Os08N/A SSIIIb INDEL 2 TC 2 TC/ / / TC Gene: Os04N/A SSIIIb INDEL 3 AAA 2 AAA/ / / AAA Gene: Os04N/A SSIIIb INDEL 1 A 2 A/- 97.0/ / A Gene: Os04N/A SSIIIb INDEL 2 AA 2 AA/ / / AA Gene: Os04N/A SSIIIb INDEL 2 AT 2 AT/ / / AT Gene: Os04N/A SSIIIb INDEL /GA 97.8/ / GA Gene: Os04N/A SSIIIb INDEL 2 GA 2 GA/ / / GA Gene: Os04N/A SSIIIb INDEL 1 A 2 A/- 90.3/ / A Gene: Os04N/A SSIIIb INDEL 1 G Gene: Os04N/A SSIIIb INDEL 1 A 2 A/- 94.8/ / A Gene: Os04N/A SSIIIb INDEL 1-2 -/A 96.4/ / A Gene: Os04N/A SSIIIb INDEL 1-2 -/A 98.3/ / A Gene: Os04N/A SSIIIb INDEL 1 A 2 A/- 98.8/ / A Gene: Os04N/A SSIIIb INDEL 3 CCT 2 CCT/ / /4 374 CCT Gene: Os04N/A SSIIIb INDEL 1 C 2 C/- 99.3/ /3 435 C Gene: Os04N/A SSIIIb INDEL 1-2 -/C 97.9/ / C Gene: Os04N/A SSIVa INDEL 1-2 -/A 88.8/ / A Gene: Os01N/A SSIVa INDEL 1-2 -/A 99.2/ / A Gene: Os01N/A SSIVa INDEL 1 A 2 A/- 99.1/ / A Gene: Os01N/A SSIVa INDEL 1 A 2 A/- 98.4/ / A Gene: Os01N/A SSIVa INDEL 1-2 -/C 99.4/ / C Gene: Os01N/A SSIVa INDEL 1 A 2 A/- 98.0/ / A Gene: Os01N/A SSIVa INDEL /CG 92.6/ / CG Gene: Os01N/A Stringency Criteria for INDEL detection (DIP) Mismatch cost 2 Similarity 0.7 Min variant frequency 0.5%

179 Appendix 2. The full list of breeding lines (studied population) and their pedigree information. barcode barcode 08 pedigree Cross *YRR07=01-01* *YRR08=01-03* ILLABONG/SARA YC B *YRR07=01-15* *YRR08=01-05* ILLABONG/VIALONE NANO-Y4 YC B *YUR07=08-19* *YRR08=01-07* ILLABONG///M102//M201/YRM3 YC B *YRR07=01-19* *YRR08=01-08* YRB4 YC *YRR07=03-14* *YRR08=01-08* YRB4 YC *YUR07=11-18* *YRR08=01-09* ILLABONG/4/YRB3///YRM2//M7/RIYC B *YUR07=02-17* *YRR08=01-10* ILLABONG/MILLIN YC B *YRR07=02-10* *YRR08=02-11* ILLABONG///YR83/M9//M7 YC B *YRR07=02-20* *YRR08=02-13* ILLABONG/VIALONE NANO-Y4 YC B *YRR07=02-16* *YRR08=02-16* ILLABONG///YR83/M9//M7 YC B *YRE07=10-02* *YRR08=02-19* YRM67 YC *YRR07=02-13* *YRR08=02-20* JARRAH YC *YRR07=04-17* *YRR08=02-20* JARRAH YC *YRR07=02-06* *YRR08=03-02* ILLABONG/YRM39 YC B *YRR07=02-07* *YRR08=03-09* M103///YRM34//YRM3/HUNG.NO.1 YC B *YUR07=02-16* *YRR08=04-02* ILLABONG/4/YRB3///YRM2//M7/RIYC B *YRR07=01-03* *YRR08=04-12* ILLABONG/IR YC B *YRR07=01-08* *YRR08=07-01* YRB3/ARBORIO//MILLIN/WC1043 YC B *YRR07=01-05* *YRR08=07-06* ILLABONG YC *YRR07=03-11* *YRR08=07-06* ILLABONG YC *YRI07=04-08* *YRI08=01-01* /H1//INGA///YRL113 YC B *YUD07=14-15* *YRI08=01-02* YRL118///INGA/M9//213D.25 YC S-7 *YRI07=01-04* *YRI08=01-04* /H1//INGA///YRL113 YC B *YUD07=03-24* *YRI08=01-05* BBL//M9/PELDE///YRL30/4/YRL113YC S-6 *YRI07=01-03* *YRI08=01-06* YRL39/IR //YRL113 YC B *YRI07=02-05* *YRI08=01-07* YRL113/WAB P31-1-HB YC B *YRI07=01-02* *YRI08=01-08* YRL39/IR //YRL113 YC B *YRI07=03-08* *YRI08=02-01* BBL//M9/PELDE///YRL30/4/YRL101YC B

180 *YRI07=03-03* *YRI08=01-09* YRL113///RD91V55//P/GO(4)/D.10 YC B *YUI07=08-12* *YRI08=01-10* YRL113/WAB P31-1-HB YC S-12 *YUI07=14-10* *YRI08=02-03* YRL113//L203/YRL34 YC S-6 *YRI07=02-03* *YRI08=02-06* BBL//M9/PELDE///YRL30/4/LANGI YC B *YRI07=02-08* *YRI08=02-08* YRL113 YC 89045J-0-17 *YRI07=02-04* *YRI08=03-02* 71011//73/M7///P/4/YRL34/5/BBL//MYC B *YRI07=01-07* *YRI08=03-03* /H1//INGA///YRL113 YC B *YRI07=03-02* *YRI08=04-04* YRL113//L203/YRL34 YC B *YUI07=12-12* *YRI08=04-09* BBL//M9/PELDE///YRL30/4/YRL113YC S-9 *YUI07=12-10* *YRI08=04-10* YRL113//L203/YRL34 YC S-10 *YRI07=04-05* *YRI08=05-09* BBL//M9/PELDE///YRL30/4/YRL113YC B *YUI07=06-10* *YRI08=05-10* RIZABELL/YRL113 YC S-14 *YRI07=02-02* *YRI08=06-02* YRL39/IR //YRL113 YC B *YRI07=01-06* *YRI08=06-05* YRL113//L203/YRL34 YC B *YRJ07=05-15* *YRJ08=02-33* M103///M201//YR196/ARDITO YC B *YRJ07=01-08* *YRJ08=02-30* M103/OEIRAS YC B *YRJ07=04-08* *YRJ08=03-28* M103/YRK2 YC 94161S B *YUJ07=09-22* *YRJ08=03-26* YRM49///ILLABONG/YRM54 YC B-2S-9 *YRJ07=03-13* *YRJ08=03-35* AKIHIKARI//KOSHIHIKARI (T)/YR YC 00238A B *YRJ07=03-12* *YRJ08=03-30* M103/YRM44 YC 95050S-5-0-B *YRI07=04-06* *YRI08=06-07* BBL//M9/PELDE///YRL30/4/LANGI YC B *YRI07=04-03* *YRI08=06-08* YRL113//L203/YRL34 YC B *YRI07=04-01* *YRI08=07-03* BBL//M9/PELDE///YRL30/4/YRL113YC B *YRI07=03-07* *YRI08=07-07* BBL//M9/PELDE///YRL30/4/LANGI YC B *YRI07=03-01* *YRI08=08-10* LANGI/LAGRUE//YRL113 YC B *YRI07=01-08* *YRI08=09-01* YRL101///PELDE/M9//M101 YC B *YRI07=02-07* *YRI08=09-10* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRI07=01-01* *YRI08=10-04* YRL113//L203/YRL34 YC B *YUD07=07-19* *YRI08=14-09* L203//YRL101/LANGI YC B-2S-10 *YRJ07=01-02* *YRJ08=01-13* M103///M201//196/ARD/4/M103/YRMYC B

181 *YRJ07=04-13* *YRJ08=01-17* M103///M201//196/ARD/4/M103/YRMYC B *YRJ07=03-09* *YRJ08=01-20* M103/YRM54 YC B *YUJ07=19-23* *YRJ08=01-22* M103//M201/YRM3///IR /YC S-6 *YRJ07=05-08* *YRJ08=01-24* M103///M201/EIKO//CALROSE YC B *YRJ07=03-04* *YRJ08=01-23* M103/YRM54 YC 99114S-26-B *YRJ07=04-01* *YRJ08=01-27* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=04-03* *YRJ08=01-30* ECHUCA/SHIMUZI MOCHI//MILLINYC B *YRJ07=02-14* *YRJ08=01-32* M103///M201//196/ARD/4/M103/YRMYC B *YRJ07=05-01* *YRJ08=01-33* M103/YRM54 YC 99114S-30-B *YRJ07=01-20* *YRJ08=01-35* M103/YRM54 YC B *YRJ07=02-15* *YRJ08=02-16* M103///M201//196/ARD/4/M103/YRMYC B *YRJ07=04-02* *YRJ08=02-19* M102/M103//M103 YC B *YRJ07=01-12* *YRJ08=02-20* ECHUCA/80023-TR YC 96041T B *YRJ07=01-07* *YRJ08=02-21* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=05-14* *YRJ08=02-22* M103///M201//196/ARD/4/M103/YRMYC B *YUJ07=19-25* *YRJ08=02-24* M103//M201/R///M201/YRM3//BOG.YC S-6 *YRJ07=04-19* *YRJ08=02-26* M103///M201//YR196/ARDITO YC 92243J-0-5-B *YRJ07=01-18* *YRJ08=02-27* M103/YRM18 YC 90019J-50-0-B *YRJ07=02-09* *YRJ08=02-28* M103//M401/CALROSE YC B *YRJ07=03-15* *YRJ08=02-35* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=01-03* *YRJ08=03-12* M102/M103 YC 92076S B *YUR07=01-16* *YRJ08=03-13* VIALONE NANO Y01/008 *YRJ07=05-19* *YRJ08=03-14* M103///M201//YR196/ARDITO YC 92243J-0-11-B *YRJ07=02-04* *YRJ08=03-15* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=02-19* *YRJ08=03-17* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=02-03* *YRJ08=03-18* M104 Y03/009 *YRJ07=04-09* *YRJ08=03-16* JARRAH YC *YRJ07=02-12* *YRJ08=03-25* M103/YRM54 YC B *YRJ07=05-06* *YRJ08=03-23* LIMAN Y98/001 *YRE07=10-04* *YRA08=01-03* YRM64/NORIN PL11 YC B

182 *YRA07=03-10* *YRA08=01-06* REIZIQ YC 86003S-12-0 *YRE07=19-01* *YRA08=01-05* M401/YRM42//YRM54 YC B *YRE07=15-01* *YRA08=01-09* NAMAGA///M201/YRM3//BOGAN YC 98140S-89-B *YUA07=04-28* *YRA08=01-10* KOSHIHIKARI(T)/M202//BOGAN YC S-10 *YUA07=05-27* *YRA08=02-01* YRM65//JARRAH/AMAROO YC B-2S-19 *YRA07=04-05* *YRA08=02-02* M201//YR196/ARDITO///YRM54 YC 99113S-57-B *YRA07=03-04* *YRA08=02-03* PARAGON YC *YRA07=02-10* *YRA08=02-05* M201//YR196/ARDITO///YRM54 YC B *YRA07=04-14* *YRA08=02-06* M7/M201//M103 YC B *YRA07=02-16* *YRA08=02-07* YRM54/YRM61 YC 97073S-93-0-B *YRA07=03-05* *YRA08=02-08* YRM54/YRM61 YC 97073S B *YUA07=05-20* *YRA08=02-12* YRM65//JARRAH/AMAROO YC B-2S-1 *YRA07=04-09* *YRA08=03-01* YRM66 YC *YUA07=17-19* *YRA08=03-02* IR /MILLIN//YRM63 YC B *YRA07=05-03* *YRA08=03-03* YRM68 YC *YRA07=05-02* *YRA08=03-04* NAMAGA YC *YRA07=03-03* *YRA08=03-05* M201//YR196/ARDITO///YRM54 YC B *YRA07=04-12* *YRA08=03-06* M201//YR196/ARDITO///YRM54 YC B *YRA07=03-12* *YRA08=03-08* QUEST_CT19 YC *YUA07=17-22* *YRA08=03-10* KOSHIHIKARI/M102//YRM43 YC B *YRA07=03-06* *YRA08=03-11* YRK4/SR YC B *YRA07=01-06* *YRA08=04-02* M201//YR196/ARDITO///YRM54 YC B *YRA07=05-10* *YRA08=04-04* OPUS/MATSURIBARE YC B *YRE07=14-04* *YRA08=04-05* YRM64/NORIN PL11 YC B *YUA07=05-21* *YRA08=04-07* M201/YRM3//BOGAN///YRM33 YC B *YRA07=03-07* *YRA08=04-10* M201//YR196/ARDITO///YRM54 YC B *YRA07=02-08* *YRA08=04-11* YRM54/YRM61 YC 97073S B *YRA07=02-01* *YRA08=04-12* QUEST YC *YRA07=05-08* *YRA08=05-02* M201//YR196/ARDITO///YRM54 YC B *YRA07=01-07* *YRA08=05-03* MILLIN YC

183 *YRA07=01-16* *YRA08=05-04* JARRAH YC *YRA07=04-03* *YRA08=05-08* CALHIKARI Y03/004 *YRA07=02-04* *YRA08=05-11* YRM42//BOGAN/M302 YC 97048S-28-0-B *YRA07=01-02* *YRA08=06-01* OPUS/MATSURIBARE YC B *YRA07=01-03* *YRA08=06-02* YRM64 YC *YRA07=05-06* *YRA08=06-04* AMAROO YC 79011S-0-32 *YRA07=01-11* *YRA08=06-05* YRM54//ECHUCA/SHIMUZI MOCHYC B *YUA07=17-25* *YRA08=06-06* M204/YRM43 YC B-2S-13-6B-S-5 *YRA07=03-16* *YRA08=06-07* ILLABONG/YRM54 YC B *YUA07=01-28* *YRA08=06-08* M401/YRM42//YRM54 YC B *YUA07=20-22* *YRA08=06-10* YRM64/NORIN PL11 YC B *YRA07=01-13* *YRA08=06-11* OPUS YC *YRA07=05-05* *YRA08=06-12* KOSHIHIKARI YC *YRB07=03-08* *YRB08=01-03* YRF205/LANGI YC B *YRB07=04-07* *YRB08=01-06* DELLMONT/LANGI YC B *YRB07=05-08* *YRB08=01-09* DOONGARA/YRL38 YC 95096S B *YUB07=14-24* *YRB08=01-10* LANGI///&(DAWN/K//IR579/K)/P//DYC B-2S-4 *YUB07=08-26* *YRB08=01-11* YRL39/IR //YRL123 YC 02093B-B-2S-13 *YRB07=04-02* *YRB08=01-16* YRL34//INGA/M9(5)/PEL///DOONGAYC B *YRB07=05-03* *YRB08=02-01* YRL125 YC *YRB07=05-01* *YRB08=02-03* YRL122/4/71011//M9/PEL//YRL29 YC B *YUB07=13-20* *YRB08=02-04* YRL39/IR //YRL123 YC 02093B-B-2S-31 *YRB07=01-09* *YRB08=02-05* PELDE/GOPALBHOG(4)/YR YC *YRB07=04-15* *YRB08=02-08* YRF205/LANGI YC B *YRB07=01-16* *YRB08=02-10* PELDE/GOPALBHOG(4)/YR YC *YRD07=04-10* *YRB08=02-12* 71011//73/M7///PEL/4/YRL34/5/IR20 YC B *YRB07=05-12* *YRB08=02-16* YRM54/CN /1 YC B *YRD07=04-02* *YRB08=03-03* INGA/L201//DOONGARA///L202 YC B *YRB07=04-03* *YRB08=03-06* KYEEMA YC *YUE07=08-07* *YRE08=05-11* M102/M103//YRM42/SR YC B-2S-4

184 *YRB07=02-12* *YRB08=03-08* DOONGARA/YRL38 YC 95096S B *YRB07=05-11* *YRB08=03-09* DOONGARA/SERATUS MALAM YC 00128T-0-2-B *YRB07=02-15* *YRB08=03-12* YRL123 YC *YRB07=03-17* *YRB08=03-13* LANGI YC *YRD07=04-11* *YRB08=04-01* INGA/L201//DOONGARA///L202 YC B *YRB07=04-13* *YRB08=04-02* YRF205/LANGI YC B *YRE07=12-02* *YRE08=05-07* QUEST YC *YUB07=02-21* *YRB08=04-03* YRL39/IR //YRL123 YC 02093B-B-2S-2 *YRB07=03-10* *YRB08=04-06* YRF205/LANGI YC B *YRB07=05-09* *YRB08=04-07* INGA/L201//DOONGARA///L202 YC B *YRB07=02-07* *YRB08=04-10* DOONGARA YC *YRB07=02-01* *YRB08=04-11* YRL118 YC 89198J-1-1 *YRB07=03-04* *YRB08=04-16* YRF207/L202 YC B *YRB07=05-10* *YRB08=05-02* YRL125_S_CT18 YC *YRB07=04-05* *YRB08=05-11* LANGI/JOJUTLA 4 YC B *YRE07=17-02* *YRE08=01-01* YRM64/TOYONISHIKI YC B *YUE07=05-02* *YRE08=01-11* OPUS/4/M7/KITAKOGANE///M201/ YC B *YRE07=14-02* *YRE08=02-01* YRM65 YC *YUE07=05-03* *YRE08=05-06* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRE07=11-02* *YRE08=02-02* M103/YRM54 YC 99114S-38-B *YRE07=22-05* *YRE08=02-03* M103 YU87/001 *YRE07=22-01* *YRE08=02-12* YRM49//IR /MILLIN YC B *YRE07=18-02* *YRE08=02-09* M103/YRM54 YC 99114S-25-B *YUE07=15-01* *YRE08=05-08* OPUS//KOSHIHIKARI (T)/M202 YC S-5 *YRE07=22-03* *YRE08=02-10* M103///YRM3/HUNG.NO.1//M401/4/YC B *YUE07=11-04* *YRE08=02-11* HITOMEBORE//YRM39/AKITAKOMYC S-1 *YUE07=01-12* *YRE08=02-05* YRM49///ILLABONG/YRM54 YC B-2S-5 *YRE07=09-05* *YRE08=05-05* M102//M201/BOGAN///YRM54 YC 99104S-9-B *YRE07=20-02* *YRE08=03-03* MILLIN YC *YRE07=23-04* *YRE08=03-06* JARRAH YC

185 *YUE07=20-13* *YRE08=03-05* M103/YRM54 YC 99114S-11-B *YRE07=14-03* *YRE08=03-09* ECHUCA YC 81121DS *YUE07=14-11* *YRE08=05-03* M201/YRM3//BOGAN///OPUS YC B *YUE07=10-03* *YRE08=03-10* M103/YRM49 YC 99219S-7-B *YRE07=10-03* *YRE08=03-11* QUEST_CT19 YC *YRE07=09-02* *YRE08=04-01* YRM64/TOYONISHIKI YC B *YUE07=08-02* *YRE08=04-07* M103/HITOMEBORE YC S-11 *YUE07=01-10* *YRE08=04-09* YRM54//YRK4/KOSHIHIKARI (TYNYC B-2S-5 *YUE07=17-11* *YRE08=04-10* OPUS//KOSHIHIKARI (T)/M202 YC S-8 *YRE07=20-03* *YRE08=04-11* YRM54/M202 YC 97027S-22-0-B *YUE07=02-01* *YUE08=02-14* MILLIN YC *YUE07=04-09* *YUE08=02-18* JARRAH YC *YRJ07=04-06* *YUJ08=13-20* SPRINT Y98/005 *YUD07=02-14* *YUD08=01-22* L205 Y03/008 *YUD07=02-19* *YUD08=01-15* YRL113 YC 89045J-0-17 *YUD07=01-16* *YUD08=01-20* YRL118 YC 89198J-1-1 *YRD07=04-03* *YRD08=01-03* YRL111 YC *YUD07=05-19* *YRD08=01-04* L202///BASMATI 370/PELDE//BASMYC B *YRD07=04-01* *YRD08=01-08* YRL113//H /YRL34 YC B *YRD07=02-05* *YRD08=01-09* LANGI/IR YC B *YUD07=14-22* *YRD08=01-10* THAIBONNET/YRL101 YC S-6 *YRD07=01-11* *YRD08=01-07* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRD07=01-07* *YRD08=02-03* L205 Y03/008 *YRD07=03-04* *YRD08=02-04* YRL118 YC 89198J-1-1 *YRD07=02-08* *YRD08=02-05* LANGI/IR YC B *YRD07=05-12* *YRD08=02-06* 213D.25/83//M7/IRR.ING///YRL38 YC B *YRD07=05-08* *YRD08=02-08* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YUD07=08-15* *YRD08=02-09* DELLMONT//BASMATI 370/PELDEYC S-16 *YRD07=02-02* *YRD08=03-01* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRD07=02-07* *YRD08=03-02* LANGI/INGA//PELDE YC B

186 *YRD07=02-04* *YRD08=03-04* LANGI/INGA//PELDE YC B *YRD07=05-11* *YRD08=03-05* LANGI/LAGRUE YC B *YRD07=01-03* *YRD08=03-06* (PELDE*2/CALROSE76)*2//DOONGYC 99248S-10-B *YRD07=01-04* *YRD08=03-07* YRL101//IR72/YRL39 YC B *YRD07=03-05* *YRD08=03-10* YRL122/THAIBONNET YC B *YUD07=03-22* *YRD08=04-01* M103/DOONGARA DH1 YC 03370DH-15 *YRD07=03-07* *YRD08=04-02* L203/YRL39//YRL101 YC B *YRD07=03-08* *YRD08=04-03* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YUD07=06-17* *YRD08=04-04* YRL39/IR //YRL113 YC B *YRD07=02-03* *YRD08=04-05* YRL113 YC 89045J-0-17 *YRD07=01-05* *YRD08=04-08* LANGI/IR YC B *YUD07=03-20* *YRD08=04-09* YRF205/LANGI YC B *YRD07=03-10* *YRD08=05-01* L202/DOONGARA YC B *YRD07=05-04* *YRD08=05-03* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRD07=05-02* *YRD08=05-05* YRL101/4/YRL39///213D.25/YR83//MYC B *YUD07=11-14* *YRD08=05-06* M103/DOONGARA DH1 YC 03370DH-24 *YRD07=01-10* *YRD08=05-07* YRB90 V31/YRL34 YC 90041J-1-24-B *YUD07=05-18* *YRD08=05-09* I/M9(5)/3/M101/73//P(2)/4/I/5/YRL10YC S-6 *YRD07=03-11* *YRD08=05-10* LANGI/IR YC 97182A B *YRD07=05-07* *YRD08=06-01* GULFMONT//YRL39/IR YC B *YRD07=01-02* *YRD08=06-02* YRL39/IR //YRL113 YC B *YRD07=05-10* *YRD08=06-04* //R/IR36///I/M9(5)/P YC B *YRD07=02-10* *YRD08=06-06* LANGI/IR YC B *YRD07=03-02* *YRD08=06-09* L202///BASMATI 370/PELDE//BASMYC B *YRD07=02-06* *YRD08=06-10* L203 YU85/001

187 Appendix 3. Target genes and sequence of gene-specific LR-PCR primers. No Gene Fragments Length Primer sequence 5-3 (bp) Forward Reverse 1 AGPS2b H AATCTTGACCGCAGTGTCG AAGTGTTGCCTGTGCATTAC H TTGTAATGCACAGGCAACAC ATCTCGACTGCCCAGTTAAG 2 SPHOL H TGCTGGTTGGTCGTAATGTG AGCCTCATTCCAGCTTAACC H CACTGTGCATTCCTGAGTTG TTGTCCATAGCTGCAGGTAG 3 GPT1 C 3940 TATCAGATTCCGAGGGCTTG GTTACCTTCCCACACCCAGA 4 GBSSI H CCAACTAGCTCCACAAGATG CATTGGGCTGGTAGTTGTTC H CCTTCCGGTTTGTTACTGAC CACACCCAGAAGAGTACAAC 5 GBSSII C 5405 CCATCGCATAGGATGAGTGA TGGAACACAACCCAGATGAA 6 SSI H AGCCCGATCTAGAAGGTACG TGATAGGCTCAAACCTGATG H ACATCAGGTTTGAGCCTATC GACACTTGACATCGCAGTAG 7 SSIIa H AGAAGAAACGGCTACTCCTC TCATGCGCTTCATGATTTCC H AGGAGACGCAAATTCCTTAG CATTGGTACTTGGCCTTGAC 8 SSIIb H ACACCTCCGGCAGATCTTTC CAACAGCAGCTTTGCAGAAC H TCTGCAAAGCTGCTGTTGAG ACTTCCACCGTTGCTCCTAC 9 SSIIIa H GGTTCTCAGTGTGGTGTTTG CATCCTTCGGAGTTCTTGTG H GTCACCACAGGACAATATCG ACCCTGCATCTTAGGCTTAG H GTTCCTGTCGAGTACAAGAG AGCCATAGTCCAGATGTAGC 10 SSIIIb H GGAGCCTTTCTTTCTCCTTC CTCCACTTGGGTTTCATGTC H AGCAACCTTGGGTAGGAATG CAATGTAGAAGCCGGGATTG 11 SSIVa H GTCCCGATACTGTTGTCTTG ATTGCCAGCAGACTACTTTC H AGAAGGTGCTAGGTTTGTTG CTTAGCCACCCATTCTTCTC 12 BEI H AATCGCCGCCGATTTCGAAG TTGCCAGGCGGAAGTCAAAC H TCTGGGATAGTCGTCTGTTC TACTTGTCTGGTGCTAGGAG 13 BEIIa H GGTGCTCTTCAGGAGGAAGG TACCTGCGGGTGAATCCAAG H GCGCTGAAGGCATTACCTAC AGTTGAACAGGCGAGAATCC 14 BEIIb H GGTAAATCGTCGTGATCTTC AAGGGAAGTAGCGATTAACG H CATCATTGGATGTGGGATTC GATGTACAGAAGTGCAGAAC 15 ISA1 H GTCAATTTCGCCGTCTACTC CAGTAGCCCTGAGAAATAGC H GGTCAGAATGGAATGGAAAG TCATCTTTCGTCCACTCAAC 16 ISA2 H GCGGCGGAAGAGTTGTAGCG GCTTCTGAGTCACCGGATGG H CGTGGCATTGATAATTCCTC TCAGGGAACATGAAGGTAAC 17 PUL H TAACCCAGATGGTCCTAGTC ACCAGTGGTCAACCTGTATG H CTATTGGTTTCCAGCCTAGC CCTTACGGAGATGACAAAGC Total ,067 H1= first half, H2= Second half, H3= third fragment, C= Complete fragment

188 Appendix 4. SNP/Indel distribution and short read coverage pattern across candidate loci. Name of each gene indicated at the top. X-Y plotters (up sides): The X-axis indicates the length of sequenced area (genes) in kb and Y-axis shows the number of detected SNPs/Indes. The graphs show the distribution of SNPs across the gene (The values under zero must be regarded as zero). Graphics in the middle side show the relevant gene (Blue=introns and Yellow=exons). Graphics in the down side (pink colour) show the coverage pattern of each gene.

189

190 Appendix 5. The full list of breeding lines (studied population) and their pedigree information. barcode barcode 08 pedigree Cross *YRR07=01-01* *YRR08=01-03* ILLABONG/SARA YC B *YRR07=01-15* *YRR08=01-05* ILLABONG/VIALONE NANO-Y4 YC B *YUR07=08-19* *YRR08=01-07* ILLABONG///M102//M201/YRM3 YC B *YRR07=01-19* *YRR08=01-08* YRB4 YC *YRR07=03-14* *YRR08=01-08* YRB4 YC *YUR07=11-18* *YRR08=01-09* ILLABONG/4/YRB3///YRM2//M7/RIYC B *YUR07=02-17* *YRR08=01-10* ILLABONG/MILLIN YC B *YRR07=02-10* *YRR08=02-11* ILLABONG///YR83/M9//M7 YC B *YRR07=02-20* *YRR08=02-13* ILLABONG/VIALONE NANO-Y4 YC B *YRR07=02-16* *YRR08=02-16* ILLABONG///YR83/M9//M7 YC B *YRE07=10-02* *YRR08=02-19* YRM67 YC *YRR07=02-13* *YRR08=02-20* JARRAH YC *YRR07=04-17* *YRR08=02-20* JARRAH YC *YRR07=02-06* *YRR08=03-02* ILLABONG/YRM39 YC B *YRR07=02-07* *YRR08=03-09* M103///YRM34//YRM3/HUNG.NO.1 YC B *YUR07=02-16* *YRR08=04-02* ILLABONG/4/YRB3///YRM2//M7/RIYC B *YRR07=01-03* *YRR08=04-12* ILLABONG/IR YC B *YRR07=01-08* *YRR08=07-01* YRB3/ARBORIO//MILLIN/WC1043 YC B *YRR07=01-05* *YRR08=07-06* ILLABONG YC *YRR07=03-11* *YRR08=07-06* ILLABONG YC *YRI07=04-08* *YRI08=01-01* /H1//INGA///YRL113 YC B *YUD07=14-15* *YRI08=01-02* YRL118///INGA/M9//213D.25 YC S-7 *YRI07=01-04* *YRI08=01-04* /H1//INGA///YRL113 YC B *YUD07=03-24* *YRI08=01-05* BBL//M9/PELDE///YRL30/4/YRL113YC S-6 *YRI07=01-03* *YRI08=01-06* YRL39/IR //YRL113 YC B *YRI07=02-05* *YRI08=01-07* YRL113/WAB P31-1-HB YC B *YRI07=01-02* *YRI08=01-08* YRL39/IR //YRL113 YC B *YRI07=03-08* *YRI08=02-01* BBL//M9/PELDE///YRL30/4/YRL101YC B

191 *YRI07=03-03* *YRI08=01-09* YRL113///RD91V55//P/GO(4)/D.10 YC B *YUI07=08-12* *YRI08=01-10* YRL113/WAB P31-1-HB YC S-12 *YUI07=14-10* *YRI08=02-03* YRL113//L203/YRL34 YC S-6 *YRI07=02-03* *YRI08=02-06* BBL//M9/PELDE///YRL30/4/LANGI YC B *YRI07=02-08* *YRI08=02-08* YRL113 YC 89045J-0-17 *YRI07=02-04* *YRI08=03-02* 71011//73/M7///P/4/YRL34/5/BBL//MYC B *YRI07=01-07* *YRI08=03-03* /H1//INGA///YRL113 YC B *YRI07=03-02* *YRI08=04-04* YRL113//L203/YRL34 YC B *YUI07=12-12* *YRI08=04-09* BBL//M9/PELDE///YRL30/4/YRL113YC S-9 *YUI07=12-10* *YRI08=04-10* YRL113//L203/YRL34 YC S-10 *YRI07=04-05* *YRI08=05-09* BBL//M9/PELDE///YRL30/4/YRL113YC B *YUI07=06-10* *YRI08=05-10* RIZABELL/YRL113 YC S-14 *YRI07=02-02* *YRI08=06-02* YRL39/IR //YRL113 YC B *YRI07=01-06* *YRI08=06-05* YRL113//L203/YRL34 YC B *YRJ07=05-15* *YRJ08=02-33* M103///M201//YR196/ARDITO YC B *YRJ07=01-08* *YRJ08=02-30* M103/OEIRAS YC B *YRJ07=04-08* *YRJ08=03-28* M103/YRK2 YC 94161S B *YUJ07=09-22* *YRJ08=03-26* YRM49///ILLABONG/YRM54 YC B-2S-9 *YRJ07=03-13* *YRJ08=03-35* AKIHIKARI//KOSHIHIKARI (T)/YR YC 00238A B *YRJ07=03-12* *YRJ08=03-30* M103/YRM44 YC 95050S-5-0-B *YRI07=04-06* *YRI08=06-07* BBL//M9/PELDE///YRL30/4/LANGI YC B *YRI07=04-03* *YRI08=06-08* YRL113//L203/YRL34 YC B *YRI07=04-01* *YRI08=07-03* BBL//M9/PELDE///YRL30/4/YRL113YC B *YRI07=03-07* *YRI08=07-07* BBL//M9/PELDE///YRL30/4/LANGI YC B *YRI07=03-01* *YRI08=08-10* LANGI/LAGRUE//YRL113 YC B *YRI07=01-08* *YRI08=09-01* YRL101///PELDE/M9//M101 YC B *YRI07=02-07* *YRI08=09-10* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRI07=01-01* *YRI08=10-04* YRL113//L203/YRL34 YC B *YUD07=07-19* *YRI08=14-09* L203//YRL101/LANGI YC B-2S-10 *YRJ07=01-02* *YRJ08=01-13* M103///M201//196/ARD/4/M103/YRMYC B

192 *YRJ07=04-13* *YRJ08=01-17* M103///M201//196/ARD/4/M103/YRMYC B *YRJ07=03-09* *YRJ08=01-20* M103/YRM54 YC B *YUJ07=19-23* *YRJ08=01-22* M103//M201/YRM3///IR /YC S-6 *YRJ07=05-08* *YRJ08=01-24* M103///M201/EIKO//CALROSE YC B *YRJ07=03-04* *YRJ08=01-23* M103/YRM54 YC 99114S-26-B *YRJ07=04-01* *YRJ08=01-27* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=04-03* *YRJ08=01-30* ECHUCA/SHIMUZI MOCHI//MILLINYC B *YRJ07=02-14* *YRJ08=01-32* M103///M201//196/ARD/4/M103/YRMYC B *YRJ07=05-01* *YRJ08=01-33* M103/YRM54 YC 99114S-30-B *YRJ07=01-20* *YRJ08=01-35* M103/YRM54 YC B *YRJ07=02-15* *YRJ08=02-16* M103///M201//196/ARD/4/M103/YRMYC B *YRJ07=04-02* *YRJ08=02-19* M102/M103//M103 YC B *YRJ07=01-12* *YRJ08=02-20* ECHUCA/80023-TR YC 96041T B *YRJ07=01-07* *YRJ08=02-21* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=05-14* *YRJ08=02-22* M103///M201//196/ARD/4/M103/YRMYC B *YUJ07=19-25* *YRJ08=02-24* M103//M201/R///M201/YRM3//BOG.YC S-6 *YRJ07=04-19* *YRJ08=02-26* M103///M201//YR196/ARDITO YC 92243J-0-5-B *YRJ07=01-18* *YRJ08=02-27* M103/YRM18 YC 90019J-50-0-B *YRJ07=02-09* *YRJ08=02-28* M103//M401/CALROSE YC B *YRJ07=03-15* *YRJ08=02-35* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=01-03* *YRJ08=03-12* M102/M103 YC 92076S B *YUR07=01-16* *YRJ08=03-13* VIALONE NANO Y01/008 *YRJ07=05-19* *YRJ08=03-14* M103///M201//YR196/ARDITO YC 92243J-0-11-B *YRJ07=02-04* *YRJ08=03-15* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=02-19* *YRJ08=03-17* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRJ07=02-03* *YRJ08=03-18* M104 Y03/009 *YRJ07=04-09* *YRJ08=03-16* JARRAH YC *YRJ07=02-12* *YRJ08=03-25* M103/YRM54 YC B *YRJ07=05-06* *YRJ08=03-23* LIMAN Y98/001 *YRE07=10-04* *YRA08=01-03* YRM64/NORIN PL11 YC B

193 *YRA07=03-10* *YRA08=01-06* REIZIQ YC 86003S-12-0 *YRE07=19-01* *YRA08=01-05* M401/YRM42//YRM54 YC B *YRE07=15-01* *YRA08=01-09* NAMAGA///M201/YRM3//BOGAN YC 98140S-89-B *YUA07=04-28* *YRA08=01-10* KOSHIHIKARI(T)/M202//BOGAN YC S-10 *YUA07=05-27* *YRA08=02-01* YRM65//JARRAH/AMAROO YC B-2S-19 *YRA07=04-05* *YRA08=02-02* M201//YR196/ARDITO///YRM54 YC 99113S-57-B *YRA07=03-04* *YRA08=02-03* PARAGON YC *YRA07=02-10* *YRA08=02-05* M201//YR196/ARDITO///YRM54 YC B *YRA07=04-14* *YRA08=02-06* M7/M201//M103 YC B *YRA07=02-16* *YRA08=02-07* YRM54/YRM61 YC 97073S-93-0-B *YRA07=03-05* *YRA08=02-08* YRM54/YRM61 YC 97073S B *YUA07=05-20* *YRA08=02-12* YRM65//JARRAH/AMAROO YC B-2S-1 *YRA07=04-09* *YRA08=03-01* YRM66 YC *YUA07=17-19* *YRA08=03-02* IR /MILLIN//YRM63 YC B *YRA07=05-03* *YRA08=03-03* YRM68 YC *YRA07=05-02* *YRA08=03-04* NAMAGA YC *YRA07=03-03* *YRA08=03-05* M201//YR196/ARDITO///YRM54 YC B *YRA07=04-12* *YRA08=03-06* M201//YR196/ARDITO///YRM54 YC B *YRA07=03-12* *YRA08=03-08* QUEST_CT19 YC *YUA07=17-22* *YRA08=03-10* KOSHIHIKARI/M102//YRM43 YC B *YRA07=03-06* *YRA08=03-11* YRK4/SR YC B *YRA07=01-06* *YRA08=04-02* M201//YR196/ARDITO///YRM54 YC B *YRA07=05-10* *YRA08=04-04* OPUS/MATSURIBARE YC B *YRE07=14-04* *YRA08=04-05* YRM64/NORIN PL11 YC B *YUA07=05-21* *YRA08=04-07* M201/YRM3//BOGAN///YRM33 YC B *YRA07=03-07* *YRA08=04-10* M201//YR196/ARDITO///YRM54 YC B *YRA07=02-08* *YRA08=04-11* YRM54/YRM61 YC 97073S B *YRA07=02-01* *YRA08=04-12* QUEST YC *YRA07=05-08* *YRA08=05-02* M201//YR196/ARDITO///YRM54 YC B *YRA07=01-07* *YRA08=05-03* MILLIN YC

194 *YRA07=01-16* *YRA08=05-04* JARRAH YC *YRA07=04-03* *YRA08=05-08* CALHIKARI Y03/004 *YRA07=02-04* *YRA08=05-11* YRM42//BOGAN/M302 YC 97048S-28-0-B *YRA07=01-02* *YRA08=06-01* OPUS/MATSURIBARE YC B *YRA07=01-03* *YRA08=06-02* YRM64 YC *YRA07=05-06* *YRA08=06-04* AMAROO YC 79011S-0-32 *YRA07=01-11* *YRA08=06-05* YRM54//ECHUCA/SHIMUZI MOCHYC B *YUA07=17-25* *YRA08=06-06* M204/YRM43 YC B-2S-13-6B-S-5 *YRA07=03-16* *YRA08=06-07* ILLABONG/YRM54 YC B *YUA07=01-28* *YRA08=06-08* M401/YRM42//YRM54 YC B *YUA07=20-22* *YRA08=06-10* YRM64/NORIN PL11 YC B *YRA07=01-13* *YRA08=06-11* OPUS YC *YRA07=05-05* *YRA08=06-12* KOSHIHIKARI YC *YRB07=03-08* *YRB08=01-03* YRF205/LANGI YC B *YRB07=04-07* *YRB08=01-06* DELLMONT/LANGI YC B *YRB07=05-08* *YRB08=01-09* DOONGARA/YRL38 YC 95096S B *YUB07=14-24* *YRB08=01-10* LANGI///&(DAWN/K//IR579/K)/P//DYC B-2S-4 *YUB07=08-26* *YRB08=01-11* YRL39/IR //YRL123 YC 02093B-B-2S-13 *YRB07=04-02* *YRB08=01-16* YRL34//INGA/M9(5)/PEL///DOONGAYC B *YRB07=05-03* *YRB08=02-01* YRL125 YC *YRB07=05-01* *YRB08=02-03* YRL122/4/71011//M9/PEL//YRL29 YC B *YUB07=13-20* *YRB08=02-04* YRL39/IR //YRL123 YC 02093B-B-2S-31 *YRB07=01-09* *YRB08=02-05* PELDE/GOPALBHOG(4)/YR YC *YRB07=04-15* *YRB08=02-08* YRF205/LANGI YC B *YRB07=01-16* *YRB08=02-10* PELDE/GOPALBHOG(4)/YR YC *YRD07=04-10* *YRB08=02-12* 71011//73/M7///PEL/4/YRL34/5/IR20 YC B *YRB07=05-12* *YRB08=02-16* YRM54/CN /1 YC B *YRD07=04-02* *YRB08=03-03* INGA/L201//DOONGARA///L202 YC B *YRB07=04-03* *YRB08=03-06* KYEEMA YC *YUE07=08-07* *YRE08=05-11* M102/M103//YRM42/SR YC B-2S-4

195 *YRB07=02-12* *YRB08=03-08* DOONGARA/YRL38 YC 95096S B *YRB07=05-11* *YRB08=03-09* DOONGARA/SERATUS MALAM YC 00128T-0-2-B *YRB07=02-15* *YRB08=03-12* YRL123 YC *YRB07=03-17* *YRB08=03-13* LANGI YC *YRD07=04-11* *YRB08=04-01* INGA/L201//DOONGARA///L202 YC B *YRB07=04-13* *YRB08=04-02* YRF205/LANGI YC B *YRE07=12-02* *YRE08=05-07* QUEST YC *YUB07=02-21* *YRB08=04-03* YRL39/IR //YRL123 YC 02093B-B-2S-2 *YRB07=03-10* *YRB08=04-06* YRF205/LANGI YC B *YRB07=05-09* *YRB08=04-07* INGA/L201//DOONGARA///L202 YC B *YRB07=02-07* *YRB08=04-10* DOONGARA YC *YRB07=02-01* *YRB08=04-11* YRL118 YC 89198J-1-1 *YRB07=03-04* *YRB08=04-16* YRF207/L202 YC B *YRB07=05-10* *YRB08=05-02* YRL125_S_CT18 YC *YRB07=04-05* *YRB08=05-11* LANGI/JOJUTLA 4 YC B *YRE07=17-02* *YRE08=01-01* YRM64/TOYONISHIKI YC B *YUE07=05-02* *YRE08=01-11* OPUS/4/M7/KITAKOGANE///M201/ YC B *YRE07=14-02* *YRE08=02-01* YRM65 YC *YUE07=05-03* *YRE08=05-06* M103///YRM3/HUNG.NO.1//M401/4/YC B *YRE07=11-02* *YRE08=02-02* M103/YRM54 YC 99114S-38-B *YRE07=22-05* *YRE08=02-03* M103 YU87/001 *YRE07=22-01* *YRE08=02-12* YRM49//IR /MILLIN YC B *YRE07=18-02* *YRE08=02-09* M103/YRM54 YC 99114S-25-B *YUE07=15-01* *YRE08=05-08* OPUS//KOSHIHIKARI (T)/M202 YC S-5 *YRE07=22-03* *YRE08=02-10* M103///YRM3/HUNG.NO.1//M401/4/YC B *YUE07=11-04* *YRE08=02-11* HITOMEBORE//YRM39/AKITAKOMYC S-1 *YUE07=01-12* *YRE08=02-05* YRM49///ILLABONG/YRM54 YC B-2S-5 *YRE07=09-05* *YRE08=05-05* M102//M201/BOGAN///YRM54 YC 99104S-9-B *YRE07=20-02* *YRE08=03-03* MILLIN YC *YRE07=23-04* *YRE08=03-06* JARRAH YC

196 *YUE07=20-13* *YRE08=03-05* M103/YRM54 YC 99114S-11-B *YRE07=14-03* *YRE08=03-09* ECHUCA YC 81121DS *YUE07=14-11* *YRE08=05-03* M201/YRM3//BOGAN///OPUS YC B *YUE07=10-03* *YRE08=03-10* M103/YRM49 YC 99219S-7-B *YRE07=10-03* *YRE08=03-11* QUEST_CT19 YC *YRE07=09-02* *YRE08=04-01* YRM64/TOYONISHIKI YC B *YUE07=08-02* *YRE08=04-07* M103/HITOMEBORE YC S-11 *YUE07=01-10* *YRE08=04-09* YRM54//YRK4/KOSHIHIKARI (TYNYC B-2S-5 *YUE07=17-11* *YRE08=04-10* OPUS//KOSHIHIKARI (T)/M202 YC S-8 *YRE07=20-03* *YRE08=04-11* YRM54/M202 YC 97027S-22-0-B *YUE07=02-01* *YUE08=02-14* MILLIN YC *YUE07=04-09* *YUE08=02-18* JARRAH YC *YRJ07=04-06* *YUJ08=13-20* SPRINT Y98/005 *YUD07=02-14* *YUD08=01-22* L205 Y03/008 *YUD07=02-19* *YUD08=01-15* YRL113 YC 89045J-0-17 *YUD07=01-16* *YUD08=01-20* YRL118 YC 89198J-1-1 *YRD07=04-03* *YRD08=01-03* YRL111 YC *YUD07=05-19* *YRD08=01-04* L202///BASMATI 370/PELDE//BASMYC B *YRD07=04-01* *YRD08=01-08* YRL113//H /YRL34 YC B *YRD07=02-05* *YRD08=01-09* LANGI/IR YC B *YUD07=14-22* *YRD08=01-10* THAIBONNET/YRL101 YC S-6 *YRD07=01-11* *YRD08=01-07* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRD07=01-07* *YRD08=02-03* L205 Y03/008 *YRD07=03-04* *YRD08=02-04* YRL118 YC 89198J-1-1 *YRD07=02-08* *YRD08=02-05* LANGI/IR YC B *YRD07=05-12* *YRD08=02-06* 213D.25/83//M7/IRR.ING///YRL38 YC B *YRD07=05-08* *YRD08=02-08* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YUD07=08-15* *YRD08=02-09* DELLMONT//BASMATI 370/PELDEYC S-16 *YRD07=02-02* *YRD08=03-01* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRD07=02-07* *YRD08=03-02* LANGI/INGA//PELDE YC B

197 *YRD07=02-04* *YRD08=03-04* LANGI/INGA//PELDE YC B *YRD07=05-11* *YRD08=03-05* LANGI/LAGRUE YC B *YRD07=01-03* *YRD08=03-06* (PELDE*2/CALROSE76)*2//DOONGYC 99248S-10-B *YRD07=01-04* *YRD08=03-07* YRL101//IR72/YRL39 YC B *YRD07=03-05* *YRD08=03-10* YRL122/THAIBONNET YC B *YUD07=03-22* *YRD08=04-01* M103/DOONGARA DH1 YC 03370DH-15 *YRD07=03-07* *YRD08=04-02* L203/YRL39//YRL101 YC B *YRD07=03-08* *YRD08=04-03* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YUD07=06-17* *YRD08=04-04* YRL39/IR //YRL113 YC B *YRD07=02-03* *YRD08=04-05* YRL113 YC 89045J-0-17 *YRD07=01-05* *YRD08=04-08* LANGI/IR YC B *YUD07=03-20* *YRD08=04-09* YRF205/LANGI YC B *YRD07=03-10* *YRD08=05-01* L202/DOONGARA YC B *YRD07=05-04* *YRD08=05-03* BBL//M9/P///YRL30/4/LANGI/INGA YC B *YRD07=05-02* *YRD08=05-05* YRL101/4/YRL39///213D.25/YR83//MYC B *YUD07=11-14* *YRD08=05-06* M103/DOONGARA DH1 YC 03370DH-24 *YRD07=01-10* *YRD08=05-07* YRB90 V31/YRL34 YC 90041J-1-24-B *YUD07=05-18* *YRD08=05-09* I/M9(5)/3/M101/73//P(2)/4/I/5/YRL10YC S-6 *YRD07=03-11* *YRD08=05-10* LANGI/IR YC 97182A B *YRD07=05-07* *YRD08=06-01* GULFMONT//YRL39/IR YC B *YRD07=01-02* *YRD08=06-02* YRL39/IR //YRL113 YC B *YRD07=05-10* *YRD08=06-04* //R/IR36///I/M9(5)/P YC B *YRD07=02-10* *YRD08=06-06* LANGI/IR YC B *YRD07=03-02* *YRD08=06-09* L202///BASMATI 370/PELDE//BASMYC B *YRD07=02-06* *YRD08=06-10* L203 YU85/001

198 Appendix 6. Name and characteristics of SNPs genotyped in the rice population. No Gene SNP ID* Coordinates on gdna Expected SNP SNP Assayed Association with Physiochemical traits Status 1 AGPS2b TBGU G/T G/G N/A No polymorphism 2 AGPS2b TBGI T/C T/T N/A No polymorphism 3 SPHOL TBGU G/T G/G N/A No polymorphism 4 SPHOL TBGU C/T C/C N/A No polymorphism 5 SPHOL TBGU C/A C/C N/A No polymorphism 6 SPHOL TBGU G/T G/G N/A No polymorphism 7 SPHOL TBGU G/T G/G N/A No polymorphism 8 GBSSI WAXYEXIN1 246 T/G T/G P1,BD,FV,SB,MT,AC,PN,Dif Highly associated 9 GBSSI WAXYEX A/C A/C SB,BD,MT,AC Highly associated 10 GBSSI WAXYEX C/T C/T T1,FV,SB,MT,AC,PN,Dif Highly associated 11 GBSSII GBSSII_GA_ G/A G/A PT, GT Low-Medium association 12 SSI TBGU T/C T/C FV,SBV,MT Low-Medium association 13 SSIIa SSIIa_GA_Ref G/T G/T N/A No association 14 SSIIa ALKSSIIA GC/TT GC/TT BDV,SB,PKT,PT,GT,CHK Highly associated 15 SSIIb TBGU A/G A/A N/A No polymorphism 16 SSIIb TBGU G/C G/G N/A No polymorphism 17 SSIIb TBGU T/C T/T N/A No polymorphism 18 SSIIb TBGU G/A A/A N/A No polymorphism 19 SSIIb TBGU C/T C/C N/A No polymorphism 20 SSIIb TBGU T/G T/T N/A No polymorphism 21 SSIIIa GA_Ref T/A T/A PT,MT, Low-Medium association 22 SSIIIa GA_Ref G/A G/A SBV,PT,MT,AC,PN,Dif,GT Low associated 23 SSIIIa GA_Ref G/A G/A N/A No association 24 SSIIIa GA_Ref T/A T/A N/A No association 25 SSIIIa GA_Ref T/A T/A CHK Low association 26 SSIIIa GA_Ref G/A G/A N/A No association 27 SSIIIa GA_Ref A/C A/C FV,SBV,PT,MT,AC,PN,Dif Low-Medium association 28 SSIIIa GA_Ref G/A G/A MT,AC,PN,Dif,GT Low-Medium association 29 SSIIIa GA_Ref G/A G/A N/A No association 30 SSIIIa GA_Ref T/C T/C N/A No association 31 SSIIIa GA_Ref A/C A/C N/A No association 32 SSIIIa GA_Ref C/T C/T N/A No association 33 SSIIIa GA_Ref C/T C/T N/A No association

199 34 SSIIIa GA_Ref G/A G/A N/A No association 35 SSIIIa GA_Ref1722ER 1722 G/A G/A FV,SBV,PT,MT,AC,PN,Dif,GT Low-Medium association 36 SSIIIa GA_Ref C/T C/T N/A No association 37 SSIIIa GA_Ref G/A G/A N/A No association 38 SSIIIa GA_Ref G/A G/A MT No association 39 SSIIIa GA_Ref C/T C/T N/A No association 40 SSIIIa GA_Ref G/A G/A N/A No association 41 SSIIIa GA_Ref G/A G/A FV,SBV,PT,MT,AC,PN,Dif Low-Medium association 42 SSIIIa GA_Ref C/T C/T PT Low association 43 SSIIIb GA_Ref T/C T/C PT Medium association 44 SSIIIb GA_Ref C/A C/A PT Medium association 45 SSIIIb GA_Ref T/C T/C PT Medium association 46 SSIIIb GA_Ref T/G T/G PT Medium-High association 47 SSIIIb GA_Ref7255ER 7255 C/A C/A PKV Medium association 48 SSIIIb GA_Ref A/C A/C PT, Dif Low-Medium association 49 SSIVa GA_Ref C/T C/T PT,GT Low-Medium association 50 SSIVa GA_Ref A/G A/G PKT,PT,AC,PN,GT Low-Medium association 51 SSIVa GA_Ref A/T A/T PT,GT Low-Medium association 52 SSIVa GA_Ref T/C T/C PT,GT Low-Medium association 53 SSIVa GA_Ref C/A C/A PT,GT Medium association 54 SSIVb TBGU G/C G/G N/A No polymorphism 55 SSIVb TBGU G/A G/G N/A No polymorphism 56 BEI GA_Ref C/T C/T PV,BDV,FV,SBV,PT,MT,AC,PN,Dif Low-Medium association 57 BEIIa GA_Ref T/G T/G N/A No association 58 BEIIb GA_Ref C/T C/T N/A No association 59 BEIIb GA_Ref C/A C/A N/A No association 60 ISA1 TBGU G/A G/G N/A No polymorphism 61 ISA1 TBGU C/G C/C N/A No polymorphism 62 ISA2 Iso2_GA_Ref T/C T/C BDV, PT, CHK Low association 63 ISA2 Iso2_GA_Ref C/A C/A BDV, PT, CHK Low association 64 Pullulanase TBGU G/A G/A PT, GT Low association 65 Pullulanase TBGU T/C T/C CHK Low association *SNP identification can be found from Kharabian-Masouleh et al., 2011 (starting with GA code) or OryzaSNP MSU database ( starting with TBG or TBU codes. Homozygosity of SNP calls mean no polymorphism in the corresponding allele. MT=Martin test (retrogradation), PN=Predicted N, Dif=Difference, CHK=Chalkiness (%).

200 Appendix 7. The results of association study among 13 different physiochemical traits and SNPs of 18 different genes. The most important columns are F-test and R2 Marker. Trait Locus/SNP df_markerf-test p-value #perm_map-perm_m p-adjusted df_model df_error MS_Error R2_ModelR2_Marker AGPS2b Section 1 No Functional Polymorphism found in this gene SPHOL Section 2 No Functional Polymorphism found in this gene GBSSI Section 3 Peak1 WAXYEXIN E E E Trough1 WAXYEX E E E Breakdown WAXYEXIN E E E Breakdown WAXYEX E E Final Viscosity WAXYEXIN E E E Final Viscosity WAXYEX E E E Setback WAXYEXIN E E E Setback WAXYEX E E E Martin_N WAXYEXIN E E E Martin_N WAXYEX E E E Martin_N WAXYEX E E AC_percent WAXYEXIN E E E AC_percent WAXYEX E E E AC_percent WAXYEX E E predicted_n WAXYEXIN E E E predicted_n WAXYEX E E E predicted_n WAXYEX E E diff WAXYEXIN E E E diff WAXYEX E E E GBSSII Section 4 Past_temp GBSSII_GA_Ref E E E GT GBSSII_GA_Ref E E SSI Section 5 Trough1 SSI_TBGU272768_ E E E FinalVisc SSI_TBGU272768_ E E E Setback SSI_TBGU272768_ E E E Martin_N SSI_TBGU272768_ E E AC_percent SSI_TBGU272768_ E E predicted_n SSI_TBGU272768_ E E diff SSI_TBGU272768_ E E SSIIa Section 6 Breakdown ALKSSIIA E E E

201 Setback ALKSSIIA PeakTime ALKSSIIA E E E Past_temp ALKSSIIA E E E GT ALKSSIIA E E Chalk% ALKSSIIA E E SSIIb Section 7 No Functional Polymorphism found in this gene SSIIIa Section 8 FinalVisc SSIIIa_GA_Ref E E FinalVisc SSIIIa_GA_Ref1722ER E E FinalVisc SSIIIa_GA_Ref E E Setback SSIIIa_GA_Ref E E Setback SSIIIa_GA_Ref E E Setback SSIIIa_GA_Ref1722ER E E E Setback SSIIIa_GA_Ref E E Past_temp SSIIIa_GA_Ref E E E Past_temp SSIIIa_GA_Ref E E E Past_temp SSIIIa_GA_Ref E E Past_temp SSIIIa_GA_Ref1722ER E E E Past_temp SSIIIa_GA_Ref E E Past_temp SSIIIa_GA_Ref E E E Martin_N SSIIIa_GA_Ref E E Martin_N SSIIIa_GA_Ref E E Martin_N SSIIIa_GA_Ref E E Martin_N SSIIIa_GA_Ref E E Martin_N SSIIIa_GA_Ref1722ER E E E Martin_N SSIIIa_GA_Ref E E Martin_N SSIIIa_GA_Ref E E AC_percent SSIIIa_GA_Ref E E AC_percent SSIIIa_GA_Ref E E E AC_percent SSIIIa_GA_Ref E E AC_percent SSIIIa_GA_Ref1722ER E E AC_percent SSIIIa_GA_Ref E E predicted_n SSIIIa_GA_Ref E E predicted_n SSIIIa_GA_Ref E E E predicted_n SSIIIa_GA_Ref E E predicted_n SSIIIa_GA_Ref1722ER E E predicted_n SSIIIa_GA_Ref E E diff SSIIIa_GA_Ref E E diff SSIIIa_GA_Ref E E

202 diff SSIIIa_GA_Ref E E diff SSIIIa_GA_Ref1722ER E E diff SSIIIa_GA_Ref E E GT SSIIIa_GA_Ref E E GT SSIIIa_GA_Ref E E GT SSIIIa_GA_Ref1722ER E E Chalk% SSIIIa_GA_Ref E E E SSIIIb Section 9 Peak Viscosity SSIIIb_GA_Ref7255ER E E Past_temp SSIIIb_GA_Ref E E E Past_temp SSIIIb_GA_Ref E E E Past_temp SSIIIb_GA_Ref E E E Past_temp SSIIIb_GA_Ref E E E Past_temp SSIIIb_GA_Ref7255ER E E E Past_temp SSIIIb_GA_Ref E E E diff SSIIIb_GA_Ref E E SSIVa Section 10 PeakTime SSIva_GA_Ref E E E Past_temp SSIva_GA_Ref E E E Past_temp SSIva_GA_Ref E E E Past_temp SSIva_GA_Ref E E E Past_temp SSIva_GA_Ref E E E Past_temp SSIva_GA_Ref E E E AC_percent SSIva_GA_Ref E E E predicted_n SSIva_GA_Ref E E GT SSIva_GA_Ref E E GT SSIva_GA_Ref E E E GT SSIva_GA_Ref E E GT SSIva_GA_Ref E E GT SSIva_GA_Ref E E SSIVb Section 11 SSIvb_TBGU260749_5090 No polymorphism detected SSIvb_TBGU260765_9525 BEI Section 12 Peak Viscosity BEI_GA_Ref E E Breakdown ViscoBEI_GA_Ref E E E FinalViscosity BEI_GA_Ref E E Setback ViscosityBEI_GA_Ref E E E Past_temp BEI_GA_Ref E E E Martin_N BEI_GA_Ref E E

203 AC_percent BEI_GA_Ref E E E predicted_n BEI_GA_Ref E E E diff BEI_GA_Ref E E BEIIa Section 13 BEIIa_GA_Ref3266 No significant association with starch properties was observed for this SNP Section 14 BEIIb_GA_Ref9035 No significant association with starch properties was observed for this SNP BEIIb_GA_Ref10068 Iso1 Section 15 TBGU362347_1748ER No polymorphism detected TBGU362346_1746EF Iso2 Section 16 Breakdown Iso2_GA_Ref E E Breakdown Iso2_GA_Ref E E Past_temp Iso2_GA_Ref E E E Past_temp Iso2_GA_Ref E E E Chalk% Iso2_GA_Ref E E E Chalk% Is02_GA_Ref E E Pullulanase Section 17 Past_temp Pullu_TBGU185983_ E E E GT Pullu_TBGU185983_ E E E Chalk% Pullu_TBGU185989_ E E

204 Appendix 8: The linkage map of 17 starch-related genes, showing the approximate location of gene on chromosomes (Chr). The red lines show the exact location of gene on chromosomes. SSIVa SSIIb BEIIb SPHOL

205 Pullulanase BEIIa SSIIIb ISA2

206 GBSSI SSI SSIIa BEI

Discovery of polymorphisms in starch-related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing

Discovery of polymorphisms in starch-related genes in rice germplasm by amplification of pooled DNA and deeply parallel sequencing Plant Biotechnology Journal (2011) 9, pp. 1074 1085 doi: 10.1111/j.1467-7652.2011.00629.x Discovery of polymorphisms in starch-related genes in rice germplasm by amplification of pooled DNA and deeply

More information

Analyses of starch biosynthetic protein complexes and starch properties from developing mutant rice seeds with minimal starch synthase activities

Analyses of starch biosynthetic protein complexes and starch properties from developing mutant rice seeds with minimal starch synthase activities Hayashi et al. BMC Plant Biology (2018) 18:59 https://doi.org/10.1186/s12870-018-1270-0 RESEARCH ARTICLE Open Access Analyses of starch biosynthetic protein complexes and starch properties from developing

More information

Towards a Better Understanding of the Metabolic System for Amylopectin Biosynthesis in Plants: Rice Endosperm as a Model Tissue

Towards a Better Understanding of the Metabolic System for Amylopectin Biosynthesis in Plants: Rice Endosperm as a Model Tissue Plant Cell Physiol. 43(7): 718 725 (2002) JSPP 2002 Towards a Better Understanding of the Metabolic System for Amylopectin Biosynthesis in Plants: Rice Endosperm as a Model Tissue Yasunori Nakamura 1 Laboratory

More information

Starch Biosynthesis in Rice Endosperm

Starch Biosynthesis in Rice Endosperm AGri-Bioscience Monographs, Vol. 4, No. 1, pp. 1 18 (2014) www.terrapub.co.jp/onlinemonographs/agbm/ Starch Biosynthesis in Rice Endosperm Naoko Fujita Department of Biological Production, Faculty of Bioresource

More information

Genetic and molecular analysis of starch synthases functions in maize and Arabidopsis

Genetic and molecular analysis of starch synthases functions in maize and Arabidopsis Retrospective Theses and Dissertations 2006 Genetic and molecular analysis of starch synthases functions in maize and Arabidopsis Xiaoli Zhang Iowa State University Follow this and additional works at:

More information

The Pennsylvania State University. The Graduate School. College of Agricultural Sciences STRUCTURE AND FUNCTION OF ENDOSPERM STARCH FROM MAIZE

The Pennsylvania State University. The Graduate School. College of Agricultural Sciences STRUCTURE AND FUNCTION OF ENDOSPERM STARCH FROM MAIZE The Pennsylvania State University The Graduate School College of Agricultural Sciences STRUCTURE AND FUNCTION OF ENDOSPERM STARCH FROM MAIZE MUTANTS DEFICIENT IN ONE OR MORE STARCH-BRANCHING ENZYME ISOFORM

More information

Project Title: Development of GEM line starch to improve nutritional value and biofuel production

Project Title: Development of GEM line starch to improve nutritional value and biofuel production Project Title: Development of GEM line starch to improve nutritional value and biofuel production Prepared by Jay-lin Jane and Hanyu Yangcheng, Department of Food Science and Human Nutrition, Iowa State

More information

Yang, F; Chen, Y; Tong, C; Huang, Y; Xu, F; Li, K; Corke, H; Sun, M; Bao, J. The original publication is available at

Yang, F; Chen, Y; Tong, C; Huang, Y; Xu, F; Li, K; Corke, H; Sun, M; Bao, J. The original publication is available at Title Author(s) Association mapping of starch physicochemical properties with starch synthesis-related gene markers in nonwaxy rice (Oryza sativa L.) Yang, F; Chen, Y; Tong, C; Huang, Y; Xu, F; Li, K;

More information

Rice Mutation Breeding for Various Grain Qualities in Thailand

Rice Mutation Breeding for Various Grain Qualities in Thailand 8. Thailand Rice Mutation Breeding for Various Grain Qualities in Thailand S. Taprab, W. Sukviwat, D. Chettanachit, S. Wongpiyachon and W. Rattanakarn Bureau of Rice Research and Development, Rice Department,

More information

Investigation of Starch Synthesis in a TILLING Mutant of Maize. Sarah C. Massey. A Thesis. Presented to. The University of Guelph

Investigation of Starch Synthesis in a TILLING Mutant of Maize. Sarah C. Massey. A Thesis. Presented to. The University of Guelph Investigation of Starch Synthesis in a TILLING Mutant of Maize By Sarah C. Massey A Thesis Presented to The University of Guelph In partial fulfillment of the requirements For the degree of Master of Science

More information

Mutational Analysis of the Pullulanase-Type Debranching Enzyme of Maize Indicates Multiple Functions in Starch Metabolism

Mutational Analysis of the Pullulanase-Type Debranching Enzyme of Maize Indicates Multiple Functions in Starch Metabolism The Plant Cell, Vol. 15, 666 680, March 2003, www.plantcell.org 2003 American Society of Plant Biologists Mutational Analysis of the Pullulanase-Type Debranching Enzyme of Maize Indicates Multiple Functions

More information

Molecular Structure and Function Polysaccharides as Energy Storage. Biochemistry

Molecular Structure and Function Polysaccharides as Energy Storage. Biochemistry 1 1.Objectives Dr. Vijaya Khader Dr. MC Varadaraj To understand how polysaccharides act as energy source To understand the structure and energy generation process from glycogen To understand the structure

More information

About the Editors List of Contributors

About the Editors List of Contributors 3GFTOC 07/20/2013 11:21:12 Page 9 Preface About the Editors List of Contributors Acknowledgements xvii xix xxi xxv 1 Starch Biosynthesis in Relation to Resistant Starch 1 Geetika Ahuja, Sarita Jaiswal

More information

Relationships between starch synthase I and branching enzyme isozymes determined using double mutant rice lines

Relationships between starch synthase I and branching enzyme isozymes determined using double mutant rice lines Abe et al. BMC Plant Biology 2014, 14:80 RESEARCH ARTICLE Open Access Relationships between starch synthase I and branching enzyme isozymes determined using double mutant rice lines Natsuko Abe, Hiroki

More information

Biochemical and genetic analyses of physical associations among Zea mays starch biosynthetic enzymes

Biochemical and genetic analyses of physical associations among Zea mays starch biosynthetic enzymes Retrospective Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2008 Biochemical and genetic analyses of physical associations among Zea mays starch biosynthetic enzymes

More information

Prentice Hall. Biology: Concepts and Connections, 6th Edition (Campbell, et al) High School

Prentice Hall. Biology: Concepts and Connections, 6th Edition (Campbell, et al) High School Prentice Hall Biology: Concepts and Connections, 6th Edition (Campbell, et al) 2009 High School C O R R E L A T E D T O Biology I Students should understand that scientific knowledge is gained from observation

More information

Performance and Inheritance of Rice Starch RVA Profile Characteristics

Performance and Inheritance of Rice Starch RVA Profile Characteristics Rice Science, 5, 12(1): 39 47 39 http://www.ricescience.org Performance and Inheritance of Rice Starch RVA Profile Characteristics YAN Chang-jie, LI Xin, ZHANG Rong, SUI Jiong-ming, LIANG Guo-hua, SHEN

More information

Starch Molecular Characteristics and Digestion Properties

Starch Molecular Characteristics and Digestion Properties Starch Molecular Characteristics and Digestion Properties B.R. Hamaker, G. Zhang, Z. Ao, M. Benmoussa and S. Maghaydah Whistler Center for Carbohydrate Research Purdue University, Indiana, USA Presentation

More information

Starch biosynthesis in rice endosperm requires the presence of either starch synthase I or IIIa

Starch biosynthesis in rice endosperm requires the presence of either starch synthase I or IIIa Journal of Experimental Botany, Vol. 62, No. 4, pp. 489 483, 2 doi:.93/jxb/err25 Advance Access publication 5 July, 2 This paper is available online free of all access charges (see http://jxb.oxfordjournals.org/open_access.html

More information

REVIEW Effect of Wheat Starch Characteristics on the Gelatinization, Retrogradation, and Gelation Properties

REVIEW Effect of Wheat Starch Characteristics on the Gelatinization, Retrogradation, and Gelation Properties JARQ 39 (4), 253 260 (2005) http://www.jircas.affrc.go.jp REVIEW Effect of Wheat Starch Characteristics on the Gelatinization, Retrogradation, and Gelation Properties Tomoko SASAKI* Food Function Division,

More information

University of Arkansas, Fayetteville. Nicholas Lawson University of Arkansas, Fayetteville. Theses and Dissertations

University of Arkansas, Fayetteville. Nicholas Lawson University of Arkansas, Fayetteville. Theses and Dissertations University of Arkansas, Fayetteville ScholarWorks@UARK Theses and Dissertations 5-2016 Gene Expression and Physiological Analysis to Study Differences Between Oryza Sativa Cultivars Susceptible and Resistant

More information

Role and Regulation of Starch Phosphorylase and Starch Synthase IV in Starch Biosynthesis in Maize Endosperm Amyloplasts

Role and Regulation of Starch Phosphorylase and Starch Synthase IV in Starch Biosynthesis in Maize Endosperm Amyloplasts Role and Regulation of Starch Phosphorylase and Starch Synthase IV in Starch Biosynthesis in Maize Endosperm Amyloplasts By Renuka M. Subasinghe A Thesis presented to The University of Guelph In partial

More information

Takahiro Noda National Agricultural Research Center for Hokkaido Region (NARCH), JAPAN Workshop Japan-New Zealand (JST), 11 October 2010, Tokyo.

Takahiro Noda National Agricultural Research Center for Hokkaido Region (NARCH), JAPAN Workshop Japan-New Zealand (JST), 11 October 2010, Tokyo. National Agriculture and Food Research Organization The enzymatic digestibility and phosphate content in potato starches Takahiro Noda National Agricultural Research Center for Hokkaido Region (NARCH),

More information

Resistant-starch formation in high-amylose maize starch

Resistant-starch formation in high-amylose maize starch Graduate Theses and Dissertations Graduate College 2010 Resistant-starch formation in high-amylose maize starch Hongxin Jiang Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd

More information

A Novel Factor FLOURY ENDOSPERM2 Is Involved in Regulation of Rice Grain Size and Starch Quality W

A Novel Factor FLOURY ENDOSPERM2 Is Involved in Regulation of Rice Grain Size and Starch Quality W This article is a Plant Cell Advance Online Publication. The date of its first appearance online is the official date of publication. The article has been edited and the authors have corrected proofs,

More information

Effect of Temperature at Grain Filling Stage on Activities of Key Enzymes Related to Starch Synthesis and Grain Quality of Rice

Effect of Temperature at Grain Filling Stage on Activities of Key Enzymes Related to Starch Synthesis and Grain Quality of Rice Rice Science, 2005, 12(4): 261-266 261 http://www.ricescience.org Effect of Temperature at Grain Filling Stage on Activities of Key Enzymes Related to Starch Synthesis and Grain Quality of Rice JIN Zheng-xun,

More information

Contents. Contributor contact details. Part I Analysing and modifying starch

Contents. Contributor contact details. Part I Analysing and modifying starch Contributor contact details Part I Analysing and modifying starch 1 Plant starch synthesis... J Preiss, Michigan State University, USA 1.1 Introduction: localization and function of starch in plants...

More information

Discrete Forms of Amylose Are Synthesized by Isoforms of GBSSI in Pea

Discrete Forms of Amylose Are Synthesized by Isoforms of GBSSI in Pea The Plant Cell, Vol. 14, 1767 1785, August 2002, www.plantcell.org 2002 American Society of Plant Biologists Discrete Forms of Amylose Are Synthesized by Isoforms of GBSSI in Pea Anne Edwards, a Jean-Paul

More information

Lecture 2: Glycogen metabolism (Chapter 15)

Lecture 2: Glycogen metabolism (Chapter 15) Lecture 2: Glycogen metabolism (Chapter 15) First. Fig. 15.1 Review: Animals use glycogen for ENERGY STORAGE. Glycogen is a highly-branched polymer of glucose units: Basic structure is similar to that

More information

Expression constructs

Expression constructs Gene expressed in bebe3 ZmBEa Expression constructs 35S ZmBEa Pnos:Hygromycin r 35S Pnos:Hygromycin r 35S ctp YFP Pnos:Hygromycin r B -1 Chl YFP- Merge Supplemental Figure S1: Constructs Used for the Expression

More information

TITLE: Fast-Track Development of Potato Clones with Pure Amylopectin Starch Used in the Paper, Textile and Food Industries by Using Induced Mutation.

TITLE: Fast-Track Development of Potato Clones with Pure Amylopectin Starch Used in the Paper, Textile and Food Industries by Using Induced Mutation. AGRICULTURAL RESEARCH FOUNDATION FINAL REPORT FUNDING CYCLE 2014 2016 TITLE: Fast-Track Development of Potato Clones with Pure Amylopectin Starch Used in the Paper, Textile and Food Industries by Using

More information

Starch biogenesis: relationship between starch structures and starch biosynthetic enzymes

Starch biogenesis: relationship between starch structures and starch biosynthetic enzymes Retrospective Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 2006 Starch biogenesis: relationship between starch structures and starch biosynthetic enzymes Li Li Iowa

More information

Changes in Cooking and Nutrition Qualities of Grains at Different Positions in a Rice Panicle under Different Nitrogen Levels

Changes in Cooking and Nutrition Qualities of Grains at Different Positions in a Rice Panicle under Different Nitrogen Levels Rice Science, 2007, 14(2): 141-148 Copyright 2007, China National Rice Research Institute. Published by Elsevier BV. All rights reserved Changes in Cooking and Nutrition Qualities of Grains at Different

More information

Molecular Biology (BIOL 4320) Exam #2 May 3, 2004

Molecular Biology (BIOL 4320) Exam #2 May 3, 2004 Molecular Biology (BIOL 4320) Exam #2 May 3, 2004 Name SS# This exam is worth a total of 100 points. The number of points each question is worth is shown in parentheses after the question number. Good

More information

Carbs: The Staff of Life, or The Stuff of Death? Ed Cox, M.D.

Carbs: The Staff of Life, or The Stuff of Death? Ed Cox, M.D. Carbs: The Staff of Life, or The Stuff of Death? Ed Cox, M.D. Pyramid, or Paleo? Carbs defined Carbohydrates (abbrev. CHO) = saccharides Saccharide from Greek for sugar Compounds of carbon, oxygen and

More information

Effects of Variations in Starch Synthase on Starch Properties and Eating Quality of Rice

Effects of Variations in Starch Synthase on Starch Properties and Eating Quality of Rice Plant Production Science ISSN: 1343-943X (Print) 1349-1008 (Online) Journal homepage: http://www.tandfonline.com/loi/tpps20 Effects of Variations in Starch Synthase on Starch Properties and Eating Quality

More information

Investigation of Variation in Debranching Enzyme and Starch Branching Enzyme I from Different Rice Cultivars

Investigation of Variation in Debranching Enzyme and Starch Branching Enzyme I from Different Rice Cultivars Investigation of Variation in Debranching Enzyme and Starch Branching Enzyme I from Different Rice Cultivars by Ella S. Whitelaw A dissertation submitted for the degree of Bachelor of Applied Science (Honours)

More information

SEEDS. Physiology of Development and Germination. J. Derek Bewley

SEEDS. Physiology of Development and Germination. J. Derek Bewley SEEDS Physiology of Development and Germination J. Derek Bewley Plant Physiology Research Group Department of Biology University of Calgary Calgary, Alberta, Canada and Michael Black Department of Biology

More information

Cluster analysis of rice starch varieties based on processability

Cluster analysis of rice starch varieties based on processability Cluster analysis of rice starch varieties based on processability Inae Lee a, Kyoung Jin We a, Junho Jung a, Seung-Won Kim a, Sanghoon Ko a a Sejong University, Department of Food Science and Technology,

More information

Industrial uses of starch

Industrial uses of starch International Symposium Agro-industrial uses of banana and plantain fruits 15-17th of May 2006 Colima (Mexico) Industrial uses of starch O. Gibert F. Vaillant M. Reynes Banana production by origin Cavendish

More information

Grain Quality and Genetic Analysis of Hybrids Derived from Different Ecological Types in Japonica Rice (Oryza sativa)

Grain Quality and Genetic Analysis of Hybrids Derived from Different Ecological Types in Japonica Rice (Oryza sativa) Rice Science, 2004, 11(4): 165 170 165 http://www.ricescience.org Grain Quality and Genetic Analysis of Hybrids Derived from Different Ecological Types in Japonica Rice (Oryza sativa) LENG Yan, HONG De-lin

More information

Chapter 1. Chemistry of Life - Advanced TABLE 1.2: title

Chapter 1. Chemistry of Life - Advanced TABLE 1.2: title Condensation and Hydrolysis Condensation reactions are the chemical processes by which large organic compounds are synthesized from their monomeric units. Hydrolysis reactions are the reverse process.

More information

Gelatinization temperature of rice explained by polymorphisms in starch synthase

Gelatinization temperature of rice explained by polymorphisms in starch synthase Plant Biotechnology Journal (2006) 4, pp. 115 122 doi: 10.1111/j.1467-7652.2005.00162.x Gelatinization temperature of rice explained by Blackwell Oxford, PBI Plant 1467-7644 2?Original Daniel 2005 Biotechnology

More information

Effects of Weak Light on Starch Accumulation and Starch Synthesis Enzyme Activities in Rice at the Grain Filling Stage

Effects of Weak Light on Starch Accumulation and Starch Synthesis Enzyme Activities in Rice at the Grain Filling Stage Rice Science, 2006, 13(1): 51-58 51 http://www.ricescience.org Effects of Weak Light on Starch Accumulation and Starch Synthesis Enzyme Activities in Rice at the Grain Filling Stage LI Tian 1, Ryu OHSUGI

More information

The Synthesis of [14c]Starch from [14~]~ucrose in Isolated Wheat Grains is Dependent upon the Activity of Soluble Starch Synthase

The Synthesis of [14c]Starch from [14~]~ucrose in Isolated Wheat Grains is Dependent upon the Activity of Soluble Starch Synthase Aust. J. Plant Physiol., 1993, 20, 329-35 The Synthesis of [14c]Starch from [14~]~ucrose in Isolated Wheat Grains is Dependent upon the Activity of Soluble Starch Synthase C. F. Jennerq K. SiwekA and J.

More information

Effect of Storage Proteins on Pasting Properties of Rice Starch

Effect of Storage Proteins on Pasting Properties of Rice Starch P-STARCH-4 Effect of Storage Proteins on Pasting Properties of Rice Starch Sarawadee Wongdechsareekul and Jirasak Kongkiattikajorn School of Bioresources and Technology, King Mongkut s University of Technology

More information

Nature Biotechnology: doi: /nbt Supplementary Figure 1. PL gene expression in tomato fruit.

Nature Biotechnology: doi: /nbt Supplementary Figure 1. PL gene expression in tomato fruit. Supplementary Figure 1 PL gene expression in tomato fruit. Relative expression of five PL-coding genes measured in at least three fruit of each genotype (cv. Alisa Craig) at four stages of development,

More information

Differential Expression Among Five Waxy Alleles and Their Effects on the Eating and Cooking Qualities in Specialty Rice Cultivars

Differential Expression Among Five Waxy Alleles and Their Effects on the Eating and Cooking Qualities in Specialty Rice Cultivars Differential Expression Among Five Waxy Alleles and Their Effects on the Eating and Cooking Qualities in Specialty Rice Cultivars ZHOU Li-jie 1, 2, SHENG Wen-tao 1, 2, WU Jun 1, 2, ZHANG Chang-quan 3,

More information

Project Title: Value-added Utilization of GEM Normal and High-amylose Line Starch

Project Title: Value-added Utilization of GEM Normal and High-amylose Line Starch Project Title: Value-added Utilization of GEM Normal and High-amylose Line Starch Prepared by Jay-lin Jane, Hongxin Jiang, and Li Li, Department of Food Science and Human Nutrition, Iowa State University,

More information

Nantes. Morphologies de grains d amidon

Nantes. Morphologies de grains d amidon Morphologies de grains d amidon Spherulitic organization of the starch granule Positive birefringence, average radial orientation Amylose C 2 C 2 C 2 α(1-4)linkages Mw ~10 6-10 7 Amylopectin Ø cluster

More information

iplex genotyping IDH1 and IDH2 assays utilized the following primer sets (forward and reverse primers along with extension primers).

iplex genotyping IDH1 and IDH2 assays utilized the following primer sets (forward and reverse primers along with extension primers). Supplementary Materials Supplementary Methods iplex genotyping IDH1 and IDH2 assays utilized the following primer sets (forward and reverse primers along with extension primers). IDH1 R132H and R132L Forward:

More information

Chemistry 107 Exam 4 Study Guide

Chemistry 107 Exam 4 Study Guide Chemistry 107 Exam 4 Study Guide Chapter 10 10.1 Recognize that enzyme catalyze reactions by lowering activation energies. Know the definition of a catalyst. Differentiate between absolute, relative and

More information

Generation of new glutinous rice by CRISPR/Cas9-targeted mutagenesis of the. Waxy gene in elite rice varieties

Generation of new glutinous rice by CRISPR/Cas9-targeted mutagenesis of the. Waxy gene in elite rice varieties Letter to the Editor Generation of new glutinous rice by CRISPR/Cas9-targeted mutagenesis of the Waxy gene in elite rice varieties Running Title: CRISPR/Cas9 create new glutinous rice varieties Jinshan

More information

Enzymes in Brewing Series: Part One-Malting

Enzymes in Brewing Series: Part One-Malting Author: Mark Sammartino Issue 4, Volume 2 Nationally Recognized Brewing Expert 4/29/13 Associate of Brewing Consulting Services, LLC Enzymes in Brewing Series: Part One-Malting In David Kapral s last issue

More information

BIO- DEGRADABLE COMPOSITE MADE FROM STARCH AND COCONUT FIBER : MECHANICAL STRENGTH AND BIODEGRATION CHRACTERSTICS

BIO- DEGRADABLE COMPOSITE MADE FROM STARCH AND COCONUT FIBER : MECHANICAL STRENGTH AND BIODEGRATION CHRACTERSTICS BIO- DEGRADABLE COMPOSITE MADE FROM STARCH AND COCONUT FIBER : MECHANICAL STRENGTH AND BIODEGRATION CHRACTERSTICS 55 Rahul Sen*, N.C.Upadhayay**, Upender Pandel*** *Research Scholar **Associate Professor

More information

FOOD POLYMER SCIENCE OF STARCH STRUCTURAL ASPECT FRINGED MICELLE MODEL OF PARTIALLY CRYSTALLINE STARCH AMORPHOUS REGION GLASS TRANSITION AT Tg FUNCTIONAL ASPECT STARCH MAP CRYSTALLINE REGION MELTING TRANSITION

More information

Transcriptome wide identification and characterization of Starch Synthase enzyme in finger millet

Transcriptome wide identification and characterization of Starch Synthase enzyme in finger millet www.bioinformation.net Volume 14(7) Hypothesis Transcriptome wide identification and characterization of Starch Synthase enzyme in finger millet Rajhans Tyagi 1*, Apoorv Tiwari 2,3, Alok Kumar Gupta 4,

More information

SUPPLEMENTARY FIGURES

SUPPLEMENTARY FIGURES SUPPLEMENTARY FIGURES A B C D Figure S1. Genome-wide association study for (A) amylose peak 1 (AM1, DP >1,000), (B) amylose peak 2 (AM2, DP 1000-121), (C) medium chain amylopectin (MCAP, DP 120-37) and

More information

What is it? Ear of Teosinite

What is it? Ear of Teosinite What is it? Ear of Teosinite The amazing corn kernel just became more amazing! Introducing Enogen Corn Corn has Come a Long Way Over the Course of the Last 7000 Years That Little Kernel is full of little

More information

Protein Phosphorylation in Amyloplasts Regulates Starch Branching Enzyme Activity and Protein Protein Interactions

Protein Phosphorylation in Amyloplasts Regulates Starch Branching Enzyme Activity and Protein Protein Interactions This article is published in The Plant Cell Online, The Plant Cell Preview Section, which publishes manuscripts accepted for publication after they have been edited and the authors have corrected proofs,

More information

May 23, Germinated Grains. Germination Phases. Germination Benefits. Starch Hydrolysates from Germinated Brown Rice

May 23, Germinated Grains. Germination Phases. Germination Benefits. Starch Hydrolysates from Germinated Brown Rice Fresh weight (g)/1 seeds May 3, 18 Hydrolyzed Starch Products and Applications Starch Hydrolysates from Germinated Brown Rice Ana Gonzalez, Emily Wong, and Ya-Jane Wang Commercial Production of Starch

More information

BIOCHEMISTRY UNIT 2 Part 4 ACTIVITY #4 (Chapter 5) CARBOHYDRATES

BIOCHEMISTRY UNIT 2 Part 4 ACTIVITY #4 (Chapter 5) CARBOHYDRATES AP BIOLOGY BIOCHEMISTRY UNIT 2 Part 4 ACTIVITY #4 (Chapter 5) NAME DATE PERIOD CARBOHYDRATES GENERAL CHARACTERISTICS: Polymers of simple sugars Classified according to number of simple sugars Sugars 3

More information

Structural Polysaccharides

Structural Polysaccharides Carbohydrates & ATP Carbohydrates include both sugars and polymers of sugars. The simplest carbohydrates are the monosaccharides, or simple sugars; these are the monomers from which more complex carbohydrates

More information

BIOCHEMISTRY & MEDICINE:

BIOCHEMISTRY & MEDICINE: BIOCHEMISTRY & MEDICINE: INTRODUCTION Biochemistry can be defined as the science of the chemical basis of life (Gk bios "life"). The cell is the structural unit of living systems. Thus, biochemistry can

More information

Lecture 34. Carbohydrate Metabolism 2. Glycogen. Key Concepts. Biochemistry and regulation of glycogen degradation

Lecture 34. Carbohydrate Metabolism 2. Glycogen. Key Concepts. Biochemistry and regulation of glycogen degradation Lecture 34 Carbohydrate Metabolism 2 Glycogen Key Concepts Overview of Glycogen Metabolism Biochemistry and regulation of glycogen degradation Biochemistry and regulation of glycogen synthesis What mechanisms

More information

Starch in western diets

Starch in western diets Starches How much do we eat? Where does it come from? Characteristics of starch Starch digestion - rate and extent Starch gelatinisation Glycaemic index of starchy foods Resistant starch Conclusions Starch

More information

ROLE OF GBSS ALLELIC DIVERSITY IN RICE GRAIN QUALITY. A Dissertation MACAIRE DOBO

ROLE OF GBSS ALLELIC DIVERSITY IN RICE GRAIN QUALITY. A Dissertation MACAIRE DOBO ROLE OF GBSS ALLELIC DIVERSITY IN RICE GRAIN QUALITY A Dissertation by MACAIRE DOBO Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the

More information

Digestibility and starch structure: the key to tailored energy release

Digestibility and starch structure: the key to tailored energy release Digestibility and starch structure: the key to tailored energy release Undine Lehmann & Frédéric Robin, Department of Food Science & Technology PO Box 44, Vers-chez-les-Blanc, CH-1026 Lausanne, Switzerland

More information

Relationship Between the First Base of the Donor Splice Site of Waxy Gene Intron 1 and Amylose Content in Yunnan Indigenous Rice Varieties

Relationship Between the First Base of the Donor Splice Site of Waxy Gene Intron 1 and Amylose Content in Yunnan Indigenous Rice Varieties Rice Science, 2007, 14(3): 189-194 Copyright 2007, China National Rice Research Institute. Published by Elsevier BV. All rights reserved Relationship Between the First Base of the Donor Splice Site of

More information

Chapter 5: The Structure and Function of Large Biological Molecules

Chapter 5: The Structure and Function of Large Biological Molecules Chapter 5: The Structure and Function of Large Biological Molecules 1. Name the four main classes of organic molecules found in all living things. Which of the four are classified as macromolecules. Define

More information

Effect of Storage Time and Storage Protein on Pasting Properties of Khao Dawk Mali 105 Rice Flour

Effect of Storage Time and Storage Protein on Pasting Properties of Khao Dawk Mali 105 Rice Flour Kasetsart J. (Nat. Sci.) 43 : 232-237 (29) Effect of Storage Time and Storage Protein on Pasting Properties of Khao Dawk Mali 15 Rice Flour Sarawadee Wongdechsarekul and Jirasak Kongkiattikajorn* ABSTRACT

More information

Study on Physicochemical Properties of Rice Varieties in Fiji

Study on Physicochemical Properties of Rice Varieties in Fiji Journal of Agricultural Science; Vol. 8, No. 4; 2016 ISSN 1916-9752 E-ISSN 1916-9760 Published by Canadian Center of Science and Education Study on Physicochemical Properties of Rice Varieties in Fiji

More information

Topic 4 - #2 Carbohydrates Topic 2

Topic 4 - #2 Carbohydrates Topic 2 Topic 4 - #2 Carbohydrates Topic 2 Biologically Important Monosaccharide Derivatives There are a large number of monosaccharide derivatives. A variety of chemical and enzymatic reactions produce these

More information

Insulin Resistance. Biol 405 Molecular Medicine

Insulin Resistance. Biol 405 Molecular Medicine Insulin Resistance Biol 405 Molecular Medicine Insulin resistance: a subnormal biological response to insulin. Defects of either insulin secretion or insulin action can cause diabetes mellitus. Insulin-dependent

More information

Carbohydrate Chemistry 2016 Family & Consumer Sciences Conference Karin Allen, PhD

Carbohydrate Chemistry 2016 Family & Consumer Sciences Conference Karin Allen, PhD Carbohydrate Chemistry 2016 Family & Consumer Sciences Conference Karin Allen, PhD Overview Carbohydrate chemistry General characteristics Sugar chemistry Starch chemistry 10 minute break Iodine test for

More information

General Biology 1004 Chapter 3 Lecture Handout, Summer 2005 Dr. Frisby

General Biology 1004 Chapter 3 Lecture Handout, Summer 2005 Dr. Frisby Slide 1 CHAPTER 3 The Molecules of Life PowerPoint Lecture Slides for Essential Biology, Second Edition & Essential Biology with Physiology Presentation prepared by Chris C. Romero Copyright 2004 Pearson

More information

Copyright 2014 Edmentum - All rights reserved.

Copyright 2014 Edmentum - All rights reserved. Study Island Copyright 2014 Edmentum - All rights reserved. Generation Date: 04/01/2014 Generated By: Cheryl Shelton Title: Science- biology Cells 1. Below is an image of a plant cell. What processes require

More information

Functionality of Protein

Functionality of Protein Protein Polymers of aa:20 different aa Primary structure aa sequence Secondary structure- chains take up conformations which may crosslink to form helices ie α helix and β pleated sheet Tertiary structure-

More information

Mechanisms of alternative splicing regulation

Mechanisms of alternative splicing regulation Mechanisms of alternative splicing regulation The number of mechanisms that are known to be involved in splicing regulation approximates the number of splicing decisions that have been analyzed in detail.

More information

Influence of Germination Conditions on Starch, Physicochemical Properties, and Microscopic Structure of Rice Flour

Influence of Germination Conditions on Starch, Physicochemical Properties, and Microscopic Structure of Rice Flour 2010 International Conference on Biology, Environment and Chemistry IPCBEE vol.1 (2011) (2011) IACSIT Press, Singapore Influence of Germination Conditions on Starch, Physicochemical Properties, and Microscopic

More information

What is the function of ribosomes? Draw and label a Bacteria cell.

What is the function of ribosomes? Draw and label a Bacteria cell. Q1 Q2 Q3 What does the mitochondria do? What is the function of ribosomes? What is the function of the cell wall? Q4 Q5 Q6 What is the function of cell membranes? What is diffusion? What is found in plant

More information

Unit 2 - Characteristics of Living Things

Unit 2 - Characteristics of Living Things Living Environment Answer Key to Practice Exam- Parts A and B-1 1. A fully functioning enzyme molecule is arranged in a complex three-dimensional shape. This shape determines the A) specific type of molecule

More information

Differential Regulation of Starch-synthetic Gene Expression in Endosperm Between Indica and Japonica Rice Cultivars

Differential Regulation of Starch-synthetic Gene Expression in Endosperm Between Indica and Japonica Rice Cultivars Inukai Rice (17) 1:7 DOI 1.11/s1-17-1- ORIGINAL ARTICLE Differential Regulation of Starch-synthetic Gene Expression in Endosperm Between Indica and Japonica Rice Cultivars Tsuyoshi Inukai Open Access Abstract

More information

Increasing Resistant Starch in Wheat. Brittany Hazard Ph.D. Candidate Genetics Graduate Group University of California Davis

Increasing Resistant Starch in Wheat. Brittany Hazard Ph.D. Candidate Genetics Graduate Group University of California Davis Increasing Resistant Starch in Wheat Brittany Hazard Ph.D. Candidate Genetics Graduate Group University of California Davis CWC Board Meeting September 4, 2014 Dubcovsky Laboratory Wheat in Our Diet (image

More information

Ch. 5 The S & F of Macromolecules. They may be extremely small but they are still macro.

Ch. 5 The S & F of Macromolecules. They may be extremely small but they are still macro. Ch. 5 The S & F of Macromolecules They may be extremely small but they are still macro. Background Information Cells join small molecules together to form larger molecules. Macromolecules may be composed

More information

DEPARTMENT OF BIOCHEMISTRY & MOLECULAR BIOLOGY

DEPARTMENT OF BIOCHEMISTRY & MOLECULAR BIOLOGY DEPARTMENT OF BIOCHEMISTRY & MOLECULAR BIOLOGY UNDERGRADUATE POSTER SESSION Monday, April 11, 2011 2:00-4:00pm Lisa Pinkava Under the Direction of Dr. Michael Thomashow Plant Research Laboratory Title:

More information

What is Life? Project PART 6: The molecules of life

What is Life? Project PART 6: The molecules of life Name: Due Monday 9/17 (15 points) What is Life? Project PART 6: The molecules of life Read the following text and answer the questions: The Molecules of Life All living things are composed of chemical

More information

Asian Journal of Food and Agro-Industry ISSN Available online at

Asian Journal of Food and Agro-Industry ISSN Available online at As. J. Food Ag-Ind. 2012, 5(04), 315-321 Asian Journal of Food and Agro-Industry ISSN 1906-3040 Available online at www.ajofai.info Research Article Effect of rice storage on pasting properties, swelling

More information

Page 1. Name:

Page 1. Name: Name: 5021-1 - Page 1 1) A student measures his pulse rate while he is watching television and records it. Next, he walks to a friend's house nearby and when he arrives, measures and records his pulse

More information

ALTERNATIVE SOURCES OF OMEGA-3 OILS FOR BARRAMUNDI, Lates calcarifer, AQUACULTURE

ALTERNATIVE SOURCES OF OMEGA-3 OILS FOR BARRAMUNDI, Lates calcarifer, AQUACULTURE ALTERNATIVE SOURCES OF OMEGA-3 OILS FOR BARRAMUNDI, Lates calcarifer, AQUACULTURE By Ramez Alhazzaa B.Sc., Grad. Dip. Animal Husbandry A thesis submitted in fulfilment of the requirements for the degree

More information

Review Starch Biosynthesis in the Developing Endosperms of Grasses and Cereals

Review Starch Biosynthesis in the Developing Endosperms of Grasses and Cereals Review Starch Biosynthesis in the Developing Endosperms of Grasses and Cereals Ian J. Tetlow * and Michael J. Emes Department of Molecular and Cellular Biology, College of Biological Science, University

More information

Identification of loci and genes responsible for sodium and chloride exclusion in rootstocks for use in marker assisted selection

Identification of loci and genes responsible for sodium and chloride exclusion in rootstocks for use in marker assisted selection Identification of loci and genes responsible for sodium and chloride exclusion in rootstocks for use in marker assisted selection Jake Dunlevy, Deidre Blackmore, Everard Edwards, Rob Walker and Mandy Walker

More information

Diseases Associated with Glycogen Synthesis

Diseases Associated with Glycogen Synthesis Paper : 04 Metabolism of carbohydrates Module: 29 Principal Investigator, Dr. S.K.Khare, Professor IIT Delhi. Paper Coordinator Content Writer Dr. Ramesh Kothari, Professor Dr. Vijaya Khader UGC-CAS Dr.

More information

Regulation of starch biosynthesis in Arabidopsis thaliana

Regulation of starch biosynthesis in Arabidopsis thaliana Regulation of starch biosynthesis in Arabidopsis thaliana By Qianru Zhao A Thesis presented to The University of Guelph In partial fulfillment of the requirements for the degree of Doctor of Philosophy

More information

Production of Amylopectin and High- Amylose Starch in Separate Potato Genotypes

Production of Amylopectin and High- Amylose Starch in Separate Potato Genotypes Production of Amylopectin and High- Amylose Starch in Separate Potato Genotypes Per Hofvander Department of Plant Biology and Forest Genetics Uppsala Doctoral thesis Swedish University of Agricultural

More information

The Structure and Func.on of Macromolecules: GRU1L4 Carbohydrates

The Structure and Func.on of Macromolecules: GRU1L4 Carbohydrates The Structure and Func.on of Macromolecules: GRU1L4 Carbohydrates Do Now: WHAT IS TABLE SUGAR MADE UP OF? Sucrose (table sugar) Composed of a glucose molecule and a fructose molecule Please draw the structure

More information

George R. Honig Junius G. Adams III. Human Hemoglobin. Genetics. Springer-Verlag Wien New York

George R. Honig Junius G. Adams III. Human Hemoglobin. Genetics. Springer-Verlag Wien New York George R. Honig Junius G. Adams III Human Hemoglobin Genetics Springer-Verlag Wien New York George R. Honig, M.D., Ph.D. Professor and Head Department of Pediatrics, College of Medicine University of Illinois

More information

Creation of Materials for Breeding Amylose Library of Primary Rice Varieties

Creation of Materials for Breeding Amylose Library of Primary Rice Varieties 4. Japan Creation of Materials for Breeding Amylose Library of Primary Rice Varieties M. Nishimura Institute of Radiation Breeding National Institute of Agrobiological Sciences 4.1 Introduction Low-amylose

More information

A mutant in Arabidopsis Lacking a Chloroplast Specific Lipid. Lewis Kurschner and Karen Thulasi Masters in Botany

A mutant in Arabidopsis Lacking a Chloroplast Specific Lipid. Lewis Kurschner and Karen Thulasi Masters in Botany A mutant in Arabidopsis Lacking a Chloroplast Specific Lipid Lewis Kurschner and Karen Thulasi Masters in Botany Fatty acid nomenclature Fatty acyl composition Chain length Degree of unsaturation and position

More information

Insulin mrna to Protein Kit

Insulin mrna to Protein Kit Insulin mrna to Protein Kit A 3DMD Paper BioInformatics and Mini-Toober Folding Activity Student Handout www.3dmoleculardesigns.com Insulin mrna to Protein Kit Contents Becoming Familiar with the Data...

More information