This guest post is by Ewan Birney on Genomic and phenotypic characterisation of a wild Medaka population: Establishing an isogenic population genetic resource in fish, arXived here.
Our lab is part of a collaboration spanning Japan, two groups in Germany and EMBL-EBI in the UK which put the paper “Genomic and phenotypic characterisation of a wild Medaka population: Establishing an isogenic population genetic resource in fish” on to the arXive pre-print server. Here I’ve taken up a kind invitation to post about this paper.
Before getting into the details of the paper, I should introduce Medaka, scientific name, Oryzias latipes. Medaka is a small fish which lives in both fresh and brackish water across the majority of the Japanese archipelago (the exception is Hokkaido, the large northern island in Japan), the Korean peninsular and the eastern China region. It has been kept as a garden pet (and then in aquaria) in Japan for a long time, with documented colourful strains in Japanese art since 17th Century (this is a woodblock from the celebrated artist, Ando Hiroshige, (安藤 広重) depicting Goldfish and, the much smaller, Medaka fish – the medaka fish are the small horizontal shoals, not the colourful goldfish sadly). It is widespread in the wild, in particular in rice paddies, hence its other common name, “the Japanese rice paddy fish”. After the rediscovery of Mendel’s work at the turn of the twentieth century scientists in Japan started to use the established colourful strains to explore genetics. The most famous paper from this era is the first discovery in any species of crossover of the sex chromosomes, X-Y. Rather brilliantly this is an open access paper from Genetics (Aida T: On the Inheritance of Color in a Fresh-Water Fish, APLOCHEILUS LATIPES Temmick and Schlegel, with Special Reference to Sex-Linked Inheritance. Genetics 1921, 6:554-573. http://europepmc.org/articles/PMC1200522 Note: at the time, the systematics in this area of fish was different, hence the different genus name). Since this early genetics, Medaka has been used for research both inside and outside Japan over the 20th Century, with a well established linkage map, transgenic procedures, genome sequence, and considerable probing of different phenotypes.
My experience in describing Medaka to European and American audiences is to now answer the usual questions about similarity and contrasts with Zebrafish, the most commonly used laboratory fish in the West. The first important thing to realise is that Medaka is on a very different branch of the Teleost (bony fish) lineage from Zebrafish, separated by an estimated 250-300 million years of divergence – so the Medaka fish genome is only marginally closer to the Zebrafish genome than either are to mammals. One should expect quite different biological details in the two systems, and each system to be equally applicable to mammalian systems. Medaka is somewhat smaller than Zebrafish and nearly always will live in the same tank format as Zebrafish and the same water system (many labs co-culture both Medaka and Zebrafish). Generation time is similar (6 weeks ~ 3 months). Zebrafish lay around 1,000 eggs in a single mating, which is a distinct advantage to Medaka’s clutch of around 30 eggs in a single mating, held to the female. However whereas zebrafish mate only once per week requiring a ‘recovery phase’ , medaka mate every day. Thus the difference in fecundity over time is small. Both zebrafish and medaka have transparent chorions (egg shells); also the embryo itself is completely transparent in both species rendering them ideal model systems to study development. Generally all techniques that have been established for one species are also applicable for the other, such as transgenesis by simple injection of suitable DNA vectors or antisense morphlinos. Medaka fish genetics is cleaner, with many inbred laboratory lines maintained by single brother-sister mating, and thus very homozygous throughout. Finally these medaka inbred lines are often made from wild-catch individuals, with an established breeding protocol to achieve homozygosity from wild individuals.
It is this last feature that we would like leverage here. With an inbreeding protocol from the wild, one can set up a near-isogenic wild panel, similar to the panels that have already been developed in Arabidopsis and Drosophila. These panels are proving very informative for quantitative genetics in both of these fields. Once such a panel is established it is a powerful mapping resource and is one of the few ways to study gene-environment effects as one can repeat phenotyping experiments over the same genotypes but in differing laboratory environments (say, high to low calorie diets, or different temperatures, or different small molecules added to the water). Being able to have such a panel with a vertebrate will be very powerful (Medaka fish has all the common cell types and tissues for a vertebrate – brain, heart, liver, muscle, gut, pancreas – both endocrine and exocrine, kidney.) The main purpose of this paper is to find and characterise the source wild population for such a panel.
A good genetic panel needs ideally to be free of population structure. One also needs the right linkage disequilibrium (LD) properties. In previous decades there was a sweet spot for LD in the genome; the longer the LD the cheaper it was to genotype, but shorter LD gave better resolution of where a functional variant is. With the advent of cheap sequencing, this trade off logic has changed to finding a population with short LD to have the best possible mapping resolution. Finally there are also practical aspects for choice of population – one would like to be able to resample from the same region easily (for example, to add to the panel in the future). After looking at a number of sites where we assessed population properties via mitochondrial typing, we choose a site, Kiyosu (https://maps.google.com/maps?q=34.78113,+137.347928333056(Kiyosu%20Sampling%20Site)&iwloc=A&hl=en), close to Nagoya where the NIBB Medaka resource group under Kiyoshi Naruse is housed. From this population we caught a number of individuals and set up 8 breeding pairs. For each of the 8 pairs, we sequenced the two parents and one child (a “trio” in the parlance of genetics). We choose this sequencing structure as it means we can phase the parental genotypes using the child’s genotype, and in effect sample 16 haplotypes from this population.
From this we show we have a good population for an isogenic panel. There is no discernable population structure, both from a distance matrix perspective across all individuals, and the lack of long LD in the population. As expected for a large teleost population, there are a relatively large number of variants, with a segregating SNP every 150bp on average. In this limited sample the LD is, as expected, quite tight, with the correlation between SNPs (expressed as r2) dropping to “baseline” levels between 5 – 10 KB . From even this limited panel we estimate that almost 40% of SNPs would be mappable to a single exon. We expect this will improve in a more complete panel.
To augment this population characterisation, we also performed some high throughput phenotyping, showing that the population has expected phenotypic diversity driven by genetics. To do this we took advantage of a small number of existing inbred southern line strains. As the numbers are low here, we cannot map any phenotypes (this will require the panel), but we can get an estimate of broad sense heritability, i.e., the proportion of variance of a measurement explained by the differences between inbred lines compared to the differences between individuals of the same line. This is the sort of calculation which is easy to do on inbred panels, and harder to deconvolute on family or outbred cases. For 6 out 7 traits we chose to measure we get reasonably high broad sense heritability measures.
We conclude that we have a good source wild population that is likely to lead to a successful near isogenic panel, and have started inbreeding. We are somewhere between the 3rd to the 4th round of brother-sister inbreeding for 200 founder pairs, and all the lines look healthy. Traditionally one considers lines to have inbred after 8 brother-sister matings, so we’re almost half way through. Although in theory the generation time is 3 months, when one does this on 200 lines, the logistics means it is closer to 4 to 5 months. Felix Loosli is overseeing the inbreeding at the KIT (Karlsruhe Institute for Technology), in Germany.
In addition to the main thrust of this paper, we also look at the population genetics of Medaka, as this is a large wild catch with complete genome wide coverage of SNPs. For me the most interesting thing is the relationship with the Northern strain of Medaka. Japan has a large mountain range roughly running down the middle of the main island of Japan (called the “Japanese Alps” in English –日本アルプス Nihon Arupusu in Japanese). The Medaka fish found to the north of this are phenotypically different (more heavily pigmented, and prefer to live in shallower tanks), but do interbreed in the laboratory with Southern strains. An open question is whether there has been any partial interbreeding (called introgression) in the wild. Using sensitive tests for introgression between lineages, first developed to detect the Human/Neanderthal interbreeding, we do not see evidence for wild interbreeding between the Northern and Southern strains. This lends support that these two “strains” are best thought of as separate species, and one might expect there to have been specific selection events for the phenotypic properties in both strains.
We welcome comments on this paper, but also more generally on the use of the future Kiyosu near isogenic wild panel. We will be completely open about data collection and distribution of data for this panel, wherever possible using global archives to minimise the complications in getting access to the data.
If you are a teleost biologist, the fact that one can co-culture Medaka and Zebrafish means that many phenotypic assays set up for Zebrafish are probably quite easily transferable to Medaka; there is an existing (small) set of inbred lines which you could look to develop assays on. If you are a molecular quantitative geneticist, this panel should have better mapping properties than human or mouse, but a full range of vertebrate cell types. I am looking forward to using some of the statistical techniques developed on Arabidopsis and Drosophila, in particular the gene/environment partitioning components, and if you are interested in gene/environment interactions, this panel will have some unique opportunities. Of course, this is not a panacea – it takes investment in husbandry details and logistics to bring in another species into any system, and quantitative genetics is just one way of exploring genetic effects – forward and reverse genetics are very powerful techniques (which, incidentally have both been used in Medaka).
If you are interested do contact Ewan Birney (firstname.lastname@example.org) or Ian Dunham (email@example.com) on aspects of the genomics/variation, and Felix Loosli (firstname.lastname@example.org) , Jochen Wittbrodt (Jochen.Wittbrodt@cos.uni-Heidelberg.de) or Kiyoshi Naruse (email@example.com) on Medaka husbandry and molecular biology.