SVs are present in every human genome, affecting molecular and cellular processes, regulatory functions, 3D structure, and transcriptional machinery5,7,8. Step 3, filters and heuristics based on the project aims are applied to remove false-positives and merge calls (see BOX 2 for details). Accessibility 2)66,68,69. By contrast, structural variation defined by genomic events that involve 50 or more base pairs is difficult to predict in traditional whole-genome-sequencing studies confidently. To address this, Hi-C Breakfinder uses a probabilistic model that incorporates information about expected spatial features when determining aberrant contact frequencies91. In spite of these improvements, we are still unable to interpret the functional consequences of the vast majority of variants. The use of individual technologies will depend on logistical variables such as cost, required resolution, and project scope. 2)85. Reference sets also vary widely when it comes to orthogonal validation where some reference sets employ multiple orthogonal platforms while others perform none, opting to maximize quality metrics instead. New platform ensemble tools are expected to develop as the cost of sequencing continues to drop and access to new technologies improves. PacBio single-molecule real-time (SMRT) sequencing leverages a stationary polymerase attached to the bottom of a nanosized well and passages single DNA strands through the enzyme to produce long-reads that significantly improve unambiguous mappability across the genome93. 2). A number of methods, such as pooled-clone sequencing and Illumina Synthetic Long Reads, represent synthetic long reads which use specific library preparations to infer long range information from existing short-read sequencers64,65. However, as Hi-C relies on the presence of digestion sites kilobases apart in the linear genome, its resolution is limited. 3). Large, population-scale detection efforts then started to emerge. A plethora of emerging technologies seek to expand beyond the capabilities of short-reads. Additional resources can be found at dbVar174. Molecular cytogenetics techniques, particularly chromosome-banding and fluorescence in situ hybridization, powered seminal work involving the detection of microscopic chromosomal aberrations but were unable to identify submicroscopic variants (for brief historical perspectives on cytogenetic-based SV detection, see REFS22,155). Indeed, some groups are still extremely underrepresented: Hispanic and Latin American individuals make up only 7.8% and 16% of the gnomAD-SV and Abel et al. Inter-read signatures involve multiple reads and detect SVs from inconsistencies in orientation, location, and size during mapping, analogous to SR signatures. Towards accurate and reliable resolution of structural variants for However, most of the intrachromosomal SVs detected by this method are > 2 Mb as distinction from local interactions is still difficult. Structural variation (SV) is generally defined as a region of DNA approximately 1 kb and larger in size [1] and can include inversions and balanced translocations or genomic imbalances (insertions and deletions), commonly referred to as copy number variants (CNVs). Two ends are sequenced ~ 100250 bp with an unsequenced insert size of ~100600 bp, High-resolution SV datasets typically deriving from, Each putative SV detected by a program is an individual call. Structural variation in the sequencing era Steve S Ho, Alexander E Urban, Ryan E Mills Nature Reviews. Single-molecule strategies exist in two dominant forms: (1) long-read sequencing by Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT), and (2) optical mapping (OM) by Bionano. The long-reads enabled identification of clustered, complex translocations and inverted duplications that amplified the oncogene ERBB2 to > 32 copies, later confirmed in a separate long-read analysis by Sedlazeck et al., providing insight into a possible breast-cancer specific mechanism96,148. Step 4, final decisions are made to designate and preserve high-confidence calls and they are output as a consolidated list of putative variants. The site is secure. Structural variations (SVs) play important roles in human evolution and diseases, but there is a lack of data resources concerning representative samples, especially for East Asians. SNVs detected by short-reads can be sequence-resolved during the discovery stage owing to their smaller size whereas most SVs would require computational inference post hoc. After signature detection, callers typically cluster and merge similar signatures from multiple reads, delineate proximal but different signatures, and choose the highest quality reads that support the putative SV. Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals, A standardized framework for representation of ancestry data in genomics studies, with application to the NHGRI-EBI GWAS Catalog, A robust benchmark for germline structural variant detection, Resolving the complexity of the human genome using single-molecule sequencing, Repetitive DNA and next-generation sequencing: computational challenges and solutions, Computational methods for discovering structural variation with next-generation sequencing, Haplotype-resolved genome sequencing of a Gujarati Indian individual, Illumina TruSeq Synthetic Long-Reads Empower De Novo Assembly and Resolve Complex, Highly-Repetitive Transposable Elements, Haplotyping germline and cancer genomes with high-throughput linked-read sequencing, Read clouds uncover variation in complex regions of the human genome, Resolving the full spectrum of human genome variation using Linked-Reads, Genome-wide reconstruction of complex structural variants using read clouds, Identifying structural variants using linked-read sequencing data, Discovery of large genomic inversions using long range information, Characterization of segmental duplications and large inversions using Linked-Reads, LinkedSV: Detection of mosaic structural variants from linked-read exome and genome sequencing data, Identification of large rearrangements in cancer genomes with barcode linked reads, De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations, Weisenfeld NI, Kumar V, Shah P, Church DM & Jaffe DB, Direct determination of diploid genome sequences, Meleshko D, Marks P, Williams S & Hajirasouliha I, Detection and assembly of novel sequence insertions using Linked-Read technology, Sedlazeck FJ, Lee H, Darby CA & Schatz MC, Piercing the dark matter: bioinformatics of long-range sequencing and mapping, Shajii A, Numanagi I, Whelan C & Berger B, Statistical Binning for Barcoded Reads Improves Downstream Analyses, DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution, Characterizing polymorphic inversions in human genomes by single-cell sequencing, Hills M, ONeill K, Falconer E, Brinkman R & Lansdorp PM, BAIT: Organizing genomes and mapping rearrangements in single cells, Sanders AD, Falconer E, Hills M, Spierings DCJ & Lansdorp PM, Single-cell template strand sequencing by Strand-seq enables the characterization of individual homologs, Comprehensive Mapping of Long-Range Interactions Reveals Folding Principles of the Human Genome, Hi-C as a tool for precise detection and characterisation of chromosomal rearrangements and copy number variation in human tumours, Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions, Genome-Wide Analysis of Interchromosomal Interaction Probabilities Reveals Chained Translocations and Overrepresentation of Translocation Breakpoints in Genes in a Cutaneous T-Cell Lymphoma Cell Line, Nucleome Analysis Reveals Structure-Function Relationships for Colon Cancer, Identification of copy number variations and translocations in cancer cells from Hi-C data, Local and global chromatin interactions are altered by large genomic deletions associated with human brain development, Integrative detection and analysis of structural variation in cancer genomes, Chromatin conformation analysis of primary patient tissue using a low input Hi-C method, Genomic dark matter: the reliability of short read mapping illustrated by the genome mappability score, PBHoney: identifying genomic variants via long-read discordance and interrupted mapping, Discovery and genotyping of structural variation from long-read haploid genome sequence data, Accurate detection of complex structural variations using single-molecule sequencing, SVIM: structural variant identification using mapped long reads, Detection and visualization of complex structural variants from long reads, NextSV: a meta-caller for structural variants from low-coverage long-read sequencing data, Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome, Assembly and diploid architecture of an individual human genome via single-molecule technologies, Long-read sequencing and de novo assembly of a Chinese genome, De novo assembly and phasing of a Korean human genome, De Novo Assembly of Two Swedish Genomes Reveals Missing Segments from the Human GRCh38 Reference and Improves Variant Calling of Population-Scale Sequencing Data, High-resolution comparative analysis of great ape genomes, Construction of JRG (Japanese reference genome) with single-molecule real-time sequencing, Characterizing the Major Structural Variant Alleles of the Human Genome, Continuous base identification for single-molecule nanopore DNA sequencing, Real-Time DNA Sequencing from Single Polymerase Molecules, Picky comprehensively detects high-resolution structural variants in nanopore long reads, Mapping and phasing of structural variation in patient genomes using nanopore sequencing, Structural variants identified by Oxford Nanopore PromethlON sequencing of the human genome, Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly, Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping, High-resolution human genome structure by single-molecule analysis, Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology, Genome-Wide Structural Variation Detection by Genome Mapping on Nanochannel Arrays, Genome maps across 26 human populations reveal population-specific patterns of structural variation, OMSV enables accurate and comprehensive identification of large structural variations from nanochannel-based single-molecule optical maps, Rapid Automated Large Structural Variation Detection in a Diploid Genome by NanoChannel Based Next-Generation Mapping, Comparative assessment of long-read error correction software applied to Nanopore RNA-sequencing data, A comparative evaluation of hybrid error correction methods for error-prone long reads, A comprehensive evaluation of long read error correction methods, Next generation mapping reveals novel large genomic rearrangements in prostate cancer, An Integrated Framework for Genome Analysis Reveals Numerous Previously Unrecognizable Structural Variants in Leukemia Patients Samples, Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562, Haplotype-resolved and integrated genome analysis of the cancer cell line HepG2, Optical mapping reveals a higher level of genomic architecture of chained fusions in cancer, Assessing structural variation in a personal genometowards a human reference diploid genome, HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies, Pan-cancer analysis of somatic copy-number alterations implicates IRS4 and IGF2 in enhancer hijacking, deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data, nFuse: Discovery of complex genomic rearrangements in cancer using high-throughput sequencing, Dissect: detection and characterization of novel structural alterations in transcribed sequences, Formation of new chromatin domains determines pathogenicity of genomic duplications, Structural Variation-Associated Expression Changes Are Paralleled by Chromatin Architecture Modifications, Chromatin features constrain structural variation across evolutionary timescales, Genomic Hallmarks and Structural Variation in Metastatic Prostate Cancer, Relative Impact of Nucleotide and Copy Number Variation on Gene Expression Phenotypes, The impact of structural variation on human gene expression, Long-read genome sequencing identifies causal structural variation in a Mendelian disease, Long-read sequencing identified a causal structural variant in an exome-negative case and enabled preimplantation genetic diagnosis, Linked-read Sequencing Analysis Reveals Tumor-specific Genome Variation Landscapes in Neurofibromatosis Type 2 (NF2) Patients: Otol, Complex structural variants in Mendelian disorders: identification and breakpoint resolution using short- and long-read genome sequencing, Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H (CFH) gene family, Complex rearrangements and oncogene amplifications revealed by long-read DNA and RNA sequencing of a breast cancer cell line, Dissecting the Causal Mechanism of X-Linked Dystonia-Parkinsonism by Integrating Genome and Transcriptome Assembly, Long-read single-molecule maps of the functional methylome, Simultaneous profiling of chromatin accessibility and methylation on human cell lines with nanopore sequencing: Supplemental Materials, Megabase Length Hypermutation Accompanies Human Structural Variation at 17p 11.2, Structural Alterations Driving Castration-Resistant Prostate Cancer Revealed by Linked-Read Genome Sequencing, TAD fusion score: discovery and ranking the contribution of deletions to genome structure, Large-Scale Copy Number Polymorphism in the Human Genome, Global variation in copy number in the human genome, Integrated detection and population-genetic analysis of SNPs and copy number variation, Mapping and sequencing of structural variation from eight human genomes, Whole-genome sequencing analysis of CNV using low-coverage and paired-end strategies is efficient and outperforms array-based CNV analysis, Challenges and standards in integrating surveys of structural variation, The new cytogenetics: blurring the boundaries with molecular biology, Copy number variations and clinical cytogenetic diagnosis of constitutional disorders, Detection of Genomic Structural Variants from Next-Generation Sequencing Data, Structural variation detection using next-generation sequencing data, Characterizing complex structural variation in germline and somatic genomes, An Evaluation of Copy Number Variation Detection Tools from Whole-Exome Sequencing Data, The clinical implementation of copy number detection in the age of next-generation sequencing, Exome sequencing and whole genome sequencing for the detection of copy number variation, Towards a comprehensive structural variation map of an individual human genome, Discovery of common Asian copy number variants using integrated high-resolution array CGH and massively parallel DNA sequencing, Nanopore sequencing and assembly of a human genome with ultra-long reads, dbVar and DGVa: public archives for genomic structural variation, Assembly of a pan-genome from deep sequencing of 910 humans of African descent, Extensive and deep sequencing of the Venter/HuRef genome for developing and benchmarking genome analysis tools, The Diploid Genome Sequence of an Individual Human, Telomere-to-telomere assembly of a complete human X chromosome, High-coverage, long-read sequencing of Han Chinese trio reference samples, Extensive sequencing of seven human genomes to characterize benchmark reference materials, The impact of translocations and gene fusions on cancer causation, Structural variant analysis for linked-read sequencing data with gemtools, Clinical application of single-molecule optical mapping to a multigeneration FSHD1 pedigree, Norris AL, Workman RE, Fan Y, Eshleman JR & Timp W, Nanopore sequencing detects structural variants in cancer, Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing, Hi-C detects novel structural variants in HL-60 and HL-60/S4 cell lines, Strong Association of De Novo Copy Number Mutations with Autism, Structural Variation of Chromosomes in Autism Spectrum Disorder, Defining the Genetic, Genomic, Cellular, and Diagnostic Architectures of Psychiatric Disorders, Genome Sequencing of Autism-Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA, Genome-wide characteristics of de novo mutations in autism, Paired-Duplication Signatures Mark Cryptic Inversions and Other Complex Structural Variation, Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome, Paternally inherited cis-regulatory structural variants are associated with autism, Genomic Patterns of De Novo Mutation in Simplex Autism, Detecting a long insertion variant in SAMD12 by SMRT sequencing: implications of long-read whole-genome sequencing for repeat expansion diseases, A 12-kb structural variation in progressive myoclonic epilepsy was newly identified by long-read whole-genome sequencing, Next-generation mapping: a novel approach for detection of pathogenic structural variants with a potential utility in clinical diagnosis, Comprehensive structural variation genome map of individuals carrying complex chromosomal rearrangements, Breakpoint mapping of a novel de novo translocation t(X;20)(q11.1;p13) by positional cloning and long read sequencing, The 22q11 low copy repeats are characterized by unprecedented size and structure variability, Mechanisms underlying structural variant formation in genomic disorders, Long-read sequence and assembly of segmental duplications, nplnv: accurate detection and genotyping of inversions using long read sub-alignment, Bakhtiari M, Shleizer-Burko S, Gymrek M, Bansal V & Bafna V, Targeted genotyping of variable number tandem repeats with adVNTR, Resolving complex tandem repeats with long reads, Interrogating the unsequenceable genomic trinucleotide repeat disorders by long-read sequencing, Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads, A survey of localized sequence rearrangements in human DNA, rMETL: sensitive mobile element insertion detection with long read realignment, TSD: A computational tool to study the complex structural variants using PacBio targeted sequencing data, Characterization of structural variants with single molecule and hybrid sequencing approaches, https://github.com/StanfordBioinformatics/HugeSeq, https://github.com/timothyjamesbecker/FusorSV, https://sourceforge.net/projects/pb-jelly/, https://github.com/PacificBiosciences/pbsv, https://github.com/fritzsedlazeck/Sniffles, https://github.com/TheJacksonLaboratory/Picky, https://bionanogenomics.com/support/softwaredownloads/, https://support.10xgenomics.com/genomeexome/software/pipelines/latest/what-is-longranger, https://github.com/1dayac/novel_insertions, https://sourceforge.net/p/bait/wiki/Home/, https://sourceforge.net/projects/strandseq-invertr/, https://github.com/dixonlab/hic_breakfinder, https://github.com/raphael-group/multibreak-sv, https://bitbucket.org/xianfan/hybridassemblysv, https://www.ebi.ac.uk/ena/data/view/PRJEB3l736, https://ijgvd.megabank.tohoku.ac.jp/download_lkjpn/, https://www.biorxiv.org/content/10.1101/508515v1.supplementary-material, https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs001123.v1.p1, https://gnomad.broadinstitute.org/downloads, https://eichlerlab.gs.washington.edu/publications/chml-structural-variation/, https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd137/, https://github.com/nanopore-wgs-consortium/CHMl3, https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd162/, https://www.mdpi.com/2073-4425/9/10/486/s1, https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd168/, https://github.com/nanopore-wgs-consortium/NA12878/blob/master/nanopore-human-genome/NA12878.hq.sv.vcf, ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/N, ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/ChineseTrio/analysis/, ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/analysis/NIST_SVs_Integration_v0.6, https://www.ncbi.nlm.nih.gov/dbvar/studies/nstd152/, http://www.internationalgenome.org/data-portal/data-collection/structural-variation, https://github.com/mcfrith/local-rearrangements, https://github.com/mehrdadbakhtiari/adVNTR, Affordability; accessible, as infrastructure is widely available; high base-calling accuracy; detection of well characterized SVs; low cost makes read-depth methods more effective; deletion detection; high throughput, Amplification bias; insert sizes are inherently limiting; ambiguous mapping to repetitive regions; low phasing power; lack of standardized merging and ensemble choice; poor insertion detection, PE, SR, and RD signals with integration of two specialized insertion callers. At the completion of phase 3, the 1KGP sequenced 2,504 individuals across 26 populations and investigated all major SV classes in contrast to the deletion focus of the phase 1 marker paper5. Currently, no single method or technology has been shown to be comprehensive enough to detect all SV within a genome. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. These datasets differ in sample size, ancestry, depth, platform, merging methodology, sensitivity, and specificity, all of which should be considered before deciding which set is right to utilize, as biases influenced by these choices are inherently passed to the applications that employ them. Indeed, the technologies and methods discussed have resulted in an aggressive influx of detectable variants but there is little ability to assign impact. A sequencing based-approach proved to be more comprehensive when in 2011 Mills et al. In another example, Dixon et al. Algorithms detect SVs from SMRT data by leveraging intra and inter-read signatures (FIG. An increase in similar population-specific SV detection projects will be necessary to shift the diversity gap in genetics research and help identify rare SVs specific to ancestral backgrounds59. A fifth factor, coordinate overlap, is considered by all EA methods to varying degrees. The most recognized forms of structural variation include deletions, duplications, inversions, insertions, and translocations, A structural variant that consists of multiple combinations of structural variant types nested or clustered with one another, Standard sequencing libraries fragmented to ~ 600800 bp in length. One of the first sequence mapping approaches performed with a single fosmid library reported a similar number of SVs, ~300 variants11. Multiplatform discovery is often employed to investigate SVs in cancer. Structural variants (SVs) are genomic rearrangements that involve at least 50 nucleotides and are known to have a serious impact on human health. Intra-read signatures enable direct detection of SVs and are derived from reads spanning entire SV events, resulting in missing sequence (deletion) or a soft-clip (insertion) within properly aligned flanking sequences. also used LRs to study the genomic architecture of the AR oncogene in castration-resistant prostate cancer and found that SVs were likely to inactivate tumor-suppressor genes in complex patterns where each haplotype could harbor a different type of inactivating SV153. A high-quality human reference panel reveals the complexity and distribution of genomic structural variants, An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder, SpeedSeq: ultra-fast personal genome analysis and interpretation, svtools: population-scale analysis of structural variation, MetaSV: an accurate and integrative structural-variant caller for next generation sequencing, Parliament2: Fast Structural Variant Calling Using Optimized Combinations of Callers, iSVP: an integrated structural variant calling pipeline from high-throughput sequencing data, Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast, FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods, Pounraja VK, Jayakar G, Jensen M, Kelkar N & Girirajan S, A machine-learning approach for accurate detection of copy number variants from exome sequencing, An Incomplete Understanding of Human Genetic Variation, Detection of large-scale variation in the human genome, Characteristics of de novo structural changes in the human genome. The platforms discussed can be employed combinatorially to complement strengths and mitigate weaknesses102. Uses a RD caller to annotate copy number at each variant locus, User choice of six individual callers. Structural variants in more than 17,000 human genomes are mapped and characterized using whole-genome sequencing, showing how this type of variation contributes to rare deleterious coding and . NanoSV iteratively clusters all reads that support a breakpoint junction whereas Picky stitches together split-reads with surrounding reads and calls SVs from the best alignments. Additionally, a study investigating non-reccurent SVs with arrays, short-reads, and long-reads found enrichment of de novo SNVs and indels near SV breakpoints, the majority of which are intragenic152. A philosophical ideal would involve sequencers that read entire genomes, without bias, as a contiguous whole. Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult to resolve. For example, ONT analysis identified a heterozygous point mutation and an exon disrupting deletion in a disease individual where the disease genotype involves bi-allelic point mutations144. Depending on the level of sensitivity a project aims to achieve, applications will either intersect calls or take a union, decreasing and increasing sensitivity while decreasing and increasing the FDR, respectively. However, Sedlazeck et al. sharing sensitive information, make sure youre on a federal GROC-SVs additionally performs local reassembly to detect complex SVs 10 kb 100 kb in length. Calls are merged by overlap, PE, SR, and RD signals, along with breakpoint junction mapping, Calls are merged by overlap that prioritizes read signatures by their respective resolution and are refined with local reassembly, PE and SR signals, along with a Bayesian likelihood genotyper. 2). Along with integrating short-read SV callers, we consider integrating data generated from multiple genomic platforms as a way to comprehensively detect the broad range of SVs. performed SV discovery in 15 individuals long-read sequenced to an average ~57X and found 86,761 SVs absent from the 1KGP and the Genomes of the Netherlands project datasets109. We thank Y. Wang, W. Zhou, A. Weber, and B. Zhou for their valuable comments and help with proofreading the manuscript. Integration of SV calls from differing technologies is analogous to EA approaches: most methods are in-house and consider coordinate overlap, breakpoint proximity, mapping orientation, read support, putative SV type, and resolution of the underlying technology. These strategies are similar to long-insert short-read libraries (reviewed elsewhere)63, which trade lowered sequence coverage for high physical coverage, improving and decreasing power to detect large and small variants, respectively. Hybrid-signature algorithms such as Genome STRiP, Delly, Manta, and LUMPY, among others, mitigate the limited scope of single-approach algorithms, improving sensitivity by integrating two or more disparate signatures to call putative SVs based on combined supporting evidence3036. The single-molecule approach detected 76% more SVs than an ensemble of 3 short-read callers (with 2 caller concordance), most of which derive from repetitive regions. Below are selected studies that either estimate the extent of SV content or provide estimates of detectable SVs according to technology within phenotypically healthy human genomes, showing the relationship between detectable SVs and available technologies. Selected sequencing-driven reference datasets representing phenotypically normal individuals are listed below. The Genome of the Netherlands Consortium 2). Given this large variation, projects often use more than one reference set to maximize inclusivity and avoid overfitting. The authors declare no competing interests. Population-scale genotyping of structural variation in the era of long Haploid human hydatidiform mole; target sequencing of BAC clones, African, Asian, European, American, and South Asian ancestry; BAC and fosmid libraries, One male and one female Swedish individual, 156 samples from the 1KGP; concordance with 10x-Genomics LRs, Genome in a Bottle, HG005, HG003, HG004 (son/father/mother), A preliminary callset containing deletions and insertions from a Han Chinese family trio, Genome in a Bottle, HG002, HG003, HG004 (son/father/mother), Contains high-confidence deletions and insertions from an Ashkenazi family trio; concordance across multiple trios, Human Genome Structural Variation Consortium, Three family trios of Han Chinese, Puerto Rican, and Yoruban Nigerian ancestry; concordance across multiple genomic platforms, (SV) Operationally defined as sequence variants > 50 bp in size.