FINDMAP

Metadata Updated: November 10, 2020

The findmap.f90 program aligns sequence reads to reference map, calls previous variants, and identifies new variants. Program and download information can be found at the Animal Improvement Program (AIP) web site: http://aipl.arsusda.gov/software/findhap Sequencing research requires efficient computation. Few programs use already known information about DNA variants when aligning sequence data to the reference map. New program findmap.f90 reads the previous variant list before aligning sequence, calling variant alleles, and summing the allele counts for each DNA source in a single pass. Advantages are faster processing, more precise alignment, more useful data summaries, more compact output, and fewer steps. Programs findmap and BWA were compared using simulated paired end reads of length 150 from fragments of length 1,000 at random locations within the UMD3.1 bovine reference assembly. Each base had 1% probability of error and 1% probability of missing. The 39 million variants from run 5 of the 1,000 bull genomes project were included, with every other variant set to reference or alternate. With 1 processor, BWA required 629 minutes per 1X for alignment, whereas findmap required 12 minutes per 1X for alignment and variant calling. Percentage of correctly mapped reads was 90.5% from BWA and 92.9% from findmap. Variant calls were output by findmap only for the 88.2% of pairs where both ends were located within the fragment length and of opposite orientation. Percentages of variants called correctly were 99.8% for SNPs and 99.9% for deletions, while insertions had 99.9% of alternate calls correct but only 98.6% of reference calls. Memory required by BWA was 4.6 Gbytes / processor, whereas findmap required 46 Gbytes that could be shared by multiple processors. Simultaneous alignment and variant calling is an efficient and accurate strategy.

Access & Use Information

Public: This dataset is intended for public access and use. License: Creative Commons CCZero

Downloads & Resources

Dates

Metadata Created Date November 10, 2020
Metadata Updated Date November 10, 2020

Metadata Source

Harvested from USDA JSON

Additional Metadata

Resource Type Dataset
Metadata Created Date November 10, 2020
Metadata Updated Date November 10, 2020
Publisher Agricultural Research Service
Unique Identifier Unknown
Maintainer
Identifier 5f5b982f-3d78-40ba-88da-f7760aec0419
Data Last Modified 2019-08-05
Public Access Level public
Bureau Code 005:18
Metadata Context https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld
Schema Version https://project-open-data.cio.gov/v1.1/schema
Catalog Describedby https://project-open-data.cio.gov/v1.1/schema/catalog.json
License https://creativecommons.org/publicdomain/zero/1.0/
Program Code 005:040
Source Datajson Identifier True
Source Hash 8f7dafbe14239851e3d00aeaf00695fd4cc03628
Source Schema Version 1.1

Didn't find what you're looking for? Suggest a dataset here.