Frequently Asked Questions (FAQs)
- What genes are in the KOMP Database?
- What genes are on the KOMP Target gene list?
- How are the KOMP Target gene list and the KOMP Master gene list generated?
- How can I search the KOMP database?
- How can I browse the KOMP database?
- How can I determine the knockout status of my gene?
- Can I nominate genes for targeting by KOMP?
- Can I raise the priority for genes that are on the KOMP Target gene list?
- My gene is in the production pipeline. How long will it take until vectors, mutant ES cells, mutant mice are available?
- Can I receive email notification when KOMP products for my gene are available?
- What products are being generated by KOMP?
- How can I order KOMP products?
- My gene is being targeted by EUCOMM. How can I obtain more information on EUCOMM products? How can I order EUCOMM products?
What genes are in the KOMP Database?
The KOMP database includes all genes from the Mouse Genome Informatics (MGI) database for which sequences and genome coordinates are available. This includes all genes predicted by the NCBI, Ensembl, and Vega (Vertebrate Genome Annotation) pipelines for mouse Genome Build 36. MGI continuously updates its Mouse Gene Catalog by comparing gene annotations from Ensembl, Vega and NCBI, and correlating them with information in MGI. Gene model differences that are revealed by this comparison are being resolved in close collaboration with Vega, Ensembl, and NCBI.
What genes are on the KOMP Target gene list?
How are the KOMP Target gene list and the KOMP Master gene list generated?
The generation of the KOMP Target gene list includes four steps:
- Generation of the KOMP Master gene list.
The design of targeting vectors requires good knowledge about the genomic structure of the target gene. Therefore, the CCDS (Consensus Coding Sequence) set of genes was selected as the starting point for the generation of the KOMP Master gene list. It is a list of mouse protein coding genes for which the NCBI and Ensembl annotation pipelines predict the same coding region. All genes in the CCDS set are annotated as having full-length coding sequences, can be translated from the genome without frameshifts, and use consensus splice sites. The genes in the CCDS set are compared with the Mouse Gene Catalog, generated from MGI by comparing gene annotations from Ensembl, Vega (Vertebrate Genome Annotation), and NCBI and correlating them with information in MGI. Those genes from the CCDS set for which there are no contradicting gene models predicted from Ensembl, Vega, NCBI and MGI are included into the KOMP Master gene list. Genes annotated by the Vertebrate Genome Annotation Group that are not part of the CCDS set but don't have conflicting gene model predictions with Ensembl or NCBI are also included.
- Annotation of the KOMP Master gene list.
- Genes trapped by the International Gene Trap Consortium (IGTC)
- Genes on EUCOMM target gene list, including pipeline status
- Genes for which targeted mutants are reported in MGI
- Genes for which other mutants are reported in MGI
- Genes for which mutants are available through the International Mouse Strain Resource (IMSR)
- Genes whose human orthologs are associated with disease entries in OMIM
- Genes assigned to the CSD production center, including pipeline status
- Genes assigned to the Regeneron production center, including pipeline status
- Generation of the KOMP Target gene list.
The following genes are deleted from the KOMP Master gene list to create the KOMP Target gene list:
- Genes for which IGTC gene traps are available
- Genes targeted by EUCOMM
- Genes for which targeted mutants are reported in MGI
Steps A to C are done computationally by the KOMP Database and associated load and QC programs. The database, and thus the KOMP Master gene list and the KOMP Target gene list, is updated daily to reflect changes in the Mouse Gene Catalog and in gene annotations, as well as manual changes of the KOMP Target gene list, described in step D.
- Manual changes of the KOMP Target gene list.
The KOMP Target gene list generated by steps A to C is modified manually based on additional considerations by KOMP. Additions to or deletions from the target list, or prioritizing genes on the target list, might take into account additional information about genes and gene nominations by the research community. For example, genes might be prioritized for targeting because there are human orthologs with disease entries in OMIM. Or, genes for which targeted mutants are reported in MGI might be included in the target list because no mutant mice are publicly available through the IMSR.
How can I search the KOMP database?
The search box offers many methods of searching the KOMP database.
You can specify multiple search terms separated by spaces or commas. Each of these terms will be used to match genes in the database. The search terms are case insensitive, so "PAX6" is treated the same as "Pax6" by the search.
The search box makes assumptions about the terms. These assumptions are:
- The term is an MGI ID if it starts with the letters MGI
- The term is an Ensembl ID if it starts with the letters ENSMUSG
- The term is a Vega ID if it starts with the letters OTTMUSG
- The term is an NCBI ID if it is a number
- The term is a genomic region if it fits the pattern "chrX:####-####" where X is a valid chromosome, and #### are valid start and stop coordinates
- Otherwise, the term is assumed to be a gene symbol or a gene name
Searching symbols
If you enter a gene symbol or a gene name, the system will perform a "begins with" match against all current gene symbols. It will also perform a "begins with" match against current names, old symbols, old names and synonyms. All matches will be displayed in alphabetical order by gene symbol. You may also search for multi-word terms by grouping the term in quotes. See the exact usage in the examples section.
Wildcard searches
You may include asterisk (*) wildcards anywhere in your search term to
indicate you want partial matches.
For example, if you wanted
to find all genes that contained the term skeletal you could search for
"*skeletal*" and it would return all genes that contained the word "skeletal"
in the gene name (or old name or symbol or synonym).
Search genomic coordinates
Searching by genomic coordinates will return all genes that are entirely within the region specified. If a partial overlap on either end is made, the gene will be excluded. When searching by coordinates, the results will be in coordinate order.
Example searches:
| You want to find... | Search for... |
|---|---|
| all genes that contain pax | *pax* |
| all genes that start with zfp | zfp |
| all bone marrow genes | "*bone marrow*" |
| the Ensembl gene ENSMUSG00000031633 | ENSMUSG00000031633 |
| all genes named kit or genes named adam | kit, adam |
| all genes on chromosome 7 | chr7: |
| all genes on chromosome 5 between 115767675 and 115816986 | Chr5:115767675-115816986 |
How can I browse the KOMP database?
You can browse the KOMP database by selecting the "Browse" tab in the upper right of the screen. You are presented two options
- Browse by gene symbol
- Browse by chromosome
Browsing by symbol separates the genes in the database by the first letter of the gene's official symbol. The genes are presented 50 genes at a time. Clicking on "Browse by gene symbol" brings you to the first page of genes that start with the letter "A" (case insensitive).
You are presented with navigation to go to the first page, the last page, or the previous/next page. You can jump to each letter of the alphabet (along with 0-9) by selecting the letter from the list.
How can I determine the knockout status of my gene?
Enter the gene symbol into the search box on the upper right and search. The resulting query summary lists one record (row) for each gene that matches your query. The KOMP status column indicates if your gene is currently on the KOMP Target gene list and if targeting is in progress. If your gene has already been assigned to one of the KOMP production centers (CSD, Regeneron), its pipeline status is indicated in the CSD or Regeneron column.
The Regeneron pipeline comprises the following statuses:
- Regeneron Selected
- Parental BAC Obtained
- Design Finished/Oligos Ordered
- Targeting Vector QC Completed
- Vector Electroporated into ES Cells
- ES cell colonies picked
- ES cell colonies screened / QC no positives
- ES cell colonies screened / QC one positive
- ES cell colonies screened / QC positives
- ES Cell Clone Microinjected
- Germline Transmission Achieved
A detailed description of the CSD pipeline statuses is available here.
Genes that are not on the KOMP Target gene list might be targeted by EUCOMM, or mutant ES cells or mice for your gene might already be publicly available from the International Gene Trap Consortium (IGTC) or through the International Mouse Strain Resource (IMSR). Please check the Other Status column for this information.
Can I nominate genes for targeting by KOMP?
Can I raise the priority for genes that are on the KOMP Target gene list?
The answer to both questions is yes. KOMP welcomes gene nomination and prioritization requests from the research community. Please complete and submit the online gene nomination form. All submissions will be forwarded to NIH and considered by the NIH KOMP administrators and a panel of scientific advisors.
My gene is in the production pipeline. How long will it take until vectors, mutant ES cells, mutant mice are available?
The KOMP project is at an early stage. The knockout production pipelines are still being optimized. Further, due to the high-throughput nature of KOMP, no special efforts for individual genes can be pursued. Therefore, it is currently difficult to provide production timelines. We encourage you to check the knockout status of your gene regularly. Generally, once an ES cell line is transferred from the Production Center to the KOMP Repository, it will take about 6 weeks before it becomes available for distribution. During this time, a series of quality control tests are applied to the ES cell clones to ensure their identity, viability, and pathogen-free status.
Can I receive email notification when KOMP products for my gene are available?
Yes. Go the KOMP Repository web site and search for your gene of interest using the Product Search box. For genes for which KOMP targeting is in progress, there will be an Express Interest link. Follow that link to register interest. You will receive email notification as soon as KOMP products for your gene are available.
What products are being generated by KOMP?
The main emphasis of KOMP (as currently funded) is to generate targeting vectors and mutant ES cells. Only for a limited number of genes, mutant mice, embryos or sperm will be generated by the production centers. The KOMP Repository offers additional services, on a cost recovery basis, to obtain mice, embryos, or sperm from mutant ES cells.
How can I order KOMP products?
Go the KOMP Repository web site and search for your gene of interest using the Product Search box. If the search results show "Available for Distribution" follow that link to the ordering page. More information on ordering KOMP products and services is available at the KOMP Repository home page.
My gene is being targeted by EUCOMM. How can I obtain more information on EUCOMM products? How can I order EUCOMM products?
You can contact our EUCOMM colleagues at eucomm@sanger.ac.uk. For more information, see http://www.eucomm.org/info/contact.shtml.

