Searching Pathway/Genome Databases

Contents

Selecting the Database to Search

Unless otherwise indicated, all Pathway/Genome Database searches are restricted to a single database. In most cases, a database describes a single organism -- although a small number of multi-organism Pathway/Genome Databases exist (examples include MetaCyc and PlantCyc). The database against which searches will be conducted is indicated below the Quick Search box in the page banner, and at the bottom of the Search and Tools pull-down menus.

To search a different database, click on the "change" link (found below the Quick Search box, and at the bottom of the Search and Tools menus). In the dialog that pops up, you can either search for the organism of interest in the scrollable list, or you can start typing in its name.

When a large number of databases is available, the alphabetical index to the left of the database list provides a convenient shortcut for scrolling to a desired part of the alphabet. If you start typing an organism name, the full list of databases will be replaced by a list of databases matching the string you typed -- you can use the mouse or the up/down arrows on your keyboard to select the desired database. Lists of your recently used databases and the site's most popular databases provide shortcuts for selecting those databases.

If the site supports user accounts, and you are logged in, you may select one database as your preferred database. This database will be your default selection when starting a new web session.

Once you have selected the desired database, click OK to exit the dialog. The page will reload, and the text under the Quick Search box should now indicate the newly selected database. Note that if you are looking at a page that contains data from a particular organism, selecting a new database will not affect the contents of the current page -- the new selection will apply only to your future searches.

Quick Search

The Quick Search box in the upper right hand corner of every page is useful if you know the name (or part of the name) or database identifier of the object you are searching for. You may use this box to search for genes, proteins, compounds, RNAs, reactions, pathways, operons, and GO terms. If the query string matches a single object, the page for that object will be displayed immediately. If there are multiple matches, the full list of matches will be shown, organized by the type of object (e.g. gene, protein, etc.).

Some examples of what can be entered into the Quick Search box include:

A few additional rules govern searches:

Search Menu: Object Searches

The Search menu contains links to specialized search pages for Compounds, Genes/Proteins/RNAs, Reactions and Pathways. Each such page contains options for searching using a number of different criteria, either individually or in combination. When the page is initially loaded, only the name searches are active, but by clicking on the different search bars, you can enable or disable additional search criteria. If multiple search criteria are specified for a given search, then unless otherwise specified the results must satisfy all of them (that is, an AND connector is used to combine the different criteria).

The results of all object searches is a table containing the names of all objects that satisfy the search, with hyperlinks to their corresponding data pages, along with any additional columns relevant to the particular search. The table will initially be sorted alphabetically by name, but small triangles in the column headers allow the user to sort by any column, in either ascending or descending order.

The sections below describe the different search criteria that are available for each object type.

Search Menu → Compounds

Search Menu → Genes/Proteins/RNAs

Search Menu → Reactions

  • Search for reaction by EC number or name
    Enter a reaction EC number or name (typically an enzyme name). EC numbers can be either full or partial. The software will attempt to do auto-completion on the name or EC number. If you select one of the auto-complete options, then when you submit the form you will be taken directly to the data page for the selected reaction or reaction class, regardless of any other search criteria you may have specified (i.e., other search criteria will be ignored). If you do not select one of the auto-complete options, then the string you typed will be the target of a substring search, which may be combined with other search criteria.

  • Search/Filter by substrates or products
    Enter a compound name to retrieve all reactions in which that compound participates either as a substrate or product. If you enter more than one compound, then the reaction must involve all specified compounds in order to be included in the results. We recommend taking advantage of the auto-complete facility to select the correct compound, as only an exact match to a compound name can be accepted here.

  • Search/Filter by ontology
    This option allows you to browse the Pathway Tools reaction ontology. Each reaction class includes in parentheses after its name the number of reactions that are members of that class. The ontology may be used in one of two ways. By selectively clicking on + icons, you can browse to find a reaction of interest, and click directly on its name to visit the data page for that reaction. Alternatively, you can check the checkbox next to one or more class names to limit your search (which may also include other search criteria) so as to only include reactions that belong to one of the checked classes. Note that there are two parallel reaction classification systems, one in which reactions are classified by conversion type (this includes the entire EC hierarchy), and another in which the reactions are classified by substrate. Most reactions in the database have parents in both classification systems.

    Search Menu → Pathways

  • Search for pathway by name
    Enter a pathway name, name fragment, or internal Pathway/Genome Database identifier. The software will attempt to do auto-completion on the string you have entered based on the contents of the database. If you select one of the auto-complete options, then when you submit the form you will be taken directly to the data page for the selected compound. This is true regardless of any other search criteria you may have specified (i.e. other search criteria will be ignored). If you do not select one of the auto-complete options, then the string you typed will be the target of a substring search, which may be combined with other search criteria.

  • Search/Filter by ontology
    This option allows you to browse the Pathway Tools pathway ontology. Each pathway class includes in parentheses after its name the number of reactions that are members of that class. The ontology may be used in one of two ways. By selectively clicking on + icons, you can browse to find a pathway of interest, and click directly on its name to visit the data page for that pathway. Alternatively, you can check the checkbox next to one or more class names to limit your search (which may also include other search criteria) so as to only include pathways that belong to one of the checked classes.

  • Search/Filter by number of reactions
    Enter a minimum and/or maximum number of desired reactions in the pathway. If either the minimum or maximum field is left blank, then the number of reactions is unconstrained in that direction.

  • Search/Filter by substrates present
    Enter one or more compound names to retrieve all pathways in which those compounds participate as a reactant, a product, or an intermediate. If you enter more than one compound, then the pathway must involve all specified compounds in order to be included in the results. We recommend taking advantage of the auto-complete facility to select the correct compound, as only an exact match to a compound name can be accepted here.

  • Search/Filter by evidence code
    The Pathway Tools evidence ontology appears here in browseable form. Each evidence code includes in parentheses after its name the number of pathways that have their function annotated with that code. Selecting one or more codes to filter on allows you to restrict your search, for example, to all pathways whose presence has been established experimentally. The Pathway Tools evidence codes and ontology are described
    here.

  • Search/Filter by organism
    This search option will be available only if a multi-organiam database (such as MetaCyc) is the selected database, and allows you to browse for pathways that are curated as occurring in a particular organism based on experimental information. The fact that a pathway is not stated to be present in a given organism does not mean that the organism does not have the pathway -- pathways are curated for only a small subset of the organisms in which they appear.

  • Search/Filter by expected taxonomic range
    This search option will be available only if a multi-organism database (such as MetaCyc) is the selected database. Each pathway in MetaCyc has been annotated with its expected taxonomic range. This search option allows you to restrict your search to include only those pathways you could reasonably expect to see for a given taxonomic grouping, for example, to restrict your search to pathways seen in plants.

  • Search/Filter by publication
    This search option is useful for retrieving a list of all pathways that cite (either directly or through one of the pathway's enzymes, genes, subpathways or substrates) a given publication or author. Enter either the PubMed ID, the author surname, or part or all of an article title.

    Search Menu → Advanced Search

    The Advanced Search tool facilitates generation of queries that are more complex than those supported by the object search tools described above. Using the Advanced Search tool, you can write queries that combine data from multiple organisms or multiple types of objects, and you can search fields that are not supported by the individual object search pages. Detailed instructions for using the Advanced Search tool to construct complex queries are available
    here.

    Ontology Searches

    An ontology is a carefully constructed vocabulary of terms, often called a controlled vocabulary. The terms are organized into a classification hierarchy (also called a taxonomy). Ontologies can be used to browse and search for objects by drilling down from more general categories to more specific ones. Each Pathway/Genome Database contains several ontologies. Those that can be searched are available from the Ontologies sub-menu in the Search menu. These ontologies can also be accessed from the object search page for their particular object type. The browseable ontologies are:

    Search Menu → Google This Site

    The Search Menu → Google This Site command uses Google to perform a full text search over this entire Web site. Searches will not be restricted to the selected database, and can locate text strings found in page comments, help pages, and other page content not queryable by other means. Submitting this form will direct the user outside this Web site to a page generated by Google. A Google full text search is also offered as an option when a Quick Search fails to return any result (or does not return the desired result).

    Search Menu → BLAST

    This facility (not available for MetaCyc) allows you to perform sequence-similarity searches using the
    BLAST program to compare your protein or nucleic acid sequence against the complete genome of the selected organism database.

    Search Menu → Search Full-text Articles

    Textpresso is a package for indexing and searching a corpus of biological literature. Textpresso searches are available for searching a large Escherichia coli literature corpus only at the BioCyc Web site, and are available only when EcoCyc is the selected database.