ALATIS Logo ALATIS Logo
Atom Label Assignment Tool using InChI String
Home
Analyzed databases
NMRFAM Servers

News:
  • Aug 2018: We deployed ALATIS application programming interface (API) here
  • May 2018: More than 91M PubChem entries have been processed and are available via the search engine.
  • April 2018: ALATIS naming system adopted by NMReData: https://doi.org/10.1002/mrc.4737.
  • December 2017: ALATIS naming system adopted by BMRB.

This page contains lists of flagged registry ID's from our target databases. These lists are provided as complementary information to our publications and can be used for remediation of the target databases.

Release date: June 2018

We processed the entire 3D structure files archived in PubChem, and compared the formula and InChI strings generated by ALATIS with those deposited in PubChem. The list of PubChem CID's that have been flagged due to discrepancies in
  1. PubChem entries with charged formulas can be downloaded from here (csv file, 37.7 MB)
  2. Inconsistency between the archived 3D structures and InChI strings
  3. :

    1. Atoms connectivity
      • Flagged entries due to inconsistency in connectivity of heavy atoms (/c layer) can be downloaded from here
      • Flagged entries due to inconsistency in assigned hydrogen atoms (/h layer) can be downloaded from here
    2. Inconsistency in charge distribution
      • Flagged entries due to inconsistency in /p layer of InChI strings can be downloaded from here
      • Flagged entries due to inconsistency in /q layer of InChI strings can be downloaded from here
    3. Inconsistency in stereochemistry
      • Flagged entries due to inconsistency in double bond sp2 stereochemistry (/b layer) can be downloaded from here
      • Flagged entries due to inconsistency in stereochemistry of chiral centers (/t layer) can be downloaded from here

Release date: Jan 2017

The outcomes of processes on our target databases (BMRB, PubChem, HMDB, PDB RCSB Ligand-Exp) are shown in this page.
  1. Correct and complete molecule and atom identifiers of entries of the target databases:
  2. You can query a rigistry ID of the processed databases or a metabolite name to see ALATIS results. If you need the complete set of the processed data of the target databases, please contact Hesam Dashti (dashti@wisc.edu).

  3. Flagged improper usage of InChI strings in databases:
    1. Incorrect/incomplete InChI strings deposited in databases:
      1. This section lists the inconsistencies between standard InChI that we generated and those deposited in databases:
      2. BMRB list can be found here.
        HMDB list can be found here.
        PubChem list can be found here.
        PDB list can be found here.

      3. This section lists the inconsistencies between standard InChI that we generated and those deposited in databases, after discarding the standard flag ("/S") of InChI strings:
      4. BMRB list can be found here.
        HMDB list can be found here.
        PubChem list can be found here.
        PDB list can be found here.

    2. Flagged deposited cross-links from BMRB and HMDB entries to PubChem:
    3. For every cross-link in the BMRB and HMDB entries to the PubChem entries, we compared their standard InChI strings and flagged the corss-links that their corresponding molecule identifiers do not match:
      Flagged cross-links in BMRB can be found here.
      Flagged cross-links in HMDB can be found here.

  4. Created cross-links from PDB entries to BMRB, HMDB, and PubChem entries:
  5. We used the standard molecule identifiers of PDB entries and compared them with the corresponding identifiers of BMRB, HMDB, and PunChem entries. Here are the results of created cross-links:
    Created cross-links to BMRB can be found here.
    Created cross-links to HMDB can be found here.
    Created cross-links to PubChem can be found here.

  6. Incompatible atom labels between BMRB and HMDB
  7. Comparing deposited atom labels of entries of the same compounds can be found here

Citation:
  • Hesam Dashti, William M. Westler, John L. Markley, Hamid R. Eghbalnia, "Unique identifiers for small molecules enable rigorous labeling of their atoms", Scientific Data 4, Article number: 170073 (2017), doi:10.1038/sdata.2017.73, https://www.nature.com/articles/sdata201773
  • Hesam Dashti , Jonathan R. Wedell , William M. Westler , John L. Markley, Hamid R. Eghbalnia, “Automated evaluation of consistency within the PubChem Compound database”, Scientific Data volume 6, Article number: 190023 (2019), doi:10.1038/sdata.2019.23, https://www.nature.com/articles/sdata201923
Disclaimer:
  • ALATIS is available to the public as a web-service via our web-server, and also through the NMRBox virtual machine. The custom source code, developed using the academic license of MATLAB® in the Linux environment (MATLAB® 2016a for CentOS 6.5). This work is copyrighted under the terms of GPL. The web-service and the source codes are provided on an “as is” basis without warranty of any kind, either expressed or implied. Any usage of the web-server, or modification and application of the source codes are free for academic use when ALATIS publication is cited.
  • The input/output file formats to ALATIS are Mol V2000 and its corresponding SDF. Other acceptable input/output file formats of this website are provided by utilizing the Open Babel (The Open Source Chemistry Toolbox) software package, please comply with the Open Babel license agreements.
Contact:
    For any question or concern please contact Hesam Dashti (dashti@wisc.edu).
Free counters!