ALATIS LogoALATIS Logo
Atom Label Assignment Tool using InChI String
Home
Analyzed databases
NMRFAM Servers
News:

This page contains lists of flagged registry ID's from our target databases. These lists are provided as complementary information to our publications and can be used for remediation of the target databases.

Release data: June 2018

We processed the entire 3D structure files archived in PubChem, and compared the formula and InChI strings generated by ALATIS with those deposited in PubChem. The list of PubChem CID's that have been flagged due to discrepancies in
  1. formula's can be downloaded from here (csv file, 37.7 MB)
  2. standard InChI strings:

    1. flagged due to discrepancies in formula layers can be downloaded from here (json file, 87.1 KB)
    2. flagged due to discrepancies in heavy atom connectivity layers can be downloaded from here (json file, 5.1 KB)
    3. flagged due to discrepancies in hydrogen assignment layers can be downloaded from here (json file, 306.6 KB)
    4. flagged due to discrepancies in stereo-chemistry layers can be downloaded from here (json file, 12.2 GB)

Release data: Jan 2017

The outcomes of processes on our target databases (BMRB, PubChem, HMDB, PDB RCSB Ligand-Exp) are shown in this page.
  1. Correct and complete molecule and atom identifiers of entries of the target databases:
  2. You can query a rigistry ID of the processed databases or a metabolite name to see ALATIS results. If you need the complete set of the processed data of the target databases, please contact Hesam Dashti (dashti@wisc.edu).

  3. Flagged improper usage of InChI strings in databases:
    1. Incorrect/incomplete InChI strings deposited in databases:
      1. This section lists the inconsistencies between standard InChI that we generated and those deposited in databases:
      2. BMRB list can be found here.
        HMDB list can be found here.
        PubChem list can be found here.
        PDB list can be found here.

      3. This section lists the inconsistencies between standard InChI that we generated and those deposited in databases, after discarding the standard falg of InChI strings:
      4. BMRB list can be found here.
        HMDB list can be found here.
        PubChem list can be found here.
        PDB list can be found here.

    2. Flagged deposited cross-links from BMRB and HMDB entries to PubChem:
    3. For every cross-link in the BMRB and HMDB entries to the PubChem entries, we compared their standard InChI strings and flagged the corss-links that their corresponding molecule identifiers do not match:
      Flagged cross-links in BMRB can be found here.
      Flagged cross-links in HMDB can be found here.

  4. Created cross-links from PDB entries to BMRB, HMDB, and PubChem entries:
  5. We used the standard molecule identifiers of PDB entries and compared them with the corresponding identifiers of BMRB, HMDB, and PunChem entries. Here are the results of created cross-links:
    Created cross-links to BMRB can be found here.
    Created cross-links to HMDB can be found here.
    Created cross-links to PubChem can be found here.

  6. Incompatible atom labels between BMRB and HMDB
  7. Comparing deposited atom labels of entries of the same compounds can be found here

Citation:
  • Hesam Dashti, William M. Westler, John L. Markley, Hamid R. Eghbalnia, "Unique identifiers for small molecules enable rigorous labeling of their atoms", Scientific Data 4, Article number: 170073 (2017), doi:10.1038/sdata.2017.73, https://www.nature.com/articles/sdata201773
Disclaimer:
  • ALATIS is available to the public as a web-service via our web-server, and also through the NMRBox virtual machine. The custom source code, developed using the academic license of MATLAB® in the Linux environment (MATLAB® 2016a for CentOS 6.5). This work is copyrighted under the terms of GPL. The web-service and the source codes are provided on an “as is” basis without warranty of any kind, either expressed or implied. Any usage of the web-server, or modification and application of the source codes are free for academic use when ALATIS publication is cited.
  • The input/output file formats to ALATIS are Mol V2000 and its corresponding SDF. Other acceptable input/output file formats of this website are provided by utilizing the Open Babel (The Open Source Chemistry Toolbox) software package, please comply with the Open Babel license agreements.
Contact:
    For any question or concern please contact Hesam Dashti (dashti@wisc.edu).
Free counters!