README ------ This file contains electronic supplementary material. The following files are included; ATOMTYPE_AMBER.DEF c-RDFs.xlsx REFCODES.txt study_temps.csv MAIN.m extract_workspaces.m scripts |__aa_spherops.m |__al_spherops.m |__assym_set.m |__cart2frac.m |__hist_gen.m |__la_spherops.m |__lattice_gen.m |__lim_set.m |__matrix_set.m |__merge_test.m |__pdb_extract.m |__sort_atom_types.m |__sp_no_gen.m |__sym_mat_trans.m |__sym_ops.m |__topology_extract.m Our Dataset ----------- The refcodes for crystal structures used to produce the c-RDFs explained in this paper are given in REFCODES.txt. This file may be renamed to a .gcd file to directly load the relevant structures into Mercury for your own investigations. Please refer to the relevant documentation on the Cambridge Crystallographic Data Centres (CCDC) website if you are unsure how to do this; https://www.ccdc.cam.ac.uk Our c-RDFs ---------- A summary of our c-RDFs is given in c-RDFs.xlsx. There is a row for each atom pair c-RDF calculated, as presented in this work. Each number refers to the calculated g(r) for the distance (r) given in the first row of the spreadsheet. Also included are the graphs for the maximum value of g(r), and the distance (r) at which this occurs. Program Outline --------------- The script used to run the relevant modules for obtaining raw data is MAIN.m. The program’s primary operation can be summarised as follows; 1. Determine atom types according to AMBER forcefield definitions for a crystal structure .pdb file with Antechamber 2. Apply all crystallographic algorithms necessary to produce atom equivalent atom positions and to expand the lattice by one unit cell in each direction. 3. Sort all atoms for each structure into individual arrays 4. Move the structure coordinate system origin to a target atom nucleus position (either water oxygen or hydrogen) 5. Convert to spherical coordinate system 6. Calculate distance, azimuth and elevation for all atom pairs within a specified cut-off distance (10 Å) 7. Repeat from moving origin for every target atom in the system 8. Save data as a MATLAB workspace for manipulation with further routines This is executed in a directory loop, so that a user can place multiple .pdb files in a single directory and run the program automatically for all crystal structures. For step 1, atomtyping by antechamber is required. This is an external routine. The antechamber software is available from http://ambermd.org/antechamber/ac.html. We recommend that the user refer to the user manuals on the ambermd website for full instructions on how to use antechamber. A custom definition file for amber atom types is given in this supplementary material (ATOMTYPE_AMBER.DEF). This file should be moved to replace './antechamber/dat/antechamber/ATOMTYPE_AMBER.DEF' in your own version of antechamber in order to produce the same results given in this paper. We recommend that you backup the original definition file. How to Run the Program ---------------------- This program is written as a series of matlab (.m) module scripts. Execution of the main program is via the 'MAIN.m' script. It will be necessary to edit this file to point to your own directories for input and output files. The input files required are .pdb files for each of the input structures you wish to calculate c-RDFs for. These files are also required for atomtyping within antechamber. We recommend that identical .pdb files be used for both atom typing and this program. The topology files output from antechamber are also required (.top). You will also need to change the user preferences in MAIN.m before program execution. Once all of the preferences in the main program script are changed, and the correct input files are present, it should be possible to run the program via the usual invoke codes involved with execution of a MATLAB script. If you are unfamiliar with this, please refer to the MATLAB documentation; http://uk.mathworks.com/products/matlab/index.html?s_tid=gn_loc_drop The program, by default, saves the calculated results for each structure as a MATLAB workspace. In order to calculate c-RDFs for each atom-type pair, the user should use the extract_workspaces.m script. Please ensure you have changed the user variables to suit your own working directory and required output files. By default, the program will output one excel spreadsheet for each atomtype pair. We suggest you run this program for a subset of workspaces, and edit the extract_workspaces.m script to produce the output that you require.