diff --git a/i19-workshop/README.md b/i19-workshop/README.md new file mode 100644 index 0000000..c84a0d5 --- /dev/null +++ b/i19-workshop/README.md @@ -0,0 +1,11 @@ +# I19 Data Processing Workshop (Feb '23) + +## Example data + +Data recorded "back in the day" from Tetrabromobenzene (C6H2Br4) at 320k on i19-1. Crystal has symmetry P21/n and unit cell 4.08, 11.21, 9.84, 90 100.7, 90. Data in FIXME add directory, and consists of four runs of images following a standard strategy. + +## Processing tutorials + +There are two tutorials here - one using `xia2` and the other (much more interesting) one running through the individual steps using the `dials` command line tools. + +[Boring one using `xia2`](./xia2/README.md) and [much more interesting using `dials`](./dials/README.md). diff --git a/i19-workshop/dials/README.md b/i19-workshop/dials/README.md new file mode 100644 index 0000000..3e171c4 --- /dev/null +++ b/i19-workshop/dials/README.md @@ -0,0 +1,398 @@ +# DIALS + +Processing X-ray diffraction data with the `dials` command-line tools allows a much more detailed understanding of what the data are like to be built up, but comes at the cost of having to type some occasionally arcane commands into a UNIX terminal. Today these commands have been substantially trimmed, so this is surprisingly straightforward in most cases, but it nevertheless does involve some potentially non-obvious options. + +## Processing data with DIALS - overview + +In the general sense this is exactly the same "workflow" as with `xia2` but this time we are going through the steps by hand, looking at the results and making up our own minds about what choices to make rather than delegating the choices to an automated system. + +As before, the data are imported, spots found, indexing and integration performed and the data scaled, so you should end up with pretty much the same answer as `xia2` gets 🙂 but you will learn a little more and it will probably take a little longer. + +### DIALS Commands +Below you will use several DIALS commands: accessing the commands is done through a terminal. DIALS is already installed on Diamond linux workstations, but you need to instruct linux that you want to use it by typing `module load dials` into the terminal. + +This will give you access to the dials commands, which are all of the format `dials.the_thing_you_want_to_do`. Obviously we don't want the first step to be memorizing every single command (there will _not_ be a test at the end) - so just remember that the \ key is your friend. Typing some of the command and double tapping \ will autocomplete as far as it can and show you all the matching options. Try it yourself by typing `dials.` and double-tapping \. + +You also don't want to have to remember the arguments you need to type into any given command. If you just type one of the commands and hit \ then it will tell you some of the potential arguments / options it's expecting. + +There are a couple of _really_ useful DIALS commands which you can basically use whenever you like as you progress through the chain of commands to inspect your data. + +`dials.image_viewer` can be used to... _view your images_ + +`dials.show` will show you a synopsis of the information contained in a DIALS file (see below) + +`dials.reciprocal_lattice_viewer` launches a graphical application to view your reflection data in reciprocal space + +### DIALS Files +There are two main types of file which DIALS uses. They are not ascii files so they're not that easy to look at with a text editor - use `dials.show` instead! + +Files with the extension `expt` are _experiment_ files which contain informaion describing the experiment: things like the detector type and geometry, incident beam definition, goniometer, UB matrix, etc. + +Files with the extension `refl` are _reflection_ files, which contain information describing the reflections: their positions in real space, intensities, etc. + +Many of the DIALS commands will finish with a line telling you it has created or edited some more files: + +```bash +... +Writing experiments to imported.expt +``` +They will be automatically named to reflect the DIALS command which created them - perhaps you can guess the name of the command that creates this file?! Whenever a command makes a file like this, feel free to explore them using the helpful commands above. + +## Importing + +In all honesty importing the data is usually the most troublesome point in any practical tutorial, as you have to correctly type the location of the data to import the right frames. For our example the data are in FIXME add this, so we want to import _all_ the data with: + +```bash +dials.import xia2 image=/dls/i19-2/data/2023/cy12345-6/foo/foo_*.cbf +``` + +which should show you something which looks a little like: + +``` +DIALS (2018) Acta Cryst. D74, 85-97. https://doi.org/10.1107/S2059798317017235 +DIALS 3.12.1-2-g1c07bc2b3-release +The following parameters have been modified: + +input { + experiments = +} + +-------------------------------------------------------------------------------- + format: + num images: 3450 + sequences: + still: 0 + sweep: 4 + num stills: 0 +-------------------------------------------------------------------------------- +Writing experiments to imported.expt +``` + +This expresses what DIALS importing understood about your data set: that we have 4 runs and a total of 3,450 images in the data set. From this point on you can carry all the data through all steps of processing using something I call "ensemble processing". + +## Spot Finding + +This is just a matter of: + +```bash +dials.find_spots imported.expt +``` + +which will make your computer make a whirring-busy-sound for a few minutes and produce _quite a lot_ of output. At the end of this is a spot finding summary which is very similar to the one in `xia2` as: + +``` +Histogram of per-image spot count for imageset 0: +1451 spots found on 900 images (max 42 / bin) + * + * + * * * * * ** +* * * * * ** *** *** * * *** ** +* * * * * ****** ******************** ** * +* ** * * ** * *** ************************************ +**** * ***** ** ****************************************** +**** ********* ********************************************* +************************************************************ +************************************************************ +1 image 900 + +Histogram of per-image spot count for imageset 1: +1224 spots found on 850 images (max 29 / bin) + * * * + ** ** * * * *** * * ** * * ** + ** ** * ** * ** ** ***** * * ** * ** * ** * + * ** ***** * ** * ** ** ***** ****** ***** * ** *** * * + * ** *********** * ***** ****************** * ** *** * * + ** ** *********** ************************** * *** *** * ** + ********************************************** ******* **** +************************************************************ +************************************************************ +************************************************************ +1 image 850 + +Histogram of per-image spot count for imageset 2: +1174 spots found on 848 images (max 29 / bin) + * * * + * * * * * * * ** * + * * * * * * ** * *** * * * * *** * +** * * ******* ** * * *** * ** * * *** *** * ** * +** ***** ******* *** ** * *** ****** * * *********** **** * +** ************* *** ***** ************ * ****************** +************************** ********************************* +************************************************************ +************************************************************ +************************************************************ +1 image 848 + +Histogram of per-image spot count for imageset 3: +1109 spots found on 850 images (max 33 / bin) + * + * + ** * * * * * * +* *** * * ** * * * * * * * *** +* ********** ** * * * * * *** * * ** **** +***************** * ***** ****** ******* * *** * * ** **** +******************* ***** ****** ******* ************** **** +**************************************** ******************* +************************************************************ +************************************************************ +1 image 850 + +-------------------------------------------------------------------------------- +Saved 4958 reflections to strong.refl +``` + +as before, these should make sense in the context of your experiment. However, because you are running through each of the steps by hand you can now take a short sojourn in reciprocal space with the `dials.reciprocal_lattice_viewer`... + +## An Excusrsion in Reciprocal Space + +_You do not need to do this to process your data!_ it is however very instructive to learn your way around these tools as you may later need this insight to process more challenging data sets. The reciprocal lattice viewer allows you to look at every spot found in your data set all at once, in _reciprocal_ space. This means you can see how the lattices align between runs, whether you have one or more lattices and so on. + +```bash +dials.reciprocal_lattice_viewer imported.expt strong.refl +``` + +This will pop up a viewer which looks a little like this: + +![The dials reciprocal lattice viewer](./rlv.png) + +Here each of the runs is a different colour, you can switch on and off with the "experiment id" toggles. You can also rotate the view with the mouse. I cannot really type out a tutorial here beyond this: + +Go play. Have fun. Come back here when you feel like you have found everything. You can run this again later too. You can also run `dials.image_viewer imported.expt strong.refl` to explore _real_ space with your diffraction data. I should really write something on that in this tutorial. + +## Indexing + +OK, so you have just come back from a little voyage in reciprocal space, I hope you had fun. Now we will get back to work and figure out the lattice parameters and crystal orientation for these data using `dials.index`: + +```bash +dials.index imported.expt strong.refl +``` + +This will map all the spots to reciprocal space, do a bunch of Fourier transforms and other mathematics and try to figure out a three dimensional basis which does a good job of matching up with the spot positions in reciprocal space. This starts off at low resolution, indexes the spots, refines and then iterates, so you may see stuff which looks very familiar fly past. This is by design, as it makes processing the high resolution data more robust. This also indexes all of the sweeps at once, so there is one `UB` matrix only. This is particularly important if you have the detector at very high two-theta angles for some of the runs. + +The output here you want to pay attention to is really at the end: + +``` +RMSDs by experiment: ++-------+--------+----------+----------+------------+ +| Exp | Nref | RMSD_X | RMSD_Y | RMSD_Z | +| id | | (px) | (px) | (images) | +|-------+--------+----------+----------+------------| +| 0 | 964 | 0.68836 | 0.62197 | 0.46931 | +| 1 | 874 | 0.46612 | 0.86498 | 0.55976 | +| 2 | 781 | 0.51527 | 0.63582 | 0.7987 | +| 3 | 789 | 0.59745 | 1.0156 | 1.067 | ++-------+--------+----------+----------+------------+ + +Refined crystal models: +model 1 (1129 reflections): +Crystal: + Unit cell: 4.08082(13), 9.8441(5), 11.2146(6), 89.996(2), 89.998(3), 100.684(2) + Space group: P 1 + U matrix: {{-0.6607, -0.7296, 0.1762}, + { 0.0610, 0.1817, 0.9815}, + {-0.7481, 0.6592, -0.0755}} + B matrix: {{ 0.2450, 0.0000, 0.0000}, + { 0.0462, 0.1034, 0.0000}, + {-0.0000, -0.0000, 0.0892}} + A = UB: {{-0.1956, -0.0754, 0.0157}, + { 0.0233, 0.0188, 0.0875}, + {-0.1528, 0.0682, -0.0067}} + +---------8<------- more matrices etc. not very interesing ----- + ++------------+-------------+---------------+-------------+ +| Imageset | # indexed | # unindexed | % indexed | +|------------+-------------+---------------+-------------| +| 0 | 1129 | 322 | 77.8% | +| 1 | 1022 | 202 | 83.5% | +| 2 | 938 | 236 | 79.9% | +| 3 | 896 | 213 | 80.8% | ++------------+-------------+---------------+-------------+ + +Saving refined experiments to indexed.expt +Saving refined reflections to indexed.refl +``` + +Here we see the R.M.S. deviations between where we found the spots on the images and where they should be according to our model, the matrix we determined, the unit cell with some uncertainties (which are not as good as we are going to get them at the end) and the fraction of indexed reflections. Here we have indexed most - if we later look closely we could find that the spot is a little split, but if you have e.g. two distinct lattices then you may find that the indexed percentage is 50% or lower. + +Guess what? You can go explore reciprocal space again but this time looking at the _indexed_ data - which will allow you to look at what was not indexed (exploring the buttons to do this is an exercise for the reader): + +```bash +dials.reciprocal_lattice_viewer indexed.expt indexed.refl +``` + +## Refinement + +The indexing just assigns a single orientation matrix to your data which is very helpful for ensuring consistent indexing but less so when it comes to actually modelling the data, as our instrumentation is not ångstrom precise. Therefore, with the refinement, we firstly allow a little bit of variation in orientation and unit cell between sweeps and secondly within sweeps, to account for beam damage or sample wobbling or similar. + +In general the output from refinement is not very interesting, with many numbers, but the very end is: + +``` +RMSDs by experiment: ++-------+--------+----------+----------+------------+ +| Exp | Nref | RMSD_X | RMSD_Y | RMSD_Z | +| id | | (px) | (px) | (images) | +|-------+--------+----------+----------+------------| +| 0 | 824 | 0.46029 | 0.37353 | 0.23481 | +| 1 | 721 | 0.34769 | 0.49482 | 0.18088 | +| 2 | 678 | 0.26417 | 0.39351 | 0.29746 | +| 3 | 685 | 0.2785 | 0.41499 | 0.35178 | ++-------+--------+----------+----------+------------+ +``` + +These are pretty reasonable numbers - the predictions match up well between where we found spots and where we calculate them to be. Mathematically we are never going to get better than about 1/6th of a pixel, and in real life we rarely see anything better than about a half a pixel. If these are much more than a couple of pixels we may need to back and look at the images, because the integration shoeboxes will be made to be big enough to accomodate this deviation. + +## Integration + +Despite this being one of the most computationally expensive steps, fundamentally it is very simple. Draw a box around where we calculate spots to be, subtract an estimate of the background, the write out the intensity. There is not much to see from the console output of this, but the image viewer is now really useful to see how the spot predictions match up with the observed spots: + +```bash +dials.image_viewer integrated.expt integrated.refl +``` + +Which shows: + +![Image viewer with integration shoebox](./image.png) + +I note here that stacking images is _really_ useful for getting an idea of what the diffraction actually looks like. Again, it is worth playing. + +## Symmetry Determination + +This is usually a straightforward step which is reliable and robust - running: + +```bash +dials.symmetry integration.expt integration.refl +``` + +There is analysis of individual elements: + +``` +Scoring individual symmetry elements + ++--------------+--------+------+------+-----+--------------+ +| likelihood | Z-CC | CC | N | | Operator | +|--------------+--------+------+------+-----+--------------| +| 0.926 | 9.63 | 0.96 | 5754 | *** | 1 |(0, 0, 0) | +| 0.085 | 2.29 | 0.23 | 5533 | | 2 |(1, 0, 0) | +| 0.928 | 9.68 | 0.97 | 5657 | *** | 2 |(0, 0, 1) | +| 0.084 | 2.25 | 0.23 | 5579 | | 2 |(1, 2, 0) | ++--------------+--------+------+------+-----+--------------+ +``` + +then composing the selected operations into a space group: + +``` +Scoring all possible sub-groups + ++-------------------+-----+--------------+----------+--------+--------+------+-------+---------+--------------------+ +| Patterson group | | Likelihood | NetZcc | Zcc+ | Zcc- | CC | CC- | delta | Reindex operator | +|-------------------+-----+--------------+----------+--------+--------+------+-------+---------+--------------------| +| P 1 2/m 1 | *** | 0.909 | 7.38 | 9.65 | 2.27 | 0.97 | 0.23 | 0 | x+y,-z,y | +| P -1 | | 0.07 | 3.74 | 9.63 | 5.89 | 0.96 | 0.48 | 0 | -a,a-b,c | +| C m m m | | 0.008 | 7.01 | 7.01 | 0 | 0.6 | 0 | 1.3 | -a,a-2*b,c | +| C 1 2/m 1 | | 0.007 | -0.03 | 7 | 7.03 | 0.6 | 0.6 | 1.3 | a-2*b,a,c | +| C 1 2/m 1 | | 0.006 | -0.04 | 6.99 | 7.03 | 0.6 | 0.6 | 1.3 | -a,a-2*b,c | ++-------------------+-----+--------------+----------+--------+--------+------+-------+---------+--------------------+ +``` + +Here we see a clear winner: at the very least the pointgroup is probably correct. + +## Scaling + +This is the first real point where we have a good idea of what the data quality is like. The scaling figures out corrections for the experimental contributions to the data e.g. sample absorption, overall scale factor, radiation damage. Usually the defaults work well e.g.: + +```bash +dials.scale symmetrized.expt symmetrized.refl +``` + +If you have a lot of heavy atoms, or a large crystal, it could be beneficial to relax the constraints on the absorption correction with `absorption_level=medium` or `high`. This will give a summary at the end which includes an estimate of the resolution limit and some overall scaling statistics: + +``` +Resolution limit suggested from CC½ fit (limit CC½=0.3): 0.61 + + -------------Summary of merging statistics-------------- + + Suggested Low High Overall +High resolution limit 0.61 1.64 0.61 0.59 +Low resolution limit 11.21 11.21 0.62 11.21 +Completeness 99.6 100.0 93.4 94.2 +Multiplicity 4.5 7.9 2.0 4.4 +I/sigma 8.7 51.1 0.0 8.4 +Rmerge(I) 0.043 0.033 1.285 0.043 +Rmerge(I+/-) 0.040 0.032 1.171 0.040 +Rmeas(I) 0.047 0.036 1.624 0.047 +Rmeas(I+/-) 0.047 0.037 1.656 0.047 +Rpim(I) 0.018 0.013 0.977 0.018 +Rpim(I+/-) 0.023 0.017 1.171 0.023 +CC half 1.000 0.999 0.559 1.000 +Anomalous completeness 92.9 100.0 46.6 85.0 +Anomalous multiplicity 2.4 4.4 1.3 2.4 +Anomalous correlation -0.216 -0.192 0.188 -0.207 +Anomalous slope 0.189 +dF/F 0.042 +dI/s(dI) 0.390 +Total observations 9705 942 227 9806 +Total unique 2169 119 114 2251 +``` + +A little further up you will find some details about the error model as well, which can be useful but I am not going to discuss right now (FIXME add an appendix): + +``` +Error model details: + Type: basic + Parameters: a = 3.17464, b = 0.01101 + Error model formula: σ'² = a²(σ² + (bI)²) + estimated I/sigma asymptotic limit: 28.616 +``` + +We can re-scale the data to the recommended resolution limit with: + +```bash +dials.scale symmetrized.expt symmetrized.refl d_min=0.61 +``` + +Which gives different output with only three columns: + +``` + -------------Summary of merging statistics-------------- + + Overall Low High +High resolution limit 0.61 1.65 0.61 +Low resolution limit 11.21 11.21 0.62 +Completeness 99.9 100.0 95.4 +Multiplicity 4.5 7.9 2.3 +I/sigma 8.8 46.4 0.1 +Rmerge(I) 0.047 0.036 1.187 +Rmerge(I+/-) 0.044 0.035 0.867 +Rmeas(I) 0.051 0.039 1.507 +Rmeas(I+/-) 0.051 0.040 1.226 +Rpim(I) 0.019 0.014 0.913 +Rpim(I+/-) 0.026 0.019 0.867 +CC half 0.999 0.999 0.819 +Anomalous completeness 94.4 100.0 63.1 +Anomalous multiplicity 2.4 4.4 1.4 +Anomalous correlation -0.267 -0.432 0.055 +Anomalous slope 0.215 +dF/F 0.045 +dI/s(dI) 0.471 +Total observations 9662 913 238 +Total unique 2130 116 104 +``` + +It is worth noting here that there are not really anomalous differences as the crystal is centric, but the scaling is barely affected by this so we can just move on with our workings. + +## Next Steps + +We can export the data into MTZ format used in MX with + +```bash +dials.export scaled.expt scaled.refl +``` + +Then make SHELX `ins` and `hkl` files with + +```bash +xia2.to_shelx scaled.mtz tbb CHBr +``` + +After which running `shelxt` and `shelxl` in the usual way is enough to get a nice structure... but that is out of context here. diff --git a/i19-workshop/dials/image.png b/i19-workshop/dials/image.png new file mode 100644 index 0000000..070744d Binary files /dev/null and b/i19-workshop/dials/image.png differ diff --git a/i19-workshop/dials/rlv.png b/i19-workshop/dials/rlv.png new file mode 100644 index 0000000..964b54b Binary files /dev/null and b/i19-workshop/dials/rlv.png differ diff --git a/i19-workshop/xia2/README.md b/i19-workshop/xia2/README.md new file mode 100644 index 0000000..1fce68b --- /dev/null +++ b/i19-workshop/xia2/README.md @@ -0,0 +1,248 @@ +# xia2 + +`xia2` as a command line tool is there to process your data when you have other things to do: it will in general make sensible decisions for you but it _won't_ tell you how your data are interesting, nor give you any real insight into what is going on. If your data are OK, it can be very effective in getting you from the diffraction images to `hkl` files in a compact period of time. + +## Basic xia2 usage + +In the simplest case (for small molecule data) you just run + +```bash +xia2.small_molecule /dls/i19-2/data/2023/cy12345-6/foo/ +``` + +(say) for data collected into a directory `/dls/i19-2/data/2023/cy12345-6/foo/`. This will do all the spot finding, indexing, refinement, integration, symmetry determination and scaling for you, and just give you a very light weight summary of the processing. + +For the tutorial data this gives several blocks of output - spot finding from each of the four scans: + +``` +-------------------- Spotfinding SWEEP1 -------------------- +1388 spots found on 900 images (max 39 / bin) + * * + * * + * * ** * * * +* * * * * *** *** ** *** *** * ** +* ** * * * * * ****** ******************** ** * +* ** * ***** * *** ************************************** +**** ********* * ****************************************** +**** ********* ********************************************* +************************************************************ +************************************************************ +1 image 900 +-------------------- Spotfinding SWEEP2 -------------------- +1170 spots found on 849 images (max 29 / bin) + * * * + * * * * ** * * * * * * + ** *** * ** * * ** *** * * ** ** * ** * + ** *** * * ** * ** ** ***** ****** * *** * ** *** + * ** ****** **** * ************************ * ** *** * ** + * ** *********** ************************** * *** *** * ** + ********************************************** ******* **** +******************************************************* **** +************************************************************ +************************************************************ +2 image 850 +-------------------- Spotfinding SWEEP3 -------------------- +1108 spots found on 848 images (max 28 / bin) + * * * * + * * * * * * * * * + * * * ** * *** ** * * *** *** +** * * * * * ** * *** ** * * * *** *** * * * +** ***** ******* *** ** * *** ****** * * ********* * *** * +**************** *** ***** ************ * ****************** +************************** ********************************* +************************** ********************************* +************************** ********************************* +************************************************************ +1 image 848 +-------------------- Spotfinding SWEEP4 -------------------- +1060 spots found on 850 images (max 31 / bin) + * + * * +* * * * * * * * +* *** * * ** * * * * * * * * * +* *** *** * ** * * ** * * *** * * ** **** +***************** * ***** ****** **** ** ** *** * * ** **** +******************* ***** ****** ******* ************** **** +******************* ******************** ******************* +************************************************************ +************************************************************ +1 image 850 +``` + +If you see any gaps in here then it may be worth looking at the images. The indexing takes all four scans worth of spots, maps the data to reciprocal space then determines an orientation matrix which best explains them all. This is then asssessed for likely symmetry based on refinement against compatible Bravais lattice constraints (which I note is _different_ from the default `dials` route.) - the output here is very sparse: + +``` +---------------- Autoindexing SWEEPS 1 to 4 ---------------- +All possible indexing solutions: +mP 4.09 11.23 9.86 90.00 100.68 90.00 +aP 4.09 9.86 11.23 89.99 90.02 79.32 +Indexing solution: +mP 4.09 11.23 9.86 90.00 100.68 90.00 +``` + +All of the compatible lattices are outputted, and by default the highest symmetry one tested - this may later be eliminated by analysis of the intensities (but in this case turns out to be correct). Next the models are refined (which shows no output) and the data integrated. Modern data sets consist of hundreds or thousands of images, so `xia2` summarises the output of integration as one character per image e.g. + +``` +-------------------- Integrating SWEEP1 -------------------- +Processed batches 2 to 901 +Standard Deviation in pixel range: 0.03 1.39 +Integration status per image (60/record): +ooooooooo.o.ooooooo.o.o.ooo..ooooooooooooooooooooooooooooo.o +oo.o.ooooo.o..oo.o.ooo.oooooo..o.oo.ooo.oooooooooooo.ooooo.. +oooooo..ooooooooooooooo.oooooo..oo..ooo..oooooooooooooooo.oo +oooooooooooooooooooo.oooooooooooooo.o.ooooo.oooooooooooooo.o +oooooo.oooooooo.o.oo.ooooooo.oooooooooooooooooo.oooo.oooooo. +ooooo.o.ooooooooooooooooooooooooooooooooooooo.ooo.ooo.oo.ooo +oooo..o.oooo.ooooo.ooooo.oooo..oooooooooooo.oooooooo..oooooo +oooo.oooooooooo.o.ooooooo.oooo.o.oo.oooooooooooo..oooooooooo +oooooo.oo.%oo.ooo.ooo.ooooo.ooooooo.oo.oooooooooo%oooooooooo +oo.oo.ooooooo.oo..oooooooo.oooooooo..oooo.ooo.oo..oooooooooo +o.oooooooooo.oooo.oooooooooo.ooooo.o.oooooooooooooo.oooooooo +o.o.ooooooooooo.o..oooo.ooooooooooooo.ooooooooooooooooo%oooo +oooo..oo%oooooooooooooo.ooooooo.ooooooo.ooo.ooooo.ooooooo.oo +oooooooo.ooooooo.oooooooooo.ooooooo.oo.ooo.ooooooooooo.%oooo +o.oooo.ooooo..oooo.oo.o.oooooooooo.ooo.o.oo.ooo..ooooo.ooooo +"o" => good "%" => ok "!" => bad rmsd +"O" => overloaded "#" => many bad "." => weak +"@" => abandoned +Mosaic spread: 0.214 < 0.214 < 0.214 +-------------------- Integrating SWEEP2 -------------------- +Processed batches 2 to 851 +Standard Deviation in pixel range: 0.03 1.07 +Integration status per image (60/record): +o.ooo.oo..o...o....ooo.oooooooo.ooo%oo..ooooo....oo.o.oo.ooo +oooooo..oooooooooo..ooooo.oo..ooooo.ooooooooooooooo.o.oooooo +oo.ooo.ooo.oooooooo.o.ooooo.o.ooooooooooo..o..o.oooo.ooooooo +oooooooo....oooooo..ooooooooo.oooooo..o.ooooo..oo.oooo.ooo.. +oooooooooo.oo%.oooo..o....o.o.ooooooooooooo.o.oooo..oooooooo +ooo.ooo.oo.oooo.oooo.ooo.ooooooo..o.oo.ooooooo.ooooooooooooo +oo.oooooooooooo.ooooo.ooo.oooooo.ooo..oooo.ooooo.ooooooooooo +ooo.oo.o.o.oo.oooooooooooo..ooo.oooooo.oo%.oo.oooooooo.ooooo +.ooooooooooooooooooooo.ooooooo.ooooooo..ooooooo.ooooo.ooooo. +ooo.oooooooo.oo.ooooo.ooooo..oooo.ooooooo.oooooo.o.oooo.o.oo +o.ooooooooooooo.ooo.o.oo.ooooooooooooooooo.oo.ooooooooo.oooo +ooooooo.ooo.oo.o.ooo.ooooo.ooooooo..ooooo.oooooooo%ooooo.ooo +ooooo.ooooooooooooo.oooo..ooooo.ooo.o.o.o%.ooooooo.o.%oooooo +.oo.o.o.oo.ooooo.oooooooo.oo..o.ooooooo.ooooo.ooo.oooo..ooo. +oooooo.ooo +"o" => good "%" => ok "!" => bad rmsd +"O" => overloaded "#" => many bad "." => weak +"@" => abandoned +Mosaic spread: 0.179 < 0.179 < 0.179 +-------------------- Integrating SWEEP3 -------------------- +Processed batches 2 to 851 +Standard Deviation in pixel range: 0.03 1.18 +Integration status per image (60/record): +ooo..o.oo.ooooooooo.oo.ooooo..oooo.ooo...o..ooooo..o.ooooooo +..ooooo.oooo.oooooooooooo..ooo.o..ooooo...ooooo.o.oooo.oo... +oooooo.oooooooooooooo.ooo.oooooooo..ooooo.oo%oo.ooo..oo.oooo +o.o..ooooo.ooo.o.oo.oooooo...ooooo.ooo.o.oooooooo.oo....oo.o +oo.oo.ooooo.o..o.ooooooo..oooooo.ooooooo.o.o.ooo.oo.oo.ooo.o +.oooooo..oooooooooooooooooooo%oooo.o.o.oooooooo..ooo.ooo..o. +oo.o.o.o.oooo..o...oo.ooo.oooooo%ooooo.ooooooo.ooooooooo.oo. +ooooooooooooooooooooooooooooo.ooo.oooooo.oo%oooooooooooooooo +o..ooo.ooo..ooo.oooooooooooooooooo.oo.ooo..oooo.ooooo.ooooo. +oooo..ooooooo.oooooooo.o..oooooo.o.ooooooooo.ooooooo.oooo.oo +oooo.ooooo..oooooooo.oooooooo.oooo.ooo.ooooo.oooooo.oooo.oo% +o.ooooooooooooooooo..ooooo...oooo.oo...ooooooo.oo.oooooo.o.o +.oo.ooooooo.o.o.oo.ooooo..oooooooooo.oo..oo.ooo.ooo.o.o.o..o +o..oooooooo...oooo.o.oooo.o.o.ooo.o.oo.oo..oooo.oo.oo.o...oo +oooooo..o. +"o" => good "%" => ok "!" => bad rmsd +"O" => overloaded "#" => many bad "." => weak +"@" => abandoned +Mosaic spread: 0.197 < 0.197 < 0.197 +-------------------- Integrating SWEEP4 -------------------- +Processed batches 2 to 851 +Standard Deviation in pixel range: 0.01 1.24 +Integration status per image (60/record): +ooooooo.ooo..oo.o.ooooo.ooo.o%oo.oooo..oo%o.o.o.ooooooo..ooo +.oo.o.oooooooooooooo..oooooo.ooooooo.ooo.oo..oooo.ooo.oooo.o +oooo.oo..oo.oooooo.ooooooo.ooooooooooo.ooooo.o.ooo.oooooo.oo +oooooo.ooo.o.oooo.ooo.o.oo.oo.o.ooooooooooo.ooo.ooooo.o.oooo +ooooooooooo.ooo.ooo.oooooo.ooo..ooo.....o.oooooooooo.o.ooooo +o.o.ooo..oo.o...o.ooooooooo..ooooo.o.oooooooooooo%o..o.ooo.o +o..oo....oooooooooooooo.oo.ooooo.oooo.ooooo..oooooooo.ooooo. +.oo..oooo.o.oo.oooooo.oooooo.o.ooooo.ooo...o.ooo..oooooo.ooo +o.oooo..ooo.oo..oo%oo..oooooo%o.o.ooooo.ooo..oooooo.ooooo.oo +o.ooo...oooooo.oo.ooooooo.o.oo..ooo.o.o..ooo..o..o.o.oo.o.oo +o.oooo.oo..oooooo..o...o.oooooo....ooooooo.oooo.oo.oooooo.o. +oo.ooooooooooo.o.o.o.oooo.oo.ooooo..ooooo..o.oooooooooo.ooo. +oooo.ooooo.oooooooo.oooo.oo.ooooo.o..ooooooo.oo..ooooooooooo +o..ooooooo.oo...oo.oooooooooooooooooooo.o..ooo.oo.oooooooooo +ooooooooo. +"o" => good "%" => ok "!" => bad rmsd +"O" => overloaded "#" => many bad "." => weak +"@" => abandoned +Mosaic spread: 0.284 < 0.284 < 0.284 +``` + +Usually would expect to get mostly `oooo` and `.....` for small molecule data sets - with `%%%` if the spots are less than perfect. What we see above is good. After this the data have the correct symmetry (as in point group) identified and then scaled, after which a "sensible" resolution limit is determined - + +``` +-------------------- Preparing DEFAULT --------------------- +Reindexing all datasets to common reference +--------------------- Scaling DEFAULT ---------------------- +Resolution limit for NATIVE/SWEEP1: 0.81 ( 0.82 suggested) +Resolution limit for NATIVE/SWEEP2: 0.58 ( 0.59 suggested) +Resolution limit for NATIVE/SWEEP3: 0.58 ( 0.66 suggested) +Resolution limit for NATIVE/SWEEP4: 0.58 ( 0.59 suggested) +``` + +(in this case, using all the data). By default all of the intensity measurements are kept in the output, with the merging stats reporting all data and the lower limit if necessary. After this stage the strong reflections are used to re-refine the unit cell post integration, to get a best estimate of the "global" unit cell for subsequent analysis along with uncertainties to pass on to `shelx`: + +``` +------------------- Unit cell refinement ------------------- +Overall: 4.08 11.21 9.84 90.00 100.67 90.00 +``` + +Finally the overall merging stats are reported for the data set - + +``` +For AUTOMATIC/DEFAULT/NATIVE Suggested Low High Overall +High resolution limit 0.59 1.60 0.59 0.58 +Low resolution limit 9.67 9.67 0.60 9.67 +Completeness 95.6 100.0 38.9 93.9 +Multiplicity 4.4 7.7 1.1 4.4 +I/sigma 9.0 44.1 0.0 8.9 +Rmerge(I) 0.049 0.041 0.558 0.049 +Rmeas(I) 0.053 0.044 0.789 0.053 +Rpim(I) 0.020 0.016 0.558 0.020 +CC half 0.999 0.999 0.666 0.999 +Total observations 9769 994 48 9777 +Total unique 2235 129 44 2243 +Assuming spacegroup: P 1 2/m 1 +Unit cell (with estimated std devs): + 4.0770(2) 11.2084(7) 9.8377(5) +90.0 100.670(4) 90.0 +``` + +It is important to note here that the space group assignment is based on the systematic absences and therefore somewhat unreliable, but the point group determination (which is really all we care about at this stage) is robust. + +The details of how this works can be followed through by going through the [_much more interesting_ `dials` tutorial](../dials/README.md). + +## Result Files + +All the data files you need are left in the `DataFiles` directory. In here are files prepared for input to `shelxt` as `shelxt.ins` and `.hkl` - these however do not have the atom definitions. If you want to look at intermediate steps, e.g. from indexing, you can look in (usually) `./DEFAULT/NATIVE/SWEEP1/index/` e.g. `dials.reciprocal_lattice_viewer 17_indexed.expt 17_indexed.refl` in this case. + +## Controlling xia2 + +The intention of `xia2` is that the defaults "just work" but sometimes you will know better and wish to enforce your will over the computer. This is very straightforward e.g. assigning the known unit cell, space group and resolution limit (which are the key choices) as: + +```bash +xia2 space_group=P2 unit_cell=4.1,11.2,9.8,90,100.7,90 d_min=0.84 /dls/i19-2/data/2023/cy12345-6/foo/ +``` + +You can also process specific "runs" with e.g. + +```bash +xia2 image=/dls/i19-2/data/2023/cy12345-6/foo/foo_2_0001.cbf image=/dls/i19-2/data/2023/cy12345-6/foo/foo_2_0001.cbf +``` + +which will only take the runs which belong to those images. You can also process a subset of the images in a run with + +``` +xia2 image=/dls/i19-2/data/2023/cy12345-6/foo/foo_2_0001.cbf:1:900 +``` + +which will only process images 1...900 inclusive. This can be useful if you did a long scan and the sample decayed.