How do I extract a molecular structure from a CIF file?

Matter Modeling Asked on November 7, 2021

When I open a CIF file of a crystal structure in a typical molecular editor like Avogadro, then I just see a few symmetry-unique atoms scattered around the unit cell. But I don’t see the full molecule with all its bonds. What is the easiest way to extract one individual molecular image with all its bonds from a cif file?

The two problems are:

  1. Some atoms are not shown at all if they are not symmetry-unique.
  2. Even if all the atoms are shown, then chemical bonds may go between the images in different unit cells, so that bonds are not properly displayed within the unit cell.

Let me show a screenshot below:

Avogadro screenshot

What I would like to happen is that the part with the orange atoms is moved up one unit cell, so that the carbon atoms attach to the blue nitrogen atoms shown at the top. Similarly all the scattered hydrogens have to be moved to different unit cells so that they attach to the atoms properly. This is clearly not a trivial task but it is something that would be needed whenever a CIF file is opened.

7 Answers

If the content of the .cif in question is fairly molecule-like, my recommendation are the cod-tools collection by the Crystallography Open Database. These may be obtained directly from their web site; if you use Debian 10/buster or Ubuntu 18.10LTS and higher, you equally find them in the the repositories (DebiChem tracker).

A typical extraction follows the pattern of

codcif2sdf input.cif > output.sdf

to write a new record (v2000 style). This works quite well, not only for .cif by the COD, but equally from other databases (including .cif the CSD file by CCDC if you forgot the direct export of .mol2 from the database), too. The cod-tools collection shines especially with unit cells of higher space group symmetries where a small number of unique atoms suffice to put the molecule together.

On occasion, an adjustment of bond orders may be necessary; e.g. in triazines like this one by COD where all six aromatic bonds between carbon and nitrogen are as double, instead as an aromatic bond.

Answered by Buttonwood on November 7, 2021

If one has access to CCDC's Mercury app, the unique molecules (Z') can be easily written as mol2 or xyz files by the Save As option.

Of course, this requires completely written CIF file (with metadata). Often times, some codes lazily output all coordinates in the unit cell.

Answered by gogo on November 7, 2021

This is due to an unfortunate code-change in some versions of Open Babel. AFAIK, versions 3.0 and later have a heuristic to detect if the CIF contains a molecule, or is a traditional solid-state crystal structure (e.g., NaCl, YBCO, etc.). These versions are used by the latest development versions of Avogadro (1.93 / 1.94 at the moment).

Based on your screenshot, it looks like you're using Avogadro 1.x. There isn't a great way to back-fix this, since I'm not maintaining that code. (I'd much rather just get 2.0 finished and out the door.)

There are two work-arounds:

  • Convert with Open Babel (e.g., CML, MDL, etc.) and read in that
  • Use "fill unit cell" to get all atoms, and build a super-cell if needed.

The patch is here #1823, although it would probably be a pain to apply the patch, rebuild Avogadro 1.x with the older versions of Open Babel it uses, etc.

Answered by Geoff Hutchison on November 7, 2021

I tried out a few programs. It seems like Jmol and Molden do resolve the CIF structure and directly show me one molecule with all its bonds without requiring any further input.

Other programs like Avogadro or a command-line conversion with OpenBabel just produce the raw data of atoms scattered in the unit cell.

Answered by Felix on November 7, 2021

A command line package I use very often to convert between structure formats (and other things) is c2x. You can download the code from their website, and a quick "c2x --formats" suggests that it supports both cif files and xyz files.

Answered by ProfM on November 7, 2021

If you're not tethered to Avogadro

You can use Mercury to change the system size by going to calculate, packing, and adjusting the cell size.

Answered by Cody Aldaz on November 7, 2021

In the case of Avogadro, the process is really straightforward. Go to

File -> Save As

In the pop-up window, you can save the data as .xyz or .mol (among many others) types.

Answered by Magic_Number on November 7, 2021

Add your own answers!

Related Questions

Quantum Dot properties using VASP

2  Asked on August 19, 2021 by suseel-rahul


How to set vacuum space for slabs?

1  Asked on December 25, 2020 by barix


Ask a Question

Get help from others!

© 2022 All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP, SolveDir