Installation¶
C++ library¶
Gemmi used to be a header-only library (until ver. 0.6.0).
Parts of the library (finding symmetry operations, parsing CIF grammar)
are still header-only; if you happen to use only these parts,
just ensure that Gemmi’s include
directory is in
your project’s include path. For example:
git clone https://github.com/project-gemmi/gemmi.git
c++ -Igemmi/include -O2 my_program.cpp
However, in most cases, you need to build a library called gemmi_cpp and link your project against it.
If you use CMake, you may:
first install gemmi and then use find_package:
find_package(gemmi 0.7.0 CONFIG REQUIRED)
or add gemmi as a git submodule and use add_subdirectory:
add_subdirectory(gemmi EXCLUDE_FROM_ALL)
or use FetchContent:
include(FetchContent) FetchContent_Declare( gemmi GIT_REPOSITORY https://github.com/project-gemmi/gemmi.git GIT_TAG ... ) FetchContent_GetProperties(gemmi) if (NOT gemmi_POPULATED) FetchContent_Populate(gemmi) add_subdirectory(${gemmi_SOURCE_DIR} ${gemmi_BINARY_DIR} EXCLUDE_FROM_ALL) endif()
Then link your target with the library (this also takes care of includes):
target_link_libraries(example PRIVATE gemmi::gemmi_cpp)
If a target only needs gemmi headers, do this instead:
target_link_libraries(example PRIVATE gemmi::headers)
The gemmi::headers interface, which is also included in gemmi::gemmi_cpp, adds two things: the include directory and the compile feature cxx_std_14 (a minimal requirement for compilation).
Gemmi can be compiled with either zlib or zlib-ng. The only difference is that zlib-ng is faster. Here are the relevant CMake options:
FETCH_ZLIB_NG – download, build statically, and use zlib-ng.
USE_ZLIB_NG – find zlib-ng installed on the system.
INTERNAL_ZLIB – compile third_party/zlib (a subset of zlib distributed with gemmi).
None of the above – find zlib installed on the system; if not found, use third_party/zlib.
On Windows, when a program or library is linked with a zlib(-ng) DLL,
it may require the DLL to be in the same directory.
It is simpler to build zlib-ng statically or use -D FETCH_ZLIB_NG=ON
.
Note on Unicode: if a file name is passed to Gemmi (through std::string
),
it is assumed to be in ASCII or UTF-8.
Python module¶
From PyPI¶
To install the gemmi module, run:
pip install gemmi
We have binary wheels for several Python versions (for all supported CPython versions and one PyPy version), so the command usually downloads binaries. If a matching wheel is not available, the module is compiled from source – it takes a few minutes and requires a C++ compiler that supports C++17.
Gemmi 0.7+ supports only Python 3.8+.
Other binaries¶
You can find gemmi:
If you use the CCP4 suite, you can find gemmi there.
If you use conda, the gemmi package, which includes also a command-line program and C++ dev files, can be installed from conda-forge:
conda install -c conda-forge gemmi
These distribution channels may have an older version of gemmi.
From git¶
The latest version can be installed directly from the repository. Either use:
pip install git+https://github.com/project-gemmi/gemmi.git
or clone the project (or download a zip file) and from the top-level directory run:
pip install .
Building with pip uses scikit-build-core and CMake underneath.
You can pass options to CMake either using the --config-settings
option
in recent pip versions:
pip install . --config-settings="cmake.args=-DFETCH_ZLIB_NG=ON"
or by using environment variables such as CMAKE_ARGS
. See
scikit-build-core docs
for details.
If gemmi is already installed, uninstall the old version first
(pip uninstall
) or add the --upgrade
option.
Alternatively, you can manually install nanobind and cmake (using pip) and build a cloned project directly with CMake:
cmake -D USE_PYTHON=1 .
make -j4 gemmi_py
Fortran and C bindings¶
The Fortran bindings are in an early stage and are not documented yet.
They use the ISO_C_BINDING module introduced in Fortran 2003
and shroud.
You can check the fortran/
directory to see what to expect.
This directory contains a Makefile – run make to build the bindings.
(They are currently not integrated with the CMake build.)
The C bindings are used only for making Fortran bindings, but they should be usable on their own.
Program¶
The library comes with a command-line program also named gemmi
.
Binaries¶
Binaries are distributed with the CCP4 suite and with Global Phasing software.
They are also in PyPI
(pip install gemmi-program
),
conda-forge packages,
and a few Linux (and FreeBSD)
distros.
The very latest builds (as well as a little older ones) can be downloaded from CI jobs:
For Windows – click the first (green) job in AppVeyor CI and find gemmi.exe in the Artifacts tab (if there is also a dll file there, it’s a dynamically linked build and both files are needed).
For Linux and Mac – sign in to GitHub (no special permissions are needed, but GitHub requires sign-in for artifacts), go to gemmi’s gemmi’s CI workflow, click the latest job with ✅, scroll to the bottom of the page, and download one of the zip files from the Artifacts section.
From source¶
To build it from source, first make sure you have git, cmake and C++ compiler
installed (on Ubuntu: sudo apt install git cmake make g++
), then:
git clone https://github.com/project-gemmi/gemmi.git
cd gemmi
cmake .
make
Alternatively, you can use pip install git+https://...
, which installs
both the Python module and the program. If you are not using the Python module,
you can use pip to build only the program:
pip install git+https://github.com/project-gemmi/gemmi.git --config-settings=cmake.args=-DONLY_PROGRAM=ON
Testing¶
The main automated tests are in Python:
python3 -m unittest discover -v tests/
We also have Python doctest tests in the documentation,
and a few other test routines.
All the commands used for testing are listed in the run-tests.sh
script in the repository.
Credits¶
This project is using code from a number of third-party open-source projects.
Projects used in the C++ library, included under
include/gemmi/third_party/
(if used in headers) or third_party/
:
PEGTL – library for creating PEG parsers. License: MIT.
sajson – high-performance JSON parser. License: MIT.
PocketFFT – FFT library. License: 3-clause BSD.
stb_sprintf – locale-independent snprintf() implementation. License: Public Domain.
fast_float – locale-independent number parsing. License: Apache 2.0.
tinydir – directory (filesystem) reader. License: 2-clause BSD.
Code derived from the following projects is used in the library:
ksw2 – sequence alignment in
seqalign.hpp
is based on the ksw_gg function from ksw2. License: MIT.QCProt – superposition method in
qcp.hpp
is taken from QCProt and adapted to our project. License: BSD.Larch – calculation of f’ and f” in
fprime.cpp
is based on CromerLiberman code from Larch. License: 2-clause BSD.
Projects included under third_party/
that are not used in the library
itself, but are used in command-line utilities, python bindings or tests:
zpp serializer – serialization framework. License: MIT.
The Lean Mean C++ Option Parser – command-line option parser. License: MIT.
doctest – testing framework. License: MIT.
linalg.h – linear algebra library. License: Public Domain.
zlib – a subset of the zlib library for decompressing gz files, used as a fallback when the zlib library is not found in the system. License: zlib.
Not distributed with Gemmi:
nanobind – used for creating Python bindings. License: 3-clause BSD.
zlib-ng – optional, can be used instead of zlib for faster reading of gzipped files.
cctbx – used in tests (if cctbx is not present, these tests are skipped) and in scripts that generated space group data and 2-fold twinning operations. License: 3-clause BSD.
Mentions:
NLOpt was used to try out various optimization methods for class Scaling. License: MIT.
Email me if I forgot about something.
List of C++ headers¶
Here is a list of C++ headers in gemmi/include/
.
This list also provides an overview of the library.
- gemmi/addends.hpp
Addends to scattering form factors used in DensityCalculator and StructureFactorCalculator.
- gemmi/align.hpp
Sequence alignment, label_seq_id assignment, structure superposition.
- gemmi/assembly.hpp
Generating biological assemblies by applying operations from struct Assembly to a Model. Includes chain (re)naming utilities.
- gemmi/asudata.hpp
AsuData for storing reflection data.
- gemmi/asumask.hpp
AsuBrick and MaskedGrid that is used primarily as direct-space asu mask.
- gemmi/atof.hpp
Functions that convert strings to floating-point numbers ignoring locale. Simple wrappers around fastfloat::from_chars().
- gemmi/atox.hpp
Locale-independent functions that convert strings to integers, equivalents of standard isspace and isdigit, and a few helper functions.
- gemmi/bessel.hpp
Functions derived from modified Bessel functions I1(x) and I0(x).
- gemmi/binner.hpp
Binning - resolution shells for reflections.
- gemmi/blob.hpp
Finding maxima or “blobs” in a Grid (map). Similar to CCP4 PEAKMAX and COOT’s “Unmodelled blobs”.
- gemmi/bond_idx.hpp
BondIndex: for checking which atoms are bonded, calculating graph distance.
- gemmi/c4322.hpp
Electron scattering factor coefficients from the International Tables.
- gemmi/calculate.hpp
Calculate various properties of the model.
- gemmi/ccp4.hpp
CCP4 format for maps and masks. See also read_map.hpp.
- gemmi/cellred.hpp
Unit cell reductions: Buerger, Niggli, Selling-Delaunay.
- gemmi/chemcomp.hpp
ChemComp - chemical component that represents a monomer from Refmac monomer library, or from PDB CCD.
- gemmi/cif.hpp
CIF parser (based on PEGTL) with pluggable actions, and a set of actions that prepare Document. To just read the CIF format, include read_cif.hpp instead.
- gemmi/cif2mtz.hpp
A class for converting SF-mmCIF to MTZ (merged or unmerged).
- gemmi/cifdoc.hpp
struct Document that represents the CIF file (but can also be read from a different representation, such as CIF-JSON or mmJSON).
- gemmi/contact.hpp
Contact search, based on NeighborSearch from neighbor.hpp.
- gemmi/crd.hpp
Generate Refmac intermediate (prepared) files crd and rst
- gemmi/ddl.hpp
Using DDL1/DDL2 dictionaries to validate CIF/mmCIF files.
- gemmi/dencalc.hpp
Tools to prepare a grid with values of electron density of a model.
- gemmi/dirwalk.hpp
Classes for iterating over files in a directory tree, top-down, in alphabetical order. Wraps the tinydir library (as we cannot yet depend on C++17 <filesystem>).
- gemmi/ecalc.hpp
Normalization of amplitudes F->E (“Karle” approach, similar to CCP4 ECALC).
- gemmi/eig3.hpp
Eigen decomposition code for symmetric 3x3 matrices.
- gemmi/elem.hpp
Elements from the periodic table.
- gemmi/enumstr.hpp
Converts between enums (EntityType, PolymerType, Connection::Type, SoftwareItem::Classification) and mmCIF strings.
- gemmi/fail.hpp
fail(), unreachable() and __declspec/__attribute__ macros
- gemmi/fileutil.hpp
File-related utilities.
- gemmi/floodfill.hpp
The flood fill (scanline fill) algorithm for Grid. Assumes periodic boundary conditions in the grid and 6-way connectivity.
- gemmi/formfact.hpp
Calculation of atomic form factors approximated by a sum of Gaussians. Tables with numerical coefficients are in it92.hpp and c4322.hpp.
- gemmi/fourier.hpp
Fourier transform applied to map coefficients.
- gemmi/fprime.hpp
Cromer-Liberman calculation of anomalous scattering factors, with corrections from Kissel & Pratt.
- gemmi/fstream.hpp
Ofstream and Ifstream: wrappers around std::ofstream and std::ifstream.
- gemmi/grid.hpp
3d grids used by CCP4 maps, cell-method search and hkl data.
- gemmi/gz.hpp
Functions for transparent reading of gzipped files. Uses zlib.
- gemmi/input.hpp
Input abstraction. Used to decouple file reading and decompression.
- gemmi/intensit.hpp
Class Intensities that reads multi-record data from MTZ, mmCIF or XDS_ASCII and merges it into mean or anomalous intensities. It can also read merged data.
- gemmi/interop.hpp
Interoperability between Model (MX) and SmallStructure (SX).
- gemmi/it92.hpp
X-ray scattering factor coefficients from International Tables for Crystallography Volume C, edition from 1992 or later.
- gemmi/iterator.hpp
Bidirectional iterators (over elements of any container) that can filter, uniquify, group, or iterate with a stride.
- gemmi/json.hpp
Reading CIF-JSON (COMCIFS) and mmJSON (PDBj) formats into cif::Document.
- gemmi/levmar.hpp
Least-squares fitting - Levenberg-Marquardt method.
- gemmi/linkhunt.hpp
Searching for links based on the _chem_link table from monomer dictionary.
- gemmi/logger.hpp
Logger - a tiny utility for passing messages through a callback.
- gemmi/math.hpp
Math utilities. 3D linear algebra.
- gemmi/metadata.hpp
Metadata from coordinate files.
- gemmi/mmcif.hpp
Read mmCIF (PDBx/mmCIF) file into a Structure from model.hpp.
- gemmi/mmcif_impl.hpp
Functions used in both mmcif.hpp and refln.hpp (for coordinate and reflection mmCIF files).
- gemmi/mmdb.hpp
Converts between gemmi::Structure and mmdb::Manager.
- gemmi/mmread.hpp
Read any supported coordinate file. Usually, mmread_gz.hpp is preferred.
- gemmi/mmread_gz.hpp
Functions for reading possibly gzipped coordinate files.
- gemmi/model.hpp
Data structures to store macromolecular structure models.
- gemmi/modify.hpp
Modify various properties of the model.
- gemmi/monlib.hpp
Monomer library - (Refmac) restraints dictionary, which consists of monomers (chemical components), links, and modifications.
- gemmi/mtz.hpp
MTZ reflection file format.
- gemmi/mtz2cif.hpp
A class for converting MTZ (merged or unmerged) to SF-mmCIF
- gemmi/neighbor.hpp
Cell-linked lists method for atom searching (a.k.a. grid search, binning, bucketing, cell technique for neighbor search, etc).
- gemmi/neutron92.hpp
Neutron coherent scattering lengths of the elements, from Neutron News, Vol. 3, No. 3, 1992.
- gemmi/numb.hpp
Utilities for parsing CIF numbers (the CIF spec calls them ‘numb’).
- gemmi/pdb.hpp
Read the PDB file format and store it in Structure.
- gemmi/pdb_id.hpp
Handling PDB ID and $PDB_DIR: is_pdb_code(), expand_pdb_code_to_path(), …
- gemmi/pirfasta.hpp
Read sequences from PIR or (multi-)FASTA formats.
- gemmi/polyheur.hpp
Heuristic methods for working with chains and polymers. Also includes a few well-defined functions, such as removal of waters.
- gemmi/qcp.hpp
Structural superposition, the QCP method.
- gemmi/read_cif.hpp
Functions for reading possibly gzipped CIF files.
- gemmi/read_map.hpp
Functions for reading possibly gzipped CCP4 map files.
- gemmi/recgrid.hpp
ReciprocalGrid – grid for reciprocal space data.
- gemmi/reciproc.hpp
Reciprocal space helper functions.
- gemmi/refln.hpp
Reads reflection data from the mmCIF format.
- gemmi/resinfo.hpp
List of common residues with basic data.
- gemmi/riding_h.hpp
Place hydrogens according to bond lengths and angles from monomer library.
- gemmi/scaling.hpp
Anisotropic scaling of data (includes scaling of bulk solvent parameters).
- gemmi/select.hpp
Selections.
- gemmi/seqalign.hpp
Simple pairwise sequence alignment.
- gemmi/seqid.hpp
SeqId – residue number and insertion code together.
- gemmi/seqtools.hpp
Functions for working with sequences (other than alignment).
- gemmi/serialize.hpp
Binary serialization for Structure (as well as Model, UnitCell, etc).
- gemmi/sfcalc.hpp
Direct calculation of structure factors.
- gemmi/small.hpp
Representation of a small molecule or inorganic crystal. Flat list of atom sites. Minimal functionality.
- gemmi/smcif.hpp
Read small molecule CIF file into SmallStructure (from small.hpp).
- gemmi/solmask.hpp
Flat bulk solvent mask. With helper tools that modify data on grid.
- gemmi/span.hpp
Span - span of array or std::vector. MutableVectorSpan - span of std::vector with insert() and erase()
- gemmi/sprintf.hpp
interface to stb_sprintf: snprintf_z, to_str(float|double)
- gemmi/stats.hpp
Statistics utilities: classes Covariance, Correlation, DataStats
- gemmi/symmetry.hpp
Crystallographic Symmetry. Space Groups. Coordinate Triplets.
- gemmi/to_chemcomp.hpp
Create cif::Block with monomer library _chem_comp* categories from struct ChemComp.
- gemmi/to_cif.hpp
Writing cif::Document or its parts to std::ostream.
- gemmi/to_json.hpp
Writing cif::Document or its parts as JSON (mmJSON, CIF-JSON, etc).
- gemmi/to_mmcif.hpp
Create cif::Document (for PDBx/mmCIF file) from Structure.
- gemmi/to_pdb.hpp
Writing PDB file format (Structure -> pdb file).
- gemmi/topo.hpp
Topo(logy) - restraints (from a monomer library) applied to a model.
- gemmi/twin.hpp
Twinning laws.
- gemmi/unitcell.hpp
Unit cell.
- gemmi/utf.hpp
Conversion between UTF-8 and wchar. Used only for file names on Windows.
- gemmi/util.hpp
Utilities. Mostly for working with strings and vectors.
- gemmi/version.hpp
Version number.
- gemmi/xds_ascii.hpp
Read unmerged XDS files: XDS_ASCII.HKL and INTEGRATE.HKL.