Overview¶
What is it for?¶
Gemmi is a library, accompanied by a set of programs, developed primarily for use in structural biology, and in particular in macromolecular crystallography (MX). For working with:
macromolecular models (content of PDB, PDBx/mmCIF and mmJSON files),
refinement restraints (CIF files) and small molecule models,
reflection data (MTZ and mmCIF formats),
crystallographic symmetry,
data on a 3D grid with crystallographic symmetry (electron density maps, masks, MRC/CCP4 format)
Parts of this library can be useful in structural bioinformatics (for symmetry-aware analysis of protein models), in chemical crystallography and in other molecular-structure sciences that use CIF files (we have the fastest open-source CIF parser).
Gemmi is open-source (MPL) and portable – it runs on Linux, Windows, macOS and even inside a web browser when compiled to WebAssembly. It is written in C++14, with Python (3.8+) bindings and a partial C and Fortran 2003 interface.
The project also maintains web-based tools and fancy PDB statistics.
Gemmi is a joint project of Global Phasing Ltd and CCP4. It is named after Gemmi Pass. The name can also be expanded as GEneral MacroMolecular I/o.
Source code repository: https://github.com/project-gemmi/gemmi
Note
You can ask questions in Discussions or Issues on GitHub. Alternatively, send me an email.
Contents¶
Prerequisites
Working with Molecules
Working with Data
Other Docs
Credits¶
This project is using code from a number of third-party open-source projects.
Projects used in the C++ library, included under
include/gemmi/third_party/
(if used in headers) or third_party/
:
PEGTL – library for creating PEG parsers. License: MIT.
sajson – high-performance JSON parser. License: MIT.
PocketFFT – FFT library. License: 3-clause BSD.
stb_sprintf – locale-independent snprintf() implementation. License: Public Domain.
fast_float – locale-independent number parsing. License: Apache 2.0.
tinydir – directory (filesystem) reader. License: 2-clause BSD.
Code derived from the following projects is used in the library:
ksw2 – sequence alignment in
seqalign.hpp
is based on the ksw_gg function from ksw2. License: MIT.QCProt – superposition method in
qcp.hpp
is taken from QCProt and adapted to our project. License: BSD.Larch – calculation of f’ and f” in
fprime.cpp
is based on CromerLiberman code from Larch. License: 2-clause BSD.
Projects included under third_party/
that are not used in the library
itself, but are used in command-line utilities, python bindings or tests:
zpp serializer – serialization framework. License: MIT.
The Lean Mean C++ Option Parser – command-line option parser. License: MIT.
doctest – testing framework. License: MIT.
linalg.h – linear algebra library. License: Public Domain.
zlib – a subset of the zlib library for decompressing gz files, used as a fallback when the zlib library is not found in the system. License: zlib.
Not distributed with Gemmi:
nanobind – used for creating Python bindings. License: 3-clause BSD.
zlib-ng – optional, can be used instead of zlib for faster reading of gzipped files.
cctbx – used in tests (if cctbx is not present, these tests are skipped) and in scripts that generated space group data and 2-fold twinning operations. License: 3-clause BSD.
Mentions:
NLOpt was used to try out various optimization methods for class Scaling. License: MIT.
Email me if I forgot about something.