Chemistry - Is there a way to use free software to convert SMILES strings to structures?
Solution 1:
According to the website, Open Babel should do the trick: Documentation - SMILES, Sourceforge.
For example, the following code will give you a neat SVG file of the molecule benzene:
obabel -:"c1ccccc1" -O benzen.svg
If you experience problems using it, you are welcome to ask more specifically.
Alternatively, you can use a web-query from the national cancer institute. It is easily accessible by the following code
http://cactus.nci.nih.gov/chemical/structure/"structure identifier"/"representation"
For example: benzene, "structure identifier"=c1ccccc1
, "representation"=image
.
Another open source solution, where you can directly export the structure into a molecular editor is Avogadro. (It uses Open Babel though.)
Depending on the actual problem, however, there might already be more advanced routines.
Solution 2:
In addition to the other good answers, I'd recommend rdkit
, an open-source, freely available software for chemoinformatics. Most people use rdkit
via its Python interface.
Here are some rdkit
basics:
- The code base is available in GitHub, here.
- The license is quite permissive; you don't need to worry about what type of work (commercial, personal, or academic) you are doing.
- The Python API makes using
rdkit
easy, but all the core functions are written C++, making it fast and efficient. The Python API provides access to these functions in Python, making it flexible and easy to learn. If you happen to be fluent in C++, a C++ API is available. - It does a whole lot more than convert SMILES to structures; see some examples here.
Here is one way to convert a SMILES to a structure in rdkit.
from rdkit import Chem
from rdkit.Chem import Draw
import matplotlib.pyplot as plt
%matplotlib inline
penicillin_g_smiles = 'CC1([C@@H](N2[C@H](S1)[C@@H](C2=O)NC(=O)Cc3ccccc3)C(=O)O)C'
penicillin_g = Chem.MolFromSmiles(penicillin_g_smiles)
Draw.MolToMPL(penicillin_g, size=(200, 200))
Here's a picture of the code and the resulting image.
Solution 3:
For those who want to convert a few SMILES strings to images, you can also use the CDK 1.5-based Depict utility from John May (www.simolecule.com/cdkdepict/, GitHub). It provides various options and outputs Scalable Vector Graphics (which can be easily converted into other formats).
For example, caffeine with title: https://www.simolecule.com/cdkdepict/depict/bow/svg?smi=CN1C%3DNC2%3DC1C(%3DO)N(C(%3DO)N2C)C%20caffeine&abbr=on&hdisp=bridgehead&showtitle=true&zoom=1.6&annotate=none
Thus, with the basic web API you can create a script to convert all SMILES strings too, e.g. using the RCurl package. This StackOverflow post explains how you convert the SVG to other formats.
However, since you probably prefer a pure R-based solution, please do have a look at the rcdk package.
Solution 4:
I'm surprised that you've had difficulty finding a toolkit - is it that the licence must be MIT or as permissive? I guess that you will be using this in software you are making, rather than a one-off data conversion?
For example, OpenBabel (C++), Chemistry Development Kit (Java), etc - in addition, the CDK can interface with R - would seem to suit your needs?