Thursday, 7 August 2014

In memory of Jean-Claude Bradley Part II

Here is the talk I gave at the symposium in memory of Jean-Claude Bradley.

I start off with a description of an open notebook science experiment I did for JCB, a protein-ligand docking related to malaria. I was able to pull all the details from his wiki, the datasets, the method, the reasons for certain choices, the results and even the edit history.

Next I discuss his use of webservices to develop chemistry applications and show several examples from my past. Finally, I suggest that today JCB would use MineCraft instead of Second Life if he was looking for an immersive environment in which to build chemistry activities for students.

Thursday, 5 June 2014

In memory of Jean-Claude Bradley

This Autumn I will be attending an ACS Meeting in San Francisco for the second time. The first time was in 2010 when I co-organised a symposium with Jean-Claude Bradley and Andy Lang.

I was pretty nervous. I stumbled through some opening remarks before finding my feet and paying tribute to the memory of Warren DeLano, another pioneer of openness in chemistry. When Jean-Claude arrived the next day to chair the second session, I remember thinking wow, this guy is so relaxed and confident he can just turn up in bermuda shorts and a casual shirt and not worry about whether his tie is sending out the right signals - I wish I was like that.

Subsequently, I found out that it hadn't always been like that. He been like, well, everyone else: wearing suits every day eagerly trying to make a good impression, following the funding, playing the game. A day came when he tired of it, looked at what he was doing, and decided it was not going to make the world a better place. So he sat down and thought about how to identify what areas of chemistry were actually "useful":
The best answer I could come up with is to trust what human researchers have to say in their papers. I developed a set of search phrases such as "what is needed now" or "what is missing is" and ran them through Google Scholar and Scirus. One of the results was "there is a pressing need for identifying and developing new drug-based antimalarial therapies".
Reactive Reports #51, 2006 Interview with David Bradley.

...and that was how he started the Useful Chemistry project. You can see the genesis of the project in his initial blog posts.

Others have commented on his legacy in Open Notebook Science. For me, his story of starting Useful Chemistry was what impressed me most: how many of us have the courage to look at our work and ask ourselves, is it useful?

To pay tribute to his remarkable vision, I will be speaking at the Jean-Claude Bradley Memorial Symposium on July 14th, organised by Andy Lang, Tony Williams and Peter Murray-Rust in Cambridge, UK. I encourage you to come along to celebrate the work of an inspiring person and to hear how others are building on his legacy.

Tuesday, 6 May 2014

cclib 1.2 now available

On behalf of the cclib development team (namely Karol Langner and Adam Tenderholt who did all the work), I am pleased to announce the release of cclib 1.2, which is now available for download. This version marks the first stable release to target Python 3, and includes several new features and bug fixes (see below).

cclib is an open source library, written in Python, for parsing and interpreting the results of computational chemistry packages. It currently parses output files from ADF, Firefly, GAMESS (US), GAMESS-UK, Gaussian, Jaguar, Molpro and ORCA.

Among other data, cclib extracts:
  • coordinates and energies
  • information about geometry optimization
  • atomic orbital information
  • molecular orbital information
  • information on vibrational modes
  • the results of a TD-DFT calculation

cclib also provides some calculation methods for interpreting the electronic properties of molecules using analyses such as:
  • Mulliken and Lowdin population analyses
  • Overlap population analysis
  • Calculation of Mayer's bond orders

For information on how to use cclib, see the tutorial. If you need help, find a bug, want new features or have any questions, please send an email to our mailing list.

If your published work uses cclib, please support its development by citing the following article:
N. M. O'Boyle, A. L. Tenderholt, K. M. Langner, cclib: a library for package-independent computational chemistry algorithms, J. Comp. Chem. 2008, 29, 839-845

Major changes since cclib 1.1
  • Move project to github
  • Transition to Python 3 (Python 2.7 will still work)
  • Add a multifile mode to ccget script
  • Extract vibrational displacements for ORCA
  • Extract natural atom charges for Gaussian (Fedor Zhuravlev)
  • Updated test file versions to ADF2013.01, GAMESS-US 2012, Gaussian09, Molpro 2012 and ORCA 3.0.1
  • Many bugfixes, thanks to Scott McKechnie, Tamilmani S, Melchor Sanchez, Clyde Fare, Julien Idé and Andrew Warden

Saturday, 3 May 2014

Turning vim into an IDE: Part 2

Something that's really useful is syntax checking. It can save a lot of time (not to mention frustration) if you can find errors *before* you run your scripts. The Syntastic plugin for vim can be used for this. As far as I can tell, it doesn't do any syntax checking itself, but rather is the glue that links other syntax checkers to vim.

First things first, to simplify matters (*), let's rename _gvimrc to .vimrc (easier to use the commandline for this).

Next let's install Syntastic:

1. Create the folder "vimfiles" in your home directory if not already present, e.g. C:\Users\Noel\vimfiles
2. If you don't already have it, you need to install the Pathogen plugin (this plugin makes it easy to install other plugins):
a. Download the zip
b. Extract the zip into vimfiles (thus creating vimfiles/autoload/pathogen.vim)
c. Create a directory called bundle in vimfiles (this is where to put plugins managed by Pathogen)
d. Add "call pathogen#infect()" near the start of your .vimrc
3. Install Syntastic as described on its website:
a. cd C:\Users\Noel\vimfiles\bundle
b. git clone
c. Test by opening a file in gvim, and checking for error messages after typing ":Helptags"

Next, let's say we want to add some syntax checkers for Python. Your choices are pyflakes, pylint, flakes8 and Python itself. It's not either/or - you can include as many as you want. Note that flakes8 is a combination of pyflakes, pep8 (PEP8 style checker), and McCabe (cyclomatic complexity checker). I'm not so keen on style checkers and so I'm going to go just with pyflakes and Python.

1. If you don't have pip (the Python package manager), then install it
2. Install pyflakes with pip ("C:\Python27\Scripts\pip.exe install --upgrade pyflakes")
3. Add C:\Python27\Scripts to the front of your Windows PATH if not already present
3. Configure Syntastic by adding the following to .vimrc:
let g:syntastic_python_checkers = ['pyflakes', 'python']
let g:syntastic_auto_loc_list = 1
4. Open a Python file and check that pyflakes and python are listed when you type ":SyntasticInfo"

The screenshot at the top of the post shows the plugin in action. It's triggered every time you save a file.

* This is necessary because the call to pathogen#infect() is ignored if in _gvimrc (I don't know why). If instead we used _vimrc, then the _vimrc installed in C:\Program Files (x86)\_vimrc is ignored (and we miss CTRL+C/V behaviour for example). This leaves us with .vimrc.

Sunday, 20 April 2014

QM Speed Test: Firefly and wrap up

You can find my latest results, this time for Firefly, over on my Github page. The summary is that it's slightly faster than NWChem for HF calculations, but much faster for B3LYP calculations. I also tried using FreeON but although it appeared to compile fine, it didn't run correctly on the CentOS system I have been using for testing.

Someone raised the point that these are not like-for-like comparisons in terms of all of the program parameters (integration grid size, convergence criteria, etc, etc.). Pedro Silva investigated the effect of grid-size on the speed and results, and tried to normalise across GAMESS, Q-Chem and Firefly. However, our intention here is to give an idea of comparative speeds using the defaults as some sort of baseline, but also because it reflects typical use.

Unfortunately, I ran out of time to continue this project back in late Jan or so. The first victim of the time shortages were the Turbomole results. The folks at COSMOlogic came forward with an evaluation copy, but I simply ran out of time before having a chance to figure it out and use it. The "tmole" input files are already there though in the repo (from Michael Banck).

The others involved have been much more active. To see all of the results gathered so far, check out the Github pages of:
Where to from here? Well, nowhere for me, at least for the immediate future. But these input files are out there now, and if you want to compare different packages, they make a good starting point. An interesting question is whether the relative speeds change once you start looking at larger systems. I don't know the answer to that...maybe you can try finding out.

Thursday, 27 February 2014

Screencasting on Windows

I always find screencasting quite difficult to set up on Windows. Here are some notes on a recent successful attempt that uses the well-known ffmpeg library/command-line tool to record from a Direct Show screen capture device.

Set up the screen capture device

Install Screen Capture Recorder and configure by selecting the shortcut "configure by resizing a transparent window". Start broadcasting the desktop by choosing the shortcut "stream desktop local LAN", and "Start Normal 10 fps".

Capture the screen

Install 32-bit ffmpeg (even on 64-bit Windows) and capture as follows (press "q" to stop):
ffmpeg -f dshow -i video=screen-capture-recorder -r 10
       -y recording.mp4

Convert to the target format

For example, convert to .wmv and retain just the first 60 seconds (see "-ss" for skipping the start and "-q:v 10" to control the size/quality):
ffmpeg -i recording.mp4 -t 60 -b:v 1000k
       -vcodec wmv2 myfile.wmv

Monday, 3 February 2014

When Python goes wrong: Off-by-one

Many niggling problems with Python were fixed in Python 3, but here's one that recently bit me in the expunged. It relates to rounding in format strings and so is a potential pitfall for anyone handling floats.

The thing is, "%f" rounds differently than "%d".
>>> "%.1f" % 1.56 ### I expect 1.6
>>> "%d" % 1.6    ### By analogy, I expect 2
The correct solution is either "%.0f" (yes - it exists) or explicitly round it yourself how you like, e.g. "%d" % int(x+0.5) or using round().

This was causing off-by-one results in my code which I fortunately identified prior to submitting a manuscript. But I'm disappointed, Python; and don't blame C.

Note: As a further issue, if you need to handle negative values also, doublecheck the rounding behaviour of your may not always be what you expect.