Fork me on GitHub

Running Quantum Espresso with Fireworks

The most difficult part of adapting to using Fireworks is probably not installing it, but actually adapting to database centric workflow style. What does that mean exactly? Well, the evolution of performing quantum chemistry calculations usually begins with performing a large number of calculations in a single file and outputting the results to differently named files within that directory. With a significantly large number of calculations, this quickly becomes a chaotic mess.

The next logical choice for retaining some order is usually to distribute these calculations between directories labeled in a way that the user can find them later. Individuals who are particularly interested in reproducibility are likely to divide calculations into individual runs as well so that each file represents its own run instance. This help to avoid overwriting information which can make following the progression of a calculation difficult or impossible. This tends to be much easier to manage, but there are still some hassles, such as needing to know the order of file structures to get specific information.

Using a database to store information about where calculations are located has all the advantages of both systems. Calculations are kept well organized in their own directory, while the database stores keywords which allow that information to be accessed easily. These keywords can be as simple as storing the names of the directories you would have used previously. The trick is getting used to this way of collecting information. I'll try and go over some basic methodologies which could be generally useful for getting accustomed to this style.

1 Running Quantum Espresso

Before we try running Fireworks at all, we should make sure we can run Quantum Espresso first. In a previous post, I went over installing Fireworks on Sherlock for SUNCAT users. This post also goes over getting the libraries needed for Quantum Espresso working in any environment. If you've followed this approach, you should be able to run the following example script with little trouble.

First, we construct a script file to run. Copy the following into a file named script.py.

from ase import Atoms
from ase.io import write
from espresso import espresso

# Define the ase-espresso keys we will use.
keys = {
    'mode': 'relax',
    'opt_algorithm': 'bfgs',
    'xc': 'RPBE',
    'outdir': '.',
    'output': {'removesave': True},
    'pw': 200,
    'dw': 2000,
    'dipole': {'status': True},
    'kpts': (1, 1, 1),
    'calcstress': True,
    'convergence': {
	'energy': 1e-5,
	'mixing': 0.35}}

# Create the atoms object
atoms = Atoms(
    'H2',
    [[0.0, 0.0, 0.0],
     [0.0, 0.0, 0.8]])
atoms.center(vacuum=4)
atoms.info = keys
write('input.traj', atoms)

# Calculate the relaxation
calc = espresso(**keys)
atoms.set_calculator(calc)
atoms.get_potential_energy()

# Save the compressed calculator file so that we can use it for later.
calc.save_flev_output()

Now we can run the script and see if everything works properly. By submitting to the development queue, we should be able to run this test quickly.

sbatch -p dev script.py

2 Creating module Fireworks functions

Fireworks is designed to run scripts in modular fashion, so that individual tasks which are common to many unit cells can be called upon easily and assembled different types of calculations. For my purposes, I find it very useful to collect the trajectory of a relaxation after it is completed. For this, ase-espresso has a built in function called pwlog2trajectory.

In the example, I'll be calling all of the functions I use from the following python scripts. It's a long one because it draws from many modules I incorporate in my workflow. I hope to be more diligent about posting in the future to help break each of these concepts up into individual posts.

2.1 Saving a Quantum Espresso trajectory

Line 129 of this code doesn't seem to operate correctly for my purposes when I was originally attempting to store the trajectory calculation information, so I modified it into my own version which works slightly differently.

import os
from ase.calculators.singlepoint import SinglePointCalculator as SPC
import numpy as np
from ase import Atoms
from ase.io import read, write
from ase.units import Ry, Bohr
# 1.889726 is the atomic unit of length per Angstrom (aul/A)
aul = 1.889726


def attach_results(f, atoms, write_file=True):
    """ Return the TS corrected energy for a scf instance
    in a log file and attach them to the given atoms obejct.

    Will also attach the forces and stress if applicable.
    """
    energy, forces, stress = None, None, None

    line = f.readline()
    while '!    total energy' not in line:
	line = f.readline()

    energy = float(line.split()[-2]) * Ry

    # Correct for non-zero temperature smearing
    for i in range(20):

	line = f.readline()
	if '     smearing contrib.' in line:
	    energy -= 0.5 * float(line.split()[-2]) * Ry

	# Collect the forces on the atoms
	if 'Forces acting on atoms (Ry/au):' in line:
	    for _ in range(4):
		line = f.readline()
		if 'atom' in line:
		    break

	    forces = []
	    for _ in range(len(atoms)):
		forces += [line.split()[-3:]]
		line = f.readline()

	    forces = np.array(forces, dtype=float) / Ry * aul

	    # If forces were located, attempt to find stress
	    for i in range(10):
		line = f.readline()

		if 'total   stress' in line:

		    stress = []
		    for _ in range(3):
			line = f.readline()
			stress += [line.split()[-3:]]

		    stress = np.array(stress, dtype=float) / Ry * Bohr ** 3
		    break

    # attach the calculator
    calc = SPC(atoms=atoms,
	       energy=energy,
	       forces=forces,
	       stress=stress)
    atoms.set_calculator(calc)

    return atoms


def log_to_atoms(log_file='log', ent=-1, out_file=None):
    """ Parse a QE log file for atoms trajectory and return a list
    of atoms objects representative of the relaxation path.

    NOTE: trajectory information is only returned for calculations
    run with BFGS internal to QE.
    """

    images = []
    with open(log_file) as f:
	line = f.readline()

	# Flag to read trajectory 'ent' only
	with os.popen(
		'grep -n Giannozzi ' +
		log_file +
		' 2>/dev/null', 'r') as p:
	    n = int(p.readlines()[ent].split()[0].strip(':'))

	for i in range(n):
	    line = f.readline()

	# Read lines one at a time
	while line:
	    line = f.readline()

	    # Signifies a new trajectory
	    # Clear any existing values from previous runs
	    if '(npk)' in line:

		# Look for an input trajectory in the same file and use it
		# Convenient for conserving constraints, tags, and atoms info
		in_file = os.path.join(
		    '/'.join(log_file.split('/')[:-1]),
		    'input.traj')

		if os.path.exists(in_file):

		    # If it does exist, read it in as the initial configuration
		    atoms = read(in_file)
		    atoms.wrap()
		    natoms = len(atoms)
		    pos = atoms.get_positions()

		    # Skip past the geometry information
		    while 'site n.' not in line:
			line = f.readline()

		# Otherwise, collect from the data
		else:
		    atoms = None

	    # Example properties
	    ######################
	    # bravais-lattice index     =            0
	    # lattice parameter (alat)  =       1.8897  a.u.
	    # unit-cell volume          =    3209.1777 (a.u.)^3
	    # number of atoms/cell      =            5
	    # number of atomic types    =            2
	    # number of electrons       =        45.00
	    # number of Kohn-Sham states=           55
	    # kinetic-energy cutoff     =      36.7493  Ry
	    # charge density cutoff     =     367.4932  Ry
	    # convergence threshold     =      7.3E-08
	    # mixing beta               =       0.1000
	    # number of iterations used =            8  plain     mixing
	    # Exchange-correlation      = BEEF ( 1  4 27 13 2)
	    # nstep                     =           50

	    # Collect potentially relevent properties
	    # The elif can be omitted if order is assured
	    elif 'number of atoms/cell      =' in line:
		natoms = int(line.split()[-1])

	    # Collect cell dimensions
	    elif 'celldm(1)' in line:
		alat = float(line.split()[1]) / aul

	    elif 'crystal' in line:
		cell = []
		for _ in range(3):
		    line = f.readline()
		    cell += [[float(x) for x in line.split()[3:6]]]
		cell = np.array(cell) * alat

	    # Collect positions, symbols, and number of atoms
	    elif 'site n.' in line:
		pos, syms = [], []

		for _ in range(natoms):
		    line = f.readline()
		    pos += [line.split()[-4:-1]]
		    syms += [line.split()[1].strip('0123456789')]

		pos = np.array(pos, dtype=float) * alat

		# Setup the atoms object
		atoms = Atoms(syms, pos, cell=cell, pbc=(1, 1, 1))

	    # This should be the last piece of information
	    elif 'number of k points=' in line:

		atoms = attach_results(f, atoms)

		# Add atom to images
		images = [atoms]

		# Only atomic positions and energies need to be collected now
		# until the calculation ends
		while 'JOB DONE.' not in line and line:
		    line = f.readline()

		    # A duplicate of the coordinates printed previously
		    if 'Begin final coordinates' in line:
			break

		    if 'ATOMIC_POSITIONS' in line:
			atoms = atoms.copy()

			coord = line.split('(')[-1]
			for i in range(natoms):
			    line = f.readline()
			    pos[i][:] = line.split()[1:4]

			    # It's possible to recover constraints here,
			    # but not yet implemented If there are 7
			    # characters in the line, we have constraints
			    # if len(line.split()) == 7:
			    #     cons += [line.split()[-3:]]
			    # else:
			    #     cons += [[1] * 3]

			# cons = np.array(cons, dtype=float)

			if coord == 'alat)':
			    atoms.set_positions(pos * alat)
			elif coord == 'bohr)':
			    atoms.set_positions(pos * Bohr)
			elif coord == 'angstrom)':
			    atoms.set_positions(pos)
			else:
			    atoms.set_scaled_positions(pos)

			# atoms.wrap()
			atoms = attach_results(f, atoms)
			images += [atoms]

		if out_file:
		    write(out_file, images)

		return images

In this function, the atoms object information for an input.traj file is used, so I'm always relying on that being in the calculation directory before having run this script. Another important caveat is that this script is designed for incorporating the entire relaxation trajectory. This will not necessarily work correctly if you re-run your Quantum Espresso calculations inside of the same directory as this will append images on to the end of the log file. Generally speaking, this is poor practice when performing calculations for reproducibility reasons.

Now that we have a functional ability to use Quantum Espresso, we can break our calculation into nicely modular functions which we can use in different situations as needed. These will need to incorporate certain aspects in order to be easily searchable in the database later down the road.

2.2 Writing a database-friendly input file

The first function I use quite frequently is one which converts an atoms object into a string format. This allows me to use an atoms object as the input to my calculation from any computer. That way, I can stage them from my personal computer ans simply have Fireworks unpack and run them once they are on the cluster.

import numpy as np
import json


def atoms_to_encode(images):
    """ Converts an list of atoms objects to an encoding
    from a .traj file.
    """

    if not isinstance(images, list):
	images = [images]

    # Convert all constraints into dictionary format
    constraints = [_.todict() for _ in images[0].constraints]
    for i, C in enumerate(constraints):

	# Turn any arrays in the kwargs into lists
	for k, v in list(C['kwargs'].items()):
	    if isinstance(v, np.ndarray):
		constraints[i]['kwargs'][k] = v.tolist()

    # Convert any arrays from the parameter settings into lists
    keys = images[0].info
    for k, v in list(keys.items()):
	if isinstance(v, np.ndarray):
	    keys[k] = v.tolist()

    data = {'trajectory': {}}
    # Assemble the compressed dictionary of results
    for i, atoms in enumerate(images):

	if i == 0:
	    # For first images, collect cell and positions normally
	    pos = atoms.get_positions()
	    update_pos = pos

	    cell = atoms.get_cell()
	    update_cell = cell

	    # Add the parameters which do not change
	    data['numbers'] = images[0].get_atomic_numbers().tolist()
	    data['pbc'] = images[0].get_pbc().tolist()
	    data['constraints'] = constraints
	    data['calculator_parameters'] = keys

	else:
	    # For consecutive images, check for duplication
	    # If duplicates are found, do not store it
	    if np.array_equal(atoms.get_positions(), pos):
		update_pos = np.array([])
	    else:
		pos = atoms.get_positions()
		update_pos = pos

	    if np.array_equal(atoms.get_cell(), cell):
		update_cell = np.array([])
	    else:
		cell = atoms.get_cell()
		update_cell = cell

	if atoms._calc:
	    nrg = atoms.get_potential_energy()
	    force = atoms.get_forces()
	    stress = atoms.get_stress()

	    # Stage results and convert to lists in needed
	    results = {
		'positions': update_pos,
		'cell': update_cell,
		'energy': nrg,
		'forces': force,
		'stress': stress}

	else:
	    results = {
		'positions': update_pos,
		'cell': update_cell}

	for k, v in list(results.items()):
	    if isinstance(v, np.ndarray):
		results[k] = v.tolist()

	# Store trajectory, throwing out None values
	data['trajectory'][i] = {
	    k: v for k, v in list(
		results.items()) if v is not None}

    # Return the reduced results in JSON compression
    return json.dumps(data)

Now that I have this function, I can store it into my path can call it to turn an atoms object into a JSON string which is safe to add into the database. Lets revisit the original quantum espresso example and see how this works.

#!/usr/bin/env python
from ase import Atoms
from qefw import atoms_to_encode

# Define the ase-espresso keys we will use.
keys = {
    'mode': 'relax',
    'opt_algorithm': 'bfgs',
    'xc': 'RPBE',
    'outdir': '.',
    'output': {'removesave': True},
    'pw': 200,
    'dw': 2000,
    'dipole': {'status': True},
    'kpts': (1, 1, 1),
    'calcstress': True,
    'convergence': {
	'energy': 1e-5,
	'mixing': 0.35}}

# Create the atoms object
atoms = Atoms(
    'H2',
    [[0.0, 0.0, 0.0],
     [0.0, 0.0, 0.8]])
atoms.center(vacuum=4)
atoms.info = keys

# Call the encoding function.
encoding = atoms_to_encode(atoms)

print(encoding)
{"trajectory": {"0": {"positions": [[4.0, 4.0, 4.0], [4.0, 4.0, 4.8]], "cell": [[8.0, 0.0, 0.0], [0.0, 8.0, 0.0], [0.0, 0.0, 8.8]]}}, "numbers": [1, 1], "pbc": [false, false, false], "constraints": [], "calculator_parameters": {"mode": "relax", "opt_algorithm": "bfgs", "xc": "RPBE", "outdir": ".", "output": {"removesave": true}, "pw": 200, "dw": 2000, "dipole": {"status": true}, "kpts": [1, 1, 1], "calcstress": true, "convergence": {"energy": 1e-05, "mixing": 0.35}}}

This provides a lovely bundled up atoms object which is complete with the ase-espresso keywords we expect to be run with this atoms object. Not only is this exactly what we need to run the calculation, it's also perfect documentation for calling on later to identify what this calculation is.

2.3 Recovering an atoms object from encoding

Now that we have a JSON representation of our atoms object, we will be able to send this string to the database have it stored for future use. Once the calculation gets called up on one of the clusters however, we will need some way of turning that JSON string back into an atoms object so that the local installation of ASE knows how to use it as an atoms object.

import json
from ase.calculators.singlepoint import SinglePointCalculator as SPC
from ase.io import write
from ase import Atoms
from ase.constraints import dict2constraint


def encode_to_atoms(encode, out_file='input.traj'):
    """ Dump the encoding to a local traj file.
    """

    # First, decode the trajectory
    data = json.loads(encode, encoding='utf-8')

    # Construct the initial atoms object
    atoms = Atoms(
	data['numbers'],
	data['trajectory']['0']['positions'],
	cell=data['trajectory']['0']['cell'],
	pbc=data['pbc'])
    atoms.info = data['calculator_parameters']
    atoms.set_constraint([dict2constraint(_) for _ in data['constraints']])

    # Attach the calculator
    calc = SPC(
	atoms=atoms,
	energy=data['trajectory']['0'].get('energy'),
	forces=data['trajectory']['0'].get('forces'),
	stress=data['trajectory']['0'].get('stress'))
    atoms.set_calculator(calc)

    # Collect the rest of the trajectory information
    images = [atoms]
    for i in range(len(data['trajectory']))[1:]:
	atoms = atoms.copy()

	if data['trajectory'][str(i)]['cell']:
	    atoms.set_cell(data['trajectory'][str(i)]['cell'])

	if data['trajectory'][str(i)]['positions']:
	    atoms.set_positions(data['trajectory'][str(i)]['positions'])

	calc = SPC(
	    atoms=atoms,
	    energy=data['trajectory'][str(i)].get('energy'),
	    forces=data['trajectory'][str(i)].get('forces'),
	    stress=data['trajectory'][str(i)].get('stress'))
	atoms.set_calculator(calc)

	images += [atoms]

    # Write the traj file
    if out_file:
	write(out_file, images)

    return images

Now we can convert our JSON encoding back into an atoms object complete with the calculation parameters which were attached before they left the computer we generated them on. I do this on the same machine here simply to demonstrate the concept.

#!/usr/bin/env python
from ase import Atoms
from qefw import atoms_to_encode, encode_to_atoms

# Define the ase-espresso keys we will use.
keys = {
    'mode': 'relax',
    'opt_algorithm': 'bfgs',
    'xc': 'RPBE',
    'outdir': '.',
    'output': {'removesave': True},
    'pw': 200,
    'dw': 2000,
    'dipole': {'status': True},
    'kpts': (1, 1, 1),
    'calcstress': True,
    'convergence': {
	'energy': 1e-5,
	'mixing': 0.35}}

# Create the atoms object
atoms = Atoms(
    'H2',
    [[0.0, 0.0, 0.0],
     [0.0, 0.0, 0.8]])
atoms.center(vacuum=4)
atoms.info = keys

# Call the encoding function.
encoding = atoms_to_encode(atoms)

print(encoding)

recovered_atoms = encode_to_atoms(encoding)

print(recovered_atoms)

{"trajectory": {"0": {"positions": [[4.0, 4.0, 4.0], [4.0, 4.0, 4.8]], "cell": [[8.0, 0.0, 0.0], [0.0, 8.0, 0.0], [0.0, 0.0, 8.8]]}}, "numbers": [1, 1], "pbc": [false, false, false], "constraints": [], "calculatorparameters": {"mode": "relax", "optalgorithm": "bfgs", "xc": "RPBE", "outdir": ".", "output": {"removesave": true}, "pw": 200, "dw": 2000, "dipole": {"status": true}, "kpts": [1, 1, 1], "calcstress": true, "convergence": {"energy": 1e-05, "mixing": 0.35}}} [Atoms(symbols='H2', pbc=False, cell=[8.0, 8.0, 8.8], calculator=SinglePointCalculator(…))]

2.4 Performing a relaxation

Now that we have all the tools we need to transfer an atoms object (and its tags) to the database and back again, we're ready to write a simple relaxaton script for executing on a generic atoms object.

from ase.io import read
from fw.fwio import atoms_to_encode
from qeio import log_to_atoms
from espresso import espresso


def get_potential_energy(in_file='input.traj'):
    """ Performs a ASE get_potential_energy() call with
    the ase-espresso calculator and the keywords
    defined inside the atoms object information.

    This can be a singlepoint calculation or a
    full relaxation depending on the keywords.
    """

    # Read the input file from the current directory
    atoms = read(in_file)

    # Planewave basis set requires periodic boundary conditions
    atoms.set_pbc([1, 1, 1])

    # Setting up the calculator
    calc = espresso(**atoms.info)
    atoms.set_calculator(calc)

    # Perform the calculation and write trajectory from log.
    atoms.get_potential_energy()
    images = log_to_atoms(out_file='output.traj')

    # Save the calculator to the local disk for later use.
    try:
	calc.save_flev_output()
    except(RuntimeError):
	calc.save_output()

    return atoms_to_encode(images)

With this script, we can perform a relaxation test using a script very similar to our first example without ever submitting anything to Fireworks. It would be a good idea to run this using a similar sbatch command on Sherlock to ensure that all of the functions are setup correctly before proceeding to the next step.

from ase import Atoms
from qefw import atoms_to_encode, encode_to_atoms, get_potential_energy

# Define the ase-espresso keys we will use.
keys = {
    'mode': 'relax',
    'opt_algorithm': 'bfgs',
    'xc': 'RPBE',
    'outdir': '.',
    'output': {'removesave': True},
    'pw': 200,
    'dw': 2000,
    'dipole': {'status': True},
    'kpts': (1, 1, 1),
    'calcstress': True,
    'convergence': {
	'energy': 1e-5,
	'mixing': 0.35}}

# Create the atoms object
atoms = Atoms(
    'H2',
    [[0.0, 0.0, 0.0],
     [0.0, 0.0, 0.8]])
atoms.center(vacuum=4)
atoms.info = keys

# Call the encoding function.
encoding = atoms_to_encode(atoms)
atoms = encode_to_atoms(encoding)

# Run the calculation
images = get_potential_energy()

Completing this calculation will give you an output with the following files

ls -lha
-rw-r--r-- 1 jrboes suncat 3.6M Dec  3 18:32 calc.tgz
-rw-r--r-- 1 jrboes suncat  644 Dec  3 18:32 input.traj
-rw-r--r-- 1 jrboes suncat  28K Dec  3 18:32 log
-rw-r--r-- 1 jrboes suncat    8 Dec  3 18:32 nodefile.18552100
-rw-r--r-- 1 jrboes suncat 3.3K Dec  3 18:32 output.traj
-rw-r--r-- 1 jrboes suncat 1.2K Dec  3 18:32 pw.inp
-rw-r--r-- 1 jrboes suncat  765 Dec  3 18:32 script2.py
-rw-r--r-- 1 jrboes suncat  392 Dec  3 18:32 stderr
-rw-r--r-- 1 jrboes suncat    0 Dec  3 18:32 stdout
-rw-r--r-- 1 jrboes suncat    8 Dec  3 18:32 uniqnodefile.18552100

2.5 Recovering an existing calculator

From the previous script, we now have a calculation which is a good save point for building off of new calculations. By loading in the calc.tgz file which was saved, we can restart our calculation from the finished relaxation using the following script.

from espresso import espresso
from ase.io import read

def get_relaxed_calculation(in_file='output.traj'):
    """ Attach a stored calculator in the current directory
    to the provided atoms object.

    Then return the atoms object with the calculator attached.
    """

    # Read the last geometry from the input file
    atoms = read(in_file)

    # Reinitialize the calculator from calc.tgz and attach it.
    calc = espresso(**atoms.info)
    calc.load_flev_output()
    atoms.set_calculator(calc)

    return atoms

If we combine this with a post-processing operation, like collecting th total potential, we can restart out calculation in a new place without having to perform the relaxation a second time. You should be able to perform whatever follow up operation you want on this compressed version of the calculation, so this is a nice way to store information without performing unnecessary post-processing that you may or may not use later.

import msgpack
import json
from qefw import array_to_list, get_relaxed_calculation


def get_total_potential(out_file='potential.msg'):
    """ Calculate and save the total potential
    """

    # We require a previously relaxed calculation for this.
    atoms = get_relaxed_calculation()
    calc = atoms.get_calculator()

    # Collect the total potential and write to disk
    potential = calc.extract_total_potential()

    potential = list(potential)
    array_to_list(potential)

    # If outfile, write a MessagePack encoded version to disk
    if out_file:
	with open(out_file, 'w') as f:
	    msgpack.dump(potential, f)

    # Return a BSON friendly version
    return json.dumps(potential, encoding='utf-8')

3 Running through Fireworks

Finally, we are ready to submit a basic calculation to Fireworks using the tools discussed above. To keep this documentation interactive, I will be pulling my own credentials from a secure file. This will looks like the following.

from netrc import netrc

# Read credentials from a secure location
host = 'suncatls2.slac.stanford.edu'
username, name, password = netrc().authenticators(host)

print(username, name, password)
your_username your_database_name your_password

Now we can connect to the launchpad of Fireworks using the following code.

from fireworks import LaunchPad, Firework, Workflow, PyTask
from ase.collections import g2
from ase.units import Ry
from math import ceil
from fw.fwio import atoms_to_encode
from netrc import netrc
from ase import Atoms

# Read credentials from a secure location
host = 'suncatls2.slac.stanford.edu'
username, name, password = netrc().authenticators(host)

launchpad = LaunchPad(
    host=host,
    name=name,
    username=username,
    password=password)

# Define the ase-espresso keys we will use.
keys = {
    'mode': 'relax',
    'opt_algorithm': 'bfgs',
    'xc': 'RPBE',
    'outdir': '.',
    'output': {'removesave': True},
    'pw': 200,
    'dw': 2000,
    'dipole': {'status': True},
    'kpts': (1, 1, 1),
    'calcstress': True,
    'convergence': {
	'energy': 1e-5,
	'mixing': 0.35}}

# Create the atoms object
atoms = Atoms(
    'H2',
    [[0.0, 0.0, 0.0],
     [0.0, 0.0, 0.8]])
atoms.center(vacuum=4)
atoms.info = keys

# Encode the atoms
encoding = atoms_to_encode(atoms)

# Define some searching keys 
search_keys = {'molecule': 'H2'}

# Two steps - write the input structure to an input file, then relax
t0 = PyTask(
    func='qefw.encode_to_atoms',
    args=[encoding])
t1 = PyTask(
    func='qefw.get_potential_energy',
    stored_data_varname='trajectory')

# Package the tasks into a firework, the fireworks into a workflow,
# and submit the workflow to the launchpad
firework = Firework([t0, t1], spec={'_priority': 1}, name=search_keys)
workflow = Workflow([firework])
launchpad.add_wf(workflow)

After running this script, if you have already initiated Fireworks rapidfire on Sherlock, you can see how to do this in a previous post. Once it is turned on, Sherlock should submit a job to the queue automatically.

So what's going on here exactly? We'll for full details about the differences between a PyTask, Firework, and Wroflow, I would recommend looking into the in-depth documentation available on the Fireworks website. When this calculation runs, it will perform 2 distinct operations on the calculation node once it is started. The first, t0, is the qefw.encode_to_atoms function demonstrated previously. (NOTE: Make sure the qefw functions are on the PYHONPATH on the server before you run this example!). This will unpack the encoding which is stored in the database by passing encoding as an argument to the first function. This allows me to run this script from my local machine and still have the encoding uploaded to the database and decoded on the server. This is very useful for me because I enjoy using a heavy version of emacs which does not run well on the servers.

Once the t0 task is performed successfully, the t1 task will begin and perform the relaxation on the calculation node as well. Since the qefw.get_potential_energy function returns the images of the atoms objects encoded into JSON, we can also tell Fireworks to store that output into the 'spec.trajectory' field of the results by assigning storeddatavarname='trajectory'. This is extremely useful since most of the information I am ever looking for is included in the final trajectory file. If you need more specific information from the file, these functions should provide a pretty clear example of how you would go about doing that. Keep in mind that all information must be in a compatible format for the MongoDB database. JSON is a pretty safe bet in that regard.

You could also add in a t3 task which performs your own follow-up tasks, such as the get_total_potential example above.

You can monitor the progress of your calculation with the webgui the following bash script. For me, lpad is an alias with a -f argument which points to my my_fireworks.yaml file.

lpad webgui

4 Collecting information from Fireworks

Once the calculation has finished, you can collect the final trajectory simply by accessing the ID of the completed calcualtion from the LaunchPad.

from fireworks import LaunchPad
from qefw import encode_to_atoms
from netrc import netrc

# Read credentials from a secure location
host = 'suncatls2.slac.stanford.edu'
username, name, password = netrc().authenticators(host)

launchpad = LaunchPad(
    host=host,
    name=name,
    username=username,
    password=password)

# Select the ID of the first completed calcualtion
ID = launchpad.get_fw_ids(query={'state': 'COMPLETED'})[0]

launch = launchpad.get_fw_dict_by_id(ID)

encoding = launch['launches'][-1]['action']['stored_data']['trajectory']
images = encode_to_atoms(encoding)

print(images)
[Atoms(symbols='SH2', pbc=False, cell=[20.0, 21.948538, 20.919218], calculator=SinglePointCalculator(...)), Atoms(symbols='SH2', pbc=False, cell=[20.0, 21.948538, 20.919218], calculator=SinglePointCalculator(...)), Atoms(symbols='SH2', pbc=False, cell=[20.0, 21.948538, 20.919218], calculator=SinglePointCalculator(...))]

The truly beautiful thing about this code is that it can be performed from my local machine, just the same way the previous submission script was. If you set up your calculation inputs and outputs, there's no need to interact with anything on the server during for regular usage purposes!

This is a wonderful tool and an excellent skill for any computational catalysis researcher to poses. Fireworks also have far more capabilities for automation then what I have covered here. In future posts, I will dive into these features in more detail as I continue to explore them.

Until then, happy programming!

org-mode source

Comments !

links

social