Figure: An early test of our new 3-D agent-based cell model, growing from 10 to 80,000 agents in about 25 days (24-threaded simulation required about 5 hours). Rendered in 3D using POVRAY (with a cutaway view). [Read more ...]

Wednesday, February 24, 2016

Saving MultiCellDS data from BioFVM

Note: This is the fifth in a series of "how-to" blog posts to help new users and developers of BioFVM

Introduction

A major initiative for my lab has been MultiCellDS: a standard for multicellular data. The project aims to create model-neutral representations of simulation data (for both discrete and continuum models), which can also work for segmented experimental and clinical data. A single-time output is called a digital snapshot. An interdisciplinary, multi-institutional review panel has been hard at work to nail down the draft standard.

A BioFVM MultiCellDS digital snapshot includes program and user metadata (more information to be included in a forthcoming publication), an output of the microenvironment, and any cells that are secreting or uptaking substrates.

As of Version 1.1.0, BioFVM supports output saved to MultiCellDS XML files. Each download also includes a matlab function for importing MultiCellDS snapshots saved by BioFVM programs. This tutorial will get you going.

BioFVM (finite volume method for biological problems) is an open source code for solving 3-D diffusion of 1 or more substrates. It was recently published as open access in Bioinformatics here:

http://dx.doi.org/10.1093/bioinformatics/btv730

The project website is at http://BioFVM.MathCancer.org, and downloads are at
http://BioFVM.sf.net.

Working with MultiCellDS in BioFVM programs

We include a MultiCellDS_test.cpp file in the examples directory of every BioFVM download (Version 1.1.0 or later). Create a new project directory, copy the following files to it:

BioFVM*.cpp and BioFVM*.h (from the main BioFVM directory)
pugixml.* (from the main BioFVM directory)
Makefile and MultiCellDS_test.cpp (from the examples directory)

Open the MultiCellDS_test.cpp file to see the syntax as you read the rest of this post.

See earlier tutorials (below) if you have troubles with this.

Setting metadata values

There are few key bits of metadata. First, the program used for the simulation (all these fields are optional):

// the program name, version, and project website:
BioFVM_metadata.program.program_name = "BioFVM MultiCellDS Test";
BioFVM_metadata.program.program_version = "1.0";
BioFVM_metadata.program.program_URL
   = "http://BioFVM.MathCancer.org";

// who created the program (if known)
BioFVM_metadata.program.creator.surname = "Macklin";
BioFVM_metadata.program.creator.given_names = "Paul";
BioFVM_metadata.program.creator.email = "Paul.Macklin@usc.edu";
BioFVM_metadata.program.creator.URL 
   = "http://BioFVM.MathCancer.org"; 
BioFVM_metadata.program.creator.organization 
   = "University of Southern California";
BioFVM_metadata.program.creator.department 
   = "Center for Applied Molecular Medicine";
BioFVM_metadata.program.creator.ORCID = "0000-0002-9925-0151";

// (generally peer-reviewed) citation information for the program 
BioFVM_metadata.program.citation.DOI 
   = "10.1093/bioinformatics/btv730";
BioFVM_metadata.program.citation.PMID = "26656933";
BioFVM_metadata.program.citation.PMCID = "PMC1234567";
BioFVM_metadata.program.citation.text 
   = "A. Ghaffarizaeh, S.H. Friedman, and P. Macklin, BioFVM: an efficient parallelized diffusive transport solver for 3-D biological simulations, Bioinformatics, 2015. DOI: 10.1093/bioinformatics/btv730.";
BioFVM_metadata.program.citation.notes = "notes here";
BioFVM_metadata.program.citation.URL 
   = "http://dx.doi.org/10.1093/bioinformatics/btv730";

// user information: who ran the program 
BioFVM_metadata.program.user.surname = "Kirk";
BioFVM_metadata.program.user.given_names = "James T.";
BioFVM_metadata.program.user.email = "Jimmy.Kirk@starfleet.mil";
BioFVM_metadata.program.user.organization = "Starfleet";
BioFVM_metadata.program.user.department = "U.S.S. Enterprise (NCC 1701)";
BioFVM_metadata.program.user.ORCID = "0000-0000-0000-0000";

// And finally, data citation information (the publication where this simulation snapshot appeared)
BioFVM_metadata.data_citation.DOI = "10.1093/bioinformatics/btv730";
BioFVM_metadata.data_citation.PMID = "12345678";
BioFVM_metadata.data_citation.PMCID = "PMC1234567";
BioFVM_metadata.data_citation.text 
   = "A. Ghaffarizaeh, S.H. Friedman, and P. Macklin, BioFVM: an efficient parallelized diffusive transport solver for 3-D biological simulations, Bioinformatics, 2015. DOI: 10.1093/bioinformatics/btv730.";
BioFVM_metadata.data_citation.notes = "notes here";
BioFVM_metadata.data_citation.URL 
   = "http://dx.doi.org/10.1093/bioinformatics/btv730";

You can sync the metadata current time, program runtime (wall time), and dimensional units using the following command. (This command is automatically run whenever you use the save command below.) 

BioFVM_metadata.sync_to_microenvironment( M ); 

You can display a basic summary of the metadata via: 

BioFVM_metadata.display_information( std::cout ); 

Setting options

By default (to save time and disk space), BioFVM saves the mesh as a Level 3 matlab file, whose location is embedded into the MultiCellDS XML file. You can disable this feature and revert to full XML (e.g., for human-readable cross-model reporting) via:

set_save_biofvm_mesh_as_matlab( false ); 

Similarly, BioFVM defaults to saving the values of the substrates in a compact Level 3 matlab file. You can override this with:

set_save_biofvm_data_as_matlab( false ); 

BioFVM by default saves the cell-centered sources and sinks. These take a lot of time to parse because they require very hierarchical data structures. You can disable saving the cells (basic_agents) via:

set_save_biofvm_cell_data( false ); 

Lastly, when you do save the cells, we default to a customized, minimal matlab format. You can revert to a more standard (but much larger) XML format with:

set_save_biofvm_cell_data_as_custom_matlab( false );

Saving a file

Saving the data is very straightforward:

save_BioFVM_to_MultiCellDS_xml_pugi( "sample" , M , current_simulation_time ); 

Your data will be saved in sample.xml. (Depending upon your options, it may generate several .mat files beginning with "sample".)

If you'd like the filename to depend upon the simulation time, use something more like this:

double current_simulation_time = 10.347; 
char filename_base [1024]; 
sprintf( &filename_base , "sample_%f", current_simulation_time ); 
save_BioFVM_to_MultiCellDS_xml_pugi( filename_base , M,
   current_simulation_time ); 

Your data will be saved in sample_10.347000.xml. (Depending upon your options, it may generate several .mat files beginning with "sample_10.347000".)

Compiling and running the program:

Edit the Makefile as below:

PROGRAM_NAME := MCDS_test
all: $(BioFVM_OBJECTS) $(pugixml_OBJECTS) MultiCellDS_test.cpp
 $(COMPILE_COMMAND) -o $(PROGRAM_NAME) $(BioFVM_OBJECTS) $(pugixml_OBJECTS) MultiCellDS_test.cpp
If you're running OSX, you'll probably need to update the compiler from "g++". See these tutorials.

Then, at the command prompt:

make
./MCDS_test

On Windows, you'll need to run without the ./:

make
MCDS_test

Working with MultiCellDS data in Matlab

Reading data in Matlab

Copy the read_MultiCellDS_xml.m file from the matlab directory (included in every MultiCellDS download). To read the data, just do this:

MCDS = read_MultiCellDS_xml( 'sample.xml' ); 

This should take around 30 seconds for larger data files (500,000 to 1,000,000 voxels with a few substrates, and around 250,000 cells). The long execution time is primarily because Matlab is ghastly inefficient at loops over hierarchical data structures. Increasing to 1,000,000 cells requires around 80-90 seconds to parse in matlab. 

Plotting data in Matlab

Plotting the 3-D substrate data

First, let's do some basic contour and surface plotting:

mid_index = round( length(MCDS.mesh.Z_coordinates)/2 ); 

contourf( MCDS.mesh.X(:,:,mid_index), ...
   MCDS.mesh.Y(:,:,mid_index), ... 
   MCDS.continuum_variables(2).data(:,:,mid_index) , 20 ) ; 
axis image
colorbar 
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) ); 
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) ); 
title( sprintf('%s (%s) at t = %f %s, z = %f %s', MCDS.continuum_variables(2).name , ...
   MCDS.continuum_variables(2).units , ...
   MCDS.metadata.current_time , ...
   MCDS.metadata.time_units, ... 
   MCDS.mesh.Z_coordinates(mid_index), ...
   MCDS.metadata.spatial_units ) ); 

OR

contourf( MCDS.mesh.X_coordinates , MCDS.mesh.Y_coordinates, ... 
   MCDS.continuum_variables(2).data(:,:,mid_index) , 20 ) ; 
axis image
colorbar 
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) ); 
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) ); 
title( sprintf('%s (%s) at t = %f %s, z = %f %s', ...
   MCDS.continuum_variables(2).name , ...
   MCDS.continuum_variables(2).units , ...
   MCDS.metadata.current_time , ...
   MCDS.metadata.time_units, ... 
   MCDS.mesh.Z_coordinates(mid_index), ...
   MCDS.metadata.spatial_units ) );  

Here's a surface plot:

surf( MCDS.mesh.X_coordinates , MCDS.mesh.Y_coordinates, ... 
   MCDS.continuum_variables(1).data(:,:,mid_index) ) ; 
colorbar 
axis tight
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) ); 
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) ); 
zlabel( sprintf( '%s (%s)', MCDS.continuum_variables(1).name, ...
   MCDS.continuum_variables(1).units ) ); 
title( sprintf('%s (%s) at t = %f %s, z = %f %s', MCDS.continuum_variables(1).name , ...
   MCDS.continuum_variables(1).units , ...
   MCDS.metadata.current_time , ...
   MCDS.metadata.time_units, ...
   MCDS.mesh.Z_coordinates(mid_index), ...
   MCDS.metadata.spatial_units ) ); 

Finally, here are some more advanced plots. The first is an "exploded" stack of contour plots:

clf
contourslice( MCDS.mesh.X , MCDS.mesh.Y, MCDS.mesh.Z , ...
  MCDS.continuum_variables(2).data , [],[], ...
  MCDS.mesh.Z_coordinates(1:15:length(MCDS.mesh.Z_coordinates)),20);
view([-45 10]);
axis tight; 
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) ); 
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) ); 
zlabel( sprintf( 'z (%s)' , MCDS.metadata.spatial_units) ); 
title( sprintf('%s (%s) at t = %f %s', ...
   MCDS.continuum_variables(2).name , ...
   MCDS.continuum_variables(2).units , ...
   MCDS.metadata.current_time, ... 
   MCDS.metadata.time_units ) ); 

Next, we show how to use isosurfaces with transparency

clf
patch( isosurface( MCDS.mesh.X , MCDS.mesh.Y, MCDS.mesh.Z, ...
   MCDS.continuum_variables(1).data, 1000 ), 'edgecolor', ...
   'none', 'facecolor', 'r' , 'facealpha' , 1 );
hold on
patch( isosurface( MCDS.mesh.X , MCDS.mesh.Y, MCDS.mesh.Z, ...
MCDS.continuum_variables(1).data, 5000 ), 'edgecolor', ...
   'none', 'facecolor', 'b' , 'facealpha' , 0.7 );
patch( isosurface( MCDS.mesh.X , MCDS.mesh.Y, MCDS.mesh.Z, ...
   MCDS.continuum_variables(1).data, 10000 ), 'edgecolor', ...
   'none', 'facecolor', 'g' , 'facealpha' , 0.5 );
hold off
% shading interp
camlight
view(3)
axis image
axis tightcamlight lighting gouraud
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) );
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) );
zlabel( sprintf( 'z (%s)' , MCDS.metadata.spatial_units) );
title( sprintf('%s (%s) at t = %f %s', ...
   MCDS.continuum_variables(1).name , ...
   MCDS.continuum_variables(1).units , ...
   MCDS.metadata.current_time, ...
   MCDS.metadata.time_units ) );

You can get more 3-D volumetric visualization ideas at Matlab's website. This visualization post at MIT also has some great tips.

Plotting the cells

Here is a basic 3-D plot for the cells:

plot3( MCDS.discrete_cells.state.position(:,1) , ...
   MCDS.discrete_cells.state.position(:,2) , ...
   MCDS.discrete_cells.state.position(:,3) , 'bo' );
view(3)
axis tight
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) ); 
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) ); 
zlabel( sprintf( 'z (%s)' , MCDS.metadata.spatial_units) );
title( sprintf('Cells at t = %f %s', MCDS.metadata.current_time, ...
    MCDS.metadata.time_units ) );

plot3 is more efficient than scatter3, but scatter3 will give more coloring options. Here is the syntax:

scatter3( MCDS.discrete_cells.state.position(:,1), ...
   MCDS.discrete_cells.state.position(:,2), ...
   MCDS.discrete_cells.state.position(:,3) , 'bo' );
view(3)
axis tight
xlabel( sprintf( 'x (%s)' , MCDS.metadata.spatial_units) ); 
ylabel( sprintf( 'y (%s)' , MCDS.metadata.spatial_units) ); 
zlabel( sprintf( 'z (%s)' , MCDS.metadata.spatial_units) ); 
title( sprintf('Cells at t = %f %s', MCDS.metadata.current_time, ...
    MCDS.metadata.time_units ) );

Jan Poleszczuk gives some great insights on plotting many cells in 3D at his blog. I'd recommend checking out his post on visualizing a cellular automaton model. At some point, I'll update this post with prettier plotting based on his methods.

What's next

Future releases of BioFVM will support reading MultiCellDS snapshots (for model initialization).

Matlab is pretty slow at parsing and visualizing large amounts of data. We also plan to include resources for accessing MultiCellDS data in VTK / Paraview and Python.

Tutorial Series for BioFVM

Setting up your development environment

Using BioFVM

  1. BioFVM Warmup: a 2D continuum simulation of tumor growth
  2. Saving MultiCellDS data from BioFVM

Return to News   Return to MathCancer