DXT Explorer

DXT Explorer is an interactive web-based log analysis tool to visualize Darshan DXT logs and help understand the I/O behavior of applications. Our tool adds an interactive component to Darshan trace analysis that can aid researchers, developers, and end-users to visually inspect their applications’ I/O behavior, zoom-in on areas of interest and have a clear picture of where is the I/O problem.

Build Instructions

Dependencies

DXT Explorer requires a Darshan log file collected with tracing data. The Darshan eXtended Tracing (DXT) support is disabled by default in Darshan. To enable tracing globally for all files, you need to set the DXT_ENABLE_IO_TRACE environment variable as follows:

export DXT_ENABLE_IO_TRACE=1

To enable tracing for particular files you can refer to the Darshan’s documentation page.

To use DXT Explorer, you need to have Python 3 and R already installed in your system, and install some required Python libraries:

pip install -r requirements.txt

In the first execution ever, DXT Explorer will automatically download any missing R packages required, thus it might take longer to generate the plot. This is all done at user level, without any need for elevated priviledges.

You also need to have Darshan Utils installed (darshan-dxt-parser) and available in your path.

Note

In Summit, if you want to run DXT Explorer, you need to load some modules:

module load python r cairo

Docker Image

You can also use a Docker image already pre-configured with all dependencies to run DXT Explorer:

docker pull hpcio/dxt-explorer

Since we need to provide an input file and access the generated .html files, make sure you are mounting your current directory in the container and removing the container after using it. You can pass the same arguments described above, after the container name (dxt-explorer).

docker run --rm --mount \
    type=bind,source="$(PWD)",target="/dxt-explorer/darshan" \
    dxt-explorer darshan/<FILE>.darshan

Exploring

Once you have the dependencies and DXT Explorer installed, you can run:

dxt-explore DARSHAN_FILE_COLLECTED_WITH_DXT_ENABLE.darshan
usage: dxt-explorer [-h] [-o OUTPUT] [-t] [-s] [-d] [-l] [--start START] [--end END] [--from START_RANK] [--to END_RANK] [--browser] darshan

DXT Explorer:

positional arguments:
  darshan               Input .darshan file

optional arguments:
  -h, --help            show this help message and exit
  -o OUTPUT, --output OUTPUT
                        Name of the output file
  -t, --transfer        Generate an interactive data transfer explorer
  -s, --spatiality      Generate an interactive spatiality explorer
  -d, --debug           Enable debug mode
  -l, --list            List all the files with trace
  --start START         Report starts from X seconds (e.g., 3.7) from beginning of the job
  --end END             Report ends at X seconds (e.g., 3.9) from beginning of the job
  --from START_RANK     Report start from rank N
  --to END_RANK         Report up to rank M
  --browser             Open the browser with the generated plot

DXT Explorer will generate by default a explore.html file with an interactive plot that you can open in any browser to explore. If you enabled the transfer or spatiality plots, additional .html files will be generated, one for each type. You are expected to visualize the following messages in the console:

2021-10-05 03:21:34,907 explore - INFO - darshan-dxt-parser: FOUND
2021-10-05 03:21:34,907 explore - INFO - Rscript: FOUND
2021-10-05 03:21:34,907 explore - INFO - parsing darshan/<FILE>.darshan file
2021-10-05 03:21:35,248 explore - INFO - generating an intermediate CSV file
2021-10-05 03:21:36,240 explore - INFO - generating interactive operation plot
2021-10-05 03:21:54,657 explore - INFO - SUCCESS

You can find a couple of interactive examples of DXT traces collected from FLASH, E2E, and OpenPMD in the companion repository for our PDSW’21 paper.

Ways to contribute

We appreciate your interest in DXT Explorer, and thank you for taking the time to contribute!

We have compiled a set of instructions to help us make DXT Explorer even better.

Reporting bugs

You can open a new issue using our GitHub issue tracker. If you run into an issue, please search first to ensure the issue has not been reported before. Open a new issue only if you have not found anything similar to your issue. Please, try to provide as much information as possible to reproduce your bug quickly.

Suggesting enhancements

You can use our GitHub issue tracker to describe your proposed feature. Please, provide the necessary context, covering why it is needed and what problem does it solve.

Testing

DXT Explorer constantly receives updates and improvements. If you can run the latest version, please consider helping us by reporting your findings, including bugs and performance regressions. Running DXT Explorer with different configurations and platforms helps us a lot in making it more robust by quickly identifying and solving issues.

Citation

You can find more information about DXT Explorer in our PDSW’21 paper. If you use DXT in your experiments, please consider citing:

@inproceedings{dxt-explorer,
   title = {{I/O Bottleneck Detection and Tuning: Connecting the Dots using Interactive Log Analysis}},
   author = {Bez, Jean Luca and Tang, Houjun and Xie, Bing, and Williams-Young, David and Latham, Rob and Ross, Rob and Oral, Sarp and Byna, Suren},
   booktitle = {2021 IEEE/ACM 6th International Parallel Data Systems Workshop (PDSW)}
   year = {2021}
}

License Agreement

DXT Explorer Copyright (c) 2021, The Regents of the University of California, through Lawrence Berkeley National Laboratory (subject to receipt of any required approvals from the U.S. Dept. of Energy). All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

(1) Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

(2) Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

(3) Neither the name of the University of California, Lawrence Berkeley National Laboratory, U.S. Dept. of Energy nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

You are under no obligation whatsoever to provide any bug fixes, patches, or upgrades to the features, functionality or performance of the source code (“Enhancements”) to anyone; however, if you choose to make your Enhancements available either publicly, or directly to Lawrence Berkeley National Laboratory, without imposing a separate written license agreement for such Enhancements, then you hereby grant the following license: a non-exclusive, royalty-free perpetual license to install, use, modify, prepare derivative works, incorporate into other computer software, distribute, and sublicense such enhancements or derivative works thereof, in binary and source code form.