From d1ed518212a1f938ac1ea78bd7d06556716026a1 Mon Sep 17 00:00:00 2001
From: Adrian Pope <apope@anl.gov>
Date: Mon, 9 Nov 2020 12:51:21 -0600
Subject: [PATCH] Update README.md

---
 README.md | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 85 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 0fb17ac..a3146d1 100644
--- a/README.md
+++ b/README.md
@@ -1,3 +1,87 @@
-GenericIO - For more information, please visit the wiki: https://xgitlab.cels.anl.gov/hacc/genericio/-/wikis/home
+# GenericIO
+
+GenericIO is a write-optimized library for writing self-describing scientific data files on large-scale parallel file systems.
+
+## Reference
+
+Habib, et al., HACC: Simulating Future Sky Surveys on State-of-the-Art Supercomputing Architectures, New Astronomy, 2015
+(http://arxiv.org/abs/1410.2805).
+
+## Source Code
+
+A source archive is available here: [genericio-20190417.tar.gz](http://www.mcs.anl.gov/~turam/genericio/genericio-20190417.tar.gz), or from git:
+
+```bash
+  git clone https://xgitlab.cels.anl.gov/hacc/genericio.git
+```
+## Output file partitions (subfiles)
+
+If you're running on an IBM BG/Q supercomputer, then the number of subfiles (partitions) chosen is based on the I/O nodes in an automatic way. Otherwise, by default, the GenericIO library picks the number of subfiles based on a fairly-naive hostname-based hashing scheme. This works reasonably-well on small clusters, but not on larger systems. On a larger system, you might want to set these environmental variables:
+
+```bash
+  GENERICIO_PARTITIONS_USE_NAME=0
+  GENERICIO_RANK_PARTITIONS=256
+```
+
+Where the number of partitions (256 above) determines the number of subfiles used. If you're using a Lustre file system, for example, an optimal number of files is:
+
+```
+  # of files * stripe count  ~ # OSTs
+```
+
+On Titan, for example, there are 1008 OSTs, and a default stripe count of 4, so we use approximately 256 files.
+
+## Benchmarks
+
+Once you build the library and associated programs (using make), you can run, for example:
+
+```bash
+  $ mpirun -np 8 ./mpi/GenericIOBenchmarkWrite /tmp/out.gio 123456 2
+  Wrote 9 variables to /tmp/out (4691036 bytes) in 0.2361s: 18.9484 MB/s
+```
+
+```bash
+  $ mpirun -np 8 ./mpi/GenericIOBenchmarkRead /tmp/out.gio
+  Read 9 variables from /tmp/out (4688028 bytes) in 0.223067s: 20.0426 MB/s [excluding header read]
+```
+
+The read benchmark always reads all of the input data. The output benchmark takes two numerical parameters, one if the number of data rows to write, and the second is a random seed (which slightly perturbs the per-rank output sizes, but not by much). Each row is 36 bytes for these benchmarks.
+
+The write benchmark can be passed the -c parameter to enable output compression. Both benchmarks take an optional -a parameter to request that homogeneous aggregates (i.e. "float4") be used instead of using separate arrays for each position/velocity component.
+
+## Python module
+
+The repository includes a genericio Python module that can read genericio-formatted files and return numpy arrays. This is included in the standard build. To use it, once you've built genericio, you can read genericio data as follows:
+
+```bash
+$ export PYTHONPATH=${GENERICIO_DIR}/python
+$ python
+>>> import genericio
+>>> genericio.gio_inspect('m000-99.fofproperties')
+Number of Elements: 1691
+[data type] Variable name
+---------------------------------------------
+[i 32] fof_halo_count
+[i 64] fof_halo_tag
+[f 32] fof_halo_mass
+[f 32] fof_halo_mean_x
+[f 32] fof_halo_mean_y
+[f 32] fof_halo_mean_z
+[f 32] fof_halo_mean_vx
+[f 32] fof_halo_mean_vy
+[f 32] fof_halo_mean_vz
+[f 32] fof_halo_vel_disp
+
+(i=integer,f=floating point, number bits size)
+>>> genericio.gio_read('m000-99.fofproperties','fof_halo_mass')
+array([[  4.58575588e+13],
+       [  5.00464689e+13],
+       [  5.07078771e+12],
+       ..., 
+       [  1.35221006e+13],
+       [  5.29125710e+12],
+       [  7.12849857e+12]], dtype=float32)
+
+```
 
 [Click here to go to the README for the alternative python interface](new_python/README.md)
\ No newline at end of file
-- 
GitLab