N-Body / Particle Simulation Methods
Amara's Recap of Particle Simulation Methods ("ARPSM" !!)
My own experience with N-body simulation methods are with schemes using direct integration of forces ("Particle Particle" methods), and also with the symplectic methods. When a discussion in August 1995 on the sci.math.num-analysis newsgroup about N-body/Particle simulation methods caught my attention, I wrote this article to keep track of the different schemes (too many acronyms!), and to learn what researchers in the field are using today.Amara Graps
Last Revision: 20 March 2000.
ASCII version available from: ftp://ftp.amara.com/papers/nbody.txt.
WWW version: here.
Germany Mirror WWW version:
http://galileo.mpi-hd.mpg.de/~graps/other/nbody.html.UK Mirror WWW version:
http://www.maths.napier.ac.uk/gavin/nbody.html.
Contents
Particle-Particle (PP)
Particle-Mesh (PM)
Particle-Particle/Particle-Mesh (P3M)
Particle Multiple-Mesh (PM2)
Nested Grid Particle-Mesh (NGPM)
Tree-Code (TC) Top Down
Tree-Code (TC) Bottom Up
Fast-Multipole-Method (FMM)
Tree-Code Particle Mesh (TPM)
Self-Consistent Field (SCF)
Symplectic Method
Changes and Additions
Acknowledgements
Personal Correspondance
Particle-Particle (PP)
The Particle-Particle method is the simplest. The method is to:
- Accumulate forces by finding the force F(i,j) of particle j on particle i,
- Integrate the equations of motion (which includes the accumulated forces), and
- Update the time counter.
- Repeat for the next time step.
For example, in a gravitational N-body simulation, a particle of mass M attracts another particle of mass m with a force: -(GMm/r^3)*r. You have N particles, which you are computing the force (N-1) times. Then you separate the equation into two first-order differential equations involving acceleration and involving velocity. Finally, use an integration scheme like Euler or Runge-Kutta to get the positions and velocities. (Simple, eh?)
While the particle-particle method is the most straight-forward method of the N-body methods, the computational physicist must still think carefully about the numerical details of formulating the theoretical physics of the problem into a digital form, in order to derive results that are physically plausible. For example, as the particles approach each other, the forces between them, and hence the accelerations, become much larger. If one uses a constant time-step in the integration scheme in order to calculate the velocities and positions, then you're likely to encounter computer overflow errors giving nonsense numbers. To avoid this situation, you may want to consider a numerical integration scheme that uses variable time-steps, instead of constant time-steps. Such a scheme should automatically cut down the time-step when the particles are near each other, and increase the time-step when the particles are far away from each other.
The Particle-Particle direct integration approach is flexible but has a high computational cost: O(N^2) operations are required to evaluate the forces on all N particles. If you have less than about N=1000 particles, and are interested in the close-range dynamics of the particles, (or if you have more particles but special hardware) then this method is the most straight-forward.
GRAPE/HARP computer
To simulate tens of thousands of particles by a highly-accurate particle-particle method, the computational physicist can expect his simulation to perform at least 10^{14} floating operations. Such considerations inspired the building of the GRAPE computer by Junichiro Makino and colleagues in Tokyo. The GRAPE (acronym for "GRAvity PipE") computer is built around specialized pipeline processors for computing the gravitational force between particles.Makino et al. has refined the technology further by including chips that perform a Hermite integration, in order to reduce memory requirements and to allow somewhat larger time steps. They call their specialized GRAPE computer: HARP. HARP is the acronym for "Hermite AcceleratoR Pipeline."
This Figure shows a specially designed GRAPE-4 system, the HARP computer, which gives hardware support for a Hermite integration scheme. Photo taken by A. Graps at the Astronomisches-Rechen Institut in Heidelberg, Germany in December 1998. A colleague's watch helps provide the size of this small, innocent-looking computer.
Some WWW N-Body Particle-Particle Simulation Examples
- http://www.amara.com/past/particles.html The variable time-step integration method of Bulirsch-Stoer is my personal favorite, and I have a Web-browsable write-up of a charged-particle simulation using the the Bulirsch-Stoer scheme here.
References
Any numerical analysis text worth anything will have chapters on integration methods. My two favorite references that highlight physics applications are listed below.
- Harvey Gould and Jan Tobochnik, An Introduction to Computer Simulation Methods, Parts 1 and 2, Addison Wesley, 1988.
- Alejandro Garcia, Numerical Methods for Physics, Prentice-Hall, 1994.
Particle-Mesh (PM)
The Particle-Mesh method treats the force as a field quantity by approximating it on a mesh. Differential operators, such as the laplacian, are replaced by finite difference approximations. Potentials and forces at particle positions are obtained by interpolating on the array of mesh-defined values. Mesh-defined densities are calculated by assigning particle attributes (e.g. "charge") to nearby mesh points in order to create the mesh-defined values (e.g. "charge density").
So the principle steps of the particle mesh calculation are:
- Assign "charge" to the mesh ("particle mass" becomes "grid density"),
- Solve the field potiential equation (e.g. Poisson's) on the mesh,
- Calculate the force field from the mesh-defined potential,
- Interpolate the force on the grid to find forces on the particles.
- Now like the PP: integrate the forces to get particle positions and velocities.
- Update the time counter.
Some details.
For Step 1), many schemes exist to assign charges to the mesh. The best ones are those that reduce the fluctuations of the force when the particles are near each other. This reduction in force fluctuation can be quantified with a constraint called: "smoothness." Smoothness is the continuity of the derivatives of the approximated function used in assigning charges to the mesh.
For Step 2), the PM method solves the Poisson equation:
del^2(phi) = 4 * pi * G * rho
for the gravitational potential phi, (usually) using an FFT technique. The transform is computed by multiplication with a suitable Green's function for the Poisson equation (which contains a spatially-varying part of the mass density field). Periodic boundary conditions are usually used to simulate an "infinite universe."
The simplest type of PM method is the "Nearest-Grid-Point" (NGP) particle-mesh method where densities at a mesh point are assigned by the total amount of "charge" in the cell surrounding the mesh point, divided by the cell volume. However, this method is rarely used. One of its drawbacks are that it gives forces that are discontinuous in value.
The "Cloud-in-a-Cell" (CIC) PM scheme is a better approximation to the force by replacing the nearest grid point assignment by a scheme involving 2^N nearest neighbors (where N is dimensionality of the problem). The CIC method gives forces continuous in value, but discontinous in their first derivative.
A more accurate PM scheme is the "Triangular-Shaped-Cloud" (TSC) method. This scheme has an assignment interpolation function that is piecewise quadratic. In one dimension it employs three mesh points. (see Hockney and Eastwood, pg. 136)
The main advantage of the PM methods is speed. The number of computations is of order O(N + Ng log Ng) where Ng is the number of grid points. The slowest step is where the scientist solves the potential equation, usually with the help of a Poisson solver, which often relies on a fast Fourier transform (under some assumptions, a wavelet transform can be utilized instead, further speeding up the potential calculation). Other numerical methods may be used to solve the potential (Poisson's) equation too, such as a finite element method or a finite volume method.
One large reason for using the PM method is for situations when you want to suppress two-body collisionality. Some tests documenting this are in Melott et al, 1989.
The PM method is basically unacceptable for studying close encounters between particles ("correlated systems") because the algorithm, in effect, treats particles as being fuzzy, nearly round balls (varies depending on which assignment scheme is used). This method is good for simulations where you want a "softening" of the inverse square law force. In general, the mesh spacing should be smaller than the wavelengths of importance in the physical system.
Another disadvantage of using the mesh-based methods are that they have difficulties handling nonuniform particle distributions. This means that the PM methods offer limited resolution. To overcome the limited resolution, some researchers have developed PM algorithms which employ meshes of finer gridding in selected subregions of the system. These finer meshes permit a more accurate modeling of regions of higher density. If one wishes to further refine the grid due to large dynamical changes in the system (for example, due to shock waves), then one can apply moving grids or adaptive grids. Staggered mesh schemes" have been found to significantly improve the resolution. (See, Melott, 1986; also in your literature searches on staggered mesh schemes, look for references by Park.)
References
- Edmund Bertschinger and James M. Gelb, "Cosmological N-Body Simulations," Computers in Physics, Mar/Apr 1991, pp 164-179.
- R.W. Hockney and J.W. Eastwood, Computer Simulation Using Particles, Institute of Physics Publishing, 1988.
- A.L. Melott, Comment on "Nonlinear Gravitational Clustering in Cosmology", Physical Review Letters, 56, (1986), 1992.
- P.J.E. Peebles, A.L. Melott, M.R. Holmes, and L.R. Jiang, "A Model for the Formation of the Local Group," Ap J, 345, (1989) 108.
Some Research That Uses and Tests This Technique
- G. Efstathiou, C. Frenk, S.D.M. White, M. Davis, "Gravitational Clustering from Scale-Free Initial Conditions," MNRAS, 235, 1988, 715-748.
- M. Davis, G. Efstathiou, C. Frenk, and S.D.M. White, "The Evoloution of Large-Scale Structure in a Universe Dominated by Cold Dark Matter," ApJ, 292, 1985, 371-394.
- A. Melott, J. Einasto, E. Saar, I. Suisalu, A. Klypin and S. Shandarin, "Cluster Analysis of the Nonlinear Evolution of Large Scale Structure in an Axion/Gravitino/Photino Dominated Universe, Physical Review Letters, 51, (1983), 935.
- A.L. Melott and S.F. Shandarin, "Controlled Experiments in Cosmological Gravitational Clustering," Ap J, 410, (1993), 469.
- B. Kuhlman, A.L. Melott and S.F. Shandarin, "A Test of the Particle Paradigm in N-Body Simulations," Ap JL, 470, (1996), L41-L44.
- Randall J. Splinter, Adrian L. Melott, Sergei F. Shandarin, Yasushi Suto, "Fundamental Discreteness Limitations of Cosmological N-Body Clustering Simulations" preprint at http://xxx.lanl.gov/abs/astro-ph/9706099. Revised version published in Astrophysical Journal497 (1998) 38-61.
Particle-Particle/Particle-Mesh (P3M)
The P3M method solves the major shortcoming of the PM method- low resolution forces computed for particles near each other. This method supplements the interparticle forces with a direct sum over pairs separated by less than about 3*del_x, where del_x is the grid spacing. The interparticle forces are split into a rapidly-varying short-range part and a slowly-varying long-range part. The PP method is used to find the total short-range contribution to the force on each particle and the PM method is used to find the total slowly-varying force contributions.
Two meshes are used in these algorithms: a "charge-potential" mesh, and a "chaining" (coarser) mesh. The charge-potential mesh is used at different stages of the PM calculation to store charge density values, charge harmonics, potential harmonics, and potential values. The chaining mesh is an array of cells to locate pairs of neighboring particles in the short-range calculation.
The scheme looks like:
- Start with the position and momenta of the particles,
- Update time step,
- Calculate PM forces (assign charge to mesh, solve for potentials, interpolate forces, increment momenta),
- Calculate PP forces (fill chaining mesh and update momenta),
- Create equations of motion (update positions, apply particle boundary conditions, update energy accumulators),
- Integrate to get position and momenta.
The number of operations is of order (N+Ng) (N = #particles, Ng=#grids). It has been widely used in cosmological simulations and is easiest to use when the forces can be readily split between short-range and long-range.
Some work in applying wavelet bases and using Fast Wavelet Transforms has extended the resolution and cut down the order of computation to be closer to O(Ng).
The disadvantage of the P3M algorithm is that it too easily becomes dominated by the direct summation part (the inter-particle force "supplement").
A solution to this problem are various adaption methods explored by many researchers in the mid 1980s through the mid 90s. For example: Pen's (1993) "Linear Moving Adaptive PM," a nested-grid adapted PM code by Chan (1986), and Anninos (1993), and Villumsen's (1987) Hierarchical Particle Mesh (HPM) code. See Splinter's 1996 paper for a description of these. In the later 1980s, Hugh Couchman constructed his algorithm, called the "mesh-refined" P3M algorithm, that is used by many people working with PM-type methods now. The algorithm uses subgrids to speed up the direct summation of nearby neighbors by shiftng some of the burden to high-resolution FFTs. This solution may well be faster than some of the hierarchical tree codes, described next.
References
- Brandt & Lubrecht, J. Comp. Phys. 90, 348 (1990).
- Gregory Beylkin, Ronald R. Coifman, and Vladimir Rokhlin. "Fast Wavelet Transforms and Numerical Algorithms I." Comm on Pure and Applied Math, XLIV: 141-183, 1991.
- Edmund Bertschinger and James M. Gelb, "Cosmological N-Body Simulations," Computers in Physics, Mar/Apr 1991, pp 164-179.
- R.W. Hockney and J.W. Eastwood, Computer Simulation Using Particles, Institute of Physics Publishing, 1988.
- Hugh M. P. Couchman, ApJL, 368, L23 (1991). See also his descriptions on the Web:
http://www-hpcc.astro.washington.edu/simulations/DARK_MATTER/adapintro.html
and:
http://www-hpcc.astro.washington.edu/simulations/DARK_MATTER/
- Splinter, Randall, "A Nested Grid Particle-Mesh Code for High Resolution Simulations of Gravitational Instability in Cosmology," MNRAS, 1996, MNRAS, 281, 281. (This paper also has a nice review of many of the other NBody methods. 3 Mb postscript document) ftp://asta.pa.uky.edu/cosmology/ngpm.ps
- B.A. Luty, I.G. Tironi and W.F. van Gunsteren (1995). "Lattice-sum Methods for Calculating Electrostatic Interactions in Molecular Simulations," J. Chem. Phys. 103 3014-3021. (Brock Luty's study of periodic boundary conditions for P3M)
- B.A. Luty and W.F. van Gunsteren (1996). "Calculating Electrostatic Interactions Using the Particle-Particle Particle-Mesh Method with Nonperiodic Long-Range Interactions" J. Phys. Chem. 100 2581-2587.
Particle Multiple-Mesh (PM2)
Now we come to the nested-grid particle-mesh methods.
The goal of the nested-grid particle methods is to increase the length dynamic range, which is the ratio between the size of the system and the smallest resolved scale. In a plain Particle-Mesh method, the dynamic range is the number of grid zones. In a Particle-Particle/Particle-Mesh, it is the ratio between the box size and the particle softening or the interparticle separation, whichever is larger. In the nested-grid methods it's the ratio between the box size and the spacing of the finest subgrid.
The "Nested-Grid" approaches cover a broad category where one can distinguish between two large classes: particle-multiple-mesh (PM2) schemes and hierarchical particle-mesh (HPM) schemes. The philosophies underlying these two subclasses are subtly different, and dictate somewhat different treatments of the interface between grids. I discuss the hierarchical particle-mesh with the "NGPM method" in the next section. Here I describe PM2.
The PM2 method describes a nested particle mesh in which the particles have a "life" of their own, independently of the grids. Ths is because there is only one set of particles that exist independently of any subgrids, and so the "back-reaction" from the subgrid solution to the parent grid is automatically included, by the particles themselves. The entire grid substructure can be chosen fresh at every step.
The principle authors of this scheme are Jessop et al. (1994) and Gelato et al. (1996, and the references therein).
One implementation of PM2 by Gelato et al. uses an intial mass granularity that already matches the desired resolution on the subgrid. This differs from the NGPM method (described next), which uses less massive particles on each subgrid.
The scheme is roughly as follows. Steps 1 through 3 look like the other particle-mesh methods.
- Assume a parent or top grid. Assign "charge" to the mesh ("particle mass" becomes "grid density"), using a Cloud-in-Cell algorithm.
- Use an FFT with a discrete Green's function to solve for the potential on the grid, and obtain accelerations for each particle.
- Create boundary conditions on the parent grid. (Gelato uses isolated boundary conditions.)
- Introduce subgrids to compute inter-particle forces. The subgrid placement occurs automatically based on particle number density and other user-tunable parameters.
- Compute the forces for the subgrid particles.
- Generate the density array on the subgrid.
- Solve the field potiential equation (e.g. Poisson's) on the sub-grid, for the gravitational potential phi, using an FFT technique.
- Impose boundary conditions on the subgrid. Particular to Gelato's scheme is to impose isolated boundary conditions and to use a buffer zone when one has adjacent subgrids to help smooth any force or density transients. Buffer zones are used differently here than in the NGPM scheme.
- Integrate the forces to get particle positions and velocities.
This method is useful for the simulation of collisionless systems of gravitating particles. It is aimed at simulations of relatively small volumes of space (not much larger than a single group of galaxies). It performs similarly to other Particle-Mesh techniques.
The cost per step is roughly O(Ng log Ng) for the potential solver and O(Np) for the charge assignment, and force interpolation. Ng is the number of grid zones, Np the number of particles.
References
- Sergio Gelato, David F. Chernoff, Ira Wasserman, "An Adaptive Hierarchical Particle-Mesh code with Isolated Boundary Conditions," ApJ accepted, 1997 May 1 issue.
You can download this paper from: http://xxx.lanl.gov/abs/astro-ph/?9607156
- Jessop, Duncan & Chau 1994, J. Comp. Phys. 115, 339.
Nested Grid Particle-Mesh (NGPM)
The Nested Grid Particle-Mesh is particle-mesh scheme by several researchers: Anninos, et al (1994), Villumsen, (1988), and others; and most recently by Randall Splinter (University of Kentucky) which uses a series of nested lattices in which the force _and_ mass resolution increases for the more finely spaced sub-lattices. This latter point is an important distinction between this and the Tree / P3M methods. Those other methods increase the force resolution without increasing the mass resolution. This NGPM method does both, introducing a new set of less massive particles on each subgrid.
A good way to think of the NGPM is as a "MonteCarlo solver" for a one-particle distribution function in a collisionless system. The "particles" are constructs that sample the phase space distribution of the much smaller, more numerous, real particles of the physical system.
- Assume a parent grid. Assign "charge" to the mesh ("particle mass" becomes "grid density").
- Define a sub-grid anywhere within the parent grid consisting of n^3 sub-grid particles. The sub-grid particle mass becomes m_p/n_s^3 where p = the parent grid and s = the sub-grid.
- One can use a arbitrary number of parent grid cells and sub-grid cells, with a uniform spacing on both grids. Also, the sub-grid is fixed and doesn't move during the simulation. A buffer zone (which includes both an inner and outer zone) is included around the sub-grid to help smooth any force or density transients.
- Calculate the forces on the parents from the parent-grid mesh-defined potential.
- Calculate the forces on the sub-grid particles:
- Set the density over the sub-grid region equal to the mean over the sub-grid (erases all of the fluctuations on the sub-grid but leaves the total mass)
- Solve the field potiential equation (e.g. Poisson's) on the sub-grid, for the gravitational potential phi, using an FFT technique. The transform is computed by multiplication with a suitable Green's function for the Poisson equation, and when inverse-transformed, it represents the contribution to the sub-grid potential whose parent particles have not entered the sub-grid.
- This potential is interpolated using a "Cloud-in-a-Cell" (CIC) scheme to compute the background forces on the sub-grid particles.
- Calculate the forces on the sub-grid particles due to themselves. An FFT transform technique is used imposing quasi-isolated boundary conditions
- Integrate the forces to get particle positions and velocities. Asynchronous time steps are used.
The asynchronous time-steps are an unusual feature and an improvement to other nested grid methods. Splinter gives reasons in his paper for being able to evolve the sub-grid distribution separately from the parent grid safely.
The reference paper doesn't list the order of computation, but it is probably between O(log Ng) and O(Ng log Ng).
References
- Anninos, et al. ApJ, 436, (1994), 11.
- Villumsen, 1988, ApJS, 71, (1988), 407.
- Splinter, Randall, "A Nested Grid Particle-Mesh Code for High Resolution Simulations of Gravitational Instability in Cosmology," MNRAS, 1996, 281, 281. (This paper also has a nice review of many of the other NBody methods. 3 Mb postscript document) ftp://asta.pa.uky.edu/cosmology/ngpm.ps )
Tree Code (TC)- Top Down
The tree codes approach the N-body simulation with the same force superposition concept as the P3M algorithm:
Force = external_force + nearest_neighbor_force + far_field_force
The external forces can be computed for each particle independently. The nearest neighbor forces only require interactions with nearest neighbors. The far-field forces, like gravity, are more expensive to compute because the force on each particle depends on all the other particles. The simplest expression for a far field force F(i) on particle i is
for i = 1 to n F(i) = sum_{j=1, j != i}^n F(i,j) end forwhere F(i,j) is the force on particle i due to particle j. Thus, the cost seems to rise as O(n^2), whereas the external and nearest neighbor forces appear to grow like O(n). Even with parallelism, the far-field forces will be much more expensive.
The Barnes-Hut Tree Code (top down) algorithm was first presented in 1986, and is based on an earlier algorithm of A. Appel in 1985. It is widely used in astrophysics, and has been thoroughly parallelized. It addresses the expensive far-field force calculations in the following clever "divide-and-conquer" way.
- Build a "quadtree,"
- For each subsquare in the quadtree, compute the center of mass and total mass for all the particles it contains,
- For each particle, traverse the tree to compute the force on it.
Some Details.
For Step 1), building a quadtree means to build a hierarchy of boxes which refine the computational domain into smaller and smaller regions. At the coarsest level, we have the entire domain. Refinement level {l+1} is obtained recursively from level l by subdivision of each box into two equal parts in each direction (eight total, for 3D). And so on.
For Steps 2) and 3), the tree structures partition the mass distribution into a hierarchy of localized regions, so that when calculating the force on a given particle, the "tree" region near the particle in question is explored in detail, and the more distant regions are explored more coarsely by treating distant clumps of particles as single massive pseudo-particles.
Tree Codes are gridless, have no preferred geometry and can incorporate either vacuum or periodic boundary conditions. In addition, they waste no time simulating regions devoid of matter. Hence, Tree Codes are particularly effective for modeling collisions between galaxies.
Forces on all particles are obtained with O(N log N) operations. The down side is that tree codes require a large amount of auxillary storage.
Note: The standard Barnes Tree Code is not as accurate, however, as the Fast Multipole Method (FMM), to be discussed later. The Barnes-Hut type algorithms are essentially dumb FMM algorithms with order-0 multipole expansions. FMM would only be slower than the Barnes Hut if you want more accuracy (but for a given accuracy, FMM is likely faster).
McMillan and Aarseth (1993) remedy the limited accuracy shortcoming by using a higher order integration scheme, along with block time steps and multipole expansions. The tree is not reconstructed at every step, but is instead predicted along with the particles with indifidual nodes rebuilt as needed.
References
- A wonderful explanation of the serial Barnes-Hut algorithm can be found on the Web at UC Berkeley as a course lecture. Go to: http://www.icsi.berkeley.edu/cs267/lecture26/lecture26.html. (For more information about this site, See the below Other N-Body WWW Sources.
- Another very nice explanation of Tree Codes can be found at: http://www.infomall.org/npac/pcw/node278.html from the book: Parallel Computing Works. This book describes a set of application and systems software research projects undertaken by the Caltech Concurrent Computation Program (CP) from 1983-1990.
- You can find several version of Barnes' Tree code at: ftp://hubble.ifa.hawaii.edu/pub/barnes/treecode/
- A.W. Appel, "An Efficient Program for Many-Body Simulations," Siam, J. Sci. Stat. Comput., 6 (1985), 85-103.
- J. Barnes and P. Hut, "A Hierarchical O(n log n) Force Calculation Algorithm," Nature, v. 324, December 1986.
- Joshua E. Barnes and Lars E. Hernquist, "Computer Models of Colliding Galaxies," Physics Today, March 1993, pp 54-61.
- J. Barnes and L. Hernquist, Annual Reviews of Astronomy and Astrophysics, 1992.
- L. Hernquist, ApJ Supplement, 64, 1987, 715.
- S. L. W. McMillan and S. J. Aarseth (1993). "An O(NlogN) Integration Scheme for Collisional Stellar Systems", Ap. J., Vol. 414, p. 200-212.
- Susanne Pfalzner and Paul Gibbon, Many-Body Tree Methods in Physics, 1996 176 pp. 49564-4 Hardback $57.95, Cambridge University Press. A clear description and introduction to tree-methods. I like it. You can get more information about the book from this Europe Web site and this U.S. site.
- Sellwood, J.A. 1989, MNRAS, 238, 115.
- J. Salmon and M. Warren, "Skeletons from the Treecode Closet", J. Comp. Phys. v 111, n 1, 1994; or http://www.ccsf.caltech.edu/~johns/ftp/nbody/skeletons.ps.Z
- For more Salmon bibliographic references related to Barnes-Hut, see: John Salmon's "Electronically available publications"
- The following report describes an implementation of Salmon's version of the Barnes-Hut algorithm to run on a the CM-5. The code is written so that it is relatively easy to port other tree algorithms (the Greengard/Rokhlin 3D adaptive FMM for example).
- P. Liu and S. Bhatt, Experiences with Parallel N-body simulations, to appear in the Proceedings of the 1994 ACM Symposium on Parallel Algorithms and Architectures.
- P. Liu, The parallel implementation of N-body algorithms, Ph.D. dissertation, Yale University, 1994.
- An N-body Tree method that I have not had time to look at is Andrey Kravtsov's "Adaptive Tree Refinement (ART)" (see publication list, which is a mesh based Eulerian code which adaptively refines meshes in the places where higher force resolution is desired. This method is used for large high-resolution cosmological simulation.
Tree Code (TC)- Bottom Up
In this scheme, sometimes referred to as the "Press" Tree, the tree is built from the bottom up, rather than the top down, by taking the two particles nearest each other, and calling calling them leaf nodes. Take the next two particles that are nearest each other and tie them together, too. Keep tying particles together in pairs like this until there are less than two particles left. Then do the same thing, except do it with the centers of mass (CM) of each of these particle pairs. Then do the same thing with the CM of the particle quadruplets. Keep going until finally you connect the two CM's of the two halfs of the system.
One very nice feature of this algorithm is that you can give individual time steps to each of the particles and each of the CM's all the way up the tree such that every timestep is some power-of-two multiple of the smallest time step. Giving individual time-steps allows you to easily synchronize the timesteps of all the particles, while simultaneously allowing you to use small timesteps in subsections of the system where interesting close encounters are occuring.
This algorithm is designed to handle close approaches correctly, and can even handle tight binaries at the same time it handles long-range interactions. It can also use perturbation techniques to even more quickly and accurately handle close binaries. The drawback of this code is that it's probably not as simple as the Barnes-Hut algorithm to understand and code.
The computational order is O(n log n).
References
- W.H. Press, (1986), The Use of Supercomputers in Stellar Dynamics, ed. P. Hut and S.L.W. McMillan (New York: Springer Verlag), pp. 156.
- J. Garrett Jernigan and David H. Porter, (1989). "A tree code with logarithmic reduction of force terms, hierarchical regularization of all variables, and explicit accuracy controls," Ap.J. Supp. Series, 71, 871-893.
- S. L. W. McMillan and S. J. Aarseth (1993). "An O(NlogN) Integration Scheme for Collisional Stellar Systems", Astrophys. J., Vol. 414, p. 200-212.
Fast Multipole Method (FMM)
The Fast Multipole Method (FMM) is a tree code that uses two representations of the potential field. The two representations are: far field (multipole) and local expansions. The two representations are referred to as the "duality principle."
This method uses a very fast calculation of the potential field. Why would one be interested in that? It's computationally easier to deal with the potential than that of the force. The force is a vector. The potential phi(x,y,z) is a scalar. (Remember from basic physics that the force is just the negative of the gradient of the potential.)
The strategy of the FMM is to compute a compact expression for the potential phi(x,y,z), which can be easily evaluated along with its derivative, at any point. It achieves this by evaluating the potential as a "multipole expansion," a kind of Taylor expansion, which is accurate when x^2+y^2+z^2 is large.
The FMM method shares the quadtree and divide-and-conquer paradigm of Barnes-Hut. It differs in the following ways:
- FMM computes an expression for the potential at every point, not the force as does Barnes-Hut.
- FMM uses more information than the mass and center of the particles in a box. This more complicated expansion is more accurate, but also more expensive if you use high-order expansions (zero-order expansions will give you a fast algorithm, however).
- The decision of whether the mass and center of mass of a box length n can be used at a distant point, or whether n must be decomposed into its children, depends only on the position and size of the box, not the location of the center of mass within the box.
McMillan and S. J. Aarseth (1993) briefly mention this method saying that this algorithm does not appear to be suitable for studies of collisional systems because 1) it is slow compared with competing methods, and 2) because it derives its O(1) CPU time per force evaluation by determining the forces on an arbitrary number of particles in essentially constant time. The FMM is thus well suited to applications where all particles have the same or similar time steps.
FMM is capable of achieving full machine precision (or less, depending on an accuracy tuning parameter). The FMM is usually thought of as an O(N) algorithm, however see the reference by Aluru (1996), below.
References
- For a wonderful explanation (pictures etc.) see the UC Berkeley Web site: http://www.icsi.berkeley.edu/cs267/lecture27/lecture27.html. (Most of the information in this section came from there. See the below Other N-Body WWW Sources for more information about this site.)
- Srinivas Aluru, "Greengard's N-Body Algorithm Is Not Order N," SIAM Journal on Scientific Computing, May 1996, Volume 17, Number 3.
- William D. Elliott and John A. Board, Jr., "Fast Fourier Transform Accelerated Fast Multipole Algorithm," SIAM Journal on Scientific Computing, March 1996, Volume 17, Number 2.
- Leslie Greengard, "The Numerical Solution of the N-Body Problem, " Computers in Physics, Mar/Apr 1990, pp 142-152.
- This algorithm was first published in "A Fast Algorithm for Particle Simulations", L. Greengard and V. Rokhlin, J. Comp. Phys., v 73, 1987.
- Greengard's 1987 Yale dissertation "The Rapid Evaluation of Potential Fields in Particle Systems" won an ACM Distinguished Dissertation Award.
- S. L. W. McMillan and S. J. Aarseth (1993). "An O(NlogN) Integration Scheme for Collisional Stellar Systems", Astrophys. J., Vol. 414, p. 200-212.
- An adaptive FMM based upon Van Dommelen and Rundensteiner (J. Comp. Phys, v83, pp. 126-147 (1989)) has been implemented on a T8000-transputer system. It relies on sorting the vortices in the streamwise and transverse directions, and then distributing (approximately) equal numbers on each processor for even load balancing. This seems to work well, particularly when the problem granularity is coarse (ie: Large number of vortices, O(10^6), on relatively few processors, O(10). For more details, see the paper Clarke and Tutty (1994) below.
Clarke, NR & Tutty, OR, "Construction and Validation of a Discrete Vortex Method for the Two-Dimensional Incompressible Navier-Stokes Equations," Computers Fluids, vol. 23, No. 6, pp 751-783 (1994).
- A possible useful reference is G. Pringle's thesis: "Numerical Study of Three-Dimensional Flow using Fast Parallel Particle Algorithms" which describes a parallel version of the FMM in both 2 and 3 dimensions.
- You may find the following paper useful: "A comparison between Fast Multipole Algorithm and Tree-Code to evaluate gravitational forces in 3-D" by R. Capuzzo-Dolcetta and P. Miocchi, submitted to Journ. Comp. Phys. You can download it at: http://xxx.lanl.gov/abs/astro-ph/9703122.
Here is more FMM related work: An Improved Fast Muiltipole Algorithm for Potential Fields, by Tomasz Hrycak and Vladimir Rokhlin.
Tree-Code-Particle-Mesh (TPM)
There have been a number of hybrid schemes of tree codes with meshes (Tree-Code-Particle-Mesh). One of the earlier works was by Suvendra Dutta, which was published in 1995 in MNRAS 276, 1109. This section describes similar work published a year later by by Guohong Xu of Princeton University. This method uses a multiple tree-code (TC) approach in regions where the mass-density is high (above some threshold), and a particle-mesh (PM) algorithm in all other areas. PM codes have much higher speed than the Tree Codes, but with limited spatial resolution. The Tree codes have much higher spatial resolution, but the mass resolution is often poor with todays computers.
So, looking at the separation of forces:
F = F(short range) + F(long range),
we can separate the techniques used in each.
F(short range). Particles in overdense regions, use *primarily* the Tree Code. The forces on Tree particles are the sum of an external force, which is due to the particles outside the Tree, and is calculated by PM method, and an internal force, which is due to the particles in the Tree containing the particle, and is calculated by the Tree Code method.
F(long range). Particles in low density regions, use the PM method.
The TPM method solves the Poisson equation, using the same techniques as the PM method (usually FFT). The equations of motion are integrated from the potential and the tree-forces:
F = Sum(w_i * w_j * w_k * del(phi)) + F(tree)
using a leap-frog scheme, where the w are "Cloud-In-Cell" weights for the PM.
Multiple time scale integrations are implemented in this scheme because the particles in high density regions have different dynamic time scales from the particles in low density regions. In the simulation "box", the Tree particles have much shorter time steps than PM particles with each time step, optimized for each Tree.
The reference paper doesn't list the order of computation, but it is probably between O(log N) and O(N log N). This method has more free variables than the other N-body methods, which could be one big drawback.
Reference
- Guohong Xu, "A New Parallel N-body Gravity Solver: TPM." You can download this paper on the Web from: http://zeus.ncsa.uiuc.edu:8080/gc3/gc3006/abstract.html and see a simulation at:
http://zeus.ncsa.uiuc.edu:8080/TPM_simulation.html.
Self-Consistent Field (SCF)
The Self-Consistent Field (SCF) method is an algorithm for evolving collisionless stellar systems. The particles in the SCF technique do not directly interact with one another, but rather, through their contribution to the combined gravitational field of the system. In other words, the potential in the SCF method does not depend on the relative coordinates of particles.
The justification for this approach is that the fundamental components of a collisionless systems, such as stars in galaxies, or electrons and ions in dense plasmas, move along orbits which are not perturbed by individual two-body interations. The distinguishing physics characteristic is that the gravitational potential is mostly determined by its continuum mass distribution.
In this scheme, the orbits in a given (time-dependent) potential are computed to find the evolving density distribution. With an improved density, as determined from these orbits, a new potential can be found, and the method continues to iterate to a "collisonless limit." After the potential is determined, one integrates N one body orbits in a given gravitational field. One advantage is that, since all orbits are computed in given potentials, one solves, iteratively, N one-body problems, rather than one N-body problem, which can be easily optimized on parallel hardware.
Hernquist & Ostriker (1992) improved on the original mean-field algorithm by expanding the density and potential of Poisson's equation with a set of basis functions that fully expands the angular and radial dependence. Other N-body Mean Field schemes that relied on expansions typically used only the angular part. Hernquist & Ostriker's set of basis functions, called "ultraspherical polynomials," resembled the system being studied- for example, spherical galaxies.
The SCF method is similar to Tree Codes in time-integration and round-off errors. It is an O(N) method. It retains tightly-bound particles, and may be capable of resolving steeper density gradients than direct N-body techniques.
It is inflexible in that it relies on expansions that are highly tuned to the problem. In Hernquist & Ostriker's case, it is to galaxies obeying the physical structure of galaxies exhibiting the de Vaucouleurs R^{1/4} law in projection.
This method might be considered as a kind of link between the fluid hydrodynamics codes (smoothed particle hydrodynamics) and particle methods.
References
- For a more detailed overview of the equations involved in the SCF method please see the following site: http://extreme.indiana.edu/metatext/annotated_fortran/top-level.html
- Lars Hernquist & Jeremiah Ostriker, (1992), "A Self-Consistent Field Method for Galactic Dynamics," Astrophysical Journal, 386, 375-397.
Symplectic Method
Another N-body method that is actively being used in solar sytem dynamical integrations today, is very different from the direct integration methods, the mesh methods, and the tree code methods. It is based on Hamiltonian mechanics and goes back to Poincare's work in the late 1800's. It is called the Symplectic Method.
The following description came from Ben Leimkuhler's Web site: http://www.math.ukans.edu/~leimkuhl/symplectic.html
Hamiltonian systems of differential equations model conservative natural phenomena such as motion in molecular systems, in multibody mechanical systems, and of the heavenly bodies. Hamiltonian systems preserve more than just the energy: their flows posess a family of symplectic invariants that characterize the long term dynamics of a patch of nearby points in phase (position,momentum) space. Since the flow (i.e. the mapping that takes a point in phase space to it's evolution through t units of time) of a Hamiltonian system is symplectic, it seems not unreasonable to ask the same of a discrete (numerical) approximation. Thus we are led to the concept of symplectic discretizations. What's all the buzz about symplectic discretizations? A symplectic method has been used to answer the question "is the solar system chaotic?'' Symplectic methods are standard tools for designing particle accelerators. Symplecticness explains the unreasonable effectiveness of the leapfrog=Verlet=St"ormer scheme compared to other second order integrators used in applications such as molecular dynamics. In short, symplecticness is one of the few principles we have for evaluating the behavior of schemes for conservative nonlinear evolutionary phenomena on physically interesting time intervals.The people working in this field feel that standard numerical integration schemes do not respect the global geometry present from the Hamiltonian nature of the dynamics. Some global structures are missed. [Franklin, Lecar and Quinn (1990) found that, in the case of circular planetary orbits, the direct integrations showed no evidence of instability for a few orbits that the symplectic mapping claimed were unstable.] Therefore, even if energy conservation is built into the integrations, numerical integration techniques for solar system dynamics is limited to the investigation of short-time quantitative phenomena only, and not to long-time qualitative phenomena investigations.
Numerical methods have been devised to approximate the time-delta_time map of the exact dynamics to any desired order in the timestep, and are symplectic. Algorithms used so far have included Runge-Kutta techniques, Adams techniques, and Pade approximants. The relative efficiency and/or simplicity of the methods vary from one problem to another and there appears to be no "best" approach, in general. These algorithms give the exact evolution of the Hamiltonian system and preserve the Poincare invariants, namely the integrals:
int int sum{over_i} dq_i dp_i
and
int int int int sum{i ne k} dq_i dp_i dq_k dp_k
(Look at the Channell and Scovel 1990 reference, the Sussman and Wisdom 1992 reference, and the Wisdom and Holman 1991 reference for specific details on how the symplectic method works.)
Two leaders in this field are Jacques Laskar at Bureau des Longitudes in Paris, and Jack Wisdom at MIT.
Laskar has performed integrations simulating many billions of years (longer than the age of the Solar System.. He said that it was to explore the limits of the chaotic zones). He said that the integration error was measured by integrating the equations back and forth over 10 Myr, and amount to ~10^{-13} after 10^7 years and behaves like order time^{1.4}. His integrations were performed on an IBM RS6000 and took 1 day of CPU time per Gyr.
Wisdom has performed the integrations with a special computer called the "Supercomputer Toolkit", which is a small multiprocessor computer built and optimized for the numerical solutions of systems of ordinary differential equations. His integrations have spanned close to a billion years. Each 100 million years takes about 1000 hours of Toolkit CPU time.
The results of both researchers is that they have found that the whole Solar Sytem is chaotic. Hyperion and other irregularly shaped satellites are "tumbling." Pluto's orbit is chaotic. The motion of the Jovian planet subsystem is chaotic. Miranda's high inclination and chaotic motions account for its tidal heating. Planet-crossing asteroid orbits were found to be unstable, the motion of comet Halley is chaotic. The chaotic diffusion (and the eccentricity) of Mercury is so large that it could be ejected from the Solar System upon a close encounter with Venus in less than 3.5 billion year (!).
Some other results: Duncan and Quinn (1993), and Mikkola and Innanen, (1995) describe and review computer simulations of large numbers of test particles placed between the giant planets and near/within the terrestrial zone. Duncan and Quinn show that test particles that encounter one or more of the giant planets are ejected to the Kuiper belt, the Oort cloud, or beyond to interstellar space. Mikkola and Innanen show that the asteroidal zone trajectories are chaotic, except within the region of the terrestrial planets.
D. Syer and S. Tremaine (1995) describe a method based on symplectic equations of motion that solve the combined collisionless Boltzmann and Poisson equations in a discreted, or lattice, phase space. The time and the positions and velocities of "particles" take on integer values, and the forces are rounded to the nearest integer. The lattice equations become the usual integro-differential equations of stellar dynamics. Equilibria are found n a variety of shapes and sizes that do not evolve with time, even slowly, unlike existing N-body approximations to stellar systems.
References
- This paper shows some stability studies for symplectic algorithms, using long-time solar system and plasma chaos simulations. J. Candy and W. Rozmus, J. Comput. Phys. 92, (1991), 230.
- P.J. Channell and C. Scovel, "Symplectic Integration of Hamiltonian Sytems," Nonlinearity, 3, (1990), 231-259.
- M.J. Duncan and T. Quinn, Ann. Rev. of Astronomy and Astrophysics, 31, (1993), 265.
- F. Franklin, M. Lecar, and T. Quinn, "On the Stability of Orbits in the Solar System: A Comparison of a Mapping with a Numerical Integration," Icarus, 88, (1990), 97-103.
- J. Laskar, "Large-scale chaos in the solar system," Astron. Astrophys. Letters 287L, (1994), 9-12.
- S. Mikkola and K. Innanen, "Solar System Chaos and the Distribution of Asteroid Orbits," Mon. Not. R. Astron., 277, (1995), 497-501.
- Ilya Prigogine, From Being to Becoming, Freeman, 1980.
- G.J. Sussman and J. Wisdom, "Chaotic Evolution of the Solar System," Science, vol. 257, no. 5066, (1992), 56-62.
- D. Syer and S. Tremaine, "Lattice Stellar Dynamics," MNRAS, 276, (1995), 467-475.
- J. Wisdom and M. Holman, Astron J, vol. 102, 1528.
- This document: Symplectic Methods for Hamilotonian Systems may be useful.
- "The solution of the n-body problem" by Florin Diacu, Mathematical Intelligencer vol. 18, no. 3 (summer 1996), 66-70, describes both Sundman's work on the 3-body problem early this century, and its more recent extension to n bodies.
- Some fascinating historical notes about orbits and gravitation can be found at http://www-groups.dcs.st-and.ac.uk/~history/HistTopics/Orbits.html.
Other N-Body WWW Sources
- Meta-Sites
- Mario's Nbody Methods Resources.
- Cosmology of the Universe (Grand Challenge Cosmology Consortium).
- Smoothed Particle Hydrodynamics Links (Tuebingen, in German).
- The Galaxy Page.
- On Numerical Simulations in Astrophysics (SEDS).
- Some Code
- NEMO- Stellar Dynamics Toolbox (plus more links).
- Sky Coyote's Nbody in Java.
- D. Thiebaut's "nbody.c" from Parallel Programming in C for the Transputer.
- NBody Particle-Particle Program w/C Source (uses 7th order Runge-Kutta for startup, and then switches to Adams-Moulton).
- A search at Netlib using GAMS will turn up the RKSUITE numerical integration library, which may be useful to PP and Symplectic methods. It is a high-order RK scheme with adaptive step sizes.
- NBody Problem: The Force-Decomposition method.
- GalaxSee: An N-Body Dynamics Probe (software).
- Overview of Long-Range N-Body Code by David Walker and extra notes. This set of programs calculates the force between N particles according to Newton's Law of Gravitation. This calculation is the kernel of particle simulation codes.
- Some Courses
- The Berkeley Web site listed in the Tree-Code and FMM sections was a course given by Jim Demmel, one of the people involved in the design of (Sca)LAPACK. Another Demmel site, has links to other interesting sources, with some guest presentations. Eventually, the course material will be published by SIAM, so you may want to check the SIAM Web site for information regarding that material.
- Real-world Nbody algorithms.
- Hierarchical Methods for Simulation, a graduate course in the Carnegie Mellon University Computer Science Dept., Fall 1998, with the useful Nbody Links.
- Some Papers
- Randall Splinter's paper: "A Nested Grid Particle-Mesh Code for High Resolution Simulations of Gravitational Instability in Cosmology" has a review of many different NBody methods -- many that I don't have itemized in this summary. For example: Pen's (1993) "Linear Moving Adaptive PM," Katz and White's (1993) Smoothed Particle Hydrodynamics method "TREESPH," various multi-grid adapted PM codes, various nested-grid adapted PM codes by Chan (1986), Anninos (1993) and others, and Villumsen's (1987) Hierarchical Particle Mesh (HPM) code. You can download this paper from: http://www.pa.uky.edu/~splinter/cosmos.html, in the FTP section (ngpm.ps).
- Some Simulations
- Parallel NBody Simulations at CMU.
- Simulation Movies from the Max Planck Instutute for Astrophysics Galaxy Formation Group.
- Galactic Dynamics -- Interacting Galaxies from Grand Challenge Consortium group.
- Indiana University Astronomy Movies.
- Some Groups and/or Researchers
- Cosmology Group at University of Kentucky.
- Cosmology Research Group at the University of Kansas.
- The Scientific Computing Group at Duke University.
- Cosmological Evolution Archetypes at Indiana.
- Douglas Heggie's Nbody Simulations, Papers, NBODY1 port.
- Parallel hierarchical N-body Solvers for Vortex Flows at Rutgers.
- J. Makino's Work.
- Some student projects for parallelizing Nbody algorithms.
- Grand Challenge Consortium Home Page.
Changes and Additions
20 March 2000
- Many changes...
- Took a break from the Web in Feb/March 2000 and removed my www.amara.com site.
- Put a Mirror at my Max-Planck-Institute fuer Kernphysics Heidelberg Web site.
- Checked all links.
- Removed many dead links.
- Changed other links for WWW sites that had moved.
- Added a few links to new sites.
- Re-edited text to be smoother in many places (similar to my Scientific Computing World Nbody methods article)
- Many color and layout aesthetical (my tastes..) changes.
19 November 1996
- Added:
- Particle-Multiple-Mesh (PM2) Section and some new text to Nested-Grid-Particle-Mesh (NGPM) (many thanks to S. Gelato).
- Aluru's, "Greengard's N-Body Algorithm Is Not Order N" Reference in FMM section.
- Clarke, NR & Tutty, OR's "Construction and Validation of a Discrete Vortex Method for the Two-Dimensional Incompressible Navier-Stokes Equations" Reference item in FMM section.
- Candy and Rozmus's, stability studies reference in Symplectic section.
- Paul Bode's N-Body Research, Simulations, and Papers link.
- B. Kuhlman et al's "A Test of the Particle Paradigm in N-Body Simulations" in PM section.
- Eric Jui-Lin Lu, NBody Simulations, Parallelizing with FMM Link.
- Richard Gerber's Galactic Interaction Simulations Link.
- Many-Body Tree Methods in Physics book reference in Tree-Code section.
- Excellent N-Body Bibliography list by Juergen K. Singer in Papers section.
- Interacting Galaxies Link in Simulations section.
- The NBody Problem: The Force-Decomposition Method in Code section.
- GalaxSee: An N-Body Dynamics Probe (software) in Code section.
- Directory of of FMM papers in FMM section.
- J. Makino's work in Researcher's section.
- Indiana University Astronomy Movies Link in Simulation section.
- Some color and layout aesthetic changes.
- Changed:
- Removed Richard Gerber's NBody Code link (not available).
- Removed The Nbody Problem Course (15-850C) at CMU (not available).
- Friends-of-Friends Method Computer Man Page (not available).
- http://astro.berkeley.edu/~mcraig/nbody.html (new location).
23 July 1996
- Added:
- This "Changes and Additions" Section.
- Meta-Sites Section.
- A Little Bit About NBody Parallel Programming link.
- Large Scale Structure on the Net link.
- Smoothed Particle Hydrodynamics link (Tuebingen).
- Course at CMU link.
- Information about UK mirror.
- NBody Particle-Particle Program w/C Source link.
- References of Brock Luty's studies of periodic BCs for P3M.
- Reference for Splinter's NGPM method in MNRAS.
- The Galaxy Page link.
- Changed:
- Mercury's ejection from the Solar System from "would" to "could" in Symplectic section.
- dq_i dp_i dp_k dp_k to dq_i dp_i dq_k dp_k in Symplectic section.
Acknowledgements
I received valuable information on these N-Body/Particle Simulation concepts and/or the article from the following people: Vince Kerchner, Andrew Bennett, Philippe Brieu, Tomasz Plewa, Mike Gross, Wayne Hayes, Joel Phillips, Jun Makino, Gordon Sande, Peter Teuben, Ton Dammers, Randall Splinter, Adrian Melott, Jens Helmers, Gavin Pringle, Walter Scott, and Sergio Gelato.
Personal Correspondance
I am glad that you found this page useful!The Web can be a very demanding place, and I have limited "resources" to maintain this Web page and to answer email. I present this Web page as is to you. Please don't be offended if I don't respond (or respond sporadically) to your request for assistance or for making changes to my site or for friendly correspondance. I hope that you can use what I have available at my site for whatever you are seeking.
However, I am always happy to receive a "hello!" or a "thank you!" for writing this page, and because I am finishing my astrophysics PhD in April 2001, any employment offers are welcome too!
Papers and Talks Home
Last Modified by Amara Graps on 20 March 2000.
Current page access count = 40629
© Copyright Amara Graps, 1995-2000.