diff --git a/doc/src/Bibliography.rst b/doc/src/Bibliography.rst index 4ed8e73dfe..9778340c94 100644 --- a/doc/src/Bibliography.rst +++ b/doc/src/Bibliography.rst @@ -877,6 +877,9 @@ Bibliography **(PLUMED)** G.A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni and G. Bussi, Comp. Phys. Comm 185, 604 (2014) +**(Pavlov)** +D Pavlov, V Galigerov, D Kolotinskii, V Nikolskiy, V Stegailov, International Journal of High Performance Computing Applications, 38, 34-49 (2024). + **(Paquay)** Paquay and Kusters, Biophys. J., 110, 6, (2016). preprint available at `arXiv:1411.3019 `_. diff --git a/doc/src/Commands_fix.rst b/doc/src/Commands_fix.rst index ea50e68cdd..a7648218fa 100644 --- a/doc/src/Commands_fix.rst +++ b/doc/src/Commands_fix.rst @@ -263,6 +263,7 @@ OPT. * :doc:`wall/body/polyhedron ` * :doc:`wall/colloid ` * :doc:`wall/ees ` + * :doc:`wall/flow (k) ` * :doc:`wall/gran (k) ` * :doc:`wall/gran/region ` * :doc:`wall/harmonic ` diff --git a/doc/src/fix.rst b/doc/src/fix.rst index d03cab4687..4cd21353c7 100644 --- a/doc/src/fix.rst +++ b/doc/src/fix.rst @@ -428,6 +428,7 @@ accelerated styles exist. * :doc:`wall/body/polyhedron ` - time integration for body particles of style :doc:`rounded/polyhedron ` * :doc:`wall/colloid ` - Lennard-Jones wall interacting with finite-size particles * :doc:`wall/ees ` - wall for ellipsoidal particles +* :doc:`wall/flow ` - flow boundary conditions * :doc:`wall/gran ` - frictional wall(s) for granular simulations * :doc:`wall/gran/region ` - :doc:`fix wall/region ` equivalent for use with granular particles * :doc:`wall/harmonic ` - harmonic spring wall diff --git a/doc/src/fix_wall_flow.rst b/doc/src/fix_wall_flow.rst new file mode 100644 index 0000000000..b40ba9697f --- /dev/null +++ b/doc/src/fix_wall_flow.rst @@ -0,0 +1,175 @@ +.. index:: fix wall/flow +.. index:: fix wall/flow/kk + +fix wall/flow command +===================== + +Accelerator Variants: *wall/flow/kk* + +Syntax +"""""" + +.. code-block:: LAMMPS + + fix ID group-ID wall/flow axis vflow T seed N coords ... keyword value + +* ID, group-ID are documented in :doc:`fix ` command +* wall/flow = style name of this fix command +* axis = flow axis (*x*, *y*, or *z*) +* vflow = generated flow velocity in *axis* direction (velocity units) +* T = flow temperature (temperature units) +* seed = random seed for stochasticity (positive integer) +* N = number of walls +* coords = list of N wall positions along the *axis* direction in ascending order (distance units) +* zero or more keyword/value pairs may be appended +* keyword = *units* + + .. parsed-literal:: + + *units* value = *lattice* or *box* + *lattice* = wall positions are defined in lattice units + *box* = the wall positions are defined in simulation box units + +Examples +"""""""" + +.. code-block:: LAMMPS + + fix 1 all wall/flow x 0.4 1.5 593894 4 2.0 4.0 6.0 8.0 + +Description +""""""""""" + +.. versionadded:: TBD + +This fix implements flow boundary conditions (FBC) introduced in +:ref:`(Pavlov1) ` and :ref:`(Pavlov2) `. +The goal is to generate a stationary flow with a shifted Maxwell +velocity distribution: + +.. math:: + + f_a(v_a) \propto \exp{\left(-\frac{m (v_a-v_{\text{flow}})^2}{2 kB T}\right)} + +where :math:`v_a` is the component of velocity along the specified +*axis* argument (a = x,y,z), :math:`v_{\text{flow}}` is the flow +velocity specified as the *vflow* argument, *T* is the specified flow +temperature, *m* is the particle mass, and *kB* is the Boltzmann +constant. + +This is achieved by defining a series of *N* transparent walls along +the flow *axis* direction. Each wall is at the specified position +listed in the *coords* argument. Note that an additional transparent +wall is defined by the code at the boundary of the (periodic) +simulation domain in the *axis* direction. So there are effectively +N+1 walls. + +Each time a particle in the specified group passes through one of the +transparent walls, its velocity is re-assigned. Particles not in the +group do not interact with the wall. This can be used, for example, to +add obstacles composed of atoms, or to simulate a solution of complex +molecules in a one-atom liquid (note that the fix has been tested for +one-atom systems only). + +Conceptually, the velocity re-assignment represents creation of a new +particle within the system with simultaneous removal of the particle +which passed through the wall. The velocity components in directions +parallel to the wall are re-assigned according to the standard Maxwell +velocity distribution for the specified temperature *T*. The velocity +component perpendicular to the wall is re-assigned according to the +shifted Maxwell distribution defined above: + +.. math:: + + f_{\text{a generated}}(v_a) \propto v_a f_a(v_a) + +It can be shown that for an ideal-gas scenario this procedure makes +the velocity distribution of particles between walls exactly as +desired. + +Since in most cases simulated systems are not an ideal gas, multiple +walls can be defined, since a single wall may not be sufficient for +maintaining a stationary flow without "congestion" which can manifest +itself as regions in the flow with increased particle density located +upstream from static obstacles. + +For the same reason, the actual temperature and velocity of the +generated flow may differ from what is requested. The degree of +discrepancy is determined by how different from an ideal gas the +simulated system is. Therefore, a calibration procedure may be +required for such a system as described in :ref:`(Pavlov) +`. + +Note that the interactions between particles on different sides of a +transparent wall are not disabled or neglected. Likewise particle +positions are not altered by the velocity reassignment. This removes +the need to modify the force field to work correctly in cases when a +particle is close to a wall. + +For example, if particle positions were uniformly redistributed across +the surface of a wall, two particles could end up too close to each +other, potentially causing the simulation to explode. However due to +this compromise, some collective phenomena such as regions with +increased/decreased density or collective movements are not fully +removed when particles cross a wall. This unwanted consequence can +also be potentially mitigated by using more multiple walls. + +.. note:: + + When the specified flow has a high velocity, a lost atoms error can + occur (see :doc:`error messages `). If this + happens, you should ensure the checks for neighbor list rebuilds, + set via the :doc:`neigh_modify ` command, are as + conservative as possible (every timestep if needed). Those are the + default settings. + +Restart, fix_modify, output, run start/stop, minimize info +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +No information about this fix is written to :doc:`binary restart files +`. + +None of the :doc:`fix_modify ` options are relevant to +this fix. + +No global or per-atom quantities are stored by this fix for access by +various :doc:`output commands `. + +No parameter of this fix can be used with the *start/stop* keywords of +the :doc:`run ` command. + +This fix is not invoked during :doc:`energy minimization `. + +Restrictions +"""""""""""" + +Fix *wall_flow* is part of the EXTRA-FIX package. It is only enabled +if LAMMPS was built with that package. See the :doc:`Build package +` page for more info. + +Flow boundary conditions should not be used with rigid bodies such as +those defined by a "fix rigid" command. + +This fix can only be used with periodic boundary conditions along the +flow axis. The size of the box in this direction must not change. Also, +the fix is designed to work only in an orthogonal simulation box. + +Related commands +"""""""""""""""" + +:doc:`fix wall/reflect ` command + +Default +""""""" + +The default for the units keyword is lattice. + +---------- + +.. _fbc-Pavlov1: + +**(Pavlov1)** Pavlov, Kolotinskii, Stegailov, "GPU-Based Molecular Dynamics of Turbulent Liquid Flows with OpenMM", Proceedings of PPAM-2022, LNCS (Springer), vol. 13826, pp. 346-358 (2023) + +.. _fbc-Pavlov2: + +**(Pavlov2)** Pavlov, Galigerov, Kolotinskii, Nikolskiy, Stegailov, "GPU-based Molecular Dynamics of Fluid Flows: Reaching for Turbulence", Int. J. High Perf. Comp. Appl., (2024) diff --git a/doc/utils/sphinx-config/false_positives.txt b/doc/utils/sphinx-config/false_positives.txt index e46fb6ca97..030c80d30c 100644 --- a/doc/utils/sphinx-config/false_positives.txt +++ b/doc/utils/sphinx-config/false_positives.txt @@ -1770,6 +1770,7 @@ Kolafa Kollman kolmogorov Kolmogorov +Kolotinskii Kondor konglt Koning @@ -2775,6 +2776,7 @@ PEigenDense Peng peptide peratom +Perf Pergamon pergrid peri @@ -3887,7 +3889,9 @@ Verlet versa Verstraelen ves +vf vflag +vflow vfrac vhi vibrational diff --git a/examples/wall/in.wall.flow b/examples/wall/in.wall.flow new file mode 100644 index 0000000000..9dfe001a55 --- /dev/null +++ b/examples/wall/in.wall.flow @@ -0,0 +1,79 @@ +variable nrun equal 1000 +variable dump_count equal 10 + +variable nwall equal 4 +variable w1 equal 67 +variable w2 equal 71 +variable w3 equal 75 +variable w4 equal 79 + +variable x_cylinder equal 20 +variable y_cylinder equal 17 +variable r_cylinder equal 4 + +variable MASS equal 1 +variable TEMP equal 0.4 +variable VFLOW equal 0.5 + +units lj +atom_style atomic + +lattice fcc 0.3 +region sim_box block 0 84 0 34 0 10 + +boundary p p p + +create_box 2 sim_box +region reg_cylinder cylinder z ${x_cylinder} ${y_cylinder} ${r_cylinder} EDGE EDGE + +create_atoms 1 box + +## setup obstacle ## +group g_obst region reg_cylinder +group g_flow subtract all g_obst +set group g_obst type 2 + +mass 1 ${MASS} +mass 2 ${MASS} + +velocity g_flow create ${TEMP} 4928459 rot yes dist gaussian +velocity g_obst set 0.0 0.0 0.0 + +pair_style lj/cut 1.122462 +pair_coeff 1 1 1.0 1.0 +pair_coeff 1 2 1.0 1.0 +pair_coeff 2 2 1.0 1.0 +pair_modify shift yes + +neighbor 0.3 bin +neigh_modify delay 0 every 20 check no + +fix 1 g_flow nve +fix 2 g_flow wall/flow x ${VFLOW} ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4} + +variable dump_every equal ${nrun}/${dump_count} +variable thermo_every equal ${dump_every} +variable restart_every equal ${nrun}/10 + +##### uncomment for grid aggregation ##### +#variable gr_Nx equal 42 +#variable gr_Ny equal 17 +#variable gr_Nz equal 1 +#variable gr_Nevery equal ${dump_every} +#variable gr_Nrepeat equal 1 +#variable gr_Nfreq equal ${dump_every} +#fix 3 g_flow ave/grid ${gr_Nevery} ${gr_Nrepeat} ${gr_Nfreq} ${gr_Nx} ${gr_Ny} ${gr_Nz} vx vy vz density/mass norm all ave one +#compute ct_gridId g_flow property/grid ${gr_Nx} ${gr_Ny} ${gr_Nz} id +#dump dmp_grid g_flow grid ${dump_every} grid.lammpstrj c_ct_gridId:grid:data f_3:grid:data[*] +########################################## + +#dump dmp_coord all atom ${dump_every} dump.lammpstrj + +#compute ct_Temp g_flow temp/com +#thermo_style custom step temp epair emol etotal press c_ct_Temp + +#restart ${restart_every} flow.restart + +timestep 0.005 +thermo ${thermo_every} +run ${nrun} diff --git a/examples/wall/log.7Feb24.wall.flow.g++.1 b/examples/wall/log.7Feb24.wall.flow.g++.1 new file mode 100644 index 0000000000..75e8b66fe1 --- /dev/null +++ b/examples/wall/log.7Feb24.wall.flow.g++.1 @@ -0,0 +1,182 @@ +LAMMPS (21 Nov 2023 - Development - patch_21Nov2023-758-ge33590b2fc-modified) + using 1 OpenMP thread(s) per MPI task +variable nrun equal 1000 +variable dump_count equal 10 + +variable nwall equal 4 +variable w1 equal 67 +variable w2 equal 71 +variable w3 equal 75 +variable w4 equal 79 + +variable x_cylinder equal 20 +variable y_cylinder equal 17 +variable r_cylinder equal 4 + +variable MASS equal 1 +variable TEMP equal 0.4 +variable VFLOW equal 0.5 + +units lj +atom_style atomic + +lattice fcc 0.3 +Lattice spacing in x,y,z = 2.3712622 2.3712622 2.3712622 +region sim_box block 0 84 0 34 0 10 + +boundary p p p + +create_box 2 sim_box +Created orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622) + 1 by 1 by 1 MPI processor grid +region reg_cylinder cylinder z ${x_cylinder} ${y_cylinder} ${r_cylinder} EDGE EDGE +region reg_cylinder cylinder z 20 ${y_cylinder} ${r_cylinder} EDGE EDGE +region reg_cylinder cylinder z 20 17 ${r_cylinder} EDGE EDGE +region reg_cylinder cylinder z 20 17 4 EDGE EDGE + +create_atoms 1 box +Created 114240 atoms + using lattice units in orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622) + create_atoms CPU = 0.010 seconds + +## setup obstacle ## +group g_obst region reg_cylinder +1950 atoms in group g_obst +group g_flow subtract all g_obst +112290 atoms in group g_flow +set group g_obst type 2 +Setting atom values ... + 1950 settings made for type + +mass 1 ${MASS} +mass 1 1 +mass 2 ${MASS} +mass 2 1 + +velocity g_flow create ${TEMP} 4928459 rot yes dist gaussian +velocity g_flow create 0.4 4928459 rot yes dist gaussian +velocity g_obst set 0.0 0.0 0.0 + +pair_style lj/cut 1.122462 +pair_coeff 1 1 1.0 1.0 +pair_coeff 1 2 1.0 1.0 +pair_coeff 2 2 1.0 1.0 +pair_modify shift yes + +neighbor 0.3 bin +neigh_modify delay 0 every 20 check no + +fix 1 g_flow nve +fix 2 g_flow wall/flow x ${VFLOW} ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 ${nwall} ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 79 + +variable dump_every equal ${nrun}/${dump_count} +variable dump_every equal 1000/${dump_count} +variable dump_every equal 1000/10 +variable thermo_every equal ${dump_every} +variable thermo_every equal 100 +variable restart_every equal ${nrun}/10 +variable restart_every equal 1000/10 + +##### uncomment for grid aggregation ##### +#variable gr_Nx equal 42 +#variable gr_Ny equal 17 +#variable gr_Nz equal 1 +#variable gr_Nevery equal ${dump_every} +#variable gr_Nrepeat equal 1 +#variable gr_Nfreq equal ${dump_every} +#fix 3 g_flow ave/grid ${gr_Nevery} ${gr_Nrepeat} ${gr_Nfreq} ${gr_Nx} ${gr_Ny} ${gr_Nz} vx vy vz density/mass norm all ave one +#compute ct_gridId g_flow property/grid ${gr_Nx} ${gr_Ny} ${gr_Nz} id +#dump dmp_grid g_flow grid ${dump_every} grid.lammpstrj c_ct_gridId:grid:data f_3:grid:data[*] +########################################## + +#dump dmp_coord all atom ${dump_every} dump.lammpstrj + +#compute ct_Temp g_flow temp/com +#thermo_style custom step temp epair emol etotal press c_ct_Temp + +#restart ${restart_every} flow.restart + +timestep 0.005 +thermo ${thermo_every} +thermo 100 +run ${nrun} +run 1000 + +CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE + +Your simulation uses code contributions which should be cited: + +- fix wall/flow command: doi:10.1177/10943420231213013 + +@Article{Pavlov-etal-IJHPCA-2024, + author = {Daniil Pavlov and Vladislav Galigerov and Daniil Kolotinskii and Vsevolod Nikolskiy and Vladimir Stegailov}, + title = {GPU-based molecular dynamics of fluid flows: Reaching for turbulence}, + journal = {The International Journal of High Performance Computing Applications}, + year = 2024, + volume = 38, + number = 1, + pages = 34-49 +} + +CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE + +Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 20 steps, delay = 0 steps, check = no + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.422462 + ghost atom cutoff = 1.422462 + binsize = 0.711231, bins = 281 114 34 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 26.69 | 26.69 | 26.69 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 0.39317221 0 0 0.58975315 0.11795063 + 100 0.3671684 0.045118445 0 0.59586622 0.27378331 + 200 0.3732041 0.036897471 0 0.59669873 0.24917809 + 300 0.37432305 0.036501844 0 0.5979815 0.24715194 + 400 0.37603886 0.035350565 0 0.59940392 0.24480762 + 500 0.37617142 0.036949771 0 0.60120196 0.24862985 + 600 0.37751983 0.036484268 0 0.60275905 0.24784635 + 700 0.37787831 0.037327783 0 0.60414029 0.25060427 + 800 0.37959242 0.036206184 0 0.60558983 0.2476903 + 900 0.38019033 0.036874395 0 0.6071549 0.24984211 + 1000 0.38070666 0.037068948 0 0.60812395 0.25041936 +Loop time of 5.61598 on 1 procs for 1000 steps with 114240 atoms + +Performance: 76923.319 tau/day, 178.063 timesteps/s, 20.342 Matom-step/s +99.7% CPU use with 1 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 2.6351 | 2.6351 | 2.6351 | 0.0 | 46.92 +Neigh | 1.2994 | 1.2994 | 1.2994 | 0.0 | 23.14 +Comm | 0.26576 | 0.26576 | 0.26576 | 0.0 | 4.73 +Output | 0.0030531 | 0.0030531 | 0.0030531 | 0.0 | 0.05 +Modify | 1.3019 | 1.3019 | 1.3019 | 0.0 | 23.18 +Other | | 0.1107 | | | 1.97 + +Nlocal: 114240 ave 114240 max 114240 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Nghost: 20119 ave 20119 max 20119 min +Histogram: 1 0 0 0 0 0 0 0 0 0 +Neighs: 164018 ave 164018 max 164018 min +Histogram: 1 0 0 0 0 0 0 0 0 0 + +Total # of neighbors = 164018 +Ave neighs/atom = 1.4357318 +Neighbor list builds = 50 +Dangerous builds not checked +Total wall time: 0:00:05 diff --git a/examples/wall/log.7Feb24.wall.flow.g++.4 b/examples/wall/log.7Feb24.wall.flow.g++.4 new file mode 100644 index 0000000000..1efe7bb28e --- /dev/null +++ b/examples/wall/log.7Feb24.wall.flow.g++.4 @@ -0,0 +1,182 @@ +LAMMPS (21 Nov 2023 - Development - patch_21Nov2023-758-ge33590b2fc-modified) + using 1 OpenMP thread(s) per MPI task +variable nrun equal 1000 +variable dump_count equal 10 + +variable nwall equal 4 +variable w1 equal 67 +variable w2 equal 71 +variable w3 equal 75 +variable w4 equal 79 + +variable x_cylinder equal 20 +variable y_cylinder equal 17 +variable r_cylinder equal 4 + +variable MASS equal 1 +variable TEMP equal 0.4 +variable VFLOW equal 0.5 + +units lj +atom_style atomic + +lattice fcc 0.3 +Lattice spacing in x,y,z = 2.3712622 2.3712622 2.3712622 +region sim_box block 0 84 0 34 0 10 + +boundary p p p + +create_box 2 sim_box +Created orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622) + 4 by 1 by 1 MPI processor grid +region reg_cylinder cylinder z ${x_cylinder} ${y_cylinder} ${r_cylinder} EDGE EDGE +region reg_cylinder cylinder z 20 ${y_cylinder} ${r_cylinder} EDGE EDGE +region reg_cylinder cylinder z 20 17 ${r_cylinder} EDGE EDGE +region reg_cylinder cylinder z 20 17 4 EDGE EDGE + +create_atoms 1 box +Created 114240 atoms + using lattice units in orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622) + create_atoms CPU = 0.003 seconds + +## setup obstacle ## +group g_obst region reg_cylinder +1950 atoms in group g_obst +group g_flow subtract all g_obst +112290 atoms in group g_flow +set group g_obst type 2 +Setting atom values ... + 1950 settings made for type + +mass 1 ${MASS} +mass 1 1 +mass 2 ${MASS} +mass 2 1 + +velocity g_flow create ${TEMP} 4928459 rot yes dist gaussian +velocity g_flow create 0.4 4928459 rot yes dist gaussian +velocity g_obst set 0.0 0.0 0.0 + +pair_style lj/cut 1.122462 +pair_coeff 1 1 1.0 1.0 +pair_coeff 1 2 1.0 1.0 +pair_coeff 2 2 1.0 1.0 +pair_modify shift yes + +neighbor 0.3 bin +neigh_modify delay 0 every 20 check no + +fix 1 g_flow nve +fix 2 g_flow wall/flow x ${VFLOW} ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 ${nwall} ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 ${w1} ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 ${w2} ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 ${w3} ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 ${w4} +fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 79 + +variable dump_every equal ${nrun}/${dump_count} +variable dump_every equal 1000/${dump_count} +variable dump_every equal 1000/10 +variable thermo_every equal ${dump_every} +variable thermo_every equal 100 +variable restart_every equal ${nrun}/10 +variable restart_every equal 1000/10 + +##### uncomment for grid aggregation ##### +#variable gr_Nx equal 42 +#variable gr_Ny equal 17 +#variable gr_Nz equal 1 +#variable gr_Nevery equal ${dump_every} +#variable gr_Nrepeat equal 1 +#variable gr_Nfreq equal ${dump_every} +#fix 3 g_flow ave/grid ${gr_Nevery} ${gr_Nrepeat} ${gr_Nfreq} ${gr_Nx} ${gr_Ny} ${gr_Nz} vx vy vz density/mass norm all ave one +#compute ct_gridId g_flow property/grid ${gr_Nx} ${gr_Ny} ${gr_Nz} id +#dump dmp_grid g_flow grid ${dump_every} grid.lammpstrj c_ct_gridId:grid:data f_3:grid:data[*] +########################################## + +#dump dmp_coord all atom ${dump_every} dump.lammpstrj + +#compute ct_Temp g_flow temp/com +#thermo_style custom step temp epair emol etotal press c_ct_Temp + +#restart ${restart_every} flow.restart + +timestep 0.005 +thermo ${thermo_every} +thermo 100 +run ${nrun} +run 1000 + +CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE + +Your simulation uses code contributions which should be cited: + +- fix wall/flow command: doi:10.1177/10943420231213013 + +@Article{Pavlov-etal-IJHPCA-2024, + author = {Daniil Pavlov and Vladislav Galigerov and Daniil Kolotinskii and Vsevolod Nikolskiy and Vladimir Stegailov}, + title = {GPU-based molecular dynamics of fluid flows: Reaching for turbulence}, + journal = {The International Journal of High Performance Computing Applications}, + year = 2024, + volume = 38, + number = 1, + pages = 34-49 +} + +CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE + +Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule +Neighbor list info ... + update: every = 20 steps, delay = 0 steps, check = no + max neighbors/atom: 2000, page size: 100000 + master list distance cutoff = 1.422462 + ghost atom cutoff = 1.422462 + binsize = 0.711231, bins = 281 114 34 + 1 neighbor lists, perpetual/occasional/extra = 1 0 0 + (1) pair lj/cut, perpetual + attributes: half, newton on + pair build: half/bin/atomonly/newton + stencil: half/bin/3d + bin: standard +Per MPI rank memory allocation (min/avg/max) = 8.496 | 8.496 | 8.496 Mbytes + Step Temp E_pair E_mol TotEng Press + 0 0.39317221 0 0 0.58975315 0.11795063 + 100 0.36726398 0.045386014 0 0.59627716 0.27402111 + 200 0.37384538 0.036574547 0 0.5973377 0.24836729 + 300 0.37487455 0.036519645 0 0.59882654 0.24691726 + 400 0.37591417 0.036405755 0 0.60027207 0.24700641 + 500 0.37654714 0.037008829 0 0.60182459 0.24883444 + 600 0.3778008 0.03663706 0 0.6033333 0.24874392 + 700 0.37851338 0.036714175 0 0.60447928 0.24881829 + 800 0.37984876 0.036237049 0 0.6060052 0.24843003 + 900 0.38022763 0.036847615 0 0.60718407 0.24987198 + 1000 0.38084717 0.037139994 0 0.60840575 0.25070072 +Loop time of 2.20347 on 4 procs for 1000 steps with 114240 atoms + +Performance: 196054.093 tau/day, 453.829 timesteps/s, 51.845 Matom-step/s +95.6% CPU use with 4 MPI tasks x 1 OpenMP threads + +MPI task timing breakdown: +Section | min time | avg time | max time |%varavg| %total +--------------------------------------------------------------- +Pair | 0.67927 | 0.70882 | 0.73473 | 2.4 | 32.17 +Neigh | 0.32928 | 0.34467 | 0.36084 | 2.0 | 15.64 +Comm | 0.3211 | 0.36609 | 0.40741 | 6.1 | 16.61 +Output | 0.0017748 | 0.0032465 | 0.0046508 | 2.1 | 0.15 +Modify | 0.71135 | 0.74424 | 0.76001 | 2.3 | 33.78 +Other | | 0.03641 | | | 1.65 + +Nlocal: 28560 ave 29169 max 27884 min +Histogram: 1 0 0 0 0 2 0 0 0 1 +Nghost: 6452.25 ave 6546 max 6368 min +Histogram: 1 0 0 0 2 0 0 0 0 1 +Neighs: 40893 ave 42032 max 39445 min +Histogram: 1 0 0 0 1 0 0 1 0 1 + +Total # of neighbors = 163572 +Ave neighs/atom = 1.4318277 +Neighbor list builds = 50 +Dangerous builds not checked +Total wall time: 0:00:02 diff --git a/src/.gitignore b/src/.gitignore index ba4c4c05b0..88bb80fdc5 100644 --- a/src/.gitignore +++ b/src/.gitignore @@ -1025,6 +1025,8 @@ /fix_wall_colloid.h /fix_wall_ees.cpp /fix_wall_ees.h +/fix_wall_flow.cpp +/fix_wall_flow.h /fix_wall_region_ees.cpp /fix_wall_region_ees.h /fix_wall_reflect_stochastic.cpp diff --git a/src/EXTRA-FIX/fix_wall_flow.cpp b/src/EXTRA-FIX/fix_wall_flow.cpp new file mode 100644 index 0000000000..1f3dcfca5b --- /dev/null +++ b/src/EXTRA-FIX/fix_wall_flow.cpp @@ -0,0 +1,325 @@ +/* ---------------------------------------------------------------------- + LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator + https://www.lammps.org/, Sandia National Laboratories + LAMMPS development team: developers@lammps.org + + Copyright (2003) Sandia Corporation. Under the terms of Contract + DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains + certain rights in this software. This software is distributed under + the GNU General Public License. + + See the README file in the top-level LAMMPS directory. +------------------------------------------------------------------------- */ + +/* ---------------------------------------------------------------------- + Contributing authors: Vladislav Galigerov (HSE), + Daniil Pavlov (MIPT) +------------------------------------------------------------------------- */ + +#include "fix_wall_flow.h" + +#include "atom.h" +#include "citeme.h" +#include "comm.h" +#include "domain.h" +#include "error.h" +#include "force.h" +#include "input.h" +#include "lattice.h" +#include "math_const.h" +#include "memory.h" +#include "modify.h" +#include "random_mars.h" +#include "update.h" +#include "variable.h" + +#include +#include +#include + +using namespace LAMMPS_NS; +using namespace FixConst; + +/* ---------------------------------------------------------------------- */ + +static const char cite_fix_wall_flow_c[] = + "fix wall/flow command: doi:10.1177/10943420231213013\n\n" + "@Article{Pavlov-etal-IJHPCA-2024,\n" + " author = {Daniil Pavlov and Vladislav Galigerov and Daniil Kolotinskii and Vsevolod " + "Nikolskiy and Vladimir Stegailov},\n" + " title = {GPU-based molecular dynamics of fluid flows: Reaching for turbulence},\n" + " journal = {The International Journal of High Performance Computing Applications},\n" + " year = 2024,\n" + " volume = 38,\n" + " number = 1,\n" + " pages = 34-49\n" + "}\n\n"; + +FixWallFlow::FixWallFlow(LAMMPS *lmp, int narg, char **arg) : + Fix(lmp, narg, arg), flowax(FlowAxis::AX_X), flowvel(0.0), flowdir(0), rndseed(0), + current_segment(nullptr) +{ + if (lmp->citeme) lmp->citeme->add(cite_fix_wall_flow_c); + if (narg < 9) utils::missing_cmd_args(FLERR, "fix wall/flow", error); + + if (domain->triclinic != 0) + error->all(FLERR, "Fix wall/flow cannot be used with triclinic simulation box"); + + dynamic_group_allow = 1; + bool do_abort = false; + + int iarg = 3; + // parsing axis + if (strcmp(arg[iarg], "x") == 0) + flowax = FlowAxis::AX_X; + else if (strcmp(arg[iarg], "y") == 0) + flowax = FlowAxis::AX_Y; + else if (strcmp(arg[iarg], "z") == 0) + flowax = FlowAxis::AX_Z; + else + error->all(FLERR, "Illegal fix wall/flow argument: axis must by x or y or z, but {} specified", + arg[iarg]); + + if (domain->periodicity[flowax] != 1) + error->all(FLERR, + "Fix wall/flow cannot be used with a non-periodic boundary along the flow axis"); + + ++iarg; + // parsing velocity + flowvel = utils::numeric(FLERR, arg[iarg], do_abort, lmp); + if (flowvel == 0.0) error->all(FLERR, "Illegal fix wall/flow argument: velocity cannot be 0"); + if (flowvel > 0.0) + flowdir = 1; + else + flowdir = -1; + if (flowdir < 0) + error->all(FLERR, "Illegal fix wall/flow argument: negative direction is not supported yet"); + + ++iarg; + // parsing temperature + double flowtemp = utils::numeric(FLERR, arg[iarg], do_abort, lmp); + kT = lmp->force->boltz * flowtemp / force->mvv2e; + + ++iarg; + // parsing seed + rndseed = utils::inumeric(FLERR, arg[iarg], do_abort, lmp); + if (rndseed <= 0) + error->all(FLERR, "Illegal fix wall/flow argument: random seed must be positive integer"); + + ++iarg; + // parsing wall count + int wallcount = utils::inumeric(FLERR, arg[iarg], do_abort, lmp); + if (wallcount <= 0) + error->all(FLERR, "Illegal fix wall/flow argument: wall count must be positive integer"); + + ++iarg; + // parsing walls + if (narg - iarg != wallcount && narg - iarg != wallcount + 2) + error->all(FLERR, "Wrong fix wall/flow wall count"); + + double scale = 0.0; + if (flowax == FlowAxis::AX_X) + scale = domain->lattice->xlattice; + else if (flowax == FlowAxis::AX_Y) + scale = domain->lattice->ylattice; + else if (flowax == FlowAxis::AX_Z) + scale = domain->lattice->zlattice; + + if (narg - iarg == wallcount + 2) { + if (strcmp(arg[narg - 2], "units") != 0) error->all(FLERR, "Wrong fix wall/flow units command"); + if (strcmp(arg[narg - 1], "box") == 0) + scale = 1.0; + else if (strcmp(arg[narg - 1], "lattice") != 0) + error->all(FLERR, "Wrong fix wall/flow units command"); + } + + walls.resize(wallcount + 2); + walls.front() = domain->boxlo[flowax]; + for (int w = 1; w <= wallcount; ++w, ++iarg) { + walls[w] = utils::numeric(FLERR, arg[iarg], do_abort, lmp) * scale; + } + walls.back() = domain->boxhi[flowax]; + if (!std::is_sorted(walls.begin(), walls.end(), std::less_equal())) { + error->all(FLERR, + "Wrong fix wall/flow wall ordering or some walls are outside simulation domain"); + } + + if (std::adjacent_find(walls.begin(), walls.end()) != walls.end()) { + error->all(FLERR, + "Wrong fix wall/flow wall coordinates: some walls have the same coordinates or lie " + "on the boundary"); + } + + memory->grow(current_segment, atom->nmax, "WallFlow::current_segment"); + atom->add_callback(Atom::GROW); + if (restart_peratom) atom->add_callback(Atom::RESTART); + + maxexchange = 1; + + random = new RanMars(lmp, rndseed + comm->me); +} + +/* ---------------------------------------------------------------------- */ + +FixWallFlow::~FixWallFlow() +{ + if (copymode) return; + atom->delete_callback(id, Atom::GROW); + if (restart_peratom) atom->delete_callback(id, Atom::RESTART); + memory->destroy(current_segment); + + delete random; +} + +/* ---------------------------------------------------------------------- */ + +int FixWallFlow::setmask() +{ + int mask = 0; + + mask |= END_OF_STEP; + + return mask; +} + +/* ---------------------------------------------------------------------- */ + +void FixWallFlow::init() +{ + if (domain->triclinic != 0) + error->all(FLERR, "Fix wall/flow cannot be used with triclinic simulation box"); + + int nrigid = 0; + int box_change_flowax = 0; + for (const auto &ifix : modify->get_fix_list()) { + if (ifix->rigid_flag) nrigid++; + switch (flowax) { + case FlowAxis::AX_X: + if (ifix->box_change & Fix::BOX_CHANGE_X) box_change_flowax++; + break; + case FlowAxis::AX_Y: + if (ifix->box_change & Fix::BOX_CHANGE_Y) box_change_flowax++; + break; + case FlowAxis::AX_Z: + if (ifix->box_change & Fix::BOX_CHANGE_Z) box_change_flowax++; + break; + } + } + + if (nrigid) error->all(FLERR, "Fix wall/flow is not compatible with rigid bodies"); + if (box_change_flowax) + error->all( + FLERR, + "Fix wall/flow is not compatible with simulation box size changing along flow direction"); + + for (int i = 0; i < atom->nlocal; ++i) { + double pos = atom->x[i][flowax]; + current_segment[i] = compute_current_segment(pos); + } +} + +/* ---------------------------------------------------------------------- */ + +void FixWallFlow::end_of_step() +{ + double **x = atom->x; + int *mask = atom->mask; + int nlocal = atom->nlocal; + + for (int i = 0; i < nlocal; ++i) { + if (mask[i] & groupbit) { + double pos = x[i][flowax]; + int prev_segment = current_segment[i]; + current_segment[i] = compute_current_segment(pos); + + if (prev_segment != current_segment[i]) generate_velocity(i); + } + } +} + +/* ---------------------------------------------------------------------- */ + +void FixWallFlow::generate_velocity(int atom_i) +{ + const int newton_iteration_count = 10; + double *vel = atom->v[atom_i]; + + double *prmass = atom->rmass; + double *pmass = atom->mass; + double mass = 0.0; + if (prmass) + mass = prmass[atom_i]; + else + mass = pmass[atom->type[atom_i]]; + + const double gamma = 1.0 / std::sqrt(2.0 * kT / mass); + double delta = gamma * flowvel; + + const double edd = std::exp(-delta * delta) / MathConst::MY_PIS + delta * std::erf(delta); + const double probability_threshold = 0.5f * (1.f + delta / edd); + + double direction = 1.0; + + if (random->uniform() > probability_threshold) { + delta = -delta; + direction = -direction; + } + + const double xi_0 = random->uniform(); + const double F_inf = edd + delta; + const double xi = xi_0 * F_inf; + const double x_0 = (std::sqrt(delta * delta + 2) - delta) * 0.5; + double x = x_0; + for (int i = 0; i < newton_iteration_count; ++i) { + x -= (std::exp(x * x) * MathConst::MY_PIS * (xi - delta * std::erfc(x)) - 1.0) / (x + delta) * + 0.5; + } + + const double nu = x + delta; + const double v = nu / gamma; + + vel[flowax] = v * direction; + vel[(flowax + 1) % 3] = random->gaussian() / (gamma * MathConst::MY_SQRT2); + vel[(flowax + 2) % 3] = random->gaussian() / (gamma * MathConst::MY_SQRT2); +} + +/* ---------------------------------------------------------------------- */ + +int FixWallFlow::compute_current_segment(double pos) const +{ + int result = 0; + for (; result < (int)walls.size() - 1; ++result) { + if (pos >= walls[result] && pos < walls[result + 1]) { return result; } + } + return -1; // -1 is "out of box" region +} + +/* ---------------------------------------------------------------------- */ + +void FixWallFlow::grow_arrays(int nmax) +{ + memory->grow(current_segment, nmax, "WallFlow::current_segment"); +} + +/* ---------------------------------------------------------------------- */ + +void FixWallFlow::copy_arrays(int i, int j, int) +{ + current_segment[j] = current_segment[i]; +} + +/* ---------------------------------------------------------------------- */ + +int FixWallFlow::pack_exchange(int i, double *buf) +{ + buf[0] = static_cast(current_segment[i]); + return 1; +} + +/* ---------------------------------------------------------------------- */ + +int FixWallFlow::unpack_exchange(int i, double *buf) +{ + current_segment[i] = static_cast(buf[0]); + return 1; +} diff --git a/src/EXTRA-FIX/fix_wall_flow.h b/src/EXTRA-FIX/fix_wall_flow.h new file mode 100644 index 0000000000..6a662f3d94 --- /dev/null +++ b/src/EXTRA-FIX/fix_wall_flow.h @@ -0,0 +1,61 @@ +/* -*- c++ -*- ---------------------------------------------------------- + LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator + https://www.lammps.org/, Sandia National Laboratories + LAMMPS development team: developers@lammps.org + + Copyright (2003) Sandia Corporation. Under the terms of Contract + DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains + certain rights in this software. This software is distributed under + the GNU General Public License. + + See the README file in the top-level LAMMPS directory. +------------------------------------------------------------------------- */ + +#ifdef FIX_CLASS +// clang-format off +FixStyle(wall/flow,FixWallFlow); +// clang-format on +#else + +#ifndef LMP_FIX_WALL_FLOW_H +#define LMP_FIX_WALL_FLOW_H + +#include "fix.h" + +namespace LAMMPS_NS { + +class FixWallFlow : public Fix { + public: + enum FlowAxis { AX_X = 0, AX_Y = 1, AX_Z = 2 }; + + FixWallFlow(class LAMMPS *, int, char **); + ~FixWallFlow() override; + int setmask() override; + void init() override; + void end_of_step() override; + + void grow_arrays(int) override; + void copy_arrays(int, int, int) override; + + int pack_exchange(int, double *) override; + int unpack_exchange(int, double *) override; + + protected: + FlowAxis flowax; + double flowvel; + double kT; + std::vector walls; + + int flowdir; + int rndseed; + class RanMars *random; + int *current_segment; + + int compute_current_segment(double pos) const; + void generate_velocity(int i); +}; + +} // namespace LAMMPS_NS + +#endif +#endif diff --git a/src/KOKKOS/Install.sh b/src/KOKKOS/Install.sh index 462c0cbe57..75949c35d8 100755 --- a/src/KOKKOS/Install.sh +++ b/src/KOKKOS/Install.sh @@ -187,6 +187,8 @@ action fix_temp_rescale_kokkos.cpp action fix_temp_rescale_kokkos.h action fix_viscous_kokkos.cpp action fix_viscous_kokkos.h +action fix_wall_flow_kokkos.cpp fix_wall_flow.cpp +action fix_wall_flow_kokkos.h fix_wall_flow.h action fix_wall_gran_kokkos.cpp fix_wall_gran.cpp action fix_wall_gran_kokkos.h fix_wall_gran.h action fix_wall_gran_old.cpp fix_wall_gran.cpp diff --git a/src/KOKKOS/fix_wall_flow_kokkos.cpp b/src/KOKKOS/fix_wall_flow_kokkos.cpp new file mode 100644 index 0000000000..b6b3f7c096 --- /dev/null +++ b/src/KOKKOS/fix_wall_flow_kokkos.cpp @@ -0,0 +1,291 @@ +/* ---------------------------------------------------------------------- + LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator + https://www.lammps.org/, Sandia National Laboratories + LAMMPS development team: developers@lammps.org + + Copyright (2003) Sandia Corporation. Under the terms of Contract + DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains + certain rights in this software. This software is distributed under + the GNU General Public License. + + See the README file in the top-level LAMMPS directory. +------------------------------------------------------------------------- */ + +/* ---------------------------------------------------------------------- + Contributing authors: Vladislav Galigerov (HSE), + Daniil Pavlov (MIPT) +------------------------------------------------------------------------- */ + +#include "fix_wall_flow_kokkos.h" +#include "atom_kokkos.h" +#include "atom_masks.h" +#include "force.h" +#include "math_const.h" +#include "memory_kokkos.h" + +using namespace LAMMPS_NS; + +template +FixWallFlowKokkos::FixWallFlowKokkos(LAMMPS *lmp, int narg, char **arg) : + FixWallFlow(lmp, narg, arg), rand_pool(rndseed + comm->me) +{ + kokkosable = 1; + exchange_comm_device = sort_device = 1; + atomKK = (AtomKokkos *) atom; + execution_space = ExecutionSpaceFromDevice::space; + datamask_read = X_MASK | RMASS_MASK | TYPE_MASK | MASK_MASK; + datamask_modify = V_MASK; + + memory->destroy(current_segment); + current_segment = nullptr; + grow_arrays(atomKK->nmax); + + d_walls = d_walls_t("FixWallFlowKokkos::walls", walls.size()); + auto h_walls = Kokkos::create_mirror_view(d_walls); + for (int i = 0; i < (int) walls.size(); ++i) h_walls(i) = walls[i]; + Kokkos::deep_copy(d_walls, h_walls); +} + +template FixWallFlowKokkos::~FixWallFlowKokkos() +{ + if (copymode) return; + memoryKK->destroy_kokkos(k_current_segment, current_segment); +} + +template void FixWallFlowKokkos::init() +{ + atomKK->sync(execution_space, datamask_read); + k_current_segment.template sync(); + d_x = atomKK->k_x.template view(); + + copymode = 1; + Kokkos::parallel_for(Kokkos::RangePolicy(0, atom->nlocal), *this); + copymode = 0; + + k_current_segment.template modify(); +} + +template +KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowInit, + const int &i) const +{ + double pos = d_x(i, flowax); + d_current_segment(i) = compute_current_segment_kk(pos); +} + +template void FixWallFlowKokkos::end_of_step() +{ + atomKK->sync(execution_space, datamask_read); + k_current_segment.template sync(); + + d_x = atomKK->k_x.template view(); + d_v = atomKK->k_v.template view(); + d_type = atomKK->k_type.template view(); + d_mask = atomKK->k_mask.template view(); + d_mass = atomKK->k_mass.template view(); + d_rmass = atomKK->k_rmass.template view(); + + copymode = 1; + if (d_rmass.data()) { + Kokkos::parallel_for( + Kokkos::RangePolicy>(0, atom->nlocal), *this); + } else { + Kokkos::parallel_for( + Kokkos::RangePolicy>(0, atom->nlocal), *this); + } + copymode = 0; + atomKK->modified(execution_space, datamask_modify); + k_current_segment.template modify(); +} + +template +template +KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowEndOfStep, + const int &atom_i) const +{ + if (d_mask[atom_i] & groupbit) { + double pos = d_x(atom_i, flowax); + int prev_segment = d_current_segment(atom_i); + d_current_segment(atom_i) = compute_current_segment_kk(pos); + if (prev_segment != d_current_segment(atom_i)) { generate_velocity_kk(atom_i); } + } +} + +template +template +KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::generate_velocity_kk(int atom_i) const +{ + const int newton_iteration_count = 10; + double mass = get_mass(MTag(), atom_i); + const double gamma = 1.0 / std::sqrt(2.0 * kT / mass); + double delta = gamma * flowvel; + + const double edd = std::exp(-delta * delta) / MathConst::MY_PIS + delta * std::erf(delta); + const double probability_threshold = 0.5 * (1. + delta / edd); + + double direction = 1.0; + + rand_type_t rand_gen = rand_pool.get_state(); + + if (/*random->uniform()*/ rand_gen.drand() > probability_threshold) { + delta = -delta; + direction = -direction; + } + + const double xi_0 = rand_gen.drand(); //random->uniform(); + const double F_inf = edd + delta; + const double xi = xi_0 * F_inf; + const double x_0 = (std::sqrt(delta * delta + 2) - delta) * 0.5; + double x = x_0; + for (int i = 0; i < newton_iteration_count; ++i) { + x -= (std::exp(x * x) * MathConst::MY_PIS * (xi - delta * std::erfc(x)) - 1.0) / (x + delta) * + 0.5; + } + + const double nu = x + delta; + const double v = nu / gamma; + + d_v(atom_i, flowax) = v * direction; + d_v(atom_i, (flowax + 1) % 3) = + /*random->gaussian()*/ rand_gen.normal() / (gamma * MathConst::MY_SQRT2); + d_v(atom_i, (flowax + 2) % 3) = + /*random->gaussian()*/ rand_gen.normal() / (gamma * MathConst::MY_SQRT2); + + rand_pool.free_state(rand_gen); +} + +template +KOKKOS_INLINE_FUNCTION int +FixWallFlowKokkos::compute_current_segment_kk(double pos) const +{ + int result = 0; + for (; result < (int) d_walls.extent(0) - 1; ++result) { + if (pos >= d_walls[result] && pos < d_walls[result + 1]) { return result; } + } + return -1; // -1 is "out of box" region +} + +template void FixWallFlowKokkos::grow_arrays(int nmax) +{ + k_current_segment.template sync(); + memoryKK->grow_kokkos(k_current_segment, current_segment, nmax, "WallFlowKK::current_segment"); + k_current_segment.template modify(); + + d_current_segment = k_current_segment.template view(); + h_current_segment = k_current_segment.template view(); +} + +template void FixWallFlowKokkos::copy_arrays(int i, int j, int) +{ + k_current_segment.template sync(); + h_current_segment(j) = h_current_segment(i); + k_current_segment.template modify(); +} + +/* ---------------------------------------------------------------------- + sort local atom-based arrays +------------------------------------------------------------------------- */ + +template +void FixWallFlowKokkos::sort_kokkos(Kokkos::BinSort &Sorter) +{ + // always sort on the device + + k_current_segment.sync_device(); + + Sorter.sort(LMPDeviceType(), k_current_segment.d_view); + + k_current_segment.modify_device(); +} + +template int FixWallFlowKokkos::pack_exchange(int i, double *buf) +{ + k_current_segment.sync_host(); + buf[0] = static_cast(h_current_segment(i)); + return 1; +} + +template +KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowPackExchange, + const int &mysend) const +{ + const int send_i = d_sendlist(mysend); + const int segment = d_current_segment(send_i); + d_buf(mysend) = static_cast(segment); + + const int copy_i = d_copylist(mysend); + if (copy_i > -1) { d_current_segment(send_i) = d_current_segment(copy_i); } +} + +template +int FixWallFlowKokkos::pack_exchange_kokkos(const int &nsend, + DAT::tdual_xfloat_2d &k_buf, + DAT::tdual_int_1d k_sendlist, + DAT::tdual_int_1d k_copylist, + ExecutionSpace /*space*/) +{ + k_current_segment.template sync(); + + k_buf.template sync(); + k_sendlist.template sync(); + k_copylist.template sync(); + + d_sendlist = k_sendlist.view(); + d_copylist = k_copylist.view(); + + d_buf = typename ArrayTypes::t_xfloat_1d_um(k_buf.template view().data(), + k_buf.extent(0) * k_buf.extent(1)); + + copymode = 1; + + Kokkos::parallel_for(Kokkos::RangePolicy(0, nsend), + *this); + + copymode = 0; + + k_buf.template modify(); + k_current_segment.template modify(); + + return nsend; +} + +template int FixWallFlowKokkos::unpack_exchange(int i, double *buf) +{ + k_current_segment.sync_host(); + h_current_segment(i) = static_cast(buf[0]); + k_current_segment.modify_host(); + return 1; +} + +template +KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowUnpackExchange, + const int &i) const +{ + int index = d_indices(i); + if (index > -1) { d_current_segment(index) = static_cast(d_buf(i)); } +} + +template +void FixWallFlowKokkos::unpack_exchange_kokkos(DAT::tdual_xfloat_2d &k_buf, + DAT::tdual_int_1d &k_indices, int nrecv, + int /*nrecv1*/, int /*nextrarecv1*/, + ExecutionSpace /*space*/) +{ + d_buf = typename ArrayTypes::t_xfloat_1d_um(k_buf.template view().data(), + k_buf.extent(0) * k_buf.extent(1)); + d_indices = k_indices.view(); + + copymode = 1; + Kokkos::parallel_for(Kokkos::RangePolicy(0, nrecv), + *this); + copymode = 0; + + k_current_segment.template modify(); +} + +namespace LAMMPS_NS { +template class FixWallFlowKokkos; +#ifdef LMP_KOKKOS_GPU +template class FixWallFlowKokkos; +#endif +} // namespace LAMMPS_NS diff --git a/src/KOKKOS/fix_wall_flow_kokkos.h b/src/KOKKOS/fix_wall_flow_kokkos.h new file mode 100644 index 0000000000..8de0eded0a --- /dev/null +++ b/src/KOKKOS/fix_wall_flow_kokkos.h @@ -0,0 +1,130 @@ +/* -*- c++ -*- ---------------------------------------------------------- + LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator + https://www.lammps.org/, Sandia National Laboratories + LAMMPS development team: developers@lammps.org + + Copyright (2003) Sandia Corporation. Under the terms of Contract + DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains + certain rights in this software. This software is distributed under + the GNU General Public License. + + See the README file in the top-level LAMMPS directory. +------------------------------------------------------------------------- */ + +#ifdef FIX_CLASS +// clang-format off +FixStyle(wall/flow/kk,FixWallFlowKokkos); +FixStyle(wall/flow/kk/device,FixWallFlowKokkos); +FixStyle(wall/flow/kk/host,FixWallFlowKokkos); +// clang-format on +#else + +// clang-format off +#ifndef LMP_FIX_WALL_FLOW_KOKKOS_H +#define LMP_FIX_WALL_FLOW_KOKKOS_H + +#include "fix_wall_flow.h" +#include "kokkos_type.h" +#include "kokkos_base.h" +#include "Kokkos_Random.hpp" +#include "comm_kokkos.h" + +namespace LAMMPS_NS { + +struct TagFixWallFlowInit{}; +template +struct TagFixWallFlowEndOfStep{}; +struct TagFixWallFlowPackExchange{}; +struct TagFixWallFlowUnpackExchange{}; + +template +class FixWallFlowKokkos : public FixWallFlow, public KokkosBase { + public: + typedef DeviceType device_type; + typedef ArrayTypes AT; + struct MassTag{}; + struct RMassTag{}; + FixWallFlowKokkos(class LAMMPS *, int, char **); + ~FixWallFlowKokkos(); + + void init() override; + void end_of_step() override; + void grow_arrays(int) override; + void copy_arrays(int, int, int) override; + void sort_kokkos(Kokkos::BinSort &Sorter) override; + int pack_exchange(int, double *) override; + int unpack_exchange(int, double *) override; + + KOKKOS_INLINE_FUNCTION + void operator() (TagFixWallFlowInit, const int&) const; + + template + KOKKOS_INLINE_FUNCTION + void operator()(TagFixWallFlowEndOfStep, const int&) const; + + KOKKOS_INLINE_FUNCTION + void operator()(TagFixWallFlowPackExchange, const int&) const; + + KOKKOS_INLINE_FUNCTION + void operator()(TagFixWallFlowUnpackExchange, const int&) const; + + int pack_exchange_kokkos(const int &nsend,DAT::tdual_xfloat_2d &buf, + DAT::tdual_int_1d k_sendlist, + DAT::tdual_int_1d k_copylist, + ExecutionSpace space) override; + + void unpack_exchange_kokkos(DAT::tdual_xfloat_2d &k_buf, + DAT::tdual_int_1d &indices,int nrecv, + int /*nrecv1*/, int /*nextrarecv1*/, + ExecutionSpace space) override; + protected: + typename AT::t_x_array d_x; + typename AT::t_v_array d_v; + typename AT::t_int_1d d_type; + typename AT::t_int_1d d_mask; + + typename AT::t_float_1d d_mass; + typename AT::t_float_1d d_rmass; + + typedef typename AT::t_xfloat_1d d_walls_t; + typedef Kokkos::Random_XorShift64_Pool rand_pool_t; + typedef typename rand_pool_t::generator_type rand_type_t; + + typename AT::tdual_int_1d k_current_segment; + typename AT::t_int_1d d_current_segment; + typename HAT::t_int_1d h_current_segment; + + typename AT::t_int_1d d_sendlist; + typename AT::t_xfloat_1d d_buf; + typename AT::t_int_1d d_copylist; + typename AT::t_int_1d d_indices; + + d_walls_t d_walls; + + rand_pool_t rand_pool; + + template + KOKKOS_INLINE_FUNCTION + void generate_velocity_kk(int atom_i) const; + + KOKKOS_INLINE_FUNCTION + int compute_current_segment_kk(double pos) const; + + KOKKOS_INLINE_FUNCTION + double get_mass(MassTag, int atom_i) const + { + return d_mass(d_type(atom_i)); + } + + KOKKOS_INLINE_FUNCTION + double get_mass(RMassTag, int atom_i) const + { + return d_rmass(atom_i); + } +}; + +} + +#endif +#endif +