diff --git a/doc/src/Bibliography.rst b/doc/src/Bibliography.rst
index 4ed8e73dfe..9778340c94 100644
--- a/doc/src/Bibliography.rst
+++ b/doc/src/Bibliography.rst
@@ -877,6 +877,9 @@ Bibliography
**(PLUMED)**
G.A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni and G. Bussi, Comp. Phys. Comm 185, 604 (2014)
+**(Pavlov)**
+D Pavlov, V Galigerov, D Kolotinskii, V Nikolskiy, V Stegailov, International Journal of High Performance Computing Applications, 38, 34-49 (2024).
+
**(Paquay)**
Paquay and Kusters, Biophys. J., 110, 6, (2016). preprint available at `arXiv:1411.3019 `_.
diff --git a/doc/src/Commands_fix.rst b/doc/src/Commands_fix.rst
index ea50e68cdd..a7648218fa 100644
--- a/doc/src/Commands_fix.rst
+++ b/doc/src/Commands_fix.rst
@@ -263,6 +263,7 @@ OPT.
* :doc:`wall/body/polyhedron `
* :doc:`wall/colloid `
* :doc:`wall/ees `
+ * :doc:`wall/flow (k) `
* :doc:`wall/gran (k) `
* :doc:`wall/gran/region `
* :doc:`wall/harmonic `
diff --git a/doc/src/fix.rst b/doc/src/fix.rst
index d03cab4687..4cd21353c7 100644
--- a/doc/src/fix.rst
+++ b/doc/src/fix.rst
@@ -428,6 +428,7 @@ accelerated styles exist.
* :doc:`wall/body/polyhedron ` - time integration for body particles of style :doc:`rounded/polyhedron `
* :doc:`wall/colloid ` - Lennard-Jones wall interacting with finite-size particles
* :doc:`wall/ees ` - wall for ellipsoidal particles
+* :doc:`wall/flow ` - flow boundary conditions
* :doc:`wall/gran ` - frictional wall(s) for granular simulations
* :doc:`wall/gran/region ` - :doc:`fix wall/region ` equivalent for use with granular particles
* :doc:`wall/harmonic ` - harmonic spring wall
diff --git a/doc/src/fix_wall_flow.rst b/doc/src/fix_wall_flow.rst
new file mode 100644
index 0000000000..b40ba9697f
--- /dev/null
+++ b/doc/src/fix_wall_flow.rst
@@ -0,0 +1,175 @@
+.. index:: fix wall/flow
+.. index:: fix wall/flow/kk
+
+fix wall/flow command
+=====================
+
+Accelerator Variants: *wall/flow/kk*
+
+Syntax
+""""""
+
+.. code-block:: LAMMPS
+
+ fix ID group-ID wall/flow axis vflow T seed N coords ... keyword value
+
+* ID, group-ID are documented in :doc:`fix ` command
+* wall/flow = style name of this fix command
+* axis = flow axis (*x*, *y*, or *z*)
+* vflow = generated flow velocity in *axis* direction (velocity units)
+* T = flow temperature (temperature units)
+* seed = random seed for stochasticity (positive integer)
+* N = number of walls
+* coords = list of N wall positions along the *axis* direction in ascending order (distance units)
+* zero or more keyword/value pairs may be appended
+* keyword = *units*
+
+ .. parsed-literal::
+
+ *units* value = *lattice* or *box*
+ *lattice* = wall positions are defined in lattice units
+ *box* = the wall positions are defined in simulation box units
+
+Examples
+""""""""
+
+.. code-block:: LAMMPS
+
+ fix 1 all wall/flow x 0.4 1.5 593894 4 2.0 4.0 6.0 8.0
+
+Description
+"""""""""""
+
+.. versionadded:: TBD
+
+This fix implements flow boundary conditions (FBC) introduced in
+:ref:`(Pavlov1) ` and :ref:`(Pavlov2) `.
+The goal is to generate a stationary flow with a shifted Maxwell
+velocity distribution:
+
+.. math::
+
+ f_a(v_a) \propto \exp{\left(-\frac{m (v_a-v_{\text{flow}})^2}{2 kB T}\right)}
+
+where :math:`v_a` is the component of velocity along the specified
+*axis* argument (a = x,y,z), :math:`v_{\text{flow}}` is the flow
+velocity specified as the *vflow* argument, *T* is the specified flow
+temperature, *m* is the particle mass, and *kB* is the Boltzmann
+constant.
+
+This is achieved by defining a series of *N* transparent walls along
+the flow *axis* direction. Each wall is at the specified position
+listed in the *coords* argument. Note that an additional transparent
+wall is defined by the code at the boundary of the (periodic)
+simulation domain in the *axis* direction. So there are effectively
+N+1 walls.
+
+Each time a particle in the specified group passes through one of the
+transparent walls, its velocity is re-assigned. Particles not in the
+group do not interact with the wall. This can be used, for example, to
+add obstacles composed of atoms, or to simulate a solution of complex
+molecules in a one-atom liquid (note that the fix has been tested for
+one-atom systems only).
+
+Conceptually, the velocity re-assignment represents creation of a new
+particle within the system with simultaneous removal of the particle
+which passed through the wall. The velocity components in directions
+parallel to the wall are re-assigned according to the standard Maxwell
+velocity distribution for the specified temperature *T*. The velocity
+component perpendicular to the wall is re-assigned according to the
+shifted Maxwell distribution defined above:
+
+.. math::
+
+ f_{\text{a generated}}(v_a) \propto v_a f_a(v_a)
+
+It can be shown that for an ideal-gas scenario this procedure makes
+the velocity distribution of particles between walls exactly as
+desired.
+
+Since in most cases simulated systems are not an ideal gas, multiple
+walls can be defined, since a single wall may not be sufficient for
+maintaining a stationary flow without "congestion" which can manifest
+itself as regions in the flow with increased particle density located
+upstream from static obstacles.
+
+For the same reason, the actual temperature and velocity of the
+generated flow may differ from what is requested. The degree of
+discrepancy is determined by how different from an ideal gas the
+simulated system is. Therefore, a calibration procedure may be
+required for such a system as described in :ref:`(Pavlov)
+`.
+
+Note that the interactions between particles on different sides of a
+transparent wall are not disabled or neglected. Likewise particle
+positions are not altered by the velocity reassignment. This removes
+the need to modify the force field to work correctly in cases when a
+particle is close to a wall.
+
+For example, if particle positions were uniformly redistributed across
+the surface of a wall, two particles could end up too close to each
+other, potentially causing the simulation to explode. However due to
+this compromise, some collective phenomena such as regions with
+increased/decreased density or collective movements are not fully
+removed when particles cross a wall. This unwanted consequence can
+also be potentially mitigated by using more multiple walls.
+
+.. note::
+
+ When the specified flow has a high velocity, a lost atoms error can
+ occur (see :doc:`error messages `). If this
+ happens, you should ensure the checks for neighbor list rebuilds,
+ set via the :doc:`neigh_modify ` command, are as
+ conservative as possible (every timestep if needed). Those are the
+ default settings.
+
+Restart, fix_modify, output, run start/stop, minimize info
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+
+No information about this fix is written to :doc:`binary restart files
+`.
+
+None of the :doc:`fix_modify ` options are relevant to
+this fix.
+
+No global or per-atom quantities are stored by this fix for access by
+various :doc:`output commands `.
+
+No parameter of this fix can be used with the *start/stop* keywords of
+the :doc:`run ` command.
+
+This fix is not invoked during :doc:`energy minimization `.
+
+Restrictions
+""""""""""""
+
+Fix *wall_flow* is part of the EXTRA-FIX package. It is only enabled
+if LAMMPS was built with that package. See the :doc:`Build package
+` page for more info.
+
+Flow boundary conditions should not be used with rigid bodies such as
+those defined by a "fix rigid" command.
+
+This fix can only be used with periodic boundary conditions along the
+flow axis. The size of the box in this direction must not change. Also,
+the fix is designed to work only in an orthogonal simulation box.
+
+Related commands
+""""""""""""""""
+
+:doc:`fix wall/reflect ` command
+
+Default
+"""""""
+
+The default for the units keyword is lattice.
+
+----------
+
+.. _fbc-Pavlov1:
+
+**(Pavlov1)** Pavlov, Kolotinskii, Stegailov, "GPU-Based Molecular Dynamics of Turbulent Liquid Flows with OpenMM", Proceedings of PPAM-2022, LNCS (Springer), vol. 13826, pp. 346-358 (2023)
+
+.. _fbc-Pavlov2:
+
+**(Pavlov2)** Pavlov, Galigerov, Kolotinskii, Nikolskiy, Stegailov, "GPU-based Molecular Dynamics of Fluid Flows: Reaching for Turbulence", Int. J. High Perf. Comp. Appl., (2024)
diff --git a/doc/utils/sphinx-config/false_positives.txt b/doc/utils/sphinx-config/false_positives.txt
index e46fb6ca97..030c80d30c 100644
--- a/doc/utils/sphinx-config/false_positives.txt
+++ b/doc/utils/sphinx-config/false_positives.txt
@@ -1770,6 +1770,7 @@ Kolafa
Kollman
kolmogorov
Kolmogorov
+Kolotinskii
Kondor
konglt
Koning
@@ -2775,6 +2776,7 @@ PEigenDense
Peng
peptide
peratom
+Perf
Pergamon
pergrid
peri
@@ -3887,7 +3889,9 @@ Verlet
versa
Verstraelen
ves
+vf
vflag
+vflow
vfrac
vhi
vibrational
diff --git a/examples/wall/in.wall.flow b/examples/wall/in.wall.flow
new file mode 100644
index 0000000000..9dfe001a55
--- /dev/null
+++ b/examples/wall/in.wall.flow
@@ -0,0 +1,79 @@
+variable nrun equal 1000
+variable dump_count equal 10
+
+variable nwall equal 4
+variable w1 equal 67
+variable w2 equal 71
+variable w3 equal 75
+variable w4 equal 79
+
+variable x_cylinder equal 20
+variable y_cylinder equal 17
+variable r_cylinder equal 4
+
+variable MASS equal 1
+variable TEMP equal 0.4
+variable VFLOW equal 0.5
+
+units lj
+atom_style atomic
+
+lattice fcc 0.3
+region sim_box block 0 84 0 34 0 10
+
+boundary p p p
+
+create_box 2 sim_box
+region reg_cylinder cylinder z ${x_cylinder} ${y_cylinder} ${r_cylinder} EDGE EDGE
+
+create_atoms 1 box
+
+## setup obstacle ##
+group g_obst region reg_cylinder
+group g_flow subtract all g_obst
+set group g_obst type 2
+
+mass 1 ${MASS}
+mass 2 ${MASS}
+
+velocity g_flow create ${TEMP} 4928459 rot yes dist gaussian
+velocity g_obst set 0.0 0.0 0.0
+
+pair_style lj/cut 1.122462
+pair_coeff 1 1 1.0 1.0
+pair_coeff 1 2 1.0 1.0
+pair_coeff 2 2 1.0 1.0
+pair_modify shift yes
+
+neighbor 0.3 bin
+neigh_modify delay 0 every 20 check no
+
+fix 1 g_flow nve
+fix 2 g_flow wall/flow x ${VFLOW} ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4}
+
+variable dump_every equal ${nrun}/${dump_count}
+variable thermo_every equal ${dump_every}
+variable restart_every equal ${nrun}/10
+
+##### uncomment for grid aggregation #####
+#variable gr_Nx equal 42
+#variable gr_Ny equal 17
+#variable gr_Nz equal 1
+#variable gr_Nevery equal ${dump_every}
+#variable gr_Nrepeat equal 1
+#variable gr_Nfreq equal ${dump_every}
+#fix 3 g_flow ave/grid ${gr_Nevery} ${gr_Nrepeat} ${gr_Nfreq} ${gr_Nx} ${gr_Ny} ${gr_Nz} vx vy vz density/mass norm all ave one
+#compute ct_gridId g_flow property/grid ${gr_Nx} ${gr_Ny} ${gr_Nz} id
+#dump dmp_grid g_flow grid ${dump_every} grid.lammpstrj c_ct_gridId:grid:data f_3:grid:data[*]
+##########################################
+
+#dump dmp_coord all atom ${dump_every} dump.lammpstrj
+
+#compute ct_Temp g_flow temp/com
+#thermo_style custom step temp epair emol etotal press c_ct_Temp
+
+#restart ${restart_every} flow.restart
+
+timestep 0.005
+thermo ${thermo_every}
+run ${nrun}
diff --git a/examples/wall/log.7Feb24.wall.flow.g++.1 b/examples/wall/log.7Feb24.wall.flow.g++.1
new file mode 100644
index 0000000000..75e8b66fe1
--- /dev/null
+++ b/examples/wall/log.7Feb24.wall.flow.g++.1
@@ -0,0 +1,182 @@
+LAMMPS (21 Nov 2023 - Development - patch_21Nov2023-758-ge33590b2fc-modified)
+ using 1 OpenMP thread(s) per MPI task
+variable nrun equal 1000
+variable dump_count equal 10
+
+variable nwall equal 4
+variable w1 equal 67
+variable w2 equal 71
+variable w3 equal 75
+variable w4 equal 79
+
+variable x_cylinder equal 20
+variable y_cylinder equal 17
+variable r_cylinder equal 4
+
+variable MASS equal 1
+variable TEMP equal 0.4
+variable VFLOW equal 0.5
+
+units lj
+atom_style atomic
+
+lattice fcc 0.3
+Lattice spacing in x,y,z = 2.3712622 2.3712622 2.3712622
+region sim_box block 0 84 0 34 0 10
+
+boundary p p p
+
+create_box 2 sim_box
+Created orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622)
+ 1 by 1 by 1 MPI processor grid
+region reg_cylinder cylinder z ${x_cylinder} ${y_cylinder} ${r_cylinder} EDGE EDGE
+region reg_cylinder cylinder z 20 ${y_cylinder} ${r_cylinder} EDGE EDGE
+region reg_cylinder cylinder z 20 17 ${r_cylinder} EDGE EDGE
+region reg_cylinder cylinder z 20 17 4 EDGE EDGE
+
+create_atoms 1 box
+Created 114240 atoms
+ using lattice units in orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622)
+ create_atoms CPU = 0.010 seconds
+
+## setup obstacle ##
+group g_obst region reg_cylinder
+1950 atoms in group g_obst
+group g_flow subtract all g_obst
+112290 atoms in group g_flow
+set group g_obst type 2
+Setting atom values ...
+ 1950 settings made for type
+
+mass 1 ${MASS}
+mass 1 1
+mass 2 ${MASS}
+mass 2 1
+
+velocity g_flow create ${TEMP} 4928459 rot yes dist gaussian
+velocity g_flow create 0.4 4928459 rot yes dist gaussian
+velocity g_obst set 0.0 0.0 0.0
+
+pair_style lj/cut 1.122462
+pair_coeff 1 1 1.0 1.0
+pair_coeff 1 2 1.0 1.0
+pair_coeff 2 2 1.0 1.0
+pair_modify shift yes
+
+neighbor 0.3 bin
+neigh_modify delay 0 every 20 check no
+
+fix 1 g_flow nve
+fix 2 g_flow wall/flow x ${VFLOW} ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 ${nwall} ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 79
+
+variable dump_every equal ${nrun}/${dump_count}
+variable dump_every equal 1000/${dump_count}
+variable dump_every equal 1000/10
+variable thermo_every equal ${dump_every}
+variable thermo_every equal 100
+variable restart_every equal ${nrun}/10
+variable restart_every equal 1000/10
+
+##### uncomment for grid aggregation #####
+#variable gr_Nx equal 42
+#variable gr_Ny equal 17
+#variable gr_Nz equal 1
+#variable gr_Nevery equal ${dump_every}
+#variable gr_Nrepeat equal 1
+#variable gr_Nfreq equal ${dump_every}
+#fix 3 g_flow ave/grid ${gr_Nevery} ${gr_Nrepeat} ${gr_Nfreq} ${gr_Nx} ${gr_Ny} ${gr_Nz} vx vy vz density/mass norm all ave one
+#compute ct_gridId g_flow property/grid ${gr_Nx} ${gr_Ny} ${gr_Nz} id
+#dump dmp_grid g_flow grid ${dump_every} grid.lammpstrj c_ct_gridId:grid:data f_3:grid:data[*]
+##########################################
+
+#dump dmp_coord all atom ${dump_every} dump.lammpstrj
+
+#compute ct_Temp g_flow temp/com
+#thermo_style custom step temp epair emol etotal press c_ct_Temp
+
+#restart ${restart_every} flow.restart
+
+timestep 0.005
+thermo ${thermo_every}
+thermo 100
+run ${nrun}
+run 1000
+
+CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
+
+Your simulation uses code contributions which should be cited:
+
+- fix wall/flow command: doi:10.1177/10943420231213013
+
+@Article{Pavlov-etal-IJHPCA-2024,
+ author = {Daniil Pavlov and Vladislav Galigerov and Daniil Kolotinskii and Vsevolod Nikolskiy and Vladimir Stegailov},
+ title = {GPU-based molecular dynamics of fluid flows: Reaching for turbulence},
+ journal = {The International Journal of High Performance Computing Applications},
+ year = 2024,
+ volume = 38,
+ number = 1,
+ pages = 34-49
+}
+
+CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
+
+Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule
+Neighbor list info ...
+ update: every = 20 steps, delay = 0 steps, check = no
+ max neighbors/atom: 2000, page size: 100000
+ master list distance cutoff = 1.422462
+ ghost atom cutoff = 1.422462
+ binsize = 0.711231, bins = 281 114 34
+ 1 neighbor lists, perpetual/occasional/extra = 1 0 0
+ (1) pair lj/cut, perpetual
+ attributes: half, newton on
+ pair build: half/bin/atomonly/newton
+ stencil: half/bin/3d
+ bin: standard
+Per MPI rank memory allocation (min/avg/max) = 26.69 | 26.69 | 26.69 Mbytes
+ Step Temp E_pair E_mol TotEng Press
+ 0 0.39317221 0 0 0.58975315 0.11795063
+ 100 0.3671684 0.045118445 0 0.59586622 0.27378331
+ 200 0.3732041 0.036897471 0 0.59669873 0.24917809
+ 300 0.37432305 0.036501844 0 0.5979815 0.24715194
+ 400 0.37603886 0.035350565 0 0.59940392 0.24480762
+ 500 0.37617142 0.036949771 0 0.60120196 0.24862985
+ 600 0.37751983 0.036484268 0 0.60275905 0.24784635
+ 700 0.37787831 0.037327783 0 0.60414029 0.25060427
+ 800 0.37959242 0.036206184 0 0.60558983 0.2476903
+ 900 0.38019033 0.036874395 0 0.6071549 0.24984211
+ 1000 0.38070666 0.037068948 0 0.60812395 0.25041936
+Loop time of 5.61598 on 1 procs for 1000 steps with 114240 atoms
+
+Performance: 76923.319 tau/day, 178.063 timesteps/s, 20.342 Matom-step/s
+99.7% CPU use with 1 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section | min time | avg time | max time |%varavg| %total
+---------------------------------------------------------------
+Pair | 2.6351 | 2.6351 | 2.6351 | 0.0 | 46.92
+Neigh | 1.2994 | 1.2994 | 1.2994 | 0.0 | 23.14
+Comm | 0.26576 | 0.26576 | 0.26576 | 0.0 | 4.73
+Output | 0.0030531 | 0.0030531 | 0.0030531 | 0.0 | 0.05
+Modify | 1.3019 | 1.3019 | 1.3019 | 0.0 | 23.18
+Other | | 0.1107 | | | 1.97
+
+Nlocal: 114240 ave 114240 max 114240 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Nghost: 20119 ave 20119 max 20119 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+Neighs: 164018 ave 164018 max 164018 min
+Histogram: 1 0 0 0 0 0 0 0 0 0
+
+Total # of neighbors = 164018
+Ave neighs/atom = 1.4357318
+Neighbor list builds = 50
+Dangerous builds not checked
+Total wall time: 0:00:05
diff --git a/examples/wall/log.7Feb24.wall.flow.g++.4 b/examples/wall/log.7Feb24.wall.flow.g++.4
new file mode 100644
index 0000000000..1efe7bb28e
--- /dev/null
+++ b/examples/wall/log.7Feb24.wall.flow.g++.4
@@ -0,0 +1,182 @@
+LAMMPS (21 Nov 2023 - Development - patch_21Nov2023-758-ge33590b2fc-modified)
+ using 1 OpenMP thread(s) per MPI task
+variable nrun equal 1000
+variable dump_count equal 10
+
+variable nwall equal 4
+variable w1 equal 67
+variable w2 equal 71
+variable w3 equal 75
+variable w4 equal 79
+
+variable x_cylinder equal 20
+variable y_cylinder equal 17
+variable r_cylinder equal 4
+
+variable MASS equal 1
+variable TEMP equal 0.4
+variable VFLOW equal 0.5
+
+units lj
+atom_style atomic
+
+lattice fcc 0.3
+Lattice spacing in x,y,z = 2.3712622 2.3712622 2.3712622
+region sim_box block 0 84 0 34 0 10
+
+boundary p p p
+
+create_box 2 sim_box
+Created orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622)
+ 4 by 1 by 1 MPI processor grid
+region reg_cylinder cylinder z ${x_cylinder} ${y_cylinder} ${r_cylinder} EDGE EDGE
+region reg_cylinder cylinder z 20 ${y_cylinder} ${r_cylinder} EDGE EDGE
+region reg_cylinder cylinder z 20 17 ${r_cylinder} EDGE EDGE
+region reg_cylinder cylinder z 20 17 4 EDGE EDGE
+
+create_atoms 1 box
+Created 114240 atoms
+ using lattice units in orthogonal box = (0 0 0) to (199.18603 80.622915 23.712622)
+ create_atoms CPU = 0.003 seconds
+
+## setup obstacle ##
+group g_obst region reg_cylinder
+1950 atoms in group g_obst
+group g_flow subtract all g_obst
+112290 atoms in group g_flow
+set group g_obst type 2
+Setting atom values ...
+ 1950 settings made for type
+
+mass 1 ${MASS}
+mass 1 1
+mass 2 ${MASS}
+mass 2 1
+
+velocity g_flow create ${TEMP} 4928459 rot yes dist gaussian
+velocity g_flow create 0.4 4928459 rot yes dist gaussian
+velocity g_obst set 0.0 0.0 0.0
+
+pair_style lj/cut 1.122462
+pair_coeff 1 1 1.0 1.0
+pair_coeff 1 2 1.0 1.0
+pair_coeff 2 2 1.0 1.0
+pair_modify shift yes
+
+neighbor 0.3 bin
+neigh_modify delay 0 every 20 check no
+
+fix 1 g_flow nve
+fix 2 g_flow wall/flow x ${VFLOW} ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 ${TEMP} 123 ${nwall} ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 ${nwall} ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 ${w1} ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 ${w2} ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 ${w3} ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 ${w4}
+fix 2 g_flow wall/flow x 0.5 0.4 123 4 67 71 75 79
+
+variable dump_every equal ${nrun}/${dump_count}
+variable dump_every equal 1000/${dump_count}
+variable dump_every equal 1000/10
+variable thermo_every equal ${dump_every}
+variable thermo_every equal 100
+variable restart_every equal ${nrun}/10
+variable restart_every equal 1000/10
+
+##### uncomment for grid aggregation #####
+#variable gr_Nx equal 42
+#variable gr_Ny equal 17
+#variable gr_Nz equal 1
+#variable gr_Nevery equal ${dump_every}
+#variable gr_Nrepeat equal 1
+#variable gr_Nfreq equal ${dump_every}
+#fix 3 g_flow ave/grid ${gr_Nevery} ${gr_Nrepeat} ${gr_Nfreq} ${gr_Nx} ${gr_Ny} ${gr_Nz} vx vy vz density/mass norm all ave one
+#compute ct_gridId g_flow property/grid ${gr_Nx} ${gr_Ny} ${gr_Nz} id
+#dump dmp_grid g_flow grid ${dump_every} grid.lammpstrj c_ct_gridId:grid:data f_3:grid:data[*]
+##########################################
+
+#dump dmp_coord all atom ${dump_every} dump.lammpstrj
+
+#compute ct_Temp g_flow temp/com
+#thermo_style custom step temp epair emol etotal press c_ct_Temp
+
+#restart ${restart_every} flow.restart
+
+timestep 0.005
+thermo ${thermo_every}
+thermo 100
+run ${nrun}
+run 1000
+
+CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
+
+Your simulation uses code contributions which should be cited:
+
+- fix wall/flow command: doi:10.1177/10943420231213013
+
+@Article{Pavlov-etal-IJHPCA-2024,
+ author = {Daniil Pavlov and Vladislav Galigerov and Daniil Kolotinskii and Vsevolod Nikolskiy and Vladimir Stegailov},
+ title = {GPU-based molecular dynamics of fluid flows: Reaching for turbulence},
+ journal = {The International Journal of High Performance Computing Applications},
+ year = 2024,
+ volume = 38,
+ number = 1,
+ pages = 34-49
+}
+
+CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE-CITE
+
+Generated 0 of 1 mixed pair_coeff terms from geometric mixing rule
+Neighbor list info ...
+ update: every = 20 steps, delay = 0 steps, check = no
+ max neighbors/atom: 2000, page size: 100000
+ master list distance cutoff = 1.422462
+ ghost atom cutoff = 1.422462
+ binsize = 0.711231, bins = 281 114 34
+ 1 neighbor lists, perpetual/occasional/extra = 1 0 0
+ (1) pair lj/cut, perpetual
+ attributes: half, newton on
+ pair build: half/bin/atomonly/newton
+ stencil: half/bin/3d
+ bin: standard
+Per MPI rank memory allocation (min/avg/max) = 8.496 | 8.496 | 8.496 Mbytes
+ Step Temp E_pair E_mol TotEng Press
+ 0 0.39317221 0 0 0.58975315 0.11795063
+ 100 0.36726398 0.045386014 0 0.59627716 0.27402111
+ 200 0.37384538 0.036574547 0 0.5973377 0.24836729
+ 300 0.37487455 0.036519645 0 0.59882654 0.24691726
+ 400 0.37591417 0.036405755 0 0.60027207 0.24700641
+ 500 0.37654714 0.037008829 0 0.60182459 0.24883444
+ 600 0.3778008 0.03663706 0 0.6033333 0.24874392
+ 700 0.37851338 0.036714175 0 0.60447928 0.24881829
+ 800 0.37984876 0.036237049 0 0.6060052 0.24843003
+ 900 0.38022763 0.036847615 0 0.60718407 0.24987198
+ 1000 0.38084717 0.037139994 0 0.60840575 0.25070072
+Loop time of 2.20347 on 4 procs for 1000 steps with 114240 atoms
+
+Performance: 196054.093 tau/day, 453.829 timesteps/s, 51.845 Matom-step/s
+95.6% CPU use with 4 MPI tasks x 1 OpenMP threads
+
+MPI task timing breakdown:
+Section | min time | avg time | max time |%varavg| %total
+---------------------------------------------------------------
+Pair | 0.67927 | 0.70882 | 0.73473 | 2.4 | 32.17
+Neigh | 0.32928 | 0.34467 | 0.36084 | 2.0 | 15.64
+Comm | 0.3211 | 0.36609 | 0.40741 | 6.1 | 16.61
+Output | 0.0017748 | 0.0032465 | 0.0046508 | 2.1 | 0.15
+Modify | 0.71135 | 0.74424 | 0.76001 | 2.3 | 33.78
+Other | | 0.03641 | | | 1.65
+
+Nlocal: 28560 ave 29169 max 27884 min
+Histogram: 1 0 0 0 0 2 0 0 0 1
+Nghost: 6452.25 ave 6546 max 6368 min
+Histogram: 1 0 0 0 2 0 0 0 0 1
+Neighs: 40893 ave 42032 max 39445 min
+Histogram: 1 0 0 0 1 0 0 1 0 1
+
+Total # of neighbors = 163572
+Ave neighs/atom = 1.4318277
+Neighbor list builds = 50
+Dangerous builds not checked
+Total wall time: 0:00:02
diff --git a/src/.gitignore b/src/.gitignore
index ba4c4c05b0..88bb80fdc5 100644
--- a/src/.gitignore
+++ b/src/.gitignore
@@ -1025,6 +1025,8 @@
/fix_wall_colloid.h
/fix_wall_ees.cpp
/fix_wall_ees.h
+/fix_wall_flow.cpp
+/fix_wall_flow.h
/fix_wall_region_ees.cpp
/fix_wall_region_ees.h
/fix_wall_reflect_stochastic.cpp
diff --git a/src/EXTRA-FIX/fix_wall_flow.cpp b/src/EXTRA-FIX/fix_wall_flow.cpp
new file mode 100644
index 0000000000..1f3dcfca5b
--- /dev/null
+++ b/src/EXTRA-FIX/fix_wall_flow.cpp
@@ -0,0 +1,325 @@
+/* ----------------------------------------------------------------------
+ LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
+ https://www.lammps.org/, Sandia National Laboratories
+ LAMMPS development team: developers@lammps.org
+
+ Copyright (2003) Sandia Corporation. Under the terms of Contract
+ DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
+ certain rights in this software. This software is distributed under
+ the GNU General Public License.
+
+ See the README file in the top-level LAMMPS directory.
+------------------------------------------------------------------------- */
+
+/* ----------------------------------------------------------------------
+ Contributing authors: Vladislav Galigerov (HSE),
+ Daniil Pavlov (MIPT)
+------------------------------------------------------------------------- */
+
+#include "fix_wall_flow.h"
+
+#include "atom.h"
+#include "citeme.h"
+#include "comm.h"
+#include "domain.h"
+#include "error.h"
+#include "force.h"
+#include "input.h"
+#include "lattice.h"
+#include "math_const.h"
+#include "memory.h"
+#include "modify.h"
+#include "random_mars.h"
+#include "update.h"
+#include "variable.h"
+
+#include
+#include
+#include
+
+using namespace LAMMPS_NS;
+using namespace FixConst;
+
+/* ---------------------------------------------------------------------- */
+
+static const char cite_fix_wall_flow_c[] =
+ "fix wall/flow command: doi:10.1177/10943420231213013\n\n"
+ "@Article{Pavlov-etal-IJHPCA-2024,\n"
+ " author = {Daniil Pavlov and Vladislav Galigerov and Daniil Kolotinskii and Vsevolod "
+ "Nikolskiy and Vladimir Stegailov},\n"
+ " title = {GPU-based molecular dynamics of fluid flows: Reaching for turbulence},\n"
+ " journal = {The International Journal of High Performance Computing Applications},\n"
+ " year = 2024,\n"
+ " volume = 38,\n"
+ " number = 1,\n"
+ " pages = 34-49\n"
+ "}\n\n";
+
+FixWallFlow::FixWallFlow(LAMMPS *lmp, int narg, char **arg) :
+ Fix(lmp, narg, arg), flowax(FlowAxis::AX_X), flowvel(0.0), flowdir(0), rndseed(0),
+ current_segment(nullptr)
+{
+ if (lmp->citeme) lmp->citeme->add(cite_fix_wall_flow_c);
+ if (narg < 9) utils::missing_cmd_args(FLERR, "fix wall/flow", error);
+
+ if (domain->triclinic != 0)
+ error->all(FLERR, "Fix wall/flow cannot be used with triclinic simulation box");
+
+ dynamic_group_allow = 1;
+ bool do_abort = false;
+
+ int iarg = 3;
+ // parsing axis
+ if (strcmp(arg[iarg], "x") == 0)
+ flowax = FlowAxis::AX_X;
+ else if (strcmp(arg[iarg], "y") == 0)
+ flowax = FlowAxis::AX_Y;
+ else if (strcmp(arg[iarg], "z") == 0)
+ flowax = FlowAxis::AX_Z;
+ else
+ error->all(FLERR, "Illegal fix wall/flow argument: axis must by x or y or z, but {} specified",
+ arg[iarg]);
+
+ if (domain->periodicity[flowax] != 1)
+ error->all(FLERR,
+ "Fix wall/flow cannot be used with a non-periodic boundary along the flow axis");
+
+ ++iarg;
+ // parsing velocity
+ flowvel = utils::numeric(FLERR, arg[iarg], do_abort, lmp);
+ if (flowvel == 0.0) error->all(FLERR, "Illegal fix wall/flow argument: velocity cannot be 0");
+ if (flowvel > 0.0)
+ flowdir = 1;
+ else
+ flowdir = -1;
+ if (flowdir < 0)
+ error->all(FLERR, "Illegal fix wall/flow argument: negative direction is not supported yet");
+
+ ++iarg;
+ // parsing temperature
+ double flowtemp = utils::numeric(FLERR, arg[iarg], do_abort, lmp);
+ kT = lmp->force->boltz * flowtemp / force->mvv2e;
+
+ ++iarg;
+ // parsing seed
+ rndseed = utils::inumeric(FLERR, arg[iarg], do_abort, lmp);
+ if (rndseed <= 0)
+ error->all(FLERR, "Illegal fix wall/flow argument: random seed must be positive integer");
+
+ ++iarg;
+ // parsing wall count
+ int wallcount = utils::inumeric(FLERR, arg[iarg], do_abort, lmp);
+ if (wallcount <= 0)
+ error->all(FLERR, "Illegal fix wall/flow argument: wall count must be positive integer");
+
+ ++iarg;
+ // parsing walls
+ if (narg - iarg != wallcount && narg - iarg != wallcount + 2)
+ error->all(FLERR, "Wrong fix wall/flow wall count");
+
+ double scale = 0.0;
+ if (flowax == FlowAxis::AX_X)
+ scale = domain->lattice->xlattice;
+ else if (flowax == FlowAxis::AX_Y)
+ scale = domain->lattice->ylattice;
+ else if (flowax == FlowAxis::AX_Z)
+ scale = domain->lattice->zlattice;
+
+ if (narg - iarg == wallcount + 2) {
+ if (strcmp(arg[narg - 2], "units") != 0) error->all(FLERR, "Wrong fix wall/flow units command");
+ if (strcmp(arg[narg - 1], "box") == 0)
+ scale = 1.0;
+ else if (strcmp(arg[narg - 1], "lattice") != 0)
+ error->all(FLERR, "Wrong fix wall/flow units command");
+ }
+
+ walls.resize(wallcount + 2);
+ walls.front() = domain->boxlo[flowax];
+ for (int w = 1; w <= wallcount; ++w, ++iarg) {
+ walls[w] = utils::numeric(FLERR, arg[iarg], do_abort, lmp) * scale;
+ }
+ walls.back() = domain->boxhi[flowax];
+ if (!std::is_sorted(walls.begin(), walls.end(), std::less_equal())) {
+ error->all(FLERR,
+ "Wrong fix wall/flow wall ordering or some walls are outside simulation domain");
+ }
+
+ if (std::adjacent_find(walls.begin(), walls.end()) != walls.end()) {
+ error->all(FLERR,
+ "Wrong fix wall/flow wall coordinates: some walls have the same coordinates or lie "
+ "on the boundary");
+ }
+
+ memory->grow(current_segment, atom->nmax, "WallFlow::current_segment");
+ atom->add_callback(Atom::GROW);
+ if (restart_peratom) atom->add_callback(Atom::RESTART);
+
+ maxexchange = 1;
+
+ random = new RanMars(lmp, rndseed + comm->me);
+}
+
+/* ---------------------------------------------------------------------- */
+
+FixWallFlow::~FixWallFlow()
+{
+ if (copymode) return;
+ atom->delete_callback(id, Atom::GROW);
+ if (restart_peratom) atom->delete_callback(id, Atom::RESTART);
+ memory->destroy(current_segment);
+
+ delete random;
+}
+
+/* ---------------------------------------------------------------------- */
+
+int FixWallFlow::setmask()
+{
+ int mask = 0;
+
+ mask |= END_OF_STEP;
+
+ return mask;
+}
+
+/* ---------------------------------------------------------------------- */
+
+void FixWallFlow::init()
+{
+ if (domain->triclinic != 0)
+ error->all(FLERR, "Fix wall/flow cannot be used with triclinic simulation box");
+
+ int nrigid = 0;
+ int box_change_flowax = 0;
+ for (const auto &ifix : modify->get_fix_list()) {
+ if (ifix->rigid_flag) nrigid++;
+ switch (flowax) {
+ case FlowAxis::AX_X:
+ if (ifix->box_change & Fix::BOX_CHANGE_X) box_change_flowax++;
+ break;
+ case FlowAxis::AX_Y:
+ if (ifix->box_change & Fix::BOX_CHANGE_Y) box_change_flowax++;
+ break;
+ case FlowAxis::AX_Z:
+ if (ifix->box_change & Fix::BOX_CHANGE_Z) box_change_flowax++;
+ break;
+ }
+ }
+
+ if (nrigid) error->all(FLERR, "Fix wall/flow is not compatible with rigid bodies");
+ if (box_change_flowax)
+ error->all(
+ FLERR,
+ "Fix wall/flow is not compatible with simulation box size changing along flow direction");
+
+ for (int i = 0; i < atom->nlocal; ++i) {
+ double pos = atom->x[i][flowax];
+ current_segment[i] = compute_current_segment(pos);
+ }
+}
+
+/* ---------------------------------------------------------------------- */
+
+void FixWallFlow::end_of_step()
+{
+ double **x = atom->x;
+ int *mask = atom->mask;
+ int nlocal = atom->nlocal;
+
+ for (int i = 0; i < nlocal; ++i) {
+ if (mask[i] & groupbit) {
+ double pos = x[i][flowax];
+ int prev_segment = current_segment[i];
+ current_segment[i] = compute_current_segment(pos);
+
+ if (prev_segment != current_segment[i]) generate_velocity(i);
+ }
+ }
+}
+
+/* ---------------------------------------------------------------------- */
+
+void FixWallFlow::generate_velocity(int atom_i)
+{
+ const int newton_iteration_count = 10;
+ double *vel = atom->v[atom_i];
+
+ double *prmass = atom->rmass;
+ double *pmass = atom->mass;
+ double mass = 0.0;
+ if (prmass)
+ mass = prmass[atom_i];
+ else
+ mass = pmass[atom->type[atom_i]];
+
+ const double gamma = 1.0 / std::sqrt(2.0 * kT / mass);
+ double delta = gamma * flowvel;
+
+ const double edd = std::exp(-delta * delta) / MathConst::MY_PIS + delta * std::erf(delta);
+ const double probability_threshold = 0.5f * (1.f + delta / edd);
+
+ double direction = 1.0;
+
+ if (random->uniform() > probability_threshold) {
+ delta = -delta;
+ direction = -direction;
+ }
+
+ const double xi_0 = random->uniform();
+ const double F_inf = edd + delta;
+ const double xi = xi_0 * F_inf;
+ const double x_0 = (std::sqrt(delta * delta + 2) - delta) * 0.5;
+ double x = x_0;
+ for (int i = 0; i < newton_iteration_count; ++i) {
+ x -= (std::exp(x * x) * MathConst::MY_PIS * (xi - delta * std::erfc(x)) - 1.0) / (x + delta) *
+ 0.5;
+ }
+
+ const double nu = x + delta;
+ const double v = nu / gamma;
+
+ vel[flowax] = v * direction;
+ vel[(flowax + 1) % 3] = random->gaussian() / (gamma * MathConst::MY_SQRT2);
+ vel[(flowax + 2) % 3] = random->gaussian() / (gamma * MathConst::MY_SQRT2);
+}
+
+/* ---------------------------------------------------------------------- */
+
+int FixWallFlow::compute_current_segment(double pos) const
+{
+ int result = 0;
+ for (; result < (int)walls.size() - 1; ++result) {
+ if (pos >= walls[result] && pos < walls[result + 1]) { return result; }
+ }
+ return -1; // -1 is "out of box" region
+}
+
+/* ---------------------------------------------------------------------- */
+
+void FixWallFlow::grow_arrays(int nmax)
+{
+ memory->grow(current_segment, nmax, "WallFlow::current_segment");
+}
+
+/* ---------------------------------------------------------------------- */
+
+void FixWallFlow::copy_arrays(int i, int j, int)
+{
+ current_segment[j] = current_segment[i];
+}
+
+/* ---------------------------------------------------------------------- */
+
+int FixWallFlow::pack_exchange(int i, double *buf)
+{
+ buf[0] = static_cast(current_segment[i]);
+ return 1;
+}
+
+/* ---------------------------------------------------------------------- */
+
+int FixWallFlow::unpack_exchange(int i, double *buf)
+{
+ current_segment[i] = static_cast(buf[0]);
+ return 1;
+}
diff --git a/src/EXTRA-FIX/fix_wall_flow.h b/src/EXTRA-FIX/fix_wall_flow.h
new file mode 100644
index 0000000000..6a662f3d94
--- /dev/null
+++ b/src/EXTRA-FIX/fix_wall_flow.h
@@ -0,0 +1,61 @@
+/* -*- c++ -*- ----------------------------------------------------------
+ LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
+ https://www.lammps.org/, Sandia National Laboratories
+ LAMMPS development team: developers@lammps.org
+
+ Copyright (2003) Sandia Corporation. Under the terms of Contract
+ DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
+ certain rights in this software. This software is distributed under
+ the GNU General Public License.
+
+ See the README file in the top-level LAMMPS directory.
+------------------------------------------------------------------------- */
+
+#ifdef FIX_CLASS
+// clang-format off
+FixStyle(wall/flow,FixWallFlow);
+// clang-format on
+#else
+
+#ifndef LMP_FIX_WALL_FLOW_H
+#define LMP_FIX_WALL_FLOW_H
+
+#include "fix.h"
+
+namespace LAMMPS_NS {
+
+class FixWallFlow : public Fix {
+ public:
+ enum FlowAxis { AX_X = 0, AX_Y = 1, AX_Z = 2 };
+
+ FixWallFlow(class LAMMPS *, int, char **);
+ ~FixWallFlow() override;
+ int setmask() override;
+ void init() override;
+ void end_of_step() override;
+
+ void grow_arrays(int) override;
+ void copy_arrays(int, int, int) override;
+
+ int pack_exchange(int, double *) override;
+ int unpack_exchange(int, double *) override;
+
+ protected:
+ FlowAxis flowax;
+ double flowvel;
+ double kT;
+ std::vector walls;
+
+ int flowdir;
+ int rndseed;
+ class RanMars *random;
+ int *current_segment;
+
+ int compute_current_segment(double pos) const;
+ void generate_velocity(int i);
+};
+
+} // namespace LAMMPS_NS
+
+#endif
+#endif
diff --git a/src/KOKKOS/Install.sh b/src/KOKKOS/Install.sh
index 462c0cbe57..75949c35d8 100755
--- a/src/KOKKOS/Install.sh
+++ b/src/KOKKOS/Install.sh
@@ -187,6 +187,8 @@ action fix_temp_rescale_kokkos.cpp
action fix_temp_rescale_kokkos.h
action fix_viscous_kokkos.cpp
action fix_viscous_kokkos.h
+action fix_wall_flow_kokkos.cpp fix_wall_flow.cpp
+action fix_wall_flow_kokkos.h fix_wall_flow.h
action fix_wall_gran_kokkos.cpp fix_wall_gran.cpp
action fix_wall_gran_kokkos.h fix_wall_gran.h
action fix_wall_gran_old.cpp fix_wall_gran.cpp
diff --git a/src/KOKKOS/fix_wall_flow_kokkos.cpp b/src/KOKKOS/fix_wall_flow_kokkos.cpp
new file mode 100644
index 0000000000..b6b3f7c096
--- /dev/null
+++ b/src/KOKKOS/fix_wall_flow_kokkos.cpp
@@ -0,0 +1,291 @@
+/* ----------------------------------------------------------------------
+ LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
+ https://www.lammps.org/, Sandia National Laboratories
+ LAMMPS development team: developers@lammps.org
+
+ Copyright (2003) Sandia Corporation. Under the terms of Contract
+ DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
+ certain rights in this software. This software is distributed under
+ the GNU General Public License.
+
+ See the README file in the top-level LAMMPS directory.
+------------------------------------------------------------------------- */
+
+/* ----------------------------------------------------------------------
+ Contributing authors: Vladislav Galigerov (HSE),
+ Daniil Pavlov (MIPT)
+------------------------------------------------------------------------- */
+
+#include "fix_wall_flow_kokkos.h"
+#include "atom_kokkos.h"
+#include "atom_masks.h"
+#include "force.h"
+#include "math_const.h"
+#include "memory_kokkos.h"
+
+using namespace LAMMPS_NS;
+
+template
+FixWallFlowKokkos::FixWallFlowKokkos(LAMMPS *lmp, int narg, char **arg) :
+ FixWallFlow(lmp, narg, arg), rand_pool(rndseed + comm->me)
+{
+ kokkosable = 1;
+ exchange_comm_device = sort_device = 1;
+ atomKK = (AtomKokkos *) atom;
+ execution_space = ExecutionSpaceFromDevice::space;
+ datamask_read = X_MASK | RMASS_MASK | TYPE_MASK | MASK_MASK;
+ datamask_modify = V_MASK;
+
+ memory->destroy(current_segment);
+ current_segment = nullptr;
+ grow_arrays(atomKK->nmax);
+
+ d_walls = d_walls_t("FixWallFlowKokkos::walls", walls.size());
+ auto h_walls = Kokkos::create_mirror_view(d_walls);
+ for (int i = 0; i < (int) walls.size(); ++i) h_walls(i) = walls[i];
+ Kokkos::deep_copy(d_walls, h_walls);
+}
+
+template FixWallFlowKokkos::~FixWallFlowKokkos()
+{
+ if (copymode) return;
+ memoryKK->destroy_kokkos(k_current_segment, current_segment);
+}
+
+template void FixWallFlowKokkos::init()
+{
+ atomKK->sync(execution_space, datamask_read);
+ k_current_segment.template sync();
+ d_x = atomKK->k_x.template view();
+
+ copymode = 1;
+ Kokkos::parallel_for(Kokkos::RangePolicy(0, atom->nlocal), *this);
+ copymode = 0;
+
+ k_current_segment.template modify();
+}
+
+template
+KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowInit,
+ const int &i) const
+{
+ double pos = d_x(i, flowax);
+ d_current_segment(i) = compute_current_segment_kk(pos);
+}
+
+template void FixWallFlowKokkos::end_of_step()
+{
+ atomKK->sync(execution_space, datamask_read);
+ k_current_segment.template sync();
+
+ d_x = atomKK->k_x.template view();
+ d_v = atomKK->k_v.template view();
+ d_type = atomKK->k_type.template view();
+ d_mask = atomKK->k_mask.template view();
+ d_mass = atomKK->k_mass.template view();
+ d_rmass = atomKK->k_rmass.template view();
+
+ copymode = 1;
+ if (d_rmass.data()) {
+ Kokkos::parallel_for(
+ Kokkos::RangePolicy>(0, atom->nlocal), *this);
+ } else {
+ Kokkos::parallel_for(
+ Kokkos::RangePolicy>(0, atom->nlocal), *this);
+ }
+ copymode = 0;
+ atomKK->modified(execution_space, datamask_modify);
+ k_current_segment.template modify();
+}
+
+template
+template
+KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowEndOfStep,
+ const int &atom_i) const
+{
+ if (d_mask[atom_i] & groupbit) {
+ double pos = d_x(atom_i, flowax);
+ int prev_segment = d_current_segment(atom_i);
+ d_current_segment(atom_i) = compute_current_segment_kk(pos);
+ if (prev_segment != d_current_segment(atom_i)) { generate_velocity_kk(atom_i); }
+ }
+}
+
+template
+template
+KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::generate_velocity_kk(int atom_i) const
+{
+ const int newton_iteration_count = 10;
+ double mass = get_mass(MTag(), atom_i);
+ const double gamma = 1.0 / std::sqrt(2.0 * kT / mass);
+ double delta = gamma * flowvel;
+
+ const double edd = std::exp(-delta * delta) / MathConst::MY_PIS + delta * std::erf(delta);
+ const double probability_threshold = 0.5 * (1. + delta / edd);
+
+ double direction = 1.0;
+
+ rand_type_t rand_gen = rand_pool.get_state();
+
+ if (/*random->uniform()*/ rand_gen.drand() > probability_threshold) {
+ delta = -delta;
+ direction = -direction;
+ }
+
+ const double xi_0 = rand_gen.drand(); //random->uniform();
+ const double F_inf = edd + delta;
+ const double xi = xi_0 * F_inf;
+ const double x_0 = (std::sqrt(delta * delta + 2) - delta) * 0.5;
+ double x = x_0;
+ for (int i = 0; i < newton_iteration_count; ++i) {
+ x -= (std::exp(x * x) * MathConst::MY_PIS * (xi - delta * std::erfc(x)) - 1.0) / (x + delta) *
+ 0.5;
+ }
+
+ const double nu = x + delta;
+ const double v = nu / gamma;
+
+ d_v(atom_i, flowax) = v * direction;
+ d_v(atom_i, (flowax + 1) % 3) =
+ /*random->gaussian()*/ rand_gen.normal() / (gamma * MathConst::MY_SQRT2);
+ d_v(atom_i, (flowax + 2) % 3) =
+ /*random->gaussian()*/ rand_gen.normal() / (gamma * MathConst::MY_SQRT2);
+
+ rand_pool.free_state(rand_gen);
+}
+
+template
+KOKKOS_INLINE_FUNCTION int
+FixWallFlowKokkos::compute_current_segment_kk(double pos) const
+{
+ int result = 0;
+ for (; result < (int) d_walls.extent(0) - 1; ++result) {
+ if (pos >= d_walls[result] && pos < d_walls[result + 1]) { return result; }
+ }
+ return -1; // -1 is "out of box" region
+}
+
+template void FixWallFlowKokkos::grow_arrays(int nmax)
+{
+ k_current_segment.template sync();
+ memoryKK->grow_kokkos(k_current_segment, current_segment, nmax, "WallFlowKK::current_segment");
+ k_current_segment.template modify();
+
+ d_current_segment = k_current_segment.template view();
+ h_current_segment = k_current_segment.template view();
+}
+
+template void FixWallFlowKokkos::copy_arrays(int i, int j, int)
+{
+ k_current_segment.template sync();
+ h_current_segment(j) = h_current_segment(i);
+ k_current_segment.template modify();
+}
+
+/* ----------------------------------------------------------------------
+ sort local atom-based arrays
+------------------------------------------------------------------------- */
+
+template
+void FixWallFlowKokkos::sort_kokkos(Kokkos::BinSort &Sorter)
+{
+ // always sort on the device
+
+ k_current_segment.sync_device();
+
+ Sorter.sort(LMPDeviceType(), k_current_segment.d_view);
+
+ k_current_segment.modify_device();
+}
+
+template int FixWallFlowKokkos::pack_exchange(int i, double *buf)
+{
+ k_current_segment.sync_host();
+ buf[0] = static_cast(h_current_segment(i));
+ return 1;
+}
+
+template
+KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowPackExchange,
+ const int &mysend) const
+{
+ const int send_i = d_sendlist(mysend);
+ const int segment = d_current_segment(send_i);
+ d_buf(mysend) = static_cast(segment);
+
+ const int copy_i = d_copylist(mysend);
+ if (copy_i > -1) { d_current_segment(send_i) = d_current_segment(copy_i); }
+}
+
+template
+int FixWallFlowKokkos::pack_exchange_kokkos(const int &nsend,
+ DAT::tdual_xfloat_2d &k_buf,
+ DAT::tdual_int_1d k_sendlist,
+ DAT::tdual_int_1d k_copylist,
+ ExecutionSpace /*space*/)
+{
+ k_current_segment.template sync();
+
+ k_buf.template sync();
+ k_sendlist.template sync();
+ k_copylist.template sync();
+
+ d_sendlist = k_sendlist.view();
+ d_copylist = k_copylist.view();
+
+ d_buf = typename ArrayTypes::t_xfloat_1d_um(k_buf.template view().data(),
+ k_buf.extent(0) * k_buf.extent(1));
+
+ copymode = 1;
+
+ Kokkos::parallel_for(Kokkos::RangePolicy(0, nsend),
+ *this);
+
+ copymode = 0;
+
+ k_buf.template modify();
+ k_current_segment.template modify();
+
+ return nsend;
+}
+
+template int FixWallFlowKokkos::unpack_exchange(int i, double *buf)
+{
+ k_current_segment.sync_host();
+ h_current_segment(i) = static_cast(buf[0]);
+ k_current_segment.modify_host();
+ return 1;
+}
+
+template
+KOKKOS_INLINE_FUNCTION void FixWallFlowKokkos::operator()(TagFixWallFlowUnpackExchange,
+ const int &i) const
+{
+ int index = d_indices(i);
+ if (index > -1) { d_current_segment(index) = static_cast(d_buf(i)); }
+}
+
+template
+void FixWallFlowKokkos::unpack_exchange_kokkos(DAT::tdual_xfloat_2d &k_buf,
+ DAT::tdual_int_1d &k_indices, int nrecv,
+ int /*nrecv1*/, int /*nextrarecv1*/,
+ ExecutionSpace /*space*/)
+{
+ d_buf = typename ArrayTypes::t_xfloat_1d_um(k_buf.template view().data(),
+ k_buf.extent(0) * k_buf.extent(1));
+ d_indices = k_indices.view();
+
+ copymode = 1;
+ Kokkos::parallel_for(Kokkos::RangePolicy(0, nrecv),
+ *this);
+ copymode = 0;
+
+ k_current_segment.template modify();
+}
+
+namespace LAMMPS_NS {
+template class FixWallFlowKokkos;
+#ifdef LMP_KOKKOS_GPU
+template class FixWallFlowKokkos;
+#endif
+} // namespace LAMMPS_NS
diff --git a/src/KOKKOS/fix_wall_flow_kokkos.h b/src/KOKKOS/fix_wall_flow_kokkos.h
new file mode 100644
index 0000000000..8de0eded0a
--- /dev/null
+++ b/src/KOKKOS/fix_wall_flow_kokkos.h
@@ -0,0 +1,130 @@
+/* -*- c++ -*- ----------------------------------------------------------
+ LAMMPS - Large-scale Atomic/Molecular Massively Parallel Simulator
+ https://www.lammps.org/, Sandia National Laboratories
+ LAMMPS development team: developers@lammps.org
+
+ Copyright (2003) Sandia Corporation. Under the terms of Contract
+ DE-AC04-94AL85000 with Sandia Corporation, the U.S. Government retains
+ certain rights in this software. This software is distributed under
+ the GNU General Public License.
+
+ See the README file in the top-level LAMMPS directory.
+------------------------------------------------------------------------- */
+
+#ifdef FIX_CLASS
+// clang-format off
+FixStyle(wall/flow/kk,FixWallFlowKokkos);
+FixStyle(wall/flow/kk/device,FixWallFlowKokkos);
+FixStyle(wall/flow/kk/host,FixWallFlowKokkos);
+// clang-format on
+#else
+
+// clang-format off
+#ifndef LMP_FIX_WALL_FLOW_KOKKOS_H
+#define LMP_FIX_WALL_FLOW_KOKKOS_H
+
+#include "fix_wall_flow.h"
+#include "kokkos_type.h"
+#include "kokkos_base.h"
+#include "Kokkos_Random.hpp"
+#include "comm_kokkos.h"
+
+namespace LAMMPS_NS {
+
+struct TagFixWallFlowInit{};
+template
+struct TagFixWallFlowEndOfStep{};
+struct TagFixWallFlowPackExchange{};
+struct TagFixWallFlowUnpackExchange{};
+
+template
+class FixWallFlowKokkos : public FixWallFlow, public KokkosBase {
+ public:
+ typedef DeviceType device_type;
+ typedef ArrayTypes AT;
+ struct MassTag{};
+ struct RMassTag{};
+ FixWallFlowKokkos(class LAMMPS *, int, char **);
+ ~FixWallFlowKokkos();
+
+ void init() override;
+ void end_of_step() override;
+ void grow_arrays(int) override;
+ void copy_arrays(int, int, int) override;
+ void sort_kokkos(Kokkos::BinSort &Sorter) override;
+ int pack_exchange(int, double *) override;
+ int unpack_exchange(int, double *) override;
+
+ KOKKOS_INLINE_FUNCTION
+ void operator() (TagFixWallFlowInit, const int&) const;
+
+ template
+ KOKKOS_INLINE_FUNCTION
+ void operator()(TagFixWallFlowEndOfStep, const int&) const;
+
+ KOKKOS_INLINE_FUNCTION
+ void operator()(TagFixWallFlowPackExchange, const int&) const;
+
+ KOKKOS_INLINE_FUNCTION
+ void operator()(TagFixWallFlowUnpackExchange, const int&) const;
+
+ int pack_exchange_kokkos(const int &nsend,DAT::tdual_xfloat_2d &buf,
+ DAT::tdual_int_1d k_sendlist,
+ DAT::tdual_int_1d k_copylist,
+ ExecutionSpace space) override;
+
+ void unpack_exchange_kokkos(DAT::tdual_xfloat_2d &k_buf,
+ DAT::tdual_int_1d &indices,int nrecv,
+ int /*nrecv1*/, int /*nextrarecv1*/,
+ ExecutionSpace space) override;
+ protected:
+ typename AT::t_x_array d_x;
+ typename AT::t_v_array d_v;
+ typename AT::t_int_1d d_type;
+ typename AT::t_int_1d d_mask;
+
+ typename AT::t_float_1d d_mass;
+ typename AT::t_float_1d d_rmass;
+
+ typedef typename AT::t_xfloat_1d d_walls_t;
+ typedef Kokkos::Random_XorShift64_Pool rand_pool_t;
+ typedef typename rand_pool_t::generator_type rand_type_t;
+
+ typename AT::tdual_int_1d k_current_segment;
+ typename AT::t_int_1d d_current_segment;
+ typename HAT::t_int_1d h_current_segment;
+
+ typename AT::t_int_1d d_sendlist;
+ typename AT::t_xfloat_1d d_buf;
+ typename AT::t_int_1d d_copylist;
+ typename AT::t_int_1d d_indices;
+
+ d_walls_t d_walls;
+
+ rand_pool_t rand_pool;
+
+ template
+ KOKKOS_INLINE_FUNCTION
+ void generate_velocity_kk(int atom_i) const;
+
+ KOKKOS_INLINE_FUNCTION
+ int compute_current_segment_kk(double pos) const;
+
+ KOKKOS_INLINE_FUNCTION
+ double get_mass(MassTag, int atom_i) const
+ {
+ return d_mass(d_type(atom_i));
+ }
+
+ KOKKOS_INLINE_FUNCTION
+ double get_mass(RMassTag, int atom_i) const
+ {
+ return d_rmass(atom_i);
+ }
+};
+
+}
+
+#endif
+#endif
+