Fixing race condition with OpenMP for GPU styles using torque (missed in regression tests due to the first fix)
Documenting GPU package option for setting the number of threads (consistent with USER-INTEL and USER-OMP).
- document previously undocumented OpenCL tune settings
- implement OpenCL platform selection through prefixing the device type with the platform id separated by a colon
- allow passing custom tune parameters though postfixing the device type with the 13 tuneable parameters separated by commas
- remove an extra clear() that would delete device properties structs an cause LAMMPS to output garbage strings