How to run Keras on multiple cores?
Tensorflow automatically runs the computations on as many cores as are available on a single machine.
If you have a distributed cluster, be sure you follow the instructions at https://www.tensorflow.org/how_tos/distributed/ to configure the cluster. (e.g. create the tf.ClusterSpec correctly, etc.)
To help debug, you can use the log_device_placement
configuration options on the session to have Tensorflow print out where the computations are actually placed. (Note: this works for both GPUs as well as distributed Tensorflow.)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
Note that while Tensorflow's computation placement algorithm works fine for small computational graphs, you might be able to get better performance on large computational graphs by manually placing the computations in specific devices. (e.g. using with tf.device(...):
blocks.)
For Tensorflow 1.x, you can configure session of Tensorflow and use this session for keras backend:
session_conf = tensorflow.ConfigProto(intra_op_parallelism_threads=8, inter_op_parallelism_threads=8)
tensorflow.set_random_seed(1)
sess = tensorflow.Session(graph=tensorflow.get_default_graph(), config=session_conf)
keras.backend.set_session(sess)
For Tensorflow 2.x, most of the modules above are deprecated. So you need to call them, for example, like this tensorflow.compat.v1.ConfigProto
.