diff --git a/doc/source/using-ray-on-a-large-cluster.md b/doc/source/using-ray-on-a-large-cluster.md index f90499c9c..ea101fa72 100644 --- a/doc/source/using-ray-on-a-large-cluster.md +++ b/doc/source/using-ray-on-a-large-cluster.md @@ -237,6 +237,8 @@ Note that the destination argument for this command must represent an absolute p ## Troubleshooting +### Problems with parallel-ssh + If any of the above commands fail, verify that the head node has SSH access to the other nodes by running @@ -253,3 +255,19 @@ node with agent forwarding enabled. This is done as follows. ssh-add ssh -A ubuntu@ ``` + +### Configuring EC2 instances to increase the number of allowed Redis clients + +This section can be ignored unless you run into problems with the maximum +number of Redis clients. + +* Ensure that the hard limit for the number of open file descriptors is set + to a large number (e.g., 65536). This only needs to be done on instances + where Redis shards will run --- by default, just the _head node_. + - Check the hard ulimit for open file descriptors with `ulimit -Hn` + - If that number is smaller than 65536, set the hard ulimit for open file descriptors + system-wide: + ``` + sudo bash -c "echo $USER hard nofile 65536 >> /etc/security/limits.conf" + ``` + - Logout and log back in