mirror of
https://github.com/wassname/ray.git
synced 2026-06-27 19:00:36 +08:00
[DataFrame] readthedocs page for Pandas on Ray (#1714)
This commit is contained in:
committed by
Robert Nishihara
parent
adffc7bfea
commit
c19c2a4e60
@@ -80,6 +80,12 @@ Ray comes with libraries that accelerate deep learning and reinforcement learnin
|
||||
rllib.rst
|
||||
rllib-dev.rst
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Pandas on Ray
|
||||
|
||||
pandas_on_ray.rst
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
:caption: Examples
|
||||
|
||||
@@ -0,0 +1,71 @@
|
||||
Pandas on Ray
|
||||
=============
|
||||
|
||||
Pandas on Ray is an early stage DataFrame library that wraps Pandas and
|
||||
transparently distributes the data and computation. The user does not need to
|
||||
know how many cores their system has, nor do they need to specify how to
|
||||
distribute the data. In fact, users can continue using their previous Pandas
|
||||
notebooks while experiencing a considerable speedup from Pandas on Ray, even
|
||||
on a single machine. Only a modification of the import statement is needed, as
|
||||
we demonstrate below. Once you’ve changed your import statement, you’re ready
|
||||
to use Pandas on Ray just like you would Pandas.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# import pandas as pd
|
||||
import ray.dataframe as pd
|
||||
|
||||
Currently, we have part of the Pandas API implemented and are working toward
|
||||
full functional parity with Pandas.
|
||||
|
||||
Using Pandas on Ray on a Single Node
|
||||
------------------------------------
|
||||
|
||||
In order to use the most up-to-date version of Pandas on Ray, please follow
|
||||
the instructions on the `installation page`_
|
||||
|
||||
Once you import the library, you should see something similar to the following
|
||||
output:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
>>> import ray.dataframe as pd
|
||||
|
||||
Waiting for redis server at 127.0.0.1:14618 to respond...
|
||||
Waiting for redis server at 127.0.0.1:31410 to respond...
|
||||
Starting local scheduler with the following resources: {'CPU': 4, 'GPU': 0}.
|
||||
|
||||
======================================================================
|
||||
View the web UI at http://localhost:8889/notebooks/ray_ui36796.ipynb?token=ac25867d62c4ae87941bc5a0ecd5f517dbf80bd8e9b04218
|
||||
======================================================================
|
||||
|
||||
If you do not see output similar to the above, please make sure that you have
|
||||
built Ray using the instructions on the `installation page`_
|
||||
|
||||
One you have executed ``import ray.dataframe as pd``, you're ready to begin
|
||||
running your Pandas pipeline as you were before. Please note, the API is not
|
||||
yet complete. For some methods, you may see the following:
|
||||
|
||||
.. code-block:: text
|
||||
|
||||
NotImplementedError: To contribute to Pandas on Ray, please visit github.com/ray-project/ray.
|
||||
|
||||
If you would like to request a particular method be implemented, feel free to
|
||||
`open an issue`_. Before you open an issue please make sure that someone else
|
||||
has not already requested that functionality.
|
||||
|
||||
Using Pandas on Ray on a Cluster
|
||||
--------------------------------
|
||||
|
||||
Currently, we do not yet support running Pandas on Ray on a cluster. Coming
|
||||
Soon!
|
||||
|
||||
Examples
|
||||
--------
|
||||
You can find an example on our recent `blog post`_ or on the
|
||||
`Jupyter Notebook`_ that we used to create the blog post.
|
||||
|
||||
.. _`installation page`: http://ray.readthedocs.io/en/latest/installation.html
|
||||
.. _`open an issue`: http://github.com/ray-project/ray/issues
|
||||
.. _`blog post`: http://rise.cs.berkeley.edu/blog/pandas-on-ray
|
||||
.. _`Jupyter Notebook`: http://gist.github.com/devin-petersohn/f424d9fb5579a96507c709a36d487f24#file-pandas_on_ray_blog_post_0-ipynb
|
||||
Reference in New Issue
Block a user