# As a pip package

You can use RumbleDB from within Python programmes by running

```bash
pip install jsoniq
```

## Java version

*Important note*: since the jsoniq package depends on pyspark 4, Java 17 or Java 21 is a requirement. If another version of Java is installed, the execution of a Python program attempting to create a RumbleSession will lead to an error message on stderr that contains explanations.

You can control your Java version with:

```bash
java -version
```

Information about how this package is used can be found [in this section](/writing-jsoniq-queries-in-python.md).

## Common issue: colliding Spark version

Some users who have already configured a Spark installation on their machine may encounter a version issue if SPARK\_HOME points to this alternate installation, and it is a different version of Spark (e.g., 3.5 or 3.4). The jsoniq package requires Spark 4.0.

If this happens, RumbleDB should output an informative error message. They are two ways to fix such conflicts:

* The easiest is remove the SPARK\_HOME environment variable completely. This will have RumbleDB fall back to the Spark 4.0 installation that ships with its pyspark dependency.
* Or you can instead change the value of SPARK\_HOME to point to a Spark 4.0 installation, if you have one. This would be for more advanced users who know what they are doing.

If you have another working Spark installation on your machine, you can see which version it is with

```
spark-submit --version
```

The above command is of course expected not to work for first-time users who only installed the jsoniq package and never installed Spark additionally on their machine.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.rumbledb.org/getting-started/as-a-pip-package.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
