Skip to content

StockDataSource

Bases: DataSource

A data source for reading stock data using the Alpha Vantage API.

Examples:

Load the daily stock data for SPY:

>>> df = spark.read.format("stock").option("api_key", "your-key").load("SPY")
>>> df.show(n=5)
+----------+------+------+------+------+--------+------+
|      date|  open|  high|   low| close|  volume|symbol|
+----------+------+------+------+------+--------+------+
|2024-06-04|526.46|529.15|524.96|528.39|33898396|   SPY|
|2024-06-03|529.02|529.31| 522.6| 527.8|46835702|   SPY|
|2024-05-31|523.59| 527.5|518.36|527.37|90785755|   SPY|
|2024-05-30|524.52| 525.2|521.33|522.61|46468510|   SPY|
|2024-05-29|525.68|527.31|525.37| 526.1|45190323|   SPY|
+----------+------+------+------+------+--------+------+
Source code in pyspark_datasources/stock.py
class StockDataSource(DataSource):
    """
    A data source for reading stock data using the Alpha Vantage API.

    Examples
    --------

    Load the daily stock data for SPY:

    >>> df = spark.read.format("stock").option("api_key", "your-key").load("SPY")
    >>> df.show(n=5)
    +----------+------+------+------+------+--------+------+
    |      date|  open|  high|   low| close|  volume|symbol|
    +----------+------+------+------+------+--------+------+
    |2024-06-04|526.46|529.15|524.96|528.39|33898396|   SPY|
    |2024-06-03|529.02|529.31| 522.6| 527.8|46835702|   SPY|
    |2024-05-31|523.59| 527.5|518.36|527.37|90785755|   SPY|
    |2024-05-30|524.52| 525.2|521.33|522.61|46468510|   SPY|
    |2024-05-29|525.68|527.31|525.37| 526.1|45190323|   SPY|
    +----------+------+------+------+------+--------+------+
    """
    @classmethod
    def name(self) -> str:
        return "stock"

    def schema(self) -> str:
        return (
            "date string, open double, high double, "
            "low double, close double, volume long, symbol string"
        )

    def reader(self, schema):
        return StockDataReader(schema, self.options)