Quickstart ========== This page gives a short introduction on how to get started with Energy Quantified's Python library. ----- 2-minute guide ^^^^^^^^^^^^^^ First, make sure that ``energyquantified`` is :doc:`installed and up-to-date ` on your workstation. **Authenticate** Import the library, create a client and supply the API key: >>> from energyquantified import EnergyQuantified >>> eq = EnergyQuantified(api_key='aaaa-bbbb-cccc-dddd') You can check if your API key is valid by invoking ``is_api_key_valid()``: >>> eq.is_api_key_valid() True **Search for curves (data series)** Energy Quantified allows you to search for curves in two ways. By free-text search: >>> curves = eq.metadata.curves(q='wind power germany actual') >>> curves [, , , , , ... Or by filtering on specific terms: >>> curves = eq.metadata.curves(area='de', data_type='actual', category=['Wind', 'Power']) >>> curves [, , ] **Load data** When you have found your curve, you can download it. As these curves are of ``curve_type = TIMESERIES``, we should use the ``eq.timeseries.load()``-function. When specifying the ``curve`` parameter in the ``load()``-function, you can either provide a Curve instance and a string. Same for the dates (either provide a Python date, datetime, or an ISO-8601-like string ``YYYY-MM-DD``). >>> from datetime import date >>> timeseries = eq.timeseries.load( >>> 'DE Wind Power Production MWh/h 15min Actual', >>> begin=date(2020, 1, 1), >>> end=date(2020, 2, 1) >>> ) The result will be a Timeseries with all the attributes parsed into Python-objects. >>> timeseries.curve >>> timeseries.resolution >>> timeseries.data[:4] [, , , ] You can also loop over the values in a Timeseries like this: >>> for v in timeseries: >>> print(v) ... Now it's up to you to decide the next step. You could save the data to your own database, or perhaps start doing some data analysis. **Use pandas for data analysis** (You need to install ``pandas`` separately to do this.) Convert any time series to a ``pandas.DataFrame`` like so: >>> df = timeseries.to_dataframe(name='series') >>> df series date 2020-01-01 00:00:00+01:00 6387 2020-01-01 00:15:00+01:00 6383 2020-01-01 00:30:00+01:00 6640 2020-01-01 00:45:00+01:00 6882 ... ... Mini-guide to pandas and matplotlib ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Before you continue:** You need to install ``pandas`` and ``matplotlib`` to follow this mini-guide. **Load some data:** First, let's import all we need and load the data: >>> # Find curves >>> curve_wind = eq.metadata.curves(q="de wind prod actual")[0] >>> curve_solar = eq.metadata.curves(q="de solar photovoltaic prod actual")[0] >>> curve_wind, curve_solar (, ) >>> # Load data >>> wind = eq.timeseries.load(curve_wind, begin='2020-03-25', end='2020-04-01') >>> solar = eq.timeseries.load(curve_solar, begin='2020-03-25', end='2020-04-01') **Using pandas:** Convert to both the wind and solar time series to ``pandas.DataFrame`` instances like so: >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> df_solar = solar.to_dataframe(name='de solar') >>> df_wind = wind.to_dataframe(name='de wind') >>> df_wind de wind date 2020-03-25 00:00:00+01:00 25049 2020-03-25 00:15:00+01:00 24810 2020-03-25 00:30:00+01:00 24648 2020-03-25 00:45:00+01:00 24395 2020-03-25 01:00:00+01:00 23992 ... ... 2020-03-31 22:45:00+02:00 9919 2020-03-31 23:00:00+02:00 10098 2020-03-31 23:15:00+02:00 10318 2020-03-31 23:30:00+02:00 10563 2020-03-31 23:45:00+02:00 10556 [668 rows x 1 columns] You can then concatenate these two into one ``DataFrame``. Supplying ``axis=1`` means that you concatenate *columns*, which in this case add the columns next to each other while maintaining the dates. (Using ``axis=0`` will concatenate on the index, which in this case are the dates. That will yield an unwanted result.) >>> df = pd.concat([dfw, dfs], axis=1) >>> df de wind de solar date 2020-03-25 00:00:00+01:00 25049 0 2020-03-25 00:15:00+01:00 24810 0 2020-03-25 00:30:00+01:00 24648 0 2020-03-25 00:45:00+01:00 24395 0 2020-03-25 01:00:00+01:00 23992 0 ... ... ... 2020-03-31 22:45:00+02:00 9919 0 2020-03-31 23:00:00+02:00 10098 0 2020-03-31 23:15:00+02:00 10318 0 2020-03-31 23:30:00+02:00 10563 0 2020-03-31 23:45:00+02:00 10556 0 [668 rows x 2 columns] Below is an example where we plot the data and export it to a file in the current working directory called ``15min_chart.png``. It uses the original ``DataFrame`` with 15-minute resolution data. >>> # Plot and save to file >>> df.plot() >>> plt.savefig('15min_chart.png') .. image:: /_static/wind_solar_actual.png You can also perform aggregations: >>> # Use pandas to aggregate to daily mean production >>> df2 = df.resample('D').mean() >>> df2 de wind de solar date 2020-03-25 00:00:00+01:00 18112.416667 9349.697917 2020-03-26 00:00:00+01:00 18977.197917 7868.750000 2020-03-27 00:00:00+01:00 16811.010417 8655.989583 2020-03-28 00:00:00+01:00 15940.093750 8773.229167 2020-03-29 00:00:00+01:00 27446.750000 3451.119565 2020-03-30 00:00:00+02:00 12960.114583 6863.718750 2020-03-31 00:00:00+02:00 5968.635417 7462.677083 And you can add the wind and solar production together to create a sum of renewables. The result will be a ``pandas.Series``, as indicated by the ``Freq: D`` in the bottom. >>> df2['de wind'] + df2['de solar'] date 2020-03-25 00:00:00+01:00 27462.114583 2020-03-26 00:00:00+01:00 26845.947917 2020-03-27 00:00:00+01:00 25467.000000 2020-03-28 00:00:00+01:00 24713.322917 2020-03-29 00:00:00+01:00 30897.869565 2020-03-30 00:00:00+02:00 19823.833333 2020-03-31 00:00:00+02:00 13431.312500 Freq: D, dtype: float64 **Find out more about pandas and matplotlib:** Look at the `pandas `_ and `matplotlib `_ documentation for a more in-depth explanation of concepts and features. Next steps ^^^^^^^^^^ Get familiar with terminology and data types used in the Energy Quantified API and in the Energy Quantified Python library: - :doc:`Terminology and data models <../userguide/terminology>`