Want to learn more? Take the full course at [ Ссылка ] at your own pace. More than a video, you'll learn hands-on coding & quickly apply skills to your daily work.
---
Welcome to this course on forecasting using ARIMA models in Python. My name is James Fulton and I will be your guide as you learn how to predict the future of time series.
Time series data is everywhere in this world. It is used in a wide variety of fields. There are many datasets for which we would like to be able to predict the future. Knowing the future of obesity rates could help us intervene now for public health; predicting consumer energy demands could help power stations run more efficiently; and predicting how the population of a city will change could help us build the infrastructure we will need.
We can forecast all of these datasets using time series models, and ARIMA models are one of the go-to time series tools.
You will learn how to fit these models and how to optimize them.
You will learn how to make forecasts of important real-world data, and importantly how to find the limits of your forecasts.
Let's start by examining a time series. We can load a time series from csv using pandas. Here we set the index as the date column and parse the date into datetime data-type.
To plot the data we make a pyplot figure and use the DataFrame's dot-plot method.
One important feature of a time series is its trend. A positive trend is a line that generally slopes up - the values increase with time. Similarly, a negative trend is where the values decrease.
Another important feature is seasonality. A seasonal time series has patterns that repeat at regular intervals, for example high sales every weekend.In contrast, cyclicality is where there is a repeating pattern but no fixed period.
White noise is an important concept in time series and ARIMA models. White noise is a series of measurements, where each value is uncorrelated with previous values.
You can think of this like flipping a coin, the outcome of a coin flip doesn't rely on the outcomes of coin flips that came before. Similarly, with white noise, the series value doesn't depend on the values that came before.
To model a time series, it must be stationary. Stationary means that the distribution of the data doesn't change with time. For a time series to be stationary it must fulfill three criteria. These are:
The series has zero trend, it isn't growing or shrinking.
The variance is constant. The average distance of the data points from the zero line isn't changing And the autocorrelation is constant. How each value in the time series is related to its neighbors stays the same.
Generally, in machine learning, you have a training set which you fit your model on, and a test set, which you will test your predictions against. Time series forecasting is just the same.
Our train-test split will be different however. We use the past values to make future predictions, and so we will need to split the data in time. We train on the data earlier in the time series and test on the data that comes later.
We can split time series at a given date as shown above using the DataFrame's dot-loc method.
We've learned the basics of stationarity and train-test splitting. Let's get used to these in practice.
#PythonTutorial #Python #DataCamp #timeseries #stationarity #ARIMA
Ещё видео!