Some professions require the prophetic task of using existing data to predict future outcomes as accurately as possible. This can come in the form of trying to predict the upcoming week’s weather, estimating a person’s risk of developing a disease, or anticipating the best time to sell stock.
The way in which they do this is by interpreting “time-series data:” a collection of observations over time. As you can imagine, gathering, inputting, and analyzing data to make these predictions requires a lot of training as it uses complex machine-learning algorithms.
A team of researchers from the Massachusetts Institute of Technology (MIT) wanted to change this by making a future-predicting tool that is accessible to non-experts. By combining already existing data points and a function that predicts future data points, they were able to create a simple interface called Time Series Predict Database, or tspDB for short.
The tool does all the complicated computing backstage, only requiring the user to complete simple data input and easy-to-read outcomes. When compared with other cutting-edge systems with real world data, like financial markets and traffic patterns, tspDB came out on top. Being more efficient and accurate when filling in missing data and predicting future values.
“Even as the time-series data becomes more and more complex, this algorithm can effectively capture any time-series structure out there. It feels like we have found the right lens to look at the model complexity of time-series data,” says senior author Devavrat Shah in an MIT press release.
Their new algorithm, published in arXiv, uses a powerful classic algorithm called singular spectrum analysis (SSA) which they adapted to predict values over time. Of course, as the program is trying to predict the future, it isn’t going to be 100 percent accurate. Therefore, it was important to the team that the variant of SSA designed also displayed the confidence interval of the prediction. This tool explains the margin of error in tspDB’s results, allowing non-experts to be more informed in their decision making.
The team’s next goal is to make this simple algorithm even more accessible, gathering feedback on how they can improve user-friendliness and functionality. “Our interest at the highest level is to make tspDB a success in the form of a broadly utilizable, open-source system,” stated Shah.
He continued: “Time-series data are very important, and this is a beautiful concept of actually building prediction functionalities directly into the database. It has never been done before, and so we want to make sure the world uses it,” he says
Source study: arXiv – On Multivariate Singular Spectrum Analysis and its Variants