|
Identifier |
000446856 |
Title |
Automated feature engineering on relational data |
Alternative Title |
Αυτοματοποιημένη κατασκευή χαρακτηριστικών σε σχεσιακά δεδομένα |
Author
|
Καλουρής, Δημήτριος Α.
|
Thesis advisor
|
Τσαμαρδινός, Ιωάννης
|
Reviewer
|
Χριστοφίδης, Βασίλης
Κομοντάκης, Νικόλαος
|
Abstract |
Machine learning typically learns from a single table. However, in the age of
big data it is often the case that data are distributed across many different tables
in a relational database for efficiency. To work with relational data it is not rare
for scientists perform feature engineering manually and intuitively. Additionally,
many algorithms that produce a single table from a relational database have been
proposed for this problem but none of them takes into account complex relational
data schemas or they are limited in the paths they follow and the combinations
of joins and aggregations they perform during feature generation. Moreover these
algorithms, during feature generation, accumulate large number of features before
performing feature selection and the feature selection algorithms are not optimized.
To this end we created SRFGA a novel online feature engineering algorithm that
performs joins and aggregations on the tables to create features and keeps only the
most useful features, using the residuals calculated by a model to guide the feature
selection. This algorithm can be used without any knowledge expertise, and it also
unifies all the previous works in terms of visited paths and actions performed.
|
Language |
English |
Subject |
Artificial intelligence |
|
Feature construction |
|
Relational databases |
|
Σχεσιακές βάσεις |
|
Τεχνητή νοημοσύνη |
Issue date |
2022-03-18 |
Collection
|
School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
|
|
Type of Work--Post-graduate theses
|
Views |
510 |