Skip to main content

How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science

  • Chapter
  • First Online:
A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years

Abstract

During the last 35 years, data management principles such as physical and logical independence, declarative querying and cost-based optimization have led to profound pervasiveness of relational databases in any kind of organization. More importantly, these technical advances have enabled the first round of business intelligence applications and laid the foundation for managing and analyzing Big Data today.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://www.deepfeatures.org.

References

  1. R. Agrawal, T. Imieliński, A. Swami, Mining association rules between sets of items in large databases, in Acm Sigmod Record, vol. 22 (ACM, 1993), pp. 207–216

    Google Scholar 

  2. R. Agrawal, R. Srikant, Algorithms for mining association rules in large databases, in Proceedings of the 20th VLDB Conference, vol. 2 (1994), pp. 141–182

    Google Scholar 

  3. C. Aliprandi, A.E. De Luca, G. Di Pietro, M. Raffaelli, D. Gazzè, M.N. La Polla, A. Marchetti, M. Tesconi, Caper: crawling and analysing facebook for intelligence purposes, in 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (IEEE, 2014), pp. 665–669

    Google Scholar 

  4. G. Amato, P. Bolettieri, F. Falchi, C. Gennaro, F. Rabitti, Combining local and global visual feature similarity using a text search engine, in International Workshop on Content-Based Multimedia Indexing (CBMI) (IEEE, 2011), pp. 49–54

    Google Scholar 

  5. G. Amato, C. Gennaro, P. Savino, Mi-file: using inverted files for scalable approximate similarity search. Multimed. Tools Appl. 71(3), 1333–1362 (2014)

    Article  Google Scholar 

  6. G. Amato, F. Debole, F. Falchi, C. Gennaro, F. Rabitti, Large scale indexing and searching deep convolutional neural network features, in International Conference on Big Data Analytics and Knowledge Discovery (Springer, Berlin, 2016), pp. 213–224

    Google Scholar 

  7. G. Amato, F. Falchi, C. Gennaro, F. Rabitti, YFCC100M-HNfc6: a large-scale deep features benchmark for similarity search, in International Conference on Similarity Search and Applications (Springer, Berlin, 2016), pp. 196–209

    Google Scholar 

  8. G. Amato, F. Carrara, F. Falchi, C. Gennaro, C. Meghini, C. Vairo, Deep learning for decentralized parking lot occupancy detection. Exp. Syst. Appl. 72, 327–334 (2017)

    Article  Google Scholar 

  9. G. Andrienko, N. Andrienko, S. Rinzivillo, M. Nanni, D. Pedreschi, F. Giannotti, Interactive Visual Clustering of Large Collections of Trajectories. VAST: Symposium on Visual Analytics Science and Technology (2009)

    Google Scholar 

  10. M. Assante, L. Candela, D. Castelli, G. Coro, L. Lelii, P. Pagano, Virtual research environments as-a-service by gCube. PeerJ Preprints (2016)

    Google Scholar 

  11. M. Avvenuti, S. Cresci, F. Del Vigna, M. Tesconi, Impromptu crisis mapping to prioritize emergency response. Computer 49(5), 28–37 (2016)

    Article  Google Scholar 

  12. S. Baccianella, A. Esuli, F. Sebastiani, Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining, in Proceedings of the 7th Conference on Language Resources and Evaluation (LREC 2010) (2010)

    Google Scholar 

  13. A.L. Barabási, R. Albert, Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  14. M. Berlingerio, M. Coscia, F. Giannotti, A. Monreale, D. Pedreschi, Multidimensional networks: foundations of structural analysis. World Wide Web 16(5–6), 567–593 (2013)

    Article  Google Scholar 

  15. P. Bolettieri, A. Esuli, F. Falchi, C. Lucchese, R. Perego, T. Piccioli, F. Rabitti, CoPhIR: a test collection for content-based image retrieval (2009), arXiv:0905.4627

  16. L. Candela, D. Castelli, P. Pagano, Virtual research environments: an overview and a research agenda. Data Sci. J. 12, GRDI75–GRDI81 (2013)

    Google Scholar 

  17. L. Candela, D. Castelli, A. Manzi, P. Pagano, Realising virtual research environments by hybrid data infrastructures: the D4 science experience, in International Symposium on Grids and Clouds (ISGC) 2014 23–28 March 2014, Academia Sinica, Taipei, Taiwan, PoS(ISGC2014)022. Proceedings of Science (2014)

    Google Scholar 

  18. F. Carrara, A. Esuli, T. Fagni, F. Falchi, A.M. Fernández, Picture it in your mind: generating high level visual representations from textual descriptions (2016), arXiv:1606.07287

  19. E. Fernández-del Castillo, D. Scardaci, Á.L. García, The EGI federated cloud e-infrastructure, in Procedia Computer Science - 1st International Conference on Cloud Forward: From Distributed to Complete Computing, vol. 68 (2015)

    Google Scholar 

  20. A. Cavoukian, Privacy design principles for an integrated justice system - working paper (2000), https://www.ipc.on.ca/index.asp?layid=86&fid1=318

  21. G. Coro, L. Candela, P. Pagano, A. Italiano, L. Liccardo, Parallelizing the execution of native data mining algorithms for computational biology. Concurr. Comput.: Pract. Exp. 27(17), 4630–4644 (2015)

    Google Scholar 

  22. M. Coscia, F. Giannotti, D. Pedreschi, A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 4(5), 512–546 (2011)

    Article  MathSciNet  Google Scholar 

  23. M. Coscia, S. Rinzivillo, F. Giannotti, D. Pedreschi, Optimal spatial resolution for the analysis of human mobility, in Proceedings of the International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (IEEE, 2012), pp. 248–252

    Google Scholar 

  24. M. Coscia, G. Rossetti, F. Giannotti, D. Pedreschi, Demon: a local-first discovery method for overlapping communities, in Proceedings of SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, 2012), pp. 615–623

    Google Scholar 

  25. G. Da San Martino, W. Gao, F. Sebastiani, Ordinal text quantification, in Proceedings of the 39th ACM Conference on Research and Development in Information Retrieval (SIGIR 2016) (2016), pp. 937–940

    Google Scholar 

  26. F. Del Vigna, M. Petrocchi, A. Tommasi, C. Zavattari, M. Tesconi, Semi-supervised knowledge extraction for detection of drugs and their effects, in International Conference on Social Informatics (Springer, Berlin, 2016), pp. 494–509

    Google Scholar 

  27. C. Dwork, Differential privacy, in Automata, Languages and Programming, ed. by M. Bugliesi, B. Preneel, V. Sassone, I. Wegener. Lecture Notes in Computer Science, vol. 4052 (Springer, Berlin, 2006), pp. 1–12. doi:10.1007/11787006_1

  28. P.N. Edwards, S.J. Jackson, G.C. Bowker, C.P. Knobel, Understanding infrastructure: dynamics, tensions, and design. Working paper, National Science Foundation (2007), http://hdl.handle.net/2027.42/49353

  29. A. Esuli, F. Sebastiani, Determining term subjectivity and term orientation for opinion mining, in Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pp. 193–200

    Google Scholar 

  30. A. Esuli, F. Sebastiani, Determining the semantic orientation of terms through gloss analysis, in Proceedings of the 14th ACM International Conference on Information and Knowledge Management (CIKM 2005) (2005), pp. 617–624

    Google Scholar 

  31. A. Esuli, F. Sebastiani, Sentiwordnet: a publicly available lexical resource for opinion mining, in Proceedings of the Conference on Language Resources and Evaluation (LREC) (2006), pp. 417–422

    Google Scholar 

  32. A. Esuli, F. Sebastiani, Sentiment quantification. IEEE Intell. Syst. 25(4), 72–75 (2010)

    Article  Google Scholar 

  33. U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, Advances in Knowledge Discovery and Data Mining, vol. 21 (AAAI Press, Menlo Park, 1996)

    Google Scholar 

  34. B. Fecher, S. Friesike, Open science: one term, five schools of thought, in Opening Science, ed. by S. Bartling, S. Friesike (Springer, Berlin, 2014), pp. 17–47

    Google Scholar 

  35. B. Furletti, L. Gabrielli, C. Renso, S. Rinzivillo, Analysis of GSM calls data for understanding user mobility behavior (2013)

    Google Scholar 

  36. L. Gabrielli, B. Furletti, R. Trasarti, F. Giannotti, D. Pedreschi, City users’ classification with mobile phone data, in IEEE Big Data (2015)

    Google Scholar 

  37. W. Gao, F. Sebastiani, Tweet sentiment: from classification to quantification, in Proceedings of the 7th International Conference on Advances in Social Network Analysis and Mining (ASONAM 2015) (Paris, FR, 2015), pp. 97–104

    Google Scholar 

  38. W. Gao, F. Sebastiani, From classification to quantification in tweet sentiment analysis. Soc. Netw. Anal. Min. 6(19), 1–22 (2016)

    Google Scholar 

  39. F. Giannotti, M. Nanni, F. Pinelli, D. Pedreschi, Trajectory pattern mining, in Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD, ACM, 2007), pp. 330–339

    Google Scholar 

  40. F. Giannotti, M. Nanni, D. Pedreschi, F. Pinelli, C. Renso, S. Rinzivillo, R. Trasarti, Unveiling the complexity of human mobility by querying and mining massive trajectory data. VLDB J. 20(5), 695–719 (2011)

    Article  Google Scholar 

  41. F. Giannotti, L.V.S. Lakshmanan, A. Monreale, D. Pedreschi, W.H. Wang, Privacy-preserving mining of association rules from outsourced transaction databases. IEEE Syst. J. 7(3), 385–395 (2013)

    Article  Google Scholar 

  42. R. Guidotti, M. Nanni, S. Rinzivillo, D. Pedreschi, F. Giannotti, Never drive alone: boosting carpooling with network analysis. Inf. Syst. 64, 237–257 (2016)

    Google Scholar 

  43. S. Hajian, J. Domingo-Ferrer, A. Monreale, D. Pedreschi, F. Giannotti, Discrimination- and privacy-aware patterns. Data Min. Knowl. Discov. 29(6), 1733–1782 (2015)

    Article  MathSciNet  Google Scholar 

  44. S. Khalifa, Y. Elshater, K. Sundaravarathan, A. Bhat, P. Martin, F. Imam, D. Rope, M. Mcroberts, C. Statchuk, The six pillars for building big data analytics ecosystems. ACM Comput. Surv. 49(2), 33 (2016)

    Google Scholar 

  45. J.G. Lee, J. Han, Trajectory clustering: a partition-and-group framework, in In SIGMOD (2007), pp. 593–604

    Google Scholar 

  46. C.S. Liew, M.P. Atkinson, M. Galea, T.F. Ang, P. Martin, J.I.V. Hemert, Scientific workflows: moving across paradigms. ACM Comput. Surv. 49(4) 66 (2016)

    Google Scholar 

  47. L. Milli, A. Monreale, G. Rossetti, D. Pedreschi, F. Giannotti, F. Sebastiani, Quantification in social networks, in 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA), vol. 36678 (IEEE, 2015), pp. 1–10

    Google Scholar 

  48. A. Monreale, F. Pinelli, R. Trasarti, F. Giannotti, Wherenext: a location predictor on trajectory pattern mining, in ACM SIGKDD Conference on Knoledge Discovery and Data Mining (KDD) (2009)

    Google Scholar 

  49. A. Monreale, G.L. Andrienko, N.V. Andrienko, F. Giannotti, D. Pedreschi, S. Rinzivillo, S. Wrobel, Movement data anonymity through generalization. TDP 3(2), 91–121 (2010)

    MathSciNet  Google Scholar 

  50. A. Monreale, W.H. Wang, F. Pratesi, S. Rinzivillo, D. Pedreschi, G. Andrienko, N. Andrienko, Privacy-preserving distributed movement data aggregation, in AGILE (Springer, Berlin, 2013)

    Google Scholar 

  51. A. Monreale, S. Rinzivillo, F. Pratesi, F. Giannotti, D. Pedreschi, Privacy-by-design in big data analytics and social mining. EPJ Data Sci. 3(1), 10 (2014). doi:10.1140/epjds/s13688-014-0010-4

  52. A. Moreo Fernández, A. Esuli, F. Sebastiani, Distributional correspondence indexing for cross-lingual and cross-domain sentiment classification. J. Artif. Intell. Res. 55, 131–163 (2016)

    MathSciNet  MATH  Google Scholar 

  53. L. Pappalardo, G. Rossetti, D. Pedreschi, “How well do we know each other?” detecting tie strength in multidimensional social networks, in 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) (IEEE, 2012), pp. 1040–1045

    Google Scholar 

  54. L. Pappalardo, F. Simini, S. Rinzivillo, D. Pedreschi, F. Giannotti, A.L. Barabasi, Returners and explorers dichotomy in human mobility. Nat. Commun. 6, 8166 (2015). doi:10.1038/ncomms9166

  55. D. Pedreschi, S. Ruggieri, F. Turini, Measuring discrimination in socially-sensitive decision records, in Proceedings of the SIAM International Conference on Data Mining (SDM 2009) (SIAM, 2009), pp. 581–592

    Google Scholar 

  56. J.R. Quinlan, C4. 5: Programs for Machine Learning (Elsevier, San Francisco, 2014)

    Google Scholar 

  57. S. Rinzivillo, S. Mainardi, F. Pezzoni, M. Coscia, D. Pedreschi, F. Giannotti, Discovering the geographical borders of human mobility. KI-Künstl. Intell. 26(3), 253–260 (2012)

    Article  Google Scholar 

  58. S. Rinzivillo, L. Gabrielli, M. Nanni, L. Pappalardo, D. Pedreschi, F. Giannotti, The purpose of motion: learning activities from individual mobility networks, in International Conference on Data Science and Advanced Analytics, DSAA (2014). doi:10.1109/DSAA.2014.7058090

  59. A. Romei, S. Ruggieri, A multidisciplinary survey on discrimination analysis. Knowl. Eng. Rev. 29(5), 582–638 (2014)

    Article  Google Scholar 

  60. G. Rossetti, M. Berlingerio, F. Giannotti, Scalable link prediction on multidimensional networks, in International Conference on Data Mining Workshops (ICDMW) (IEEE, 2011), pp. 979–986

    Google Scholar 

  61. G. Rossetti, R. Guidotti, I. Miliou, D. Pedreschi, F. Giannotti, A supervised approach for intra-/inter-community interaction prediction in dynamic social networks. Soc. Netw. Anal. Min. 6, 86 (2016)

    Google Scholar 

  62. G. Rossetti, L. Pappalardo, R. Kikas, D. Pedreschi, F. Giannotti, M. Dumas, Homophilic network decomposition: a community-centric analysis of online social services. Soc. Netw. Anal. Min. J. 6, 103 (2016)

    Google Scholar 

  63. G. Rossetti, L. Pappalardo, D. Pedreschi, F. Giannotti, Tiles: an online algorithm for community discovery in dynamic social networks, in Machine Learning (2016), pp. 1–29

    Google Scholar 

  64. S. Ruggieri, Using t-closeness anonymity to control for non-discrimination. Trans. Data Priv. 7(2), 99–129 (2014)

    MathSciNet  Google Scholar 

  65. S. Ruggieri, F. Turini, A KDD process for discrimination discovery, in Proceedings of Machine Learning and Knowledge Discovery in Databases (ECML-PKDD 2016) Part III. LNCS, vol. 9853 (Springer, Berlin, 2016), pp. 249–253

    Google Scholar 

  66. S. Ruggieri, D. Pedreschi, F. Turini, Data mining for discrimination discovery. ACM Trans. Knowl. Discov. Data 4(2), Article 9 (2010)

    Google Scholar 

  67. S. Ruggieri, S. Hajian, F. Kamiran, X. Zhang, Anti-discrimination analysis using privacy attack strategies, in Proceedings of Machine Learning and Knowledge Discovery in Databases (ECML-PKDD) Part II. LNCS, vol. 8725 (2014), pp. 694–710

    Google Scholar 

  68. R. Trasarti, F. Pinelli, M. Nanni, F. Giannotti, Mining mobility user profiles for car pooling, in Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11, ACM, New York, 2011), pp. 1190–1198

    Google Scholar 

  69. R. Trasarti, R. Guidotti, A. Monreale, F. Giannotti, Myway: location prediction via mobility profiling, in Information Systems (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to F. Giannotti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this chapter

Cite this chapter

Amato, G. et al. (2018). How Data Mining and Machine Learning Evolved from Relational Data Base to Data Science. In: Flesca, S., Greco, S., Masciari, E., Saccà, D. (eds) A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years. Studies in Big Data, vol 31. Springer, Cham. https://doi.org/10.1007/978-3-319-61893-7_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-61893-7_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-61892-0

  • Online ISBN: 978-3-319-61893-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics