DATA LAKE
its functionalities and applications
DOI:
https://doi.org/10.31510/infa.v21i1.1960Keywords:
Dta lake, Database, Raw dataAbstract
In an age where the voracity for data is insatiable, the concept of the Data Lake emerges as a robust and innovative reservoir for the retention and analysis of information. Inspired by pioneering research by authors such as James Dixon on his blog in 2010, founder of Pentaho, and Thomas H. Davenport, renowned data analytics expert, the Data Lake stands out as a disruptive approach in the data management landscape. This article aims to explore this concept, examining the flexible and scalable architecture proposed by Dixon and the main traditional approaches to preserving the integrity of raw data regardless of its source or format all in a single place, Considering the scarcity of literature that still exists because it is a new subject. When addressing the Data Lake, it is intended to cover not only its structure, but also its implications that this raw data storage environment can have on scientific research, also showing what the Data Lake is, in order to contribute to the understanding of this concept.
Downloads
References
AMAZON WEB SERVICES. Estudo de Caso: Coca-Cola. Disponível em: https://aws.amazon.com/pt/solutions/case-studies/innovators/coca-cola/. Acesso em: 27 fev. 2024.
AMAZON WEB SERVICES. Estudo de Caso: Coca-Cola Andina. Disponível em: https://aws.amazon.com/pt/solutions/case-studies/coca-cola-andina-case-study/. Acesso em: 27 fev. 2024.
Amazon Web Services. Data Lakes and Analytics: Data Lakes. Disponível em: https://aws.amazon.com/pt/big-data/datalakes-and-analytics/datalakes/. Acesso em: 12 mar. 2024.
Amazon Web Services. (s.d.). AWS CloudTrail: Guia do usuário. Recuperado de https://docs.aws.amazon.com/pt_br/aescloudtrail/latest/userguide/cloudtrail-user-guide.html. Acessado em: 12 mar.2024.
Cutting, D., & Cafarella, M. (2015). Data Lakes: The Definitive Guide. Data Lake Management: Challenges and Opportunities. Disponível em: http://www.vldb.org/pvldb/vol12/p1986-nargesian.pdf. DOI: https://doi.org/10.14778/3352063.3352116
Dixon, J. (2010). Pentaho, Hadoop, and Data Lakes. Disponivel em: https://jamesdixon.wordpress.com/2010/10/14/pentaho-hadoop-and-data-lakes.
Fang, H. (2015). Managing Data Lakes in Big Data Era: What's a data lake and why has it become popular in data management ecosystem. In The 5th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent Systems, June 8-12, 2015, Shenyang, China. DOI: https://doi.org/10.1109/CYBER.2015.7288049
GIL, Antônio Carlos. Como elaborar projetos de pesquisa. 1991. Atlas.
Inmon, B., & Linstedt, D.. Data Lake Architecture: Designing the Data Lake and Avoiding the Garbage Dump. 2017. Technics Publications.
IPSense. Estudo de Caso: AWS Neighborly Data Lake. Disponível em: https://www.ipsense.com.br/estudo-de-caso-aws-neighborly-data-lake/. Acesso em: 27 fev. 2024.
Khine, P.P.. Data lake: a new ideolçogy in big data era. Disponivel em: https://doi.org/10.1051/itmconf/20181703025. Acessado em: 12 mar. 2024. DOI: https://doi.org/10.1051/itmconf/20181703025
Medium. Como Criamos Nosso Data Lake Utilizando a AWS. Disponível em: https://medium.com/building-soulkey/como-criamos-nosso-data-lake-utilizando-a-aws-e8cd96618929. Acesso em: 12 mar. 2024.
Miloslavskaya, N., & Tolstoy, A. Application of Big Data, Fast Data and Data Lake Concepts to Information Security Issues. In 2016 4th International Conference on Future Internet of Things and Cloud Workshops. DOI: https://doi.org/10.1109/W-FiCloud.2016.41
Serra, J., & Anton, B. (2018). "Data Lake Architecture." Disponível em: https://www.itm-conferences.org/articles/itmconf/pdf/2018/02/itmconf_wcsn2018_03025.pdf.
Singh, A. (2019). Architecture of Data Lake. Revista Internacional de Pesquisa Científica em Ciência da Computação, Engenharia e
Tecnologia da Informação (IJSRCSEIT), 5(2), 411-414. Disponível em: https://doi.org/10.32628/CSEIT1952121. Acesso em 27 fev. 2024. URL da revista: http://ijsrcseit.com/CSEIT1952121.
Singh, A. & Ahmad, S. Architecture of Data Lake. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 2019, vol. 5. Diponivel em: https://doi.org/10.32628/CSEIT1952121. Acessado em: 12 mar. 2024. DOI: https://doi.org/10.32628/CSEIT1952121
Wider, P. & Nolte, H. Toward data lakes as central building blocks for data management and analysis. Disponível em: https://www.frontiersin.org/articles/. Acessado em: 12 mar. 2024.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Revista Interface Tecnológica

This work is licensed under a Creative Commons Attribution 4.0 International License.
Os direitos autorais dos artigos publicados pertencem à revista Interface Tecnológica e seguem o padrão Creative Commons (CC BY 4.0), que permite o remixe, adaptação e criação de obras derivadas do original, mesmo para fins comerciais. As novas obras devem conter menção ao(s) autor(es) nos créditos.