Azure Data Lake is a scalable data storage and analytics service.
This article needs additional citations for verification. (October 2017) |
The service is hosted in Azure, Microsoft's public cloud.
Wiki English | |
Developer(s) | Microsoft |
---|---|
Initial release | November 16, 2016 |
Available in | English |
Type | Data storage and analytics service |
Website | azure |
Azure Data Lake service was released on November 16, 2016. It is based on COSMOS, which is used to store and process data for applications such as Azure, AdCenter, Bing, MSN, Skype and Windows Live. COSMOS features a SQL-like query engine called SCOPE upon which U-SQL was built.
Users can store structured, semi-structured or unstructured data produced from applications including social networks, relational data, sensors, videos, web apps, mobile or desktop devices. A single Azure Data Lake Storage account can store trillions of files where a single file can be greater than a petabyte in size.
Azure Data Lake Analytics is a parallel on-demand job service. The parallel processing system is based on Microsoft Dryad. Dryad can represent arbitrary Directed Acyclic Graphs (DAGs) of computation. Data Lake Analytics provides a distributed infrastructure that can dynamically allocate or de-allocate resources so customers pay for only the services they use.
Azure Data Lake Analytics uses Apache YARN, the part of Apache Hadoop which governs resource management across clusters. Microsoft Azure Data Lake Store supports any application that uses the Hadoop Distributed File System (HDFS) interface.
Using Data Lake Analytics, users can develop and run parallel data transformation and processing programs in U-SQL, a query language that combines SQL with C#. U-SQL was designed as an evolution of the declarative SQL language with native extensibility through the user code written in C#. U-SQL uses C# data types and the C# expression language.
In 2021 Microsoft announced the 2024 retirement of the original Azure Data Lake Storage, now termed "Gen1". The related Azure Data Lake Analytics / U-SQL technologies are also being retired. Azure Data Lake Storage Gen2, an extension of Azure Storage, will continue. The suggested replacement technologies are Azure Synapse Analytics and Apache Spark.
This article uses material from the Wikipedia English article Azure Data Lake, which is released under the Creative Commons Attribution-ShareAlike 3.0 license ("CC BY-SA 3.0"); additional terms may apply (view authors). Content is available under CC BY-SA 4.0 unless otherwise noted. Images, videos and audio are available under their respective licenses.
®Wikipedia is a registered trademark of the Wiki Foundation, Inc. Wiki English (DUHOCTRUNGQUOC.VN) is an independent company and has no affiliation with Wiki Foundation.