AWS launches AWS Lake Formation service

August 09, 2019 06:35 AM

Amazon Web Services (AWS) has announced the availability of AWS Lake Formation, a fully managed service that enables customers to build, secure and manage data lakes.

The solution not only simplifies but automates many of the complex manual steps required to create a data lake, including collecting, cleaning and cataloguing data and securing making the data available to others.

“Our customers tell us that Amazon S3 is the ideal place to house their data lakes, which is why AWS hosts more data lakes than anyone else – with tens of thousands and growing every day. They’ve also told us that they want it to be easier and faster to set up and manage their data lakes,” said Raju Gulabani, vice president of databases, analytics and machine learning, AWS. “That’s why we built AWS Lake Formation, so customers can spend more time learning from their data and innovating, rather than wrestling that data into functioning data lakes. AWS Lake Formation is available today and we’re excited to see how customers use it as one of the building blocks for growing and transforming their businesses and customer experiences.”

Users can bring their data into a data lake from a variety of sources using pre-defined templates, automatically classify and prepare the data, and centrally define granular data access policies to oversee access by the different groups within an organisation.

“We wanted to create a data platform with the ability to manage the security settings for all the different applications in our environment. With AWS Lake Formation, we can now define policies once and enforce them in the same way, everywhere, for multiple services we use, including AWS Glue and Amazon Athena,” said Anand Desikan, director of cloud and data services, Panasonic Avionics. “The enhanced level of control gives us secure access to data and meta-data for columns and tables, not just for bulk objects, which is an important part of our data security and governance standard.”

Additionally, customers can then analyse this data using their choice of AWS analytics and machine learning services, including Amazon Redshift, Amazon Athena, and AWS Glue, with Amazon EMR, Amazon QuickSight, and Amazon SageMaker due to launch in the coming months.

“I focus on helping clients in their ‘Data on Cloud’ journey. Specific to that, we have seen that organisations are dealing with a lack of trusted data when they need to perform analytics on data coming from multiple sources,” added Namrata Maheshwary, senior architect for the data business group, Accenture. “Data cleansing is a critical step in data analytics and can greatly impact the business outcome and decision making. The new features in AWS Lake Formation have been hugely beneficial to address the challenge of data veracity and securing access to the data lake. We found it tremendously useful to make use of the advanced machine learning techniques for data preparation to find matching records, clean, and duplicate data from different data sources. This will help reduce the time, effort, and cost, while improving the quality and accuracy of the data in a customer’s data lakes.”