May 20, 2020

Can Machine Learning Work with Cybersecurity? Introducing Amazon Macie

Amazon Macie, a fully managed data security and data privacy service that uses machine learning and pattern matching to discover and protect your sensitive data in AWS.
Source: Unsplash

A not so yet known product by Amazon is looking to emerge between developers. Its name is Amazon Macie, and what if offers is thrilling at the time of data privacy and protection. With a mixture of cybersecurity and Machine Learning, you can now have your hosted data secured and protected on another level.

The service is offered to AWS users, providing a double layer of security compared to other web services. Macie discovers which is the most sensitive data and protects it automatically.

The information in your account is stored in S3 buckets (similar to file folders), some information may be local and some may be shared with others. Some buckets are encrypted and some are unencrypted. Macie´s intelligence can access what type of data is analyzing when it comes to protection. Giving each file the security it needs.

What does Macie do with the data?

Macie generates alerts and notifications to the most sensitive data by using algorithmic matching and pattern recognition, both Machine Learning features. Macie´s output, protected and sensitive data, can then be used by other AWS services for example AWS Management Console or Amazon CloudWatch. This can help you meet regulations, such as the Health Insurance Portability and Accountability Act (HIPAA) and General Data Privacy Regulation (GDPR).

When it comes to the most sensitive data, Macie bases it´s search and protection on personal identifiable information (PII). Names, addresses, and credit card numbers. Macie is open to configuration so you can protect a custom data entry. There may be other sensitive data outsides PII that you or your business may interpret as sensitive. Macie is not limited to its predetermined configuration. 

What is the process?

It works as a simple process. First you enable Macie with one click from the AWS Management Console, or by requesting an API call. This is followed by the automatic creation of s3 buckets. Next, Macie starts to work by Discovering the sensitive data, and finally it generates findings and sends them to Amazon CloudWatch to be integrated into the relevant workflows.

Macie can only be used as a tool for temporary hosted data. Meaning, you don´t have to necessarily have your data in S3, but can move it to S3 to use Macie and then migrate it back to its origin. Recently discovered findings by Macie are stored for 30 days in the AWS Management console. Then, the information is written to a customer-owned S3 bucket for long term retention.

Compliance teams are legally required to do checks on sensitive information. Where it resides for example. This can be automated by Macie. Allowing one-time, daily, weekly or monthly scans of data. This data, including job outputs, findings, evaluation results, timestamps, and bucket information can later be used in the long term. Companies need to know their sensitive information is not only safe but analyzed. The data may be stored for internal purposes or for the company´s audits.

Macie currently works in AWS, but it can be expected to launch a version for external data storage systems. For now, you will have to migrate your data to a S3 environment. You can extract files from Applications, File sharing platforms, collaboration tools, and Emails.

Let's hope for Amazon to extend their service for external data too… As of course they can help analyze sensitive information at a much influential level. Global data is more important than a companies own hosting. If Amazon is not extending their service for other apps, they are basically promoting S3 by a differentiated value. Don´t get me wrong, it´s advanced development and very useful for companies. But all data is stored in S3.

The Multi-account feature allows a single administrator to manage multiple member accounts. Like a master account and its sub-accounts. It is meant to give order and authority to data in a company. It also helps for ticketing systems and Syncing of multiple accounts using Macie into one output.

Is it expensive?

The price of Amazon Macie is based on two factors. The number of Amazon S3 buckets in the account (per month) and the amount of data processed (per month). Only charging the customer for the bytes processed as sensitive information.

It´s not a common measure to reduce the price of your product by 80%, well, that is exactly what Amazon just did with one of their AWS (Amazon Web Services). Amazon Macie, a not so yet known product that is looking to become more accessible for its developers, it´s now at its lowest cost since its launch date.

Amazon offers a good 30-day free trial. Including an S3 bucket and security assessments. If used on a single account, the cost is direct, and if used on the multi-account configuration, the cost is estimated based on the sum of all accounts enabled and running. It is highly recommended to first do a 30-day free trial to give you accurate information on costs and usefulness.

The cost is $0,10 per S3 bucket. The amount of data processed for sensitive discovery is priced as followed. The first GB/Month is free, the next 50,000 GB are priced at $1 per GB, the next 450,000 GB are priced at $0,50 per GB, and over 500,000 GB are priced at $0,25 per GB.

To conclude, this is a very interesting proposal by Amazon. Allowing the everyday customer to analyze its data and protect it with Machine Learning and Pattern Recognition. It is currently only available for S3 hosted data, but it is expected to be offered for other apps too. The pricing has been reduced by 80% and is definitely a useful tool for companies with large amounts of data and sensitive information.

