https://www.techtarget.com/searchaws/tip/Learn-the-basics-of-Amazon-EFS-to-find-the-right-storage-fit
Amazon S3 remains the most popular storage option from AWS, but it has some limitations. Enterprises -- and, more importantly, legacy applications -- can't use it to interact with the service like a standard network file system. Instead, they have to use REST APIs to access individual file objects.
AWS addresses this issue through its Elastic File System (EFS), a service that provides standards-based file shares on the cloud. But despite the benefits of Amazon EFS, some users struggle to determine if it's the best AWS storage option for them. And, if they do choose EFS, configuring the service can also pose a challenge.
EFS is a managed network-attached storage filer for EC2 instances based on network file system (NFS) version 4. Unlike DIY NFS implementations that might use one or more EC2 instances with Elastic Block Store (EBS) volumes as an NFS server, EFS is distributed across servers that span several availability zones (AZs). This eliminates I/O bottlenecks to improve performance.
This distributed design also means Amazon EFS is highly available, reliable and scalable -- up to petabytes -- with I/O throughput that increases as the file system grows. EFS volumes deliver consistent performance of 50 Mbps per TB of storage; however, throughput can double to 100 Mbps in short bursts. Burst performance for file systems larger than 1 TB linearly scales at 100 Mbps per TB.
Like other NFS file shares, compute instances mount the remote file system to access data. Instances in different virtual private clouds (VPCs) and AZs can create "mount targets" in each VPC to mount the same share.
Once mounted, the Amazon EFS share looks like any other file system that's compliant with the Portable Operating System Interface, and it uses standard NFS permissions to control access to users and groups. Mount targets also enable on-premises systems to access EFS shares via a VPC that spans a Direct Connect network link. Furthermore, EFS is available to VMs hosted on VMware Cloud on AWS.
Due to its ability to customize I/O performance, EFS is versatile and well-suited for a range of workloads, including data analytics, database backups, rich media storage, content management collaboration, user home directories and container image storage.
Enterprises typically compare EFS to S3 and EBS when they weigh their AWS storage options. EFS is generally best for traditional file-based applications, while S3 is best for cloud-native applications. EBS is ideal when users require maximum control over the file volume configuration. Other characteristics of each include:
EFS offers two performance modes, which users configure at setup to accommodate different workloads:
Users can also configure EFS in one of two throughput modes that control a share's I/O capacity:
EFS uses a system of credits to determine how long bursts can sustain, with increases based on the size of the underlying file system. In our previous example, a 100 GB share can burst for up to 72 minutes per day, while the 1 TB file system earns more credits that enable it to burst up to 12 hours per day. File systems larger than a TB can burst to proportionately larger values than 100 Mbps based on size -- for example, a 5 TB share can burst to 500 Mbps for up to 12 hours per day. To monitor their burst credits, users can set up a CloudWatch alert that notifies when the "BurstCreditBalance" parameter drops before a certain threshold.
Provisioned throughput is an extra-cost option, so use it discriminately. For example, a 100 GB file system with 10 Mbps of provisioned throughput would cost $30 per month for the basic (burstable) storage. It would cost an additional $30 per month for the provisioned I/O, after accounting for the 5 Mbps of throughput -- at 50 Kbps per GB of capacity -- included with burstable service.
Because Amazon EFS is based on NFS version 4, users should follow best practices for an NFS file volume. For instance, they should unmount a file mount target before deleting it; use a current Linux version with the latest NFS code and bug fixes; enable parallelized open and close operations for the OS configuration; and ensure that the number of open files and simultaneous users doesn't exceed EFS limits, which are 32,000 and 128, respectively.
Like other NFS implementations, each file operation has some latency, so access to large numbers of small files is much slower than reading one large file. To get around this, parallelize small file access across many EC2 instances, which results in higher aggregate throughput. Also, read and write requests to an EFS share use up system memory and CPU resources, so choose larger instance types to handle the I/O for apps that make thousands of EFS accesses.
21 Aug 2018