Re: Possible to bring in data stored on S3?
It’s quite fast. Using the AWS CLI tools, I clock downloads at ~180-200 MB/s (megabytes per second) from S3 to an r3 instance. Similar for uploads. The CLI tools run up to 10 threads per download, thus the speed. YMMV of course, especially with smaller instances. Keep in mind S3 does not provide real random access (though byte-range headers are supported), so it cannot replace NFS/EBS in all cases.
Nik
On Oct 6, 2014, at 6:19 PM, John Readey <jreadey_at_hdfgroup.org> wrote:
> Is anyone pulling in data on demand using the S3 API’s? I’m curious how the performance compares to using NFS.
>
> John
>
> From: Jennifer Staab <jstaab_at_cs.unc.edu>
> Date: Monday, October 6, 2014 at 6:08 PM
> To: greg <margeemail_at_gmail.com>, "starcluster_at_mit.edu" <starcluster_at_mit.edu>
> Subject: Re: [StarCluster] Possible to bring in data stored on S3?
>
> Besides the others mentioned, AWS has their own "command line interface" CLI see here for installation instructions and here for usage. AWS CLI provides tools for accessing S3. If you don't want to use AWS CLI, there are many SDKs available depending on programming language or platform see here.
>
> Good Luck.
>
> Jennifer
>
> On 10/6/14 3:29 PM, greg wrote:
>> Hi all,
>>
>> Is it possible to bring data into my cluster from Amazon S3? I didn't
>> see it in the manual.
>>
>> Thanks for reading!
>>
>> Greg
>> _______________________________________________
>> StarCluster mailing list
>> StarCluster_at_mit.eduhttp://mailman.mit.edu/mailman/listinfo/starcluster
>
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Mon Oct 06 2014 - 21:25:13 EDT
This archive was generated by
hypermail 2.3.0.