Some important things to note. I specified ec2_spot_price, which is the amount I am willing to pay for my m1.medium instance to run per hour. I said $0.04 an hour, which is pretty low, but reasonable for a medium instance. You can find all of the current spot prices either in the AWS console, or on the EC2 website
. As you can see, the spot prices are much, much smaller than the on-demand price of an instance. For example, for the m1.medium instance, which has 1.7 GB of ram and 1 core, the spot price currently is $0.013 per hour. The on-demand price is $0.120 per hour. That's a 90% discount on a m1.medium. Of course, you should always read the downsides
of using a spot instance, such as it can be terminated at any time, without warning, by Amazon. For my benchmarks, I can always re-run benchmarks if my instance is terminated. I needed to run 10 - 10 minute benchmarks, therefore after every benchmark, I uploaded the resulting data to S3 immediately so I wouldn't lose any work if the instance was terminated.
Also, I used the regular Amazon Linux AMI. They are listed on the Amazon website
. I could have very well used a CentOS, Ubuntu, or any other linux image for my instance. But, I prefer the official Linux AMI since it provides a very up to date OS which is very similar to the feel of a CentOS 6 instance. For example, it uses yum for repository management, and RPM's to install. And has versions (except for the kernel) similar to CentOS 6.
I also added a special command, periodic_remove, in order to terminate the instance if something went wrong inside the instance. Sometimes yum can hang, or the instance may not start up properly. In those cases, amazon will not notify you of the problem, and Bosco will not be able to determine there is an issue. Since my benchmarks should not last longer than 100 minutes, I automatically remove the instance after 150 minutes (a little breathing room) of running.
You may submit the instance with the normal 'condor_submit' command. The job will move to the R
unning state when the instance has begun running.
Once the instance has started, you may ssh into the instance by using the unique ssh key that Bosco generates for you. It is specified in the submit file as ec2_keypair_file. You also need the DNS name for the instance, which is available in the job's classad.
The command will output the hostname of the EC2 host. You may connect to the EC2 instance with the command, replacing the XXXX with the job number, and hostname with the address you get from the above command:
$ ssh -i keyfile.XXXXX [email protected]hostname
Pros of using Bosco to submit Amazon EC2 Jobs:
- Simple management of Amazon instance from your workstation.
- Specify spot price right inside of the job description.
- Ability to bootstrap the instance easily with user data scripts.
- Ability to use HTCondor policies in order to manage the instances, such as periodic remove statement above.
- The EC2 universe is only available on Linux builds of Bosco. You cannot manage EC2 instances on the Mac version of Bosco.
- Amazon EC2 has hundreds and hundreds of features, Bosco only allows you to use the simple submit EC2 instances and spot pricing. You will not be able to use the vast majority when you are using Bosco to manage your instances. But if all you need is to run some processing, Bosco is great!