StarCluster - Mailing List Archive

Re: Tophat run on a 2-node cluster

From: Jacob Barhak <no email>
Date: Fri, 2 Aug 2013 16:35:13 -0500

Hi Manuel,

You seem to run into similar issues I had.

I am not familiar with your application, yet I may have some answers/ideas for you.

On c1.xlarge you have 8 virtual CPUs per node. So you have 16 cores in your cluster if I read this correctly. And such a machine has 7gb of memory.

The 10gb issue is something I encountered myself. Unless your application is installed on a different volume, then the 10gb restriction may limit the disk space of your temp files. Your runs many just choke because your temp is full. I am not sure about this, yet you can check this by running df during your runs.

If I am correct it seems you have 2 solutions:
1. Install your application on the large volume so the the tmp space will be there as well.
2. Share and mount a big volume to replace the tmp directory of your application.

I posted a few commands to the list that allow doing this, yet they did not reach the archives, probably due to the length of the post. So I am sending you those lines I used to share a bigger volume and run my simulation on it. Perhaps this will work for you - yet you will have to rewrite the lines for you OS if you are not using windows. Yet it is possible your problem is not disk space.

      Jacob

#### from my other post ####
Hi Rayson,

Thanks to your guidance and thanks to Bruce Fields who provided additional guidance and shortcuts I was able to reach a solution for using the larger ephemeral storage that comes with the larger AMI.
 
I am sending the solution that worked for me to this list to close this issue in a way that others can follow in the future.
 
The solution consists of 3 lines of code that should be executed after the cluster was started and is running. The following lines are for a windows cmd and assume there are 20 nodes in the cluster.
 
FOR %n IN (001,002,003,004,005,006,007,008,009,010,011,012,013,014,015,016,017,018,019) DO (starcluster sshmaster mycluster "echo '/mnt/vol0 node%n(async,no_root_squash,no_subtree_check,rw)' >> /etc/exports")
 
starcluster sshmaster mycluster "service nfs start"
 
FOR %n IN (001,002,003,004,005,006,007,008,009,010,011,012,013,014,015,016,017,018,019) DO (starcluster sshnode mycluster node%n "mount master:/mnt/vol0 /mnt/vol0")
 
Those 3 lines NFS share /mnt/vol0 on the master with all the nodes. a bash version of these lines should not be hard to write for Linux users. The number of nodes here is handled manually, so this is not the most elegant solution, yet it works and the only solution that seems to easily work at this point in time.
 
There are other solution options that are in the works that may lead to superior solutions for the disk space problem. You can find leads to those I know about in the following links:
 
http://star.mit.edu/cluster/mlarchives/1803.html
https://github.com/jtriley/StarCluster/issues/44
 
I hope documenting this solution will help other users

####### end of copied post #######
 


Sent from my iPhone

On Aug 2, 2013, at 3:51 PM, "Manuel J. Torres" <mjtorres.phd_at_gmail.com> wrote:

> I am trying to run tophat software mapping ~38 Gb of RNA-seq reads in fastq format to a reference genome on a 2-node cluster with the following properties:
> NODE_IMAGE_ID = ami-999d49f0
> NODE_INSTANCE_TYPE = c1.xlarge
>
> Question: How many CPUs are there on this type of cluster?
>
> Here is a df -h listing of my cluster:
> root_at_master:~# df -h
> Filesystem Size Used Avail Use% Mounted on
> /dev/xvda1 9.9G 9.9G 0 100% /
> udev 3.4G 4.0K 3.4G 1% /dev
> tmpfs 1.4G 184K 1.4G 1% /run
> none 5.0M 0 5.0M 0% /run/lock
> none 3.5G 0 3.5G 0% /run/shm
> /dev/xvdb1 414G 199M 393G 1% /mnt
> /dev/xvdz 99G 96G 0 100% /home/large-data
> /dev/xvdy 20G 5.3G 14G 29% /home/genomic-data
>
> I created a third volume for the output that does not appear in this list but is listed in my config file and which I determined I can read and write to. I wrote the output files to this larger empty volume.
>
> I can't get tophat to run to completion. It appears to be generating truncated intermediate files. Here is the tophat output:
>
> [2013-08-01 17:34:19] Beginning TopHat run (v2.0.9)
> -----------------------------------------------
> [2013-08-01 17:34:19] Checking for Bowtie
> Bowtie version: 2.1.0.0
> [2013-08-01 17:34:21] Checking for Samtools
> Samtools version: 0.1.19.0
> [2013-08-01 17:34:21] Checking for Bowtie index files (genome)..
> [2013-08-01 17:34:21] Checking for reference FASTA file
> [2013-08-01 17:34:21] Generating SAM header for /home/genomic-data/data/Nemve1.allmasked
> format: fastq
> quality scale: phred33 (default)
> [2013-08-01 17:34:27] Reading known junctions from GTF file
> [2013-08-01 17:36:56] Preparing reads
> left reads: min. length=50, max. length=50, 165174922 kept reads (113024 discarded)
> [2013-08-01 18:24:07] Building transcriptome data files..
> [2013-08-01 18:26:43] Building Bowtie index from Nemve1.allmasked.fa
> [2013-08-01 18:29:01] Mapping left_kept_reads to transcriptome Nemve1.allmasked with Bowtie2
> [2013-08-02 07:34:40] Resuming TopHat pipeline with unmapped reads
> [bam_header_read] EOF marker is absent. The input is probably truncated.
> [bam_header_read] EOF marker is absent. The input is probably truncated.
> [2013-08-02 07:34:41] Mapping left_kept_reads.m2g_um to genome Nemve1.allmasked with Bowtie2
> [main_samview] truncated file.
> [main_samview] truncated file.
> [bam_header_read] EOF marker is absent. The input is probably truncated.
> [bam_header_read] invalid BAM binary header (this is not a BAM file).
> [main_samview] fail to read the header from "/home/results-data/top-results-8-01-2013/topout/tmp/left_kept_reads.m2g\
> _um_unmapped.bam".
> [2013-08-02 07:34:54] Retrieving sequences for splices
> [2013-08-02 07:35:16] Indexing splices
> Warning: Empty fasta file: '/home/results-data/top-results-8-01-2013/topout/tmp/segment_juncs.fa'
> Warning: All fasta inputs were empty
> Error: Encountered internal Bowtie 2 exception (#1)
> Command: /home/genomic-data/bin/bowtie2-2.1.0/bowtie2-build /home/results-data/top-results-8-01-2013/topout/tmp/segm\
> ent_juncs.fa /home/results-data/top-results-8-01-2013/topout/tmp/segment_juncs
> [FAILED]
> Error: Splice sequence indexing failed with err =1
>
> Questions:
>
> Am I running out of memory?
>
> How much RAM does the AMI have and can I make that larger?
>
> No matter what configuration starcluster I define, I can't seem to make my root directory larger that 10Gb and it appears to full.
>
> Can I make the root directory larger that 10GB?
>
> Thanks!
>
> --
> Manuel J Torres, PhD
> 219 Brannan Street Unit 6G
> San Francisco, CA 94107
> VOICE: 415-656-9548
> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster
Received on Fri Aug 02 2013 - 17:41:02 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject