1. Home
  2. Docs
  3. Introduction to NGS analysis
  4. Setup simulated reads

Setup simulated reads

As an example we are going to use a set of simulated reads which have been designed to be representative of Illumina reads that should align to the BRCA1 gene.

More details on the source of this dataset can be found in the associated paper:

http://f1000research.com/articles/1-2/v1

And details of the dataset found at:

http://figshare.com/articles/Simulated_Illumina_BRCA1_reads_in_FASTQ_format/92338

You will note that these are actually paired end reads, but for the purposes of this example, we will treat them as the results of a fragment run.  Once you’ve worked though the example you may wish to work out how to repeat your analysis treating the data as paired end reads.

Copy reads to your home directory

mkdir ~/ngs

cd ~/ngs

cp /scratch/share_ngs/intro_ngs/Brca1Reads_1.1.fastq .

cp /scratch/share_ngs/intro_ngs/Brca1Reads_1.2.fastq .

cp /scratch/share_ngs/intro_ngs/chr17.fa .

Check the number of reads

grep @chr Brca1Reads_1.1.fastq | wc -l

100000

grep @chr Brca1Reads_1.2.fastq | wc -l

100000

Alternatively, we can use the –c flag in grep to return a count.

grep -c @chr Brca1Reads_1.1.fastq

100000

grep -c @chr Brca1Reads_1.2.fastq

100000