CpG Islands

Report potential CpG island regions using 200 bp windows and Obs/Exp ≥ 0.6 with GC ≥ 50%.

Tool Configuration
Configure the parameters for CpG Islands

Paste raw DNA or FASTA-formatted sequence (limit 100,000,000 characters). Non-DNA characters are removed before analysis.

1

Input Genomic DNA Sequences

Paste genomic DNA in FASTA format. CpG islands are typically found in promoter regions and regulatory elements. The tool accepts up to 100 million characters and removes non-DNA characters automatically.

2

Understanding Detection Criteria

The tool scans 200 bp windows looking for regions where the observed CpG dinucleotide frequency divided by the expected frequency meets or exceeds 0.6, and where GC content is at least 50%. These criteria follow the original Gardiner-Garden and Frommer method.

3

Review Island Properties

Each reported island shows 1-based start and end coordinates, the length in base pairs, GC percentage, and the observed/expected CpG ratio. Higher Obs/Exp ratios indicate stronger CpG islands that are less likely to be methylated in normal tissue.

4

Interpret Biological Context

CpG islands often mark transcription start sites and are usually unmethylated in normal cells. Aberrant methylation can silence tumor suppressor genes. Use coordinate information to correlate islands with known genes or regulatory elements in your sequence.

Interpretation Notes
Understanding the CpG island criteria used in this tool

Obs/Exp Threshold

The observed-to-expected CpG ratio must be greater than or equal to 0.6 within the 200 bp window to qualify as an island.

%GC Requirement

Windows must exceed 50% GC content to be reported. The percentage is calculated across the full window length.

Sliding Window

The analysis scans windows offset by one base, matching the original Gardiner-Garden & Frommer method implemented in SMS.

Sequence Cleaning

Non-DNA characters are removed and uracil (U) is converted to thymine (T) prior to scanning to maintain consistent base counts.