Report potential CpG island regions using 200 bp windows and Obs/Exp ≥ 0.6 with GC ≥ 50%.
Paste raw DNA or FASTA-formatted sequence (limit 100,000,000 characters). Non-DNA characters are removed before analysis.
Paste genomic DNA in FASTA format. CpG islands are typically found in promoter regions and regulatory elements. The tool accepts up to 100 million characters and removes non-DNA characters automatically.
The tool scans 200 bp windows looking for regions where the observed CpG dinucleotide frequency divided by the expected frequency meets or exceeds 0.6, and where GC content is at least 50%. These criteria follow the original Gardiner-Garden and Frommer method.
Each reported island shows 1-based start and end coordinates, the length in base pairs, GC percentage, and the observed/expected CpG ratio. Higher Obs/Exp ratios indicate stronger CpG islands that are less likely to be methylated in normal tissue.
CpG islands often mark transcription start sites and are usually unmethylated in normal cells. Aberrant methylation can silence tumor suppressor genes. Use coordinate information to correlate islands with known genes or regulatory elements in your sequence.
The observed-to-expected CpG ratio must be greater than or equal to 0.6 within the 200 bp window to qualify as an island.
Windows must exceed 50% GC content to be reported. The percentage is calculated across the full window length.
The analysis scans windows offset by one base, matching the original Gardiner-Garden & Frommer method implemented in SMS.
Non-DNA characters are removed and uracil (U) is converted to thymine (T) prior to scanning to maintain consistent base counts.