What does TFBSPred do?
TFBSPred predicts transcription factor
binding sites in a selected genomic locus in specific cell types.
How do I cite TFBSPred?
The paper describing the tool is currently under preparation. In the
meantime we would be grateful if you could cite its URL (https://www.michalopoulos.net/tfbspred).
Which species are available?
You can search for human and mouse genes.
Why only mouse and not more species?
Mouse genome has around 90% degree of similarity with the human genome. Furthermore, there
were DNaseI Hypersensitivity data available only for human and mouse.
Where do stored data come from?
The database contains:
Why is TFBSPred better than similar webtools?
TFBSPred takes advantage of various parameters. First of all, we are
using the novel TFFM algorithm
to predict TFBSs. We utilize the open
chromatin regions in multiple different cells discovered by DNAse HS
data and we intersect them with the respective conserved genomic
region between human and mouse. Thus, we improve our chances of
finding a biologically active region and reduce the chance of false
positives. In addition, instead of specifying a fixed length of TFBS search upstream or downstream of the promoter, TFBSPred calculates
the conserved open chromatin region on its own.
Why didn't autocomplete find my gene when I typed it?
You were probably using an alias for your gene. Only HGNC/MGI gene symbols are accepted.
Where can I find the HGNC/MGI symbol of my gene?
For humans you can use the GeneCards
(recommended) or the HGNC website
. For mouse you can use the MGI
How do I use a genomic position as input?
You have to visit Ensembl
in order to identify your genomic position. Format the input as: Chromosome:Genomic-Location(Orientation) e.g.
I’ve submitted my gene of interest and I’m redirected to a
page with a table containing my gene, its available transcripts and
their TSSs. What do I do next?
In order to proceed with the web tool, you must
click on one of the available TSS. The Gene symbol, Ensembl Gene and
Transcript stable ID tabs contain links to the HGNC
Why are some of the TSS cells empty?
Multiple different transcripts may use the same
TSS. In that case we omit writing this TSS more than once.
I’ve picked my TSS. What is next?
Please select at least 1 option of each category (Lineage, Tissue, Karyotype and Sex) or there will not be any
results. Afterwards, the filtered cell types are presented.
Why do I have to select specific cells?
We recommend choosing those cells similar to what you are interested in.
Different cell types with different treatments contain various open
What is that TFFM threshold field?
It is the threshold value for the TFFM search. The range of acceptable
values is from 0.85 to 1. The higher the threshold, the stricter will
the TFBS search be. We recommend values around 0.9-0.95.
How long do I have to wait for the results?
On average, results will be presented in about a minute. Please do not resubmit the data until then.
I have submitted my choices and after about a minute I got a
blank page! What went wrong?
There are two possible explanations. You might have chosen a genomic
position which is not in a conserved region between human and mouse. Alternatively, it is possible that the
cell lines picked might not have an open chromatin landscape in that position. In order
to determine which of the two is the cause, we recommend selecting
all the cell lines. If a result is not produced, that could be an indication that the gene is not expressed in the specific cell lines.
I have submitted my choices and after about a minute I got
some results! What am I seeing?
On the results page, the first thing visible is the pairwise alignment
of the open chromatin human and mouse conserved region dictated by the
TSS and the cells chosen in ClustalW format. The organism chosen is
always first in the pair.
Why is that position in a red background?
The TSS is highlighted red and the arrow above it is the orientation of transcription.
What does the View MultiFASTA link do?
It produces the pairwise alignment shown below as a MultiFASTA format
sequence. It can be used as input to multiple alignment visualization programs such as
for further analysis.
What do the numbers at the side of each line of the
This is the genomic position of the beginning and the end of each of the subsequences. This way the
users can easily search for a part of the sequence in a genome
browser. Each line has a length of 50 chars.
So, where are the TFBS predictions?
The TFBS results are located below the pairwise alignment of the
conserved open chromatin region, denoted by the “TFBS Prediction
results” header. We only present the conserved TFBSs and their DNA
target sequences are also in ClustalW format. The results are in alphabetical order by the name of each TF,
along with a link to the relevant JASPAR
webpage. More than one TFBSs for a single TF may be discovered.
I found the TFBS of a TF I am interested in but I cannot find
the target sequence on the conserved open chromatin alignment at the
The TFBS results take into account the strand where the target
sequence is discovered. Try converting it to its reverse complementary
sequence and then searching for it. If you still cannot find it, the sequence might be fragmented between 2 different lines or there are gaps in it. In this case, try searching for it using the position numbers.