Thanks for the info! We are trying to concatenate different tissues so that we can directly compare annotations across tissues without potential bias in mapping labels across tissues, I am wondering if you have suggestions on comparing annotations across tissues?
On a related note, would it be possible to point me to some tutorial of what the last column means in segway output segway.0.bed.gz and segway.0.layered.bed.gz files? Here is the snapshot of segway.bed.gz:
chr1
3000000 85347103
0 1000
. 3000000
85347103 27,158,119
69855
chr1
3000000 85347103
1 1000
. 3000000
85347103 217,95,2
36054
chr1
3000000 85347103
10 1000
. 3000000
85347103 117,112,179
46347
Also, as I am trying to visualize and do some analysis on which region label is tissue-specific and which region is consistent across tissues (across different segway files generated for each tissue). are there some existing scripts to do that? Sorry I wasn't
able to find them and would appreciate your feedback!
Hi,
The --reverse-world is a 0-indexed option, so if the 2nd track listed in the comma separation is the reverse stranded data, it would need to have a value of "1". There is no current way of specifying multiple worlds to reverse.
While this may seem like a significant limitation, it is often not a sought-out feature. Concatenation is the go-to way to handle stranded data but is less obviously useful or perhaps necessary with multiple cell types. For a better discussion on this
you can refer to a segmentation and genome annotation review paper:
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1009423. Currently the best way to handle this is to have two separate Segway models for each tissue/cell type if you wish to preserve strandedness for both. You could also vertically
stack the data and concatenate across strandedness, however this depends on your use case and is discussed better in the review paper.
It is very unlikely either trained model will suffer any kind of ill fit from having the data split across cell types. On annotation comparison between tissue/cell types, you will likely have to manually map equivalent labels as best you can based on
the learned parameters (using Segtools or other annotation software will likely help here).
Hope this helps!
Eric
Hi Eric,
I'm so sorry to bother you again. I am re-reading the --reverse-world option in
https://segway.hoffmanlab.org/doc/3.0.4/segway.html#segrna and
doubting if I used it correctly. I have RNA-seq assays which have forward and reverse direction, as well as a bunch of other assays that are not strand-specific. I am hoping to use different assays as different tracks and concatenate Heart and Brain genomes.
Would you please advise if the following track name is correct format (I duplicated non-strand-specific assays to forward and reverse strand), and what reverse-world value to specify?
Track name: "--track Heart.5mc.FFPE.forward,Heart.5mc.FFPE.reverse,Brain.5mc.FFPE.forward,Brain.5mc.FFPE.reverse --track Heart.H3K4m3.FFPE.forward,Heart.H3K4m3.FFPE.reverse,Brain.H3K4m3.FFPE.forward,Brain.H3K4m3.FFPE".
Thanks a lot for your help on this issue!
Ran
Hi Eric,
Thanks a lot! These are really helpful!
Ran
For 1, you likely do not need to perform any normalization. Segway by default does an arcsinh normalization on each track (which can be turned off). Additionally, the conditional probability distribution learned is specific to the track and is evenly weighted
across all tracks.
For 2, there is no good way of attempting multiple labels in one command. Multiple instances of training are focused on finding the best fit for a given number of labels with different initial stating parameters. If you want to try different label numbers you
need to produce a new trained segway model for each one.
Hope that helps.
Eric
Hi Eric,
Sorry to interrupt you again, I have two more questions on running segRNA:
- For RNA-seq data, what type of data normalization should we perform before running segRNA? Also, does it have to be consistent with other ChIP-seq assay data distributions when running segway jointly on RNA-seq and ChIP-seq?
- I was trying to specify "num-labels" to multiple values so that the model can try different number of clusters at once, but go the following error:
-
segway train --num-labels=2:4:1 test.genomedata traindir_nlabel
sh: -c: line 0: syntax error near unexpected token `('
sh: -c: line 0: `x86_64-conda_cos6-linux-gnu-c++ -E -x assembler-with-cpp -DCARD_SEG=slice(2, 4, 1) -DCARD_SUBSEG=1 -DCARD_FRAMEINDEX=2000000 -DSEGTRANSITION_WEIGHT_SCALE=1.0 -Itraindir_nlabel
-I. traindir_nlabel/segway.str'
Unexpected EOF Error: expecting GM magic keyword at line 1
Exiting Program
Would you please advise on how to fix it? Thanks a lot for your attention on this!
Thanks,
Ran
Hi,
You're simply missing the "command" part of the arguments. This is usually "train" or "annotate". In this case, it looks like you are missing "train".
I believe the idea behind the non-stranded data is correct.
Hope that helps!
Eric
Dear Eric,
I'm trying to run segway (version
3.0.4), with concatenated tracks and train on a subset of coordinates.
I went through the tutorial https://segway.readthedocs.io/en/latest/quick.html and
understand the specific parts, but it seems I couldn't find some command arguments in the segway version I installed from conda. E.g. I tried to run a concatenated segmentation by separating tracks with a comma through the following command, but got the error
in orange:
segway --track h3k27me3 --track h3k36me3 --include-coords=test.bed test.genomedata traindir_track
usage: segway [global_args] COMMAND [args]...
segway: error: argument
train create a model with learned parameters
- train-init prepare initial models for parallel training
- train-run train initial models to completion criterion, in parallel
-- train-run-round train models for one round, in parallel
- train-finish select best model and prepare for `annotate`
annotate label a genome using a model
- annotate-init prepare for parallel annotation
- annotate-run annotate the genome, in parallel
- annotate-finish concatenate parallel annotation results
posterior infer posterior probabilities of each label across a genome
- posterior-init prepare for parallel posterior inference
- posterior-run infer posterior probabilities, in parallel
- posterior-finish concatenate parallel posterior inference results
Use `segway COMMAND --help` for help specific to command COMMAND.
: invalid choice: 'h3k27me3' (choose from '', 'train-init', 'train-run', 'train-finish', 'train-run-round', 'annotate-init', 'annotate-run', 'annotate-finish', 'posterior-init', 'posterior-run', 'posterior-finish',
'train', 'annotate', 'identify', 'posterior', 'identify+posterior')
Would you please help advise how to achieve that?
And separately, if I run segway with some assays in forward and reverse strand (RNA-seq) but some non-stranded ChIP-seq assays, is it ok I just duplicate the non-stranded ChIP-seq data to .forward and .reverse track, and use command like the following:
segway --track h3k27me3.forward,h3k27me3.reverse,rna.forward,rna.reverse --track brain.h3k27me3.forward,brain.h3k27me3.reverse,brain.rna.forward,brain.rna.reverse --reverse-world=1
--include-coords=test.bed test.genomedata traindir_track
Thanks a lot for your attention to this issue!
Best,
Ran
Yes but you will probably want to duplicate the non-strand-specific data like ChIP-seq in both the forward and reverse worlds.
Dear Roberts and Michael,
Thanks for the pointer! Since we have multiple assay types (ChIP-seq, RNA-seq, etc), can we use SegRNA for ChIP-seq assays together with RNA-seq?
Best,
Ran
This is the relevant section in the docs about the --reverse-world option and concatenation.
Not sure why it didn't come up in the search!
Eric
[Bcc Bill]
Eric can you please help with this? M
Michael, I am not sure which documents you are referring to. I searched the readthedocs for Segway for "reverse-world" but came up empty:
Can you help?
Thanks.
Bill
Yeah unfortunately if we let anyone send to it the spam is out of control.
We have a preprint on SegRNA in fact! The - - reverse-world option is what you need, should be described in the docs.
Michael
Hi Michael,
I tried to figure out how to send a support message about Segway, but it seems like I can't ask a question without subscribing to the list. Is that right? Seems like a lot of overhead!
Anyway, the question (below) is whether RNA-seq is supported. Did you ever end up implementing the two-strand stuff for RNA-seq analysis?
Thanks!
Bill
You are not authorized to send mail to the SEGWAY-L list from your
[log in to unmask] account. You might be authorized to post to the list from another
account, or perhaps when using another mail program configured to use a
different email address. However, LISTSERV has no way to associate this other
account or address with yours. If you need assistance or if you have any
questions regarding the policy of the SEGWAY-L list, please contact the list
owners at
[log in to unmask].
---------- Forwarded message ----------
From: William Stafford Noble <
[log in to unmask]>
To:
[log in to unmask]
Cc:
Bcc:
Date: Wed, 1 Jun 2022 20:39:03 -0700
Subject: RNA-seq segmentation
Does Segway support RNA-seq segmentation? I know there were plans to do so, but I don't know (remember?) whether those plans were ever completed. We have data from four tissues and five assay types, and we were thinking of applying Segway to it, but
one of the assays is RNA-seq.
Thanks.
Bill
This e-mail may contain confidential and/or privileged information for the sole use of the intended recipient.
Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited.
If you have received this e-mail in error, please contact the sender and delete all copies.
Opinions, conclusions or other information contained in this e-mail may not be that of the organization.
If you feel you have received an email from UHN of a commercial nature and would like to be removed from the sender's mailing list please do one of the following:
(1) Follow any unsubscribe process the sender has included in their email
(2) Where no unsubscribe process has been included, reply to the sender and type "unsubscribe" in the subject line. If you require additional information please go to our UHN Newsletters and Mailing Lists page.
Please note that we are unable to automatically unsubscribe individuals from all UHN mailing lists.
Patient Consent for Email:
UHN patients may provide their consent to communicate with UHN about their care using email. All electronic communication carries some risk. Please visit our website
here to learn about the risks of electronic communication and how to protect your privacy. You may withdraw your consent to receive emails from UHN at any time. Please contact your care provider, if you do not wish to receive emails from UHN.
This e-mail may contain confidential and/or privileged information for the sole use of the intended recipient.
Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited.
If you have received this e-mail in error, please contact the sender and delete all copies.
Opinions, conclusions or other information contained in this e-mail may not be that of the organization.
If you feel you have received an email from UHN of a commercial nature and would like to be removed from the sender's mailing list please do one of the following:
(1) Follow any unsubscribe process the sender has included in their email
(2) Where no unsubscribe process has been included, reply to the sender and type "unsubscribe" in the subject line. If you require additional information please go to our UHN Newsletters and Mailing Lists page.
Please note that we are unable to automatically unsubscribe individuals from all UHN mailing lists.
Patient Consent for Email:
UHN patients may provide their consent to communicate with UHN about their care using email. All electronic communication carries some risk. Please visit our website
here to learn about the risks of electronic communication and how to protect your privacy. You may withdraw your consent to receive emails from UHN at any time. Please contact your care provider, if you do not wish to receive emails from UHN.
This e-mail may contain confidential and/or privileged information for the sole use of the intended recipient.
Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited.
If you have received this e-mail in error, please contact the sender and delete all copies.
Opinions, conclusions or other information contained in this e-mail may not be that of the organization.
If you feel you have received an email from UHN of a commercial nature and would like to be removed from the sender's mailing list please do one of the following:
(1) Follow any unsubscribe process the sender has included in their email
(2) Where no unsubscribe process has been included, reply to the sender and type "unsubscribe" in the subject line. If you require additional information please go to our UHN Newsletters and Mailing Lists page.
Please note that we are unable to automatically unsubscribe individuals from all UHN mailing lists.
Patient Consent for Email:
UHN patients may provide their consent to communicate with UHN about their care using email. All electronic communication carries some risk. Please visit our website
here to learn about the risks of electronic communication and how to protect your privacy. You may withdraw your consent to receive emails from UHN at any time. Please contact your care provider, if you do not wish to receive emails from UHN.
This e-mail may contain confidential and/or privileged information for the sole use of the intended recipient.
Any review or distribution by anyone other than the person for whom it was originally intended is strictly prohibited.
If you have received this e-mail in error, please contact the sender and delete all copies.
Opinions, conclusions or other information contained in this e-mail may not be that of the organization.
If you feel you have received an email from UHN of a commercial nature and would like to be removed from the sender's mailing list please do one of the following:
(1) Follow any unsubscribe process the sender has included in their email
(2) Where no unsubscribe process has been included, reply to the sender and type "unsubscribe" in the subject line. If you require additional information please go to our UHN Newsletters and Mailing Lists page.
Please note that we are unable to automatically unsubscribe individuals from all UHN mailing lists.
Patient Consent for Email:
UHN patients may provide their consent to communicate with UHN about their care using email. All electronic communication carries some risk. Please visit our website
here to learn about the risks of electronic communication and how to protect your privacy. You may withdraw your consent to receive emails from UHN at any time. Please contact your care provider, if you do not wish to receive emails from UHN.