uchime说明书Word文档下载推荐.docx
- 文档编号:781265
- 上传时间:2023-04-29
- 格式:DOCX
- 页数:18
- 大小:80.54KB
uchime说明书Word文档下载推荐.docx
《uchime说明书Word文档下载推荐.docx》由会员分享,可在线阅读,更多相关《uchime说明书Word文档下载推荐.docx(18页珍藏版)》请在冰点文库上搜索。
UnderWindows,usebackslashesanddrivelettersasneeded:
C:
\binaries\usearch.exe-cluster_fastseqs.fasta-id0.9-centroidsnr.fasta
Ifthebinaryisinyourcurrentdirectory,youcanuse./or.\(dot-slashordot-backslash)asthepathname:
./usearch-cluster_fastseqs.fasta-id0.9-centroidsnr.fasta
Command-lineoptions
Optionsaregivenafterahyphen(-).Commandnamesareoptions,somustalsohaveahyphen.Twohyphensareallowed(Linuxlongoptionform),sobothid0.9and--id0.9areallowed.Theremustbeaspacebetweentheoptionnameanditsvalue,so-id0.9isnotallowed.
Commands
Acommandlinemusthaveexactlyonecommand.ThecommandnameisusuallyfollowedbythenameofaninputorqueryfileinFASTAformat.
UCHIMEalgorithm
UCHIMEisanalgorithmfordetectingchimericsequences.Itisimplementedintheuchime_refanduchime_denovocommands.
ThefundamentalstepinUCHIMEisasearchfora3-wayalignmentofaquerysequencewithtwoparentsequences(AandB)suchthatoneparentismoresimilartoonesegmentofthequery(Q)andtheotherparentissimilaroveranothersegment,asinthefigurebelow.Ascoreiscalculatedfromthealignment.Higherscoresindicateastrongerchimericsignal.Ascorecutoffsetbytheminhoption(0.28bydefault)determineswhetherthequeryisclassifiedasachimera.
Thissearchcanbeperformedwithareferencedatabaseofparentsequencesprovidedbytheuser,orthedatabasecanbeconstructeddenovofromthequerysequences.Indenovomode,thesequencesareassumedtobederivedfromonePCRrun.Inthiscase,parentsequencesshouldbemoreabundantthantheirchimerasbecausetheparentampliconswillhaveundergonemoreroundsofamplification.
Parametertuning
UCHIMEparametersareoptimizedfordetectionofverylow-divergencechimeras.Intypicalapplicationssuchas16SOTUpickingfromnext-genreads,chimerasoverdivergencelessthantheOTUradiusmaynotbeimportant,inwhichcaseitmaybebettertoretuneparameters.Thiscanbedonebyincreasingmindivandreducing-minh.
Practicalconsiderations
PleaserefertoPracticalUCHIMEforfurthercomments.
Referencedatabasemode
Thereferencedatabaseshouldcontainhigh-qualitysequencesthatarebelievedtobechimera-free.Agoodreferencedatabasefor16SribosomalRNAgenesisavailableintheMicrobiomeUtilitiesprovidedbytheBroadInstitute.Analternativecouldbethechimera.slayerreferencedatabaseprovidedforuseinmothur.
Denovomode
Indenovomode,abundanceskewisusedtodistinguishchimerasfromparents.inputshouldbeestimatedampliconsequenceswithintegerabundancesspecifiedusingsizeannotations,e.g.:
>
FQ23BBGZ5;
size=23;
Theminimumabundanceskewisspecifiedbytheabskewparameter,whichdefaultsto2.0(becauseoneroundofPCRdoublestheabundance).AbundanceisameasureofhowmanyampliconswithagivenuniquesequencewerepresentinthesampleafteramplificationbyPCR.Onewaytoestimatethisistosumthetotalnumberofreadsintheclusterusedtoestimatethegivenampliconsequence.UCHIMEusesonlyratiosofabundances,sotheabsolutevaluedoesnotmatter.However,usingthenumberofreadsisausefulindicator—forexample,aclustercontainingonereadislikelytobespurious.AmpliconsequencesandabundancescanbeestimatedusingUSEARCH,orbyusinganotheralgorithmsuchasChrisQuince'
sPyroNoiseorAmpliconNoise.Whenusingdenovomode,sequencesshouldbeestimatedampliconsfromonesequencingrun(strictly,onePCRamplificationstage),otherwiseabundancesmaynotbedirectlycomparable.
Seealso
UCHIMEorderdependency
Reproducibilityofresults
Reference
Edgar,RC,Haas,BJ,Clemente,JC,Quince,C,Knight,R(2011)UCHIMEimprovessensitivityandspeedofchimeradetection,Bioinformaticsdoi:
10.1093/bioinformatics/btr381[PMID21700674].
UCHIMEinputorderdependency
UCHIMEalgorithm
Occasionally,UCHIMEdenovoproducesdifferentresultswhentheinputorderischanged.Thisbehaviorisanunavoidableside-effectofthealgorithmdesign,anddoesnotindicateabug.Themostcommonreasonisatieinmaximumsmoothedidentitywhenidentifyingthetopparents.CallthetiedparentsA1andA2.Thetieisbrokenarbitrarily(bypickingthefirstindatabaseorder).ItmaybethatA1createsachimeric3-wayalignment,whileA2doesnot.ThiswillresultinthequerysequencebeingclassifiedaschimericifA1ischosen,butnotchimericifA2ischosen.Thisisusuallybecausethequeryisamarginalhit,sothedifferenceswithachangeininputorderiscomparabletothechangeyouwouldseewithasmallchangeinthescorethreshold(-minhoption).
uchime_refcommand
ChimeradetectionusingtheUCHIMEalgorithm.SeeUCHIMEscoreforparameters.
Adatabasefileofnucleotidesequencesmustbespecifiedusingthedboption.ThedatabasemaybeinFASTAorUDBformat.UDBformatisfastertoload.Thereferencedatabaseshouldincludesequencesthatmightappearasparentsinthequeryset.Theseshouldbehigh-qualitysequencesthatarebelievedtobefreeofchimeras.Errorsinreferencesequenceswilldegradedetectionaccuracyandincreasethenumberoffalsepositives.Chimeraswillnotbedetectediftheirparents(orsufficientlycloserelatives)arenotpresentinthedatabase.
Thestrandoptionisrequired.Ifthedatabaseisknowntobeorientedonthesamestrandasthereferencedatabase,thenstrandpluswillgivefasterexecution.Otherwise,strandbothshouldbeused.
Multithreadingissupported.
Outputoptionsareuchimeout,uchimealns,chimerasandnonchimeras.Theuchimeout5optionsetstheuchimeoutformattobecompatiblewithpreviousversions.
Theselfandselfidoptionsspecifythatareferencesequencematchingthequerysequenceshouldbeignored.Thisisusefulforestimatingthefalse-positiverateusingadatabaseofsequencesknowntobefreeofchimeras.Withself,matchingisdonebythesequencelabel,withselfidmatchingisdonefromanalignment(a100%matchisignored).
Example
usearch-uchime_refreads.fasta-db16s_ref.udb-uchimeoutresults.uchime-strandplus
uchime_denovocommand
DenovochimeradetectionusingtheUCHIMEalgorithm.SeeUCHIMEscoreforparameters.
Theinputfilemustcontainestimatedampliconswithabundancesspecifiedbysizeannotations.
Multithreadingisnotsupported.
Theabskewoptionsetstheabundanceskew,whichdefaultsto2.0.
usearch-uchime_denovoamplicons.fasta-uchimeoutresults.uchime
uchimeoutfile
TheuchimeoutoptionspecifiesatabbedtextoutputfilefortheUCHIMEcommandsuchime_refanduchime_denovo.Fieldsareshowninthefollowingtable.Iftheuchimeout5optionisspecified,thentheTfield(#5)isnotoutputsothattheformatisbackwardscompatiblewithUSEARCHv5.SeeUCHIMEscoringforexplanationofthefields.
Field
Name
Description
1
Score
Higherscoremeansmorestronglychimericalignment.
2
Q
Querylabel.
3
A
ParentAlabel.
4
B
ParentBlabel.
5
T
Topparent(T)label.Thisistheclosestreferencesequence;
usuallyeitherAorB.
6
IdQM
Percentidentityofqueryandthemodel(M)constructedasasegmentofAandasegmentofB.
7
IdQA
PercentidentityofQandA.
8
IdAB
PercentidentityofQandB.
9
PercentidentityofAandB
10
IdQT
PercentidentityofQandT.
11
LY
Yesvotesinleftsegment.
12
LN
Novotesinleftsegment.
13
LA
Abstainvotesinleftsegment.
14
RY
Yesvotesinrightsegment.
15
RN
Novotesinrightsegment.
16
RA
Abstainvotesinrightsegment.
17
Div
Divergence,definedas(IdQM-IdQT).
18
YN
YorN,indicatingwhetherthequerywasclassifiedaschimeric.ThisrequiresthatScore>
=thresholdspecifiedby-minh,Div>
minimumdivergencespecifiedbymindivandthenumberofdiffs((Y+N+A)ineachsegment(LandR)isgreaterthantheminimumspecifiedby-mindiffs.
Theuchimealnsoptionspecifiesafilecontaininghuman-readablealignmentsgeneratedbyUCHIMEcommands.SeeUCHIMEscoringforanexamplealignment.
-chimerasoption
FASTAoutputfiletocontainsequencesclassifiedaschimeras.
-nonchimerasoption
FASTAoutputfiletocontainsequencesclassifiedasnotchimeric.
-strandoption
Requiredfornucleotidesearches.Valuemustbesettoplusorboth.
-strandplus
Searchforhitsontheforward(plus)strandonly.Thisisfaster,soisrecommendedwhenthequerysequencesareknowntobeorientedonthesamestrandasthedatabase.
-strandboth
Searchforhitsonboththeforwardandreverse-complementedstrands.
databasefiles
Searchcommandsrequireadboptionspecifyingadatabasefilename.Foru-commands(usearch_global,usearch_localandublast)thedatabasemaybeinFASTAformatorUDBformat.Othersearchcommands(search_localandsearch_global)supportFASTAonly.Thefile
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- uchime 说明书