外文翻译-α-β剪枝& Zobrist散列.docx

文档编号：650441
上传时间：2023-04-29
格式：DOCX
页数：8
大小：22.25KB

《外文翻译-α-β剪枝& Zobrist散列.docx》由会员分享，可在线阅读，更多相关《外文翻译-α-β剪枝& Zobrist散列.docx（8页珍藏版）》请在冰点文库上搜索。

外文翻译-α-β剪枝& Zobrist散列.docx

外文文献

Alpha–betapruning&Zobristhashing

Alpha–betapruning

Alpha–betapruningisasearchalgorithmthatseekstodecreasethenumberofnodesthatareevaluatedbytheminimaxalgorithminitssearchtree.Itisanadversarialsearchalgorithmusedcommonlyformachineplayingoftwo-playergames（Tic-tac-toe,Chess,Go,etc.）.Itstopscompletelyevaluatingamovewhenatleastonepossibilityhasbeenfoundthatprovesthemovetobeworsethanapreviouslyexaminedmove.Suchmovesneednotbeevaluatedfurther.Whenappliedtoastandardminimaxtree,itreturnsthesamemoveasminimaxwould,butprunesawaybranchesthatcannotpossiblyinfluencethefinaldecision.

Thebenefitofalpha–betapruningliesinthefactthatbranchesofthesearchtreecanbeeliminated.Thisway,thesearchtimecanbelimitedtothe'morepromising'subtree,andadeepersearchcanbeperformedinthesametime.Likeitspredecessor,itbelongstothebranchandboundclassofalgorithms.Theoptimizationreducestheeffectivedepthtoslightlymorethanhalfthatofsimpleminimaxifthenodesareevaluatedinanoptimalornearoptimalorder（bestchoiceforsideonmoveorderedfirstateachnode）.

Withan（averageorconstant）branchingfactorofb,andasearchdepthofdplies,themaximumnumberofleafnodepositionsevaluated（whenthemoveorderingispessimal）isO（b*b*...*b）=O（bd）–thesameasasimpleminimaxsearch.Ifthemoveorderingforthesearchisoptimal（meaningthebestmovesarealwayssearchedfirst）,thenumberofleafnodepositionsevaluatedisaboutO（b*1*b*1*...*b）forodddepthandO（b*1*b*1*...*1）foreven

depth,or.Inthelattercase,wheretheplyofasearchiseven,the

effectivebranchingfactorisreducedtoitssquareroot,or,equivalently,thesearchcangotwiceasdeepwiththesameamountofcomputation.[10]Theexplanationofb*1*b*1*...isthatallthefirstplayer'smovesmustbestudiedtofindthebestone,butforeach,onlythebestsecondplayer'smoveisneededtorefuteallbutthefirst（andbest）firstplayermove–alpha–betaensuresnoothersecondplayermovesneedbeconsidered.Whennodesareorderedatrandom,theaveragenumberofnodesevaluatedisroughly.

0

Normallyduringalpha–beta,thesubtreesaretemporarilydominatedbyeitherafirstplayeradvantage（whenmanyfirstplayermovesaregood,andateachsearchdepththefirstmovecheckedbythefirstplayerisadequate,butallsecondplayerresponsesarerequiredtotrytofindarefutation）,orviceversa.Thisadvantagecanswitchsidesmanytimesduringthesearchifthemoveorderingisincorrect,eachtimeleadingtoinefficiency.Asthenumberofpositionssearcheddecreasesexponentiallyeachmovenearerthecurrentposition,itisworthspendingconsiderableeffortonsortingearlymoves.Animprovedsortatanydepthwillexponentiallyreducethetotalnumberofpositionssearched,butsortingallpositionsatdepthsneartherootnodeisrelativelycheapastherearesofewofthem.Inpractice,themoveorderingisoftendeterminedbytheresultsofearlier,smallersearches,suchasthroughiterativedeepening.

Thealgorithmmaintainstwovalues,alphaandbeta,whichrepresentthemaximumscorethatthemaximizingplayerisassuredofandtheminimumscorethattheminimizingplayerisassuredofrespectively.Initiallyalphaisnegativeinfinityandbetaispositiveinfinity,i.e.bothplayersstartwiththeirlowestpossiblescore.Itcanhappenthatwhenchoosingacertainbranchofacertainnodetheminimumscorethattheminimizingplayerisassuredofbecomeslessthanthemaximumscorethatthemaximizingplayerisassuredof（beta<=alpha）.Ifthisisthecase,theparentnodeshouldnotchosethethisnode,becauseitwillmakethescorefortheparentnodeworse.Therefore,theotherbranchesofthenodedonothavetobeexplored.

Additionally,thisalgorithmcanbetriviallymodifiedtoreturnanentireprincipalvariationinadditiontothescore.SomemoreaggressivealgorithmssuchasMTD（f）donoteasilypermitsuchamodification.

Furtherimprovementcanbeachievedwithoutsacrificingaccuracy,byusingorderingheuristicstosearchpartsofthetreethatarelikelytoforcealpha–betacutoffsearly.Forexample,inchess,movesthattakepiecesmaybeexaminedbeforemovesthatdonot,ormovesthathavescoredhighlyinearlierpassesthroughthegame-treeanalysismaybeevaluatedbeforeothers.Anothercommon,andverycheap,heuristicisthekillerheuristic,wherethelastmovethatcausedabeta-cutoffatthesamelevelinthetreesearchisalways

7

examinedfirst.Thisideacanbegeneralizedintoasetofrefutationtables.

Alpha–betasearchcanbemadeevenfasterbyconsideringonlyanarrowsearchwindow（generallydeterminedbyguessworkbasedonexperience）.Thisisknownasaspirationsearch.Intheextremecase,thesearchisperformedwithalphaandbetaequal;atechniqueknownaszero-windowsearch,null-windowsearch,orscoutsearch.Thisisparticularlyusefulforwin/losssearchesneartheendofagamewheretheextradepthgainedfromthenarrowwindowandasimplewin/lossevaluationfunctionmayleadtoaconclusiveresult.Ifanaspirationsearchfails,itisstraightforwardtodetectwhetheritfailedhigh（highedgeofwindowwastoolow）orlow（loweredgeofwindowwastoohigh）.Thisgivesinformationaboutwhatwindowvaluesmightbeusefulinare-searchoftheposition.

Moreadvancedalgorithmsthatareevenfasterwhilestillbeingabletocomputetheexactminimaxvalueareknown,suchasSCOUT,[11]NegascoutandMTD-f.

Sincetheminimaxalgorithmanditsvariantsareinherentlydepth-first,astrategysuchasiterativedeepeningisusuallyusedinconjunctionwithalpha–betasothatareasonablygoodmovecanbereturnedevenifthealgorithmisinterruptedbeforeithasfinishedexecution.Anotheradvantageofusingiterativedeepeningisthatsearchesatshallowerdepthsgivemove-orderinghints,aswellasshallowalphaandbetaestimates,thatbothcanhelpproducecutoffsforhigherdepthsearchesmuchearlierthanwouldotherwisebepossible.

AlgorithmslikeSSSontheotherhand,usethebest-firststrategy.Thiscanpotentiallymakethemmoretime-efficient,buttypicallyataheavycostinspace-efficiency.

Zobristhashing

Zobristhashing（alsoreferredtoasZobristkeysorZobristsignatures）isahashfunctionconstructionusedincomputerprogramsthatplayabstractboardgames,suchaschessandGo,toimplementtranspositiontables,aspecialkindofhashtablethatisindexedbyaboardpositionandusedtoavoidanalyzingthesamepositionmorethanonce.Zobristhashingisnamedforitsinventor,AlbertLindseyZobrist.Ithasalsobeenappliedasamethodforrecognizingsubstitutionalalloyconfigurationsinsimulationsofcrystallinematerials.

Zobristhashingstartsbyrandomlygeneratingbitstringsforeachpossibleelementofa

boardgame,i.e.foreachcombinationofapieceandaposition（inthegameofchess,that's12pieces×64boardpositions,or14ifakingthatmaystillcastleandapawnthatmaycaptureenpassantaretreatedseparately）.Nowanyboardconfigurationcanbebrokenupintoindependentpiece/positioncomponents,whicharemappedtotherandombitstringsgeneratedearlier.ThefinalZobristhashiscomputedbycombiningthosebitstringsusingbitwiseXOR.

Ifthebitstringsarelongenough,differentboardpositionswillalmostcertainlyhashtodifferentvalues;howeverlongerbitstringsrequireproportionallymorecomputerresourcestomanipulate.Manygameenginesstoreonlythehashvaluesinthetranspositiontable,omittingthepositioninformationitselfentirelytoreducememoryusage,andassumingthathashcollisionswillnotoccur,orwillnotgreatlyinfluencetheresultsofthetableiftheydo.Zobristhashingisthefirstknowninstanceoftabulationhashing.Theresultisa3-wiseindependenthashfamily.Inparticular,itisstronglyuniversal.

Asanexample,inchess,eachofthe64squarescanatanytimebeempty,orcontainoneofthe6gamepieces,whichareeitherblackorwhite.Thatis,eachsquarecanbeinoneof1

+6x2=13possiblestatesatanytime.Thusoneneedstogenerateatmost13x64=832randombitstrings.Givenaposition,oneobtainsitsZobristhashbyfindingoutwhichpiecesareonwhichsquares,andcombiningtherelevantbitstringstogether.

Ratherthancomputingthehashfortheentireboardeverytime,asthepseudocodeabovedoes,thehashvalueofaboardcanbeupdatedsimplybyXORingoutthebitstring（s）forpositionsthathavechanged,andXORinginthebitstringsforthenewpositions.Forinstance,ifapawnonachessboardsquareisreplacedbyarookfromanothersquare,theresultingpositionwouldbeproducedbyXORingtheexistinghashwiththebitstringsfor:

'pawnatthissquare' （XORingoutthepawnatthissquare）'rookatthissquare' （XORingintherookatthissquare）

'rookatsourcesquare' （XORingouttherookatthesourcesquare）'nothingatsourcesquare' （XORinginnothingatthesourcesquare）.

ThismakesZobristhashingveryefficientfortraversingagametree.

Incomputergo,thistechniqueisalsousedforsuperkodetection.

ThesamemethodhasbeenusedtorecognizesubstitutionalalloyconfigurationsduringMonteCarlosimulationsinordertopreventwastingcomputationaleffortonstatesthathavealreadybeencalculated.

α-β剪枝&Zobrist散列

α-β剪枝

α-β剪枝搜索算法,旨在减少节点的数量评估的极大极小算法在搜索树。

这是一

个敌对的搜索算法通常用于机玩的两人游戏（一字棋、国际象棋,等等）。

时停止完全评估移动至少一种可能已经发现,证明这一举措是比之前检查行动。

这些举措不需要进一步评估。

在应用到一个标准的极大极小树,它返回相同的举措极大极小,但树枝剪掉,不可能影响最终的决定。

α-β剪枝的好处在于,可以消除搜索树的分支。

这种方式,搜索时间可以局限于

“更有前途”子树,和一个更深层次的搜索可以在同一时间进行。

和它的前辈一样,它属于类的分支定界算法。